Table of Contents
If you have any comments, questions, or general-purpose harassment you would like give us about this book, then please use the comment form at the bottom of each page! We promise that we will try to incorporate any feedback you give (minus the profanity, of course), will respond to your questions, and credit you appropriately. I certainly hope that readers have not made the cheap joke that this is the first chapter with even a bit of content!
Although Nuxeo EP 5 is described as an "Enterprise Content Management system," none of our lessons to this point have dealt at all with the content of documents! We have showed you how to create a new document type, associated schemas with it, handle events, access control, etc. and not discussed at all how to read even one byte of content! Why? There are two reasons. The first, and probably less important, reason we have favored discussing properties rather than content is that most developers are familiar with the reading of bytes and manipulating them programmatically from a typical filesystem. Thus, the features of Nuxeo that involving manipulating a document's meta-data and presentation were likely to be more interesting to the reader. The second, and probably more important reason, we have favored discussing Nuxeo's property features over content is simple: content is a property.
One of the supplied schemas in Nuxeo is file. This schema has one property defined on it, the filename property. This is used, by convention, to store the filename of a file that has been imported into Nuxeo. The Nuxeo web UI follows this convention anytime you have the opportunity to "upload" a file. If you were to write a program that does a bulk import of files into Nuxeo you should follow this convention too, for example. Also by convention there is a property file:content that holds the bytes of a file that has been imported into Nuxeo. This is how "content" gets turned into a property. There is no rule that says that content must be stored on this property, but if you follow this convention your documents will work nicely with the web UI of Nuxeo. Finally, it should be clear by now that you may have multiple properties with "the content" on a Nuxeo document. There is no notion of a distinguished property that always has "the sole content" of a document. You might, for example, have a document that has multiple translations and the content of them living on properties like version_english and version_française.
The type of the data stored in a property in Nuxeo is a Blob. A Blob represents a large collection of unstructured bytes, so it fits the idea of content well. Blobs are used in Nuxeo to allow Nuxeo's infrastructure a place to introduce various types of optimizations. The number and complexity of these is beyond the scope of this book, but a couple of examples may be helpful. If you have a large amount of content Nuxeo will not load the content from the Blob until it is actually needed (fetched via an InputStream) even if you read the property's "value." Another example is that if your Nuxeo installation is configured to have multiple servers with the web front-end on one server and the content repository on another, Nuxeo will use a Blob that understands how to fetch data efficiently to the front-end from the back-end. All of this is hidden from you as an application developer; you simply manipulate Blob objects.
When you want to create content by means other than using the Web UI - such as in a test - there are a number of Blob implementations that will make your life easier. You can use the StringBlob to create a Blob of content from a Java String. In the tests for this lesson, we use the StringBlob to create content for a text file. Our tests for this lesson also use the FileBlob to create content from an existing file on the local filesystem.
Here is a listing of the key parts of the public API for a Blob. If you are familiar with Java's IO interfaces, much of this will be familiar to you.
public interface Blob {
//how many bytes does the blob hold, if known
long getLength();
//cache the mime type
String getMimeType();
void setMimeType(String mimeType);
//cache the digest (such as MD5 hash)
String getDigest();
void setDigest(String digest);
//cache the originating filename
String getFilename();
void setFilename(String filename);
//read content in various formats
InputStream getStream() throws IOException;
Reader getReader() throws IOException;
byte[] getByteArray() throws IOException;
String getString() throws IOException;
//bulk transfer the entire blob content to various types of output
void transferTo(OutputStream out) throws IOException;
void transferTo(Writer out) throws IOException;
void transferTo(File file) throws IOException;
}
We have brought up the subject of content because in this lesson we will be creating relations between documents (see below) if and only if the documents are photos. A document's "mime type" is a string like "image/jpeg". The part before the slash indicating the main, or content, type of the document and the part after the slash indicating more specific information about the format of the content, or its subtype. For example, the MP3 files that contain your music typically have the mime type "audio/mp3" and web pages are written, at least normally, using "text/html". When we do not know, or can't figure out, a document's mime type we use the mime type "application/octet" which should be interpreted as "bunch of bytes we don't know the format of." A complete list of mime types is hard to write because various programs and developers are constantly creating new ones; however, the Internet Assigned Numbers Authority (IANA) periodically makes a significant list available.
Although some systems use the filename to determine the mime type of an object with simple rules, such as all files that end with ".mp3" have mime type "audio/mp3", a more reliable method is to actually interogate the content itself. There are various libraries available that know where to probe a file to see what mime type it is. These libraries work by knowing things of the form "files of type audio/mp3 always have a byte with content 0 at the 421st position and a byte with content 255 at the 97th position in the file."
Nuxeo, naturually, provides a service that implements this
functionality, the MimeTypeRegistry. You
access this service in the normal way, with Framework.getService(MimetypeRegistry.class).
Once you have a reference to the registry, a block of code like the
following can be used to determine (or at least try to determine)
the mime type of a DocumentModel that you
expect has the file schema associated
with it:
protected String mimeTypeOfFileDocument(CoreSession session,
DocumentModel documentModel) throws Exception {
String filename = (String) documentModel.getProperty("file", "filename");
if (filename == null) {
return "application/octet";
}
MimetypeRegistry registry = Framework.getService(MimetypeRegistry.class);
// this just gets the blob from the default place that the UI puts it
Blob content = (Blob) documentModel.getProperty("file", "content");
if (content == null) {
return "application/octet";
}
String type = registry.getMimetypeFromStreamWithDefault(
content.getStream(), "application/octet");
log.info("Found mime type of " + documentModel.getPathAsString()
+ " is " + type);
return type;
}
The previous two topics, content and mime types, have been brought in so that we can discuss photos about events. In particular, we are going to improve our code for handling the event that a new document has been created. The goal is to allow members of our special group of "social directors" to add photos about events, they are type of people who would do this! These photos can be taken before the event, as a preview and to encourage participation in the upcoming event, or they can be taken after and are considered photos taken at the event. Here is the set of rules we would like to encode, more specifically.
Create a relation between a new document D and an upcoming event if:
The owner of D is a member of the group of the "socialButterflies" OR the currently logged in user is a member of that group AND
D and the upcoming event are in the same directory (i.e. have the same parent) in the repository AND
The mime type of D is a recognized image format (like JPEG or PNG)
A relation in Nuxeo is a triple, or tuple, of three items. This triple is
usually written as (SUBJECT, VERB,
OBJECT). These relations allow two documents, the
SUBJECT and OBJECT that would otherwise be considered quite
distict to show they are, well... "related" by the VERB. If the verb were something like "is translation
of" then you can assume that the original content is the subject
document and the translation is the object document. You can create
these relations by hand using the Nuxeo EP 5 web UI with the
"Relations" tab that is shown when you examine a document in
detail, as is highlighted in this screen capture:

You can see in the darker portion of the screen capture, that the author is creating a relation between a document called "Bowie With Cigarette" and another document in the repository that will be found via search. The type of relation is shown in the Predicate field (fancier than saying plain old "Verb" field!). Nuxeo ships with a set of 5 basic predicates including the "Is based on" above. The other four are "conforms to," "references," "replaces," and "requires."
If you are interested in adding to or changing the set of
relations that are shown in the annotations tab, you will need to
look at the Nuxeo source code and then rebuild the Nuxeo server.
This type of change is why we are open source! The Nuxeo source has
a bundle called nuxeo-platform-relations-default-config. This
bundle has a directory that defines the default set of relations.
This bundle has two files in (of course!) in src/main/resources/directories that define the
relations and their inverses - the inverse of a relation being the
way of expressing that a document is the OBJECT of the relation. These two files reference, of
course labels like label.relation.predicate.References that are then
translated to the user's preferred language. These, just as in our
previous lesson, end up being referred to by messages_en.properties or similar files for other
languages; for the default installation these can be found in the
Nuxeo source code, in the bundle nuxeo-platform-lang in the src/main/resources/nuxeo.war/WEB-INF/classes
directory. We hope that this continues to hammer home the point
that your bundles and Nuxeo's bundles work in the same manner.
Although Nuxeo exposes the relations tab, as shown above, those are the relations it expects users to create "by hand." There are two built-in "features" of Nuxeo that are actually just some user interface candy that hide relations. First, the Nuxeo comment system is implemented using relations. When you use the UI to create a comment, like "This is an example" in the snap above, the Nuxeo comment system swings into action. It creates a new document, stores the contents of the comment in property and then creates a relation between the new document and the one the comment is related to. Then, when it displays the user interface for a document, it is easy to find the comments on the document being displayed since the pointers (relations) are already in place.
We hope it comes as no surprise to you that the Nuxeo comment
system defines a new schema type (in a file called comment.xsd!) with some fields in it like
comment, author, and creationTime. Further, the comment system uses
an event listener to become informed about documents getting
deleted. It uses this event to delete comment documents when the
document they are associated with gets removed. The only part of
the comment system that uses parts of the API we have not yet
covered is the User Interface presented on the comments tab. You
could build most of the Nuxeo comment system with the lessons you
have had to this point! Nuxeo has no
magic.
The other feature that makes use of relations is Nuxeo is the annotation system that was released with version 5.2 of Nuxeo EP. This system allows you to select regions of a document - either an rectangular region of an image document or some text from the document - and associate an annotation with it. Just as a comment is a subject document with a relation to the whole of an object document, an annotation is a subject document with a relation to a part of an object document. When you display an image, for example, Nuxeo's annotation system uses the relations to display an image like this:

As you can see in the image above, Nuxeo knows which part of the original image is the source of the annotation and the text that was originally entered as the annotation itself is displayed in the box to the side. (This image is from the Inauguration of President Obama, the person pictured is Chief Justice John Roberts, who goofed up his lines for the presidential oath of office.)
We have revamped the now somewhat busy DocumentCreationListener
class to include a few new or refactored functions. One of these is
the "middleman" that when given a new document, model, it checks to see if any event is "related"
to it:
private void checkDocumentForRelationToEvent(DocumentEventContext context,
DocumentModel model) throws Exception {
RelationHelper helper = new RelationHelper();
CoreSession session = context.getCoreSession();
log.info("Checking document for relation:" + model.getPathAsString());
DocumentModel eventDoc = helper.isEventImage(session, model);
if (eventDoc != null) {
createBasicRelation(model, eventDoc, false);
}
eventDoc = helper.isPreviewImage(session, model);
if (eventDoc != null) {
createBasicRelation(model, eventDoc, true);
}
session.save();
}
This method is straightforward most areas. First we create an
instance of RelationHelper, a new class
for the lesson, that implements the rules for adding an annotation
we explained in the previous section. The helper has two methods, isEventImage and isPreviewImage, that implement the two different
cases of photos from an event or images that are previews of the
event. If the helper wants to indicated success, it return a
DocumentModel that represents the
Upcoming document that the new (photo)
DocumentModel is related to. It returns
null if no relations should be
created.
So, this method should be called anytime a new document is
created, to see if meets our criteria for being an image of the
right kind and from a user in the right group. When this method
wants to create a relation between two documents, it calls
createBasicRelation, but to discuss
that function we need to explain properties of relations.
Properties of relations can be thought of in one of two ways. The first, more straightforward, way is to think of a Map that is associated with the relation that concerns the relation itself. But, who wants to do things the straightforward way? The more complex (or perhaps more sophisticated?) way to think of a relation property as another relation in which the first relation is the the subject. Seem strange? Returning to our discussion above of the "is comment on" relation, if document A has a comment document B then the "creationTime" property is really about the time that the relation is created. Both documents A and B may have existed for some time in the repository and would have their own respective creation time properties (part of the dublincore schema!). To finish this example we could write this "meta-relation" of creation time with the verbs in bold as: ((B is comment on A) created on tuesday at 5pm). No matter which formulation you prefer, you should consider this code sample:
private void createBasicRelation(DocumentModel imageDocumentModel,
DocumentModel eventDocumentModel, boolean isPreview)
throws Exception {
QNameResource imageAsResource = getDocumentResource(imageDocumentModel);
QNameResource eventDocumentAsResource = getDocumentResource(eventDocumentModel);
ResourceImpl predFwd, predRev;
if (isPreview) {
predFwd = new ResourceImpl(REFERENCES_URI);
predRev = new ResourceImpl(REFERENCES_URI);
} else {
predFwd = new ResourceImpl(BASED_ON_URI);
predRev = new ResourceImpl(BASED_ON_URI);
}
String commentText;
if (isPreview) {
commentText = "About the event";
} else {
commentText = "From the event";
}
Statement fwd = new StatementImpl(imageAsResource, predFwd,
eventDocumentAsResource);
setProperties(fwd, "[Automatically Added]", new Date(), commentText);
Statement rev = new StatementImpl(eventDocumentAsResource, predRev,
imageAsResource);
setProperties(rev, "[Automatically Added]", new Date(), commentText);
ArrayList<Statement> stmtList = new ArrayList<Statement>();
stmtList.add(fwd);
stmtList.add(rev);
getRelationManager().add(DEFAULT, stmtList);
}
The first thing you will notice is that we immediately turn both
DocumentModels, imageDocumentModel and eventDocumentModel, into named resources. In the
interest of simplicity, QNameResource
type can be thought of as a URL that describes the server and
location in the repository of the given documents. We then compute
other "resources" that reference the verbs of the relation, one for
the forward (Fwd) direction and one for the reverse (Rev). You can
see that if the photo is a preview, we use the verb REFERENCES and if the photo is of the event itself we
use the verb BASED_ON. These may not be ideal verbs for the
relations we have, but these are known to the Nuxeo UI - one of the
five built-in verbs - and the UI will display them correctly
without modification.
You should see that we create some comments about the relations
("properties" of the relation in terms of the title of this
subsection) to help the display be more informative. The critical
Nuxeo type for creating Relations is Statement which we create via its StatementImpl implementation class. A Statement is the basic relation element in the Nuxeo
system, so named because the SUBJECT VERB
OBJECT relation can be read as a statement of fact (try it
yourself!). You should see that we are adding some properties to
each statment, again to help the user understand when looking at
the relation that it was automatically created.
Finally, we retreive the RelationManager object (not shown) as a service and
add our two statements, as a list, to the graph named DEFAULT. This graph is the one displayed and
manipulated by the Nuxeo EP 5 web user interface. If you are using
the RelationManager to maintain multiple
graphs of relations (a.k.a. statements) you should probably be
reading the RDF spec, not this document! Most folks will want to
stick to the DEFAULT graph of
relations.
Since we have used relations that are understood by the Nuxeo EP 5 Web User Interface, the effects of our modified CreateDocumentListener and its new "helper" can be seen through your web browser. In the screen snap below, there has been an Upcoming document created by someone realted to an upcoming show by (some guy named) David Bowie. A member of the socialButterfiles group has helpfully added a two images to the repository to convince the (skeptical) masses to attend the concert by this unknown artist. The UI depicted in this snap is reached by clicking on the Relations tab of the Upcoming event:

The event document has two relations created for the incoming case (forward) and two for the reverse or outgoing case. It should be clear that the comments/author have been created by the method createBasicRelation above. Further, if you cilck on the links you will presented with the document that is the "other end" of the relation, in this case both photos that have been uploaded by one of the social directors.
We have supplied you with all the code to make this lesson work
in this lesson's skeleton (lesson-relations in the usual svn repository),
with one exception. We have not "hooked up" the middle man code
above in the method checkDocumentForRelationToEvent, to the event
handler code. You need to call this code in the right part of the
event handler to make sure that the relations get created. Be
careful to make sure that you follow the event handler code
carefully to find the right place and be sure not to "miss" some
photo documents.