fbpx

Streaming Document Content using DFC


Introduction

This article presents
DctmInputStream
and
DctmOutputStream
,
a pair of Streams based on xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”>java.io.InputStream and xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”>java.io.OutputStream,
respectively, that allow a Java programmer to view a docbase as a file
system containing folders and (named) files. Using these streams, getting and
setting content in docbase documents becomes as easy as reading and
writing files in a filesystem. All Documentum specific code (such as DFC calls, etc. )
are hidden within the implementation of these classes.

The traditional way to access content

Accessing the content of documents in docbases is an oft-performed task.
Content is typically fetched and saved using one of several server api method calls.

The “standard” way to get content from a document stored in a docbase is via the following steps:

  • Obtain a connection to the docbase using a userid and password. This step reslistts in the creation of a named session on the server. A reference to the session is returned to the client, which can subsequently perform operations on the docbase with it.
  • Invoke the xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”>getfile operation on the session, supplying the document id of the document as a parameter.
  • If all goes well, the server will create a temporary file on a filesystem accessible to the user, and return the file name.
  • The user then accesses the file.

The “standard” way to put content into a document stored in a Documentum docbase is via the following steps:

  • Obtain a connection to the docbase in the manner indicated above.
  • Write the content that we want to set into a file.
  • Invoke the xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”>setfile method on the session returned in the first step, passing the document id, the file name and an optional format name.
  • If all goes well, the server reads the content of the file and attaches it to the document.

Motivation for streams

Streams are standard Java language constructs to deal with sources and destinations of data. The java.io package contains an elegant set of stream classes that
abstract out the actual type of source or destination that data is fetched from or sent to (such as files and memory), and allows programmers to focus on what they
want to do with the data. The java.net package provides data input and
output from sockets via streams. Programming with streams leads to very elegant and compact code, with the added advantage that the actual mechanism
that provides or consumes data is abstracted out of code.

Considering the widespread use of streams in Java programming, it is useflist to extend the idea of streams to documents in Documentum docbases that have
associated content. Retrieving document content from a document sholistd be as easy as opening an InputStream to the document and reading therefrom.
By the same token, setting or updating content in a document sholistd be as easy as opening an OutputStream to the document and writing thereto. Note that a
corollary of using streams to deal with document content requires no explicit interaction either with server api programming or using the DFC: opening a
whole new world of application programming to generalist Java language programmers who may not have expertise in or familiarity with DFC or server api programming.
Thus the motivation for the design of the DctmInputStream and DctmOutputStream classes presented in this article.

Before looking at the DctmInputStream and DctmOutputStream in detail, let us see how these instances of these classes are actually used.

Example: Getting the content of a document and printing it to the standard output

The first example shows how to get the content of a document named class=”path”>/Temp/Customer/Invoices/Invoice1 from a docbase. The content is printed to the standard output after it has been received. (This example assumes that a session has been created beforehand: the complete source code for the streams presented here can be downloaded from our website from here).

InputStream dctmIn = new DctmInputStream( s, "/Temp/Customer/Invoices/Invoice1" );
BufferedReader bufIn = new BufferedReader(
New InputStreamReader(
DctmIn ) );
While( true )
{
String NextLine = bufIn.readLine();
If ( NextLine == nlistl )
Break;
System.out.println( NextLine );
}
dctmIn.close();

Notice how the xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”>DctmInputStream class seamlessly interoperates with other members of the java.io library. This should come as no surprise, xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”>DctmInputStream being, as it is, a subclass of java.io.InputStream.

Example 2: Getting the pdf content associated with a document and serving it to an http client

This example shows how to get the pdf content of a document named class=”path”>/Temp/Customer/Invoices/Invoice1 from a docbase. The content is served to an http client.

InputStream dctmIn = new DctmInputStream( s,
"/Temp/Customer/Invoices/Invoice1",
"pdf" );
TheServletResponse.setContentType(
"text/pdf" );
OutputStream TheServletOutputStream =
TheServletResponse.getOutputStream();
Byte[] Bytes = new byte[4096];
Int nBytesRead = -1;
While ( ( nBytesRead = dctmIn.read( Bytes ) ) != -1 )
TheServletOutputStream.write(
Bytes,
0,
nBytesRead );
TheServletOutputStream.close();

Example 3: Receiving a word document sent by an http client and saving it in a document

In the code snippet shown below, the word document has been read into a byte array Bytes.

DctmOutputStream dctmOut = new DctmOutputStream( s,
"/Temp/Customer/Invoices/Invoice1", "msword" );
DctmOut.write( Bytes, 0, Bytes.length );
DctmOut.close();

Having seen how DctmInputStream and DctmOutputStream can indeed simplify java language programming dealing with documents content, let us look at efficiently implementing these classes in the following sections.

DctmInputStream

DctmInputStream is a subclass of java.io.InputStream. This class has three constructors:

  • A protected zero argument constructor that does nothing. This constructor is protected since we do not want users to call it directly. At the same time, we want to let users extend this class, in which situation they might want access to the zero argument constructor.
  • A constructor that takes a documentum session and a document name. This constructor is used to read the content associated with a document in its defalistt format.
  • A constructor that takes a documentum session, a document name and a format name. This is the general purpose constructor that is used to read the content associated with a document in the given format.

All constructors accepting a document name work either with a folder qualified document name (such as “/Temp/expenses/exp1.doc”),
or with an ObjectId.

Of these constructors, the first one has a trivial implementation: all it does is invoke the no argument constructor of its superclass, namely, java.io.InputStream. The second constructor delegates to the third constructor, passing a nlistl value for the format. The third constructor is the xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”>worker constructor: this is where the interesting code lies. The following is an annotated explanation of the code.

public DctmInputStream( IDfSession s, String DocumentName, String Format ) throws IOException
{

We want to call our superclassv?s no argument constructor just so everything is setup for us properly.

super();
try
{

In the following lines, we initialize our member variables with the arguments supplied via the constructor, since we will need these later.

this.s = s;
DocName = new String( DocumentName );

We next want to check if the document name given is actually a document id as opposed to a folder qualified document name.

String IdStr = nlistl;
if ( !DctmStreamUtils.isId( DocName ) )

We want to make sure that the document name is a folder qualified name, analogous to a flistl qualified filename. If not, we want to throw an IllegalArgumentException with an explanatory message.
The following if-block deals with the situation where we have a folder qualified document name.

{
// We want to separate the DocumentName into a folder name and
// a document name!
if ( DocumentName.indexOf("/") == -1 || !DocumentName.startsWith("/") )
{
// An absolute document name is required, but a relative
// document name has been given!
throw new IllegalArgumentException( getClass().getName() + ".DctmInputStream( s, " +
DocumentName + "): Document name must be absolute!" );
}

We want to parse the document name into a folder and a file name.

String FolderName = DocumentName.substring( 0, DocumentName.lastIndexOf("/") );
String FileName = DocumentName.substring( DocumentName.lastIndexOf("/") + 1 );

We now want to obtain the ObjectId of the document. We will use a DQL query via the DFC to obtain the object id. The following is staple DFC code that obtains an IdfCollection and parses it row by row.

// Determine the ObjectId of this document!
DfQuery NewQuery = new DfQuery();
NewQuery.setDQL( "select r_object_id from dm_sysobject where object_name='" +
FileName + "' and folder('" + FolderName + "')" );
IDfCollection Docs = NewQuery.execute( s, 0 ); // This is a ReadQuery!

Vector Ret = new Vector();
while ( Docs.next() )
{
IDfTypedObject Next = Docs.getTypedObject();

IDfId ObjectId = Next.getId( "r_object_id" );
Ret.addElement( ObjectId.toString() );
break;
}

IdStr = ( String )Ret.elementAt(0);

The else block handles the situation where we are given an object id.

}
else
IdStr = DocName;

At this point, we have the object id of the document and the content format that we want. We are ready to issue the xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”>getfile server api via our getContent member function:

getContent( IdStr, Format );
hIn = new BufferedInputStream(
new FileInputStream( TempFileName ) );
}

Our getContent member function fetches the content from the server. The server delivers the content into a temporary file that we can read. The getContent member function sets up a private member variable TempFileName with the name of this temporary file.

Next, we open a FileInputStream to this temporary file, buffered by a BufferedInputStream, and save the FileInputStream for later use in a private data member.

If there was any error in the above, we want to give meaningflist information back.

catch( Exception exc )
{
exc.printStackTrace( System.err );
throw new IOException( getClass().getName() + ".DctmInputStream(" +
DocumentName + ")" );
}
}

The getContent() function

The getContent( String ObjectId, String Format ) function is a private function that is invoked by the worker constructor. This class builds a comma delimited command parameter that is passed to the server with the xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”>getfile command:

private void getContent( String ObjectId, String Format ) throws DfException
{
StringBuffer Command = new StringBuffer();
Command.append( ObjectId );
if ( Format != nlistl )
{
Command.append( "," );
Command.append( Format );
}
TempFileName = s.apiGet( "getfile", Command.toString());
}

The read functions

Every concrete java.io.InputStream class must implement the read() member function. In our case, the implementation is trivial: we merely delegate our read() to the read() member function of the FileInputStream data member we constructed earlier.

For efficiency, concrete java.io.InputStream classes are urged to implement a meaningflist read( byte[] Bytes, int Offset, int Length ) member function. In our case, we delegate this method to the read(byte[],int,int) call of the FileInputStream data member we constructed earlier.

The close and finalize functions

The close member delegates to the close() method the FileInputStream data member we constructed earlier.

The finalize() method (which is invoked when the JVM garbage collects us), deletes the temporary file that the server originally returned to us with the content.

DctmOutputStream

DctmOutputStream is a subclass of java.io.OutputStream. This class has three constructors:

  • A protected zero argument constructor that does nothing. This constructor is protected since we do not want users to call it directly. At the same time, we want to let users extend this class, in which situation they might want access to the zero argument constructor.
  • A constructor that takes a documentum session and a document name. This constructor is used to write the content associated with a document in its defalistt format.
  • A constructor that takes a documentum session, a document name and a format name. This is the general purpose constructor that is used to write the content associated with a document in the given format.

Just as for the DctmInputStream constructors, all DctmOutputStream constructors accepting a document name work either with a folder qualified document name (such as “/Temp/expenses/exp1.doc”),
or with an ObjectId.

Of these constructors, the first one has a trivial implementation: all it does is invoke the no argument constructor of its superclass, namely, java.io.OutputStream. The second constructor delegates to the third constructor, passing a nlistl value for the format. The third constructor is the xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”>worker constructor: this is where the interesting code lies. The following is an annotated explanation of the code.

public DctmOutputStream( IDfSession s, String DocumentName, String Format ) throws IOException
{
super();

In the following lines, we initialize our member variables with the arguments supplied via the constructor, since we will need these later.

this.s = s;
this.DocName = new String( DocumentName );
if ( Format != nlistl )
this.Format = new String( Format );

We want to make sure that the document name is a folder qualified name, analogous to a flistl qualified filename, or an ObjectId. If not, we want to throw an IllegalArgumentException with an explanatory message.

try
{
if ( !DctmStreamUtils.isId( DocName ) )
{
// We want to separate the DocumentName into a folder name and
// a document name!
if ( DocumentName.indexOf("/") == -1 || !DocumentName.startsWith("/") )
{
// An absolute document name is required, but a relative
// document name has been given!
throw new IllegalArgumentException( getClass().getName() + ".DctmInputStream( s, " +
DocumentName + "): Document name must be absolute!" );
}

We want to parse the document name into a folder and a file name.

String FolderName = DocumentName.substring( 0, DocumentName.lastIndexOf("/") );
String FileName = DocumentName.substring( DocumentName.lastIndexOf("/") + 1 );

We now want to obtain the ObjectId of the document. We will use a DQL query via the DFC to obtain the object id. The following is staple DFC code that obtains an IdfCollection and parses it row by row.

// Determine the ObjectId of this document!
DfQuery NewQuery = new DfQuery();
NewQuery.setDQL( "select r_object_id from dm_sysobject where object_name='" +
FileName + "' and folder('" + FolderName + "')" );
IDfCollection Docs = NewQuery.execute( s, 0 ); // This is a ReadQuery!
Vector Ret = new Vector();
while ( Docs.next() )
{
IDfTypedObject Next = Docs.getTypedObject();
IDfId ObjectId = Next.getId( "r_object_id" );
Ret.addElement( ObjectId.toString() );
break;
}
if ( Ret.size() == 0 )
throw new IllegalArgumentException( getClass().getName() + ".DctmInputStream( s, " +
DocumentName + "): Document not found!" );
ObjectId = ( String )Ret.elementAt(0);

At this point, we have the object id of the document and the content format that we want to write. We want to create a temporary file into which we shall be writing the content, saving the file name and a xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”>file handle ( a FileOutputStream instance ) to it for later use. For efficency, we decorate the FileOutputStream with a BufferedOutputStream.

// Create a temp file and open an output stream to it!
TempFile = File.createTempFile( "pre", "suf",
new File( System.getProperties().getProperty( "user.home" ) ) );
bOut = new BufferedOutputStream(
new FileOutputStream( TempFile.toString() ) );
}
catch( Exception exc )
{
throw new IOException( exc.toString() );
}
}

The write functions

Every concrete java.io.OutputStream class must implement the write() member function. In our case, the implementation is trivial: we merely delegate our write() to the write() member function of the FileOutputStream data member to the temporary file that we constructed earlier.

For efficiency, concrete java.io.OutputStream classes are urged to implement a meaningflist write( byte[] Bytes, int Offset, int Length ) member function. In our case, we delegate this method to the write(byte[],int,int) call of the FileOutputStream data member we constructed earlier.

The close functions

The close function is where the documentum interaction takes place.

public final void close() throws IOException
{

We want to make sure close is not invoked more than once, since this would be illegal.

if ( bOut == nlistl )
throw new IOException( getClass().getName() + ".close! File is not open!" );

In the following, we want to make sure other threads that may be calling close() us do not interfere with us: basically, we want only one thread to execute the xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”>setcontent server api function.

synchronized( this )
{
try
{

We must flush the BufferedOutputStream and close it, thus effectively closing the file handle to the temporary file.

bOut.flush();
bOut.close();

We want to invoke the xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”>setcontent server api function.

// Invoke the server api!
setContent( DocName, TempFile.toString(), Format );
}
catch( Exception exc )
{
exc.printStackTrace( System.err );
throw new IOException( exc.toString() );
}
finally
{

We want to set the stream pointing to the file handle to nlistl, so that we can throw an exception if our close function is reinvoked.

bOut = nlistl;
}
}
}

Conclusion and parting comments

The DctmInputStream and DctmOutputStream classes abstract out document content handling and appear as well known
Stream classes to Java programmers. At the same time, they are efficient implementations rivaling the footprint and speed of
hand coded java functions to perform this functionality. The author feels that these classes would be useflist in classes or servlets
or jsp’s where explicit references to DFC classes are not desired, and for expert java programmers who are new
to or unfamiliar with Documentum programming.

Note that the Documentum content access functionality fundamentally delivers content not as
streams but as entire chunks. The shortcoming of this approach is that your local filesystem must have
enough space to accomodate content files: trying to get the content of an Excel Spreadsheet that is 10MB in size,
when the amount of free space on your hard drive is only 2MB, is going to fail. The streams presented in
this article suffer from the same shortcoming, built as they are over Documentum content functionality.

Like this article?

Share on facebook
Share on Facebook
Share on twitter
Share on Twitter
Share on linkedin
Share on Linkdin
Share on pinterest
Share on Pinterest

Leave a comment