Primary Document and Process Data

Mirjana's picture
BPML
 
The BPML specification provides an abstract model and XML syntax for expressing business processes and supporting entities. BPML itself does not define any application semantics such as particular processes or application of processes in a specific domain; rather it defines an abstract model and grammar for expressing generic processes. This allows BPML to be used for a variety of purposes that include, but are not limited to, the definition of enterprise business processes, the definition of complex Web services, and the definition of multi-party collaborations.
 
BPML is a meta-language for the modeling of business processes, just as XML is a meta-language for the modeling of business data.
 
BPML code includes activities and elements. An activity is a step in a business process, and may be comprised of multiple elements. Elements are defined components of code that provide structure and instructions regarding the activity they embody.
 
 
Process
 
A process is a progressively continuing procedure that consists of a series of controlled activities systematically directed toward a particular result of end. A process is defined as performing activities of varying complexity.
Any entity that a process communicates is defined as a participant. Participants can be business applications (e.g., ERP, CRM), customers, partners, and other processes. They can be either static or dynamic. Static participant is defined in the process and is referenced by activities in the process. Dynamic participant is retrieved from process data using XPath expression
 
Activity
 
An activity is a component that performs a specific function within the process. For example, invoking another process. Activities can be as simple as sending or receiving a message, or as complex as coordinating the execution of other processes and activities.
Activities are either atomic or complex.
 
As a simple activity is always based on sending and receiving message, that is always in XML format, it is important to understand XML format and the ways XML can be handled in a business process, by some functions and/or services in a business process in GIS.
 
Context
 
Activities always execute within a context. The context retains an association between the activities and information and services they utilize during execution, for example, properties that they can access, security credentials, transaction, exception handling, etc.
 
Context maintains the state of the business process from service to service. It contains, among other things, the document being manipulated by the business process. This is also where each service reports errors and status. The GIS infrastructure is designed to persist the context between steps.
The context contains several components:
  • Input Parameters - Retrieve parameters before beginning the operation
  • Workflow Document Body - Set up the document body
  • Error Reporting - Set up status and error reporting
Context can be process data, message that the process sends to a participant or message the
process receives from a participant.
 
Messages
 
Messages are used to exchange information between a process and its participants. All messages are in XML format, and a schema is used to define the format. Upon receiving a message, an activity can use rules to decide whether it will consume the message. Consumed messages are transformed into process data using assignment. XPath is used to retrieve and transform messages.
 
 
Properties
 
A property definition is used to define a property within a particular context. The context can be part of a process.
 
A property is a named value. Activities can access a property’s value or establish a new value. Properties are access as part of the context in which an activity is executed, also known as its current context. Properties are communicated between loosely coupled processes by performing operations and mapping properties to the message exchanged in these operations.
 
Components that do the job in a business process are Services or Adapters.
 
  • every service has 2 areas at disposal to use for processing, one is Process Data and another is Primary Document – these 2 areas will be explained in the following sections.
  • properties in output message that is XML message sent from a business process to a service can be used by a service
  • every service also produce an XML that we call input message. We can also manipulate with properties from an input message.
 
 
Primary Document and Process Data
 
 
Definition and explanation of primary document and process data
 
Primary Document
 
The primary document is the core document in a business process, a document that is supposed to be processed in a business process or only transferred without changing.
 
It can be changed by translation services (such as Translation Service, XSLT Service or XML Encoder).
 
Primary Document can be transferred by the services through a business process and if we do not use any service that can change the content of the primary document, it will be the same in all the steps of a business process.
 
Process Data
 
We can say it is a unique section of storage space for saving variables and evaluating syntax.
Also, process data is area where information of activities or services during the execution of a business process are saved. So, all the data related to a business process are collected into process data. Process data is always in XML format and saved under root element which is <ProcessData>.
 
Services and activities put information in process data, and also they can access and use information from the process data to complete the business process activities.
 
Both areas, primary document as well as process data are important for processing data in a business process.
Some services use only primary document and do not use process data.
Some services only put some information into process data and do not do anything with a primary document.
 
Also, there are services that use primary document and process data and both areas are changed or updated after such services finish execution.
 
Short resume ...
 
Process Data can include the following:
 
  • data extracted from a primary document
  • data assigned in a business process explicitly by assign element
  •  data placed by a service
 
  •  Meaning and explanation of the name PrimaryDocument
 
    • About services that use PrimaryDocument
 
There are some services that use primary document for processing or operate on a primary document. Any service that needs a document for processing knows how to take a document with the name PrimaryDocument. As process data can contain many documents, only one can be named as PrimaryDocument, others have other names. Also, maybe, none of them have PrimaryDocument name after processing.
If we have a document(s) with name(s) different than PrimaryDocument, we would have to rename one into the right name and service will recognize it then and take for processing.
 
There are services that work on primary document and need a document with that exact name in the process data, and it is the PrimaryDocument, and such services are:
 
-          Translation Service
-          XSLT Service (if configured to translate the primary document, not the process data)
-          SMTP Send Adapter
-          Document Extraction Service
-          Document Keyword Replace Service
-          FTP Client PUT Service
-          SFTP Client PUT Service
-          HTTP Client POST Service
-          CD Server CopyTo Service
-          Command Line Adapter (in useInput parameter is set to Yes)
-          EDI Deenvelope Service
-          EDI Encode Service
-          EDI Envelope Service
-          XML Encoder (in modes: Encode non-XL document and Use existing XML document)
-          XML Validation Service (if xml_input_from is set to PrimaryDocument)
 
... and some other services.
 
Typical error that you get when a service expects a PrimaryDocument but cannot find it, and can be found in a Status Report, is:

com.sterlingcommerce.woodstock.workflow.WorkFlowException: There is no Primary Document, this service operates on the Primary Document

... or for EDI Encode Service ...

Error encoding primary document - EDI ENCODER SERVICE: There is no document to encode.

… or for EDI Envelope Service …

com.sterlingcommerce.woodstock.workflow.WorkFlowException: No document to envelope

    • About services that need PrimaryDocument for an input and/or produce a PrimaryDocument
 
As we mentioned above that some services use only primary document and do not use process data and some services only put some information into process data and do not do anything with a primary document or any other combination, here is few examples of different usages from practice mainly regarding PrimaryDocument.
 
- There are lot of services that needs the PrimaryDocument to operate on and produce only one document which name is always PrimaryDocument.
 
For example
 
  • Translation Service - when we translate a document by Translation Service it will always use the PrimaryDocument as the input for processing and the result of processing will go into the PrimaryDocument as well (if we do not say to go into another document, but more on that in the section about Input/Output messages).
  • XSLT Service - Or another example, if we translate a document by XSLT, input can be the PrimaryDocument (although Process Data can also be used as the input for translation) and the result is saved in the PrimaryDocument.
  • XML Encoder - XML Encoder in the mode 'Encode a non-XML document' also uses PrimaryDocument as input and produce the result that is placed into the Primary Document.
  • Document Keyword Replace Service – uses PrimaryDocument as an input and result of replace placed back to the PrimaryDocument.
  • Mail Mime Service – depending on setting of this service, it will use the PrimaryDocument for creating a raw mail MIME message and produced MIME message will be put back into the Primary Document.
 
 
- Some other services only need PrimaryDocument as input for processing, but do not change it, that means, do not produce a new PrimaryDocument in the process data area as the output result:
 
For example
 
  • SMTP Service – simply uses PrimaryDocument to use for sending mail, but do not produce any result back into the Primary Document
  • XML Encoder - in mode 'Use existing XML document', XML document from the PrimaryDocument will be taken and placed into the process data. So Primary Document is used for the input, but a new Primary Document is not created as the output result.
  • XML Validation Service – validates the PrimaryDocument,
  • FTP Client PUT Service – takes the PrimaryDocument and send it to an FTP server. There is no any change in the PrimaryDocument after the service finishes processing
  • HTTP Client POST Service – needs a PrimaryDocument for processing and does not produce a new one or makes any change in the PrimaryDocument.
  • CD Server CopyTo Service – similar to any other communicatio service that sends out a document, so needs a PrimaryDocument for processing, but does not produce a new one after finishing.
 
 
    • About services that produce PrimaryDocument or documents with other names (different than the PrimaryDocument):
 
Some services need PrimaryDocument to operate on, some other services or even the same as mentioned above also produce as the result of its activity a document with the name PrimaryDocument or documents with other names.
 
Examples of services and processes that produce more than one document in the process data follows …
 
  1. File System Adapter
 
When the File System Adapter collects multiple files from the file system, more than one document will be placed in the process data with names FSA_Document1, FSA_Document2, …
 
Business proces example:
 
<process name="default">
 <sequence>
    <operation name="File System Adapter">
      <participant name="FSA_name"/>
      <output message="FileSystemInputMessage">
        <assign to="Action">FS_COLLECT</assign>
        <assign to="collectionFolder">C:\collectionDirecory</assign>
        <assign to="collectMultiple">true</assign>
        <assign to="deleteAfterCollect">false</assign>
        <assign to="filter">*</assign>
        <assign to="." from="*"></assign>
      </output>
      <input message="inmsg">
        <assign to="." from="*"></assign>
      </input>
    </operation>
 
 </sequence>
</process>
 
Result of the File System Adapter (multiple collect) in the process data:
 
<?xml version="1.0" encoding="UTF-8"?>
<ProcessData>
 <FSA_Document1 SCIObjectID="serverName:578ceb:1184f7a12a5:e05" filename="fileName_1.txt"/>
 <FSA_Document2 SCIObjectID="serverName:578ceb:1184f7a12a5:e06" filename="filename_2.txt "/>
 <FSA_Document3 SCIObjectID="serverName:578ceb:1184f7a12a5:e07" filename="filename_3.txt "/>
 <FSA_DocumentCount>3</FSA_DocumentCount>
</ProcessData>
 
We can see that there are more than one document in the process data, none of them has the name PrimaryDocument. Documents are saved under tags FSA_Document[n], where n is an order number of collected document. Also, there is an attribute 'filename' in every element with values that correspond to the file names from the file system. 
 
  1. FTP Client GET Service
 
After multiple GET (mget) files from an FTP server site, we also get more than one file where none of them has the name PrimaryDocument but they are saved under tags with original names gotten from the server.
 
 Business proces example:
 
<process name="default">
 <sequence>
    <operation name="FTP Client Begin Session Service">
      <participant name="FTPClientBeginSession"/>
      <output message="FTPClientBeginSessionServiceTypeInputMessage">
        <assign to="FTPClientAdapter">FTPClientAdapter</assign>
        <assign to="RemoteHost">serverName</assign>
        <assign to="RemotePasswd">password</assign>
        <assign to="RemotePort">21</assign>
        <assign to="RemoteUserId">userId</assign>
        <assign to="UsingRevealedPasswd">true</assign>
        <assign to="." from="*"></assign>
      </output>
      <input message="inmsg">
        <assign to="." from="*"></assign>
      </input>
    </operation>
 
    <operation name="FTP Client GET Service">
      <participant name="FTPClientGet"/>
      <output message="FTPClientGetServiceTypeInputMessage">
        <assign to="RemoteFilePattern">*</assign>
        <assign to="SessionToken" from="SessionToken/text()"></assign>
        <assign to="." from="*"></assign>
      </output>
      <input message="inmsg">
        <assign to="." from="*"></assign>
      </input>
    </operation>
 
    <operation name="FTP Client End Session Service">
      <participant name="FTPClientEndSession"/>
      <output message="FTPClientEndSessionServiceTypeInputMessage">
        <assign to="SessionToken" from="SessionToken/text()"></assign>
        <assign to="." from="*"></assign>
      </output>
      <input message="inmsg">
        <assign to="." from="*"></assign>
      </input>
    </operation>
 
 </sequence>
</process>
 
Result of the FTP Client GET Service (mget), in the process data:
 
<?xml version="1.0" encoding="UTF-8"?>
<ProcessData>
 <SessionBeginTime>2008-02-26 11:59:36.218</SessionBeginTime>
 <SessionToken>FTPClientAdapter_FTPClientAdapter_node1_12040235762181001:5222</SessionToken>
 <Status>0</Status>
 <ServerResponse>
    <Code>230</Code>
    <Text>230 User userName logged in.</Text>
 </ServerResponse>
 <TranscriptDocumentId> serverName:578ceb:1184f7a12a5:e60</TranscriptDocumentId>
 <TranscriptDocument_1 SCIObjectID="serverName:578ceb:1184f7a12a5:e63"/>
 <TranscriptDocument_3 SCIObjectID="serverName:578ceb:1184f7a12a5:eb7"/>
 <fileName_1.txt SCIObjectID="serverName:578ceb:1184f7a12a5:eb8"/>
 <fileName_2.txt SCIObjectID="serverName:578ceb:1184f7a12a5:eb9"/>
 <fileName_3.txt SCIObjectID="serverName:578ceb:1184f7a12a5:eba"/>
 <Status>0</Status>
 <TranscriptDocumentId>
serverName:578ceb:1184f7a12a5:ead</TranscriptDocumentId>
 <ServerResponse>
    <Code>226</Code>
    <Text>226 Transfer complete.</Text>
 </ServerResponse>
 <DocumentList>
    <DocumentId>serverName:578ceb:1184f7a12a5:e81</DocumentId>
    <DocumentId>serverName:578ceb:1184f7a12a5:e8a</DocumentId>
    <DocumentId>serverName:578ceb:1184f7a12a5:e8d</DocumentId>
 </DocumentList>
</ProcessData>
 
TranscriptDocument_[n] contains response of ftp server in particular steps. Every document contains response from one step in ftp session or response of one ftp command.
Documents gotten by mget command from ftp site are saved under tags named with original file names.
There is also list of DocumentId(s) under tag DocumentList. They are unique id(s) under which we can find documents in this particular workflow context. Some services can process a document based on its DocumentId . For example, FTP Client PUT Service can be provided by DocumentId parameter and based on that value document will be taken from a current workflow context and put at the ftp site by the service. If no value for DocumentId is specified then the service will put the primary document to the remote server.
 
 
  1. B2B Mail Client Adapter
 
One more example shows documents gotten from a mail server. We can find body of mail in the PrimaryDocument and attachments under Mail_Mime_DOC_[n] tags.
 
Business proces example that analyze raw mail message taken from the mail server containing body and 3 attachments:
 
<process name="default">
 <sequence>
    <operation name="Mail Mime Service">
      <participant name="MailMimeService"/>
      <output message="MailMimeServiceInputMessage">
        <assign to="mail-mime-operation">parse</assign>
        <assign to="parse">true</assign>
        <assign to="." from="*"></assign>
      </output>
      <input message="inmsg">
        <assign to="." from="*"></assign>
      </input>
    </operation>
 
 </sequence>
</process>
 
Result of the Mail Mime Service that parses mail MIME message:
 
<?xml version="1.0" encoding="UTF-8"?>
<ProcessData>
 <Mail_Client>
    <Headers>
      <X-MimeOLE>Produced By Microsoft MimeOLE V6.00.2900.3198</X-MimeOLE>
      <To><user@localhost></To>
      <From>"sender" <user@localhost></From>
      <Received>from [127.0.0.1]</Received>
      <Content-Type>multipart/mixed;
          boundary="----=_NextPart_000_0003_01C87889.481382B0"</Content-Type>
      <Date>Tue, 26 Feb 2008 15:07:19 +0100</Date>
      <Attachment_Count>3</Attachment_Count>
      <X-Mailer>Microsoft Outlook Express 6.00.2900.3138</X-Mailer>
      <MIME-Version>1.0</MIME-Version>
      <Message-ID><000701c87880$e6a291e0$8865a8c0@serverName></Message-ID>
      <Subject>a</Subject>
      <X-Priority>3</X-Priority>
      <X-MSMail-Priority>Normal</X-MSMail-Priority>
    </Headers>
    <Attachments>
      <Filenames>
        <Filename3>attachmentName_3.txt</Filename3>
        <Filename2>attachmentName_2.txt</Filename2>
        <Filename1>attachmentName_1.txt</Filename1>
      </Filenames>
      <FileExtensions>
        <FileExtension3>txt</FileExtension3>
        <FileExtension2>txt</FileExtension2>
        <FileExtension1>txt</FileExtension1>
      </FileExtensions>
      <ContentTypes>
        <Content_Type3>text/plain;
          name=" attachmentName_3.txt"</Content_Type3>
        <Content_Type2>text/plain;
          name=" attachmentName_2.txt"</Content_Type2>
        <Content_Type1>text/plain;
          name=" attachmentName_1.txt"</Content_Type1>
      </ContentTypes>
    </Attachments>
 </Mail_Client>
 <b2b-raw-message>true</b2b-raw-message>
 <b2b-protocol>smtp</b2b-protocol>
 <PrimaryDocument SCIObjectID="serverName:578ceb:1184f7a12a5:1eb4"/>
 <Mail_Mime_DOC_2 SCIObjectID="serverName:578ceb:1184f7a12a5:1eb7"/>
 <Mail_Mime_DOC_3 SCIObjectID="serverName:578ceb:1184f7a12a5:1eba"/>
 <Mail_Mime_DOC_4 SCIObjectID="servername:578ceb:1184f7a12a5:1ebd"/>
 <Mail_Mime>
    <Total_Message_Content>4</Total_Message_Content>
 </Mail_Mime>
</ProcessData>
 
 
This is one more example when more than one document is created in the processs data. But in this case we can find the PrimaryDocument as well as other documents. Body of mail is placed in the PrimaryDocument and attachments can be found in the document under Mail_Mime_DOC_[n] tag names.
 
 
We saw 3 different examples where services produce more than one document in the process data.
 
In the case we want to give those files to a service that operates on the PrimaryDocument, one by one, we can create a loop and in every iteration of the loop, document will be renamed to the PrimaryDocument in order a service that will operate on a document can recognize the name. There is no service that can take e.g. FSA_Document1 document from the process data. Such name always has to be changed into PrimaryDocument before giving that document to a service that operates on a document.
 
Section about renaming document in the process data follows.
 
 
  • Renaming document and its purpose
 
    • Taking a document from the PrimaryDocument and saving under another name
 
If we want to take a document saved under the PrimaryDocument tag and save it under another tag name, e.g. TEMP_Storage, it can be done by the following assign statement.
 
    • Assign statement
 
<assign to="TEMP_Storage" from="PrimaryDocument/@SCIObjectID"></assign>
 
    • Configuration in the Graphical Process Modeler:
 
 
    • Result in the process data
 
<?xml version="1.0" encoding="UTF-8"?>
<ProcessData>
 <PrimaryDocument SCIObjectID="serverName:4741d6:11859e01178:-3e47"/>
 <TEMP_Storage SCIObjectID="serverName:4741d6:11859e01178:-3e47"/>
</ProcessData>
 
    • Purpose for such an action
 
We know now that some services create a PrimaryDocument in process data area. If we imagine that we have 2 or more services where every of them will place its result into the PrimaryDocument, then we will see that the second service will rewrite the result of the first one in the PrimaryDocument, the third seervice will rewrite the result of the second one also in the primaryDocument, etc ...
If we want to save the result of a specific service or do not allow another service rewrite it, we will simply take the content of the PrimaryDocument and save the whole document under any tag name (in our example it is TEMP_Storage). But we can also see that both tags containing these 2 documents exist in the process data. When we save a document under another name, nothing disappeared from the process data, just another element is added with link to a document. If we want or need to remove PrimaryDocument we have to release it explicitly with the Release Service. Let say that we do not rename a document in the process data, it is better to say we create a new one. Any action in a business process can remove an element from the process data except the Release Service. 
 
 
    • Releasing the PrimaryDocument after creating the TEMP_Storage (if necessary)
 
Configuration of the Release Service to release the PrimaryDocument is:
 
<operation name="Release Service">
      <participant name="ReleaseService"/>
      <output message="ReleaseServiceTypeInputMessage">
        <assign to="TARGET">PrimaryDocument</assign>
        <assign to="." from="*"></assign>
      </output>
      <input message="inmsg">
        <assign to="." from="*"></assign>
      </input>
</operation>
 
 
 
    • Taking a document from a tag which name is different than the PrimaryDocument and saving under PrimaryDocument
 
If we want to take a document saved under the TEMP_Storage tag and save it under the PrimaryDocument, it can be done by the following assign statement:
 
    • Assign statement
 
<assign to="PrimaryDocument" from="TEMP_Storage/@SCIObjectID"></assign>
 
    • Configuration in the Graphical Process Modeler:
 
 
    • Result in the process data
 
<?xml version="1.0" encoding="UTF-8"?>
<ProcessData>
 <TEMP_Storage SCIObjectID="MIRJANA:4741d6:11859e01178:-2b08"/>
 <PrimaryDocument SCIObjectID="MIRJANA:4741d6:11859e01178:-2b08"/>
</ProcessData>
 
    • Purpose for such an action
 
If the content of a document that is saved under the name different than the PrimaryDocument has to be taken by a service in a business process, first of all we have to save it under the PrimaryDocument. As we told before, any service in GIS can only recognize the PrimaryDocument name, not any other name in a process data area.
Once when we return content of document saved under the tag TEMP_Storage into the PrimaryDocument, maybe we want to release the TEMP_Storage element where document has been saved.
 
    • Releasing the TEMP_Storage after creating the PrimaryDocument (if necessary)
 
Configuration of the Release Service to release the TEMP_Storage is:
 
<operation name="Release Service">
      <participant name="ReleaseService"/>
      <output message="ReleaseServiceTypeInputMessage">
        <assign to="TARGET">TEMP_Storage</assign>
        <assign to="." from="*"></assign>
      </output>
      <input message="inmsg">
        <assign to="." from="*"></assign>
      </input>
</operation>
 
 
 
In both of these cases, we do not rename a document, but create a new document in the process data under different tag name.

There are other ways to preserve a document without using assign for creating a new document explicitly, but that topic will be explained in one new document .... when I find time to write it  ...