You are here

Alfresco Transfer Service 3.3.3 - Getting Started

Rich Hart's picture
Rich Hart

As part of a customer requirement to expose publications to multiple Alfresco repositories, along with a recent version upgrade to Alfresco 3.3.3, we decided to implement and extend the Alfresco Transfer service.

Initial findings found it very easy to get going, and soon had an out of the box transfer service between two Alfresco instances configured with one class - the custom action, and a couple of spring context file changes to wire up the beans. What is particularly nice is that no configuration is required to turn on the transfer service at either the source or target end.

As part of the transfer, a manifest file is created which is similar to the xml file in an ACP which outlines the nodes to transfer, the metadata attached to each of them, and also any associations. The temporary location of the manifest file is located at: <tomcat-home>\temp\Alfresco\longLife_transfer. If you have an error or failure and it might be content related then this is a good source to start identifying the problem. Note that on a successful transfer the manifest file is automatically deleted after the commit event takes place.

The Process:

The metadata and the file content are not both sent to the target at the same time. It first gathers all the metadata, association information, and parent folder locations of all the nodes and creates the XML manifest file from this information. A conversation takes places whereby this manifest is sent to the target first (over HTTP or HTTPS), allowing the target to analyse and provide feedback about which nodes it needs to send. For example a node of the correct timestamp may already be present, and so not require re-transmitting from the source, which is obviously a great efficiency saving. Based on this feedback, the source then sends the actual content of the files. These files are sent in batches, again for performance and efficiency reasons. Content is prepared in a buffer and is “chunked” into packages of around 1MB, this is the default value in the ContentChunkerImpl class, when the buffer has reached its chunk size, it is flushed and the content is sent to the target.

On a successful transfer the service passes through multiple event stage, and once committed to the target repository you can search for the relevant nodes. The target nodes maintain all properties including the node-uuid, cm:creator, cm:created, cm:modified properties. On a transfer failure the transaction will roll-back, and no nodes will be transferred.

Transfers can be made synchronously or asynchronously, by calling the transferService.transfer(.... or the transferService.transferAsync(... methods from the Transfer Service API. Be aware the permissions are checked so not all users can transfer if they don't have the appropriate permissions on the source or the target end, to read from the source or create at the target. One way around this is the wrap the execution around a "RunAs" admin call, but again you may stumble across problems if running in async mode, as you'll need to run the entire transaction as admin, but there are ways around this. Try running the transaction using the Alfresco RetryingTransactionHelper class and create a RetryingTransactionCallback, this way you can set the permissions on the entire transaction (AuthenticationUtil.setFullyAuthenticatedUser("admin");), to ensure the whole transaction runs in using the specified user. The existing ACP process has a different approach to transmitting details about associations. The ACP method only seems to maintain associations if both target and source are present in that package (although I’m not sure that was always the case). What was always the case was that nodes did not transmit source-association information, but only target-association information. This was inconvenient since it relied on the transmitter knowing the correct order to transmit nodes, otherwise associations could get lost. The Transfer Service API, on the other hand, transmits both source and target association node-refs, meaning that getting the correct order is not so important.

Customising your service

The crawler is one of several ways in which the service can be customised. The crawler builds a set of noderefs which are handed to the transfer service for transfer. The way this set is built is determined by the NodeFinder implementation and NodeFilter applied to the crawler. For the customer implementation we were not looking at bulk transfers but transferring a node and its dependencies when a certain business rule was applied. Out of the box there is a ChildAssociatedNodeFinder, which will transfer the actioned upon node and all of its children. We needed the opposite of this, and created a PrimaryParentAssociatedNodeFinder, which took a root space as its constructor and would return the actioned upon node, and all of its primary parents up to the configured root space. This ensured that no “orphans” would occur and the transfer would commit successfully.

The second area that can be quickly extended is the call-back classes. When calling the transfer service it is optional to pass in a call-back or list of call-backs. Call-backs will catch each event stage and enable you to update progress to a UI, cancel an event mid-flow, add a failed transfer to a queue for retry, or update a property on the node on completion.

//Code example implementation of TransferCallback method processEvent 
@Override
    public void processEvent(TransferEvent event) {
        switch (event.getTransferState()) {

        case ERROR :
            logger.error("Error in transfer adding to queue " + event.getMessage());
            //Add this to a queue for retry
            break;

        case SUCCESS :
            logger.debug("Successfully completed transfer");
            //Set some properties on completion
            if (event.isLast()) {
                setTransferSuccess(actionedUponNode);
            }
            break;

        default :
            logger.debug(event.getMessage());
        }
    }
//End of code example

In summary the transfer service is a great mechanism to share content between multiple Alfresco instances, its implementation is flexible and allows for customisations and extensions. The 3.4 already has improvements on the initial release in 3.3 and I look forward to spending some time investigating these.