Alfresco, counting more than 1000 elements

Many people need to count elements inside the repository. In a common repository, having more than 1,000 elements from the same type or aspect is a regular scenario.

In this blog post, several ways of counting elements in Alfresco repository are exposed.

Problem statement

How many nodes having businessDocument aspect are in the repository?

Let’s assume the following content model: a base aspect named ust:businessDocument and two inheriting aspects named ust:inboundDoc and ust:outboundDoc

<aspects>
    <aspect name="ust:businessDocument">
        <properties>
            <property name="ust:docDate">
                <type>d:datetime</type>
            </property>
        </properties>
    </aspect>
    <aspect name="ust:inboundDoc">
        <parent>ust:businessDocument</parent>
        <properties>
            <property name="ust:receivedDate">
                <type>d:datetime</type>
            </property>
        </properties>
    </aspect>
    <aspect name="ust:outboundDoc">
        <parent>ust:businessDocument</parent>
        <properties>
            <property name="ust:sentDate">
                <type>d:datetime</type>
            </property>
        </properties>
    </aspect>
</aspects>

So every content having any of these three aspects must be included.

For this sample, I’ve prepared a repository with 2,403 nodes including any of these aspects.

Using CMIS

A simple Groovy script can be developed by using a simple CMIS Query

import org.apache.chemistry.opencmis.commons.*
import org.apache.chemistry.opencmis.commons.data.*
import org.apache.chemistry.opencmis.commons.enums.*
import org.apache.chemistry.opencmis.client.api.*
import org.apache.chemistry.opencmis.client.util.*

String cql = "SELECT cmis:objectId FROM ust:businessDocument"

OperationContext opCon = session.createOperationContext();
opCon.setMaxItemsPerPage(1000000);

ItemIterable<QueryResult> results = session.query(cql, false, opCon)

println "--------------------------------------"
println "Total number: ${results.totalNumItems}"
println "Has more: ${results.hasMoreItems}"
println "--------------------------------------"


--------------------------------------
Total number: 1000
Has more: true
--------------------------------------

However, CMIS (and also FTS) can only retrieve 1,000 elements. You can play with paging and skipping, but there is no (simple) way to obtain more than 1,000.

Using Database

It’s not recommended to play with Alfresco Database, but it looks like this is the right chance to do it.

Let’s start with a simple query to see what happens.

SELECT count(1)
FROM alf_node AS n,
  alf_node_aspects AS a,
  alf_qname AS q,
  alf_namespace AS ns,
  alf_store AS s
WHERE a.qname_id = q.id
  AND a.node_id = n.id
  AND q.ns_id = ns.id
  AND n.store_id = s.id
  AND s.protocol = 'workspace'
  AND s.identifier = 'SpacesStore'
  AND ns.uri = 'http://www.ust-global.com/model/business/1.0'
  AND q.local_name in ('businessDocument');
 count
-------
   801
(1 row)

It looks like parent aspects are not related with the node, so we need to include every inherited aspect in the query.

SELECT count(1)
FROM alf_node AS n,
  alf_node_aspects AS a,
  alf_qname AS q,
  alf_namespace AS ns,
  alf_store AS s
WHERE a.qname_id = q.id
  AND a.node_id = n.id
  AND q.ns_id = ns.id
  AND n.store_id = s.id
  AND s.protocol = 'workspace'
  AND s.identifier = 'SpacesStore'
  AND ns.uri = 'http://www.ust-global.com/model/business/1.0'
  AND q.local_name in ('businessDocument', 'inboundDoc', 'outboundDoc');
 count
-------
  2403
(1 row)

So we have the number we were looking for, but we are scanning alf_node table to get it: database performance could be degraded!

Using SOLR

Skipping all Alfresco overload, we can use directly SOLR Engine to perform a query in alfresco core

https://localhost/solr/alfresco/afts?q=ASPECT:%22ust:businessDocument%22

<response>
	<lst name="responseHeader">
		<int name="status">0</int>
		<int name="QTime">6</int>
		<lst name="params">
			<str name="q">ASPECT:"ust:businessDocument"</str>
		</lst>
	</lst>
	<result name="response" numFound="2403" start="0">
		<doc>
			<str name="id">_DEFAULT_!8000000000000019!8000000000005279</str>
			<long name="_version_">0</long>
			<long name="DBID">21113</long>
		</doc>
	</result>
	<bool name="processedDenies">false</bool>
</response>

So we found that numFound 2,403 by using a simple query and without degrading Alfresco performance.

REST API

After talking a while with Younes Regaieg and Axel Faust at Alfresco IRC, I realised that there is also a way to provoke a SOLR query when using REST API invocation. When including a query that cannot be processed as TMQ, Alfresco is also counting the elements when using REST API.

This is why we are adding name:* to the query in the following code snippet.

[POST]

https://localhost/alfresco/api/-default-/public/search/versions/1/search

[Payload]

{
  "query": 
  {
     "language": "afts", 
     "query": "name:* AND ASPECT:\"ust:businessDocument\""
  }
}

Results are included in totalItems field in the response:

{
    "list": {
        "pagination": {
            "count": 100,
            "hasMoreItems": true,
            "totalItems": 2403,
            "skipCount": 0,
            "maxItems": 100
        },

The same thing can be obtained in a different way, but there is always an alternative that is better than the others. Some experimentation on test environments can save you more troubles in your real service!

Anuncios

How to create a Site for Alfresco using Java API

Some time ago, Site creation in Alfresco was driven by Share web application. So using a simple SiteService.createSite from repository Java API is not enough to provide access to the site from Share.

Below some useful snippets are listed.

Creating the Site

// Default site preset
String sitePreset = "site-dashboard";
SiteInfo siteInfo = siteService.createSite(sitePreset,
    "test-site", 
    "Test Site",
    "Test Site description", 
    SiteVisibility.PRIVATE);

Adding site members

Including at least a Manager is advisable.

siteService.setMembership("test-site", 
    "admin",
    SiteRole.SiteManager.toString());

Creating dashboard

This step is required to provide access from Share.

private void createDefaultDashboard(SiteInfo siteInfo) {
    
    NodeService nodeService = serviceRegistry.getNodeService();
    FileFolderService fileFolderService = serviceRegistry.getFileFolderService();
    ContentService contentService = serviceRegistry.getContentService();
    
    FileInfo surfConfig = fileFolderService.create(siteInfo.getNodeRef(), "surf-config", ContentModel.TYPE_FOLDER);
    Map<QName, Serializable> properties = new HashMap<QName, Serializable>();
    properties.put(ContentModel.PROP_CASCADE_HIDDEN, Boolean.TRUE);
    properties.put(ContentModel.PROP_CASCADE_INDEX_CONTROL, Boolean.TRUE);
    nodeService.addAspect(surfConfig.getNodeRef(), ContentModel.ASPECT_HIDDEN, properties);
    // Hint from Bertrand Forest
    properties = new HashMap<QName, Serializable>();
    properties.put(ContentModel.PROP_IS_INDEXED, Boolean.FALSE);
    properties.put(ContentModel.PROP_IS_CONTENT_INDEXED, Boolean.FALSE);
    nodeService.addAspect(surfConfig.getNodeRef(), ContentModel.ASPECT_INDEX_CONTROL, properties);
    
    FileInfo pages = fileFolderService.create(surfConfig.getNodeRef(), "pages", ContentModel.TYPE_FOLDER);
    FileInfo site = fileFolderService.create(pages.getNodeRef(), "site", ContentModel.TYPE_FOLDER);
    FileInfo siteName = fileFolderService.create(site.getNodeRef(), siteInfo.getShortName(), ContentModel.TYPE_FOLDER);
    
    Map<QName, Serializable> props = new HashMap<QName, Serializable>(1);
    props.put(ContentModel.PROP_NAME, "dashboard.xml");  

    NodeRef node = nodeService.createNode(
                            siteName.getNodeRef(), 
                        ContentModel.ASSOC_CONTAINS, 
                        QName.createQName(NamespaceService.CONTENT_MODEL_1_0_URI, "dashboard.xml"),
                        ContentModel.TYPE_CONTENT, 
                        props).getChildRef();
                        
    ContentWriter writer = contentService.getWriter(node, ContentModel.PROP_CONTENT, true);
    writer.setMimetype(MimetypeMap.MIMETYPE_XML);
    writer.setEncoding("UTF-8");
    // TODO Create dashboard.xml file by using an external file resource instead of a hand-coded String
    writer.putContent("<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" + 
            "<page>\n" + 
            "      <title>Collaboration Site Dashboard</title>\n" + 
            "      <title-id>page.siteDashboard.title</title-id>\n" + 
            "      <description>Collaboration site's dashboard page</description>\n" + 
            "      <description-id>page.siteDashboard.description</description-id>\n" + 
            "      <authentication>user</authentication>\n" + 
            "      <template-instance>dashboard-2-columns-wide-left</template-instance>\n" + 
            "      <properties>\n" + 
            "        <sitePages>[{\"pageId\": \"documentlibrary\"}]</sitePages>\n" + 
            "      <theme/><dashboardSitePage>true</dashboardSitePage></properties>\n" + 
            "    <page-type-id>generic</page-type-id></page>");
    
}

Creating Site containers

There are different containers that should be created (wiki, forum,…) but having a Document Library is required for many use cases.

NodeRef documentLibraryNodeRef =  siteService.createContainer(siteInfo.getShortName(), SiteService.DOCUMENT_LIBRARY, null, null);

Some operations are not described in Alfresco documentation, but they can be guessed inspecting internal structure of the repository.

Alfresco 5: using just only alfresco-global.properties

Deploying Alfresco artifacts in several environments requires different parameters in configuration files.

Some time ago I was working with Alfresco in a maven way: packaging properties together with the artifact by using profiles. This method requires several packagings for the same module: one for each environment.

Alfresco experienced developers, and alfresco documentation itself, are suggesting to include module properties just only in alfresco-global.properties for repo artifacts and in share-config-custom.xml for share artifacts.

However, this is one more file than required. Let me share with you how to use alfresco-global.properties for share modules with the following example.

Properties

Including properties values in alfresco-global.properties.

customHeaderModule.url=http://intranet.keensoft.local/alfresco/cabecera.html 
customHeaderModule.height=216px

Repo project

Exposing these properties as a REST service.

1. Describing the REST service
/src/main/amp/config/alfresco/extension/templates/webscripts/es/keensoft/custom-actions/headerParams.get.desc.xml

<webscript>
  <shortname>Get header params</shortname>
  <description>Get header params
  To test: curl -v -u "http://localhost:8080/alfresco/service/keensoft/header/params"
  </description>
  <family>keensoft</family>
  <url>/keensoft/header/params</url>
  <format default="json"/>
  <authentication>none</authentication>
  <transaction>none</transaction>
</webscript>

2. Declaring Spring bean for the Web Script
/src/main/amp/config/alfresco/module/custom-actions/context/services-context.xml

<!-- Return params from alfresco-global.properties -->
<bean id="webscript.es.keensoft.custom-actions.headerParams.get" 
      class="es.keensoft.alfresco.action.webscript.HeaderParamsWebScript" parent="webscript">
	<property name="url" value="${customHeaderModule.url}" />
    <property name="height" value="${customHeaderModule.height}"/>
</bean>

3. Implementing logic for the service
/src/main/java/es/keensoft/alfresco/action/webscript/HeaderParamsWebScript.java

package es.keensoft.alfresco.action.webscript;

import java.io.IOException;

import org.json.simple.JSONObject;
import org.springframework.extensions.webscripts.AbstractWebScript;
import org.springframework.extensions.webscripts.WebScriptRequest;
import org.springframework.extensions.webscripts.WebScriptResponse;
import org.springframework.http.MediaType;

public class HeaderParamsWebScript  extends AbstractWebScript {
	
	private String url;
	private String height;
	
	@SuppressWarnings("unchecked")
	@Override
	public void execute(WebScriptRequest request, WebScriptResponse response) throws IOException {
		
		JSONObject obj = new JSONObject();
		obj.put("customHeaderModuleHeight", height);
		obj.put("customHeaderModuleUrl", url);
    	
    	String jsonString = obj.toString();
    	response.setContentEncoding("UTF-8");
    	response.setContentType(MediaType.APPLICATION_JSON.toString());
    	response.getWriter().write(jsonString);
	
	}

	public void setUrl(String url) {
		this.url = url;
	}

	public void setHeight(String height) {
		this.height = height;
	}

}

Share project

Recovering alfresco-global.properties value by using Alfresco REST API.

1. Declaring Surf customization
/src/main/amp/config/alfresco/web-extension/site-data/extensions/custom-header.xml

<extension>
  <modules>
    <module>
      <id>Custom Header</id>
      <version>1.0</version>
      <auto-deploy>true</auto-deploy>
      <customizations>
          <customization>
              <targetPackageRoot>org.alfresco.share.header</targetPackageRoot>
              <sourcePackageRoot>es.keensoft.share.header</sourcePackageRoot>
          </customization>
      </customizations>
    </module>
  </modules>
</extension>

2. Recovering values from REST service
/src/main/amp/config/alfresco/web-extension/site-webscripts/es/keensoft/share/header/share-header.get.js

function main() {
	var headerParams = jsonConnection("/keensoft/header/params");
	model.url = headerParams.customHeaderModuleUrl;
	model.height = headerParams.customHeaderModuleHeight;
}

main();

function jsonConnection(url) {
	
	var connector = remote.connect("alfresco"),
		result = connector.get(url);

	if (result.status == 200) {		
		return eval('(' + result + ')')
	} else {
		return null;
	}
}

3. Extending view by using FTL
/src/main/amp/config/alfresco/web-extension/site-webscripts/es/keensoft/share/header/share-header.get.html.ftl

<@markup id="custom-header-resources" action="before" target="html">
	<iframe id="ifHeader" name="ifHeader" scrolling="auto" frameborder="0"
	    width="100%" 
	    height="${height}" 
	    src="${url}">
	</iframe>
</@markup>

Result

A new header is showed above original Alfresco header. Url and height parameters are recovered from alfresco-global.properties

alfresco-header-customized