This blog post was originally published in keensoft blog 04 AUGUST 2018
keensoft UST Global started using Alfresco Community for internal purposes in early 2013 over Ubuntu 12.04. We started with 4.0.d but in a few we upgraded to 4.2.c. From them, we have been sharing every upgrading process in our blog:
- Upgrade from 4.2.c to 5.0.c
- Upgrade from 5.0.c to 5.1.g (201605)
- Upgrade from 5.1.g (201605) to 5.2.c (201702)
Some days ago, Alfresco released the first Alfresco 6 GA (identified as 201806-GA). So, as we need some background experiences before spreading the word to our customers, we started to plan the upgrading of our internal server.
Starting point
Our server had following base software:
- Ubuntu 14.04
- Alfresco CE 201707
- PostgreSQL 9.4.12
With 16 GB RAM, memory distribution was mapped as following:
- 6 GB for Tomcat (Alfresco & Share)
- 2 GB for Jetty (SOLR)
- 4 GB for PostgreSQL
Some other features:
- Authentication and synchronisation with our corporate OpenLDAP
- Many custom addons as AMPs and JARs, mainly home-made
Size of the repository:
$ cd /opt/alfresco/alf_data/contentstore$ du --max-depth=1 -h 3.0G ./2013 14G ./2014 6.4G ./2015 14G ./2016 11G ./2017 20G ./2018 66G . |
The new server
As our software was running on the original 2012 server, we decided to install a new server with the same specifications but including a Docker service by default:
- Ubuntu 18.04
- Docker 17.05
- 16 GB RAM
Alfresco recommends installing the product by using Docker images from release 6, so we built a simple Docker Compose to include required settings.
Our Docker base images catalog is:
- alfresco/alfresco-content-repository-community:6.0.7-ga
- alfresco/alfresco-share:6.0
- alfresco/alfresco-search-services:1.1.1
- alfresco/alfresco-content-app:latest
- postgres:10.1
- jbarlow83/ocrmypdf:v7.0.0
However, we required to perform some configuration to this images to adapt them to our requirements.
version: '3'
services:
httpd:
build: ./httpd
ports:
- 80:80
links:
- alfresco
- share
- solr6
adf:
build: ./adf
depends_on:
- alfresco
ports:
- 3000:3000
alfresco:
build: ./alfresco
privileged: true
environment:
JAVA_OPTS : '
-Ddb.driver=org.postgresql.Driver
-Ddb.username=alfresco
-Ddb.password=alfresco
-Ddb.url=jdbc:postgresql://postgres:5432/alfresco
-Dsolr.host=solr6
-Dsolr.port=8983
-Dsolr.secureComms=none
-Dsolr.base.url=/solr
-Dindex.subsystem.name=solr6
-Ddeployment.method=DOCKER_COMPOSE
-Dcsrf.filter.enabled=false
-Xmx6g -Xms6g
'
volumes:
- ./data/alf-repo-data:/usr/local/tomcat/alf_data
- ./data/ocr_input:/ocr_input
- ./data/ocr_output:/ocr_output
ports:
- 21:21 #FTP port
- 25:25 #SMTP port
- 143:143 #IMAP port
- 445:445 #CIFS
- 137:137/udp #CIFS
- 138:138/udp #CIFS
- 139:139 #CIFS
share:
build: ./share
environment:
- REPO_HOST=alfresco
- REPO_PORT=8080
- 'CATALINA_OPTS= -Xms2g -Xmx2g'
postgres:
image: postgres:10.1
environment:
- POSTGRES_PASSWORD=alfresco
- POSTGRES_USER=alfresco
- POSTGRES_DB=alfresco
# From pg_tune (4 GB + 2 CPUs + 300 connections)
command: '
postgres -c max_connections=300
-c shared_buffers=1GB
-c effective_cache_size=3GB
-c maintenance_work_mem=256MB
-c checkpoint_completion_target=0.7
-c wal_buffers=16MB
-c default_statistics_target=100
-c random_page_cost=1.1
-c effective_io_concurrency=200
-c work_mem=3495kB
-c min_wal_size=1GB
-c max_wal_size=2GB
-c max_worker_processes=2
-c max_parallel_workers_per_gather=1
-c max_parallel_workers=2
-c log_min_messages=LOG
'
volumes:
- ./data/postgres-data:/var/lib/postgresql/data
ports:
- 5432:5432
solr6:
image: alfresco/alfresco-search-services:1.1.1
environment:
- SOLR_ALFRESCO_HOST=alfresco
- SOLR_ALFRESCO_PORT=8080
- SOLR_SOLR_HOST=solr6
- SOLR_SOLR_PORT=8983
- SOLR_CREATE_ALFRESCO_DEFAULTS=alfresco,archive
- SOLR_JAVA_MEM=-Xms2g -Xmx2g
# Set permissions for user with uid 1000 ('isadm' in host, 'solr' in container)
volumes:
- ./data/solr-data:/opt/alfresco-search-services/data
ocrmypdf:
build: ./ocrmypdf
hostname: ocrmypdf
volumes:
- ./data/ocr_input:/ocr_input
- ./data/ocr_output:/ocr_output
To use ADF app by using HTTPs and /adf context path, we made some changes in the source code.
package.json
"build" : "npm run server-versions && ng build --prod --base-href /adf/" |
proxy.conf.js
"target" : "https://0.0.0.0:443" , "secure" : true , |
src/app.config.json
"ecmHost" : "https://{hostname}{:port}" , |
Finally, we wrote a small script to start Docker Compose on boot by using systemd.
[Unit] Description=Alfresco 6 Server After=docker.service Requires=docker.service [Service] ExecStart= /usr/local/bin/docker-compose -f /opt/alfresco/docker-compose .yml up ExecStop= /usr/local/bin/docker-compose -f /opt/alfresco/docker-compose .yml down - v [Install] WantedBy=multi-user.target |
Once the base installation was successful, we tested every addon and extension on Alfresco 6 (special care for repository developments) and we adapted some of them:
At this point, we knew that Alfresco 6 was running in the new server with every customisation. So, we made a first data import to start with capacity planning.
Capacity planning
As this was a new infrastructure, despite having the same resources, we run some capacity tests to measure the quality of the service.
We considered three different scenarios:
- Alfresco 201707 installed by components: Tomcat, Jetty and PostgreSQL
- Alfresco 6 on Docker Compose
- Alfresco 6 on Docker Compose but Database as local software
One of the most repeated mantras by experts in Docker is that databases perform better when installed on a server than when running inside a Docker container. So our last scenario would help us to test this hypothesis.
Once the results were analysed, we discovered that Docker Compose performance was the same (or even better) than classic installation using components. And we also discovered that using a local software for the database had few impact in performance. So, we validated our initial idea to use Alfresco 6 in Docker Compose with all the services as containers.
Upgrading
At this point, upgrading was easy, so we had tested every element before.
Without stopping Alfresco 5 service, we started a copy of the contents.
$ rsync -avzh -e "ssh -p 22" root@alfresco5: /opt/alfresco/alf_data/content * /opt/alfresco/data/alf-repo-data/ |
Once the synchronisation was full, we stopped Alfresco 5 service and we performed a database dump.
$ sudo service alfresco stop $ sudo su - postgres $ cd /opt/alfresco/postgresql/bin $ . /pg_ctl start $ . /pg_dump alfresco > /tmp/exportFile-20180801 .dmp Password: $ . /pg_ctl stop $ exit |
We copied the file to the new server and we restored the database in Alfresco 6.
$ docker-compose up postgres $ cat exportFile-20180801.dmp | docker exec -i docker_postgres_1 psql -U alfresco -d alfresco |
After that, a new synchronisation to gather last changes.
$ rsync -avzh -e "ssh -p 22" root@alfresco5: /opt/alfresco/alf_data/content * /opt/alfresco/data/alf-repo-data/ |
And we decided to remove SOLR indexes and allow the server to re-index again all the content before start definitively our brand new Alfresco 6.
$ rm -rf /opt/alfresco/data/solr-data $ docker-compose up -d |
Finally, we created a GIT repository to manage all the Docker configuration.
Upgrading to Alfresco 6 is more or less the same as previous releases, but the new infrastructure based in Docker allows us to improve the performance and the elasticity of the service.