End-to-end tests journey & Integration into a GitLab pipeline

Intro to end-to-end tests & GitLab integration

During our microservices journey from the past years, we have faced several issues when developing, maintaining, and integrating the end-to-end tests into our CI/CD pipeline. Let’s look at the most important topics that we encountered, and -where is the case- how we solved them:


In order to make the best use of this article, at least a basic understanding of Linux, Docker and Docker Compose is needed.

This article assumes that all microservices are packed as docker images and are published into a private or public docker repository.

The end to end test framework must support Junit-like runners that can be launched from Gradle, Maven, or command line.


The sample software is an enterprise application developed on Java tech stack with Spring Boot on the backend and Angular on the frontend. The system is composed of multiple microservices deployed on Open Telekom Cloud as docker containers. We are using Hazelcast as the integration backbone and MySql as persistent storage.

Because there are multiple environments, we have set up a GitLab pipeline for continuous integration and delivery. We have developed a consistent test suite composed of unit tests, integration tests, security tests, performance tests, and end-to-end tests. Most of them are executed on our environment branches on every merge. We are working with feature branches that get merged into the environment branches. There are basically five environments, two development systems, one feature preview system, one staging system, and the production environment.

As an end-to-end test framework, we are using Serenity because it provides strong support for automated tests using Selenium 2. Serenity’s purpose is to produce living documentation. The tests are written in the give – when – then style, and deliver illustrated narrative reports.

The aggregated reports can be organized by using features, stories, steps, scenarios, and tests.
Berg Software - End-to-end tests an GitLab integration - 01 Interaction journey
Berg Software - End-to-end tests an GitLab integration - 02 Test count

What are end-to-end tests?

End-to-end tests are basically user interface tests. They are at the top of the test pyramid and are the most expensive to develop and maintain, but the slowest to execute. Therefore, just a small percent of the whole test suite is consisting of end-to-end tests.
Berg Software - End-to-end tests an GitLab integration - 03 End-to-end tests pyramid

Why write end-to-end tests?

The end-to-end tests are a safety net for refactoring and continuous enhancements. They increase the software quality by giving feedback sooner, before going into production.

Last but not least, end to end tests are a necessity because they increase the team confidence in each delivery.

When to write end-to-end tests?

Sometimes unit tests are hard to implement or to start with, especially for legacy applications where refactoring is needed in order to make the code more testable. In this case, it is safer to start with end-to-end tests.

Also, end-to-end testing can also be used as acceptance tests. If the UI mockups are clear, they can be written in parallel to the story implementation.

How much test to write?

The end to end tests should be performed only on business-critical user journeys. Only happy paths must be included, meaning that we don’t have to test the error cases or the validations.

GitLab basics

GitLab is Berg Software’s choice for hosting the source code, and running the continuous integration and delivery pipelines. Their plans range from free to enterprise. You can create private or public repositories.

Some of GitLab’s nicest out of box features are:

  • CI/CD support
  • protected branches
  • code review support
  • predefined variables
  • custom runners support
  • integration with different identity providers
  • fine-grained permissions model
  • scheduled jobs

You can define multiple organizations, each of them with their own projects and repositories.

Jobs can be triggered manually or automatically by other jobs or by user commits. Each job executes a pipeline with multiple stages. Each stage can have its own docker image that can be executed on the latest code extracted from the specified branch.

There are predefined environment variables accessible to the user inside the jobs, like branch name and commit message, which are very helpful during the build. There are also user-defined variables that can store sensitive values, and which can be protected. Only protected branches have access to protected variables.

How to: set up the runners and caching

Each job is executed by a certain GitLab runner. The runners are container images executed on the user-provided machines. They pick up jobs and execute them. You can define as many runners as you want. There are also shared runners provided free of charge, but you have to wait for other users’ builds. We are not using shared runners. However, they can be found in GitLab in the project CI/CD settings section.
Berg Software - End-to-end tests an GitLab integration - 04 How to set up the runners and caching
Each runner must be registered and configured with certain labels or tags. For each project, we can define which runners should be assigned to execute the pipelines, based on the runners’ tags. You need to obtain a registration token from the GitLab group’s page in the CI/CD settings. We are using a Docker Compose to register and initialize the runners one time:
version: '2'
    restart: 'no'
    image: gitlab/gitlab-runner:alpine
    - /opt/docker_conf/runner-main-01:/etc/gitlab-runner
    - register
    - --non-interactive
    - --locked=false
    - --name= Group - prj - MAIN - 01
    - --executor=docker
    - --output-limit=409600
    - --docker-image=alpine:latest
    - --docker-volumes=/var/run/docker.sock:/var/run/docker.sock
    - --docker-volumes=/opt/docker_data/runner-main-01-cache:/opt/docker_data/runner-main-01-cache
    - --tag-list=otcmain
    - --docker-privileged
    - --docker-cache-dir=/opt/docker_data/runner-main-01-cache
    - --cache-type=s3
    - --cache-s3-server-address=10.6.yyy.xx:9000
    - --cache-s3-access-key=minioxxxxx
    - --cache-s3-secret-key=yyyy
    - --cache-s3-bucket-name=runner-main
    - --cache-shared
    - --cache-s3-insecure
    - CI_SERVER_URL=https://gitlab.com/
The runner is a docker container running on one of our cloud machines. The executed docker image is taken from the docker repository gitlab/gitlab-runner:alpine.
- --docker-image=alpine:latest
The command section defines which command gets executed when starting the container. There is a one-time execution. We have defined the volumes and mappings for the cache and for the configuration.
    - /opt/docker_conf/runner-main-01:/etc/gitlab-runner
    - --docker-volumes=/opt/docker_data/runner-main-01-cache:/opt/docker_data/runner-main-01-cache
    - --docker-cache-dir=/opt/docker_data/runner-main-01-cache
The caching uses an AWS S3-like object storage to store the build dependencies. Minio is used as S3 provider:
    - --docker-cache-dir=/opt/docker_data/runner-main-01-cache
    - --cache-type=s3
    - --cache-s3-server-address=10.6.yyy.xx:9000
    - --cache-s3-access-key=minioxxxxx
    - --cache-s3-secret-key=yyyy
    - --cache-s3-bucket-name=runner-main
    - --cache-shared
    - --cache-s3-insecure
The MinIO container used for caching must run in the same docker network as the runner.
version: '2'
    image: minio/minio
    container_name: minio1
    restart: unless-stopped
    - /opt/docker_conf/minio-etc:/root/.minio
    - /opt/docker_data/minio-data:/data
    - "9000:9000"
    - MINIO_ACCESS_KEY=minioak
    command: server /data
      name: prj-bridge-network
A tag list flag labels the runner; it can be used to map the runner with a certain project.
- --tag-list=otcmain
After initialization, we can start the runners using Docker Compose on our code quality machines:
version: '2'
    restart: unless-stopped
    image: gitlab/gitlab-runner:alpine
    container_name: runner-main-01
    - /opt/docker_conf/runner-main-01:/etc/gitlab-runner
    - /opt/docker_data/runner-main-01-cache:/opt/docker_data/runner-main-01-cache
    - /var/run/docker.sock:/var/run/docker.sock

Each runner has its own cache and config folder, previously configured on the registration process.

In order to not block the main runners on the long execution of the end to end tests:

  • we have to register and start in the same way as above;
  • the trigger runners need to start the non-blocking execution of the pipeline from another project pipeline.

The startup script looks the same as above:

    restart: unless-stopped
    image: gitlab/gitlab-runner:alpine
    container_name: runner-guitrigger-01
    - /opt/docker_conf/runner-guitrigger-01:/etc/gitlab-runner
    - /opt/docker_data/runner-guitrigger-01-cache:/opt/docker_data/runner-guitrigger-01-cache
    - /var/run/docker.sock:/var/run/docker.sock

How to: set up the pipeline

Each project lives in its own git repository. In the project root folder, there is a file named .gitlab-ci.yml that configures the pipeline for each project. The file defines variables, cache settings, before and after scripts to be executed and the stages of the pipeline. The stages are basically the steps to be executed during the pipeline, and the conditions for the execution to be triggered. The last step of our pipeline is the end to end test stage.
Berg Software - End-to-end tests an GitLab integration - 05 How to set up the pipeline
We have five environments, each of them with its specific protected branch. When a certain commit is performed there, the application is built and deployed on that specific environment. For the production environment, we have to create a certain tag with a specific naming convention for deployment. You can define variables that can be used in the same file, or in all pipeline scripts:
GRADLE_OPTS: "-Dorg.gradle.daemon=false"
The stages are defined as follows:
  - opssetupchmodfix
  - test

Here we have two stages. Not all stages are executed in one single run. We will detail each stage definition later on.

To speed up the jobs execution time, we need to cache the dependencies of each build so they are not downloaded every time. The cache has a key that is picked from the predefined git environment variable CI_COMMIT_REF_SLUG, that contains the git tag or branch name for which the project is built:

<pre class="text" style="font-family:monospace;">cache:
- .gradle/wrapper
- .gradle/caches</pre>
We use per branch cache. All dependencies are cached inside the project folder, and the cache can be reused on every job execution within the specific project. At the end of the job, the specified folders are zipped into an archive cache.zip and stored under the specified key on the machine where the GitLab runner is executed. We can define a script to be executed before starting the execution of the stages:
  - export GRADLE_USER_HOME=`pwd`/.gradle
The test stage labeled ‘guitest’ starts the execution of the end to end tests:
  image: registry1.projekt.de/prj-dind-chrome
  stage: test   
    - ./ci/bin/guitest.sh "$EXECUTE_TEST_FOR_ENVIRONMENT" "$SRC_TRGR_BRANCH"
    - guirunner
The test stage is associated with certain runners, based on tag label. Only those runners that have the same tag ‘guirunner’ can pick up the execution of this stage. The image registry1.projekt.de/prj-dind-chrome is a custom docker image based on selenium/standalone-chrome:latest enhanced with bash, curl, openssl, git, x11 server, jdk and other tools, in order to be able to execute the end to end test in a browser environment. Excerpt from the Dockerfile used to build the image:
RUN sudo apt-get -y install bash curl git openssl openssh-client openjdk-8-jdk libxpm4 libxrender1 libgtk2.0-0 libnss3 libgconf-2-4 xvfb gtk2-engines-pixbuf xfonts-cyrillic xfonts-100dpi xfonts-75dpi xfonts-base xfonts-scalable x11-xserver-utils x11-xkb-utils
The runner executing this stage will start the given container from the defined image; will check out the code of the end to end project; and will launch (inside the container) the script defined in the script section:

The script is obtained from the checked-out source code. There are two environment variables given to the script that are provided to the container by the caller. They are computed based on the current branch or tag variable provided by GitLab CI_COMMIT_REF_NAME. I will explain the script later on.

The process that triggers the execution can be another pipeline that calls the command in a stage

The command is available in the docker image registry.gitlab.com/finestructure/pipeline-trigger. The artifacts section labeled ‘artifacts:’ defines what files should be made available for download after the job finishes, what is the source of those files, and for how long they should be preserved.
    when: always
      - target/*
      - /opt/selenium/config.json
    expire_in: 1 week
The stage is executed only for certain branches defined in the ‘only:’ section. Additionally, the stage is not executed when the commit message given by the developer contains the text ‘opssetup’ expressed as a regular expression. There is a logical ‘and’ condition between the two.
    - branch_dev1
    - branch_dev2
    - branch_preview
    - master
    - /^story-/
      - $CI_COMMIT_MESSAGE =~ /^opssetup-*/
The script guitest.sh can launch the end to end test inside the runner container or on another dedicated remote machine. We started with the first approach but ended with running the tests on a dedicated machine. The first approach used the commands:
Xvfb -ac :99 -screen 0 1280x1024x16 &
export DISPLAY=:99

./gradlew -i clean runAParallelSuite aggregate -P webdriver.base.url=$TARGET_BASE_URL
The first step was to set up the in-memory display server and resolution. In the second step, we are launching a custom Gradle task. More details will follow. This approach doesn’t scale very well. We got frequent memory issues so we decided to start the tests on a dedicated machine using the commands:
openssl enc -aes-XXalgo -d -pass env:GRP_SSL_ENCSECRET -in ci/secrets-enc/id_rsa-usr-dockerexec.dat >| ci/secrets/id_rsa-usr-dockerexec

ssh -i ci/secrets/id_rsa-usr-dockerex usr-dockerex@10.6.xx.xx  -o StrictHostKeyChecking=no "/opt/docker_exec/launch-tests.sh ${CI_JOB_ID} ${TARGET_LABEL} ${TARGET_BRANCH} ${TARGET_ENV}"

With the help of Ansible, we have set up the machine with the right user accounts and permissions to be able to launch the GUI tests from a remote machine. The user secret key is provided after it is decrypted using a protected environment variable GRP_SSL_ENCSECRET defined in the GitLab user interface. The script variables were computed based on the branch name and default GitLab variables.
Each job gets assigned a unique id exposed as CI_JOB_ID. The TARGET_LABEL is used to indicate the docker image labels to be used when running the test. The other two variables are indicating the branch from which to check out the test code and the environment to consider when downloading the images. In a further section, we will explain the launch script.

Another stage called ‘opssetupchmodfix’ is executed only when the provided commit message contains the text ‘opssetup-chmod-fix’, and only on the runner labeled with ‘otcprj’. The stage changes the permissions of the executable bash files checked out from git into the current project workspace folder and pushes the changes back to git. The default permissions are denying the execution.

  stage: opssetupchmodfix
  image: registry1.projekt.de/prj-dind-base
    - chmod oga+x ./ci/bin/*.sh
    - ./ci/bin/fixexecuteflag.sh
    - otcprj
      - $CI_COMMIT_MESSAGE =~ /^opssetup-chmod-fix*/
The stage is needed to be executed once when coming from windows environment because there is no easy way to set the execution permission for the bash files.

How to: isolate the tests

You might think that isolating the tests is a bad thing because you don’t get feedback from the real system, data, and users. But actually, most of the time, the purpose of the end to end tests is to verify that the existing business-critical functionality still works as expected after performing any changes. Verifying the system consistency or performance must not be done using the end to end tests.

How to: prepare the environment for running the tests

We have created a cloud machine for running the end to end tests using terraform scripts. If multiple machines are needed, they can be created dynamically. The machine is configured using Ansible. The user ‘usr-dockerex’ is created and allowed (A) to connect via its ssh key, and (B) to execute the end to end tests launch script. The script launch-tests.sh is copied during the machine setup.

We need to be able to run in parallel multiple end-to-end tests for different branches and environments. Therefore, we are using the job id to prepend all services started by one run. We are using Docker Compose to start the containers.

To each container name we append the job ID like in the following:

container_name: ui-client${JOB_ID}
We create a docker image in the pipeline of each microservice, and push it to our docker repository. From there, the end to end tests are pulling and starting the images. Different branches produce different docker images, with well-defined labels.
image: "registry1.projekt.de/ui-client:${ CLIENT_IMAGE_TAG}"
The tag is computed using GIT API based on the source branch, target environment, and the project that triggered the build. When working on a feature, multiple microservices and UI clients might be affected. The microservices that are not changed have to be started as well for the end to end test. For this, we need to specify a target environment where to get the latest images for that environment. Each environment is bound to a protected branch. Let’s suppose we have 3 microservices and only 2 are changed. Each one resides in its own git repository. The changes for the same story/feature are performed in a branch with the same name, let’s say feature1. The first environment where the feature is deployed should be the dev environment. So, the unchanged microservice3 has to use the latest image built for the dev environment. The used images are in this case the following:
•	image: "registry1.projekt.de/microservice1:feature1" 
•	image: "registry1.projekt.de/ microservice2:feature1" 
•	image: "registry1.projekt.de/ microservice3:dev1_latest"
If feature1 branch would exist for the microservice3, it would also be used for the end to end tests. We check the existence of a branch with the following script:
function check_branch_existence(){
     local result=$1
    local project=$2
    local branch=$3
    local exists=$(curl -I -so /dev/null -w "%{http_code}" "https://gitlab.com/api/v4/projects/${project}/repository/branches/${branch}?private_token=${GRP_GITUSER_ACCESSTOKEN}")
    eval $result="'$exists'"
If the branch exists, the output result variable should contain the 204 HTTP response code. We have to check the existence of a docker image tag to launch the right container or fall back to a default image which is actually the dev image. We use the following function:
function check_image_tag_existence(){
    local result=$1
    local project=$2
    local image_tag=$3
    local repo=$4
    echo "docker inspect --type=image ${repo}/${project}:${image_tag}"
    local exists=$(docker inspect --type=image "${repo}/${project}:${image_tag}" --format "{{.Id}}" > /dev/null 2>&1 && echo $? || echo $?) 
    echo "Exists image? ${exists}"
    if [ $exists -eq 1 ]; then
	echo "try to pull image"
	local pullresult=$(docker pull "${repo}/${project}:${image_tag}";echo $?)
	echo "pull result ${pullresult}"
	echo "tried to pull image ${repo}/${project}:${image_tag}"
    exists=$(docker inspect --type=image "${repo}/${project}:${image_tag}" --format "{{.Id}}" > /dev/null 2>&1 && echo $? || echo $?) 
    echo "after second inspect $exists"
    eval $result="'$exists'"
If the image is not downloaded locally the docker inspect command won’t find it. Therefore, a pull is performed before the second inspect command. The result of the second inspect is returned. We are iterating through all our GitLab microservices projects to compute the right environment variables with the container images and tags.
projects=("1233xxx" "1224xxxx"… )
for i in ${!projects[@]};
checkImageTag ${projects[$i]} ${project_names[$i]} ${project_vars[$i]} $imageTag "dockerregistry1.server" "dev1-latest"
The function checkImageTag exports the right environment variables for each microservice, or default them.
function checkImageTag(){
	    local curProjId=$1
		local curProjName=$2
		local curProjVar=$3
		local curProjTag=$4
		local curRegistry=$5
		local envBranch=$6
		check_image_tag_existence imageexistresult $curProjName $curProjTag $curRegistry
		if [ $imageexistresult -eq 0 ]; then 
			eval "export $(echo ${curProjVar}_IMAGE_TAG | tr [a-z] [A-Z])=$imageTag"
			#fallback to the default tag
			check_image_tag_existence imageexistresult $curProjName $envBranch $curRegistry
			if [ $imageexistresult -eq 0 ]; then 
				eval "export $(echo ${curProjVar}_IMAGE_TAG | tr [a-z] [A-Z])=dev1-latest"
				echo "image ${curProjTag} for project ${curProjName} does not exist!"
				exit 1
We will have exported for each microservice one variable, in the form of SERVICE_VAR_NAME_IMAGE_TAG. The var name by convention is the upper case of service name with the minus character replaced by an underscore. So far, we have computed (A) the container names to be unique for different jobs, and (B) the right docker image tags based on the branch name and target environment. We still miss the proper cleanup before launching the new containers for the tests. The execution of the end to end tests are taking less than one hour so we can stop all containers started more than one hour ago, that are still running. The containers have the prefix ad. We filter those containers and count them. If there are such containers, we use docker stop giving the list of filtered containers IDs.
if [ `docker ps --filter "status=running" --filter "name=ad" | grep 'hour.* ago' | awk '{print $1}' | wc -l ` -gt 0 ] ; then
docker stop $(docker ps --filter "status=running" --filter "name=ad" | grep 'hour.* ago' | awk '{print $1}')
We will delete the containers afterward.
if [ `docker ps --filter "status=exited" | grep 'hour.* ago' | awk '{print $1}' | wc -l ` -gt 0 ] ; then
docker rm $(docker ps --filter "status=exited" | grep 'hour.* ago' | awk '{print $1}')
Depending on the resources of test machine(s), we can limit the amount of tests to run in parallel on one machine. It does not matter which service is counted, it can be any.
if [ `docker inspect --format='{{.Name}} ' $(docker ps -aq --filter "status=running" --filter "name=admicroservice1*" )  | cut -d r -f 3 | wc -l` -lt 5 ] ; then
 echo 'allowing not more than 5 running gui tests'
 echo 'there are running in parallel already 5 gui test, please retry later '
 exit 1
A new network is created for every job ID.
export NETWORK_BRIDGE=nwkbguitest${JOB_ID}
[ ! "$(docker network ls | grep $NETWORK_BRIDGE)" ] && docker network create -d bridge $NETWORK_BRIDGE || echo "Network present!"
If the containers are already running for the same job id, we stop and remove them via the following commands:
vardockers=`docker ps -a --filter "name=*${JOB_ID}"| wc -l`
if [ $vardockers -eq 1 ]
 echo "no containers"
        echo "stop and remove all containers"
        docker stop $(docker ps -a -q --filter "name=*${JOB_ID}")
        docker rm $(docker ps -a -q --filter "name=*${JOB_ID}")
After the initial cleanup is performed, we are starting all containers with Docker Compose in the background.
docker-compose -f /opt/docker_compose/docker-compose-apps.yml up -d
Afterwards, containers’ IPs are logged to console and the log output of the end to end test container is redirected to the console.
docker inspect --format='{{.Name}} {{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' $(docker ps -q)
docker logs -f endtoend-tests${JOB_ID}
When the tests are finished, the final cleanup will be performed and the result code will be evaluated and returned.
result=`docker inspect $(docker ps -aq --filter "name=endtoend-tests${JOB_ID}") --format='{{.State.ExitCode}}'`
docker-compose -f /opt/docker_compose/docker-compose-apps.yml down
docker network rm $NETWORK_BRIDGE
docker network prune -f
docker image prune -f
docker volume prune -f
exit $result
The Docker Compose .yml file will be explained in a further section.

How to: prepare the data for the tests

We use the Spock framework for system testing of our rest API. Spock is a testing and a specification platform for Java and Groovy applications that can be easily integrated with IDEs and CI pipelines via the Junit runner. Spock allows us to write specifications that describe expected features from a system under test. The tests can produce documentation for a wider audience than developers, by using the labeled blocks “given”, “when” and “then” (like in the following):
given: "an admin user"
// ...code
when: "his company settings are changed"
// ...code
then: "the user is notified"
The produced reports are looking like this:
Berg Software - End-to-end tests an GitLab integration - 06 Reports
The tests are organized in test suites and specs. Each spec deals with a particular domain of our APIs.
Berg Software - End-to-end tests an GitLab integration - 07 Reports
Basically, we are using the system tests to populate an empty database on which to run the end to end tests. A small number of system tests are written specifically for the end to end testing. This is a necessity where some data is needed, or some workflows must be prepared in advance because the same cannot be created just from the end to end tests. The tests created to facilitate the end to end tests execution should be clearly distinguished. Specific users are used for testing, in order to isolate (as much as possible) the interferences with the existing users active on the system. At the beginning of the execution, the test data is prepared. In the end, the cleanup is performed. The Docker Compose file from above contains the MySQL service that executes an SQL script on startup, to create the empty databases.
        container_name: prj-mysql${JOB_ID}
        image: mysql:5.8
            - "./setmode.cnf:/etc/mysql/conf.d/setmode.cnf"
            - "3306"
             sh -c "
               echo 'CREATE DATABASE IF NOT EXISTS at_projects  /*!40100 DEFAULT CHARACTER SET utf8 */; CREATE DATABASE IF NOT EXISTS at_server; …' > /docker-entrypoint-initdb.d/init.sql;
               /usr/local/bin/docker-entrypoint.sh --character-set-server=utf8mb4 --collation-server=utf8mb4_unicode_ci

How to: run the tests with docker compose

All microservices and required runtime dependencies are declared in the Docker Compose file. The container names are unique, while the image tags are computed from the launching script and provided as environment variables. All services for one job are started in the same network including the database server.
      name: ${NETWORK_BRIDGE}
Each container must have some resource limits specified (like memory and CPU). A typical back end service looks like this:
    image: "registry1/prj-server:${PRJ_SERVER_IMAGE_TAG}"
    container_name: prj-server${JOB_ID}
      - "SPRING_PROFILES_ACTIVE=otcdev,swagger"
      - "SQL_SERVER_IP=db-mysql${ JOB_ID}"
      - 8080
    restart: always
    mem_limit: 2500m
        - db-mysql
A typical front end service looks like this:
    image: "registry1 /ui-client:${UI_CLIENT_IMAGE_TAG}"
    container_name: ui-client${JOB_ID}
      - "BACKEND_SERVER_URL=http://prj-server${ JOB_ID}:8080"
      - "BACKEND_CHECKLIST_URL=http://prj-checklists${JOB_ID}:8084"
      - 8085
    restart: always
    command: /bin/bash -c "envsubst < /opt/enviroment-endpoints.template > /opt/enviroment-endpoints.json && cp /opt/enviroment-endpoints.json /usr/share/nginx/html/assets/enviroment-endpoints.json && exec nginx -g 'daemon off;'"
One interesting thing is the substitutions of the backend URLs environment variables into a file provided to the nginx web server. This file is then included in the UI image, to point the UI of the end-to-end tests to the right backend. At this point, we only need an instance for each service. Running the test is actually performed in a special container started from a chrome image. The image with all dependencies set is prepared to run the end-to-end test.
    image: registry1.projekt-adis.de/adis-dind-chrome
    container_name: endtoend-tests${JOB_ID}
The section maps a host folder to a container folder. The host folder will be persistent even after the container finishes the work. This is where we will store the end to end test reports.
      - /opt/docker_data/guitest:/repo/tests
The entry point defines the command to run when starting the container. There are multiple Linux commands separated by ;
entrypoint: >					           
        /bin/sh -c "
First, the git user is configured to appear when committing.
git config --global user.name \"${GRP_GITUSER_NAME}\";	
git config --global user.email \"${GRP_GITUSER_EMAIL}\";
Unique folders for each job id are created, and access rights are granted.
sudo mkdir repo;
sudo chmod a+rwx repo;
cd repo;
sudo mkdir tests/${JOB_ID};
sudo chmod a+rwx tests/${JOB_ID};
sudo rm -rf adis-systemtest;
The code for the system test is checked out and executed on the fresh environment, in order to populate the end to end test data.
git -C . clone --branch ${PRJ_SYSTEMTEST_BRANCH} https://${GRP_GITUSER_LOGON}:${GRP_GITUSER_ACCESSTOKEN}@gitlab.com/company/projects/cust/prj/prj-systemtest.git;
        chmod oga+x ./prj-systemtest/ci/bin/systemtest.sh;
        cd prj-systemtest;
The tests for populating the data are launched, giving them the right back end URLs.
./ci/bin/systemtest.sh ISOLATED http://prj-server${JOB_ID}:8080 http://prj-projects${ JOB_ID}:8081 http://prj-checklists${ JOB_ID }:8084;
The test results are parsed, and the job is failed if there are failures.
cp -R build/test-results/test /repo/tests/${ JOB_ID}/test-results-system;
cat /repo/tests/${ JOB_ID}/test-results-system/TEST-specs.TestSuite.xml;
cat /repo/tests/${JOB_ID}/test-results-system/*.xml >> /repo/tests/${ JOB_ID}/test-results-system/systestresult.log;
[ `cat /repo/tests/${JOB_ID}/test-results-system/systestresult.log | grep '\failures=.[^0].' | wc -l` -eq 0 ] && echo 'system test success' || exit 1;
The end to end tests are checked out from git (i.e. from the computed branch).
cd ..;
sudo rm -rf prj-guitest;
git -C . clone --branch ${PRJ_GUITEST_BRANCH} https://${GRP_GITUSER_LOGON}:${GRP_GITUSER_ACCESSTOKEN}@gitlab.com/company/projects/client/prj/prj-guitest.git;
chmod oga+x ./prj-guitest/ci/bin/*.sh;
cd prj-guitest;
The end to end test is launched, also providing the URLs of the user interface clients.
./ci/bin/guitest.sh ISOLATED http://prj-client${ JOB_ID} http://prj-admin-client${ JOB_ID};
The results are evaluated, parsed, and copied to a location where the nginx server can serve them to the outside.
cp -R target/site /repo/tests/${ JOB_ID}/test-site;
        cp -R build/test-results/runAParallelSuite /repo/tests/${ JOB_ID }/test-results;
        cp -R build/reports/tests/runAParallelSuite /repo/tests/${ JOB_ID }/test-reports;
        cat /repo/tests/${ JOB_ID }/test-site/serenity/results.csv;
        [ `tail -n +2 /repo/tests/${ JOB_ID }/test-site/serenity/results.csv | grep -v SUCCESS | wc -l` -eq 0 ] && echo 'gui test successful' || exit 1;
    mem_limit: 7500m

How to: clean up after the tests

After running the test suites, we execute the cleanup step. The cleanup can fail because of various reasons, such as abrupt program termination due to the lack of resources, corrupted data or network failure, bad cleanup procedure, or redeployment in progress. One way to deal with the situation is to also do the cleanup at the beginning of the test. If the cleanup does not succeed, then fail the test. Sometimes you cannot delete any created resources if the delete API is not available or the cleanup is too expensive. Our solution is to always use a clean database for each end-to-end test execution and to populate it with the system tests. In order to do that, we needed an isolated environment.

How to: make the tests as fast as possible

The end-to-end tests are executed in parallel, to minimize execution time. In order to do that, the tests are split into independent test suites. This means that the data consumed or produced by one suite does not alter the data of other suites running in parallel. Pay attention when certain users’ actions impact all other users’ running sessions. For instance: the user language is changed, the user preference to not display the menu is changed, the privileges are changed. As seen above, there is a Gradle task that extends the standard test task to which we provide the application URLs where to point the browser: For the standard test task (i.e. the one where the browser should point to, based on the application’s URLs), there is a Gradle task that extends it.
task runAParallelSuite(type: Test) {
    systemProperty "webdriver.base.url", findProperty("webdriver.base.url")
    maxParallelForks = 3
    forkEvery = 1
    include '**/**TestSuite.class'
    testLogging.showStandardStreams = true
There are 3 test suites to be executed in parallel, on three threads. Other factors to influence the performance are the logging, and the capturing of screenshots. We have reduced the logging to an acceptable level, and have set up the screenshots to only be captured in case of failure. The serenity.properties file, located in the root of the end to end project, contains the configurations:
In order to not interfere with the browser windows, running in headless mode is also a requirement. On Linux a virtual terminal has to be set up with a certain resolution, as shown above.
chrome.switches=--headless --disable-extensions --disable-gpu --disable-dev-shm-usage --no-sandbox
At some point, we got a lot of invalid session errors and headless mode, that were fixed by adding the two flags –disable-gpu –disable-dev-shm-usage. Inside the end-to-end tests xpaths based on ids have to be used for maximum performance.

How to: maintain the test suite

User journeys are written for business-important scenarios. Here, the user performs different actions, to achieve a certain goal. The tasks are performed in the user interface by interacting with it. The test asks questions or makes assumptions about the result of the visible task. Group the code into reusable tasks or actions performed by the user:
public abstract class BaseLoginTask implements Task
public class OpenMyProfilePage implements Task
public class EnterValueIntoField implements Task
Write reusable user interface questions or queries:
public class ConfirmationMessageAppears implements Question<String>
public class ValueInColumnOfTable implements Question<List<String>>
In order to have a maintainable test suite, one important thing is to use xpaths based on ids and not on parent/child relations within the HTML. The changes to the layout are performed very often and should not influence the end to end tests. Use constructs like this:
public static Target ADD_TASK = Target.the("AddTask button").located(By.id("btn-add-tasks"))
instead of:
Target.the("The second selected interaction").locatedBy("//div/span/app-link-or-text/a");
Another thing is to not introduce hard-coded wait times (like “wait for 20 seconds”). Instead, use constructs that wait for certain conditions to happen and provide a timeout:
WaitUntil.the(ChecklistTemplates.ADD_CHECKLIST_TEMPLATE_BUTTON,             WebElementStateMatchers.isEnabled()) 
.forNoMoreThan(TimeConstants.SECONDS_10). seconds().performAs(actor);
Several levels of time outs should be established as constants, to be used everywhere. Also, a factor to adjust the timeouts at once should be established and given as system properties when running in a slower environment.
public static final int SECONDS_1 = 1 * factor;
    public static final int SECONDS_3 = 3 * factor;
    public static final int SECONDS_10 = 10 * factor;    
    public static final int LOADER = 5 * factor; 
    public static final int LONG_LIST = 10 * factor;
In serenity.properties we have:
#How long webdriver waits for elements to appear by default, in milliseconds.
#How long webdriver waits by default when you use a fluent waiting method, in milliseconds.
#How long should the driver wait for elements not immediately visible, in milliseconds.
For slow pages, if possible, we advise introducing a progress bar or a loader indicator in the user interface, that can be checked at the beginning and end of long-running operations.
WaitUntil.the(MainUIUtilities.LOADER, WebElementStateMatchers.isVisible());
WaitUntil.the(MainUIUtilities.LOADER, WebElementStateMatchers.isNotPresent()).


  • Private git runners have to be registered for job execution
  • Minio can be used as a pipeline cache provider
  • The end to end tests must have their own GitLab project set up.
  • The execution can be triggered from any project from a stage defined in gitlab-ci.yml file, using the pipeline-trigger container image.
  • The tests should be executed on a dedicated/provisioned environment, isolated from external influence for better stability.
  • GitLab API can be leveraged for complex situations
  • The tests should be broken into independent test suites that must run in parallel, headless, with screenshots on failure only, for achieving the best performance
  • User journeys must be defined to test only the business-critical happy paths.
  • The backend integration tests can be used to populate the data for the end to end tests.
  • For maintainability, ids must be used everywhere in UI in addition to the code review and code reuse practices.
  • Wait for condition, with a timeout, must be used instead of fixed wait time.
  • A cleanup of the test data should be performed on every run


How do you approach end-to-end testing / pipeline integration? Got any experience that you can share? Let us know!

29 years in business | 2700 software projects | 760 clients | 24 countries

We turn ideas into software. What is yours?

Get in touch

2 + 12 =