In order to develop reliable code in a collaborative way, it is important to put together an infrastructure for automated code versioning, its continuous integration, as well as code execution. To this end, we created two pipelines for continuous integration and data analysis as discussed in the following.
1. Continuous Integration Pipeline
The continuous integration pipeline automates the process of code versioning, performing static analysis, testing, creating documentation, building a package and integrating with the data analysis pipeline. Figure below presents the pipeline created to automate the code development process for both API and notebooks.
We chose the following tools and technologies:
PyCharm IDE (Integrated Development Environment) - it is a desktop software suited for development of Python code. Pycharm comes with a number of features facilitating code development (code completion,code quality checks, execution of unit tests, code formating, etc.). Professional edition supports editing of notebooks and better management of matplotlib images.
GitLab CI/CD - we use GitLab repository for code versioning and execution of the continuous integration pipeline for the API and notebooks. SWAN provides a read-only sharing feature and therefore we share code with GitLab. In case of HWC and operation notebooks, we version them with a dedicated repository (
lhc-sm-hwc). The scheduling of signal monitoring applications is carried out by Apache Airflow with code synchronized with a repository (
SWAN is used for development and prototyping of analysis and signal monitoring notebooks.
NXCALS cluster is used for code execution;
Apache Airflow schedules execution of signal monitoring applications;
EOS and HDFS are used as persistent storage for computation results.
Project backlog is available on JIRA (https://its.cern.ch/jira/projects/SIGMON)
Note, that we rely on industry-standard tools for the automation of the software development process. Majority (except for PyCharm IDE and Python Package Index) services are supported by CERN IT department reducing the maintenance effort.
1.1. Pipeline for
The pipeline consists of several stages:
doctestexecutes code examples in the documentation (it is allowed to fail);
test_devexecutes all unit and integration tests with GitLab CI;
type_checkingperforms an analysis of input arguments and return types with
mypypackage (it is allowed to fail);
pagescreates the code documentation based on doc strings describing modules, classes and functions. The documentation is created with Sphinx package.;
sonarperforms static code analysis with sonarQube;
deploy_documentationcopies documentation to EOS;
deploy_productionpublishes lhcsmapi package on the python package index(PyPI);
deploy_production_eoscopies lhcsmapi package into EOS virtual environment.
The EOS virtual environment is updated in case of:
- a tagged commit on the master branch (eos/project/l/lhcsm/venv)
- a commit on a protected branch (eos/project/l/lhcsm/venv_dev)
- a commit on the
1.2. Pipeline for
The pipeline consists of a single stage executed on a master-branch tagged commit. It copies all notebooks into the EOS directory (eos/project/l/lhcsm/hwc/lhc-sm-hwc).
2. Release of a New Version
2.1. A new
lhc-sm-hwc version with local git repository
Clone or checkout the repository
To clone locally a new repository, execute
git clone https://gitlab.cern.ch/lhcdata/lhc-sm-api.git
To check an existing local repository, execute
2.2. A new
lhc-sm-hwc version with central GitLab repository
Instructions below are performed for
lhc-sm-api repository. The same holds true for
Go to https://gitlab.cern.ch/LHCData/lhc-sm-api and open Web IDE
You should see the IDE view.
Perform a change to the repository: delete/create/edit a desired file. For example, in
Naturally, you may perform more modifications as needed.
Update package version in
lhcsmapi/__init__.pyby incrementing the least significant version index (below from
1.4.72 to 1.4.73)
Commit to **master** branchand type the commit message. Confirm by clicking
You may consult your commit at the link (https://gitlab.cern.ch/LHCData/lhc-sm-api/commit/92666404)
The last step needed to publish a new version is to create a tag. IN other words, you need to tag the commit with the version name (1.4.73). To this end, go to Repository and then Tags (https://gitlab.cern.ch/LHCData/lhc-sm-api/-/tags). Click
Then put the version name and click
Create Tag. You may optionally add a message and release notes.
This would create a tag associated with the recent commit. As a result a pipeline is triggered that will update the virtual environment of the project on EOS.
The change will be visible once the pipeline associate with the tag is completed (https://gitlab.cern.ch/LHCData/lhc-sm-api/-/pipelines)