15 January 2026
Towards reproducible builds via Docker
When maintaining software, it is an important aspect to build new versions quickly and
with minimal effort. If it is about free software, it is also relevant to make each step reproducible.
Since I maintain several free software, these questions and their answers are unavoidable for me.
Recently, I faced
Docker,
which has been famous in the open source world for many years, but I was not very interested in it
because I used my own toolset. While working on the evaluation of a nice mathematical
project, I tried to install their software on my
workstation without success. The solution was to use Docker, exactly how the project authors officially suggested.
The reason of my failure was that Python had to be a certain version (3.11) to ensure support for some underlying modules.
(My Python, being on Ubuntu 24.04, was 3.12. Bad luck!)
You may say: it is crazy that a minor change in versioning can make such a difference. But this is reality.
Thirty-five years ago, near the birth of the open source movement, you usually had no such problems since software changes
were much more followable than today because of the small amount of existing software packages.
Now, however, to ensure that people run the same underlying software, especially
when a new version of your tool is packaged, one may need to fix every piece of the prerequisites.
Fifteen years ago when I became responsible for packaging software at
GeoGebra, I was happy with using
VirtualBox
and
vmware to ensure such details in a full virtualized environment.
Today, however, there are much better approaches, and here comes Docker in picture.
The big problem with Docker is that several people use it in a different way than it was designed for. It was the
same with the above mentioned mathematical project. In fact, Docker should be considered as a set of command line tools to help
Linux hosted virtualization, rather than using it as a simple command. This was my first misunderstanding.
Now, I would compare it rather to Git because both tools support certain workflows. For each workflow
you may need a different scenario to select the appropriate tool for each step.
Docker has a kind of philosophy which may be different from the natural approach of a typical system administrator or a package maintainer.
In my case, I had to understand the difference between images and containers. First you build an image
with
docker build, and then you can run that image to create a container with
docker run.
Interacting between the container and your host system is not always trivial: sometimes your container
is exiting before you want to get data from it, or you need to forward information from the host to
the container by using the correct way designed by the Docker authors. Luckily, I had a quite helpful
guidance from AI tools and I got the main concepts quickly enough without reading the whole documentation in advance.
In fact, for Docker, it could be an important hint to
read the docs first to clarify the main
concepts before running into crazy issues. In fact, nowadays it is very rare that you read the docs before
giving the tool a try first. “If everything else fails, read the documentation”, this well-known wisdom
is certainly a modification of a quote by
Ralph Waldo Emerson.
As of today, I would use another modification: “Before everything else fails, ask the AI, but always check the result.”
Now I am after 3 weeks of experiments. The first steps included
compiling Krita via Docker.
This was inspired by my son Benedek who came up with the idea to develop a plugin for Krita to allow showing
a hexagonal grid to support game masters in
Dungeons & Dragons.
I was surprised how perfectly the compilation process worked in a Docker virtualization. Clearly,
its maintainer
Dmitry Kazakov
did a wonderful job by putting the tools together to have an exceptionally nicely working
development environment.
In my learning process I chose my current project
bibref
with the idea in mind to simplify its build process, and, to make it even more deterministic.
Formerly, I had difficulties on Windows because of using the rolling distribution
MSYS2, and by
having changed versions of some libraries quite spontaneously the build stopped working at some point.
I experienced the same problem with
GitHub Actions
builds, also when building for macOS. In fact, Docker has nothing
to do with Mac builds, but it has a good support to cross-compile code for Windows. Also,
it turned out that building for the
WebAssembly platform is a good use case for Docker.
Surprisingly (but this was actually clear from the Krita example) it is also possible to
run a Linux-based graphical application via Docker, but I was unsure how far it is supported —
luckily, I have managed to solve
all problems I faced during my experiments.
I created 5
Dockerfiles.
I learned it was important in my case to put them in separate folders
(otherwise it may take a long time to send the whole content of the folder to Docker unnecessarily).
Each
Dockerfile is responsible for one specific build of bibref. There are common parts
of the
Dockerfiles: most commonly, the native version must be built to ensure having
the database cache pre-generated. I guess advanced Docker users would separate the common parts
in a different image and compose them together, but I was happy with my solution at this stage.
Five containers
I started with building the command line (cli) version. It was the simplest scenario, but I already faced
the problem of interactive run of a container. Then I continued with building the
Qt
version for Linux which provides a graphical user interface (gui).
The first working version of the
Dockerfile was not much longer than for the cli version, but the final
version is quite long because of patching the code to have full support for icons and
running the application with a number of environment variables to ensure proper startup of Chromium
inside Qt via
QtWebEngine. (Here I found AI support extremely helpful
when debugging the issues.)
Then I went on with compiling the web version, first with its command line interface.
Here I needed to get
Emscripten and compile the
SWORD library for WebAssembly as well. It was
surprisingly easy to run a web server inside the container and connect to it from the host
(or from another workstation in the same network). The next project was the same with the Qt version,
here I had to recompile the whole Qt stack from scratch for both the container host and
for WebAssembly. This was quite challenging but still manageable since I had lots of experience
from last December when I successfully
did
the same natively (without Docker). For this 4th
Dockerfile, however,
I did not manage to connect to the WebAssembly based Qt application from an arbitrary
workstation, except from the Linux host, but that was acceptable for me since my final scenario
was to put the deployment on a public server.
The fifth project was to cross-compile the gui version for Windows. Here I made several attempts until
I got success. The first dead end was to use
Wine
and install MSYS2 on Wine, everything inside Docker.
I chose this approach by having a
positive
result from an earlier experiment. I had to disable the signature checking since it would involve
unimplemented features in Wine, but finally I had to give up this approach because its terrible slowness.
Unfortunately, the installation
of
libxml2 was stuck for an indefinite time, no idea why.
The second dead end was to use MINGW64 by compiling everything from scratch. I managed to build
Graphviz,
Boost, Qt and
zlib from source with moderate pain. Unfortunately,
it turned out that bibref requires some non-trivial features of Graphviz that are required to display
statement graphs, including
rsvg and
pango. These finally would result
in recompiling almost a complete
GTK stack which seemed
an overkill for the project. Here, a workaround was to copy the binaries of the required Graphviz
dependencies from the MSYS2 repository, and surprisingly, this option worked for the full project.
But, finally I chose to select all possible dependencies from MSYS2
(by using its
MINGW64 packages),
instead of compiling Graphviz, Boost and zlib from source. This resulted in selecting 36 packages,
being available in
zstd format.
(This format has been chosen by the
pacman
package manager for
Arch Linux, too.)
The packages, however, do not include the Qt binary packages.
Indeed, it was a sad decision to leave out the Qt packages, but I did not manage to resolve
a linker problem: in the final step I got an error related to some mismatches with the Qt library
taken from the MSYS2/MINGW64 build. So, at the end of the day,
Qt had to be built from scratch for the Linux/MINGW64 platform, and it took a huge amount of time.
The final challenge was to collect all dependencies for each MSYS2/MINGW64 package
(this is how 36 packages were finally selected) and lay out the required folders
for the final .EXE to find. This included the folders
platforms,
styles
and
imageformats from the Qt distribution (which was built by me formerly).
As the last step, I managed to run the
Inno Setup
utility (after installing it
via inside
xfvb via Wine)
and create the Windows installer .EXE as well,
completely automatically.
Reproducible?
Reproducibility is just partial at the moment. First, cloning a GitHub repository without
a fixed tag or commit hashcode is always a question. This should be fixed in the future.
For one
Dockerfile I used Subversion to get the latest version: this should also be
changed to use a fixed version. Luckily, Docker has a feature to override preconfigured settings
via the
ARG command,
so this might be used in the future to fine-tune the required version to build.
The used packages from MSYS2 are already set via ARG commands. Being a rolling release,
it is quite natural that MSYS2 will reject serving outdated versions of the dependent packages
after a while, hopefully not earlier than in a couple of years.
Fortunately, the package versions can be overriden with the above mentioned idea, so this is already
in a good shape, no change is required in the Dockerfile. (Each change in the Dockerfile
usually results in a full rebuild of all remaining steps that come after the modified one, including re-run of the changed
step as well, of course. So one should avoid changing the Dockerfile if it is not necessary.)
For the moment, I am happy that all the projects are buildable and the artifacts being created
can be copied in an easy way from the container to the host system. The Linux versions
remain to be built by the
Snapcraft machinery and the
Flatpak ecosystem automatically,
and for the Mac version I still need my own Mac Mini (borrowed from the GeoGebra guys),
but I have already changed the release workflow for the WebAssembly ports and the Windows version.
Comparison
Here is a short comparison of the 5 containers in a tabular form:
| No. |
Target OS |
Application variant |
Compiler |
Guest OS |
Dockerfile length (chars) |
Build time (mins) |
Image size (GB) |
| 1 |
Linux |
cli |
gcc |
Ubuntu 24.04 |
1259 |
4 |
1.6 |
| 2 |
Linux |
gui |
gcc |
Debian Trixie |
4149 |
7 |
3.25 |
| 3 |
WebAssembly |
cli |
emscripten/clang |
Ubuntu 24.04 |
2873 |
13 |
3.57 |
| 4 |
WebAssembly |
gui |
emscripten/clang |
Ubuntu 24.04 |
4926 |
96 |
69.2 |
| 5 |
Windows |
gui |
mingw64/gcc |
Debian Trixie |
14225 |
100+ |
79.6 |
You can try all of these variants (and also the Mac version) on the web site of the project,
here.
If interested, the
Dockerfiles include several comments to help their users.
Most importantly, each
Dockerfile comes with step-by-step instructions at the top of the file.
Continue reading…
See also a filtered list of the entries on topics
GeoGebra,
technical developments or
internal references in the Bible.
|
Zoltán Kovács
Linz School of Education
Johannes Kepler University
Altenberger Strasse 69
A-4040 Linz
|