Wednesday, April 4, 2018

gitlab-runner cache key bug: cache not being created anymore in gitlab steps


Besides auto-updating also to the most recent version of the Docker image of maven:3-jdk-8, the Gitlab runners were also always updated to the most recent version.

Again not best-practice and again I learned the hard way: builds suddenly starting to fail.

The issue and workarounds/solutions

 Because suddenly indeed the docker image build of the application started to fail with this error:

Step 9/12 : ADD service1/target/*.jar app.jar
ADD failed: no source files were specified
ERROR: Job failed: exit code 1

It turned out the service1/target directory was empty (or missing, I forgot).
Investigating some more showed that the cache produced in previous steps in Gitlab was not there anymore. The version that this started appearing was:

Running with gitlab-runner 10.1.0 (c1ecf97f)
on gitlab-runner-hosted (70e74c0e)
From older successful builds I saw that at the end of the previous step, when the cache is created, you see something like this:   

Creating cache developbranch:1...
WARNING: target/: no matching files
service1/target/: found 87 matching files
service2/target/: found 33 matching files
untracked: found 200 files
Created cache
Job succeeded

But those loglines were now suddenly missing! And I noticed that the gitlab-runner version changed between the last successful build and this failing one.
So I had an area to focus on: the caching didn't work anymore.

The cache definition in the gitlab-ci.yml was:

cache:  key: "$CI_COMMIT_REF_NAME"
  paths:    - target/
    - service1/target/
    - service2/target/
  untracked: true

I suspected maybe the environment variable of the key: field being empty or something.
But when I added logging in other script steps, the $CI_COMMIT_REF_NAME variable was filled with the value 'developbranch'.  So it is not empty.
Then I had an epiphany and prefixed the above environment variable with a string, making the cache key: definition look like this:

cache:  key: prefix-"$CI_COMMIT_REF_NAME"
  paths:    - target/
    - service1/target/
    - service2/target/
  untracked: true

In the above you can see I prefixed the key with the hardcoded string "prefix-".  And indeed that did it, the creating of the cache worked again and looked like this:

Creating cache prefix-developbranch:1...
WARNING: target/: no matching files
service1/target/: found 87 matching files
service2/target/: found 33 matching files
untracked: found 200 files
Created cache
Job succeeded

So also here I learned (as I really already knew): don't auto-update but do that in a controlled way, so you know when to expected potentially failing builds.

Friday, March 30, 2018

Gitlab maven:3-jdk-8 Cannot get the revision information from the scm repository, cannot run program "git" in directory error=2, No such file or directory


In a Gitlab project setup the most recent Docker image for maven was always retrieved from the internet before each run by having specified image: maven:3-jdk-8 in the gitlab-ci.yml. The image details can be found here.

Of course this is not a best-practice; your build can suddenly start failing at a certain point because an update to the image might have something changed internally causing things to fail.
What you want is controlled updates. That way you can anticipate on builds failing and plan the upgrades in your schedule.

The issue and workarounds/solutions

And indeed suddenly on March 29 2018 our builds started failing with this error:

[ERROR] Failed to execute goal org.codehaus.mojo:buildnumber-maven-plugin:1.4:create (useLastCommittedRevision) on project abc: Cannot get the revision information from the scm repository :
[ERROR] Exception while executing SCM command.: Error while executing command. Error while executing process. Cannot run program "git" (in directory "/builds/xyz"): error=2, No such file or directory

That message is quite unclear: is git missing? Or is the directory wrong? Or could the maven buildnumber plugin not find the SCM repository?
After lots of investigation it turned out the maven:3-jdk-8 image indeed had changed about 18 hours before.
And after running the maven command in a local version of that Docker image indeed the same error occured!  Awesome, the error was reproducable.
And after installing git again in the image with:

- apt-get update
- apt-get install git -y

the error disappeared!  But a new one appeared:

[ERROR] The forked VM terminated without properly saying goodbye. VM crash or System.exit called?
This also hadn't happened before. After some searching it turned out it might be the surefire and failsafe plugins being outdated.
So I updated them to 2.21.0 and indeed the build succeeded.

Here's the issue reported in the Docker Maven github. UPDATE: it is caused by an openjdk issue (on which the maven:3-jdk-8 is based upon.

This issue made us realize we really need an internal Docker repository. And so we implemented that :)

One disadvantage about Docker images is that you can't specify a commit hash to use. Yes you can specify a digest instead of a tag, but that is a unique UUID hashcode only. You can't see from that hashcode anymore the (related) tagname.

Saturday, February 3, 2018

How to prevent Chromium from rebooting a Raspberry Pi 3 model B Rev 1.2

I tried to use a Raspberry Pi 3 model B Rev 1.2 as a dashboard for monitoring a couple of systems using Chromium as browser.

Tip: use  this to have it never turn off the display:
sudo xset s off
sudo xset -dpms
sudo xset s noblank

I had only two tabs open all the time and was using the Revolver browser extension to rotate the tabs. One tab had the default Datadog page open, another a custom dashboard within Kibana that refreshed every 15 minutes.
Using all default settings, within a few hours the Pi would reboot out of its own! So something got it to do that.

It seemed the browser (tabs) or the Javascript in them were just leaking so much memory that the Pi ran out of memory.  I tried multiple times with the same default setup, but the behavior was the same each time.

So I tried a couple of other things:
  1. Have the Revolver plugin fully reload the page. Still a reboot of the Pi, though it took a bit longer

  2. Added --process-per-site to the startup shortcut of Chromium. This causes Chrome to create less processes and that should reduce the memory usage a bit. But still a reboot of the Pi; though again it took a bit longer.
    Note that this also comes with its own weaknesses.

  3. Added --disable-gpu-program-cache to the startup shortcut of Chromium. Again still rebooted the Pi after a while.

  4. Tried other browsers like Midori and Firefox Iceweasel.  Midori does not have a Revolver-like plugin, so it didn't fit the requirements. Firefox's only add-on that should work gave some kind of "invalid format" error (don't remember exactly) when trying to install it. The other add-ons for Firefox were not compatible with Iceweasel.

So in the end I did not find a solution :(  I just built a cron-job that would restart the browser every 5 hours.
If you found a way to fix this problem, let the world know in the comments!