Some Background
It has been a while since the last high profile vulnerabilities like Heartbleed or the Rowhammer family have been published. But last Friday saw another exploit make international headlines: CVE-2021-44228 or, as it is more commonly known: Log4Shell.
This exploit falls in the category of unauthenticated remote code execution (RCE), so allows remote attackers to execute arbitrary code. What makes this exploit particularly bad is that the affected software log4j is the most common logging framework in the Java world and pretty much present everywhere.
In addition to it being omnipresent, attackers do not need any special form of access to the target application, it is enough to simply cause a user-defined string to be logged – like for example a webserver logging the user agent of your browser in an access-log when you visit a website..
For more information on this exploit please visit https://www.lunasec.io/docs/blog/log4j-zero-day
Mitigation
The issue in cases like this is usually not waiting for a fixed release of the impacted product – log4j in this case – but rather the time it takes for this release to trickle down through the entire stack until every last component you are running is fixed.
This means a lot of companies providing software that depends on log4j are now trying to provide interim fixes as quickly as possible to mitigate the wait until all necessary upstream releases are available.
The majority of the tools that are under management by the Stackable platform are written in Java, so quite a few of them are also vulnerable and we immediately started looking for a solution to provide to our customers.
Since our workloads are containerized we are in the favorable position of being able to add build steps to our containers that can also influence the environment those tools run in – i.e. the container. This allows us a large degree of flexibility when dealing with these types of situations.
Due to the fact that we manage the container version independent of the actual product version we can update the container without actually changing the version of the product that is run inside of the container. This allowed our customers to simply perform a rolling restart of their services, regardless of the version they are currently running, and have them come back up with a fixed version (fixed according to the current best practices).
As a first step we simply added an environment variable to all our Dockerfiles that disables the vulnerable functionality in supported versions of log4j (> 2.15). However on the one hand this only fixes later versions of log4j, and even then it is not a perfect fix, as indicated by a subsequent CVE that has been opened for this parameter.
So we went a step further and after some investigation chose a way forward that is based on a solution that Cloudera has recommended to their users.
They provide a script that iterates over all paths known to contain Java libraries for the platforms supported by Cloudera, checks every single jar file for the vulnerable .class file and removes this file.
This is a quick and workable solution, however it has a few drawbacks, mainly that it interferes with the software distributed by the platform, which means that these changes will be overwritten with any subsequent update or even when new servers are added to the system.
We chose to incorporate an adapted version of this script into our container build process so that all container images shipped as part of our platform are cleaned of the vulnerable code.
We will closely monitor the situation going forward and take additional measures as needed to ensure the security of our platform.