Wednesday, February 14, 2024

Kotlin: how to mock a static Companion method using MockK

Introduction

How do you mock a static companion method in Kotlin using MockK?

I wanted to mock this (static) method in a companion object in Kotlin in class MyClass:

  companion object {
    fun isNegativeAndFromSavingsAccount(amount: BigDecimal, accountType: accountType) = amount < BigDecimal.ZERO && accountType == AccountType.SAVINGS
  }

 Trying it with a regular 'every' like this doesn't work:

      every { Declaration.isNegativeAndFromSavingsAccount(any(), any()) } returns false

Note it does compile fine!
But when running the test, you'll get this error:

    io.mockk.MockKException: Failed matching mocking signature for left matchers: [any(), any()]
    at io.mockk.impl.recording.SignatureMatcherDetector.detect(SignatureMatcherDetector.kt:97)

Setup:

Solution

This is the way it does work:

import io.mockk.mockkObject

    mockkObject(Declaration.Companion) {
      every { MyClass.
isNegativeAndFromSavingsAccount(any(), any()) } returns false
    }

Note this did not work, got the same error:

    mockkObject(Declaration::class)
    every { MyClass.
isNegativeAndFromSavingsAccount(any(), any()) } returns false

I found several posts, but none of them gave a clear answer and/or were using some older version of MockK. E.g: https://github.com/mockk/mockk/issues/61
and this StackOverflow post.

Some more examples and variants of solutions can be found here, e.g when using @JvmStatic.

Tuesday, December 26, 2023

OWASP DependencyCheck returns 403 Forbidden accessing NVD API using API key

Introduction

Recently, the NVD (National Vulnerability Database) which the Owasp dependency check plugin uses to get its data from to check for vulnerabilities, has introduced the use of an API key. That's for them to better control access and throttling - imagine how many companies and organizations are using that API, each time a dependency check build is performed. Especially those that don't cache the NVD database and at each run retrieve it again. And be aware: "... previous versions of dependency-check utilize the NVD data feeds which will be deprecated on Dec 15th, 2023. Versions earlier then 9.0.0 are no longer supported and could fail to work after Dec 15th, 2023."

But this introduction doesn't go without some hiccups. For example it is possible to still get HTTP 403 Forbidden responses, even while you have a valid key. Here's my research while trying to fix it.

Setup:

  • Gradle 7.x
  • Dependency Check v9.0.6 (issue applies at least for versions > 8.4.3)
  • Configuration:

    dependencyCheck {
        failBuildOnCVSS = 6
        failOnError = true
        suppressionFile = '/bamboo/owasp/suppressions.xml'
        nvd.apiKey = '
    <yourkey>'
    }

    You can also set it dynamically via an environment variable like this:

    dependencyCheck {
      nvd {
        apiKey = System.getProperty("ENV_NVD_API_KEY")
      }
    }

  • Via commandline you can invoke it like this:

    ./gradlew dependencyCheckAggregate -DENV_NVD_API_KEY=<yourkey>

 

Solution

First you should check if your API key is valid by execution this command:

curl -H "Accept: application/json" -H "apiKey: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -v https://services.nvd.nist.gov/rest/json/cves/2.0\?cpeName\=cpe:2.3:o:microsoft:windows_10:1607:\*:\*:\*:\*:\*:\*:\*
 

(where xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx is your NVD API key)

That should return JSON (and not a 404). Now you know your API key is valid.
 

Some have some success with setting the delay longer:

    nvd {
        apiKey = System.getProperty("ENV_NVD_API_KEY")
        delay = 6000 // milliseconds, default is 2000 with API key, 8000 without
    }

Commandline version:

--nvdDelay 6000
 

You can also increase the validForHours option, but that doesn't work if during you construct completely new Docker containers each build - you lose that history.

All NVD options you can pass to DependencyCheck are found here.

But currently (27 December 2023) all the above efforts don't always fix the problem of the 403. Sometimes it works for a while, but then not again. If you  build many projects in your company at about the same time, you still have a chance of getting throttled of course.

The best solution is to create a local cache so you are less dependent on NVID API calls (and thus the throttling).

 

Other causes mentioned

  • Being behind a proxy with your whole company, see https://github.com/jeremylong/DependencyCheck/issues/6127 and "If you have multiple builds happening at the same time - using the same API key you could hit the NVD rate limiting threshold. Ideally, in an environment with multiple builds you would implement some sort of caching strategy". See: https://github.com/jeremylong/DependencyCheck/issues/6195
  • Use --data argument to control cache location.
  • It appears the NVD has put a temporary block on all requests that use a virtualMatchString of "cpe:2.3:a" - outstanding issue.


Misc

 

Saturday, November 4, 2023

MapStruct library: setting a field with an expression to a ZonedDateTime

Introduction

Setup:

- Mapstruct 1.5.5
- Kotlin 1.8
- IntelliJ 2023.2.3
 

 

For a mapping between two classes, the source class did not have the mandatory (non-null, Kotlin!) target class field created. And I wanted to fill it with the current ZonedDateTime class, with the timezone of Amsterdam/Europe. And that specific ZoneId is a constant in my Application.kt class named MY_DEFAULT_TIME_ZONE.

Solution

So I looked at the expression field in @Mapping found here.
I got this solution working quite fast with: 

@Mapping(target = "created", expression = "java(ZonedDateTime.now())"). 

But as you see, the ZoneId constant is still missing.

Potential other solutions

I had to try quite a few things to get that working, because the class ZoneId was not getting recognized in the generated implementation MapStruct mapper.
In the end this worked:

@Mapper(componentModel = MappingConstants.ComponentModel.SPRING, imports = [ZoneId::class, Application::class])
interface MyMapper {

  @Mapping(target = "created", expression = "java(ZonedDateTime.now(my.package.Application.MY_DEFAULT_TIME_ZONE))")
  fun myDtoToResponseDto(myDto: MyDto): ResponseDto
  ...
}
Note the imports field to have the class ZoneId and the constant available (imported) in the implementation class, generated by MapStruct.

In the Application.kt you then have to make the MY_DEFAULT_TIME_ZONE constant available to Java, since that's what MapStruct uses as language:

Application.kt
{
  companion object {

    @JvmField
    val MY_DEFAULT_TIME_ZONE: ZoneId = ZoneId.of("Europe/Amsterdam")
    ...
}

I also tried this: 

@Mapping(target = "created", source = ".", qualifiedByName = ["getValue"]) 

with a function:

@Named(value = "getValue")
fun getValue(myDto: MyDto): ZonedDateTime {
  return ZonedDateTime.now(MY_DEFAULT_TIME_ZONE)
}

The advantage of this solution is that you can use Kotlin code and you don't have to wait and see if your expression has the correct syntax and will compile.
But then I got this error: ZonedDateTime does not have an accessible constructor. I also tried to wrap the field created in a small class, but that didn't work either (could be me :)
See this and this for more details on how that should work.

I also tried with the @JvmDefault annotation, but that is deprecated + it requires you to use the -Xjvm-default property, which I couldn't get to work in IntelliJ with Gradle.
And it is not always guaranteed to work, see here and here and here:

I'm definitely still a beginner in using MapStruct. So probably one of the other methods could work too... Any tips are welcome :)

Wednesday, September 27, 2023

SpringDoc OpenAPI Swagger generated Swagger API shows incorrect class with same name

Introduction

When you have multiple classes with the same name in your classpath, SpringDoc with Swagger API annotations potentially picks the wrong class with the same name when generating the Swagger UI documentation.


Suppose you have these classes:

  • org.example.BookDto
  • org.example.domain.BookDto
     

And you specified your endpoint like this, where you want to have it use org.example.BookDto:

  @Operation(summary = "Get a list of books for a given shop")
  @ApiResponses(
    value = [
      ApiResponse(
        responseCode = "200",
        description = "A list of books",
        content = [Content(mediaType = "application/json",
                    array = ArraySchema(schema = Schema(implementation = BookDto::class)))]
      )
    ]
  )
  @GetMapping("/books/{shopId}")
  fun getBooksByShopId(
    @Parameter(description = "Shop to search for")
    @PathVariable shopId: Long
  ): List<BookDto> {
    return bookService.getBooksByShopId(shopId)
      .map { BooksMapper.mapDto(it) }
  }

Then whatever it finds first on the classpath will be visible in https://localhost:8080/swagger-ui.html. Not necessarily the class you meant, it might pick org.example.domain.BookDto.  

Setup:

  • Spring Boot 3
  • Kotlin 1.8
  • Springdoc OpenAPI 2.2.0
     

Solution

Several solutions exist:

Solution 1

Specify in your application.yml:

springdoc:
 use-fqn: true

 

Disadvantage: the Swagger documentation in the swagger-ui.html endpoint has then the fully specified package classpath + classname in it. Looks ugly. 

Solution 2

Setting it in the @Bean configuration:

import io.swagger.v3.core.jackson.TypeNameResolver
  @Bean
  fun openAPI(): OpenAPI? {

    TypeNameResolver.std.setUseFqn(true)
    return OpenAPI()
      .addServersItem(Server().url("/"))
      .info(
        Info().title("Books Microservice")
          .description("The Books Microservice")
          .version("v1")
      )
      .externalDocs(
        ExternalDocumentation()
          .description("Books Microservice documentation")
          .url("https://github.com/myproject/README.md")
      )
  }

Disadvantage: also in this solution the Swagger documentation in the swagger-ui.html endpoint has then the fully specified package classpath + classname in it. Looks ugly.

Solution 3

You can create your own ModelConverters, but that is much more work. Examples here:  https://github.com/swagger-api/swagger-core/wiki/Swagger-2.X---Extensions#extending-core-resolver and https://groups.google.com/g/swagger-swaggersocket/c/kKM546QXGY0

Solution 4

Make sure for each endpoint you specify the response class with full class package path:

@Operation(summary = "Get a list of books for a given shop")
  @ApiResponses(
    value = [
      ApiResponse(
        responseCode = "200",
        description = "A list of books",
        content = [Content(mediaType = "application/json",
                    array = ArraySchema(schema = Schema(implementation =
org.example.BookDto::class)))]
      )
    ]
  )
  @GetMapping("/books/{shopId}")
  fun getBooksByShopId(
    @Parameter(description = "Shop to search for")
    @PathVariable shopId: Long
  ): List<BookDto> {
    return bookService.getBooksByShopId(shopId)
      .map { BooksMapper.mapDto(it) }
  }

 See the bold Schema implementation value for what changed.


 

 

Wednesday, August 23, 2023

Unknown application error occurred Runtime.Unknown - Startup issue AWS Serverless Lambda

Introduction

Trying to invoke a deployed AWS Serverless Lambda on AWS, I was getting this error CloudWatch when trying to invoke the lambda via an SQS event, published by another service in my landscape:
 
2023-08-15T15:20:44.047+02:00 START RequestId: ab924ff5-236c-5b09-8a29-12a0b9447e41 Version: $LATEST
2023-08-15T15:20:45.223+02:00 Unknown application error occurred
  Runtime.Unknown
  Unknown application error occurred Runtime.Unknown
2023-08-15T15:20:45.223+02:00 END RequestId: ab924ff5-236c-5b09-8a29-12a0b9447e41

 

That's all. No more details. Nothing appeared in Datadog to which my CW logging is forwarded to. But the lambda ran fine when running it locally in IntelliJ using the SAM AWS Toolkit, with me logged in with my IAM role.
Adding logging or a try/catch wouldn't do anything, since this error appears already before the lambda even gets invoked.
 
Setup:
  • AWS Serverless Lambda
  • IAM
  • IntelliJ AWS Toolkit
  • Kotlin 1.8.10
  • CloudWatch
  • Datadog
  • AWS Parameter Store
  • KMS
  • SSM
  • SQS
     
 

Solution

Then I tried to trigger the lambda via the AWS console by manually creating the SQS event and sending it on the SQS queue the lambda is listening to. There I did get the details of the error shown:

{
  "errorMessage": "User: arn:aws:sts::100004:assumed-role/my-lambda-role-acc/my-lambda is not authorized to perform: ssm:GetParameter on resource: arn:aws:ssm:eu-west-1:
100004:parameter/abc/apikey because no identity-based policy allows the ssm:GetParameter action (Service: AWSSimpleSystemsManagement; Status Code: 400; Error Code: AccessDeniedException; Request ID: 657c62f2-3527-42e0-8ee4-xxxxxxxx; Proxy: null)",
  "errorType": "com.amazonaws.services.simplesystemsmanagement.model.AWSSimpleSystemsManagementException"
 
See this screenshot: 

 
 
The reason it worked locally is probably because there I'm logged in with a different IAM account (with more permissions) than when the lambda is deployed in the AWS cloud.

Then after fixing that by adding the path abc/apikey to the key as resource, I got this error:
{
  "errorMessage": "User: arn:aws:sts::
100004:assumed-role/my-lambda-role-acc/my-lambda
is not authorized to perform: kms:Decrypt on resource: arn:aws:kms:eu-west-1:
100004:key/ff841b70-5038-6f0b-8621-xxxxxx because no identity-based policy
allows the kms:Decrypt action (Service: AWSKMS; Status Code: 400; Error Code: AccessDeniedException; Request ID: aaa5a8d0-d26e-5051-7ac0-xxxxxxxx; Proxy: null)
(Service: AWSSimpleSystemsManagement; Status Code: 400; Error Code: AccessDeniedException; Request ID: f807b8d7-826e-4d4c-9b5c-xxxxxxx; Proxy: null)",
  "errorType": "com.amazonaws.services.simplesystemsmanagement.model.AWSSimpleSystemsManagementException"
}


So the KMS decrypt is not allowed (no permission for) on that specific AWS Parameter Store entry abc/apikey.

The fix here was to add the action for the correct resource, see the items in bold:

  statement {
    sid    = "AllowReadingParameterStoreParameters"
    effect = "Allow"

    actions = [
      "ssm:DescribeParameters",
      "ssm:GetParameter",
      "kms:Decrypt"          
    ]

    resources = [
      "arn:aws:ssm:eu-west-1:100004:parameter/abc/apikey",
      "arn:aws:kms:
eu-west-1:100004:key/*
    ]
  }

Note that the error gave away already a little on how to name that resource. Be aware that this way you potentially give more Decrypt access than you want...

Misc:
Other tips to try if you ever have this Runtime.Unknown error (but did not try): Instrumenting Java code in AWS Lambda.
And some more generic tips for troubleshooting during/before invocation.
And while executing: https://docs.aws.amazon.com/lambda/latest/dg/troubleshooting-execution.html
 

Friday, July 21, 2023

Too many open files in AWS Lambda serverless troubleshooting

Introduction

One of my Kotlin lambdas was throwing Too many open files exceptions and thus logging errors, but only nightly, when it got bursts of SQS messages to process. 



To find out what was causing these errors, I followed these steps:
  1. Read up on what can be the causes
  2. Determine what/where in the code is not closing the file descriptors
  3. Find a solution to the issue
  4. Then fix the issue
After reading up, it turned out that it can be actual files not getting closed, but open connections also use file descriptors, so they count for the total number of open File Descriptors (FDs).

I also found out that for AWS Lambda Serverless, the maximum open file descriptors is fixed to 1024. Normally in Linux systems you can modify that limit e.g with the ulimit  command, but not in the lambdas execution runtimes. Thus a quick fix of increasing the open files limit wasn't possible.

Important to know too when analyzing this problem is that "... Lambda doesn't send more than one invocation at a time to the same container. The container is not used for a second invocation until the first one finishes. If a second request arrives while a first one is running, the second one will run in a different container. When Lambda finishes processing the first request, this execution environment can then process additional requests for the same function. In general, each instance of your execution environment can handle at most 10 requests per second. This limit applies to synchronous on-demand functions, as well as functions that use provisioned concurrency. In you're unfamiliar with this limit, you may be confused as to why such functions could experience throttling in certain scenarios." Partially from: https://docs.aws.amazon.com/lambda/latest/dg/lambda-concurrency.html  My addition: note that in the above mentioned 10 requests per second performance, still those 10 requests are handled sequentially!!

Note that no events were lost in the end when the exceptions occurred; AWS lambda recovered by itself by providing the events again from the queue, since these were not processed due to the exception. It also scaled up the number of instances significantly, probably due to the burst and it detecting that messages were not getting processed sufficiently quick.

To the determine the cause of the open file descriptors, I tried several options to find out how many and which files are opened by what:
  1. Try by using a Java MXBean
  2. Try the File Leak Detector library
  3. Try via Linux commands
  4. Force Garbage Collections
  5. Examine the code for potential spots where files and connections are reopened over and over again
Not tried but could be an option to explore: track the network connections created, e.g get the open connections count from OkHttpClient. Something like this: OkHttpClient().newBuilder().build()..dispatcher().runningCallsCount()

Setup
  • AWS lambda

  • Kotlin 1.8.10

  • Java 17

  • Retrofit2

  • IntelliJ

  • Gradle


Java MXBean open files detection

This option only supports showing the amount of open file descriptors. Not which part(s) of the lambda have a given file descriptor in use.
val os: OperatingSystemMXBean = ManagementFactory.getOperatingSystemMXBean()
if (os is UnixOperatingSystemMXBean) {
  logger.info("Number of open fd: " + (os as UnixOperatingSystemMXBean).openFileDescriptorCount)
}

Found via: https://stackoverflow.com/questions/16360720/how-to-find-out-number-of-files-currently-open-by-java-application

Note the call will fail at the moment the Too many files error starts to happen, because logging and many other calls require a file descriptor; and the MXBean itself probably too.... So all you can see is that number of open file descriptors increase and increase up to the exception.

File Leak Detector open files detection

I used v1.13 since v1.15 was not available on Maven Central.
First you have to get this library on the command line when starting the Lambda. But after supplying the java agent to the command line of the AWS lambda like this:

java -javaagent:lib/file-leak-detector.jar

the error during startup was:

Failed to find Premain-Class manifest attribute in lib/file-leak-detector-1.13.jar

Error occurred during initialization of VM

agent library failed to init: instrument

That error shows up because the MANIFEST.MF file is missing the Premain-Class entry, which tells the runtime what the main method is to start the agent.


I tried some other paths to verify the path was correct; if the path is incorrect you get a message like “Error opening zip file or JAR manifest missing”.

(note that I already had a -javaagent argument on the command line for Datadog. Both added caused the deployment to fail with a timeout; didn't further investigate why, I just removed that Datadog -javaagent for now)

And indeed when I looked inside  the MANIFEST.MF of the file-leak-detector-v1.13.jar, no such entryI then downloaded the source code of the library from Github and noticed another jar file getting created: file-leak-detector-1.16-SNAPSHOT-jar-with-dependencies.jar

(note here I switched to v1.16-snapshot just to have the latest)

And in there, the Premium-Class is set!

Premain-Class: org.kohsuke.file_leak_detector.AgentMain

I then decided to add the new jar locally to the build of the lambda, for testing purposes, as described here: https://stackoverflow.com/questions/20700053/how-to-add-local-jar-file-dependency-to-build-gradle-file

The jar was put in the 'libs' directory which I created in the root directory of the (IntelliJ) Gradle project.  Gradle depencency: implementation files('libs/file-leak-detector-1.16-SNAPSHOT-jar-with-dependencies.jar')

After that, the File Leak Detector started up fine, as can be seen from these messages:

File leak detector installed

Could not load field socket from SocketImpl: java.lang.NoSuchFieldException: socket

Could not load field serverSocket from SocketImpl: java.lang.NoSuchFieldException: serverSocket

Note the last two messages are due to Java17+ not allowing this anymore, you can find more details about this when searching for those exact error messages in the File Leak Detector source code.

I then did have SocketExceptions appear at the nightly runs too like “Caused by: java.net.SocketException: Too many open files” so I couldn't tell too much yet. It seems the lib-file-leak-detector is then not dumping the open files, probably because the above mentioned Java 17+ issue. Or something else went wrong, at least I couldn't see any dumps in AWS CloudWatch though.

So I set up my own listener from the library, so I could then dump the open files whenever I wanted. It is possible, but no full example is given; the *Demo.java examples give some ideas away. Here's what I used:

logger.info("Is leak detector agent installed = " + Listener.isAgentInstalled()")
if (Listener.isAgentInstalled()) {
  try {
    val b = ByteArrayOutputStream()
    Listener.dump(b)
    logger.info("The current open files Listener dump = $b")
    val currentOpenFiles = Listener.getCurrentOpenFiles()
    logger.info("The current open files Listener list size = ${currentOpenFiles.size}")
    
    var jarFilesCounter = 0
    currentOpenFiles.forEach {
      when (it) {
        is Listener.FileRecord -> {
          if (!it.file.name.endsWith("jar")) {
            logger.info("File named " + it.file + " is opened by thread:" + it.threadName)
          } else {
            jarFilesCounter++
          }
        }
        else -> logger.info("Found record by Listener is not a file record, skipping")
      }
    }
    logger.info("Of the open files, $jarFilesCounter are .jar files, those were skipped in the logging of the list of files currently open")
    b.close()
  } catch (ex: Exception) {
    logger.error("Dump of current open files failed with exception: ", ex)
  }
} else {
  logger.info("Leak detector agent is not installed, so not dumping open files")
}
Note I skipped the jars during logging, which I noticed count for a lot of the open files listed.

The Listener.dump() lists all open files, instead of showing how many times a given file is opened. I couldn't find anything mentioning the library does support this; would be a very useful feature.

I noticed the open files count was always lower than when using the MXBean. My guess is that the MXBean does count the open Socket connections too. And thus is much more precise. 

Linux commands open files detection

There are two ways in Kotlin (and Java) to execute a command on the command line:

    p = Runtime.getRuntime().exec(command)

and

    val pb = ProcessBuilder(command, arguments)
    val startedProcess = pb.start()

I tried both ways. My goal was to use the 'lsof' command, but that was not available in the lambda runtime. Then I tried to get the user of the process. And the process ID of the lambda itself. Then via /procs/fd one could find what files are kept open by a give PID.

These commands worked:

      val pb = ProcessBuilder("sh", "-c", "echo $$")

      val startedProcess = pb.start()


      val pb = ProcessBuilder("sh", "-c", "ls -al")
      val startedProcess = pb.start()

      val pb = ProcessBuilder("sh", "-c", "ls -al /proc")
      val startedProcess = pb.start()

      p = Runtime.getRuntime().exec("id -u $userName")


These didn't work:

      p = Runtime.getRuntime().exec("/proc/self/status")
      p = Runtime.getRuntime().exec("echo PID  = $$")
      p = Runtime.getRuntime().exec("$$")
      val pb = ProcessBuilder("echo", "$$")
      val pb = ProcessBuilder("$$")
      p = Runtime.getRuntime().exec(command)  // Exited with value 2, so probably invalid command

When I got the PID, I tried this to get the open FDs by this PID in different ways, but that failed:

      val pb = ProcessBuilder("sh", "-c", "ls -al /proc/$pidFound/fd")

After listing all files in the current directory via this command:

      val pb = ProcessBuilder("sh", "-c", "ls -al /proc")

I saw that none of the numbers (PIDs) listed there were matching the found PID!  At this point I stopped further exploring this option, since then I wouldn't find the open files in /proc/$pidFound/fd anyway....

Force GC

A theory why the Too many open files error is appearing was that the Java runtime doesn't get enough time to clean up (garbage collect) the opened file descriptors. 
So to test this theory, I forced a Garbage Collect after each 50 invocations of the lambda instance. Of course calling System.gc() doesn't fully guarantee it will happen right at that moment, e.g when the runtime is too busy it will happen later.
To cater for that I also added a Thread.sleep() call.  Yes this solution normally is a potential performance killer, but an option to verify the theory. This is the code I used:

   nrOfInvocationsOfThisLambdaInstance++
      if (nrOfInvocationsOfThisLambdaInstance % 50L == 0L) {
        logger.info("$nrOfInvocationsOfThisLambdaInstance % 50 == 0, so going to garbage collect")
        try {
          System.gc()
        } catch (e: Throwable) {
          logger.error("Unable to garbage collect due to: ${e.message}. Full details in the exception.", e)
        }
        logger.info("$nrOfInvocationsOfThisLambdaInstance % 50 == 0, so going to runFinalization")
        try {
          System.runFinalization()
        } catch (e: Throwable) {
          logger.error("Unable to runFinalization due to: ${e.message}. Full details in the exception.", e)
        }

        logger.info("Going to sleep() so it hopefully gets time to GC...")
        try {
          Thread.sleep(5000)
        } catch (i: InterruptedException) {
          // ignore
        }
   }

And indeed, the Too many open files error was gone!  But this couldn't really be the final acceptable solution of course.

Examine the code for keeping files open

See below.

Solution

So at this point I only had some idea of how many open files there were at a certain point. I saw the number using the MXBean solution go up to about 1023 and then the Too many open files error started to appear.
In the code I did find a spot where it was opening and reading a configuration file on each incoming request!  After moving that code into an object class (or init{} block, or val variable at class level), the Too many open files error started to appear already much later (as in: the number of open files count went up much slower and the exception occurred less).
So I was moving into the right direction!   Also, the errors were all SocketExceptions now.  After investigating the code some more and more, I noticed the OkHttpClient was getting created each time an HTTP request to an external third party was made (which is relatively slow of course).   After also moving this part into an object class, the error was completely gone!

Conclusion: the tools gave some more insights on what was going on, and I learned quite few things on how/where/when file descriptors are used in lambdas, but in the end the problem was found during plain old code examination :)

Friday, April 21, 2023

OWASP Dependency Check plugin suppressions.xml examples

Introduction

One of the features of the OWASP dependency check plugin is to be able to suppress reported vulnerabilities, for example because they are false-positives for your configuration, or no new version is available yet, so you want to suppress the alert for a certain period of time.



Those suppressions you specify in the suppressions.xml file. The format is specified here.

But not all possibilities of suppressing have examples. Especially those where you just want to exclude a whole set of packages, e.g. everything of the Spring framework starting with 'spring-', like 'spring-webflux', 'spring-web' etc for a given version.

After some trial and here I came up with some more additional useful examples.

Solution

Setup

  • Kotlin 1.8.10
  • Gradle
  • Spring Boot 2.7.9
  • failBuildOnCVSS set to 7
  • OWASP plugin versions tested: 7.2.1, 8.0.0

Examples

Reported vulnerabilities as HIGH

  • logback-core-1.3.0.jar
  • logback-classic-1.3.0.jar
Suppressions:
  • <packageUrl regex="true">^pkg:maven/ch\.qos\.logback/logback-core@1.3.*$</packageUrl>
    Will not show logback-core anymore in the report as HIGH.

  • <packageUrl regex="true">^pkg:maven/ch\.qos\.logback/logback.*@1.3.*$</packageUrl>
    Will not report neither logback-core nor logback-classic anymore as vulnerabilities.
Full example of the suppression:

    <suppress until="2023-10-01Z">
        <notes><![CDATA[
        No new version exists yet for any version after this version.
        ]]></notes>
        <packageUrl regex="true">^pkg:maven/ch\.qos\.logback/logback.*@1.3.*$</packageUrl>
        <vulnerabilityName>CVE-2021-42550</vulnerabilityName>
    </suppress>


Reported vulnerabilities as HIGH

  • spring-webflux-5.3.25.jar
  • spring-messaging-5.3.25.jar
Suppressions:
    • <packageUrl regex="true">^pkg:maven/org\.springframework/spring-.*@5.3.25$</packageUrl>
      Will not report neither of the two as vulnerabilities anymore.
    Notice the ".*" used!

    Full example of the suppression:

        <suppress until="2023-10-01Z">
            <notes><![CDATA[
            No new version exists yet for any version after 5.3.26, which has the same issue.
            ]]></notes>
            <packageUrl regex="true">^pkg:maven/org\.springframework/spring-.*@5.3.25$</packageUrl>
            <vulnerabilityName>CVE-2023-20860</vulnerabilityName>
            <vulnerabilityName>CVE-2016-1000027</vulnerabilityName>
        </suppress>


    And here some examples that don't work:
    • <packageUrl regex="true">^pkg:maven/ch\.qos\.logback/logback-*@1.3.*$</packageUrl>
      Shows both logback-core and logback-classic again in the report.

    • <packageUrl regex="true">^pkg:maven/ch\.qos\.logback/logback*@1.3.*$</packageUrl>
      Shows both logback-core and logback-classic again in the report.
    Another thing to know from here: the vulnerabilities report can show an issue as MEDIUM, while the vulnerability reports as a 8.5 in the CVSSv2 ranking, while the CVSSv3 rates it at 6.6. So the report seems to take only the CVSSv3 value into account for the Highest Severity level.


    Thursday, March 16, 2023

    Datadog: Malformed _X_AMZN_TRACE_ID value Root - also known as X-Amzn-Trace-Id

    Introduction

    Since 14 March 2023 suddenly my AWS lambdas started to log this error:

    datadog: Malformed _X_AMZN_TRACE_ID value: Root=1-6411cb3d-e6a0db584029dba86a594b7e;Parent=8c34f5ad8f92d510;Sampled=0;Lineage=f627d632:0

    Note that the lambda processing was finishing normally, this metrics logging to Datadog is happening apparently in the background.


    Setup

    Investigation

    After a lot of searching, I found out that the datadog-lambda-java library was causing the issue, since that same day the issue was reported here in its Github repository.
    Which also seems to point to the code that is the culprit: the length != 3 is assuming that the trace field will always consist of exactly 3 parts. But its specifications allow for more, so it seems AWS added another part.
    The definition of the header can be found here and here where the examples still have 3 elements (parts separated by a ';'), but can now be 4.

    Solution

    A patch has been posted, but as the 3rd comment says, the library is deprecated anyway. Here is the upgrade guide.
     
    UPDATE 24 March 2023: the patch has been applied and a new release has been made! See https://github.com/DataDog/datadog-lambda-java/pull/90 for the 1.4.10 release.

     
     
     

    Wednesday, February 15, 2023

    AWS SAM CLI FileNotFoundError: WinError 3: The system cannot find the path specified .class class Kotlin 1.7 Windows 10

    Introduction

    The AWS SAM CLI command 

    sam.cmd build MyFunction --template C:\techie\workspace\my-function\local\template.yaml --build-dir C:\techie\workspace\my-function\local\.aws-sam\build --debug 

    fails in an IntelliJ commandline terminal due to this 

    FileNotFoundError: [WinError 3] The system cannot find the path specified 

    error.



    Setup

    - Windows 10 Pro laptop

    - IntelliJ 2022

    - Kotlin 1.7

    - Java 11 at least for compilation

    - Serverless lambda written in Kotlin

    - AWS SAM CLI, version 1.67.0

    - AWS Toolkit plugin for IntelliJ 

    Investigation

    First I tried to install SAM AWS CLI using HomeBrew (formerly Brew) in WSL 1 (ubuntu) under Windows 10 using these steps: 

    https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/install-homebrew.html

    But that failed during Homebrew installation. Probably upgrading to WSL 2 would fix that. But then I also realized: IntelliJ then doesn't know about that at all, since SAM CLI is then installed in WSL, not in Windows.

    It kept on saying after running brew postinstall --verbose --debug gcc

    ==> Installing aws-sam-cli from aws/tap

    Error: An exception occurred within a child process:

      Errno::EFAULT: Bad address - /home/linuxbrew/.linuxbrew/bin/gcc-12

    And also:

    Warning: The post-install step did not complete successfully

    You can try again using:

      brew postinstall gcc

    Also trying brew postinstall --verbose --debug gcc didn't succeed.  This error mentioned here was also applicable: https://github.com/orgs/Homebrew/discussions/4052

    I also didn't dare wsl --update because other configurations I already had set up might fail after that. Guess I will do that at a more quiet moment :)

    So then I went for the manual installation in Windows, as found here: https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/install-sam-cli.html

    In IntelliJ you then have to set the path to the executable to where you installed it:


    So IntelliJ will use the Windows AWS SAM CLI, not the one in the terminal (WSL 1).

    Than I ran my command, first outside IntelliJ to be able to control the parameters more easily:

    C:\Users\techie>C:\Amazon\AWSSAMCLI\bin\sam.cmd build MyFunction --template C:\techie\workspace\my-function\local\template.yaml --build-dir C:\techie\workspace\my-function\local\.aws-sam\build --debug

    But that gave this error:

    FileNotFoundError: [WinError 3] The system cannot find the path specified: 'C:\\Users\\techie\\AppData\\Local\\Temp\\tmpmjdhug40\\7c98ad184709dded6b1c874ece2a0edea9c55b0a\\build\\classes\\kotlin\\test\\com\\mycompany\\myfunction\\domain\\MyServiceTest$should register valid make this a long text to exceed this successfully$2.class'

    First I thought it was due to spaces in the test-methodname. But replacing them with an underscore didn't work either. Or maybe case-insensitive-ness; but my test-methodname is all lowercase.

    I tried many things to exclude the whole test-class from the process, because why is it even included during the sam build command? 

    Options I tried were:

    1. Setting GRADLE_OPTS before running the command -x test, similar to the MAVEN_OPTS example here: https://github.com/aws/aws-sam-cli/issues/1105#issuecomment-777703158
    2. Excluding the test-file or even just all tests in build.gradle, like:

      jar {

        sourceSets {

              main {

                  java {

                      exclude '**/TestExcludeClass.java'

                  }


                  kotlin {

                      exclude '**/TestExcludeKotlinClass.kt'

                  }

              }

          }

      }

      Note that excluding everything with 'exclude '**/*.kt'  did make the sam build fail, so the changes were taken into account.
    3. In build.gradle add: excludeTestsMatching "com.mycompany.myfunction.lambda.MyServiceTest"
    4. test.onlyIf { ! Boolean.getBoolean(skipTests) } and then specifying as GRADLE_OPTS="-DskipTests=true"

    It seems the initial step of the sam build (JavaGradleWorkflow:JavaGradleCopyArtifacts) just takes all .class files anyway, even test classes.

    I tried turning off Telemetry That did have no affect on the error.

    Solution

    Then I found this comment: https://github.com/aws/aws-sam-cli/issues/4031#issuecomment-1173730737

    Could it be that, that the path is just too long? Typical Windows limit so that could definitely be it.

    And yes after applying below command in PowerShell as an Administrator

    New-ItemProperty -Path "HKLM:\SYSTEM\CurrentControlSet\Control\FileSystem" `-Name "LongPathsEnabled" -Value 1 -PropertyType DWORD -Force

    it worked!

    Detailed explanation: https://learn.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation?tabs=powershell#enable-long-paths-in-windows-10-version-1607-and-later

    The command ran fine! This is then the output:

    Build Succeeded

    Built Artifacts  : ..\..\techie\workspace\my-function\local\.aws-sam\build

    Built Template   : ..\..\techie\workspace\my-function\local\.aws-sam\build\template.yaml

    Commands you can use next

    =========================

    [*] Validate SAM template: sam validate

    [*] Invoke Function: sam local invoke -t ..\..\techie\workspace\my-function\local\.aws-sam\build\template.yaml

    [*] Test Function in the Cloud: sam sync --stack-name {{stack-name}} --watch

    [*] Deploy: sam deploy --guided --template-file ..\..\techie\workspace\my-function\local\.aws-sam\build\template.yaml



    Wednesday, December 21, 2022

    Where to find .gitattributes on Windows 10/11 using Git, IntelliJ 2022 and WSL Ubuntu to fix CRLF (\r\n) command not found: convert to LF line endings on checkout

    Introduction

    Problem: when checking out a project in IntelliJ in Windows, all files are checked out with Window's newline CRLF (\r\n).

    But if you then open a terminal in IntelliJ which runs WSL (Ubuntu) and you want to run a bash script like this shell script, you'll get this error:

    #!/bin/sh

    set -e

    [unrelated stuff deleted]

    It will fail with: 

    ./deploy.sh: line 2: $'\r': command not found

    : invalid optione 3: set: -

    set: usage: set [-abefhkmnptuvxBCHP] [-o option-name] [--] [arg ...]

    Those errors are caused by the shell script having CRLF instead of just LF, which Ubuntu/Mac OS expects.

    Sidenote: I made softlink from /bin/sh to bash in my system because dash does not support several features, see here


    So I tried setting IntelliJ's Code Style to use \n for newlines and line endings and line separator, also for new projects like mentioned here.

    But still after creating a new project from Git, all files including the above script was set to CRLF. You can see this at the bottom of the screen in this screenshot:


    I also realised that I don't want all files to have just LF, because I noticed doing that manually, on my commit of those files, I had to commit these too, as the newline changed. But I didn't want to bother other teammembers on Mac Books with this unnecessary change. Similar to what happens when you do a dos2unix on the file.
    So the next to try was to get Git on my machine to handle it correctly. And thus hopefully IntelliJ too.
    The .gitattributes file seemed a very good candidate: a suggestion was to change the git config 'core.autocrlf'. But that meant all the local files were getting changed still if I understood correctly, which I don't want.

    Solution

    It cost a lot of effort to find out where the gitattributes file is currently; and I wanted to change it for all my Git commands in the future, not just for one project.
    What I wanted to change it to is mentioned here:

    # Convert to LF line endings on checkout.
    *.sh text eol=lf
    **/*.sh eol=lf

    Those lines specify to change all end of lines (newlines) of files ending with .sh to be checked out with a LF (\n).
    I also added the 3rd line myself, to specify subdirectories, but maybe that was not needed.

    It was hard to find the correct location for the gitattributes (or .gitattributes file which made it all more confusing), this official Git text wasn't super-clear on it: 

    Finally I found my gitattributes file on Windows here:

    C:\git\etc\gitattributes

    C:\git is actually where I installed the Git client. And it is the one IntelliJ uses (but not the one the terminal shell in IntelliJ uses!) 

    And there were already quite a few settings in there like:

    *.RTF diff=astextplain
    *.doc diff=astextplain
    *.DOC diff=astextplain

    I just appended these:

    **/*.sh eol=lf
    *.sh eol=lf

    Restarted IntelliJ to be sure. And yes, after a complete full new checkout of my Git project, Java and Kotlin files were still having CRLF line endings, and my shell script had the LF ending.

    Sunday, December 18, 2022

    Docker build with Git command running in CircleCI failing with: Fatal: No names found, cannot describe anything, invalid argument, for "-t, --tag" flag: invalid reference format

    Introduction

    Context: Docker, CircleCI, Github.

    The Docker build command 

    docker build -f .circleci/Dockerfile -t $AWS_ACCOUNT_ID.ecr.$AWS_DEFAULT_REGION.amazonaws.com/${CIRCLE_PROJECT_REPONAME}:`git describe --tags` -t $AWS_ACCOUNT_ID.ecr.$AWS_DEFAULT_REGION.amazonaws.com/${CIRCLE_PROJECT_REPONAME}:${CIRCLE_BUILD_NUM} -t $AWS_ACCOUNT_ID.ecr.$AWS_DEFAULT_REGION.amazonaws.com/${CIRCLE_PROJECT_REPONAME}:${CIRCLE_SHA1} .

    was failing with this message:

    fatal: No names found, cannot describe anything.

    invalid argument "********************************************/my-project:" for "-t, --tag" flag: invalid reference format

    See 'docker build --help'.

    Exited with code exit status 125

    Solution

    You'd expect the Docker command maybe being syntactically incorrect. But the error message is referring to something else: it turns out the git describe --tags command gave the fatal message.
    The cause was that there was no git-tag set at all on the Github project yet.   After manually creating a release (including a tag) on Github and running the above build command again, the docker build command succeeded.

    Wednesday, November 30, 2022

    Flyway FlywayValidateException Validate failed: Migrations have failed validation. Detected failed migration to version 1 solution, please remove any half-completed changes then run repair to fix the schema history

    Introduction

    Setup:
    • Spring boot 2.7.4
    • Kotlin 1.7.20

    Flyway dependencies:

    plugins {
        id "org.flywaydb.flyway" version "9.6.0"
    }

    project(":application") {
      dependencies {

          dependencies {

              implementation "org.flywaydb:flyway-core:9.6.0"
              implementation "org.flywaydb:flyway-mysql:9.6.0"
      }
    }

    With only this Flyway script V1__initial_script present in the Spring Boot application:

    create table order
    (
        id                 varchar(255) primary key,
        orderId        bigint(20)    not null
    );

    create index order_id_idx on order(orderId);

    create table order_entry
    (
        id              varchar(255) primary key,
        orderid     bigint(20) not null,
        desc              varchar(255) not null,
        UNIQUE (oderId),
        constraint fk_orderId foreign key (orderId) references order (id)
    );

    gave this error:

    Invocation of init method failed; nested exception is org.flywaydb.core.api.exception.FlywayValidateException: Validate failed: Migrations have failed validation
    Detected failed migration to version 1 (initialize schema).
    Please remove any half-completed changes then run repair to fix the schema history.
    Need more flexibility with validation rules? Learn more: https://rd.gt/3AbJUZE

    See also the below screenshot:


    So the error is very vague. I also knew for sure that the schema script was the very first Flyway script to be run for that database, so it could not be the case that I edited the script after it already had run before succesfully.

    Solution

    The error code documentation doesn't tell much either. No details at all. 

    But when I ran the contents of the above script on the SQL command line in DBeaver you get the exact details of the problem:

    org.jkiss.dbeaver.model.sql.DBSQLException: SQL Error [3780] [HY000]: Referencing column 'orderId' and referenced column 'id' in foreign key constraint 'fk_orderId' are incompatible.
    at org.jkiss.dbeaver.model.impl.jdbc.exec.JDBCStatementImpl.executeStatement(JDBCStatementImpl.java:133)
    ...

    So Flyway just doesn't log the details of the error at all. I also tried the Flyway validate in a bean, to see if I could inspect the exception hopefully returned in the validateWithResult variable:

      @Bean
      fun myFlyway(dataSource: DataSource): String {
        val flyway = Flyway.configure().dataSource(dataSource).load()
        val validateWithResult = flyway.validateWithResult()
        logger.info("Flyway validate = ${validateWithResult}");
      }

    But that only showed again the high level error:

    Migrations have failed validation
    Error code: FAILED_VERSIONED_MIGRATION
    Error message: Detected failed migration to version 1 (initialize schema). Please remove any half-completed changes then run repair to fix the schema history.

    I found out that older Flyway versions used to have a bug not showing the whole exception,
    But even in this most version 9.6.0 the full exception is not logged.

    I also tried configuring Flyway with outputting JSON as mentioned here and here:

    flyway {
        outputType="json"
    }

    But outputType can't be used here. Didn't further investigate how to specify that parameter, the @Bean has no option for it. Maybe it is possible via application.yml...

    Though it seems it should be possible via configuration files and environment variables.

    Thus in the end the solution for me was to run the script manually against the database to see what the exact error is. Running repair is also not the correct suggestion, because the script just contained some syntax/semantic errors. It looks like a Flyway bug that the exact error details aren't logged by default, and I couldn't get it to do that either.