Showing posts with label driver. Show all posts
Showing posts with label driver. Show all posts

Thursday, December 28, 2017

Logback DBAppender sometimes gives error on AWS Aurora: IllegalStateException: DBAppender cannot function if the JDBC driver does not support getGeneratedKeys method *and* without a specific SQL dialect

LOGBack DBAppender IllegalStateException


Sometimes when starting a Spring Boot application with Logback DBAppender configured for PostgreSQL or AWS Aurora in logback-spring.xml, it gives this error:

java.lang.IllegalStateException: Logback configuration error detected: ERROR in ch.qos.logback.core.joran.spi.Interpreter@22:16 - RuntimeException in Action for tag [appender] java.lang.IllegalStateException: DBAppender cannot function if the JDBC driver does not support getGeneratedKeys method *and* without a specific SQL dialect

The error can be quite confusing. From the documentation it says that Logback should be able to detect the dialect from the driver class.

But apparently it doesn't. Sometimes. After investigating, it turns out that this error is also given when the driver can't connect correctly to the database. Because it will then not be able to find the metadata either, which it uses to detect the dialect. And thus you get this error too in that case!
A confusing error message indeed.

A suggestion in some post was to specify the <sqlDialect> tag, but that is not needed anymore in recent Logback versions. Indeed, it now gives these errors when putting it in logback-spring.xml file either below <password> or below <connectionSource>:

ERROR in ch.qos.logback.core.joran.spi.Interpreter@25:87 - no applicable action for [sqlDialect], current ElementPath  is [[configuration][appender][connectionSource][dataSource][sqlDialect]]
or
ERROR in ch.qos.logback.core.joran.spi.Interpreter@27:79 - no applicable action for [sqlDialect], current ElementPath  is [[configuration][appender][sqlDialect]]
To get a better error message it's better to implement the setup of the LogBack DBAppender in code, instead of in the logback-spring.xml. See for examples here and here.




Wednesday, December 9, 2015

com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /10.255.235.17 (Timeout during read)

In a recent project, seemingly randomly, this exception occurred when doing a CQL 'select' statement from a Spring Boot project to Cassandra:

com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /10.255.235.17 (Timeout during read), /10.255.235.16 (Timeout during read))
...


After a lot of research, some people seemed to have reported the same issue. But no clear answer anywhere. Except that some Cassandra driver versions might be the cause of it: they mark (all) the node(s) as down and don't recognize it when it becomes available again.

But, the strange this is we have over 10 (micro) services running, all running at with least 2 instances. But only one of these services had this timeout problem. So it almost couldn't be the driver.... Though it did seem to be related with not using the connection for a while, because often our end-to-end tests just ran fine, time after time. But after a few hours, the tests would just fail. Then we didn't see the pattern yet...

But, as a test, we decided to let nobody use the environment against which the end-to-end tests run for a few hours; especially also because some of the below articles do mention as a solution to set the heartbeat (keep-alive) of the driver.

And indeed, the end-to-end tests started failing again after the grace period. Then we realized it: all our services have a Spring Boot health-check implemented, which is called every X seconds. EXCEPT the service that has the timeouts; it only recently got connected with Cassandra!

After fixing that, the error disappeared! Of course depending on the healthcheck for a connection staying alive is not the ideal solution. A better solution is probably setting the heartbeat interval on the driver on Cluster creation:

var poolingOptions = new PoolingOptions()
  .SetCoreConnectionsPerHost(1)
  .SetHeartBeatInterval(10000);
var cluster = Cluster
  .Builder()
  .AddContactPoints(hosts).
  .WithPoolingOptions(poolingOptions)
  .Build();


In the end it was the firewall which resets all TCP connections every two hours!

References

Tips to analyse the problem:

Similar error reports