Wednesday, December 9, 2015

com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /10.255.235.17 (Timeout during read)

In a recent project, seemingly randomly, this exception occurred when doing a CQL 'select' statement from a Spring Boot project to Cassandra:

com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /10.255.235.17 (Timeout during read), /10.255.235.16 (Timeout during read))
...


After a lot of research, some people seemed to have reported the same issue. But no clear answer anywhere. Except that some Cassandra driver versions might be the cause of it: they mark (all) the node(s) as down and don't recognize it when it becomes available again.

But, the strange this is we have over 10 (micro) services running, all running at with least 2 instances. But only one of these services had this timeout problem. So it almost couldn't be the driver.... Though it did seem to be related with not using the connection for a while, because often our end-to-end tests just ran fine, time after time. But after a few hours, the tests would just fail. Then we didn't see the pattern yet...

But, as a test, we decided to let nobody use the environment against which the end-to-end tests run for a few hours; especially also because some of the below articles do mention as a solution to set the heartbeat (keep-alive) of the driver.

And indeed, the end-to-end tests started failing again after the grace period. Then we realized it: all our services have a Spring Boot health-check implemented, which is called every X seconds. EXCEPT the service that has the timeouts; it only recently got connected with Cassandra!

After fixing that, the error disappeared! Of course depending on the healthcheck for a connection staying alive is not the ideal solution. A better solution is probably setting the heartbeat interval on the driver on Cluster creation:

var poolingOptions = new PoolingOptions()
  .SetCoreConnectionsPerHost(1)
  .SetHeartBeatInterval(10000);
var cluster = Cluster
  .Builder()
  .AddContactPoints(hosts).
  .WithPoolingOptions(poolingOptions)
  .Build();


In the end it was the firewall which resets all TCP connections every two hours!

References

Tips to analyse the problem:

Similar error reports



Wednesday, December 2, 2015

iOS9 / iOS 9 / iOS 9.1 / ATS 9: An SSL error has occurred and a secure connection to the server cannot be made due to old sha1 signed certificate

In iOS 9 and higher apps, a higher level of ciphers is required for a certificate for Forward Secrecy.

Before iOS 9, it was possible to let a site within a webview forward/redirect to another SSL protected site.
For example it was possible to let another site redirect to this one in iOS 8: https://www.securesuite.co.uk/

But since iOS 9 it is not allowed anymore and you'll get an error like:

An SSL error has occurred and a secure connection to the server cannot be made." UserInfo={NSUnderlyingError=0x7f9855dcb520 {Error Domain=kCFErrorDomainCFNetwork Code=-1200 "An SSL error has occurred and a secure connection to the server cannot be made." UserInfo={_kCFStreamErrorDomainKey=3, NSLocalizedRecoverySuggestion=Would you like to connect to the server anyway?, _kCFNetworkCFStreamSSLErrorOriginalValue=-9802, _kCFStreamPropertySSLClientCertificateState=0, NSLocalizedDescription=An SSL error has occurred and a secure connection to the server cannot be made., NSErrorFailingURLKey=https://www.securesuite.co.uk


You'll just see a blank/white screen in the webview; no errors or whatsoever on the screen.

The by-default supported list in iOS 9 and higher can be found here: https://developer.apple.com/library/prerelease/ios/documentation/General/Reference/InfoPlistKeyReference/Articles/CocoaKeys.html#//apple_ref/doc/uid/TP40009251-SW35

You can test easily whether a webpage/site is accepted at this site: https://www.ssllabs.com/ssltest/analyze.html

So when run for the above mentioned securesuite site, it tells us even in red the signature is still using the old SHA-1:


The site also even checks different clients like browsers and mobile operating systems like Android and iOS. And see the error in the Handshake Simulation section for ATS 9/ iOS 9: Client requires SHA2 certificate signatures

Running it against https://www.mastercard.com, which has SHA-2 as signature algorithm, the forwarding does work in iOS 8 and ios 9+: 
And the iOS 9 client also likes it:

To still be able to have iOS 9 and higher apps work with those less-secure sites which still use SHA-1, you can specify which domains are "ok-ish", i.e whitelist per domain. 
In the sections in https://developer.apple.com/library/prerelease/ios/documentation/General/Reference/InfoPlistKeyReference/Articles/CocoaKeys.html#//apple_ref/doc/uid/TP40009251-SW35 you can read how to whitelist: "Allowing Insecure Connection to a Single Server" and "Allowing Lowered Security to a Single Server" and "Using ATS For Your Servers and Allowing Insecure Connections Elsewhere".