caddy/caddytls
Matthew Holt 0e34c7c970
tls: Fix background certificate renewals that use TLS-SNI challenge
The loop which performs renewals in the background obtains a read lock
on the certificate cache map, so that it can be safely iterated. Before
this fix, it would obtain the renewals in the read lock. This has been
fine, except that the TLS-SNI challenge, when invoked after Caddy has
already started, requires adding a certificate to the cache. Doing this
requires an exclusive write lock. But it cannot obtain a write lock
because a read lock is obtained higher in the stack, while the loop
iterates. In other words, it's a deadlock.

I was able to reproduce this issue consistently locally, after jumping
through many hoops to force a renewal in a short time that bypasses
Let's Encrypt's authz caching. I was also able to verify that by queuing
renewals (like we do deletions and OCSP updates), lock contention is
relieved and the deadlock is avoided.

This only affects background renewals where the TLS-SNI(-01) challenge
are used. Users report seeing strange errors in the logs after this
happens ("tls: client offered an unsupported, maximum protocol version
of 301"), but I was not able to reproduce these locally. I was also not
able to reproduce the leak of sockets which are left in CLOSE_WAIT.
I am not sure if those are symptoms of running in production on Linux
and are related to this bug, or not.

Either way, this is an important fix. I do not yet know the ripple
effects this will have on other symptoms we've been chasing. But it
definitely resolves a deadlock during renewals.
2017-01-21 14:39:36 -07:00
..
storagetest Refactor and improve TLS storage code (related to locking) 2016-09-19 17:24:34 -06:00
certificates.go tls: Fix background certificate renewals that use TLS-SNI challenge 2017-01-21 14:39:36 -07:00
certificates_test.go fix typo 2016-08-09 14:57:17 +09:00
client.go Add support for OCSP Must-Staple for Let's Encrypt certs (#1221) 2016-10-29 08:44:49 -06:00
client_test.go Rewrote Caddy from the ground up; initial commit of 0.9 branch 2016-06-04 17:00:29 -06:00
config.go Add support for OCSP Must-Staple for Let's Encrypt certs (#1221) 2016-10-29 08:44:49 -06:00
config_test.go Refactor and improve TLS storage code (related to locking) 2016-09-19 17:24:34 -06:00
crypto.go Minor text fixes ;) 2016-08-23 15:47:23 -06:00
crypto_test.go Remove dead code, do struct alignment, simplify code 2016-10-25 19:19:54 +02:00
filestorage.go Refactor and improve TLS storage code (related to locking) 2016-09-19 17:24:34 -06:00
filestorage_test.go Pluggable TLS Storage (#913) 2016-07-08 07:32:31 -06:00
handshake.go Remove dead code, do struct alignment, simplify code 2016-10-25 19:19:54 +02:00
handshake_test.go Rewrote Caddy from the ground up; initial commit of 0.9 branch 2016-06-04 17:00:29 -06:00
httphandler.go Set listenHost to localhost if empty; fixes test on Windows 2016-12-23 10:28:00 -07:00
httphandler_test.go ACME challenge proxy now accounts for ListenHost (bind); fixes #1296 2016-12-23 09:40:03 -07:00
maintain.go tls: Fix background certificate renewals that use TLS-SNI challenge 2017-01-21 14:39:36 -07:00
setup.go Add support for OCSP Must-Staple for Let's Encrypt certs (#1221) 2016-10-29 08:44:49 -06:00
setup_test.go Add support for OCSP Must-Staple for Let's Encrypt certs (#1221) 2016-10-29 08:44:49 -06:00
storage.go Refactor and improve TLS storage code (related to locking) 2016-09-19 17:24:34 -06:00
tls.go Refactor and improve TLS storage code (related to locking) 2016-09-19 17:24:34 -06:00
tls_test.go Refactor and improve TLS storage code (related to locking) 2016-09-19 17:24:34 -06:00
user.go Fix small misspellings 2017-01-10 13:09:24 -08:00
user_test.go tls: Improve flaky test depending on CPU scheduling (I think) 2016-11-28 23:37:22 -07:00