Proactively Monitoring TLS Certificate Deployment
2022-09-15 Update
This is a copy of my article from Datto’s engineering blog, originally published on 2021-09-13. As of the present, the server responds with an HTTP 502. The original text can be found on the Internet Archive.
Unless you’re using the ACME protocol with a certificate authority such as Let’s Encrypt, you’re probably well aware of the annoyance of certificate rotation. Here at Datto, we use certificates in many places with a validity period of around a year, depending on the Certificate Authority. Last February, we noticed that several production hosts were providing expired certificates for one of our major Internet-facing domains - a mistake that many other companies suffer from, as well. This caused several problems, and it was decided that after the issues were addressed, we needed to take a very proactive stance in monitoring certificates for all of our TLS-enabled services. I will not dive into the details about why the certificates weren’t properly rotated, but rather, what we’re doing from now on so this sort of issue never occurs again.
If you search something along the lines of “https certificate monitor” with your Internet search engine of choice, you’ll be met with no shortage of results. However, for all of the results that I evaluated, they all had one thing in common - the user needed to provide a static list of fully qualified domain names to be checked. Being a company that is constantly in flux (and adding new domain names to our DNS zones), I’m not sure how this method would succeed, unless the service was manually updated whenever a new domain name is added.
Even discounting the use of static lists, external certificate monitoring services may have other issues. For one, external services won’t have visibility into private networks. Secondly, if your network has load balancers that don’t terminate TLS sessions, any unaware scanner will most likely trust the first answer of the load balancer.
Finding a Source of Truth
With the above in mind, I wanted to derive a perfect mapping of IP -> hostname(s) information for all subdomains of a given company-managed domain. The solution? Ask your local DNS server.
Given the above problems, I set out to make my own solution. It would have to perform the following tasks:
- Discover all DNS zones that we serve
- Fetch all DNS records for each discovered zone
- Resolve all CNAME records to IP addresses
- For each fully qualified domain name that resolves to a given IP address (through any A or CNAME records), initialize a TLS session on each user-specified port and check whether or not the served certificate is valid
My implementation of this process does the above by taking in a list of DNS zone names and their authoritative nameservers. From there, it requests zone transfers (AXFRs) from each authoritative nameserver, according to a local configuration file. Note that the host running this tool must be approved to request zone transfers from each nameserver (unless using output of a previous zone transfer in JSON format). This privilege should not be taken lightly. Although zone transfers are extremely useful in this context, they’re also a fantastic resource for any attackers that can communicate with the DNS server. Be vigilant in your DNS server configuration, and keep a lean list of hosts approved to request zone transfers.
Evaluating TLS Certificates
With an export of all DNS records from zone transfers, we’re now ready to start checking TLS certificates. The project is written in Python, so I started to evaluate all of my options on how to validate certificates. Many of the SSL libraries that I investigated over didn’t seem to tell me why a given certificate chain was invalid, or which certificate in the chain was found to be invalid (if applicable). For example, Python’s ssl library simply raises SSLErrors, and the included reason is typically not very useful.
There’s also pyOpenSSL, which seems to be of use when you want to validate a single certificate or have the entire certificate chain on hand. I was looking for a more turn-key approach. In a perfect world, the library I was searching for would tell me exactly why the certificate chain was found to be invalid, with little effort on my part.
As I searched for more libraries, I came across SSLyze. Not only can this library evaluate whole TLS certificate chains and give you verbose error messages about certificate errors, but it also has a plethora of plugins. Of particular note to us is the CERTIFICATE_INFO ScanCommand, which provides detailed information about the host’s certificate chain. Here’s the output of using this ScanCommand on an HTTPS server, via SSLyze’s command-line interface:
$ sslyze 'datto.com:443' --certinfo
CHECKING HOST(S) AVAILABILITY
-----------------------------
datto.com:443 => 184.50.211.211
SCAN RESULTS FOR DATTO.COM:443 - 184.50.211.211
-----------------------------------------------
* Certificates Information:
Hostname sent for SNI: datto.com
Number of certificates detected: 1
Certificate #0 ( _RSAPublicKey )
SHA1 Fingerprint: 670582095c617956e986e330785ae85f164d60e8
Common Name: *.datto.com
Issuer: DigiCert SHA2 Secure Server CA
Serial Number: 17247695375665899256019678832627664436
Not Before: 2021-05-05
Not After: 2022-05-10
Public Key Algorithm: _RSAPublicKey
Signature Algorithm: sha256
Key Size: 2048
Exponent: 65537
DNS Subject Alternative Names: ['*.datto.com', 'datto.com']
Certificate #0 - Trust
Hostname Validation: OK - Certificate matches server hostname
Android CA Store (9.0.0_r9): OK - Certificate is trusted
Apple CA Store (iOS 14, iPadOS 14, macOS 11, watchOS 7, and tvOS 14):OK - Certificate is trusted
Java CA Store (jdk-13.0.2): OK - Certificate is trusted
Mozilla CA Store (2021-01-24): OK - Certificate is trusted
Windows CA Store (2021-02-08): OK - Certificate is trusted
Symantec 2018 Deprecation: OK - Not a Symantec-issued certificate
Received Chain: *.datto.com --> DigiCert SHA2 Secure Server CA
Verified Chain: *.datto.com --> DigiCert SHA2 Secure Server CA --> DigiCert Global Root CA
Received Chain Contains Anchor: OK - Anchor certificate not sent
Received Chain Order: OK - Order is valid
Verified Chain contains SHA1: OK - No SHA1-signed certificate in the verified certificate chain
Certificate #0 - Extensions
OCSP Must-Staple: NOT SUPPORTED - Extension not found
Certificate Transparency: OK - 3 SCTs included
Certificate #0 - OCSP Stapling
OCSP Response Status: SUCCESSFUL
Validation w/ Mozilla Store: OK - Response is trusted
Responder Key Hash: b'\x0f\x80a\x1c\x821a\xd5/(\xe7\x8dF8\xb4,\xe1\xc6\xd9\xe2'
Cert Status: GOOD
Cert Serial Number: 17247695375665899256019678832627664436
This Update: 2021-09-05
Next Update: 2021-09-12
SCAN COMPLETED IN 0.46 S
------------------------
Of note, the CERTIFICATE_INFO ScanCommand checks the certificate chain against multiple trust stores, instead of just using the host’s built-in trust store. This information may be useful in the fringe case that a given trust store distrusts one of our certificates, but another does not.
Putting it All Together
As mentioned above, I wanted my program to start by collecting information from DNS zone transfers, and end by checking TLS certificates for all identified hosts, for each mapped IP address (be it through A or CNAME record). DNS Certificate Checker does just this.
In this execution, the program takes in a DNS zone transfer JSON and checks all of the corresponding TLS services as defined in the local configuration file. Of course, the program can run without a local zone transfer JSON, and get the AXFR results at run-time, but my machine doesn’t have privileges to make AXFR queries to the DNS servers in question, so I’m using results from a previous zone transfer.
$ python3.8 dns_cert_checker.py --from-zones-json datto.com.json -o out.csv
The output file, called out.csv in this execution, will contain one row for each warning or error, as (predominantly) determined by SSLyze. If the certificate is valid but within a configured time to expiration, a warning is raised. Else, if SSLyze raises an exception or otherwise labels the certificate as invalid, a detailed error is raised. These findings can be seen in either the log output or the output CSV in this format:
+------------+------+------------+---------+--------------------------------------------------+
| ip_address | port | fqdn | status | message |
+------------+------+------------+---------+--------------------------------------------------+
| [REDACTED] | 443 | [REDACTED] | warning | "certificate "[REDACTED]" expiring at [REDACTED]" |
| [REDACTED] | 443 | [REDACTED] | error | subject does not match hostname |
| [REDACTED] | 443 | [REDACTED] | error | unable to get local issuer certificate |
| [REDACTED] | 443 | [REDACTED] | error | certificate has expired |
| [REDACTED] | 443 | [REDACTED] | error | certificate chain does not have valid order |
| [REDACTED] | 443 | [REDACTED] | error | self signed certificate |
| [REDACTED] | 443 | [REDACTED] | error | BUG_IN_SSLYZE |
+------------+-----+------------+---------+---------------------------------------------------+
Shortcomings
DNS is a great source of truth, but it does not know everything about the hosts to which it points. For example, hosts with wildcard records may serve any number of certificates depending on which hostname they’re using for a given connection. My program doesn’t identify wildcards, and instead checks the *
subdomain as if it were the sole valid name for the record. Conversely, if you truly wanted to evaluate the certificate practices for a host with a wildcard record, you’d need to check every possible hostname for the wildcard record.
Your DNS zones may have A or CNAME records that point to hosts that are outside of your control. This tool will provide expiration warnings and certificate errors for such hosts, as it has no idea which records are truly owned by the organization. Conclusion
At this point, the program is only used for evaluating certificates for standing errors, proximity to expiration, and lack of trust from a given trust store. However, there’s no reason that it can’t be expanded to process findings from other SSLyze plugins, such as checking supported cipher suites against a user-configurable list of acceptable options. If desired, SSLyze can even check for SSL/TLS vulnerabilities such as HeartBleed and ROBOT.
By outputting findings to both a python logger and (if specified), a CSV, findings from DNS Certificate Checker can easily be sent to human-alerting mechanisms, which will make it far easier to take actions on certificates that are misconfigured or close to expiring.
My primary finding from this exercise is that looking for the right library or program to fulfill your task should demand far more time than writing a solution for it. SSLyze is a fantastic library, and if I hadn’t found it, I’d likely have spent far too long pursuing worse solutions.
If you’re interested in using this for your own organization, you can find the source code for DNS Certificate Checker at our GitHub project.