Proactively Monitoring TLS Certificate Deployment

Sep 15, 2022

2022-09-15 Update

This is a copy of my article from Datto’s engineering blog, originally published on 2021-09-13. As of the present, the server responds with an HTTP 502. The original text can be found on the Internet Archive.

Unless you’re using the ACME protocol with a certificate authority such as Let’s Encrypt, you’re probably well aware of the annoyance of certificate rotation. Here at Datto, we use certificates in many places with a validity period of around a year, depending on the Certificate Authority. Last February, we noticed that several production hosts were providing expired certificates for one of our major Internet-facing domains - a mistake that many other companies suffer from, as well. This caused several problems, and it was decided that after the issues were addressed, we needed to take a very proactive stance in monitoring certificates for all of our TLS-enabled services. I will not dive into the details about why the certificates weren’t properly rotated, but rather, what we’re doing from now on so this sort of issue never occurs again.

If you search something along the lines of “https certificate monitor” with your Internet search engine of choice, you’ll be met with no shortage of results. However, for all of the results that I evaluated, they all had one thing in common - the user needed to provide a static list of fully qualified domain names to be checked. Being a company that is constantly in flux (and adding new domain names to our DNS zones), I’m not sure how this method would succeed, unless the service was manually updated whenever a new domain name is added.

Even discounting the use of static lists, external certificate monitoring services may have other issues. For one, external services won’t have visibility into private networks. Secondly, if your network has load balancers that don’t terminate TLS sessions, any unaware scanner will most likely trust the first answer of the load balancer.

Finding a Source of Truth

With the above in mind, I wanted to derive a perfect mapping of IP -> hostname(s) information for all subdomains of a given company-managed domain. The solution? Ask your local DNS server.

Given the above problems, I set out to make my own solution. It would have to perform the following tasks:

Discover all DNS zones that we serve
Fetch all DNS records for each discovered zone
Resolve all CNAME records to IP addresses
For each fully qualified domain name that resolves to a given IP address (through any A or CNAME records), initialize a TLS session on each user-specified port and check whether or not the served certificate is valid

My implementation of this process does the above by taking in a list of DNS zone names and their authoritative nameservers. From there, it requests zone transfers (AXFRs) from each authoritative nameserver, according to a local configuration file. Note that the host running this tool must be approved to request zone transfers from each nameserver (unless using output of a previous zone transfer in JSON format). This privilege should not be taken lightly. Although zone transfers are extremely useful in this context, they’re also a fantastic resource for any attackers that can communicate with the DNS server. Be vigilant in your DNS server configuration, and keep a lean list of hosts approved to request zone transfers.

Evaluating TLS Certificates

With an export of all DNS records from zone transfers, we’re now ready to start checking TLS certificates. The project is written in Python, so I started to evaluate all of my options on how to validate certificates. Many of the SSL libraries that I investigated over didn’t seem to tell me why a given certificate chain was invalid, or which certificate in the chain was found to be invalid (if applicable). For example, Python’s ssl library simply raises SSLErrors, and the included reason is typically not very useful.

There’s also pyOpenSSL, which seems to be of use when you want to validate a single certificate or have the entire certificate chain on hand. I was looking for a more turn-key approach. In a perfect world, the library I was searching for would tell me exactly why the certificate chain was found to be invalid, with little effort on my part.

As I searched for more libraries, I came across SSLyze. Not only can this library evaluate whole TLS certificate chains and give you verbose error messages about certificate errors, but it also has a plethora of plugins. Of particular note to us is the CERTIFICATE_INFO ScanCommand, which provides detailed information about the host’s certificate chain. Here’s the output of using this ScanCommand on an HTTPS server, via SSLyze’s command-line interface:

$ sslyze 'datto.com:443' --certinfo

CHECKING HOST(S) AVAILABILITY
 -----------------------------

   datto.com:443                       => 184.50.211.211 


 SCAN RESULTS FOR DATTO.COM:443 - 184.50.211.211
 -----------------------------------------------

 * Certificates Information:
       Hostname sent for SNI:             datto.com
       Number of certificates detected:   1

     Certificate #0 ( _RSAPublicKey )
       SHA1 Fingerprint:                  670582095c617956e986e330785ae85f164d60e8
       Common Name:                       *.datto.com
       Issuer:                            DigiCert SHA2 Secure Server CA
       Serial Number:                     17247695375665899256019678832627664436
       Not Before:                        2021-05-05
       Not After:                         2022-05-10
       Public Key Algorithm:              _RSAPublicKey
       Signature Algorithm:               sha256
       Key Size:                          2048
       Exponent:                          65537
       DNS Subject Alternative Names:     ['*.datto.com', 'datto.com']

     Certificate #0 - Trust
       Hostname Validation:               OK - Certificate matches server hostname
       Android CA Store (9.0.0_r9):       OK - Certificate is trusted
       Apple CA Store (iOS 14, iPadOS 14, macOS 11, watchOS 7, and tvOS 14):OK - Certificate is trusted
       Java CA Store (jdk-13.0.2):        OK - Certificate is trusted
       Mozilla CA Store (2021-01-24):     OK - Certificate is trusted
       Windows CA Store (2021-02-08):     OK - Certificate is trusted
       Symantec 2018 Deprecation:         OK - Not a Symantec-issued certificate
       Received Chain:                    *.datto.com --> DigiCert SHA2 Secure Server CA
       Verified Chain:                    *.datto.com --> DigiCert SHA2 Secure Server CA --> DigiCert Global Root CA
       Received Chain Contains Anchor:    OK - Anchor certificate not sent
       Received Chain Order:              OK - Order is valid
       Verified Chain contains SHA1:      OK - No SHA1-signed certificate in the verified certificate chain

     Certificate #0 - Extensions
       OCSP Must-Staple:                  NOT SUPPORTED - Extension not found
       Certificate Transparency:          OK - 3 SCTs included

     Certificate #0 - OCSP Stapling
       OCSP Response Status:              SUCCESSFUL
       Validation w/ Mozilla Store:       OK - Response is trusted
       Responder Key Hash:                b'\x0f\x80a\x1c\x821a\xd5/(\xe7\x8dF8\xb4,\xe1\xc6\xd9\xe2'
       Cert Status:                       GOOD
       Cert Serial Number:                17247695375665899256019678832627664436
       This Update:                       2021-09-05
       Next Update:                       2021-09-12


 SCAN COMPLETED IN 0.46 S
 ------------------------

Of note, the CERTIFICATE_INFO ScanCommand checks the certificate chain against multiple trust stores, instead of just using the host’s built-in trust store. This information may be useful in the fringe case that a given trust store distrusts one of our certificates, but another does not.

Putting it All Together

As mentioned above, I wanted my program to start by collecting information from DNS zone transfers, and end by checking TLS certificates for all identified hosts, for each mapped IP address (be it through A or CNAME record). DNS Certificate Checker does just this.

In this execution, the program takes in a DNS zone transfer JSON and checks all of the corresponding TLS services as defined in the local configuration file. Of course, the program can run without a local zone transfer JSON, and get the AXFR results at run-time, but my machine doesn’t have privileges to make AXFR queries to the DNS servers in question, so I’m using results from a previous zone transfer.

$ python3.8 dns_cert_checker.py --from-zones-json datto.com.json -o out.csv

The output file, called out.csv in this execution, will contain one row for each warning or error, as (predominantly) determined by SSLyze. If the certificate is valid but within a configured time to expiration, a warning is raised. Else, if SSLyze raises an exception or otherwise labels the certificate as invalid, a detailed error is raised. These findings can be seen in either the log output or the output CSV in this format:

+------------+------+------------+---------+--------------------------------------------------+
| ip_address | port | fqdn      | status  | message                                           |
+------------+------+------------+---------+--------------------------------------------------+
| [REDACTED] | 443 | [REDACTED] | warning | "certificate "[REDACTED]" expiring at [REDACTED]" |
| [REDACTED] | 443 | [REDACTED] | error   | subject does not match hostname                   |
| [REDACTED] | 443 | [REDACTED] | error   | unable to get local issuer certificate            |
| [REDACTED] | 443 | [REDACTED] | error   | certificate has expired                           |
| [REDACTED] | 443 | [REDACTED] | error   | certificate chain does not have valid order       |
| [REDACTED] | 443 | [REDACTED] | error   | self signed certificate                           |
| [REDACTED] | 443 | [REDACTED] | error   | BUG_IN_SSLYZE                                     |
+------------+-----+------------+---------+---------------------------------------------------+

Shortcomings

DNS is a great source of truth, but it does not know everything about the hosts to which it points. For example, hosts with wildcard records may serve any number of certificates depending on which hostname they’re using for a given connection. My program doesn’t identify wildcards, and instead checks the * subdomain as if it were the sole valid name for the record. Conversely, if you truly wanted to evaluate the certificate practices for a host with a wildcard record, you’d need to check every possible hostname for the wildcard record.

Your DNS zones may have A or CNAME records that point to hosts that are outside of your control. This tool will provide expiration warnings and certificate errors for such hosts, as it has no idea which records are truly owned by the organization. Conclusion

At this point, the program is only used for evaluating certificates for standing errors, proximity to expiration, and lack of trust from a given trust store. However, there’s no reason that it can’t be expanded to process findings from other SSLyze plugins, such as checking supported cipher suites against a user-configurable list of acceptable options. If desired, SSLyze can even check for SSL/TLS vulnerabilities such as HeartBleed and ROBOT.

By outputting findings to both a python logger and (if specified), a CSV, findings from DNS Certificate Checker can easily be sent to human-alerting mechanisms, which will make it far easier to take actions on certificates that are misconfigured or close to expiring.

My primary finding from this exercise is that looking for the right library or program to fulfill your task should demand far more time than writing a solution for it. SSLyze is a fantastic library, and if I hadn’t found it, I’d likely have spent far too long pursuing worse solutions.

If you’re interested in using this for your own organization, you can find the source code for DNS Certificate Checker at our GitHub project.