Sorry, Troy, HTTPS is NOT Easy!

Reading Time: 13 minutes

Bad news, everyone. Troy Hunt lied to us! …Kind of, sort of… Okay, maybe not really.

This post is about the difficulties I had when moving my website from HTTP to HTTPS.

For anyone who might not know, Troy Hunt is behind a lot of security resources, including haveibeenpwned.com which allows users to check if their details have been compromised in a data breach. He’s trusted the world over, with many government organisations and financial institutions also making use of Troy’s resources.

So, when he launched httpiseasy.com – a quick guide to securing your website with an SSL certificate – I mentally bookmarked it for the time when I might want to switch to HTTPS, trusting that, with this guide, it would indeed be easy. Unfortunately, that wasn’t the case for me.

Context

To give some context to this tale, the website that you’re on now (www.cassandrahl.com) is the one I wanted to secure, and have now secured, after some struggles that I’ve detailed below. I considered it a fairly simple website, until I came across numerous problems that I might not have had, had it really been so simple.

Why Move to HTTPS?

I’ve had this website for about three years now. When I first started it, I thought about setting it up as HTTPS, but I honestly didn’t know how successful the website would be or how long I would want to keep it running. I took a YAGNI (you ain’t gonna need it) approach and decided to go without, despite having seen warnings that it might be tricky to switch later. And to be really honest, I didn’t want to pay for something extra that I didn’t think I would ever need.

I decided to switch to HTTPS for the launch of Identity Stories – a collaborative blog series that depends on submissions from other people. Even though I planned to collect submissions using a secure Google Form, I wanted to secure the site too, to give contributors full confidence in the security of their data and submissions.

Website Setup and Features

It’s a self-hosted WordPress site, which means that I have a domain and hosting provider that’s separate to WordPress. I use CPanel to manage the web host, which includes the software Softaculous that I then use to install and run WordPress. So there are already three tools in use before adding CloudFlare – the SSL provider recommended in Troy’s guide. I was also using a multi-site installation of WordPress. I have no recollection of why I set it up this way, since I only have one WordPress site.

The website is not static or only providing of information. It’s a blog that has features for social sharing, commenting, and a contact form that sends messages to the website email address. These little extras resulted in some unexpected consequences that I’ll share towards the end of the post.

Why HTTPS Wasn’t Easy for Me

I started by following the first three of four parts in the guide (technically, only the first should be necessary). This was fairly easy to do, considering that the videos clearly show all the steps to go through. I was following the instructions of someone I trust, so I didn’t do what I usually would and look into details of all available options at each step before selecting one. This made things a lot quicker to set up initially, but the time it took to actually get things working was a lot longer. Almost two weeks. Not so easy after all.

Um, Did It Work? (Only Accessible via HTTP)

According to Troy’s videos, once the SSL is showing as active on CloudFlare, the new DNS settings should be propagated and the site should be secure. But this was not the case. I confirmed the DNS changes via whois, cleared my browser’s cache, tried different browsers, tried a different device, and even flushed my device’s DNS settings. Nothing worked – the site was still only accessible via HTTP, and trying to access the HTTPS URL directly in Chrome returned a browser error that the connection wasn’t private. This was the first of many errors.

It didn’t make sense. Everything looked like it should be working, and I had already turned on the setting for “Always Use HTTPS”. It shouldn’t have been possible to access the HTTP version.

I Googled the error, searched official documentation, and scoured the CloudFlare community forums. Not only did I not find any useful information, but I discovered that a lot of CloudFlare’s own documentation and UI is outdated and refers to settings or elements that have changed or no longer exist. When I found a page that could have useful information on what to look for in your host management system, there was a disclaimer along the lines of, “this method won’t work with CPanel,” with absolutely no other information to help CPanel users.

Eventually, I tagged onto someone else’s forum post where other people had been offered solutions directly, and asked for any ideas on how to solve my issue. Seemingly, HTTPS was working for at least the person who responded, as was the redirect. It was weird, but at least it was working for someone.

I thought that maybe I just needed to be more patient. But that didn’t make sense either, given that I’d already cleared caches and DNS settings. What else was there to wait for? Four days later, without doing anything different, HTTPS finally started working for me. Kind of.

Where Did It Go? (Intermittent Loss of Access)

It turns out, HTTP access only was the least of my problems. After finally being able to access the HTTPS version of the site, it disappeared. Then reappeared and disappeared again, times about twenty. I hadn’t changed any of the settings in CPanel or CloudFlare and didn’t understand why the site had suddenly become intermittently inaccessible.

More Browser Errors, and Broken Redirects

I decided it was time to reach out to CloudFlare’s support team. While I waited for a response from a real person, their automated response suggested that adding a certificate to my origin server – part four in Troy’s guide – might help. I’d previously avoided doing this because it’s not something I had experience with, and it seemed more error prone. I didn’t want to make a mistake and mess anything up. But at this point, things were already messed up, so I struggled a bit more with official documentation – or lack thereof – and eventually got it set up correctly.

That didn’t help. I still had intermittent issues accessing my site, and saw various browser errors along the way, including:

NET::ERR_CERT_COMMON_NAME_INVALID
NET::ERR_CERT_AUTHORITY_INVALID
SEC_ERROR_UNKNOWN_ISSUER
and others that I didn’t make a note of anywhere

I noticed that different browsers returned different error information, and that the redirects between http, http://www, https and https://www no longer worked. Trying to access each of these returned different results too. Then I noticed a difference between trying to access the website over WiFi or mobile network as well. The discrepancies were starting to pile up.

HSTS Preload

I started to panic a little. What was going on? Why couldn’t I access the site?

At some point, I stumbled across a Google support page that warned against switching on HSTS Preload straight away, and to gradually increase the term only after HTTPS access is shown to be stable, or else risk irreversibly prolonging any issues for visitors. I realised that I’d already switched on this setting at the maximum twelve month term, as in Troy’s guide. What had I done? Had I accidentally made the website inaccessible for as long as twelve months for some visitors? Is that how HSTS works? I honestly don’t know, but I didn’t have time to stop and research it – I needed to reverse it!

Thankfully, my request on hstspreload.org was still pending and I was able to remove the site from the preload list. Disaster averted; now back to the immediate access issue.

More Redirects, and Response Codes

I spent literally hours speaking to both CloudFlare and CPanel support personnel. After a lot of back and forth, the only real cause for the intermittent access issues that they could think of was a problem with redirects. I used Varvy to try and see what redirects were present on the four URL variations. Again, I was getting a different result almost every time. However, I did notice a pattern. There was usually one URL with no redirects, two with one redirect and one URL with two redirects. The final response codes were also inconsistent and varied between a 200 and 301 – regardless of whether I could actually access the site in a real browser.

Out of interest, and because I’d been experimenting with Postman recently, I decided to set up tests against the four URLs in Postman, to see if that might quickly tell me whether each URL was accessible at a given time. Again – different results. Most of the time, I got a 200 response even if I couldn’t actually access the site in the browser; other times I just got an error in the collection runner that the site wasn’t available.

Automation did not help me!

Again with the Redirects

Apart from the CloudFlare “Always Use HTTPS” setting, I didn’t have any redirects set up in CloudFlare or CPanel, so we all started trying to figure out where a problematic redirect might be coming from. For clarity, the goal was to have all traffic going to https://www.cassandrahl.com – the HTTPS and www. version.

A cURL command ($ curl -svo /dev/null https://cassandrahl.com) revealed that the redirect was coming from WordPress. I didn’t have any redirects set up in WordPress directly, or via any plugins, but we checked them all, just in case.

We also:

Checked the wpkw_links file in the database (where Namecheap / CPanel support said any redirects would be) – this was completely empty and therefore not the source of the redirect
Checked the .htaccess file for any HTTP to HTTPS or HTTPS to HTTP redirects – there were none
Changed the siteurl and home values in the wpkw_options database file to the naked domain – that broke the site and it failed to load with a “redirects are not working properly” error
Went into the wp-config.php file of the multi-site WordPress files and changed the domain there to the naked domain; same result as above
Changed the WordPress multi-site installation to a single site installation (CPanel thought this might make the redirect easier to find (it did not)) then changed the initial profile URL on the CPanel Softaculous installer, and within the WordPress UI settings (WordPress Address (URL) and Site Address (URL)), to the naked domain – the site was still up and running, but with “https://cassandrahl.com” in the browser address bar instead of “https://www.cassandrahl.com” (I’ve always had Google Webmaster tools set up to prefer www (which does not use redirects to do so), so this essentially broke that)
Installed the CloudFlare WordPress plugin, which promises to fix redirect loops – this just showed the same “Automatic HTTPS Rewrites” option to fix mixed media issues that I already had turned on directly in my CloudFlare account

We still couldn’t figure out where the problematic redirect was coming from. Then I had an idea. The cURL request we’d used before showed details of the redirect on https://cassandrahl.com . I decided to run this command against all four versions of the URL and note the results after each change I tried. As the Varvy tool showed previously, the redirect only existed on three of the four variations – all but the target URL (https://www.cassandrahl.com). This is also the URL that was used in the WordPress UI settings, which I later discovered are directly linked to the siteurl and home values in the wpkw_options database file, but with some validation to avoid site breakages.

When I changed this to the “https://cassandrahl.com” and re-ran the cURL commands, I found that this displaced the redirect to the https://www URL. It didn’t solve the issue, but it did confirm where the redirect was coming from. It just wasn’t obvious because they aren’t labelled as redirects within the WordPress UI.

Trolled by Time

I relayed the results of my experiments to CloudFlare and explained that I didn’t think redirects were the problem here, or else all their WordPress users would have the intermittent access issue too. They had one more idea. They asked me to send them HAR files for the site – one when it’s fully accessible and working normally, and another when it’s inaccessible. The website was in a period of working at the time, so I checked back every so often and waited for it to become inaccessible again.

And waited… And waited. The intermittent issue disappeared!

Since initially trying to set up HTTPS, the only things I’d actually done were to add an origin CA (which didn’t affect the issues I’d had), switch off HSTS Preload (which I’d managed to do before it was actually in place) and switch to a single-site WordPress installation (which I might as well have not done, as it didn’t change anything). All the other changes I’d made to try and solve the access issues didn’t work and were reversed. Why did the intermittent access issue suddenly disappear after almost two weeks?

I still don’t fucking know. ~~Fixed~~ Trolled by time.

It Ain’t Over ‘Til It’s Over (Other Unexpected Consequences)

So that was fun… Not. Apart from the site being inaccessible, here are some other things I came across during my not-so-easy HTTPS journey – none of which were mentioned in Troy’s guide, since the site used there is much simpler and doesn’t include everything that mine does.

DNS Records Managed Only in CloudFlare

Maybe this is obvious to some, but it wasn’t to me. When my DNS records from CPanel were automatically retrieved in CloudFlare, I was under the impression that this was only so they could be orange-clouded to activate the proxy. Like I said, I just followed Troy’s videos and didn’t initially do any further digging. When I tried adding the HTTPS URLs to my Google Webmaster account, that’s when I realised that I had to add the verification record in CloudFlare, and not CPanel, where I’d added it before and still had to manage other things, such as the database files and email deliverability settings. Again, this was completely different to all the documentation I’d seen, even when searching about my specific set up. Oh well.

Emails Stop Working?

While I searched for solutions for the intermittent access issue, I stumbled upon posts from people saying that their website emails had also stopped working. I tested my emails. Also not working.

Long story short – I know that my emails had stopped working at some point, but I’m not sure at which point they started working again, or how many of the things I did to get them working were actually necessary.

I retrieve my website emails through a linked POP account on a different Gmail address. When I tried to test my website emails with my Gmail email as the sender or receiver, Gmail basically messed things up and stopped me from being able to see any of the test emails. It looked like most of my tests failed, but only some actually did.

Here’s how I made sure that website emails definitely work for me:

In CloudFlare, the following records exist:
- A for naked domain and mail domain
- CNAME for www
- A for mail domains
- MX for mail servers
- TXT for domain key, dmarc and SPF
In CloudFlare, the following records are grey-clouded:
- mail
- webmail
- WHM
- autodiscover
For the SPF record, get the correct SPF value from CPanel > Email Deliverability > Manage > SPF and copy the recommended SPF value there (I was looking for it in my CPanel DNS records, but it doesn’t exist there)
For the dmarc record, the default value is used: “v=DMARC1; p=none”
Test website emails only with accounts that are completely separate and can’t be used to access or send website emails OR when using a linked Gmail account, find hidden emails in the folder More > All Mail

Since the example site used in Troy’s guide didn’t include any email services, having to deal with this was quite a surprise to me, and it took some digging and experiments for me to be comfortable that it was finally working as it should be.

Social Shares Disappear

At last, we come to the final issue, which, unfortunately, is still not resolved. And I’m kind of over it at this point.

I use a couple of plugins to enable social sharing on my blog posts. There are counters for certain social platforms that track how many times a post has been directly shared on social media. Some of my posts had accumulated over 300 social shares.

They are all gone.

The social shares are gone because they were based on the original HTTP URL, and the HTTPS one is classed, by the social platforms, as a separate URL to share. Apparently, depending on which social sharing tool you use, there are ways to aggregate the numbers and get the share counts back. However, none of the ones I tried worked and uh, yeah… I’m done.

To Conclude…

HTTPS is hard. Or at least it was for me. I think I might not have had some of the problems that I did if I my website was super basic, or if I’d taken the time to properly understand and choose all the set up options like I normally would. However, I still don’t think that would have prevented the main issue I had – the website being intermittently unavailable.

It’s kind of annoying that I still don’t know the cause – or apparent fix of that issue – but the fact that CloudFlare and CPanel support teams couldn’t figure it out either makes me feel a bit better, or at least a bit less dumb.

I probably shouldn’t have blindly followed the instructions in the guide I used, which I know is super obvious and not something I normally would have done. But I want to openly admit to that, and some other mistakes I made, because I believe that you can only truly be open to learning when you’re truly open about your mistakes. I may not have gotten to the bottom of every mystery, but it’s been a great opportunity for me to learn more about security certificates, redirects, tooling, DNS records, email configurations, and more. I count that as a win.

Have you had any problems moving to HTTPS? What did you learn from the last mistake you made? Share your success / learning stories in the comments.

Discover more from Cassandra HL

Subscribe to get the latest posts sent to your email.

One thought to “Sorry, Troy, HTTPS is NOT Easy!”

someone says:

9th May 2019 at 9:28 am

Yes, HTTPS can be hard. But that sounds more like an cloudfare than an https issue. using a global cdn with several layers of security is another can of worms. disclaimer: i do not like commercial CDNs because of certain experiences and the idea of an free internet.