This post is the first in a series covering privacy, anonymity and security on the internet in recent times, with a focus on real issues affecting people in the real world. Censorship and pervasive state-sponsored surveillance is a daily reality for hundreds of millions of people around the world.
Surveillance and censorship on the internet is a day-to-day reality in many countries. There are some well-known examples such as China, Russia and Iran, but these countries are not the only countries employing censorship. And there are indications that it is becoming more common, especially with software solutions that enable turn-key censorship being a commodity such as Symantec BlueCoat.
Whilst surveillance is definitely more difficult to measure and estimate, and often based on whistleblowers and information leaks, censorship can be felt in many countries very directly, with some of the most common sites on the internet completely blocked off in various regions of the world. In an attempt to beat censorship, many have turned to the Tor Project.
The Tor Project
The Tor Project promotes and develops software to protect the privacy and anonymity of online users. They manage the Tor network, which allows volunteers to run a relay on their device, allowing it to transport multi-hop, encrypted and anonymized traffic of users in the network.
In order to detect and monitor censorship on the internet, Tor also runs the Open Observatory of Network Interference project. Probes are run by volunteers and attempt to access blocked websites, at some risk to those running the probes. They have no real anonymity in-front of their respective ISP or any surveillance technology possibly deployed against them. Such exposure is unavoidable as the intention is primarily to test whether normal internet user traffic and protocols, chiefly DNS, HTTP and HTTPS traffic, are blocked.
The data collected by OONI leaves no margin for speculation: censorship is real, it is widespread and it affects a great deal of people in a major way. Tor provides free plug-and-play tools such as the Tor Browser Bundle, which includes a browser with privacy extensions and optimal configurations pre-loaded. The browser is configured to attempt to leak as little information as possible (such as DNS) outside the encrypted, anonymized Tor tunnel.
The Importance of TLS & SNI
TLS, the successor to SSL, is a cryptographic protocol designed to provide integrity and privacy between applications. The most common usage of TLS is within HTTPS, which is HTTP over TLS (or SSL).
With TLS, two endpoints can establish communication over the internet that prevents eavesdroppers from observing, modifying or spoofing messages between them. Tor itself uses TLS between onion routers it has recently communicated with to protect TOR protocol communication.
TLS is also used in QUIC, a protocol originally designed by Google in 2012 and still under development, which is planned to replace HTTP/HTTPS. QUIC doesn’t rely on TCP for the transport layer but opts to use UDP instead.
The latest version of the TLS protocol is TLS 1.3, which was approved in 2018. TLS 1.3 contains very important changes; most importantly it removes old, deprecated and insecure cryptographic suites that should not be in use in 2018, and it also comes with speed-up features, such as TLS False start and 0-RTT. It also makes mandatory an extension known as Server Name Indication (SNI), one of the original proposed extensions from RFC4366, which was written all the way back in 2006.
There are real, solid reasons for why SNI is required, but there is also a major downside to SNI. SNI leaks the hostname on establishment of every TLS 1.3 connection. There is more to be said about SNI and how to fix it, and this will be addressed more fully in a post of its own; for now, note that the IETF Survey of Worldwide Censorship Techniques (draft 07) has been marking it as an Achilles heel for years.
Pluggable Transports and Censorship
Multiple countries have been documented attempting to both monitor and block Tor traffic, and incorrect usage of Tor has led to the documented downfall of multiple users over the years. One example is that connecting to the Tor network without a bridge is easily detectable, and can be used for attribution or creating a small target group.
Tor’s cat and mouse against censors and monitoring led to the development of bridge relays, which rely on various techniques and protocols to bypass censorship. These techniques are called pluggable transports.
One crucial pluggable transport is meek, and to explain what meek is, we first need to explain domain fronting, the technique meek leverages to provide privacy.
What is Domain Fronting?
Domain fronting is a technique to obfuscate the SNI field of a TLS connection, effectively hiding the target domain of a connection. It requires finding a hosting provider or CDN which has a certificate that supports multiple target domains (known as SAN’s, subject alternative names). One of the domains will be a common one which the client wants to pretend to be targeting in the connection establishment in the SNI field, and the other domain is the actual target of the connection and the following HTTP request.
The following image shows an example of the google.com certificate, which has many SAN domains, among them *.appengine.google.com.
Once that’s done, domain fronting can be attempted. A quick test to see if domain fronting works for a pair of domains is to use cURL, sending the hidden host (android.com in this case) as an HTTP header and specifying the target as the domain we’re hiding behind (google.com in this case). cURL will specify that domain in the SNI field.
And here is another demonstration of the flow that hopefully makes it a bit clearer.
Pluggable Transports: meek
On the 14th of August 2014, the Tor Project announced the release of the meek pluggable transport. Meek uses domain fronting to hide the target bridge relay behind a very popular domain. For example, it could use google.com as a cover for xyz-meek-relay.appspot.com.
This allowed the creation of meek bridge relays on large clouds such as Google App Engine, Amazon CloudFront/EC2 and Microsoft Azure, hiding the actual target hostname behind domains such as google.com, amazon.com or various static asset CDNs.
Domain fronting was nothing short of revolutionary for Tor users in high-risk countries.
- It made Tor traffic look exactly the same as normal HTTPS (with some caveats, bad usage can still make connections stand out).
- The side effect of blocking meek is very expensive to most censors, blocking Akamai/Amazon/Google either partially or completely in a country is not an act that goes unnoticed.
Meek is not a silver bullet as there are scenarios such as China blocking access to Google; regardless, meek still had huge impact and utility, and more providers were being discovered and researched. It was not unrealistic to expect a situation where blocking all meek bridges completely would require blocking a large chunk of the internet.
Domain fronting was adopted by other privacy-seeking service providers, notably Signal and Telegram, and proved itself when Signal was blocked in Egypt, Oman, Qatar and the UAE. Signal was still accessible thanks to domain fronting on Google App Engine due to these countries not being willing to go as far as blocking access to google.com just to block these services.
Malicious Use of Domain Fronting: APT29
Domain fronting isn’t only used for good purposes; unfortunately, hiding the target domain is also a valuable tool for attackers looking to hide connections to their command and control servers and other assets, as was the case with the hacking group APT29, also known as Cozy Bear and The Dukes.
On March 27th, 2017 Mandiant/FireEye reported they had detected the Russian nation-state backed APT29 group employing domain fronting for at least two years. Domain fronting received quite a lot of attention around this time from the hype created in the cybersecurity community.
The Demise of Domain Fronting
On April 14th, 2018 a bug report was opened on the Tor bug tracker regarding breakage in the meek-google transport.
It was quickly discovered that Google’s infrastructure began responding with an HTTP Error 502 with the message “This HTTP request has a Host header that is not covered by the TLS certificate used. Due to an infrastructure change, this request cannot be processed”.
Google thus silently killed off domain fronting on its infrastructure. Two weeks later, Amazon followed suit blocking domain fronting and posting a blog post on the subject.
The use of domain fronting peaked in late April 2018. Amazon announced the blocking of domain fronting on April 27th. The same week of Amazon’s announcement, Signal announced their Amazon CloudFront account was frozen. Amazon pointed to Signal’s blog and Github account as proof of the alleged violation of Amazon’s ToS.
Signal had never attempted to conceal its usage of domain fronting, as it had announced the feature for users in Egypt and UAE in 2016. Telegram also took a major hit, especially in Russia, where a ban on the app was upheld in court, affecting 15 million Telegram users at the time.
A week after Amazon had joined Google in blocking domain fronting, the Tor Project published “Domain Fronting is Critical to the Open Web”, a treatise on the importance of domain fronting to internet privacy, and detailing the move to Microsoft Azure.
As of April 2019, domain fronting still works on Microsoft Azure and serves as a critical lifeline for those relying on meek. While Microsoft’s cloud is smaller than those of either Amazon or Google, the effect of blocking it entirely would be immense for most censors. Blocking access to Azure would affect first-party cloud services owned by Microsoft such as Office 365 and Outlook, and even possibly disrupt vital services such as Windows update. On top of that, there is an unknown (but definitely large) amount of legitimate 3rd-party services hosted on Azure that would also take a hit.
Closing Thoughts on Vendor Responsibility
In closing, we would like to say we do not think Google or Amazon dropping domain fronting should be seen as an attempt by them to harm privacy on the internet, Domain fronting is an awkward but clever trick to side-step a flaw (or lack of feature) of the TLS protocol. As there has also been malicious usage of domain fronting, the most prominent one being APT29, it is certainly a liability in some respects. It has been suggested that political pressure may have been applied on these companies to phase out domain fronting; however, neither Google nor Amazon have commented on such speculation.
Also, as we will cover in our next post, Google is one of the vendors working on solving this issue, along with Cloudflare, Fastly, Apple and other members of the TLS working group that are involved with eSNI.
It is commendable in our eyes that Microsoft continues to provide domain fronting on Azure. There is no doubt that it is crucial to meek and other services that use the technique to protect the privacy of their users.
In Our Next Episode…
In our next post we will take a look at encrypted SNI, a proposed extension to TLS 1.3 currently being discussed and reviewed, and how hopefully it will step in to solve the problems domain fronting was used for.