Geoff Jones 2 July, 2013

Shared Dictionary Compression Over HTTP (SDCH) - Bypassing Your Filtering Devices

Following Cyberis’ recent articles on bypassing perimeter filtering devices (e.g. proxies, IDS and next-generation firewalls) by manipulating HTTP response headers, we’ve taken a closer look at some more obscure Content-Encoding mechanisms. This article discusses Shared Dictionary Compression over HTTP (SDCH), and the implications for perimeter security controls designed to protect your network from unwanted content.

What Is SDCH And How Does It Work?

SDCH is a content-encoding method which was proposed back in 2008 by Google, and is implemented in Chrome and supported by a number of Google servers. The full proposal can be obtained here - Rather than replicate the contents of the document in this blog post, I’ll try and summarise as concisely as possible:

The whole idea of the protocol is to reduce redundancy across HTTP connections. The amount of ‘common data’ across HTTP responses is obviously significant - for example you will often see a website use common header/footers across a number of HTML pages. If the client were to store this common data locally in a ‘dictionary’, the server would only need to instruct the client how to reconstruct the page using that dictionary.

A much simplified overview of the process is outlined below:

HTTP/1.1 200 OK
Date: Tue, 02 Jul 2013 09:59:47 GMT
Server: Apache
X-Frame-Options: SAMEORIGIN
Content-Disposition: attachment; filename=helloworld.exe
Content-Encoding: sdch
Transfer-Encoding: chunked
Content-Type: application/octet-stream

0cgiuvts............@...... .u. p...4.r.4.r.4.r.4.s.y.r.....7.r..w..-.r..w..>.r..w..n.r..v..5.r..v..5.r.Rich4
  1. A SDCH supported web browser (e.g. Chrome) requests page from a SDCH enabled web server
  2. If the client already has the dictionary, it’ll say so in its request. If it hasn’t, the server will tell the client to go and get the dictionary in the background.
GET /ResponseCoder/sdch/sdch_exe.php HTTP/1.1
User-Agent: Chrome
Accept-Encoding: gzip,deflate,sdch
Avail-Dictionary: cNrqfIFl
  1. Server will respond with a VCDIFF’d response
  2. Client reconstructs the page using the dictionary

VCDIFF encoding is relatively complex to understand at the operation level (at least compared to normal HTTP), although in layman terms you could look at it as replacing long common strings with the necessary instructions required for a client to reconstruct the original page from a shared dictionary. Obviously if there are multiple common strings across many responses, significant bandwidth can be saved.


Now this technology has been designed with security in mind, at least to a certain degree. The dictionary is hashed both at the client and server side, mitigating simple attempts to maliciously intercept or otherwise modify dictionary contents - a compromised dictionary will hash to a different identifier and the server should not recognise it. A ‘true’ man-in-middle attack will still succeed of course, as the attacker will have all constituent parts required to fool both sides into believing the dictionary is ‘correct’.

The scope of a dictionary has also been carefully considered, and closely follows the HTTP cookie model, where a cookie (in this case a dictionary) is only valid for the specified domain and path. In other words, to manipulate the contents of the dictionary, you would either need to be sat in the middle of the connection, have control over the client, or have control over the server - there are many more interesting attacks if any of these cases were true.

The security model then, from a client/server perspective, seems adequate. The real problem lies with intermediary filtering devices - something the authors of SDCH failed to consider fully when proposed.

Filtering Devices

Many organisations deploy perimeter filtering devices (e.g. web proxies and next-generation firewalls) to prevent unwanted content being delivered over HTTP and other common protocols. For example, it is common for certain categories of content, such as hacking tools, pornography, gambling sites and known malicious sites to be blocked from corporate users - typically with an appropriate warning message displayed indicating a breach of Acceptable Use Policy, or perhaps the dangers associated with the blocked content.

Now in this sense, SDCH poses a problem. Content can be shared with the client in the form of a dictionary once, and unless the dictionary becomes invalidated for some reason, the dictionary content may not be sent across the wire again. Think about the following scenarios:

  1. An intermediary filtering device fails to correctly process the contents of the dictionary when it is first shared
  2. An intermediary filtering device fails to cache the dictionary, meaning it cannot decode future SDCH encoded messages
  3. An intermediary filtering device replaces some content of an SDCH encoded response, effectively invalidating the encoded content when processed by the client
  4. What happens if there are a large number of dictionaries for a large number of clients? Should the filtering device cache them all?
  5. What happens if neither the dictionary or encoded response contain ‘unwanted’ content in isolation, but when reconstructed, the content poses a security risk?

The Exploit - Bypassing Executable Content Filtering

Cyberis has extended ResponseCoder to incorporate a proof-of-concept SDCH encoded Win32 executable. The first 128 bytes of the executable is located in the dictionary. As the dictionary also contains an amount of meta-data prior to the actual content, it is unlikely to be considered as executable content by any filtering device. Furthermore, the SDCH encoded response obviously is missing the first 128 bytes of the executable, meaning that it also is unlikely to trigger any filtering rules. If you are using Chrome (or any other user agent that supports SDCH), and your perimeter devices do not understand SDCH encoded content, an attacker now has a mechanism to introduce malicious content into your network.

You can test it in our labs here.

The Risks

The risk should be considered relatively low for a number of reasons. Firstly, the adoption of SDCH is relatively low - the Apache SDCH project has been inactive for some time. Secondly, an attacker would only need to exploit this if their intended victim is using Chrome and is protected by a filtering device that does not strip out SDCH content. Finally, whilst this issue may allow the associated payload of an attack to be delivered through filtering devices, an attacker must first identify and exploit a vulnerability in the victim’s browser (or at least perform an element of social engineering) - in other words it is dependant on other vulnerabilities being present.

Other risks should also be considered however - other unwanted content could be ‘obscured’ in a similar fashion. Intrusion Detection Systems may fail to alert to unwanted or potentially dangerous content. A malicious insider would also be able to introduce hacking tools into the environment by running their own SDCH-enabled web server.

The Solution

The protocol specification certainly addresses some of these risks, and the arguably the easiest solution for web proxy/firewall vendors is to strip out all SDCH HTTP headers - effectively disabling SDCH as a valid encoding mechanism. Alternatively, full support for SDCH could be implemented, although this may not be an attractive option if throughput is an important consideration. Firewall administrators should also be aware of the limitations of perimeter filtering, particularly when dealing with protocols that have expanded beyond their original specification.

If you are concerned about the risks, you can test our SDCH encoded executable here, and obtain the source code to the application over at GitHub.

Improve your security

Our experienced team will identify and address your most critical information security concerns.