Skip to content

HTTP Compression: Compressing Files, How Low Can You Go?

Posted: February 23rd, 2010 | Filed under: IIS & HTTP, Performance Tools | Tags: , , , , , , , ,

HTTP compression on IIS is easy to enable with tools such as httpZip or ZipEnable and requires no client-side configuration to obtain benefits, making it a very smart way to get extra performance and a better user experience.

It’s well known that there is a limited amount of bandwidth on most Internet connections and anything IT administrators can do to accelerate site load time benefits not only the organization, but users as well. HTTP compression, a function built into both browsers and servers, can substantially improve site performance by reducing the amount of time required to transfer data between the server and the client. When data is encoded using a compressed format like GZip or Deflate, it introduces complexity into the HTTP request/response interaction by necessitating a type of content negotiation. This content negotiation communicates with the browser, deciding if it can or cannot handle the compressed data and sends the appropriate version of the resource to the browser.

Why use server side compression?

Most users’ knowledge of compression comes from compressed files such as .zip format that they download, extract, and open. However, compression can be used passively as well to compress documents as they are being transferred to a client’s browser. Because it’s a passive process, the server can reduce the size of the pages sent, consequently reducing the download time for users and their bandwidth usage.

You can typically reduce an HTML document to less than half of its original size, (the exact percentage saved will depend on the degree of redundancy or repetition in the character sequences in the file) saving the amount of time the client needs to download the page as well as the amount of bandwidth required. This can be accomplished without changing the way the site works, its page layout and content remain the same, the only thing that changes is the way the information is transferred between server and browser.

Keep in mind that when looking at overall savings on a site, compression rates of less than half the size may be counterbalanced by the presence of image MIME types that cannot usefully be compressed.

Acceptable File Types

Some file formats are not able to be compressed further. For instance, files that are already compressed, such as JPEGs, GIFs, PNGs, movies, and ‘packaged content’ (e.g., Zip, Gzip, and bzip2 files) are not going to compress significantly further with a simple HTTP compression filter. Therefore, you are not going to get a noticeable benefit from compressing these files.

If bandwidth savings is the primary goal, the strategy should be to compress all text-based output. Ideally, this should include not only static text files (such as HTML and CSS), but files that produce output in text media MIME types (such as ASP and ASP.NET files), as well as files that are text-based but of another media type (such as external JavaScript files). Heavily formatted pages, for example those that make heavy use of tables (repetitive formatting content) may compress even further, sometime to as little as one-third of the original size.

What tool works the best for you?

On Microsoft IIS 4 and 5 Web servers, httpZip is the best solution for compression as it addresses a number of shortcomings in functionality, customization, and reporting on Windows 2000 that gave rise to a third party tools market. httpZip is also ideal in certain cases on IIS 6.0. However, with the launch of Windows Server 2003 and IIS 6.0, Microsoft chose to make compression functionality a priority, and their internal IIS 6.0 compression software works — though you must delve into the IIS metabase to manage it beyond a simple “on/off” decision (and there is no browser compatibility checking). You should use ZipEnable to safely unlock and greatly enhance the configuration and management options for IIS 6.0 built-in compression.

/ Port80

No Comments »

HTTP Compression : Smaller, Faster… Better

Posted: August 17th, 2009 | Filed under: IIS & HTTP, Performance Tools, Web Design, Development, & Usability | Tags: , , , ,

What is HTTP compression and how does it work?

HTTP compression is a long-established Web standard in which a GZip or Deflate encoding method is applied to the payload of an HTTP response, significantly compressing the resource before it is transported across the Web.

When data is encoded using a compressed format like GZip or Deflate, it introduces complexity into the HTTP request/response interaction by necessitating a type of content negotiation. Whenever compression is applied, it creates two versions of a resource: Read the rest of this entry »

No Comments »

Webinar on Web Performance Solutions (Microsoft IIS and More — Archived Version)

Posted: May 21st, 2008 | Filed under: Performance Tools | Tags: ,

Learn how to send less data, less often in this new archived webinar from Port80 Software!

Port80 Software invites you to view a webinar on Web performance solutions… We recently reviewed the Web acceleration market, where Port80 Software solutions like httpZip, ZipEnable, CacheRight, and w3compiler come to play on IIS Web server performance, and how to analyze HTTP compression and Expiry Cache Control solutions with HTTP analysis tools.

Agenda — Web Performance Solutions for Microsoft IIS Web Servers:

- Common Web Performance Challenges
- The Web Acceleration Market
- Analyzing the HTTP Request/Response Cycle
- HTTP Compression
- Expiry Based Cache Control
- Questions and Answers…

Login to Access the Archived Webinar:

- Go to https://www119.livemeeting.com/cc/port80/view?id=Web_Performance_Solutions_1.
- Enter your name, the Recording ID if not already entered (Web_Performance_Solutions_1), bypass the Recording Key (this is not required), and then click View Recording.
- You will be presented with two format options on the next screen; we recommend choosing the second version (”meeting replay), as the first version does not include video.
- Click the Windows Media icon under ‘View’ for the Microsoft Office Live Meeting Replay version, and this will launch the Webinar in Windows Media Player (http://www.microsoft.com/windows/windowsmedia/default.mspx).
- If you have any trouble logging into the Webinar, please just ask for help at support@port80software.com.

Thanks for watching, and please let us know if you have a topic for our next webinar event or if we can answer any questions on IIS performance!

Best regards,
Port80 Software

No Comments »

HTTP Compression and the Google AdSense Crawler Bot

Posted: February 22nd, 2008 | Filed under: Performance Tools | Tags:

FACT: HTTP Compression really improves Web serving.

FACT: Big sites like Google and Yahoo! use compression.

UNFORTUNATE FACT: Some services are not aware enough of compression and may break… unless you have a smart compression engine!

This underutilized technology transparently reduces the size of all text-based content served from a Web site or Web service, speeding up transmission across the Web, reducing bandwidth expenses, and freeing up Web server availability to handle more requests. Compression deployments are accelerating among business sites, and Google.com has been compressing responses for a long time (see this real-time report: http://www.port80software.com/tools/compresscheck?url=www.google.com).

Google’s Googlebot, their Web crawler that indexes sites to form the basis of search results, also likes to see compressed content. At a search engine conference a few years back, search guru Danny Sullivan spent some time focusing on this: Google only indexes so much of a page, so if you send the Googlebot compressed content (which it asks for by the presence of the “accept-encoding: gzip, deflate“ header in a request), you can theoretically get more content indexed and save bandwidth on that request from Googlebot and all other requests to IE, FireFox and other browsers and search bots with HTTP compression. Very cool.

It is ironic then, given Google’s knowledge and use of HTTP compression, that Google’s AdSense program, which sells contextual advertising on third party sites, use technology that is not compatible with HTTP compression. One of Port80 Software’s httpZip compression clients received this email recently from Google’s AdSense team in response to why the Port80 client’s contextual ad site was not getting index by the AdSense crawler bot program (which goes by the user-agent name starting with “mediapartners-google”; a user agent is the Web client’s name, usually a browser or bot)… this is part of the email from a Google AdSense rep to our client:

“I’ve reviewed your site and have determined that our crawler is having difficulty accessing your URL. Specifically, your webserver is sending our crawler HTML in a compressed format, which our crawler is unable to process.

We recommend that you speak with your web administrator to ensure your system does not send our crawler compressed data. You can determine our crawler by looking for user agents starting with ‘Mediapartners-Google’.

Additionally, please be aware that after you have turned off the encoding, it may be 1 or 2 weeks before the changes are reflected in our index. Until then, we may display less relevant or non-paying public service ads. You should expect your ad relevance to increase over time.”

So, the AdSense crawler bot does not like HTTP Compression. But the real question is — why are they asking for it? In the request to get compression from any Web server, a user agent must first have that “accept-encoding: gzip, deflate” header in the original request… if the AdSense bot cannot deal with compression, it should not be requested by the bot itself. That makes sense, right?

It looks like Google AdSense is asking clients to not compress responses to their bot to fix this issue, rather than fixing the decompression bug (an educated guess) in their bot code. So, the fix for now if you have a Web server, are in the AdSense program from the serving side (you host Google AdSense ads on your own site), and still want to use compression for all other Web visitors, an exception must be made for any request with a user-agent starting in “mediapartners-google”.

Unfortunately, you cannot do this on Microsoft IIS 4 or 5 servers (NT or 2000) without a third party compression tool like httpZip from Port80 Software that can add a compression exclusion for a user agent. On IIS 6 (Windows 2003), you can use httpZip or ZipEnable to add such an exception or exclusion. We will be adding the default exception for this browser to a minor version upgrade of both products soon, but here is how to add an exception for this AdSense bot with httpZip and ZipEnable.

Excluding Google’s AdSense Bot IIS Compression with httpZip:

- Install the free httpZip trial from www.httpzip.com/try.

- Once installed, confirm compression is working fine (http://www.port80software.com/products/httpzip/evaluation).

- Open the httpZip Settings Manager.

- On the compression tab, to add a new Browser Exception for a MIME type, select “New” and, in the Add Browser Exception dialog, enter a Browser Name (like “AdSense Bot”) for the browser in the text box labeled “Browser Name.” Next, enter the search string text used to identify the browser (use “mediapartners-google” to get all versions of the bot, this short version will wildcard for specific software versions of the bot) in the text box labeled “Search String”, then click OK. Please note: you will have to add this for the MIME types being requested by the bot, which should include “text/html”, “text/css”, “text/javascript”, and “application/x-javascript” MIMEs, and probably a few more, based on what you are serving and want to get indexed.

Picking a MIME (text/html) to Exclude the AdSense Bot from compression:

httpZip: Pick a MIME type first...

Setting up the AdSense Bot Exception for text/html MIME:

httpZip: Set up the exception for the AdSense Bot...

- Apply your settings in the httpZip Settings Manager. Repete proces for other MIMEs that you want to get indexed (FYI, text/html should take care of most dynamic content output from ASP, ASP.NET, CFM, PHP, JSP, etc. files).

- You can use Wfetch, a free tool in the IIS 6 Resource Kit, to test that no responses will compress when requested by the AdSense bot (http://support.microsoft.com/kb/840671). Just add these headers to a request in Wfetch (“accept-encoding: gzip, deflate”), and the response from the server with the new httpZip exclusion will not be compressed (it should have no headers like “content-encoding: gzip” or “content-encoding: deflate” in the response from the Web server and is therefore not compressed).

- All your other requests from good browsers and bots will now be compressed while you can feel safe that you are not messed up with the Google AdSense bot. Remember, it may take a few weeks for the AdSense bot to reindex your site correctly.

You can add an exclusion to compression requests from the AdSense bot on IIS 6 with ZipEnable by following the instructions above and adding an exclusion directly in ZipEnable — here is the documentation for that process in ZipEnable (http://www.port80software.com/products/zipenable/docs#adv_set_browser). You will also want to use something like Wefetch that will allow you to alter your request headers so you can trick out the user-agent and make sure you are getting no compression when the user agent includes “mediapartners-google*” (make sure the search string is a wildcard implictly in ZipEnable , a bit different than in httpZip: “mediapartners-google*” ).

We hope this helps clear up any confusion on Google AdSense and HTTP compression – please contact us for help here and for other tips on IIS performance boosts!

Best regards,
Port80 Software

1 Comment »

Microsoft Flash Training Module: IIS 6 compression and ZipEnable

Posted: November 24th, 2004 | Filed under: IIS & HTTP, Performance Tools | Tags:

Microsoft has created some handy Flash-based walk-throughs for HTTP compression in general, IIS 6.0 native compression and also how to use ZipEnable (pretty much covers everything in ZE except browser compatibility…).

These modules are also a great way to get non-technical folks familiar with HTTP compression concepts and how to make it work on Windows Server 2003.

Access the free training modules at http://www.microsoft.com/windowsserver2003/tryiis/training/compression.mspx.

Happy Thanksgiving, folks!

No Comments »