CHAPTER 4,Rule 4: Gzip Components
- Client sends: Accept-Encoding: gzip, deflate
- Server compresses the response using one of the accepted methods and reply
- There is a cost to gzipping: it takes additional CPU cycles on the server to carry out the compression and on the client to decompress the gzipped file
- Image and PDF files should not be gzipped because they are already compressed.
- Generally, it’s worth gzipping any file greater than 1 or 2K
- Apache 1.3 uses mod_gzip for compressing while Apache 2.x uses mod_deflate
Proxy Caching and Compressing
Imagine the following scenario:
- The first request to the proxy for a certain URL comes from a browser that does not support gzip ( so the request doesn't have Accept-Encoding: gzip, deflate ).
- the proxy cache is empty
- The proxy forwards that request to the web server
- The web server’s response is uncompressed ( because the request doesn't have Accept-Encoding: gzip, deflate ).
- The response will be cached by the proxy and sent on to the browser
- Now, suppose the second request to the proxy for the same URL comes from a browser that does support gzip
- The proxy responds with the (uncompressed) contents in its cache, the second request missed the opportunity to get compressed content.
Now imagine this scenario, the first request is from a browser that supports gzip and the second request is from a browser that doesn’t. In this case, the proxy has a compressed version of the contents in its cache and serves that version to all subsequent browsers whether they support gzip or not.
To solve this problem:
- the Web server should tell the Proxy server to save multiple cached responses of the same URL. This happens by using the Vary header in the response (e.g. Vary: Accept-Encoding), this causes the proxy to cache multiple versions of the response, one for each value of the Accept-Encoding request header.
- You can prevent Proxy server from keeping a cached copy by setting Cache-Control: private in the response.