Monday, August 10, 2015

CH13 - High Performance Web Sites

CHAPTER 13, Rule 13: Configure ETags

ETags are a mechanism that web servers and browsers use to validate cached components.

Conditional GET Requests
  1. When a cached component does expire (or the user explicitly reloads the page), the browser can’t reuse it without first checking that it is still valid.
  2. The browser sends a Conditional Get Request to server to check if the cached version is still valid. 
  3. The server will reply “304 Not Modified” if the cached version is still up to date or "200 OK" with the new version of the content if the cached version has been modified.
  4. There are two ways in which the server determines whether the cached component matches the one on the origin server:
    1. By comparing the last-modified date
    2. By comparing the entity tag
Last-Modified Date
  1. The client sends a get request.
  2. The server replies with Last-Modified: Tue, 12 Dec 2006 03:03:59 GMT
  3. When the component is expired, the client sends a get request with If-Modified-Since: Tue, 12 Dec 2006 03:03:59 GMT
  4. The server replies "304 Not Modified", if the content has not been modified.

Entity Tags
  1. ETags were added to provide a more flexible mechanism for validating entities than the last-modified date.If, for example, an entity changes based on permissions, the User-Agent or Accept-Language headers.
  2. The client sends a get request.
  3. The server replies with ETag: "10c24bc-4ab-457e1c1f".
  4. When the component is expired, the client send a get request with If-None-Match: "10c24bc-4ab-457e1c1f".
  5. The server replies 304 Not Modified, if the content has not been modified.
The Problem with ETags
  1. They are constructed using attributes that make them unique to a specific server hosting a site (i.e. in case of a cluster of servers, ETags won’t match when a browser gets the original component from one server and later makes a conditional GET request that goes to a different server). which means unnecessary reloading of components.
  2. Apache adds information like file type, owner, group, and access mode to the ETags.
  3. IIS uses different information.
  4. If both If-None-Match and If-Modified-Since are in the request, the origin server “MUST NOT return a response status of 304 (Not Modified) unless both conditions met.
What to do 
  1. If you have components that have to be validated based on something other than the last-modified date, ETags are a powerful way of doing that, 
  2. In case of a single server website, you can let the web server (e.g. Apache) to generate ETags for you
  3. In case of cluster of servers make sure to configure that ETag header by yourself, dont let the webserver to do that.

No comments:

Post a Comment