Wednesday, 11 September 2013

Lost with web technologies and protocols? Let me help you to clarify differences between WebSockets, Google SPDY, HTTP 2.0 & Co

Web technologies are moving so fast today. Or maybe that's a feeling I've recently had because excepting few REST services here and there, I didn't have to handle much web technologies at my low-latency-finance-work since 2005 (whereas the web was my speciality at the time). And now what?!? even the core web protocols are changing?!? incredible...

If I now ask you 'what are the differences between WebSockets, Server Sent events, HTTP 2.0?' would you be able to answer me easily? If yes, you have better reading to do (this other post for instance). If not: sit down and relax...I'm gonna earn you some time. 

Nothing new in my post, since there are lots of information available down there. But simply a synthetic view that may be helpful to grasp the differences between Web sockets, Google SPDY (pronounced "SPeeDY') and HTTP 2.0 without googling hours like I did...

But before that, let's refresh our mind with core basics:


  • TCP: guaranteed in-order delivery, bi-directional and full-duplex, TCP is useful in lots of context, but is also the transport layer protocol on which HTTP is built-on
  • HTTP 1.0: request-response application-level protocol built upon TCP to exchange structured text documents that uses logical links. HTTP is said 'stateless' because a separate TCP connection to the same server is made for every resource request (the client initiate a new TCP connection with the server, the client makes one full HTTP request, the server gives its full HTTP response, and then the underlying TCP connection is closed). I don't know if the 3-seconds-goldfish-memory is a myth, but I can tell you that HTTP memory beyond this simple Q & A roundtrip would be peanuts without the existance of post-its... Er... I mean: cookies. Name/value pairs message headers are also joined to transmit meta-data and informations between client & server (with the X- prefix to indicate non-standard headers)
  • HTTP 1.1: Maintains the HTTP 1.0 request-response paradigm, but improves the latency by reusing a connection multiple time to download images, scripts, stylesheets, etc after the page has been delivered. HTTP 1.1 avoids lots of the overhead cost of TCP connection establishment, but maintains the one response per request paradigm
  • Short polling: in the standard HTTP model, a server cannot initiate a connection with a client. Therefore, in order to receive asynchronous events as soon as possible, the client needs to poll the server periodically for new content (with a javascript timer for instance). However, the frequency of the poll request can cause an unacceptable burden on the server, the network, or both (by forcing HTTP roundtrips even if no data is available on the server side). It can also be inefficient because it reduces the responsiveness of the applications since data is queued on the server side, until the server receives the next poll request from the client 
  • Long polling: kind of hacks to HTTP (1.x) in order to support data-push from the server to the client. With long polling, the client requests a page from the server in a way similar to a short and standard polling; however, it the server has no information available for the client, then instead of sending an empty response, the server holds (his breath) the request -responding partially with headers to the client request for instance- and waits for information to become available (or for a suitable timeout event), before sending a complete response to the client. The client then typically sends a new long poll request, either immediately of after a pause. The short duration of the pause between a long poll response and the next long poll request avoids the closing of idle connections (when HTTP 1.1 is used)
  • HTTP streaming: a set of technologies or techniques where the server keeps a request open indefinitely; that is, it never terminates the request or closes the connection, even after it pushes data to the client. This mechanism significantly reduces the network latency because it avoids to pay the cost of the underlying TCP connection establishment. This may work whether with HTTP 1.0 (using EOF as a streaming mechanism) or HTTP 1.1 (using whether chunked transfer or EOF to stream) as underlying protocol. The HTTP streaming mechanism is strongly based on the capability of the server to send several pieces of information on the same response, without terminating the request or the connection. Warning: some network intermediaries (proxies, gateways, ...) involved in the transmission between the server and the client may seriously hurt or prevent this streaming experience from working (by buffering the data until the entire response is published by the server)
  • Comet: a generic term which covers both the long polling and the HTTP streaming techniques in order to support push-based interactions from the server. Third parties comet libraries usually support multiple techniques and fallback strategies to try a maximize cross-browser and cross server support 

At the end, there were lots of reasons why we couldn't stick with the former HTTP 1.x family... But rather than being negative here, let's instead detail the use cases and objectives for those new protocols and technologies

Indeed, the technologies we will describe below were mainly introduced for the following objectives:
  1. To speed up the interactions over the web by reducing the latency (cause responsiveness is nowadays a must for end-users, especially for phone users)
  2. To allow servers to natively push data to their clients (to support event-driven)
  3. To allow web clients and web servers to exchange more than HTML pages (the initial purpose of HTTP, remember?)
  4. To secure the interactions over the web

Ok then... thanks for the overview. But can we have a look at the new stuffs now?!?

Ok, ok: but just calm down pal...  Here they are:
  • WebSockets: Coming both as an IETF protocol and a W3C API, WebSocket is a technology allowing full-duplex communications channels over a single TCP connection. WebSocket is a transport layer built-on TCP, but offering an HTTP friendly upgrade handshake. As a client, you initiate an HTTP connection, before requesting the server to upgrade the session from HTTP to the WebSocket protocol. Once this bootstrapping is done: there is no more HTTP between the client and the server! Some old-fashion socket interactions to be handled instead, where client and server are equal peers with the same capability to send messages at any time, and to handle their reception asynchronously
  • Server-Sent events: W3C standardization of an API for opening an HTTP connection for receiving push notifications from a server (using a text/event-stream MIME type) in the form of DOM events. The browser API is called the EvenSource API, and is part of the so called HTML 5...
  • SPDY: Already implemented and available Google proposal to extend HTTP with a more efficient wire protocol (still based on TCP) but maintaining all the former HTTP semantics (encoding, headers, cookies, request/response). SPDY replaces some parts of HTTP, but mostly augments it. In other words: SPDY =  {HTTP 1.x headers and methods + google connection management & data transfer format}. The name SPDY captures speed, but also the idea of compression. Some SPDY implementation details:
    • Usage of TCP as the underlying transport layer, so requires no changes to existing networking infrastructure (to ease the WW deployment)
    • Multiplexed requests: many concurrent HTTP requests to run across one TCP session
    • Usage of TLS (over TCP) as the standard transport protocol. Ok, it will improves the security of our web exchanges. But as a nice side effect, It will help the transparent deployment of SPDY. Indeed, no one on the network path between the server and the client will have access, and be able to mess with the SPDY bits (protected by TLS tunneling) 
    • Prioritized requests: to prevent high priority requests from being blocked by non-critical resources
    • Compressed HTTP headers: to save latency and bandwidth for pages with tons of sub-requests
    • Server push: the capability for a server to push data to clients via the X-Associated-Content header (informing the client that the server is pushing a resource to the client before the client has asked for it)
    • Server hint: the capability for the server to suggest the client that it should ask for specific resources (done via the X-Subresources header)
  • HTTP 2.0: is the next planned version of the HTTP network protocol. Currently a working draft by the IETF (working group last call is scheduled for April 2014), HTTP 2.0 is derived from SPDY -used here as a starting point- and defines an upgrade handshake and data framing very similar to the WebSocket standard. The main idea of this draft is to provide asynchronous connection, multiplexing, header compression, request-response pipelining (with possible prioritized streams) while preserving a full backwards compatibility with the transaction semantics of HTTP 1.1. One major difference though, between HTTP 2.0 and SPDY, is that HTTP 2.0 won't force the usage of TLS

Hum, interesting... But what kind of impacts for our website implementation today?


I would say that using SPDY right now, until HTTP 2.0 is finished and implemented seems to be an interesting path

Indeed, SPDY is already supported by Chrome, Firefox and Opera on the client side (today), and by Apache web server, nginx, Jetty, Netty, Node.js, etc. on the server side. This interesting path has also already been taken by some tiny web actors such as Twitter, Wordpress, Facebook (you can Google them if you want to find out what are those companies ;-) and Google.* of course!

On the other hand, you'd better-as usual with web technologies- cover your back with fallback strategies. Cause it's most unlikely that corporate infrastructure components (such as proxies, reverse proxies, firewalls, etc) will be compliant soon with those HTTP 2.0 like (and of course WebSocket!) technologies. I know that's a shame, but it's a fact that classic corporate (like banks for instance) aren't innovative as web giants...

Anyway, some hints if you decided to rely on SPDY (and with HTTP 2.0 in the future)

You should not do domain sharding anymore to improve the end-user response time (sharding domains being the mechanism used to minimize HTTP 1.x round-trip times by parallelizing downloads across various hostnames). With stream multiplexing and request-response pipelining, this is not useful anymore (and worst: degrades the end-user experience). As a consequence, set a switch on your server saying: if it is through SPDY, don't bother with sharding.

Also, avoid image spriting (i.e. to combine multiple images within the same file to load them without having too much HTTP 1.x roundtrips), this is not useful anymore, and even counter-productive (likewise web domain sharding).

Last but not least: let's finish this post with some real web mechanical sympathy

As you may notice, the web is full of resources explaining how to configure your server components and how to design your web applications in order to rely efficiently on SPDY. But I'd like to end this post with an excellent reference: a QCON session from Roberto Peon -one of the inventor of SPDY- where he explains a lot about HTTP 1.x limitations, SPDY and HTTP 2.0. A very clear session, available here on infoQ (.pdf slides included).

I hope that this post clarified a little bit this jungle of terms and concepts. Time for us to speed up the end-user responsiveness of our web apps ;-)

Happy coding!

2 comments:

  1. Great overview and refresh.
    Regarding SPDY vs HTTP 2.0, I support using drop-in solution to benefits from immediate performance gain + start discovering the architecture patterns it support. But I would delay any (significant) architectural change until HTTP 2.0 reaches a mature stage.
    I feel sharding is still useful for load balancing purposes for example, but it is no longer required to ensure low latency for customer.

    ReplyDelete
  2. Also, remember that since SPDY requires TLS, it will render useless any added-value intermediate proxy (such as a cache).
    I understand it allows the protocol to remain compatible with existing infra, but it also make the stream hidden and not transparent, which potentially permit proprietary protocol extensions, which is not really good!

    ReplyDelete