Is there any way to ensure the HTTP/2 client/server interaction is truely multiplexed and asynchronous?
I'm excited for HTTP/2, not for it's ability to improve the performance for a browser, but it's ability to improve performance in a data center.
I personally think a single machine doing simple REST operations in a low-latency environment should be able to handle hundreds of thousands of requests. Unfortunately that has not been my experience in practice.
I've done some prototyping and discovered that a bottleneck in HTTP/1.1 is the number of IP packets required. Even when using multiple connections, each request/response requires an IP packet to be sent. This puts an artificial limit on the number of concurrent requests/responses in that you cannot send more than the number of packets/s that your machine can manage. In my testing, this has been on the order of 10s of thousands packets/s. In addition, getting to that level requires the CPUs to be completely saturated which makes doing any useful work impossible.
HTTP/2 uses a single TCP connection. That allows multiple requests/responses to be transmitted within a single TCP/IP packet, which can increase the request rate to hundreds of thousands and even millions.