While I was looking at https://grpc.io/blog/deadlines
the other day, I was thinking about optimizing resource utilization and better availability in my applications. And here what I came up with - it always makes sense to handle some sort of deadlines, especially in io-intensive applications where one business function might consist of N different http/database calls, likely with retries. Since all external party timings can vary (for a bunch of different reasons), we can't force exact timeouts for every party. Without concept like that our load can be significantly amplified, because client (as an application or as an user with almighty F5) would come in a moment, asking us to do the same thing once again (and very likely to start queuing, then forgetting about this request and retrying once again and so on).
Which bring us to the first question - since Jetty itself handles all worker threads, do you think it would make sense to control this deadlines concept from inside? Is it possible to handle it with already existing abstractions? If so, would you mind mentioning them to me?
Next, we should somehow define this point of time, after which we can execute our termination operation (throw exception, stop the thread, etc). I see a few possibilities here:
1) Since it's our application, we know better what is our expected sla, so basically we can have a Map URI -> Timeout, get the timeout, sum it up with current timestamp, store in ThreadLocal and verify at stoppoints, terminate on exceeding.
2) We can ask client to send header with timestamp value, when he will stop waiting for response, read this in inbound filter and then follow option 1 mechanism.
3) We can stamp request with timestamp on balancer/reverse proxy and then follow option 2 mechanism.
Option 1 seems to be the most convenient from architectural and security purposes, but there is one nasty problem - Filter executes after request being popped from the queue, and we don't know for how many time it was held in queue. If server suffers under increased load, most likely these tail requests (from queue) are 'dead' as well - there is no such client who need them to be completed, and it would be extremely convenient to just drop those and proceed fresh ones.
This lead us to the second question - is there any abstraction which executes before queue insertion or immediately after successful insertion (so we are sure it doesn't surpass pool capacity and still have reference to it) and/or augment request or whatever else is suitable so we can retrieve it later in filter?