Skip to main content


Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Eclipse Projects » Remote Application Platform (RAP) » deadlock in ServerPushManager (RAP 3.20) ?(Non-responsive requests to the ServerPushServiceHandler)
deadlock in ServerPushManager (RAP 3.20) ? [message #1856435] Fri, 09 December 2022 18:12 Go to next message
Gunnar Adams is currently offline Gunnar AdamsFriend
Messages: 49
Registered: May 2016
Member
Hi,
we are investigating an issue in our Eclipse RAP-based web application, which only occasionally (but still too often) occurs for one of our customers.
Mainly when the load on the server is rather high, it will happen for even multiple users at the same time (different HttpSessions).
The customer uses two JBOSS EAP 7.3 servers with httpds in front (all on the same machine) and mod_jk (AJP13 protocol). mod_jk in either httpd is configured to issue one retry after 5 minutes to the same JBOSS and then fail over to the JBOSS on the other machine.
This failover will then send the request to the other JBOSS, which knows nothing about the current HttpSession/UISession, resulting in a 403 error (which is not the issue here).

We now were able to pinpoint the problem based on log output we added to the framework to the ServerPushServerHandler.service() method, which does not send back a response (with Status Code 200, interestingly) until either
1. the HTTP session times out (after 30 minutes) or
2. Another request comes in to the ServerPushManager for the same cid (UISession)
So, due to 2. we see the retry request coming from the mod_jk loadbalancer terminating the first request and due to 1. we find the response to that second request coming after exactly 30 minutes.

So, I guess, that affected user sessions are stuck in the ServerPushManager.processRequest() method, which uses the result of the canReleaseBlockedRequest() method to determine, whether it can issue the response. Obviously this works in the case of the session invalidation and another request coming in on a different thread.
I understand that the ServerPushServiceHandler's response to the GET request only indicates to the Javascript framework, that a request to the LifeCycleServiceHandler can be used to get an actual update.

I see that the mustBlockCallBackRequest() method checks the hasRunnables variable, which seems to be the only other way to trigger the response. The web application will actively trigger updates to the browser via time events. A user action should also keep it alive. Nevertheless, at times of high load, the session occasionally gets stuck. We only see the request coming in and the request is not sent for more than 5 minutes and not after (due to several mechanisms) after a maximum time of 1 minute.

Do you maybe have an idea on how to solve this problem or can explain those details of the PushSession to me?

We start the PushSession early-on in the application (from createUI()) and keep it running over that whole user session.

We use in our code calls to Display.syncExec(), Display.asyncExec() and Display.timerExec(). We also made sure in our application (by triggering a call to an empty client-side Javascript function) that an update is/should be sent to the browser at least every 60 seconds. This was implemented early-on to avoid request timeouts imposed by the web server.

We also checked the thread pools in the JBOSS server. They should be set to high-enough values (several hundreds of Threads). We also confirmed using thread dumps that no obvious blocking threads existed (except for the ones looping/waiting in the ServerPushManager).

Thank you very much for your insights.

Best regards
Gunnar









[Updated on: Mon, 12 December 2022 20:39]

Report message to a moderator

Re: deadlock in ServerPushManager (RAP 3.20) ? [message #1856536 is a reply to message #1856435] Wed, 14 December 2022 10:51 Go to previous messageGo to next message
Ivan Furnadjiev is currently offline Ivan FurnadjievFriend
Messages: 2427
Registered: July 2009
Location: Sofia, Bulgaria
Senior Member
Hi,

it's really hard to suggest anything here. Debugging deadlock without a way to reproduce it is really hard.
Did you observe the same deadlock in a different servlet container? I remember similar problems in Jetty some years ago when the thread responsible for processing the request was blocked by the container.

Regards,
Ivan
Re: deadlock in ServerPushManager (RAP 3.20) ? [message #1856560 is a reply to message #1856536] Thu, 15 December 2022 10:24 Go to previous messageGo to next message
Gunnar Adams is currently offline Gunnar AdamsFriend
Messages: 49
Registered: May 2016
Member
Hi Ivan,
thank you very much for your reply.
We may have found the root cause for our problem:

Because the ServerPush session does not seem to impose a limit on the time it might take for the request to complete, we added in our application a timer-based execution of a JS function call using client scripting. This "nop( ) " function was scheduled to be executed at a fixed rate using a ScheduledThreadPoolExecutor.
We think that the problem is caused by too many of those keepAlive timertask queueing up in too few threads (the pool size was limited to 4) to handle them in time. We were using one global ScheduledThreadPoolExecutor for all timer-based tasks of all the user sessions.

The special load-balancing setup at the customer site then added additional complexity: Due to the timertasks not executing after 55 seconds as intended but after more than 10 minutes, the loadbalancer in the mod_jk component would issue an retry request on its own, wake up the ServerPushManager, discarding the previous request but not forward the response (status 200 in this case), because it already assumed the request to be non-responsive. Nevertheless, that request thread would now be freed-up again.

The retry request would also be stuck in the ServerPushManager for more than 5 minutes until the httpd mod_jk component finally decides to retry the request, but now to a different JBOSS, which does not have any knowledge of the HttpSession / JSESSIONID. This, then resulted in the 403 response, because the UISession was not found.
Our application is able to run several different UISessions in different browser tabs. So, while the session where the user was not doing anything caused the failure, the session the user was still busy working was using the same HttpSession (JSESSIONID cookie) and would also be moved to the wrong JBOSS. That the user would immediately notice as the application became unresponsive.

It seems, that when the unexpected 403 response to a POST request is received, this is not handled by the RWT framework. We just see in the browser console an exception happening when the framework tries to interpret an undefined object as a non-json response.

I now have modified our code in two ways:

1. We are using now multiple ScheduledThreadPoolExecutors, essentially one per user session with a thread pool of 2.

2. I am testing a configurable timeout for the PushServerManager, which will respond to the serverpush request after (depending on configuration) 60 seconds / 90 seconds / .... (=multiples of the 30 second wait time in the loop).

As it is not uncommon for request timeouts to exist in web servers, I would like your opinion, whether something like 2. could not be added in the RAP framework or if what I am trying to do here has an actual downside or may cause some kind of problem with the way the ServerPushSession is supposed to work.
Because our web application acts as a frontend for a backend server that can, essentially trigger updates to the browser at any time, I see no other way than have the server push session running over the whole duration of a session.

Best regards,
Gunnar




Re: deadlock in ServerPushManager (RAP 3.20) ? [message #1856695 is a reply to message #1856560] Fri, 23 December 2022 07:32 Go to previous message
Ivan Furnadjiev is currently offline Ivan FurnadjievFriend
Messages: 2427
Registered: July 2009
Location: Sofia, Bulgaria
Senior Member
Hi,

to release the blocked ServerPush request you can simply add an empty asyncExec from a background thread at fixed rate.

Best regards,
Ivan
Previous Topic:Mouse up / down event not triggering on column header click or select
Next Topic:[ANN] RAP 3.23 Released!
Goto Forum:
  


Current Time: Sat Nov 09 02:31:51 GMT 2024

Powered by FUDForum. Page generated in 0.02724 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top