thanks for the first feedback!
yes, I’m sure that OutOfMemory occurs because of the ungarbaged sessions. Loosing a few kb with every open session. Our web apps have an end-user-friendly jmx-like monitoring. there I can see it pretty clear. Our apps are usually running for months and years without interrupt, we have no leaking memory issues there.
No, up to now I could not reproduce it - otherwise I would have tackled it already myself.
One more thing that should not, but might interfere: The app keeps track of active sessions. It implements HttpSessionListener interface and our WebApp (extending ServletContextHandler) uses addEventListener() for itself and for the session manager to capture sessionCreated() and sessionDestroyed() to update it’s own session list. Up to some time it get the events, later on it does not anymore.
I usually use that in our monitoring view to verify if any new changes solve the issue - again, for some time I see sessions disappearing and then they remain that while it shows that the maxInactiveInterval was set to something reasonable but it sticks there for days without any further usage/access.
WebApp / Session setup:
bdm.Server creates DefaultSessionIdManager and HouseKeeper as described in my previous mail
bdm.Server scans for .war-files and creates a bdm.WebApp extending ServletContextHandler
bdm.WepApp reads WEB-INF/web.xml which contains 1 servlet with load-on-startup + 1 servlet filter + 1 listener-class + following session config
which is applied to the WebApps getSessionHandler().getCookieConfig()
on war-file change the app is stopped, removed from the contexts and a new one is added. Happens every few weeks. The hanging scavenger occurs regardless also if we don’t update / redeploy any apps.
The app has a config option to change the session expiration which is by default set to 3 hours. It’s applied when the session is created with httpSession.setMaxInactiveInterval(inSeconds). As mentioned, in production we usually have up to 6 hours, for development / testing I usually change it to 1 or 2 min.
I can easily obtain Heap Dumps of machines after startup where scavenging works and after some time where it does not. I do see, that sessions are referenced by the session manager and our session list, not by anyone else. How does that help me?
I am familiar with profilers and use them on a regular basis. I think debugging or temporarily adding more debug infos to the dump-output is more suitable in the given case. What / where should I look at to gain any insights?
Are you sure that the OOM happens after the scavenger dies and not that the scavenger dies because of OOM?
I would connect to it with jconsole or similar tool and keep looking at the running threads and the memory and see when things start to go wrong.
Also turn up session logging to debug level to capture more info.
Also try with a profiler and see if you can grab a heap dump to analyse.
Can you reproduce in a test environment?
Can you tell me what all of the session related configuration is, including in your web.xml, and anything you do in code, and I'll see if I can reproduce.
jetty-dev mailing listjetty-dev@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit