Thanks Jonah for the update, we will try the fix on our side and update!
> Please include text as text rather than (just) a screenshot so that I can copy the text.
Sorry for that.
Thanks,
Vishnu
From: Jonah Graham <jonah@xxxxxxxxxxxxxxxx>
Sent: 14 October 2024 21:13
To: Vishnu Sarath <vishnu.sarath@xxxxxxxxxxx>
Cc: CDT General developers list. <cdt-dev@xxxxxxxxxxx>; Nipin P <nipin.p@xxxxxxxxxxx>; Vinod Appu <vinod.appu@xxxxxxxxxxx>
Subject: Re: DSF views not getting updated after having intense UI updates for sometime
That looks like it could be the issue! If the runInSwtThread method returns (normally or exceptionally) when runnables is not null, that means that needsPosting will never be true when enque is called, leading fDisplay.asyncExec to never
be called again!
The simple fix is to log the error and keep going by using a safe runner.
/**
@since 2.3 */
protected
void runInSwtThread() {
Runnable
runnable;
while ((runnable
= getNextRunnable()) != null) {
var
finalRunnable =
runnable;
SafeRunner.run(new
ISafeRunnable() {
@Override
public
void run()
throws Exception {
finalRunnable.run();
}
});
}
}
The safe runner should mean things like restarting the debug session fixes things. But may still leave the session broken. Therefore the bug in AbstractCachingVMProvider needs some attention too. I don't know if this NPE is a side effect
of bad usage of the provider, or a bug in the provider itself. But there certainly seems to be a race condition. This code snippet (lines 1284-1292 of AbstractCachingVMProvider) shows pretty clearly that fProperties in unexpectedly being changed:
// We are caching the result of this update. Copy the properties from the update
// to the cached properties map.
if (entry.fProperties
== null) {
entry.fProperties
= new HashMap<>((getData().size() + 3) * 4 / 3);
if (update.getProperties().contains(PROP_CACHE_ENTRY_DIRTY))
{
entry.fProperties.put(PROP_CACHE_ENTRY_DIRTY,
entry.fDirty);
}
//
this is the line with NPE, so between the new HashMap a few lines above
//
and here
AbstractCachingVMProvider.flush() is being called
entry.fProperties.put(PROP_UPDATE_STATUS,
new PropertiesUpdateStatus());
}
Since you have a regular manual refresh of the UI that is probably where the source of the race condition is.
Please
file an issue and create a PR with a fix once you have tested it. I have not run my change in runInSwtThread to see if it is correct.
PS Please include text as text rather than (just) a screenshot so that I can copy the text.
Attaching another log when we received the same issue.

Thanks,
Vishnu
Hi Jonah,
We have gone through the logs when we had this issue and saw an exception related to SimpleDisplayExecutor. Could you please have a look at the below screenshot.

And as always thank you for the quick responses from your side
😊!
Thanks,
Vishnu
> Do you think if Display.async called by other threads multiple times going to make something similar?
I don't think there is an issue with Display.asyncExec implementation, but there could be a bug in SimpleDisplayExecutor or some other global state. Not sure if there is an easy
way to find all static data dsf ui uses, but from your symptoms it seems likely that one of those fields is going to be in a bad state.
Hopefully some long runs with tracing on (and perhaps a faster refresh interval?) will expose the case.
1) Are there any exceptions in the log - if the Diaplay.asyncExec call in SimpleDisplayExecutor throws an exception AFAICT that may break the implementation, so hopefully there
would be a log entry in this case
[Vinod] Nothing seen in the first place, trying to reproduce but the issue is not always coming.
2) You mentioned restarting Eclipse is required to restore correct behaviour. What about just closing/opening the affected views?
[Vinod] No, no change.
Do you think if Display.async called by other threads multiple times going to make something similar?
~Vinod
Hi Vishnu,
That sounds concerning, and it also sounds like it will be hard to identify where that fails. I hope you can find a way to get to the failure state faster than 15 mins.
I recommend you try running with various traces on as that should hopefully help identify where it is getting stuck. See screenshots below for these options files:
I would also do a thread dump once in the failed state (either pausing in the IDE or using jstack - visualvm may be helpful too?) to see if some secondary threads are deadlocked.
Unfortunately I don't know where the problem may be, but looking at org.eclipse.cdt.dsf.ui.concurrent.SimpleDisplayExecutor to see whether things are being queued and not run, or
if they aren't being queued. The reason I recommend starting around this class is that very little state is preserved between sessions, but org.eclipse.cdt.dsf.ui.concurrent.SimpleDisplayExecutor.fExecutors is one of them.
The other couple of questions I would look to answer:
1) Are there any exceptions in the log - if the Diaplay.asyncExec call in SimpleDisplayExecutor throws an exception AFAICT that may break the implementation, so hopefully there
would be a log entry in this case
2) You mentioned restarting Eclipse is required to restore correct behaviour. What about just closing/opening the affected views?
Hi All,
I am facing issue like: DSF views are not getting updated after using the UI thread intensely for sometime.
Just a background of what's happening in my test setup:
* We have multiple view getting refreshed at an interval of 1sec and 2secs (including trace compass views) during a CDT debug session.
* After running the same in this setup for around 15mins, we see that none of the DSF views (like Debug view, Variables views, Register view and other views which we extended from
Variables) are working, the VMNodes are not getting updated. Non-DSF views are working fine.
* But we can see that the debug operations (like Resume, suspend) are sending out proper commands to GDB. So I hope the DSF session executor thread is working fine (and of course
the main (UI) thread also).
* Even if we terminate and restart, the debug launch, the issue persists (none of the DSF views are getting updated). To get recovered we need to restart the IDE.
I am a little confused on where to start with the analysis. Has anyone encountered a similar issue? Really appreciate if anyone could shower some thoughts on which area I should
be focussing (or debugging) while root causing the issue.
Thanks,
Vishnu
|