[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
[
List Home]
Re: [jakartaee-tck-dev] [glassfish-dev] Tracking usage data for EE4J working group CI cloud systems
|
On 10/1/20 10:48 AM, Ed Bratt wrote:
I wasn't intending to point any fingers at the stability observations
you made -- only to observe that we want to focus on improving the
reliability so that we don't need to rely on re-runs, or waits, or other
symptomatic only type fixes.
I agree 100%! I don't think that we need to make the memory tuning
changes just yet, as they will not improve reliability.
One day, I'd like to see that we can
initiate several test runs simultaneously -- and they all reliably
complete -- perhaps taking more clock-time, but reliably and
consistently repeating.
+1
Ideally, we could fill this compute pipeline with running and waiting
tasks and be confident that the system will reliably produce, consistent
results. It sounds like, for some unknown cause, we aren't there yet.
Getting to the root cause of this would be my priority (and we don't
know if that's a GlassFish issue, an infrastructure issue or even
something else). But, I'm not actually doing the work so, it's just my
opinion.
+1
Part of the original discussion related to
https://github.com/eclipse-ee4j/glassfish/issues/23191 suggested adding
a --verbose --debug options which I tried on my local machine (as I can
reproduce a similar symptom that could be the same as on CI), I would
see `org.glassfish.flashlight.MonitoringRuntimeDataRegistry not found by
org.glassfish.main.admin.monitoring-core` as shown on
https://gist.github.com/scottmarlow/eec11ca74b99d021346b270fa29ce4fa
(which has the full server output including error exceptions).
I'm thinking that we could try to reproduce the failure on Jenkins with
the --verbose --debug options so that we see the actual cause on
Jenkins. Perhaps it will be the same exception call stack as I see locally.
I don't know the code involved in the
https://gist.github.com/scottmarlow/eec11ca74b99d021346b270fa29ce4fa
exception call stack enough to debug and find the cause. If you open
this gist, please search for `java.lang.ClassNotFoundException:
org.glassfish.flashlight.MonitoringRuntimeDataRegistry not found by
org.glassfish.main.admin.monitoring-core` which kind of seemed like a
startup race condition but I have no idea really.
Scott