User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:68.0) Gecko/20100101 Thunderbird/68.10.0
Hi,
We are still facing the GF start-domain failure often in our TCK
runs. The failure in one of the suites cause the entire job to be
running for a long time. Has anyone found a solution for the
start-domain issue yet.
Regards,
Alwin
On 12/06/20 1:17 am, arjan tijms wrote:
Hi,
Indeed, --verbose only logs to the console and will hang
the current process. It doesn't seem there's a port in use (it
would explicitly complain about that). Here the startup code
just doesn't detect the server process to be running. This
could mean the detection somehow fails, or the process, in
fact, doesn't start.
Often if the GF process doesn't start there would be errors
in the log, but that's not the case. The logs are a little
hard to retrieve, so possible it would be easier to cat them
to the main Jenkins log as soon as the script detects
failure to start.
Do you have any idea what it could be? I
race condition would be unlikely, since it
keeps failing on the same node if repeated.
So maybe it's something related to the node,
but I'm not sure.
Would port in use errors show in the console? Or
do we need to start Glassfish with the --verbose
option to see errors like that?
We encountered the same issue with
jakartaee-tck platform run too yesterday
in couple of the nodes. But it went
through in all other >30 nodes.
+
/root/ri/glassfish6/glassfish/bin/asadmin
--user admin --passwordfile
/root/admin-password.txt start-domain Picked up JAVA_TOOL_OPTIONS:
-Xmx6G Waiting for domain1 to start
........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................ No response from the Domain
Administration Server (domain1) after
600 seconds. The command is either taking too
long to complete or the server has
failed. Please see the server log files
for command status. Please start with the --verbose
option in order to see early messages. Command start-domain failed.
There was a stop-domain failure in
glassfish CI some time back which was
fixed by correcting the docker image.
Regards,
Alwin
On 11/06/20 10:52 pm, arjan tijms
wrote:
Hi,
I just noticed an old issue
with the CI has resurfaced.
Seemingly randomly, GlassFish
will fail to start up:
12:11:19 ===== TEST RUN -
STARTING GLASSFISH AND DB
=====
12:21:20 Waiting for
domain1 to start
.......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
12:21:20 No response from
the Domain Administration
Server (domain1) after 600
seconds.
12:21:20 The command is
either taking too long to
complete or the server has
failed.
12:21:20 Please see the
server log files for command
status.
12:21:20 Please start with
the --verbose option in order
to see early messages.
Repeating the script (within
the same test run) never helps.
This is automatically done
during the test. However the
exact same build does start up
on other nodes.