Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [glassfish-dev] Issues with Jenkins

Hi Scott, 

sorry, I forgot to send this answer yesterday. Today (Wednesday) I passed with one build containing changes in the Jenkisfile - moved "hungry" docs to GH Actions, added vmstat logging, increased timeouts.

---
Thank you, I don't know the backend of Jenkins here, so thanks for this direction to monitor. However we noticed this problem perhaps month ago with increasing frequency until it became unbearable. Today (Tuesday) did not pass a single build on Jenkins since the morning (CEST = UTC+2). 

Regards the TCK - I would not say moving, rather splitting, so we would not automatically test TCK snapshot vs GF snapshot, but each side would test own changes against a solid release.
So in effect, now I would not break TCK10 builds with changes in 7.1.0-SNAPSHOT; after Jenkins CI will get healthy again soon, I hope I will fix the rest before the end of July (platform is already fixed).

I am not sure now how much resources we would need for the TCK11 and 12 exactly, but I am sure TCK11 has much lower requirements than TCK10, you all did a lot of work there. Also TCK doesn't always need GF, so there is more freedom on both ends. Btw I am running TCK10 vs GF7 on private Jenkins or locally already, just using commands documented here: https://github.com/eclipse-ee4j/glassfish/wiki/Jakarta-EE-10-TCK-Tests
For the TCK11, we are not there yet, but I will get there soon ...

More resource packs would be nice, however I am afraid that we cannot get more than 13 for now.

-- 
David Matejcek | OmniFish
david.matejcek@xxxxxxxxxxx

On 7/15/25 19:45, Scott Marlow wrote:
I looked at https://status.redhat.com which shows that there is a (database) problem mentioned today but am not sure how that impacts GlassFish testing.  There is also a history of prior incidents listed as well.  Eclipse CI has its own infrastructure layer as well.  Continuing the discussion in https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/issues/6369 is the right path to follow but sometimes it helps to know if there is an OpenShift outage.

In previous years we talked about moving GlassFish TCK testing to the GlassFish project which I think would require increasing the number of resource packs assigned to the GlassFish project.  If that seems important now, please start a new GlassFish thread for discussing how to do that.  Perhaps that discussion could be part of a new thread about EE 12 TCK testing for GlassFish.  

Scott

On Tue, Jul 15, 2025 at 10:46 AM David Matejcek via glassfish-dev <glassfish-dev@xxxxxxxxxxx> wrote:

Hi all, 

as you probably noticed last two weeks or even more, we are struggling with Jenkins performance, which randomly significantly decreases.

The slowdown is not so much random, as I started watching it, however at this moment I don't see any simple solution working without reducing automated quality assurance.

What we could do:

Prefered Solution
Backup Alternative 1
  • We will change the Jenkinsfile, so it will stop running expensive parts
    • Compiling asciidoc documentation
      • The most frequently killed part of the build
      • I tried to improve JVM settings, tried to reproduce issues from Jenkins, but nothing fixed that.
      • Documentation should be up to date and should pass the processing.
    • "Ant based zombie army" how I am calling that for several years
      • Obsoleted slow expensive part of the build, however covering some critical features
      • Has probably the highest score of "new bug kills"
      • Is slowly converted to the application-tests maven module, which is much faster. However our conversion tempo is slow - help is welcome!
Backup Alternative 2
  • We can temporarily configure Jenkinsfile to mark failed builds just as warnings.
  • Then reviewers could rerun builds locally and report the result, or builds can be reexecuted on Jenkins too, which is much slower.
Backup Alternative 3
  • Move from Jenkins to GitHub Actions
  • I am already refusing it, but just for the record
    • The problem is that on GHA we have just 4 thread CPU for the build and it is not suitable for integration testing. 
    • It is even slower than Jenkins CI.
    • GHA applies some fairness policy, so jobs can be suspended for a while if they "eat" too much.
Combination

As I am thinking about it now, we can probably also combine all three alternatives:

  1. Move the compilation of the documentation to GitHub Actions.
  2. Temporarily configure Jenkinsfile to mark failures as warnings.
  3. Reviewers then decide how to interpret failed steps and what to do with that.
  4. Jenkins CI admins will fix that without blocking us.

What do you think about it? Did I forgot something? Yeah, maybe I could check Jenkins plugins, if we can add some configuration ie. for monitoring the performance ...

-- 
David Matejcek | OmniFish
david.matejcek@xxxxxxxxxxx
_______________________________________________
glassfish-dev mailing list
glassfish-dev@xxxxxxxxxxx
To unsubscribe from this list, visit https://www.eclipse.org/mailman/listinfo/glassfish-dev


Back to the top