Hi Martin,
Thanks for the suggestion. We have been battling the git cloning speed for a while. Just this week I found the root cause and for the last while a workaround has been running. You can follow progress on this issue in Bug
560283
We haven't had a single failure due to git since implementing an earlier version of the workaround described in Comment 18. I will be rolling out Comment 18 everywhere soon.
The summary of the underlying problem is that the container that the git fetch was done is very limited, just .1 CPU and 256M ram. With that little cpu/ram sometimes the fetch simply fails, not surprising given the pack file itself is > 200M and git likes memory a lot!
The mitigation around the Jenkins bug (JENKINS-30600) is to use sh to do the git fetch in the main container which has >1 CPU and >1G ram. When the Kubernates cluster (presumably) is not overloaded, the full depth fetch takes < 30 seconds (
git fetch stage in recent build)
Finally, we can't do a shallow fetch on the check_cleanliness job, the one that was failing most, as that job needs the last commit time for each plug-in to generate and check version numbers.
----
That said - there have been a LOT of timeouts and slow downs. This seems to have gotten much worse recently. I have been considering how to run with a lot less tests in the gerrits - it is incredibly frustrating to have a gerrit that spends 1+ hour running irrelevant (to the change) tests, and worse of all reports unstable or worse!
For me it is very motivating that other people are interested in this and I appreciate your feedback, like changing some of the gerrit plug-in settings. I think I will change the gerrit jobs so that the CDT UI and CDT Other ones don't run when they are useless (like cmake, terminal etc changes - the EPP project has a clever script that I am going to adapt). On the unlikely case that a regression gets merged in that case the test failure will be reported by master build and we can take corrective action.
Short summary - I don't want to see another unstable report on AutomatedIntegrationSuite.testBuildAfterSourcefileDelete or similar.
Please follow and contribute to this in
Bug 499777 and its dependencies. I am just about to make one for handling the running less tests I mentioned above.
Jonah