Evidence still being gathered, suggestions gratefully received.
Versions: 5.1.3.201810200350-r, 5.2.1.201812262042-r, 5.3.0.201903130848-r, 5.1.0.201809111528-r, 5.1.3.201810200350-r
Scenario: 6 different deploys in java 11 containers (mesos)
We have used jgit for years for a library that we use to poll github for chhanges to repos successfully.
In this program every minute we re pull and check for changes, and pass affected files list and shas up to the caller. This is done in a single threaded scheduled executor.
We are finding that after several days. the synchronization stops. Neither the scheduled thread nor jgit-workqueue look "suspicious"
In fact they look very benign.
"JGit-WorkQueue" - Thread t@67
java.lang.Thread.State: WAITING
at jdk.internal.misc.Unsafe.park(Native Method)
- parking to wait for <5e502c44> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2081)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1170)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:899)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1054)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1114)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.lang.Thread.run(Thread.java:834)
Locked ownable synchronizers:
- None
config-update-0" - Thread t@87
java.lang.Thread.State: WAITING
at jdk.internal.misc.Unsafe.park(Native Method)
- parking to wait for <79fc6f28> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2081)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1170)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:899)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1054)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1114)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.lang.Thread.run(Thread.java:834)
Locked ownable synchronizers:
- None
However, heavy debugging shows we go into this code and never "return"
LOG.trace("Configuration of credentials completed, setting remote {}", remoteIndex);
pull.setRemote("remote" + remoteIndex);
PullResult result = pull.call();
LOG.trace("Got result {}", result);
(The first log shows, but not the second)
Additional debugging - a custom ProgressMonitor shows beginTask is called, and then no further
A try/catch/throwable around the code gleans no obvious failure of the thread itself.
Additional: We've turned on full net debugging and seen odd finalizer errors
avax.net.ssl|WARNING|03|Finalizer|2019-03-29 12:26:57.882 UTC|SSLSocketImpl.java:494|SSLSocket duplex close failed (
"throwable" : {
java.net.SocketException: Socket is closed
at java.base/java.net.Socket.shutdownInput(Socket.java:1521)
at java.base/sun.security.ssl.BaseSSLSocketImpl.shutdownInput(BaseSSLSocketImpl.java:216)
at java.base/sun.security.ssl.SSLSocketImpl.shutdownInput(SSLSocketImpl.java:651)
at java.base/sun.security.ssl.SSLSocketImpl.bruteForceCloseInput(SSLSocketImpl.java:606)
at java.base/sun.security.ssl.SSLSocketImpl.duplexCloseOutput(SSLSocketImpl.java:566)
at java.base/sun.security.ssl.SSLSocketImpl.close(SSLSocketImpl.java:479)
at java.base/sun.net.www.http.HttpClient.closeServer(HttpClient.java:1058)
at java.base/sun.net.www.http.ClientVector.put(KeepAliveCache.java:252)
at java.base/sun.net.www.http.KeepAliveCache.put(KeepAliveCache.java:125)
at java.base/sun.net.www.protocol.https.HttpsClient.putInKeepAliveCache(HttpsClient.java:658)
at java.base/sun.net.www.http.HttpClient.finished(HttpClient.java:406)
at java.base/sun.net.www.http.KeepAliveStream.close(KeepAliveStream.java:99)
at java.base/sun.net.www.MeteredStream.justRead(MeteredStream.java:93)
at java.base/sun.net.www.MeteredStream.skip(MeteredStream.java:154)
at java.base/sun.net.www.http.KeepAliveStream.close(KeepAliveStream.java:89)
at java.base/sun.net.www.MeteredStream.finalize(MeteredStream.java:209)
at java.base/java.lang.System$2.invokeFinalize(System.java:2117)
at java.base/java.lang.ref.Finalizer.runFinalizer(Finalizer.java:87)
at java.base/java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:171)}
Additional: We've tried turning off TLS 1.3 (openjdk has known issues) with -Djdk.tls.client.protocols=TLSv1,TLSv1.1,TLSv1.2
and verified only TLSv1.2 is being used with tcpdump.
--
Michael Bell
Principal Software Engineer / Architect
1 Montgomery Tower, Suite 700
San Francisco, CA 94104
(415) 317-9559
mbell@xxxxxxxxxxxxx
data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAJoAAAAoCAIAAACJn3Q+AAAAGXRFWHRTb2Z0d2FyZQBBZG9iZSBJbWFnZVJlYWR5ccllPAAABw1JREFUeNrsm09MVFcUxhUxRHkYIjUwxDQ24rjTiK40EVd0obiAhUpSVkRT66IVa1cGkYSkVGhXNhBXNvXPQrsQF3YjY0JXMi3dSWtijGEghpbIIAEb6Y855vT2zXvPN8MMQ+k9IZPHnfPuO/d85zvn3DszaxcWFtZYWS1SZF1g4bSyQqU4h3PNxkdexUdmh395nZj4a3xCxzfs2VUSrdlQu3tj7e6iMsc6PX+ydum188108s9bd14O3Dch9E4FTqlTd2Bza8v6SJV1/UqEc+rm7cmr194kZzK6q/xYY0Vri2XqCoITUo53ds88/CnLLF9VWd19qSS63WJQeDjB8vnptrnfniypDXNKKy+cJ/1aGAoJZ06wVNl6pYcuySJRMDiffXQqV1gKR7de6bVZtzD7ThqfHGK5yPXkzERnt0WiAHC+Toz/cfVazo0gPuiQLRjLDWc+sFTSU5ItHkuU4oyo+fLej3myg5SbfDi06fCH79QcHh6+fv06r8nkIvzRaPTQoUMnTpwoKytbHpe1tbXFYrEABczDqoB7HccZHBz0VBgdHW1ububi7NmzcpEvOGdiQ3l109TNO8FwTk9P9/T0DAwMuNaP4MH+/n4/J+ZWMCNrBXlLAjH43gCdUHCS66THKXIczz4zf9TUCooNAUdF6ViqsHhiOYAWOZSGhgYzmDRJwDkuSBLLE1VBcL68d//F11f0rG7Dnl3V3Zdcng3f0JYfayw/3rg+UgU8pFBz5mB5FR/xO1W4e/euYllXV0fWqq6ulnFgFp8yyL8S4zgat+JcLiTk5V/P7C0XLgVuBCGeMjY2lkgkFKqGlIjOyZMn4/G4PHrv3r3pc4Z5rnljAGuxx1NfFihPWYQzGRua6PzK1Jj9+dfnp9ve/67vn5H4SEgst3z6cfnxpreNVplD/izZUfOs5VRIgvrBSS6ViyNHjly8eNHkys6dO3EriOJ04GTkxo0b6IM6I+oFsMHpJrcoYEylac1UkBrGiMymt9TW1hI9AXUa7I8ePWqOMAlPocCbg/v27dPrSCSCQgCorIXEo3aijw2SA/pTIkGDzmJnC3s8PQtlMyZ7VaViqULq3nS4PuR5k5+PhB/icde7rES7BpMWNB2KpSTkjo4O7UG4OHfunFmiTAUtciaWCFx0jaSn/fQRvB9wC0tjUaaprhIDYOa06BO++ITrR48e4RMAFoOL8KDfB1uvExOZwunHrdKDoQ5m50Z/91twcOLS0JZFmsxob2+/fPkyrFLvyIVSHAWcQifpSgMqsLyvr49XjYOAJWBhe0q4RWs59rsMI8dgFWogIZCj7xnHEj0sBIUHDx5wo+iLnXiDa54icOby4+sVKJrlwBsGJ1IixUbiHW8yIq7BZQy6WIKnJAKYQTKkH43M2KLMazrRiJRi7womlCU5e25+tF3AjOGU6CSSh5gH80BdjCymvJEhPQm6PlKZqfsow1s+O+2xyXkYapNTEq3xHJcQ1rKfTlDNsabLJHj1QrwsiVT1FUvffOM4ZuGUxidAKN5k7PAew+Aw08ZTkr6lYV1mJ7HITgBIfNHu9uyO7WE29S4hLKZu3naVz7nRJyE3OX67FNYsNJI6ZC5AMCatvbNLNNOdxgeuNLsSF35ZiLBEazzovhMqLSUBQqqn43OlDe+NCgWv8sLn6RsVUy/8Z1gvvvmWouvaqIS8lxjye4viL1EPwwhMz40KIJmNq7krB3J1K2rKWlypJ0pQdom7RvOsSjtkTziJP2nfsF/g1AgTefz4sTnIQvCA7lgYd+Whf+07IaJz8EDwMQK+Drn1nLp1h78s3LHRP2jwjkZ6LCV+zDCF3lWaIPUp/4ojJMXhSmbG+3IgAKtwtKenwojmBmYmheB6v+PA3t5e4pLHqWGuQORGLGGQWoCRqImdEjHAybvpRafITHRQkD+/jx5DbjayFsIl+NtDoCV9nWeFo/HzJJZZdWTHprNpV4mCsCqZkqWsQi0EreCjXZO4WC7ki6ZEdzgAhp2S/wGVCcU8zwYCWeeqQwGyrmJzdpwLKe990urXCr3Fu6SENhUGsJjJycn5+XlZf1NTU1dX17Zt28ykJ80OFJSiK3kYNfUCszHCq8yGApPjDfEmak+fPq2oqJBjCs3MCwsLqO3fv98ckcllZszDMO7llanOnDkD15kHI3mWIMFT5LBJ9hhiv1pVX18vjyYbV6SEkemUCC8xwFyIKZl9G2GisztPJ7dFTukHP3yfq+/26VkJlA1zhLZqJLPPOze3tuTJDvs9zQLASbOaD0SpmulHg1aykOIsaDQTG8rtV78qL5zP7ar0vLuAn1UVROwXM//HyVa3NGAQsOUPz8vIlx0WywKzUzlqf9SweuB8ewBkf3K0muBcY38QuMrgVLE/111VcFr5T3a2ViycVpZD/hZgAP9XlxwC+KCNAAAAAElFTkSuQmCCAAAAAAAAAAA=