[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[jgit-dev] More reliable StreamCopyThread flush() method

Hello JGit devs!

I've just spent 2 days debugging hanging fetch via SSH from GitHub.
The hanging thread has a stacktrace like this:

  java.lang.Thread.State: WAITING
      at java.lang.Object.wait(Object.java:-1)
      at java.io.PipedInputStream.read(PipedInputStream.java:326)
      at java.io.PipedInputStream.read(PipedInputStream.java:377)
      at java.io.FilterInputStream.read(FilterInputStream.java:133)
      at org.eclipse.jgit.util.io.TimeoutInputStream.read(TimeoutInputStream.java:112)
      at org.eclipse.jgit.util.IO.readFully(IO.java:246)
      at org.eclipse.jgit.transport.PacketLineIn.readLength(PacketLineIn.java:186)
      at org.eclipse.jgit.transport.PacketLineIn.readString(PacketLineIn.java:138)
      at org.eclipse.jgit.transport.PacketLineIn.readACK(PacketLineIn.java:102)
      at org.eclipse.jgit.transport.BasePackFetchConnection.negotiate(BasePackFetchConnection.java:655)
      at org.eclipse.jgit.transport.BasePackFetchConnection.doFetch(BasePackFetchConnection.java:356)
      at org.eclipse.jgit.transport.BasePackFetchConnection.fetch(BasePackFetchConnection.java:301)
      at org.eclipse.jgit.transport.BasePackFetchConnection.fetch(BasePackFetchConnection.java:291)
      at org.eclipse.jgit.transport.FetchProcess.fetchObjects(FetchProcess.java:247)
      at org.eclipse.jgit.transport.FetchProcess.executeImp(FetchProcess.java:160)
      at org.eclipse.jgit.transport.FetchProcess.execute(FetchProcess.java:122)
      at org.eclipse.jgit.transport.Transport.fetch(Transport.java:1138)

It was waiting for response from GitHub, but never get one, so it
failed with timeout exception. The reason it didn't get reply is that
not all the data was sent to GitHub.

That happens because flush method in StreamCopyThread doesn't work
reliably with Jsch (a fine piece of engineering). An actual write to
the network happens in com.jcraft.jsch.Session write() method which
catches InterruptedException in several places and doesn't restore
thread's interrupted status. So if someone called
StreamCopyThread.flush() when it was calling dst.write() chances are
the flush will be lost and some data wouldn't be sent.

Please take a look at the possible fix
https://git.eclipse.org/r/#/c/54324/. Please let me know if there is a
better way to fix the problem.

Regards,
Dmitry