Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Eclipse Projects » EGit » checkout arbitrary commit with JGit
checkout arbitrary commit with JGit [message #1176826] Fri, 08 November 2013 15:27 Go to next message
R Shapiro is currently offline R Shapiro
Messages: 386
Registered: June 2011
Senior Member
I would like use JGit to checkout a given repository clone to an arbitrary commit (yes, possibly leaving the repo in a detached-head state). I do not want to create a new branch.

I've tried all sorts of variants of the checkout command, looked at web docs, done the usual google searching. Still no joy.

Here's one example I tried:

git.checkout().setStartPoint(commit).setAllPaths(true).setCreateBranch(false).call();


where commit is the RevCommit instance I want to checkout.

This will run without errors but does nothing like a checkout. Instead it ends up marking every file as modified.


Obviously I'm misunderstanding the API -- I know JGit can do what I want, since EGit can do it.

What am I missing?
Re: checkout arbitrary commit with JGit [message #1176838 is a reply to message #1176826] Fri, 08 November 2013 15:39 Go to previous messageGo to next message
Christian Halstrick is currently offline Christian Halstrick
Messages: 106
Registered: July 2009
Senior Member
Just remove ".setAllPaths(true)" . What your are doing with your code is a checkout which is using a pathspec (see last variant in http://git-scm.com/docs/git-checkout). Something like "git checkout <commit> -- ."

Ciao
Chris
Re: checkout arbitrary commit with JGit [message #1176986 is a reply to message #1176838] Fri, 08 November 2013 17:45 Go to previous messageGo to next message
R Shapiro is currently offline R Shapiro
Messages: 386
Registered: June 2011
Senior Member
I tried that variant earlier, I got the error: "Branch name <null> is not allowed"

That was why I added setAllPaths(true) in the first place

But now I think I get it: even though I told jgit not to create a branch it still requires me to provide a name.

So this works:


git.checkout().setStartPoint(commit).setCreateBranch(false).setName(commit.name()).call()


Thanks for the tip!
Re: checkout arbitrary commit with JGit [message #1178791 is a reply to message #1176986] Sat, 09 November 2013 22:40 Go to previous messageGo to next message
Matthias Sohn is currently offline Matthias Sohn
Messages: 578
Registered: July 2009
Senior Member
I think the CheckoutCommand should allow to checkout a commit by specifying a RevCommit without
the commit's name since if you already have the RevCommit it's unnecessary overhead to render the
RevCommit's name and reparse it in order to execute the checkout.
Re: checkout arbitrary commit with JGit [message #1179787 is a reply to message #1178791] Sun, 10 November 2013 14:48 Go to previous messageGo to next message
R Shapiro is currently offline R Shapiro
Messages: 386
Registered: June 2011
Senior Member
I agree that it's odd to require a name if a RevCommit is provided and no new branch is being made.

But I have a more serious problem now. I'm running checkout on a whole series of RevCommits. It chugs along fine for awhile and then fails with "Checkout conflict". I know for sure there are no conflicts since I did exactly the same series of checkouts, in the same sequence, by launching each with a process that runs "git checkout":

String[] command = {
         "git",
         "checkout",
         commit.name()
      };
      ProcessBuilder builder = new ProcessBuilder(command);

etc.

This variant works fine.

So JGit is finding a checkout conflict where Git does not. The so-called conflict is apparently due to line-break differences. The repository configuration has core.autocrlf set to 'input' so I would have expected JGit to ignore line-breaks entirely when it's trying to determine whether or not a checkout is in conflict. As I understand it, the 'input' setting should leave the working files with unix line-breaks regardless of what's in any given commit.

For reference, this is jgit-3.1.0.201310021548
Re: checkout arbitrary commit with JGit [message #1179924 is a reply to message #1179787] Sun, 10 November 2013 16:53 Go to previous messageGo to next message
Matthias Sohn is currently offline Matthias Sohn
Messages: 578
Registered: July 2009
Senior Member
Are you using gitattributes for this repository ?
Re: checkout arbitrary commit with JGit [message #1180069 is a reply to message #1176826] Sun, 10 November 2013 19:12 Go to previous messageGo to next message
R Shapiro is currently offline R Shapiro
Messages: 386
Registered: June 2011
Senior Member
Quote:
Are you using gitattributes for this repository ?


We do use a .gitattributes file in one directory which contains some text files that need to preserve platform-specific line-breaks.

The file that JGit says is in conflict is an unrelated part of the tree.

[Updated on: Sun, 10 November 2013 19:27]

Report message to a moderator

Re: checkout arbitrary commit with JGit [message #1180209 is a reply to message #1180069] Sun, 10 November 2013 21:23 Go to previous messageGo to next message
Matthias Sohn is currently offline Matthias Sohn
Messages: 578
Registered: July 2009
Senior Member
Can you share the repository or try to debug the problem ?
Re: checkout arbitrary commit with JGit [message #1181200 is a reply to message #1180209] Mon, 11 November 2013 12:48 Go to previous messageGo to next message
R Shapiro is currently offline R Shapiro
Messages: 386
Registered: June 2011
Senior Member
Unfortunately I can't share the repo.

I'm not currently set up to debug JGit. I've been interested for some time in contributing but I never quite seem to get around to going through the all the steps in the set-up procedure. Maybe I can debug just with a clone of the sources? I'll look into that.

In the meantime I'll see if I can construct a small generic repository that replicates the problem.
Re: checkout arbitrary commit with JGit [message #1181366 is a reply to message #1180209] Mon, 11 November 2013 15:01 Go to previous messageGo to next message
R Shapiro is currently offline R Shapiro
Messages: 386
Registered: June 2011
Senior Member
I can tell you a little more about the conflict but not much. I'm looking at jgit sources from tag 3.1.0.201310021548-r.

The conflict is flagged because WorkingTreeIterartor#contentCheck() is returning true for the DirCacheEntry in question.
In other words getEntryObjectId().equals(entry.getObjectId()) is false.

I assume this happens due to DOS v Unix line-breaks.

contentCheck is being called because WorkingTreeIterartor #isModified() thinks the DirCacheEntry is SMUDGED.

That's all I know so far.

Btw the @return javadoc for WorkingTreeIterartor#contentCheck appears to be inverted.





Re: checkout arbitrary commit with JGit [message #1182665 is a reply to message #1181366] Tue, 12 November 2013 10:30 Go to previous messageGo to next message
Christian Halstrick is currently offline Christian Halstrick
Messages: 106
Registered: July 2009
Senior Member
When you are investigating why E/JGit then you should inspect IndexDiffFilter.include() -> WorkingTreeIterator.isModified() -> WorkingTreeIterator.compareMetaData()

Very roughly the following happens when E/JGit does a "git status":

1  for each distinct path p we find in (workingtree, index, head)
2    if index(p).* differs from head(p).* mark p as dirty and continue loop
3    compare workingtree and index:
4      if index(p).assumeValid mark p as clean and continue loop
5      if index(p).isUpdateNeeded mark p as dirty and continue loop
6      if !index(p).isSmudged and index(p).length!=workingtree(p).length mark p as dirty and continue loop
7      if index(p).mode!=workingtree(p).mode mark p as dirty and continue loop
8      if index(p).lastModified != workingtree(p).lastmodified do content comparison between index(p) and workingtree(p) . report p as dirty/clean accordingly and continue loop
9      if index(p).lastModified == workingtree(p).lastmodified and !index(p).isSmudged mark p as clean and continue loop
10     do content comparison between index(p) and workingtree(p). report p as dirty/clean accordingly


If you only work with E/JGit then this should be fine. E/JGit would checkout ignoring any .gitattribute files. When the process us forces E/JGit to do content checks everything is fine.

Problems arise if you use also native git. E.g. you checkout with native git (e.g. during a initial clone, pull,...). Native git will do content modification before it writes to workingtree. This will even change the length of the file beeing different from the length recorded in the index. If you ask native git about the status everything is ok. Native git knows it has modified the file while writing to workingtree and will do the reverse operation to find out whether a file is dirty or not. But if you ask E/JGit things get problematic. Whenenver E/JGit has to do the content-check then E/JGit will report dirty files because of line 6 or 10.

One could thing: Even if native git checks out wouldn't the algorithm above would allow E/JGit not to have to do a content-check. Native git checks out and E/JGit should detect that the files have not been modified by purely looking at metadata (line 9). But line 6 and smudged entries ruin that.

Maybe we should move line 6 deeper down in the algorithm (behind line 9). This may solve some of the problems. I'll try that

But still: unless JGit learns about .gitattributes there will always be problems in this area.









Ciao
Chris
Re: checkout arbitrary commit with JGit [message #1182915 is a reply to message #1182665] Tue, 12 November 2013 14:03 Go to previous messageGo to next message
R Shapiro is currently offline R Shapiro
Messages: 386
Registered: June 2011
Senior Member
Quote:
Problems arise if you use also native git. E.g. you checkout with native git (e.g. during a initial clone, pull,...)



That would explain it then. This specific repository includes a number of oversized commits which seem to be problematic for JGit. So I generally use EGit/JGit for local operations and also for push, while using Git for clone, pull and fetch. I also run 'git status' regularly.

Operationally this seems to work ok in a normal work flow: I can do checkouts with EGit without any issues. But given your description I can see how more exotic uses of checkout, like walking through an arbitrary list of commits and doing a checkout each one in turn, could run into be problems.

Thanks for the detailed explanation.
Re: checkout arbitrary commit with JGit [message #1182932 is a reply to message #1176986] Tue, 12 November 2013 14:18 Go to previous messageGo to next message
Christian Halstrick is currently offline Christian Halstrick
Messages: 106
Registered: July 2009
Senior Member
regarding your initial question of how to checkout a commit: don't use '.setStartPoint()' when you don't want to create a new branch. This is not described correctly, I agree. But we have tried to stick with the wording on git man-page. Look at http://git-scm.com/docs/git-checkout and look for "start_point" to see when a git checkout needs a "start_point". That's when creating a new branch. Something like "git checkout -b my_new_branch HEAD~2". You want to create a new branch my_new_branch, let him start from HEAD~2 and finally checkout that new branch.

But in your case it's more simple:

git.checkout().setName(commit.name()).call()


If you checkout with JGit instead of native git you could also get around your problems with wrong status as I described before.


Ciao
Chris
Re: checkout arbitrary commit with JGit [message #1183130 is a reply to message #1182932] Tue, 12 November 2013 17:05 Go to previous messageGo to next message
Matthias Sohn is currently offline Matthias Sohn
Messages: 578
Registered: July 2009
Senior Member
I think we should add another method to simplify this to

git.checkout().setId(commit).call();
Re: checkout arbitrary commit with JGit [message #1183190 is a reply to message #1182932] Tue, 12 November 2013 18:00 Go to previous messageGo to next message
R Shapiro is currently offline R Shapiro
Messages: 386
Registered: June 2011
Senior Member
Quote:
git.checkout().setName(commit.name()).call()


Much nicer. It's not easy to deduce this from the API as described

I attempted to do a new clone with jgit.sh but got an OutOfMemoryError (Java heap space)
Running again with -Xmx2048m, looks like it could be awhile before it finishes, I'll followup when it does.



Re: checkout arbitrary commit with JGit [message #1183252 is a reply to message #1183190] Tue, 12 November 2013 18:54 Go to previous messageGo to next message
R Shapiro is currently offline R Shapiro
Messages: 386
Registered: June 2011
Senior Member
Cloning with JGit is taking quite a long time. It quickly gets through the following:


Initialized empty Git repository in /private/tmp/xxx/.git
remote: Counting objects: 50398
remote: Compressing objects: 100% (9870/9870)
Receiving objects: 100% (50398/50398)
Resolving deltas: 100% (32026/32026)
Updating references: 100% (14/14)
remote: Total 50398 (delta 32026), reused 48736 (delta 30915)

At this point I see nothing further.

ctl-backslash shows the "main" thread sitting in a zip inflater, where it's been for the past 50+ minutes

at java.util.zip.Inflater.inflateBytes(Native Method)
at java.util.zip.Inflater.inflate(Inflater.java:256)
- locked <0x0000000121e45d98> (a java.util.zip.ZStreamRef)
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:152)
at java.util.zip.InflaterInputStream.skip(InflaterInputStream.java:208)
at java.io.BufferedInputStream.skip(BufferedInputStream.java:366)
- locked <0x000000016adc8068> (a java.io.BufferedInputStream)
at org.eclipse.jgit.lib.ObjectStream$Filter.skip(ObjectStream.java:199)
at org.eclipse.jgit.util.IO.skipFully(IO.java:330)
at org.eclipse.jgit.internal.storage.pack.DeltaStream.seekBase(DeltaStream.java:329)
at org.eclipse.jgit.internal.storage.pack.DeltaStream.read(DeltaStream.java:213)
at org.eclipse.jgit.internal.storage.pack.DeltaStream.read(DeltaStream.java:214)

Re: checkout arbitrary commit with JGit [message #1183436 is a reply to message #1183252] Tue, 12 November 2013 21:38 Go to previous messageGo to next message
Matthias Sohn is currently offline Matthias Sohn
Messages: 578
Registered: July 2009
Senior Member
You probably need to increase jgit's cache buffering IO on pack files "core.packedGitLimit"
and if your repository contains large files also increase "core.streamFileThreshold" to some
value larger than the largest file you expect. This may require that you also increase max
heap size.
Re: checkout arbitrary commit with JGit [message #1184409 is a reply to message #1183436] Wed, 13 November 2013 12:30 Go to previous messageGo to next message
R Shapiro is currently offline R Shapiro
Messages: 386
Registered: June 2011
Senior Member
Unfortunately still no luck after setting core.streamFileThreshold to 500m and core.packedGitLimit to 1g. Heap size in the sh script is 2048m. The clone hangs in the same place.

There are big binary files, in multiple versions, but no one file is any bigger than 150m. None of these gain anything by compression but I'm not clear how to turn it off in jgit.




Re: checkout arbitrary commit with JGit [message #1184576 is a reply to message #1184409] Wed, 13 November 2013 14:54 Go to previous messageGo to next message
Christian Halstrick is currently offline Christian Halstrick
Messages: 106
Registered: July 2009
Senior Member
Since we cannot try out the repo it's hard to give good hints. Have you seen http://dev.eclipse.org/mhonarc/lists/jgit-dev/msg00693.html? Here Shawn has more hints about config parameters which help JGit.

Ciao
Chris
Re: checkout arbitrary commit with JGit [message #1185686 is a reply to message #1184576] Thu, 14 November 2013 07:44 Go to previous message
Matthias Sohn is currently offline Matthias Sohn
Messages: 578
Registered: July 2009
Senior Member
Was this repository packed / GCed with jgit ? Since jgit doesn't understand gitattributes yet it has no reliable way to
detect that your big binary files are binary. At the moment jgit uses the following heuristic check to detect binary files:
if a file contains a null byte in the first 8kB jgit assumes it's a binary file. If one of your big binaries happens to not
match this, jgit will try to compute deltas which may lead to problems. You could try what Shawn proposed at the end
of the above mentioned mail on the list.
Previous Topic:Egit bug trackers misconfigured
Next Topic:maven multi module error: Unable to ignore resources Attempted to beginRule: xxx does not match oute
Goto Forum:
  


Current Time: Tue Sep 16 19:35:53 GMT 2014

Powered by FUDForum. Page generated in 0.04191 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software