Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [jgit-dev] Transient poopies after JGit commit

Bill Burdick <bill.burdick@xxxxxxxxx> wrote:
> 
> OK, I'm trying to use DirCache, but I must not be using it properly.  I have
> a boolean in my script to switch between using GitIndex and DirCache.
>  GitIndex behaves like I would expect, but when I use DirCache to do two
> commits (after edits), the second one adds a commit to the list and updates
> the head like you would expect, but it appears to change the tree in the
> first commit and leaves the new commit empty.  Here is the if-clause that
> chooses between GitIndex and DirCache.  After that, I'm attaching the entire
> (updated) script.

You shouldn't use a builder like this.  Instead you need to use
an editor.

The builder API expects you to feed it the entire new index,
hopefully in path name order to avoid a sort during commit.  Right
now you seem to be feeding it only the new/modified paths, which
means you are telling it to effectively delete the unchanged paths.

By using an editor you instead feed a set of paths which you want
to modify with edit commands.  These are merge-sorted against the
current index in order to produce the new index state.  This is
the more common usage I think.

But it really depends.  A builder can be used in the middle of
a TreeWalk along with the right type of iterator to build up the
new index.  That lets the TreeWalk manage the merge between the
working tree iterator and the index iterator... but this has been
reported to be a bit slower than brute force use against GitIndex.
In theory it should perform better, since we can do the entire
thing in one pass... :-\
 
> When I make a DirCacheEntry, do I need to initialize it further than setting
> its path and fileMode?

You might want to also set its last modified time to match the
working tree, to give a better chance that the item in the index
matches the file later during a subsequent status operation.

> As an aside, it seems like if DirCacheBuilder had an add(path) method,
> creating new entries would be just as easy as it is with GitIndex.

Its close, yes.  But the problem with that is each addition is
O(log N) if its actually a replacement, and O(N) if its an actual
new addition or removal.

If you do more than one of these, it starts to get expensive as we
modify the index.  The builder/editor APIs allow us to aggregate
the mutations together and perform one rebuild of the structure,
with a lower total cost.

Though either API right now requires an O(N) cost for any change.
If you are only doing a handful of replacements/modifications, that
O(N) can far outweight O(R log N).  Which might explain why the
TreeWalk case I mentioned above doesn't perform as well as it should.

> if (useIndex) {
> index = repository.getIndex
> pathsDo(repoPath, _.getName != ".git") {index.add(repoPath, _)}
> index.write
> commit.setTree(repository.mapTree(index.writeTree))
> } else {
> dircache = DirCache.lock(repository.getIndexFile)
> val builder = dircache.builder
> 
> pathsDo(repoPath, _.getName != ".git") {f =>
> val entry = new DirCacheEntry(f.getAbsolutePath.substring(repoPathLen))
> 
> entry.setFileMode(FileMode.REGULAR_FILE)
> builder.add(entry)
> }
> builder.commit
> commit.setTree(repository.mapTree(dircache.writeTree(new
> ObjectWriter(repository))))
> }

-- 
Shawn.


Back to the top