Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [jgit-dev] Fastest way of retrieving contents of new/changed files of all commits?


----- Ursprungligt meddelande -----
> Från: gt6@xxxxxxx
> Till: jgit-dev@xxxxxxxxxxx
> Skickat: söndag, 19 okt 2014 14:54:46
> Ämne: [jgit-dev] Fastest way of retrieving contents of new/changed files	of all commits?
> 
> Hello everyone,
> 
> For a research project I need the contents of all new and changed files,
> as well as the file paths of deleted files, for every commit of a
> particular branch in a repository. I currently do it as follows:
> 
...

> 6. From the diff above, I can determine the new, changed and deleted
> file paths and the read the contents of new/changed files from the
> working directory.
> 
> My problem now is that this is too slow for my purposes. I actually have
> to checkout the files into the working dir so that I can read the files
> and for every commit, I have to create another diff etc. Plus it's hard
> to parallelize this. I could just run N instances on different parts of
> the commit-range (e.g. the 1st 1000, the 2nd 1000 and so on) but then I
> would need N temporary directories to checkout the commits.
> 

Look att the source for the Log and Diff commands. Diff compares and produces the differences
between any two commits. Log does this (with the -p option) for all commits
for a range of commits. Both commands work with bare repo.

-- robin



Back to the top