[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
[
List Home]
Re: [jgit-dev] Fastest way of retrieving contents of new/changed files of all commits?
|
----- Ursprungligt meddelande -----
> Från: gt6@xxxxxxx
> Till: jgit-dev@xxxxxxxxxxx
> Skickat: söndag, 19 okt 2014 14:54:46
> Ämne: [jgit-dev] Fastest way of retrieving contents of new/changed files of all commits?
>
> Hello everyone,
>
> For a research project I need the contents of all new and changed files,
> as well as the file paths of deleted files, for every commit of a
> particular branch in a repository. I currently do it as follows:
>
...
> 6. From the diff above, I can determine the new, changed and deleted
> file paths and the read the contents of new/changed files from the
> working directory.
>
> My problem now is that this is too slow for my purposes. I actually have
> to checkout the files into the working dir so that I can read the files
> and for every commit, I have to create another diff etc. Plus it's hard
> to parallelize this. I could just run N instances on different parts of
> the commit-range (e.g. the 1st 1000, the 2nd 1000 and so on) but then I
> would need N temporary directories to checkout the commits.
>
Look att the source for the Log and Diff commands. Diff compares and produces the differences
between any two commits. Log does this (with the -p option) for all commits
for a range of commits. Both commands work with bare repo.
-- robin