|[jgit-dev] Removal/replacing of streaming delta support|
As I learned the last week JGit have a very hard time to handle delta when it goes above “streamfilethreshold”. After searching around I find quite a few others that have the same problem (c git also have issues).
http://www.eclipsecon.org/2013/sites/eclipsecon.org.2013/files/Scaling%20Up%20JGit%20-%20EclipseCon%202013.pdf also says “Remove streaming delta support (too slow)”.
Personally I prefer that JGit throws a (OutOfMemory)Exception instead of entering a loop that takes > 12h to finish. The hard part is perhaps to know that the loop will take that long. Some LargePackedDeltaObject actually finishes within reasonable time. You could perhaps add a check and simply give up and throw an Exception after running > 5 minutes, but that sounds like a hack to me.
An other idea I had was to use a temporary file when an delta packed object is considered large (> “streamfilethreshold”). I actually managed to hack this together (using nio and memory mapped temporary files) and with this I can open objects that were 353MB when limiting the Java heap to 300MB and using default (50M) “streamfilethreashold”. The performance was surprisingly fast. I must admit that the temporary files were placed on /tmp which is an SSD and that I probably have enough RAM available so that the OS don’t have to write so often to the disk which means that the benefit might not be as large when you are low on RAM (remains to be tested).
My code is far from something that is ready to be submitted and have tons of issues and is probably done in completely wrong place, but it was fun doing it.
Would this (the idea, not my hack) be a possible replacement/alternative for the existing streamed delta support?