Home » Eclipse Projects » EGit / JGit » Performance issue to a shared network drive from Windows.(Git repository size is about 46m and takes approximately 47 minutes.)
Performance issue to a shared network drive from Windows. [message #1850980] |
Wed, 23 March 2022 16:51  |
Eclipse User |
|
|
|
Hello,
Using EGit/JGit (6.0.0.202111291000-r) to clone a project to a shared network drive from Windows, I have a performance issue. The network share is historical, used by a Linux build farm environment due to our colossal codebase. The Git repository size is about 46m and takes approximately 47 minutes. To a local drive, it takes approximately 30 seconds. My first inclination was to contact our IT department to check the performance of the network share (it's either NetApp or Samba, I'm not sure). But before that, I tried using Git for Windows from the command line, and its timing was approximately 18 seconds to the network share and 1.5 seconds locally. I looked at EGit/JGit log output and saw what appears to be a constant check of .gitconfig in my local HOME directory and the local Git directory in a class called FileSnapshot. Any help is appreciated.
Regards,
Jim
|
|
| | | | |
Re: Performance issue to a shared network drive from Windows. [message #1851031 is a reply to message #1851022] |
Fri, 25 March 2022 03:51   |
Eclipse User |
|
|
|
Quote:Do you know why Git for Windows to the samba drive performance is better?
No, I don't know.
My first guess is that JGit accesses the git config file far too often. It does cache the config, but it checks for each call to Repository.getConfig() whether it has changed on disk. Which means at least getting the file attributes of three files: the repo config, global config, and system config. Accessing file attributes is expensive on Windows, and I imagine it's even more expensive with Samba.
My second guess would be that file timestamp resolution on a Samba drive is coarse, and JGit thus frequently considers files to be "racily clean" and re-loads them "just in case". ("Racily clean" basically means: if a file is modified twice within the file timestamp resolution, it may have been modified even if timestamp and size are the same, and the file must be re-loaded to catch it.)
But there may be other reasons why JGit is especially slow in this case.
Quote:My best option may be to support the plug-in on Linux.
What plug-in?
You wrote the Samba share was because of a build system. Most build systems I've seen are de-coupled from repository management. In the simplest case, there's a git server hosting "canonical" central repositories. Developers clone from there, and then work with their local clones. When they push, their changes go into the central repositories on the git server and the CI build is triggered. The CI build is just another client of that git server and clones the repository or repositories it needs for its build, and then works with these local clones on the build machine.
If cloning during the build should not be possible, you could maybe share the directory where the git server holds the central repositories with the build machine and make the build work directly from those central repositories. But I've never come across such a case (and I have seen projects with "colossal codebases", too). The build can use shallow clones to speed up cloning.
|
|
| | |
Re: Performance issue to a shared network drive from Windows. [message #1851252 is a reply to message #1851239] |
Thu, 31 March 2022 15:00   |
Eclipse User |
|
|
|
I'm beginning to see how your environment is set up. I once had to set up such a system, and my initial idea indeed was similar: use a Samba share to give Windows users access. I gave up quickly.
Instead we abandoned the idea that a Windows developer could build locally. (Our application targeted Linux only anyway (and Ada 95, not C), but developers had either Windows PCs/laptops or terminal access via ssh on a Linux machine.) The build was Linux-only, and ran on Linux.
We used a git server with great pre-commit review and build support: Gerrit. We installed Gerrit and Jenkins in our Linux environment (different boxes). Windows users could clone from Gerrit, develop, push their changes to Gerrit. Gerrit would trigger a Jenkins job on the Linux build machine that would clone the repo, check out the change, build it and run tests. Users get e-mails about the build and test result, and can see build logs in the Jenkins Web UI. Once successfully built and approved in the Gerrit code review, the change could be merged in the Gerrit Web UI, which triggered another job to build the product.
Users who preferred to use vim in a Linux terminal could SSH in to Linux, use the git command line to clone into their Linux directory, work there, commit, and also push to Gerrit. Since those users were on Linux in the terminal, they could also compile their code manually, or even run tests. At a later stage we got a VNC server installed because we needed to be able to run UI tests in the Linux build, so one could even tunnel a VNC session through SSH to Linux and start Eclipse on the Linux machine, but have the Window on the PC and work that way. That way users got a Linux UI via RemoteDesktop, and could even manually compile their code in a Linux terminal, or run tests there. UI was minimal (X Windows and mwm as window manager) but fully sufficient.
All rather hodge-podge, but it worked very well.
|
|
| | | | | | | |
Goto Forum:
Current Time: Tue May 13 14:20:26 EDT 2025
Powered by FUDForum. Page generated in 0.05237 seconds
|