|Re: [jgit-dev] NPE in getAuthorIdent() on parsed RevCommit object|
Dariusz Luksza <dariusz.luksza@xxxxxxxxx> wrote: > > In EGit synchronization we have strange problem with > RevCommit.getAuthorIdent(). It occurs *only* on oldest commit that is > include in Synchronization view. > > First thing that I've discover was, that > remoteCommit.getParent(0).getAuthorIdent() in some cases was throw NPE > even that remoteCommit.getParent(0).getId() returns proper value. > Therefore I've use RevWalk.parseCommit(remoteCommit.getParrent(0)) to > re-parse commit object. After re-parsing getAuthorIdent() returns > proper value (there were no more NPE). Reference to remote commit > parent is stored in private final field so that it can be obtained in > various place of synchronization. Not entirely strange to NPE. RevWalk can (and does) discard the byte buffer that holds the commit data that is used to obtain this information. If that's discarded, *poof*, no data, and we NPE on access. > But re-parsing commit object didn't solve this problem. NPE was still > thrown in same place! For some reasons reference to author ident is > lost before we achieve this trouble spot. Currently I come up with > workaround for this issue by re-parsing commit exactly in trouble spot > but this isn't efficient. Yea, you really need to make sure its parsed right before you do the access. Traversals can cause a discard, especially if you have the default of setRetainBody(false), or if the commit in question ever got tagged with the UNINTERSTING flag (e.g. its reachable from something that was passed to markUninteresting). When you can't be certain the body is present, the correct pattern becomes: revWalk.parseBody(revCommit); revCommit.getAuthorIdent() We may need to stick a warning label on RevWalk. Its designed to be *fast*, not user-friendly to the application developer. When there is a choice between ease-of-use and raw-throughput, RevWalk favors raw-throughput. Consequently we don't check to see if the object has the data it needs on hand to answer getAuthorIdent(). Nor do we try to load it when its missing. We assume its there, because there are valid use cases where you can prove its going to be present and bypassing the checks saves time when dealing with 60k commits. If you can't prove it, you should make the extra call to verify the data is loaded. That is the parseBody method. If the body is already load, parseBody() basically only cost a few machine instructions (load the flag field, do a boolean and with the constant '1', check for != 0, return). So its fairly cheap, but if you can prove its not necessary, its good to avoid it. :-) -- Shawn.
Back to the top