Re: [jgit-dev] Improving ObjectIdSubclassMap performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

Re: [jgit-dev] Improving ObjectIdSubclassMap performance

From: Aliaksandr Radzivanovich <aradzivanovich@xxxxxxxxx>
Date: Wed, 9 Mar 2011 23:49:01 +0200
Delivered-to: jgit-dev@xxxxxxxxxxx
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; b=VtiLNdQ8b1wzRR7QVv0cjZlN4+zg5fABZO/22FiZitgCStR02Jh74EZSHJ2icOx03/ G0XhN3S58U4FMPyOhXP6zrDUctSORjaMK54U41pm1t4LoH0RJlEyMJtzG7aSauaVO4// l9AJj59I7hrfiOUpyNXYopGvvXuJoxOurDReg=
List-archive: <https://dev.eclipse.org/mailman/private/jgit-dev>
List-help: <mailto:jgit-dev-request@eclipse.org?subject=help>
List-subscribe: <https://dev.eclipse.org/mailman/listinfo/jgit-dev>, <mailto:jgit-dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://dev.eclipse.org/mailman/options/jgit-dev>, <mailto:jgit-dev-request@eclipse.org?subject=unsubscribe>

Yes, this is the way I implemented it too. In fact, I just copied the
original class and than made few modifications in it.

The algorithm is really, really simple. I won't bother posting it to
Gerrit, here are just a few snippets. Four hash functions variant:


	public V get(AnyObjectId toFind) {
		V v;
		if ((v = objects[index0(toFind)]) != null
				&& AnyObjectId.equals(v, toFind))
			return v;
		if ((v = objects[index1(toFind)]) != null
				&& AnyObjectId.equals(v, toFind))
			return v;
		if ((v = objects[index2(toFind)]) != null
				&& AnyObjectId.equals(v, toFind))
			return v;
		if ((v = objects[index3(toFind)]) != null
				&& AnyObjectId.equals(v, toFind))
			return v;
		return null;
	}

	private void insert(V v) {
		for (int n = 0; n < STEPS; n++) {
			if ((v = swap(index0(v), v)) == null)
				return;
			if ((v = swap(index1(v), v)) == null)
				return;
			if ((v = swap(index2(v), v)) == null)
				return;
			if ((v = swap(index3(v), v)) == null)
				return;
		}
		grow();
		insert(v);
	}

	private int index0(AnyObjectId id) {
		int q = objects.length >> 2;
		return (id.w1 >>> 1) % q;
	}

	private int index1(AnyObjectId id) {
		int q = objects.length >> 2;
		return (id.w2 >>> 1) % q + q;
	}

	private int index2(AnyObjectId id) {
		int q = objects.length >> 2;
		return (id.w3 >>> 1) % q + q * 2;
	}

	private int index3(AnyObjectId id) {
		int q = objects.length >> 2;
		return (id.w4 >>> 1) % q + q * 3;
	}

	private V swap(int i, V value) {
		V v = objects[i];
		objects[i] = value;
		return v;
	}


That's it.

I don't know, may be these index method can be made even faster by
using clever bit twiddling hacks, but I don't think it will improve
speed significantly.

On 9 March 2011 22:56, Shawn Pearce <spearce@xxxxxxxxxxx> wrote:
> On Wed, Mar 9, 2011 at 12:50, Johannes Schindelin
> <Johannes.Schindelin@xxxxxx> wrote:
>> My guess is that the hash calculation makes for the overhead. If that is
>> the case, you might want to use several parts of the SHA-1 as
>> different hashes (interpreting the 20-byte as 5 32-bit integers gives you
>> five hash functions for the price of one), because the SHA-1 has been
>> calculated anyway.
>
> That's what we do already. SHA-1s are already stored as 5 primitive
> ints in Java, by treating the raw 20 bytes as 5 network byte-order
> integers and converting them to primitive integer values.
>
> The current table maps that onto a slot in the hashtable with:
>
>  (id.w1 >>> 1) % obj_hash.length
>
> Which could be faster, the table length is currently always a power of
> 2. So we could rewrite this as:
>
>  id.w1 & table_mask
>
> Where table_mask is just:
>
>  table_mask = obj_hash.length - 1.
>
> :-)
>
> --
> Shawn.
>

References:
- [jgit-dev] Improving ObjectIdSubclassMap performance
  - From: Shawn Pearce
- Re: [jgit-dev] Improving ObjectIdSubclassMap performance
  - From: Aliaksandr Radzivanovich
- Re: [jgit-dev] Improving ObjectIdSubclassMap performance
  - From: Shawn Pearce
- Re: [jgit-dev] Improving ObjectIdSubclassMap performance
  - From: Johannes Schindelin
- Re: [jgit-dev] Improving ObjectIdSubclassMap performance
  - From: Shawn Pearce

Prev by Date: Re: [jgit-dev] [jgit-build] Build failed in Hudson: jgit #653
Next by Date: Re: [jgit-dev] Improving ObjectIdSubclassMap performance
Previous by thread: Re: [jgit-dev] Improving ObjectIdSubclassMap performance
Next by thread: Re: [jgit-dev] Improving ObjectIdSubclassMap performance
Index(es):
- Date
- Thread

Breadcrumbs