Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [jgit-dev] Improving ObjectIdSubclassMap performance

On Wed, Mar 9, 2011 at 16:28, Robin Rosenberg
<robin.rosenberg@xxxxxxxxxx> wrote:
> Have you measured how much time is spend doing linear searches? Seems
> only using half the table would mean lots of holes, but unlike other
> tables the penalty for having one could be quite large, because you
> steal buckets from the legitimate owners, and that could perhaps be much
> more expensive that you expect.
>
> Allowing more air than 50% in the table could give some hints about this
> behavior, or even adding counters just for measuring.

Sadly, we're sort of in a sweet spot at 50% load factor.

If I change the code to double the table when a linear chain gets
longer than 64, the same final table size is reached but its 1s
slower.

If I change the code to use 75% load factor, the same final table size
is reached, but its 3s slower.

If I change the code to use 25% load factor, the table is 33M entries
large (rather than 4M), and its 2s slower.


I have a few cleanups and micro optimizations that I'll upload in a
minute, these shaved a little bit off the running time, but we're
talking <1s for a 37s operation.


As far as chain lengths go, we're not that bad.

Here is the raw data. Objects is the number of items in the table at
the end, table is the size of the obj_hash array, wasted is the
difference. A hit is a successful get() returning an object, a miss is
a get that returned null and later turns into an insert. The number of
calls on chain lengths above 18 falls off fast so I elided it out.
With a 50% load factor, most operations have shorter than 7 items in
their chain. A wide majority are below 2.

objects: 1886362
table: 4194304
  (wasted 2307942)
chains (hit):
  length   0 : 42396217 get calls
  length   1 : 13675300 get calls
  length   2 : 6756795 get calls
  length   3 : 3759100 get calls
  length   4 : 2213449 get calls
  length   5 : 1413852 get calls
  length   6 : 1046289 get calls
  length   7 : 812226 get calls
  length   8 : 596076 get calls
  length   9 : 529995 get calls
  length  10 : 357039 get calls
  length  11 : 321895 get calls
  length  12 : 261752 get calls
  length  13 : 162623 get calls
  length  14 : 163727 get calls
  length  15 : 112538 get calls
  length  16 : 78401 get calls
  length  17 : 103203 get calls
  length  18 : 81563 get calls
  ...
  length >63 : 11553 get calls

chains (miss):
  length   0 : 872894 get calls
  length   1 : 345177 get calls
  length   2 : 187751 get calls
  length   3 : 117402 get calls
  length   4 : 78825 get calls
  length   5 : 55710 get calls
  length   6 : 41359 get calls
  length   7 : 31547 get calls
  length   8 : 24517 get calls
  length   9 : 19507 get calls
  length  10 : 15565 get calls
  length  11 : 12767 get calls
  length  12 : 10550 get calls
  length  13 : 8895 get calls
  length  14 : 7573 get calls
  length  15 : 6382 get calls
  length  16 : 5542 get calls
  length  17 : 4847 get calls
  length  18 : 4162 get calls
  ...
  length >63 : 1096 get calls

-- 
Shawn.


Back to the top