|[tracecompass-dev] Heap storage consumption for traces|
A while back, Matthew and I discussed heap storage utilization by traces. My traces have a number of attributes associated with event, with the number of attributes varying by type of trace. I create ITmfEventField and TmfStateValue objects for each attribute.
I have a trace with about 88,000 events. In that trace, I ended up with close to 800,000 ITmfEventField and 500,000 TmfStateValue objects.
I created a hash table for ITmfEventField objects that contained a single instance for each unique field name and value pair. In this trace, I ended up with about 4,000 unique ITmfEventField objects.
I created a hash table for the TmfStateValue objects and end up with 52 unique objects.
I ran my code with VisialVM, and I noticed that the top class for heap usage was org.eclipse.tracecompass.internal.statesystem.core.backend.historytree.HTInterval. From one run to the next the total number of objects varies somewhere between 1.5 million and 2.5 million. If I click the Perform-GC button in VisualVM, the count drops to around 700,000, fairly consistently.
I also see a large number of instances of com.google.com.collect.ImmutableMapEntry (355,000), com.google.com.collect.ImmutableMapEntry (178,000) classes and com.google.com.collect.ImmutableMapEntry$NonTerminalImmutableMapEntry (355,000)
Looking at the reference pane in VisualVM for a few (< 10) of each of these classes, it looks like the ImmutableMapEntry and ImmutableMapEntry$nonTerminalImmutableMap objects might be related to StateItem objects and the ImmutableMapEntry objects might be related to ITmfEventField objects.
I'm wondering if it is possible to use a hash table to maintain single copies of unique instances of those objects or whether the data in them guarantees each instance is unique. If it is possible, then is the performance tradeoff for ushing a hash table worth it?