Home » Eclipse Projects » NatTable » using NatTable trees without GlazedLists(questions I've run into while using 1M treerows)
using NatTable trees without GlazedLists [message #1296282] |
Mon, 14 April 2014 23:29 |
Rob Sigel Messages: 4 Registered: April 2014 |
Junior Member |
|
|
I'm dealing with datasets using trees with around a million rows. Some of the datasets are shallow hierarchies (with most leaf nodes at the root and a minority of leaf nodes in 2 level hierarchies), and others are deeper (with almost all leaf nodes 20+ levels deep in their tens of thousands of hierarchies).
Due mostly to memory constraints (and some observed performance issues), I've been unable to use the GlazedLists tree implementation, so I'm attempting to create a non-Glazed implementation through the interfaces. Thankfully the upcoming v1.1 release appears to be adding many features I'd customized in my own v1.0.1 implementation, like expandAll/collapseAll.
Some thoughts/observations about the current v1.1 codebase when GlazedLists isn't being used:
Index vs Position in TreeLayer, ITreeData, and ITreeRowModel
Outside the Trees, index has a very specific meaning - the unique location of data in the bottom-most layer (DataLayer) of the stack. (Corresponding approximately to the key in a database.) In the TreeLayer, ITreeRowModel, and ITreeData, the "position" vs "index" distinction appears much less clear, unless I'm missing something. If a TreeLayer isn't required to be placed immediately above the DataLayer in the layer stack, then there can be multiple Position translations already involved before the TreeLayer sees the data. Thus it is unclear to me just what Index is supposed to mean for the ITreeRowModel and ITreeData. Shouldn't it still be a position instead of an index? Technically, are ITreeRowModel and ITreeData expected to perform their work according to the position of the data in the underlying layer, before the TreeLayer's own translations occur? Or are they working according to the DataLayer's true indexes, with appropriate translations between the TreeLayer position and the DataLayer's index expected to always be applied at the TreeLayer/ITreeRowModel boundary? Or is this up to the specific implementation to decide?
The situation is made more confusing since only some of the TreeLayer Events being sent up the Layer stack are passing collections of ITreeRowModel "Indexes" without translation (instead of TreeLayer Positions) -- see TreeLayer.expandTreeRow() and TreeLayer.expandAll() for example. Other Events are passing collections of translated TreeLayer positions -- see TreeLayer.collapseTreeRow() and TreeLayer.collapseAll() for example. Maybe this Event inconsistency is a bug?
New in v1.1: collections of all descendants
I suspect the collections of collapsed/hidden positions/indexes should be Sets where possible, not Lists. When iterating tree nodes to collect subnodes, child nodes (or their indices/positions) are added to a single list for every ancestor that is visited during the collection process (see collapseAll() for example). So in deeper trees that already have a million nodes, the collection of "indexes" returned by collapseAll() can have multiple millions of entries in the list, with leaves (and branches) duplicated in the list according to how many ancestors/branches appear above them. Subsequent work, including the index-to-rowposition translations in collapseAll(), thus inefficiently happens multiple times per child.
Clearing the TreeLayer.hiddenRowIndexes set
Could TreeLayer.hiddenRowIndexes simply be cleared during TreeLayer.expandAll(), instead of the call to hiddenRowIndexes.removeAll(treeRowModel.expandAll())? (Or are there expandAll() use cases where a simple hiddenRowIndexes.clear() would be incorrect?)
Should this set be cleared in response to StructuralChangeEvents, at least vertical ones? Isn't a vertical StructuralChangeEvent the correct way for the underlying DataLayer to indicate emptying the treetable and/or completely repopulating the treetable?
While there's no documentation for this, I'm assuming Layers (like the DataLayer) are allowed to trigger Events that weren't caused by Commands. Or are externally-caused dataset changes always expected to drive Commands down the entire NatTable layer stack, with the Events only happening in response to handled Commands?
|
|
|
Re: using NatTable trees without GlazedLists [message #1297722 is a reply to message #1296282] |
Tue, 15 April 2014 21:47 |
Dirk Fauth Messages: 2902 Registered: July 2012 |
Senior Member |
|
|
Two things before I try to answer some of the questions:
1. Starting such a discussion shortly before the upcoming release is called "bad timing"! If you would have started this discussion a few weeks before, we could have tried to clean up stuff before the release.
2. You are the first one I know that is using the tree functionality without GlazedLists. I always thought GlazedLists is used to get rid of performance issues. I have never seen any example for having a NatTable tree without GlazedLists, which is possibly the reason for your investigation results. Nobody used this functionality before AFAIK, or at least, nobody before used it that intensively.
Index vs. Position
As I said before, I'm not aware of anybody who is using the tree without GlazedLists. And there the indexes are needed to operate on the TreeList. I'm not sure if it is a "inconsistency bug" or intentionally. But it works well with GlazedLists. Maybe you are just confused because of parameter names? Trying to dig in deeper to this would possibly result in refactoring. And that is something for the new architecture. So at that place I would answer "It is up to the specific implementation to decide"
Collections of descendants
Well I'm not 100% sure why the API is using a List instead of Collection. But possibly the reason is about ordering. So simply changing List to Set might cause issues later. But as I said, I'm not 100% sure as I never used or tested trees without GlazedLists. And I don't want to change the API in the current architecture. Of course you could implement it internally to use a Set and then return a correctly ordered List. Not sure if that would be better in terms of performance.
Clearing hiddenRowIndexes
clear() on expandAll() ... with GlazedLists there are no hiddenRowIndexes because the collapsing and expanding of rows is done in GlazedLists. Thinking about this, clear() should be fine. Did you test this? If it is ok for you, feel free to create a ticket and contribute the fix.
About the handling of StructuralChangeEvents, you might be correct. The RowHideShowLayer is checking for updates on such changes. Maybe the same handling need to be done in case there are hidden row indeces.
Trigger events
Yes layers can fire events even if no triggered via commands.
Said that, I'm not sure what I should do with your observations. The codebase for 1.1 is frozen and I will not perform major changes before the 1.1 release.
After the 1.1 release there will be no further active development on the 1.1 architecture. So the only option will then to contribute.
Nevertheless, thanks for your observations. Maybe you will contribute some patches to make that feature better in the future.
Greez,
Dirk
|
|
|
Re: using NatTable trees without GlazedLists [message #1297800 is a reply to message #1297722] |
Tue, 15 April 2014 22:57 |
Rob Sigel Messages: 4 Registered: April 2014 |
Junior Member |
|
|
Thanks for the fast reply, Dirk!
Sorry about the bad timing, but it was due to unavoidable scheduling conflicts on my side. I'd had time to put together an initial prototype back in late November, then priority changes kept me away until the end of last week. At which point I saw in the forums that 1.1 was coming soon, so I updated my prototype to the current SNAPSHOT build and investigated my prior concerns, resulting in the above forum post.
If I'm the only one not using GlazedLists, I may take another look at getting that working acceptably with my dataset, from both a memory and performance perspective. I expect I'm not the only one with massive dataset hierarchies, so I may have done something incorrectly with my initial GlazedLists tree prototype. After all the deep code dives I've performed lately, I've now got a much better idea how things are tied together.
I'll be glad to contribute patches. I've been unable to compile the NatTable project from source so far due to various dependency problems. (I'm not familiar with Maven or Tycho for the commandline build, so I'm not sure how to interpret those dependency problems, and in the Eclipse IDE I continually get manifest errors in the Target Platform/Target Definition editor for the target-platform.target file in the target-platform project, but I'll keep trying.)
Thanks again,
Rob
|
|
| |
Goto Forum:
Current Time: Fri Apr 19 23:16:01 GMT 2024
Powered by FUDForum. Page generated in 0.03235 seconds
|