Skip to main content


Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Eclipse Projects » NatTable » using NatTable trees without GlazedLists(questions I've run into while using 1M treerows)
using NatTable trees without GlazedLists [message #1296282] Mon, 14 April 2014 23:29 Go to next message
Rob Sigel is currently offline Rob SigelFriend
Messages: 4
Registered: April 2014
Junior Member
I'm dealing with datasets using trees with around a million rows. Some of the datasets are shallow hierarchies (with most leaf nodes at the root and a minority of leaf nodes in 2 level hierarchies), and others are deeper (with almost all leaf nodes 20+ levels deep in their tens of thousands of hierarchies).

Due mostly to memory constraints (and some observed performance issues), I've been unable to use the GlazedLists tree implementation, so I'm attempting to create a non-Glazed implementation through the interfaces. Thankfully the upcoming v1.1 release appears to be adding many features I'd customized in my own v1.0.1 implementation, like expandAll/collapseAll.

Some thoughts/observations about the current v1.1 codebase when GlazedLists isn't being used:

Index vs Position in TreeLayer, ITreeData, and ITreeRowModel

Outside the Trees, index has a very specific meaning - the unique location of data in the bottom-most layer (DataLayer) of the stack. (Corresponding approximately to the key in a database.) In the TreeLayer, ITreeRowModel, and ITreeData, the "position" vs "index" distinction appears much less clear, unless I'm missing something. If a TreeLayer isn't required to be placed immediately above the DataLayer in the layer stack, then there can be multiple Position translations already involved before the TreeLayer sees the data. Thus it is unclear to me just what Index is supposed to mean for the ITreeRowModel and ITreeData. Shouldn't it still be a position instead of an index? Technically, are ITreeRowModel and ITreeData expected to perform their work according to the position of the data in the underlying layer, before the TreeLayer's own translations occur? Or are they working according to the DataLayer's true indexes, with appropriate translations between the TreeLayer position and the DataLayer's index expected to always be applied at the TreeLayer/ITreeRowModel boundary? Or is this up to the specific implementation to decide?

The situation is made more confusing since only some of the TreeLayer Events being sent up the Layer stack are passing collections of ITreeRowModel "Indexes" without translation (instead of TreeLayer Positions) -- see TreeLayer.expandTreeRow() and TreeLayer.expandAll() for example. Other Events are passing collections of translated TreeLayer positions -- see TreeLayer.collapseTreeRow() and TreeLayer.collapseAll() for example. Maybe this Event inconsistency is a bug?

New in v1.1: collections of all descendants

I suspect the collections of collapsed/hidden positions/indexes should be Sets where possible, not Lists. When iterating tree nodes to collect subnodes, child nodes (or their indices/positions) are added to a single list for every ancestor that is visited during the collection process (see collapseAll() for example). So in deeper trees that already have a million nodes, the collection of "indexes" returned by collapseAll() can have multiple millions of entries in the list, with leaves (and branches) duplicated in the list according to how many ancestors/branches appear above them. Subsequent work, including the index-to-rowposition translations in collapseAll(), thus inefficiently happens multiple times per child.

Clearing the TreeLayer.hiddenRowIndexes set

Could TreeLayer.hiddenRowIndexes simply be cleared during TreeLayer.expandAll(), instead of the call to hiddenRowIndexes.removeAll(treeRowModel.expandAll())? (Or are there expandAll() use cases where a simple hiddenRowIndexes.clear() would be incorrect?)

Should this set be cleared in response to StructuralChangeEvents, at least vertical ones? Isn't a vertical StructuralChangeEvent the correct way for the underlying DataLayer to indicate emptying the treetable and/or completely repopulating the treetable?

While there's no documentation for this, I'm assuming Layers (like the DataLayer) are allowed to trigger Events that weren't caused by Commands. Or are externally-caused dataset changes always expected to drive Commands down the entire NatTable layer stack, with the Events only happening in response to handled Commands?

Re: using NatTable trees without GlazedLists [message #1297722 is a reply to message #1296282] Tue, 15 April 2014 21:47 Go to previous messageGo to next message
Dirk Fauth is currently offline Dirk FauthFriend
Messages: 2902
Registered: July 2012
Senior Member
Two things before I try to answer some of the questions:

1. Starting such a discussion shortly before the upcoming release is called "bad timing"! If you would have started this discussion a few weeks before, we could have tried to clean up stuff before the release.
2. You are the first one I know that is using the tree functionality without GlazedLists. I always thought GlazedLists is used to get rid of performance issues. I have never seen any example for having a NatTable tree without GlazedLists, which is possibly the reason for your investigation results. Nobody used this functionality before AFAIK, or at least, nobody before used it that intensively.

Index vs. Position
As I said before, I'm not aware of anybody who is using the tree without GlazedLists. And there the indexes are needed to operate on the TreeList. I'm not sure if it is a "inconsistency bug" or intentionally. But it works well with GlazedLists. Maybe you are just confused because of parameter names? Trying to dig in deeper to this would possibly result in refactoring. And that is something for the new architecture. So at that place I would answer "It is up to the specific implementation to decide"

Collections of descendants
Well I'm not 100% sure why the API is using a List instead of Collection. But possibly the reason is about ordering. So simply changing List to Set might cause issues later. But as I said, I'm not 100% sure as I never used or tested trees without GlazedLists. And I don't want to change the API in the current architecture. Of course you could implement it internally to use a Set and then return a correctly ordered List. Not sure if that would be better in terms of performance.

Clearing hiddenRowIndexes
clear() on expandAll() ... with GlazedLists there are no hiddenRowIndexes because the collapsing and expanding of rows is done in GlazedLists. Thinking about this, clear() should be fine. Did you test this? If it is ok for you, feel free to create a ticket and contribute the fix.

About the handling of StructuralChangeEvents, you might be correct. The RowHideShowLayer is checking for updates on such changes. Maybe the same handling need to be done in case there are hidden row indeces.

Trigger events
Yes layers can fire events even if no triggered via commands.

Said that, I'm not sure what I should do with your observations. The codebase for 1.1 is frozen and I will not perform major changes before the 1.1 release.
After the 1.1 release there will be no further active development on the 1.1 architecture. So the only option will then to contribute.

Nevertheless, thanks for your observations. Maybe you will contribute some patches to make that feature better in the future.

Greez,
Dirk
Re: using NatTable trees without GlazedLists [message #1297800 is a reply to message #1297722] Tue, 15 April 2014 22:57 Go to previous messageGo to next message
Rob Sigel is currently offline Rob SigelFriend
Messages: 4
Registered: April 2014
Junior Member
Thanks for the fast reply, Dirk!

Sorry about the bad timing, but it was due to unavoidable scheduling conflicts on my side. I'd had time to put together an initial prototype back in late November, then priority changes kept me away until the end of last week. At which point I saw in the forums that 1.1 was coming soon, so I updated my prototype to the current SNAPSHOT build and investigated my prior concerns, resulting in the above forum post.

If I'm the only one not using GlazedLists, I may take another look at getting that working acceptably with my dataset, from both a memory and performance perspective. I expect I'm not the only one with massive dataset hierarchies, so I may have done something incorrectly with my initial GlazedLists tree prototype. After all the deep code dives I've performed lately, I've now got a much better idea how things are tied together.

I'll be glad to contribute patches. I've been unable to compile the NatTable project from source so far due to various dependency problems. (I'm not familiar with Maven or Tycho for the commandline build, so I'm not sure how to interpret those dependency problems, and in the Eclipse IDE I continually get manifest errors in the Target Platform/Target Definition editor for the target-platform.target file in the target-platform project, but I'll keep trying.)

Thanks again,
Rob
Re: using NatTable trees without GlazedLists [message #1298305 is a reply to message #1297800] Wed, 16 April 2014 07:06 Go to previous message
Dirk Fauth is currently offline Dirk FauthFriend
Messages: 2902
Registered: July 2012
Senior Member
Quote:
If I'm the only one not using GlazedLists


I don't know if you are the only one not using GlazedLists. I'm just no aware of any other user that is not using GlazedLists for trees. Slightly different meaning. Wink

Quote:
I'll be glad to contribute patches


That would be great! For the future we will need more contributions, as we will focus more on the new architecture. And you might know that there is so much code and functionality that can't be maintained by just one person.

I see 3 interesting things to start contributing:

1. Javadoc
Over the last months I tried to add Javadoc at several places, but of course I'm not finished. If you have a clearer understanding of some functionality you can try to add the Javadoc. I will review it of course, but that might be helpful.

2. An example for NatTable tree without GlazedLists.
I think that would be awesome for others to see how to achieve that.

3. Bugfixes and enhancements.
I think I don't need to say anything more about that. Wink

For small fixes that doesn't have impact on other functionality we can think about adding it now to 1.1. But that needs to be done in the next days. The release process is almost done.
For example, the tickets you opened recently were a bug in the TreeRowModel and an enhancement on expand handling. Small changes that are good to get into 1.1 as I see them as issues that can be fixed in frozen state. (I did that already). The event handling in TreeLayer to modify the hiddenRowIndexes would be something bigger, as it could have impact on other use cases like GlazedLists tree building (modifying the hiddenRowIndexes when they aren't used would hurt those use cases). So such a modification can not make it to 1.1.

For the 1.1 release I recently created a contribution guide. In fact it is just an enhancement to the old "Getting started". I just set up a new workspace on a new notebook, and everything works fine. Also the target resolution. Maybe you want to give it another try?

http://www.eclipse.org/nattable/documentation.php?page=contribution_guide

Greez,
Dirk
Previous Topic:Multiple Display Converters
Next Topic:EMF Examples
Goto Forum:
  


Current Time: Fri Apr 19 23:16:01 GMT 2024

Powered by FUDForum. Page generated in 0.03235 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top