Home » Modeling » VIATRA » Rete Visualizer Node Explanations(How to express the nodes in the Rete Visualizer as basic Rete nodes (projection, filter, etc.)?)
Rete Visualizer Node Explanations [message #1829327] |
Wed, 01 July 2020 13:25 |
Hans van der Laan Messages: 34 Registered: February 2020 |
Member |
|
|
Hey,
I'm currently reading Zoltan's thesis [1] to get a better grip on how VIATRA works under the hood. I want to understand the engine better so that in my thesis I will be able to reason about the performance of the queries (or at least explain it to my supervisors / the company I'm doing my thesis at).
First of all, great explanation! You should provide a link somewhere in the documentation to this document. It really helps to get a grasp on the theory behind model queries!
I'm currently reading section 3 about graph pattern matching with Rete Networks. On page 29, you show the types of nodes the Rete network consists of (entity, relation, projection, filter, compute, join, anti-join and disjunction nodes). These explanations are very clear. However, I'm stuck trying to convert this knowledge back to the current implementation.
There are a few nodes I can kind-of imagine what they do, such as Join-Left / Uniqueness Enforcer. However, there are many nodes such as trimmer (some kind of filter?) and bucket (what is a discriminator?) for which I'm not sure what they do. I also don't understand the numbers next to the node names (e.g. I have a Trimmer node with the label: Trimmer: [1,2]/3)
Is there somewhere where I can find an explanation of what each node does (or translated back into the original node types) and what the numbers next to the node labels mean? If not, could you please provide such an explanation?
Kind regards,
Hans
[1] https://repozitorium.omikk.bme.hu/handle/10890/5357
PS: A while ago in a bug report, I asked if there was a way if I could see the contents of the tuples stored in the various nodes. I got an answer that this is sadly not possible. Is this only not possible via the API/UI or is there also not in the code a place somewhere where I could set a breakpoint and observe what is in the buckets?
|
|
|
Re: Rete Visualizer Node Explanations [message #1829373 is a reply to message #1829327] |
Thu, 02 July 2020 08:09 |
Zoltan Ujhelyi Messages: 392 Registered: July 2015 |
Senior Member |
|
|
Hi Hans,
thanks for the feedback about my thesis, I am really glad it was helpful to you. We were not linking it to the documentation as we did not consider it a very easy read, but we will think about it, and also because of the differences you have also found here. These differences were caused by either implementation concerns, e.g. by splitting or merging some node types we change readability and performance; or handling some corner cases effectively, e.g. when you want to duplicate an element of a tuple.
So far, we did not have the time to document in great length these elements (and this is one of the reasons we consider the Rete visualization an experimental component), but I try to address your questions here.
The numbers like []/2 or [1,2]/3 represent tuple masks - describing what elements to take from the parent node. E.g. [1,2]/3 means the parent is a 3-tuple, and the current node should contain elements 1 and 2 from it (in the given order).
The trimmer node in the code can be thought of the project node from the dissertation: the mask parameter is used to select which elements from the parent should be included (basically it is used to remove columns from the parent node, and in some case duplicate some others). A bucket is what its name is: basically elements might be grouped based on a key parameter - normally you can ignore that, it is only used for very low-level debugging reasons.
I hope these explanations helped to clear up the differences, if not, feel free to ask for more clarifications.
---
About accessing the contents of Rete node, it is indeed not possible in the UI, but if you really need to, you could access the Rete network at any point you have access to the ViatraQueryEngine, and go down through the query backend and query result providers, reaching the RetePatternMatcher classes that are already part of the Rete network. From that point, you can try to look into the nodes, but it will not be easy, e.g. some nodes are not caching for performance reasons and you will need some patience to figure out the things work internally.
Best regards,
Zoltán
|
|
|
Re: Rete Visualizer Node Explanations [message #1829383 is a reply to message #1829373] |
Thu, 02 July 2020 11:27 |
Hans van der Laan Messages: 34 Registered: February 2020 |
Member |
|
|
Yes, it does. Thank you for the explanation.
Quote: We were not linking it to the documentation as we did not consider it a very easy read, but we will think about it, and also because of the differences you have also found here.
To be honest, I was afraid that this would've been the case. When starting with VIATRA, I tried to learn how RETE worked by reading: Production Systems and Rete Algorithm Formalisation [1]. It was difficult to get through and they made a few mistakes in their example which had me really confused. Since then, I've put off diving into the algorithm (until now).
Probably I've not yet seen all the available nodes, however for the networks of my current queries there are still few nodes I'm unsure what they exactly do. Could you please explain what do the IX (Indexer), Aggregator Indexer, Count and Dispatch nodes do?
Next to this I was also wondering,: what kind of node do we get for custom aggregators? Do we get something equivalent to the count node?
[1] https://hal.inria.fr/inria-00099850/
[Updated on: Thu, 02 July 2020 11:29] Report message to a moderator
|
|
| |
Re: Rete Visualizer Node Explanations [message #1829390 is a reply to message #1829389] |
Thu, 02 July 2020 14:20 |
Gabor Bergmann Messages: 36 Registered: July 2009 |
Member |
|
|
Hi Hans,
my 1 minute Rete summary: think of relational algebra from CS intro courses, like union, project, join etc. Imagine that you store the (partial) result of each node of the relational algebra expression tree. An then imagine of you change the input relations, you can propagate deltas in this network and locally incrementally update each Rete node. Of course, we can actually save some memory by not actually storing the results at every node, only those where it really pays off from a performance perspective (e.g. it will be used for lookups).
As for concrete node types:
IX is Indexer. It Essentially stores tuples in some kind of bucketed hash lookup. So if there are 4 columns, then indexing tuples by mask [0,1]/4 means that they are sorted into buckets based on their first two elements. The most important reason to have such a node is to add IX nodes as parents to join nodes. If a new tuple arrives at the join node from one parent, then the join node can ask the other parent IX to look up which tuples are to be joined against the newly inserted one (due to having the same join key), and join them with the original update, and propagate this as its output delta.[/li]
From the point of view of the join node, an Aggregator Indexer has the same interface as regular IXers. But instead of returning all tuples in the bucket, it will return a single tuple with their aggregate value. This can be a neutral value if the bucket is empty (e.g. 0 in case of sum).[/li]
As a minor and irrelevant performance optimization, and also for legacy reasons, counting is implemented a bit differently than other aggregators. Hence the CountNode. [/li]
Dispatch nodes and Discriminator Buckets: it is a common use case to divide a big relation based on constant values of one column. E.g. if you have a pattern operation(umlClass, operationName, operationType, operationVisibility), then you can get all private operations by taking the subset of tuples where the last column is VisibilityKind::PRIVATE, similarly for PUBLIC, DEFAULT etc. Under the hood, a dispatch node receives all tuples of the operation, and dispatches them across many child nodes (the discriminator buckets) based on the value of a certain column. Internally, it just has a simple hash map as "routing table" that assigns the corresponding bucket node for each value of the column that was actually used in a pattern as a filtering constant.[/li]
|
|
| |
Goto Forum:
Current Time: Sat Nov 02 00:09:39 GMT 2024
Powered by FUDForum. Page generated in 0.04405 seconds
|