Re: [rdf4j-dev] Inferred and Explicit sail branch

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

Re: [rdf4j-dev] Inferred and Explicit sail branch

From: Jeen Broekstra <jeen.broekstra@xxxxxxxxx>
Date: Fri, 14 Feb 2020 11:50:51 +1100
Delivered-to: rdf4j-dev@xxxxxxxxxxx
List-archive: <https://www.eclipse.org/mailman/private/rdf4j-dev>
List-help: <mailto:rdf4j-dev-request@eclipse.org?subject=help>
List-subscribe: <https://www.eclipse.org/mailman/listinfo/rdf4j-dev>, <mailto:rdf4j-dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://www.eclipse.org/mailman/options/rdf4j-dev>, <mailto:rdf4j-dev-request@eclipse.org?subject=unsubscribe>

On Thu, Feb 13, 2020 at 9:39 PM Håvard Ottestad <hmottestad@xxxxxxxxx> wrote:

Hi,

I've been looking at the way inferred and explicit statements are handled. In the memory store I can see that all statements are stored in the same pile, with a boolean flag for each statement. I think it's done the same way for the native store....but not sure...?

Yes, native store does this the same way.

Since there is an isolation level issue when using the union sail (1) I was wondering if it's time to do away with the dual sail source abstraction?

This might also make queries over inferred data faster, since you don't end up executing the getStatements(...) method twice - once per sail source.

What I'm wondering about then is how bad of an idea this is? Will we need to add a flag to every statement? Will we need to have a tri-state flag for "inferred+explicit", "explicit only" and "inferred only"? How many breaking changes will this introduce....and how open are we to making those and moving to RDF4J 4.0 within the next 3-6 months?

I'm not against simplifying the SailSource structure in principle. It's a hard to follow part of the Sail design.

However, the notion of separating out inferred and explicit statements is rather central to its design. If you were to remove the notion of a separate explicit and and inferred sailsource you would also have to changes many things in other places, for example like how SailSinks approve statements as just one simple example. It would certainly be a lot of work and a number of breaking changes. I don't think we'd need a tri-state flag really - or at least I don't see a need for that immediately.

From having a closer look, I actually think the problem might be a bug in the MemoryStore specifically, rather than in the logic in the UnionSailSource. You'll notice that if you execute the same tests using an inferencer on top of a native store, there is no transaction leak. And I find it rather odd that a flush on a SailSource object would have this effect. I also vaguely remember that we had problems before with the rather simple way (a global counter) in which the memorystore keeps track of snapshots, so I'd be more inclined to investigate that in more detail first, and see if we can fix it at that level.

Cheers,

Jeen

References:
- [rdf4j-dev] Inferred and Explicit sail branch
  - From: Håvard Ottestad

Prev by Date: [rdf4j-dev] Inferred and Explicit sail branch
Next by Date: [rdf4j-dev] 3.1.1 patch release today, next minor/major release scheduling
Previous by thread: [rdf4j-dev] Inferred and Explicit sail branch
Next by thread: [rdf4j-dev] 3.1.1 patch release today, next minor/major release scheduling
Index(es):
- Date
- Thread

Breadcrumbs