Re: [rdf4j-dev] 2 questions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

Re: [rdf4j-dev] 2 questions

From: "Jeen Broekstra" <jeen@xxxxxxxxxxxx>
Date: Tue, 29 Nov 2022 09:20:25 +1300
Delivered-to: rdf4j-dev@xxxxxxxxxxx
Feedback-id: i3b0147cd:Fastmail
List-archive: <https://www.eclipse.org/mailman/private/rdf4j-dev/>
List-help: <mailto:rdf4j-dev-request@eclipse.org?subject=help>
List-subscribe: <https://www.eclipse.org/mailman/listinfo/rdf4j-dev>, <mailto:rdf4j-dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://www.eclipse.org/mailman/options/rdf4j-dev>, <mailto:rdf4j-dev-request@eclipse.org?subject=unsubscribe>
User-agent: Cyrus-JMAP/3.7.0-alpha0-1115-g8b801eadce-fm-20221102.001-g8b801ead

On Tue, 29 Nov 2022, at 08:27, Matthew Nguyen wrote:

Hey Jeen, thanks for the response.

> The thinking here is that the optimizer _attempts_ to change the query plan and fails, leaving the query plan unchanged. The optimizer failing should not lead to an immediate error however: it's possible that somehow the query can still be executed successfully. That's why the optimizers generally swallow exceptions.

Should the optimizer be able to fix errors or just optimize the query plan (maybe I'm not understanding what you mean by "immediate error")?

The point I'm making is that it's not the job of the optimizer to detect/handle this kind of error. The optimizer should attempt to optimize the query plan, and if any failure happens during that attempt, just back off, leave the original query plan in place, and exit normally.

If the problem the optimizer encountered is truly an error that should result in a query evaluation exception, it will be encountered again in the actual evaluation phase of the query execution, and that's where the responsibility sits for surfacing that exception up to the caller.

> Interesting find! I don't think that's caused by the optimizer swallowing the exception though. If you look at the server logs you will see that it actually does log that an exception during query execution:

Right. It logs the error but doesn't pass it along. I think the console just dumps the logging (stack) to stdout so not sure it recognizes it as a failure.

It does though. Try just running the query directly from code, something like this:

String query = "...." // your example query:

var rep = new SailRepository(new MemoryStore());

try (var conn = rep.getConnection()) {

conn.prepareTupleQuery.evaluate(query).forEach(System.out::println);

}

This will fail with a query evaluation exception when you run it, as expected. What the console does is catch that exception, and print the message to the screen.

The problem you discovered in the Workbench is not that the optimizer swallows the exception, it's that the handling of the error later on in the execution process (when it does get thrown during the evaluation) somehow gets hidden in either the Server or the Workbench application.

There are probably certain classes of errors (maybe QueryEvaluationException?) that should be bubbled up from the optimizer?

I think the only kind of exception an optimizer should conceivably throw is if, by some bug in the optimizer itself, it gets the query plan in an unrecoverable state.

Anything else it should back out of, because it, in general, doesn't know if something further down the optimizer pipeline might do something clever that removes the exception (one example in this case might be some custom optimizer that recognizes certain "magic" function names and replaces them - the point is we don't know).

> Whether that intention to keep these things decoupled still makes sense though is a fair question.

Lemme know what you guys want to do. I will probably have some cycles in December with holidays et al where I can help with this.

Thanks Matt. Might be worth logging an issue ticket for this so you can describe the proposal in a bit more detail and we can discuss. I'm aware there is also some work by Jerven on query compilation that sits in this same space. Jerven, you have any views on the refactor Matt is proposing?

Jeen

thx, matt

-----Original Message-----
From: Jeen Broekstra <jeen@xxxxxxxxxxxx>
To: rdf4j developer discussions <rdf4j-dev@xxxxxxxxxxx>
Sent: Fri, Nov 25, 2022 7:49 pm
Subject: Re: [rdf4j-dev] 2 questions

On Thu, 24 Nov 2022, at 03:29, Matthew Nguyen via rdf4j-dev wrote:
Hey folks, debugging some things and noticed the following that I wanted to get a take on:

1. Some of these optimizers are eating the exceptions (eg https://github.com/eclipse/rdf4j/blob/c607df2ace72eba12d6a3f9586b7fec4b8886129/core/queryalgebra/evaluation/src/main/java/org/eclipse/rdf4j/query/algebra/evaluation/impl/ConstantOptimizer.java#L248). Do we want that? So if I pass in an unregistered function, shouldn't the exception bubble up here?

The thinking here is that the optimizer _attempts_ to change the query plan and fails, leaving the query plan unchanged. The optimizer failing should not lead to an immediate error however: it's possible that somehow the query can still be executed successfully. That's why the optimizers generally swallow exceptions.

you can test it with against 4.2.1 workbench:

PREFIX fn: <http://example.com/>
SELECT ?whatever
WHERE {
BIND (fn:whatever("foo") AS ?whatever)
}

and it just silently fails.

Interesting find! I don't think that's caused by the optimizer swallowing the exception though. If you look at the server logs you will see that it actually does log that an exception during query execution:

org.eclipse.rdf4j.query.QueryEvaluationException: Unknown function 'http://example.com/whatever'
at org.eclipse.rdf4j.query.algebra.evaluation.impl.StrictEvaluationStrategy.lambda$evaluate$0(StrictEvaluationStrategy.java:1524)
at java.util.Optional.orElseThrow(Optional.java:290)
at org.eclipse.rdf4j.query.algebra.evaluation.impl.StrictEvaluationStrategy.evaluate(StrictEvaluationStrategy.java:1524)
at org.eclipse.rdf4j.query.algebra.evaluation.impl.StrictEvaluationStrategy.evaluate(StrictEvaluationStrategy.java:1060)
at org.eclipse.rdf4j.query.algebra.evaluation.iterator.ExtensionIterator.convert(ExtensionIterator.java:47)
at org.eclipse.rdf4j.query.algebra.evaluation.iterator.ExtensionIterator.convert(ExtensionIterator.java:23)
...

Also if you run the query directly on a local repository (e.g. by using the Console client) it fails "normally" with the same exception.

So this looks like a potential bug in how the server sends evaluation errors back to the workbench (or possibly how the workbench handles receiving such errors).

2. should we normalize these calls to the ValueExpr derived classes so we don't have these large if instanceOf/else checks (eg https://github.com/eclipse/rdf4j/blob/7fc970aa1fe27308bb44ec20d106e56e6353fdc9/core/queryalgebra/evaluation/src/main/java/org/eclipse/rdf4j/query/algebra/evaluation/impl/StrictEvaluationStrategy.java#L919)? Looks like we would need one for evaluation/prepare if I'm reading this right. So something like:

expr.acceptEval(this, bindings) and expr.acceptPrepare(this, context)

and the derived classes would just call the appropriate evaluate/prepare through the accepted 'this' object. Would save a few cycles on the if/else checks I would think. Also a little cleaner/less casting and any new _expression_ class would need to implement it if these were made abstract in the base and less likely to break b/c we forgot to add the corresponding if/else check.

This would probably be a good idea - however I haven't fully thought through if/how this would potentially lead to stronger coupling between between the algebra and the evaluation strategy. I believe the original intent of the setup (back at the dawn of time) was that the algebra model itself would be kept fully independent from the evaluation strategy - that's probably why it's set up the way it is (either that or we simply didn't know better :)).

Whether that intention to keep these things decoupled still makes sense though is a fair question.

Jeen

_______________________________________________
rdf4j-dev mailing list
rdf4j-dev@xxxxxxxxxxx
To unsubscribe from this list, visit https://www.eclipse.org/mailman/listinfo/rdf4j-dev

References:
- [rdf4j-dev] 2 questions
  - From: Matthew Nguyen
- Re: [rdf4j-dev] 2 questions
  - From: Jeen Broekstra
- Re: [rdf4j-dev] 2 questions
  - From: Matthew Nguyen

Prev by Date: Re: [rdf4j-dev] 2 questions
Next by Date: [rdf4j-dev] Visitor pattern or ...
Previous by thread: Re: [rdf4j-dev] 2 questions
Next by thread: [rdf4j-dev] Visitor pattern or ...
Index(es):
- Date
- Thread

Breadcrumbs