Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[rdf4j-dev] 4.1.0 is out the door

Hi!

I've just released 4.1.0. 

I would like to organise another planning workshop like the one that Jeen organised last year. It would be nice to start discussing our plans for the next major release. 

September would be a good time for me. Last year we held the meeting on a weekday at 12:00 CET. 

Here is a table of various time zones that might be relevant. 


I can create a doodle for us, but first I would appreciate some feedback on what dates and time zones to include?

Thank you to everyone for being involved in, and contributing to, RDF4J!

Cheers,
Håvard M. Ottestad

PS. I've attached the meeting notes that Jeen typed up from the previous workshop.



Hi all,

Thanks everyone who participated in the live planning sesion yesterday, I personally found it a very useful and productive meeting. Here is a summary of the main outcomes, and further down the full meeting notes. Feel free to comment if anything was overlooked or misrepresented.

Present

Jeen Broekstra, Havard Ottestad, Bart Hanssens, Jerven Bolleman, Andreas Schwarte

Summary
  • Moving to Java 11 is agreed upon by all concerned, keeping open the possibility to support Java 8 backports of specific features as further 3.x releases (but only if specifically required by a stakeholder)
  • We agreed to remove deprecated/obsolete code drastically, erring on the side of removing too much. The observation being that if we remove something that people miss, it's easier to put it back in a following minor release than vice versa. In particular, SeRQL will be removed, as will SPIN (though SPIN functions and tuple function support will be preserved)
  • The main selling points of the 4.0 release (apart from cleanup and Java 11) will be large dataset validation with SHACL, and stabilization of RDF-star support
  • The query algebra and possibly the query engine internal details will be revised as part of the 4.0 release, with the following aims in mind:
    • setting us up for support of SPARQL 1.2 (incuding query annotations/hints)
    • improved support for concurrency / optimization
  • We discussed moving to a more "compiled" model for queries, allowing more drastic optimizations such as predicate / filter pushdown. For 4.0 we will focus on the API groundwork needed to make that possible. This will need to be spiked and formulated further, and relevant issues will be created to describe and track this work by the people most involved in it.
  • Several other possible improvements were discussed, issues will be raised with a description and rationale to track these ideas, by the proposers.
  • It is agreed to make upgrades for Elasticsearch / SOLR / Lucene version updates (including reindexing) a user responsibility, offering relevant pointers in documentation/release notes where necessary, but not offering automated support from inside the RDF4J framework itself.
  • Timeline-wise we are looking at Q4 of 2021 at the earliest, depending on developer availability. The plan is to make 4.0 our focus immediately after the 3.7 release. In principle no further 3.x minor releases will be done, so we can focus on getting 4.0 ready as quickly as possible. If specific features require a release however, we can opt to do another minor release by cherry-picking.

Full Meeting notes

  1. Introductions
  2. main themes for RDF4J 4.0
    1. Java 11 migration
      1. All like it, prior constraints from commercial partners seem to be gone.
      2. We want to stay on LTS releases. But after java11 we can use multi-release jars.
      3. Keep 3x series around for bug fix releases. On an as needed case.
    2. removal of SeRQL, SPIN and other deprecated code
      1. SeRQL should we keep it? No one of us uses it, so let’s remove it. Fixes to the query algebra are easier without SeRQL around.
      2. SPIN sail keep custom functions but remove most other parts. Some contributors left.
      3. Opinion: hard remove for 4.x reintroduce if demanded for 4.1+ 
      4. Also remove most default methods that were just there for minor releases. Exception query explanation api should stay.

  1. RDF-star stabilization
    1. Happening in minor releases. Could be the new selling point. Also depends on RDF-DEV CG/WG.
  2. query algebra revisions
    1. Opening for making it easier to support 1.2 also performance tuning.
  3. Validating large datasets with SHACL
    1. Initial validation of a large dataset should be faster. Being able to validate dataset on the fly. Looking at adding this to the ShACL sail which is now much more modular. New approach generating SPARQL queries, which would work for remote repositories as well.
  1. what's on your mind?
    1. What is thread-safe?
      1. Where do we guarantee behaviour.
      2. Can we get rid of synchronization of the close method on iterators? As it is today?
      3. Connection level thread safety? Can we have multiple reads on the same connection (in same transaction). (Jerven note: important for parallelizing query engine)
    2. Query compilation
      1. Current arch basically allows this for the pure Java implementations.
      2. Major challenges are iterator removal and grouping/chunking. But can be done without end user visible behaviour.
      3. Open an issue on predicate (filter) pushdown into the getStatements method.
    3. How to let SAILs inform other layers that they support features, such as that a method would return a sorted iterator. Should we add default methods or more marker interfaces? Inference support is an example (not great example with a wrapper). 
    4. Query algebra can we support hints.
    5. ES / Solr / Lucene upgrades and how to test these upgrades (ES/Solr/Lucene are releasing new minor and major versions fairly fast), Lucene and ES often suggest to reindex after upgrading to a new major version, and may also not be compatible with older/newer versions
      1. Also: check stricter ES license for new ES versions: may not be compatible with RDF4J’s EPL
      2. Testing: could we use test containers. 
      3. Q? Andreas who uses this.
      4. Conclusion: users should upgrade using the SOLR/lucene/ES tools outside of RDF4J. This includes reindexing if needed.

Cheers,

Jeen

Back to the top