[eclipselink-dev] Some ideas for EclipseLink 3.0.0

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

[eclipselink-dev] Some ideas for EclipseLink 3.0.0

From: Patric Rufflar <patric@xxxxxxxxxxx>
Date: Tue, 27 Feb 2018 14:43:58 +0100
Delivered-to: eclipselink-dev@xxxxxxxxxxx
List-archive: <https://dev.eclipse.org/mailman/private/eclipselink-dev>
List-help: <mailto:eclipselink-dev-request@eclipse.org?subject=help>
List-subscribe: <https://dev.eclipse.org/mailman/listinfo/eclipselink-dev>, <mailto:eclipselink-dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://dev.eclipse.org/mailman/options/eclipselink-dev>, <mailto:eclipselink-dev-request@eclipse.org?subject=unsubscribe>
User-agent: Roundcube Webmail

Hello,

I'm not sure if this is the correct place for this kind of mail - ifnot, please let me know where to post it instead.

We're working with EclipseLink for a long time and I'd like to sharewith you some thoughts of our work with it and

our proposals for the upcoming EclipseLink release 3.0.0.

I'm sorry that it got a little bit long but hopefully it will be of somehelp for you.Most of them are performance or correctness related and I'm quiteconfident that a significant number of users might benefit from thefollowing improvements:

1. Implicitly use (OUTER) JOIN FETCH for EAGERly loaded ManyToOnereferences

If an entity contains a ManyToOne reference with FetchType.EAGER(default) or when the corresponding weaving technique is disabled,EclipseLink issues a separate (sub-)query for each of those referenceswhen retrieving it.In case of complex hierarchies of those ManyToOne references, this mayresult in a dramatic performance drop as the fetch ofone simple query might cause many additional queries and round trips tothe database.

IMHO, EclipseLink should implicitly use an (OUTER) JOIN FETCH forfetching EAGER references

(maybe as a configurable option of the Session/EntityManager).

This should affect all queries irrespective of the source of such aquery (e.g. find(), transparent indirection containers, Query).

To avoid too big / deep JOIN FETCH queries, there should be a limit(ideally configurable) on how many references (or levels)

should be considered for implicit JOIN FETCH queries.

We had scenarios where the introduction of a JOIN FETCH increasedsingle-threaded throughput by factors of 5-10.I would love to see that kind of "tuning" to be automatically done inEclipseLink.(BTW, LAZY loaded references can only help here in cases where theManyToOne reference is actually NOT accessed, otherwise theperformance impact matches the one for eagerly fetched ManyToOnereferences).

2. Enable a way to provide query hints to transparent indirectioncontainers

Currently, I'm not aware of a way to provide query hints for transparentindirection containers like IndirectSet.This prevents us to provide performance related query hints likejoin/batch-fetches or FetchSize(ideally with the option of specifying them globally, e.g. use fetchSizeof xyz for all indirect container loads)



3. Improve sequence caching

Using sequence caching is a good idea in many scenarios.
However, there's still room of improvement:

- Introduce a configuration option to share the sequence cache acrosstransactions. Currently, it looks like each transaction has its ownsequence cache.This lowers the cache efficiency (and saves unnecessary database roundtrips) for transactions with only a few persists.

In most cases, especially where sequence values are retrieved

from a globally synchronized database sequence generator, there's noreason why the sequence cache should not be shared acrosstransactions/client sessions.

- Expose the sequence cache to an official API so that non-JPA code canalso benefit from EclipseLink sequence caching,

hencing improving its efficiency even more.


4. Make EntityManager.getReference() database roundtrip free

Currently, there is a little difference in behavior betweenEntityManager.find() and EntityManager.getReference().Both issue database calls which is not required for the latter accordingthe JPA spec. This lowers the efficiency of getReference().Imagine that you just want to link an entity (with a known PK) to anewly persisted one - getReference() would be perfect for that jobwithout the need of fetching anything from the database (e.g. byreturning a proxy).


Other JPA implementations do better here (Hibernate for example).


5. Add support for retrieving detached entities within transactions

Many entities retrieved by a JPA provider are not changed at all.

Often, this is known at development time. If EclipseLink would supportsome kind of hint (e.g. READ_DETACHED)to return a detached entity (even within a transaction), significantamount of work and memory for building and managing the backup clonecould be saved.Initially, I thought the QueryHints.READ_ONLY would be exactly what Ineed, however, according to the documentationit can only be used when using the shared-cache and fornon-transactional queries only. Both doesn't apply in our use case.



6. Avoid StackOverflowError for certain entity models

Imagine a single entity which represents some kind of linked listelement. Each record/entity references to a previous record to definethe chain

(by using an EAGERly fetched ManyToOne reference).
The very first element has a null reference to the previous record.

Even though this kind of data structure is perfectly valid, EclipseLinkhas some issues with it:When the "chain" grows, EclipseLink will sooner or later throw anStackOverflowError.The reason for this is that eagerly fetched references are processedusing a recursion-based approach.For each element, the stack increases, and we had cases where a30-element chain already caused a StackOverflowError with default stacksizesettings in an enterprise-grade linux environment. I'm not aware of anyworkaround(beside using a lazy previous reference or increasing stack size) toavoid that issue.



7. Exploit parallelism of CPU bound tasks

Some tasks within EclipseLink are quite CPU intense, e.g. changetracking calculations or creating backup clones in large transactions.Throughput can be significantly increased in certain scenarios ifEclipselink would exploit parallelism of such tasks by using multiplethreads

(should be configurable).


8. Weaving: Eliminate need for a backup clone in certain scenarios

This is something more experimental:

For entities with a significant amount of mappings, the backup cloneadds significantly CPU and memory overhead.In certain scenarios where no or only little fields changed and whereweaving is used, alternative methods might perform much better here,e.g. by storing the original values in the original clone beforechanging them

(that the original clone contains both database and changed values).


9. Address bug reports that affect correctness

Correctness should be the most crucial feature for any ORM.

Please have a look at corresponding bug reports that affect correctness,e.g.:


349477 (42 votes)
391279 (35 votes)
371743 (16 votes)
247662 (15 votes)
416837 (12 votes)
467470 (12 votes)
416837


10. Care about startup time

EclipseLink takes (relatively) long to startup when having largepersistence units and/or classpaths.Most of the time is spent within I/O operations which can be avoided inmany scenarios (maybe configurable).

In certain short-living EntityManagerFactory scenarios (e.g.unit/automated testing) it does matter significantlywhether EclipseLink needs 1 or 3 seconds to startup (please also take alook at bug 352845).



11. Address open bug reports

Currently, there are ~155 open, unresolved and unassignedcritical/blocker bug reports for EclipseLink in the bugzilla.

Same is true for feature requests with a significant number of votes.

I'm sure the community would appreciate if they finally get someresponse for some of them.

Finally, I'd like to thank all involved developers/companies for theirgreat work related to EclipseLink!



Regards,
Patric

Prev by Date: [eclipselink-dev] RFR: Bug #531528
Next by Date: [eclipselink-dev] Repository moving to GitHub
Previous by thread: [eclipselink-dev] RFR: Bug #531528
Next by thread: [eclipselink-dev] Repository moving to GitHub
Index(es):
- Date
- Thread

Breadcrumbs