Eclipse Community Forums: EclipseLink » Optimizing EclipseLink generated SQL

Home » Eclipse Projects » EclipseLink » Optimizing EclipseLink generated SQL(Questions about the cartesian products it often generates)

Show: Today's Messages :: Show Polls :: Message Navigator

Optimizing EclipseLink generated SQL [message #521938]

Fri, 19 March 2010 11:50

Rodrigo Greselle

Messages: 12
Registered: March 2010

Junior Member

Hello, everybody.

I've recently posted a question on stackoverflow.com, about the SQL generated by EclipseLink: here.

As it is still lacks an answer from the good folks over there, I thought I might ask it here as well.

At the project I'm currently working on, we're updating the software from Java EE 5 to 6, due to many reasons, but mostly because we need the CDI support for upcoming modules. The previous version used Hibernate as the persistence provider, but the domain layer was "free" of provider-specific annotations, and so it was reused with few modifications to the mappings.

However, the SQL generated by EclipseLink for our previous JPQL queries contains several cartesian plans, instead of (inner) joins, despite paths being available on the tables to perform joins and squeeze better performance.

I can post code and examples if you want, but I would simply like to know if it's possible to fine tune this behaviour through configuration, rather than using annotations or query hints.

On a short note, even when I add the query hints to join or batch-fetch associations, EclipseLink still applies cartesian plans on its SQL queries.

Thanks in advance for any comments you might have on this issue.

Report message to a moderator

Re: Optimizing EclipseLink generated SQL [message #522390 is a reply to message #521938]

Mon, 22 March 2010 14:23

Rodrigo Greselle

Messages: 12
Registered: March 2010

Junior Member

Weird, no answer either here or in stackoverflow.com. Am I asking something too stupid? If that's the case, be kind to at least point me to the docs where I can get rid of my confusion.

Report message to a moderator

Re: Optimizing EclipseLink generated SQL [message #522503 is a reply to message #521938]

Mon, 22 March 2010 19:51

James Sutherland

Messages: 1939
Registered: July 2009
Location: Ottawa, Canada

Senior Member

Please give a specific example and include the JPQL or query, and the SQL that EclipseLink generates, and the SQL that you desire.

EclipseLink offers several mechanisms to optimize a query, including join fetching, batch fetching, pagination, sql hints, and many more. You can also use your own SQL for any query.

James : Wiki : Book : Blog : Twitter

Report message to a moderator

Re: Optimizing EclipseLink generated SQL [message #522506 is a reply to message #521938]

Mon, 22 March 2010 20:25

Rodrigo Greselle

Messages: 12
Registered: March 2010

Junior Member

Hello James,

I am already applying lazy pagination, lazy loading, and query batching on the queries. I am not interest in using native SQL - my main issue is with the predicate of queries.

As you've asked, here's an example. This is a query outputed by EclipseLink:

FINE: SELECT t0.ID, t0.type, t0.CREATION, t0.NAME, t0.TYPE, t0.jpa_version, t1.ID, t1.EMAIL FROM swa_assignable t0, swa_resource t1 WHERE ((t0.id = ?) AND ((t1.ID = t0.ID) AND (t0.type = ?)))
        bind => [6, R]

This is given by a simple EntityManager.find() query. The entities Assignable and Resource are related by Inheritance - Assignable is parent of Resource, and this is mapped using a PK-FK in the Resource table.

I simply posted the simplest example I could find; this happens to almost every query I have. For example, I have a WorkOrder entity with several @ManyToOne relationships; whenever I issue parameterized queries that deal with these relationships, instead of the appropriate joins, EclipseLink does these equals comparisons on a very large WHERE clause.

So I was looking for something (configuration-wise) that could instruct EclipseLink to use joins instead. Does that exist? (although by your reaction, I believe it does not...)

Thanks for replying!

Report message to a moderator

Re: Optimizing EclipseLink generated SQL [message #523194 is a reply to message #521938]

Thu, 25 March 2010 13:35

James Sutherland

Messages: 1939
Registered: July 2009
Location: Ottawa, Canada

Senior Member

Not sure I understand. The join is in the WHERE clause, and is correct and necessary. Do you instead want the join in the FROM clause? There should be no difference to the database. EclipseLink will put the join in the FROM clause for outer joins, but for normal join it puts it in the where clause as it is more common.

What database are you using, and are you seeing any difference between the two in the database's fetch plan?

James : Wiki : Book : Blog : Twitter

Report message to a moderator

Re: Optimizing EclipseLink generated SQL [message #523213 is a reply to message #523194]

Thu, 25 March 2010 14:43

Rodrigo Greselle

Messages: 12
Registered: March 2010

Junior Member

Hello again,

Yes James, I'd like JOINS to be placed in the FROM clause. My example was poor - you're right when you say that the query above is acceptable.

But looking at more complex entities, with several foreign keys, joins in the FROM clause will produce a better execution plan. If you consider using tables with hundreds of thousands of records - which is the case for the application I'm working on - the DBA will freak out when he detects such SQL.

The database at hand is MS-SQL. I'll try to post a more meaningful example in the next hours, when I get some free time. I'll also try to post some execution plan snapshots from the DB management tool.

Thanks for the attention so far.

[EDIT: typos]

[Updated on: Thu, 25 March 2010 16:19]

Report message to a moderator

Re: Optimizing EclipseLink generated SQL [message #523242 is a reply to message #523213]

Thu, 25 March 2010 16:06

Rodrigo Greselle

Messages: 12
Registered: March 2010

Junior Member

Hi again James,

As promised, the SQL samples, as generated by EclipseLink:

/* select clause abbreviated for clarity */
SELECT t1.*, t0.*, t2.*, t3.*, t4.*, t5.*
FROM swa_work_orders t1
    LEFT OUTER JOIN swa_network_element t5 ON (t5.ID = t1.ne_id),
    swa_priorities t4,
    swa_dispatch_project t3,
    swa_work_order_detail t2,
    swa_work_order_type t0 WHERE ((((t1.severity = 1)
    AND (t1.STATUS = 1))
    AND (t1.project_id = 10001))
    AND ((((t0.ID = t1.TYPE_ID)
    AND (t2.worder_id = t1.ID))
    AND (t3.id = t1.project_id))
    AND (t4.ID = t1.priority_id)))
    ORDER BY t1.CREATION DESC

And what we think should be generated:

SELECT t1.*, t0.*, t2.*, t3.*, t4.*, t5.*
FROM swa_work_orders t1
    LEFT OUTER JOIN swa_network_element t5 ON t5.ID = t1.ne_id
    inner join swa_priorities t4 on t1.priority_id = t4.id
    inner join swa_dispatch_project t3 on t1.project_id = t3.id
    inner join swa_work_order_detail t2 on t1.id = t2.worder_id
    inner join swa_work_order_type t0 on t1.type_id = t0.id
    WHERE t1.severity = 1
    AND t1.STATUS = 1
    AND t1.project_id = 10001
    ORDER BY t1.CREATION DESC

If you compare execution plans for both queries, you'll see that the second format perfoms better, *especially* under heavy load of data on the involved tables.

The JPQL I am using to generate such a query is dynamic, and is the following:

select w from WorkOrder w 
                inner join fetch w.type 
                inner join fetch w.details 
                inner join fetch w.project 
                inner join fetch w.priority 
                left join fetch w.networkElement 
where w.status = :status and w.severity = :severity

The relationships above are @ManyTone (type, project, priority) and @OneToOne (details), and the query itself is optimized with the usage of inner join fetch, since I need these dependencies to display data.

I'll try to attach the execution plans from the MS-SQL management application to support my case Smile

Regards,

Rodrigo

Report message to a moderator

Re: Optimizing EclipseLink generated SQL [message #523297 is a reply to message #521938]

Thu, 25 March 2010 19:29

James Sutherland

Messages: 1939
Registered: July 2009
Location: Ottawa, Canada

Senior Member

It is odd that MySQL is treating these two equivalent queries differently.

You can log an enhancement request to have all join be joined in the from clause. Most of the functionality is already there in EclipseLink to support outer joins, so it should not be too hard to implement.

James : Wiki : Book : Blog : Twitter

Report message to a moderator

Re: Optimizing EclipseLink generated SQL [message #523300 is a reply to message #523297]

Thu, 25 March 2010 20:02

Rodrigo Greselle

Messages: 12
Registered: March 2010

Junior Member

It's MS-SQL, from Micro$oft, and not MySQL.

I'll have the DBA take a look at the queries again, given your claim that they should perform equally despite the load on the database.

To be honest I tend to believe that the joins on the FROM clause are a better choice than doing this work on the predicate, simply because they will single data out of the query sooner in the query processing phase. With less data on its hands, the DBMS will be able perform the predicate evaluation faster.

Regards,
Rodrigo

Report message to a moderator