Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [eclipselink-dev] ANSI inner join after LEFT OUTER JOIN

Hi Dies,

  That is a good point.

At the moment, the only examples I can come up with relate to situations where the EclipseLink application does not completely control the contents of the database, and I think they are fairly unlikely, so it would likely be best to go ahead with the outer join and just include a note about it in the documentation for the SymfowarePlatform.

-Tom

Dies Koper wrote:
Hi Tom,

We tried to create a test application where we would a wrong result set if we replaced the JOIN with a LEFT OUTER JOIN in this case you mentioned, but didn't succeed. I think it was a bad example too.

Looking at the FROM clause again:

 > FROM {oj
 > CMP3_EMPLOYEE t1 LEFT OUTER JOIN CMP3_DEPT t0
 > ON (t0.ID = t1.DEPT_ID)
 > LEFT OUTER JOIN (
 > CMP3_EMP_PROJ t5 JOIN CMP3_PROJECT t3
 > ON (t3.PROJ_ID = t5.projects_PROJ_ID)
 > JOIN CMP3_LPROJECT t4 ON (t4.PROJ_ID = t3.PROJ_ID)
 > ) ON (t5.EMPLOYEES_EMP_ID = t1.EMP_ID)
 > }, CMP3_SALARY t2

Now that I changed Employee's mapping from Project to LargeProject (and not update the table names), the 'CMP3_EMP_PROJ' join table above is actually joining tables CMP3_EMPLOYEE and CMP3_LPROJECT, which means all its rows point to existing rows in the CMP3_LPROJECT and CMP3_PROJECT tables. That means that CMP3_EMP_PROJ LEFT OUTER JOIN CMP3_PROJECT will not return more rows than CMP3_EMP_PROJ JOIN CMP3_PROJECT. Joining this again with CMP3_LPROJECT with a LEFT OUTER JOIN instead, we'd still get the same results.

So it still seems LEFT OUTER JOIN is a valid and safe replacement for DB's that don't support ANSI inner join syntax.
(or can you come up with another example?)

Thanks,
Dies


On 29/01/2010 18:31, Dies Koper wrote:
Hi Tom,

Yes! That's a good example.

This is the generated SQL. It has two (inner) JOINs.
The first has a join table on its left side, so can be replaced by a
LEFT OUTER JOIN.
The second one however, needs the inner join to include only
LargeProjects. A left outer join would not exclude Project and
SmallProjects, so would give unexpected (too many) results.

SELECT t1.EMP_ID, t2.EMP_ID, t1.PAY_SCALE, t1.L_NAME, t1.STATUS,
t1.VERSION, t1.ROOM_NUM, t1.GENDER, t2.SALARY, t1.F_NAME,
t1.START_OVERTIME, t1.END_OVERTIME, t1.START_TIME, t1.END_TIME,
t1.ADDR_ID, t1.DEPT_ID, t1.MANAGER_EMP_ID, t1.FORMER_COMPANY,
t1.FORMER_START_DATE, t1.FORMER_END_DATE, t1.START_DATE, t1.END_DATE,
t0.ID, t0.NAME

FROM {oj
CMP3_EMPLOYEE t1 LEFT OUTER JOIN CMP3_DEPT t0
ON (t0.ID = t1.DEPT_ID)
LEFT OUTER JOIN (
CMP3_EMP_PROJ t5 JOIN CMP3_PROJECT t3
ON (t3.PROJ_ID = t5.projects_PROJ_ID)
JOIN CMP3_LPROJECT t4 ON (t4.PROJ_ID = t3.PROJ_ID)
) ON (t5.EMPLOYEES_EMP_ID = t1.EMP_ID)
}, CMP3_SALARY t2

WHERE (t2.EMP_ID = t1.EMP_ID)

I would like to refine the code to only replace the inner with a left
outer join when it returns the correct results.

The first JOIN is generated in
SQLSelectStatement#appendFromClauseForOuterJoin. Also from the variable
names and comments in that code it looks safe to replace it with a LEFT
OUTER JOIN for Symfoware. Would you agree?

The second JOIN is generated in printAdditionalJoins, and you just
proved it is not safe to use a LEFT OUTER JOIN here in all cases.

I've only tried with the above query in the debugger, but I assume the
'desc.hasInheritance()' call could be a key.
It could be used to determine whether this query is of your suggested type.
The fix could then be something like:

if(i < nDescriptorTables) {
// it's descriptor's own table
if (session.getPlatform().isSymfoware() && !desc.hasInheritance()) {
writer.write(" LEFT OUTER");
}
writer.write(" JOIN ");

So:
- for other DB's but Symfoware nothing would change
- for tables involved in inheritance nothing would change; leads to
SQLException on Symfoware: document limitation
- for secondary tables, or table per class inheritance, the same results
would be returned using LEFT OUTER JOIN.

What do you think?

Regards,
Dies


On 29/01/2010 03:59, Tom Ware wrote:
Hi Dies,

You are right that is not a good example.

Perhaps I am missing something but what would happen in the following
scenario:

Project has two subclasses (mapped with multiple tables), LargeProject
and SmallProject

Employee has a mapping called projects to LargeProject

You run a query like: "select e from Employee e left join e.projects p"

You can simulate something like this by changing the projects mapping in
Employee in our jpa advanced model to point to LargeProject instead of
Project.

-Tom

Dies Koper wrote:
Hi Tom,

This update won't affect that JPQL query, as its generated SQL query
has no (inner) JOIN.

The generated SQL for that query is:

SELECT t1.EMP_ID, t2.EMP_ID, t1.PAY_SCALE, t1.L_NAME, t1.STATUS,
t1.VERSION, t1.ROOM_NUM, t1.GENDER, t2.SALARY, t1.F_NAME,
t1.START_OVERTIME, t1.END_OVERTIME, t1.START_TIME, t1.END_TIME,
t1.MANAGER_EMP_ID, t1.ADDR_ID, t1.DEPT_ID, t1.START_DATE, t1.END_DATE,
t1.FORMER_COMPANY, t1.FORMER_START_DATE, t1.FORMER_END_DATE, t0.ID,
t0.NAME
FROM {oj CMP3_EMPLOYEE t1 LEFT OUTER JOIN CMP3_ADDRESS t4 ON
(t4.ADDRESS_ID = t1.ADDR_ID) LEFT OUTER JOIN CMP3_DEPT t0 ON (t0.ID =
t1.DEPT_ID)}, CMP3_EMP_PROJ t5, CMP3_PROJECT t3, CMP3_SALARY t2
WHERE (((t4.ADDRESS_ID > ?) AND (t2.EMP_ID = t1.EMP_ID)) AND
((t5.EMPLOYEES_EMP_ID = t1.EMP_ID) AND (t3.PROJ_ID =
t5.projects_PROJ_ID)))

To test this, I added the following test:

package: org.eclipse.persistence.testing.tests.jpa.advanced
test class: EntityManagerJUnitTestSuite

public void testJOIN() {
EntityManager em = createEntityManager();
List result = em.createQuery("Select e from Employee e left join
e.address a join e.projects p where a.id > 1").getResultList();
}

Please let me know if I didn't run it the way you intended.

Regards,
Dies


On 28/01/2010 02:08, Tom Ware wrote:
Hi Dies,

How will Symfoware deal with a query like:

Select e from employee e left join e.address a join e.projects p where
a.id > 1

How will your update deal with a query like that?

-Tom


Dies Koper wrote:
Hi Tom,

I am looking at resolving remaining open issues on Symfoware for the
LRG
test sets.

Currently my biggest issue is the 'Inner Join Keyword' we have
discussed
before (also on the Wiki). About 1000 tests in Core LRG fail
because of
it, and about 50 in Core JPA.

I'd like to understand more about your comments:

-----------------------------------------------
There was the suggestion that we could "fake" this feature by using an
OUTER JOIN instead. I recommend against that strategy because of the
two
reasons someone would choose to use joining

* A fetch join is used to improve the efficiency of a query. As more
outer joins are added to a query, the result set gets big at an
exponential rate. This removes any benefit one would get from the
FETCH join
* A non-fetch join is used to limit results. Adding an outer join
means that the results are no longer limited in the same way.
-----------------------------------------------

I have gone through all the failures, to check what kind of SQL
statement the inner join was used in.
It turns out that in all of them a LEFT OUTER JOIN would give the same
result set as an INNER JOIN:

* Use of secondary tables
E.g. complexJoinTest:

CMP3_EMPLOYEE t0 JOIN CMP3_SALARY t1 ON (t1.EMP_ID = t0.EMP_ID)

* Use of join tables with relationships
E.g. testProjectOuterJoinTeamMembersJoinAddress:

CMP3_EMP_PROJ t5 JOIN CMP3_EMPLOYEE t2 ON (t2.EMP_ID =
t5.EMPLOYEES_EMP_ID

* Combination of both of the above
E.g. testLargeProjects:

CMP3_EMP_PROJ t5 JOIN CMP3_PROJECT t3 ON (t5.PROJECTS_PROJ_ID =
t3.PROJ_ID) JOIN CMP3_LPROJECT t4 ON (t4.PROJ_ID = t3.PROJ_ID))

With the secondary table, for each row in the primary table the
secondary table will have a corresponding row, so there will never be
any 'unmapped' rows on the LEFT side.

With the join table, as it joins two tables, there are no rows where
the
foreign key is NULL, so no 'unmapped' rows on the LEFT side.

I have spent a few days trying to come up with a pattern where the
LEFT
OUTER JOIN would return a different result set, but no matter how I
play
around with the table order or combinations, the join table is always
placed on the left side.

Also, in the source where these SQL statements are built
(SQLSelectStatement#appendFromClauseForOuterJoin), we have the
following
line that prints the table on the left side of the JOIN:


writer.write(relationTable.getQualifiedNameDelimited(printer.getPlatform()));




The variable name 'relationTable' seems to be in line with my findings
in the test results that it is always a join table on the left.

I understood your comments to be based on the difference between inner
and outer joins in general, where outer joins can (and are generally
used for that purpose) return also unmapped rows from one of the
tables
in its result set, so if we return these unwanted results the queries
will be less efficient. Is that correct?
From what I have seen so far it appears that in this part of the code
there are never any unmapped rows, so it is safe to use a LEFT OUTER
JOIN for Symfoware.

Thanks,
Dies


_______________________________________________
eclipselink-dev mailing list
eclipselink-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/eclipselink-dev


Back to the top