Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [eclipselink-users] Batch read question: controlling no. ofsubqueries?

On Mon, Apr 6, 2009 at 7:45 PM, Sebastien Tardif <stardif@xxxxxxxxxxxx> wrote:
See
http://forums.oracle.com/forums/thread.jspa?threadID=874139&tstart=15

Oh, good; that's nice to hear.  For those following along, the link includes this juicy nugget:

TopLink [and hence Eclipselink] [...] supports "batch reading" which is much more optimal. On the find query you can set the query to batch read the Item's User [in an Item-to-Seller one-to-many association] so when you access any of the Item's users it will load all of the Item's Users in a single query. The query uses a join instead of a dynamic IN clause with each id, so it much more efficient, and also works for objects with composite primary keys. It also loads to specific objects related to the query, not a random set of objects, and does not require usage of proxies.

And that's very cool.  If I understand this right, this means that (to use this example) my find query would suck back Items whose Sellers collections might or might not be initialized (i.e. might or might not be marked as EAGER), but when those Seller collections are accessed, whatever query populates them would grab all of them?

Now, that still means that the total number of SQL queries is proportional to the size of the original OQL finder query's result list (if you get ten Items back, you will have run the Item finder query, obviously, and then you'll run at most 10 more subqueries--one each time you access an Items getSellers() method).  Right?

I have a guy in our company right now who is touting a home-grown solution based on the fact that he can guarantee the number of round trips to the database.  In this case, he can guarantee that there will be one SQL query to figure out what the Items are (SELECT * FROM Items i WHERE i.x = 'gug' AND i.z = 'Foo'), and there will be one SQL query to figure out what the Sellers are, by reusing the JOIN and WHERE conditions from the first query (SELECT i.* FROM Seller s INNER JOIN Item i ON i.itemID = s.itemID WHERE i.x = 'gug' AND i.z = 'Foo').  Then his home grown solution marries the...well, effectively under the covers it's the equivalent of a RowSet--he marries the "Item core data" RowSet with the "Seller data" RowSet, and then builds his ORM objects off of that.

I think I have that right.  It's late.

Pros to EclipseLink batch reading: if you never access a collection, we won't have populated unneeded data.  Cons: we have to run lots of queries in the case where you access even just one of the Sellers of every Item in the result list.  "It depends": If you have 10,000 Items, you'll run at most 10,001 queries.

Pros to this guy's home grown solution: two SQL queries, max.  Ever.  Cons: a whole lot of data crossing the wire when you might not need it.

Have I accurately stated the EclipseLink scenario?

Thanks again for any help anyone can offer, and thanks for the link.

Best,
Laird

Back to the top