Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Eclipse Projects » EclipseLink » Logging about time to create entity manager factory Impl should exist
Logging about time to create entity manager factory Impl should exist [message #1775069] Tue, 24 October 2017 10:04 Go to next message
Nuno G. de M. is currently offline Nuno G. de M.Friend
Messages: 30
Registered: September 2012
Member
Hi,

In my oppinion, in the following class:
org.eclipse.persistence.jpa.PersistenceProvider

The method:

....
protected EntityManagerFactory createContainerEntityManagerFactoryImpl(PersistenceUnitInfo info, Map properties, boolean requiresConnection) {
// Record that we are inside a JEE container to allow weaving for non managed persistence units.
JavaSECMPInitializer.setIsInContainer(true);


,,,
}

Should be given some sort of StopWatch.start() StopWatch.stop()
LOGGER.inf("Setup Entity Manager - Costed: {} ms", stopWatch.getElapsedTimeMs);

This method is too important and may cost way too much time on the dpeloyment of an application.
I have just been stuck in debugging session for 7 seconds:
Waiting for the code to transition from:
// if predeploy fails then emSetupImpl shouldn't be added to FactoryProvider
transformer = emSetupImpl.predeploy(info, nonNullProperties);
EntityManagerFactoryProvider.addEntityManagerSetupImpl(sessionName, emSetupImpl);


NOTE: If
transformer = emSetupImpl.predeploy(info, nonNullProperties);

can have such a relevant bootstraping cost as it appears to have in the application I am looking at, I want to know exactly what price I am paying for have the entity manager factory bootstraped.


Please consider adding some always on INFO log statements on the cost of creating the EntityManager factory and as well on the cost of creating the very first entity manager for a given arbitrary EntityMangagerFactory.

Having such logging would have helped me find out the hotspots where I need to drill in to see if I can help out eclipselink with additional metadata to reduce this 7 second cost.

Many thanks for taking this feedback into consideration.

Kindest regards.
Re: Logging about time to create entity manager factory Impl should exist [message #1775208 is a reply to message #1775069] Thu, 26 October 2017 10:32 Go to previous messageGo to next message
Nuno G. de M. is currently offline Nuno G. de M.Friend
Messages: 30
Registered: September 2012
Member
Hi,

First of all, compliments to the person that wrote the following documentation:
https://wiki.eclipse.org/EclipseLink/Building

The documentation there is great, and I believe that building eclipselink from sources (at least for the 2.6.X branch) is much easier then last time I tried to the same thing.
Not struggle whatsoever with libraries not being found.

I think the document above can be made even easier by doing a more "guide for dummies approach".

1. git clone REPO_URL

2. If working on windows and having problems with file system path :
git config --global


# load all branches you might want to work with
3. git fetch

etc...

Then move over to eclipse, and explain that java > build > class path varibales ECLIPSELINK_HOME is to be set

and so on.

I think the document is great, but that now with eclipselink on git, you can really make a guide that even dummies can even build the elcipslenk.jar library from source in a heartbeat.

I would also add some print screens of the:
extension.lib.external

To show off pretty clearly what type of libraries the person can expect to have to put in there, not just textualy.

With that said, I have now trivially tweaked the source code to give me the information that I wanted.
Here is an illustration:
####2017-10-26 12:14:54,272 ThreadId:236 INFO [logger: org.eclipse.persistence.internal.jpa.EntityManagerSetupImpl] - Weaving Settings:
weaveChangeTracking : true
weaveLazy: true
weaveEager: false
weaveFetchGroups: true
weaveInternal: true
weaveRest: true
<LogContext:none> <ServerService Thread Pool -- 94>
####2017-10-26 12:14:54,303 ThreadId:236 INFO [logger: org.eclipse.persistence.internal.jpa.EntityManagerSetupImpl] - Load Mapping files executed in 15 ms <LogContext:none> <ServerService Thread Pool -- 94>
####2017-10-26 12:14:54,303 ThreadId:236 INFO [logger: org.eclipse.persistence.internal.jpa.EntityManagerSetupImpl] - processEntityMappings executed in 0 ms <LogContext:none> <ServerService Thread Pool -- 94>
####2017-10-26 12:14:54,303 ThreadId:236 INFO [logger: org.eclipse.persistence.internal.jpa.EntityManagerSetupImpl] - processEntityMappings executed in 0 ms <LogContext:none> <ServerService Thread Pool -- 94>
####2017-10-26 12:14:54,319 ThreadId:236 INFO [logger: org.eclipse.persistence.internal.jpa.EntityManagerSetupImpl] - Creation of EntityManager Predeploy for PersistenceUnit: SOMEAPP_EXTERNAL_PU executed in: 109 ms <LogContext:none> <ServerService Thread Pool -- 94>
####2017-10-26 12:14:54,319 ThreadId:126 INFO [logger: org.eclipse.persistence.internal.jpa.EntityManagerSetupImpl] - Weaving Settings:
weaveChangeTracking : true
weaveLazy: true
weaveEager: false
weaveFetchGroups: true
weaveInternal: true
weaveRest: true
<LogContext:none> <ServerService Thread Pool -- 64>
####2017-10-26 12:14:54,335 ThreadId:236 INFO [logger: org.eclipse.persistence.jpa.PersistenceProvider] - Creation of EntityManagerFactory for PersistenceUnit: SOMEAPP_EXTERNAL_PU executed in: 234 ms <LogContext:none> <ServerService Thread Pool -- 94>
####2017-10-26 12:14:55,115 ThreadId:126 INFO [logger: org.eclipse.persistence.internal.jpa.EntityManagerSetupImpl] - Load Mapping files executed in 796 ms <LogContext:none> <ServerService Thread Pool -- 64>
####2017-10-26 12:14:56,240 ThreadId:126 INFO [logger: org.eclipse.persistence.internal.jpa.EntityManagerSetupImpl] - processEntityMappings executed in 1125 ms <LogContext:none> <ServerService Thread Pool -- 64>

####2017-10-26 12:14:56,726 ThreadId:126 INFO [logger: org.eclipse.persistence.jpa.PersistenceProvider] - Creation of EntityManagerFactory for PersistenceUnit: SOMEAPP_PU executed in: 2625 ms


The results are already quite interesting.
While I am not so hopeful that the time spent on:
processEntityMappings executed in 1125 ms

Can trivially be reduced - since most likely the logic behind this step is by far not trivial.

The step:
Load Mapping files executed in 796 ms
To me looks an unnaceptable bechmark. I would be surprised that this step here cannot be easily be tuned, but I might be wrong.

In any case, once again, many thanks for the document referred above.
It helped me a lot.

Kindest regards.
Re: Logging about time to create entity manager factory Impl should exist [message #1775231 is a reply to message #1775208] Thu, 26 October 2017 14:43 Go to previous messageGo to next message
Nuno G. de M. is currently offline Nuno G. de M.Friend
Messages: 30
Registered: September 2012
Member
Hi,

I have now be able to reduce at least a little bit the Time it takes to create an Entity Manager factory in the case of this application.

Currently the above Factory:

Is yielding me something like:
EntityManagerSetupImpl] - Creation of EntityManager Predeploy for PersistenceUnit: SOMEAPP_PU executed in: 1951 ms

Whereas before we were looking at:
Creation of EntityManagerFactory for PersistenceUnit: SOMEAPP_PU executed in: 2625 ms

I did not go after any fancy complicated logic of eclipselink since I have no intention of getting burned.

The following log statements are relevant for this:
####2017-10-26 16:22:10,022 ThreadId:127 INFO [logger: org.eclipse.persistence.internal.jpa.metadata.MetadataProcessor] - Map ArchieveURLs to ArchieveObjects computed in : 93 ms for 16 archives using 5 threads <LogContext:none> <ServerService Thread Pool -- 64>
####2017-10-26 16:22:10,256 ThreadId:127 INFO [logger: org.eclipse.persistence.internal.jpa.EntityManagerSetupImpl] - Load Mapping files executed in 327 ms <LogContext:none> <ServerService Thread Pool -- 64>

We gain something like 400 ms on this loadMapings logic simply be reworking the
public void loadMappingFiles(boolean throwExceptionOnFail) {

In the MetadataProcess.java



LOAD MAPPINGS PROBLEM:
The load mappings method is not as efficient as it can be.
It redoes non cheap operations at least twice and single threaded.
This means we waste time and available cores.

In the load mappings we have to calls that are not re-using computations done on previous calls.

loadStandardMappingFiles(MetadataHelper.JPA_ORM_FILE)
loadStandardMappingFiles(MetadataHelper.ECLIPSELINK_ORM_FILE)

We have these two calls.
And this is not good because in each of these calls there will be Loop mapping rootUrl -> Archive.
The building of archive using the archive factory does not come for free.
In a debugger session we can see that each time the archive factory is called the debugger stops for a bit.
If the debugger stops for a bit it is obvious the opeation does not come fore free because we feel it already.


So here is what the implementation looks in my sloppy work branch:

SNIPEET 1 - reworking loadMappingFiles
/**
* INTERNAL: This method is responsible for figuring out list of mapping files to read into XMLEntityMappings
* objects and store on the project. Note, the order the files are discovered and read is very important so do not
* change the order of invocation.
*/
public void loadMappingFiles(boolean throwExceptionOnFail) {
// (a) Start by building a map of ArchiveURL -> Archive object
Map<URL, Archive> archieveUrlToArchive = getArchiveUrlToArchive();

<------------------ Does this make sense to you. We compute once and only once in this algorithm the Map of RootURls to The corresponding Archive objects. Not twice per each fo the MATA-INF/ files to hunt for.

// Read all the standard XML mapping files first.
loadStandardMappingFiles(MetadataHelper.JPA_ORM_FILE, archieveUrlToArchive);

// Read all the explicitly specified mapping files second.
loadSpecifiedMappingFiles(throwExceptionOnFail);

// Read all the standard eclipselink files last (if the user hasn't
// explicitly excluded them). The eclipselink orm files will be
// processed last therefore allowing them to override and merge
// metadata where necessary. Note: we want the eclipselink orm
// metadata to merge into standard jpa files and not vice versa.
// Loading them last will ensure this happens.
Boolean excludeEclipseLinkORM = Boolean.valueOf((String) m_project.getPersistenceUnitInfo().getProperties()
.get(PersistenceUnitProperties.EXCLUDE_ECLIPSELINK_ORM_FILE));
if (!excludeEclipseLinkORM) {
loadStandardMappingFiles(MetadataHelper.ECLIPSELINK_ORM_FILE, archieveUrlToArchive);
}

if (m_metadataSource != null) {
XMLEntityMappings entityMappings = m_metadataSource.getEntityMappings(this.m_predeployProperties, m_loader,
m_session.getSessionLog());
if (entityMappings != null) {
m_project.addEntityMappings(entityMappings);
}
}
}




SNIPPET 2 - Trivial GETTER METHODS:
/**
* @return A collection of URLs (e.g. jar-files ins WEB-INF/lib/ of a WAR application mentonedin the
* persistence.xml)
*/
protected Collection<URL> getRootUrls() {
PersistenceUnitInfo puInfo = m_project.getPersistenceUnitInfo();
Collection<URL> rootUrls = new HashSet<URL>(puInfo.getJarFileUrls());
rootUrls.add(puInfo.getPersistenceUnitRootUrl());
return rootUrls;
} <--------------------- This logic comprises a small buch of coad that originall was in the loadStandardMappingFiles

/**
* @return Create a map of ArchiveUrl -> Corresponding Archive Object
*/
protected Map<URL, Archive> getArchiveUrlToArchive() {
// (a) determine the jar-file urls
Collection<URL> rootUrls = getRootUrls();
// (b) Setup the archieve factory to create Archive objects
final ArchiveFactory archiveFactory = PersistenceUnitProcessor.getArchiveFactory(m_loader);
// (c) run a parallel hunt of All Archieve objects
return computeJarArchivesInParallel(archiveFactory, rootUrls); <---------------------------- This is a new method that Build the Archives in parallel instead of single threaded
}






SNIPPET 3:
Computing in JDK 7 (in jdk 8 we sohuld use Parallel streams by the way)

/**
* A constant that tells us the max amount of time we are willing to wait fo the thread pool to be done with all the
* arhieve building steps.
*
* Dumb constant, should be something like MAX cost per archieve (2 seconds) per number of archieves. If eclipselink
* would take 2 seconds to process on archieve, this would be an unsable library. In any case, 2 minutes of archieve
* scanning is ample time.
*/
private static final int MAX_WAIT_TIME_SECONDS_FOR_BUILDING_ALL_ARCHIEVES_SECONDS = 120;




/**
* When eclipse link goes hunting for META-INF/eclipselink-orm.xml, it cannot depend on the classloader to an
* efficient: classLoader.getResources("META-INF/eclipselink-orm.xml") to find all the resources, because of
* possible class loader isolation. Instead, it is forced to create so called eclipselink archieves, which offer
* APIs to search for resources within the archieve.
*
* <P>
* Each archieve is relateively expensive to build. For this reason, we shall use a thread pool to parallelize this
* step.
*
*
* @param archiveFactory
* A tool that can help us create archieves
* @param archieveUrls
* A list of URLs (e.g. path to jar-files mentioned in the persistence.xml) that might have an
* eclipselink-orm.xml. This parameter may not be null.
* @return A map of archieveUrl to Corresponding archieve
*/
protected Map<URL, Archive> computeJarArchivesInParallel(final ArchiveFactory archiveFactory,
Collection<URL> archieveUrls) {
StopWatch stopWatch = new StopWatch();
stopWatch.start();
int threadPoolSize = Math.min(5, archieveUrls.size());
Map<URL, Archive> result = computeJarArchivesInParallelImpl(archiveFactory, archieveUrls, threadPoolSize);
stopWatch.stop();
LOGGER.info(String.format(
"Map ArchieveURLs to ArchieveObjects computed in : %1$s ms for %2$s archives using %3$s threads",
stopWatch.getElapsedTimeMs(), archieveUrls.size(), threadPoolSize));
return result;

}

/**
* see {@link #computeJarArchivesInParallel(ArchiveFactory, Collection)}
*/
protected Map<URL, Archive> computeJarArchivesInParallelImpl(final ArchiveFactory archiveFactory,
Collection<URL> archieveUrls, int threadPoolSize) {
// (a) Trivial case - there are not archieveUrls
if (archieveUrls.isEmpty()) {
return Collections.emptyMap();
}

// (b) prepare a thread pool
// since We are using JDK 7 there is no change of usse parallel streams
final ExecutorService fixedThreadPool = (ThreadPoolExecutor) Executors.newFixedThreadPool(threadPoolSize);
final Map<URL, Archive> result = new ConcurrentHashMap<URL, Archive>();
final List<Exception> unexpectedError = Collections.synchronizedList(new ArrayList<Exception>());

// (c) Create Mapping jobs of URL to Archieve
// we want to parallelize this because the operation is not cheap
for (final URL currentArchiveUrl : archieveUrls) {
fixedThreadPool.submit(new Runnable() {
@Override
public void run() {
try {
// (c.1) Execute the mapping
Archive currentArchiev = archiveFactory.createArchive(currentArchiveUrl, null);
result.put(currentArchiveUrl, currentArchiev);
} catch (Exception e) {
// (c.2) If things go wrong do not lose track of the error
unexpectedError.add(e);
}
}
});
}
fixedThreadPool.shutdown();
try {
fixedThreadPool.awaitTermination(MAX_WAIT_TIME_SECONDS_FOR_BUILDING_ALL_ARCHIEVES_SECONDS,
TimeUnit.SECONDS);
} catch (Exception e) {
String errMsg = String.format(
"Unexpected error. Building Archieves for URLS: %1$s took more than: %2$s seconds. This is not allowed.",
archieveUrls, MAX_WAIT_TIME_SECONDS_FOR_BUILDING_ALL_ARCHIEVES_SECONDS);
throw new RuntimeException(errMsg);
}

// (d) Make sure none of the slaves died trying to create archieve
if (!unexpectedError.isEmpty()) {
Exception firstExceptio = unexpectedError.get(0);
String errMsg = String.format("Unexpected error: %1$s", firstExceptio.getMessage());
throw new RuntimeException(firstExceptio);
}
return result;
}





SNIPPET 4: The Old Load Standard Mappings File is deprecated - we do not want to be computing the archive objects each time, we want them handed over.
So we add a new API that you have already seen used in SNIPPET 1 to do precisely this
The method is simpler now.
And since you have more methods, you have the potential of documenting the code better and have less sapghetti.

/**
* @deprecated Do not use this API - this api is inneficient because one should be avoiding to compute Archive
* objets all the time
* @param ormXMLFile
* Mapping file to hunt for
*/
@Deprecated
protected void loadStandardMappingFiles(String ormXMLFile) {
// (a) Compute a Map of ArchiveUrls to Archive objectes
Map<URL, Archive> archieveUrlToArchive = getArchiveUrlToArchive();
// (b) Load mapping
this.loadStandardMappingFiles(ormXMLFile, archieveUrlToArchive);
}

/**
* This method is hunting for an arbitrary xml file (e.g. META-INF/eclipselink-orm.xml).
*
*
*/
protected void loadStandardMappingFiles(String ormXMLFile, Map<URL, Archive> archieveUrlToArchive) {
// (a) Determine the set of URLs pointing to jar-file archieves referred by the persistence-xml.
Collection<URL> rootUrls = getRootUrls();

// (b) Process Each of the ArchieveUrls
for (URL rootURL : rootUrls) {
getSessionLog().log(SessionLog.FINER, SessionLog.METADATA, "searching_for_default_mapping_file",
new Object[] { ormXMLFile, rootURL }, true);
URL ormURL = null;

Archive par = null;
try {

// Old innefficient line of code
// par = PersistenceUnitProcessor.getArchiveFactory(m_loader).createArchive(rootURL, null);
par = archieveUrlToArchive.get(rootURL);

if (par != null) {
ormURL = par.getEntryAsURL(ormXMLFile);

if (ormURL != null) {
getSessionLog().log(SessionLog.FINER, SessionLog.METADATA, "found_default_mapping_file",
new Object[] { ormURL, rootURL }, true);

// Read the document through OX and add it to the project., pass persistence unit properties for
// any orm properties set there
XMLEntityMappings entityMappings = XMLEntityMappingsReader.read(ormURL, m_loader,
m_project.getPersistenceUnitInfo().getProperties());
entityMappings.setIsEclipseLinkORMFile(ormXMLFile.equals(MetadataHelper.ECLIPSELINK_ORM_FILE));
m_project.addEntityMappings(entityMappings);
}
}
} catch (Exception e) {
throw new RuntimeException(e);
} finally {
if (par != null) {
par.close();
}
}
}
}





I would suggest that this tunning be added to eclipselink branch 2.6.x and above.
I really need performance boost on the deployment, and eclipselink bites at the deployment time both at entity manager factory creation and then at the cost of the first enty manager.
SAo anything that makes eclipselink boostrapping cost cheaper will be a very nice thing.


Feel free to change modifty throw away any snippet posted above.
I would just be happy that I can update my 2.6 branch and get the tunning in the way you see fit.

NOTE:
- I think there is further room for imprvement. The processing of each XML file once it gets found could most likely be parallelized as well.
I would recommend that be looked into as well.
The loading and parsing of XML data should ideall be done un100 ms.



[Updated on: Thu, 26 October 2017 14:48]

Report message to a moderator

Re: Logging about time to create entity manager factory Impl should exist [message #1775234 is a reply to message #1775231] Thu, 26 October 2017 14:51 Go to previous messageGo to next message
Nuno G. de M. is currently offline Nuno G. de M.Friend
Messages: 30
Registered: September 2012
Member
Simple question,

Is it allowed to fork eclipselink branch and Upload to GitHub so that you can look at any code changes that may be relevant?

I suspect that this is not allowed, especially considering that the GIT repository conveniently comes with this Plugins/ directory that my have third party libraries with special licenses.

But personally, I would like to be able to Fork Eclipselink to be able to when needed create a branch with small tunnings that you would either take or not not take.

Thanks for any reply on this.
Re: Logging about time to create entity manager factory Impl should exist [message #1775396 is a reply to message #1775234] Sun, 29 October 2017 13:06 Go to previous messageGo to next message
Nuno G. de M. is currently offline Nuno G. de M.Friend
Messages: 30
Registered: September 2012
Member
Is it possible to get any feedback, as to whether there is any intention on behalf of the my eclipselink code committers to take a good look at the bootstraping cost of the entity manager factories for the PUnits delcared in the persistence.xml of a web application, and try to remove from it inneficiencies (e.g. redundant steps) and overall speed up the hole process.

When you consider the overall benefit of spending the time to do this, like illustrated above, it is really massive.
Just with the tivial tunning illustrated above we are getting 400 ms for a single Persistence Unit for a single deployment for a single person.
Multiplied by the number of deployments I need to do in a day multiplied by number of developpers working with eclipsleink, 400 ms when put into prespective is a massive time gain.

I would really be happy if I could see my eclipselink entity manager factor setup go down from 2 seconds to a second.
Perhaps I being to greedy, but it would be great that strating up eclipselink becomes irrelevant on the deployment time of a web application.

Today, the moment you add a Persistence.xml to you deployment, you immediately feel the impact at initial deployment and on the fist entity manager.
And this "stuckness" prception can be greatly reduced I believe.


I would be thankful for some feedback on this.

Kindest regards.
Re: Logging about time to create entity manager factory Impl should exist [message #1775487 is a reply to message #1775396] Mon, 30 October 2017 20:10 Go to previous messageGo to next message
Chris Delahunt is currently offline Chris DelahuntFriend
Messages: 1275
Registered: July 2009
Senior Member
Feel free to submit any improvements through bugzilla, and the project can always use additional committers.
Re: Logging about time to create entity manager factory Impl should exist [message #1775517 is a reply to message #1775487] Tue, 31 October 2017 10:01 Go to previous messageGo to next message
Nuno G. de M. is currently offline Nuno G. de M.Friend
Messages: 30
Registered: September 2012
Member
Hi,

Many Thanks,

Can you confirm that the following is the right refernece?
https://bugs.eclipse.org/bugs/



Kindest regards.
Re: Logging about time to create entity manager factory Impl should exist [message #1775777 is a reply to message #1775517] Mon, 06 November 2017 09:19 Go to previous message
Nuno G. de M. is currently offline Nuno G. de M.Friend
Messages: 30
Registered: September 2012
Member
Hi,

I have opened the following bug:
https://bugs.eclipse.org/bugs/show_bug.cgi?id=526869

Many thanks for all the help.
Kindest regards.
Previous Topic:@Version timestamp; select local or server time
Next Topic:Unable to retrieve timestampz from postgresql db
Goto Forum:
  


Current Time: Wed Dec 13 19:11:14 GMT 2017

Powered by FUDForum. Page generated in 0.02383 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software