| EOFException in CrawlThread [message #652876] |
Mon, 07 February 2011 08:01  |
Andrej Rosenheinrich Messages: 22 Registered: August 2010 |
Junior Member |
|
|
Hi again,
looking at the logfiles we notice on a regular basis the following exception:
2011-02-04 21:00:24,326 ERROR [Thread-39 ] impl.CrawlThread - Error while processing record with Id whatever of dataSourceId
org.eclipse.smila.connectivity.framework.CrawlerException: org.eclipse.smila.connectivity.framework.CrawlerException: java.io.EOFException
at org.eclipse.smila.connectivity.framework.crawler.web.WebCraw ler.getMObject(WebCrawler.java:361)
at org.eclipse.smila.connectivity.framework.util.internal.DataR eferenceImpl.getRecord(DataReferenceImpl.java:100)
at org.eclipse.smila.connectivity.framework.impl.CrawlThread.pr ocessDataReferences(CrawlThread.java:352)
at org.eclipse.smila.connectivity.framework.impl.CrawlThread.ru n(CrawlThread.java:235)
Caused by: org.eclipse.smila.connectivity.framework.CrawlerException: java.io.EOFException
at org.eclipse.smila.connectivity.framework.crawler.web.WebCraw ler.deserializeIndexDocument(WebCrawler.java:830)
at org.eclipse.smila.connectivity.framework.crawler.web.WebCraw ler.getRecord(WebCrawler.java:577)
at org.eclipse.smila.connectivity.framework.crawler.web.WebCraw ler.getMObject(WebCrawler.java:359)
... 3 more
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Obje ctInputStream.java:2553)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java :1296)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java: 350)
at java.util.ArrayList.readObject(ArrayList.java:593)
at sun.reflect.GeneratedMethodAccessor47.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe thodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass .java:974)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.j ava:1848)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStre am.java:1752)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java :1328)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStrea m.java:1946)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.j ava:1870)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStre am.java:1752)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java :1328)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java: 350)
at org.eclipse.smila.connectivity.framework.crawler.web.WebCraw ler.deserializeIndexDocument(WebCrawler.java:828)
This exception appears when crawling with one or multiple threads. Crawls are running for several hours using the same dataSourceId, so if the configfile is wrong the exception should be thrown from the beginning and more often. Is there any explanation for this behavior? Is this a problem of SMILA or the underlying operation system?
Thanks,
Andrej
|
|
|
|
|
|
Powered by
FUDForum. Page generated in 0.01816 seconds