Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
RE: [smila-user] RE: JDBC-Crawling Phenomenon

hi,

 

> what I meant was, that identical data goes through the [ADD Rule] sometimes and through the [ADD JDBC Rule] sometimes.

> And there is no obvious rule when which rule is chosen. That’s the problem.

 

exactly! u need to

- mark ur records distinctly so there is a condition that only one rule will select them and not the other OR

- put them into diff. Qs and have the listeners listen on their respective Qs.

 

Kind regards

Thomas Menzel @ brox IT-Solutions GmbH

 

From: smila-user-bounces@xxxxxxxxxxx [mailto:smila-user-bounces@xxxxxxxxxxx] On Behalf Of Andreas.Schultz@xxxxxxxxxxx
Sent: Mittwoch, 30. September 2009 09:07
To: smila-user@xxxxxxxxxxx
Subject: AW: [smila-user] RE: JDBC-Crawling Phenomenon

 

Hi Thomas,

 

what I meant was, that identical data goes through the [ADD Rule] sometimes and through the [ADD JDBC Rule] sometimes.

And there is no obvious rule when which rule is chosen. That’s the problem.

 

At  2009-09-29 15:40:34,799:

- Record is routed with rule [Default Route Rule] and operation [null], record id=177c250f8e116110396aaa5b1dd51662d633f6517dab42801d98be7f1765f6    

- Closing JdbcCrawler...                                                                                                                             

- Unregistering crawling thread kinkon_bookmark_jdbc                                                                                                

- Crawling thread kinkon_bookmark_jdbc unregistered                                                                                                 

- Crawling thread kinkon_bookmark_jdbc stopped.                                                                                                      

- Record is processed by Listener with rule: [ADD Rule] and operation [ADD], record id=177c250f8e116110396aaa5b1dd51662d633f6517dab42801d98be7f1765f6

 

At 2009-09-29 15:40:58,391:

Record is routed with rule [Default Route Rule] and operation [null], record id=177c250f8e116110396aaa5b1dd51662d633f6517dab42801d98be7f1765f6         

Closing JdbcCrawler...                                                                                                                                  

Record is processed by Listener with rule: [ADD JDBC Rule] and operation [ADD], record id=177c250f8e116110396aaa5b1dd51662d633f6517dab42801d98be7f1765f6

 

As you may have recognized, there are about 15 sec. between the operations. As I mentioned, I put exactly the same data (a single set) into the process.

I tried it several times afterwards to get a glimpse of an rule of it, but it reacts  totally heuristic. Always the same data!

 

Best

 

Andreas Schultz
Senior Software Developer

- - - - Bitte beachten Sie meine neuen Kontaktdaten - - - -


Empolis GmbH  |  Meisenstr. 90 | 33607 Bielefeld  |  Germany
AN ATTENSITY GROUP COMPANY
Phone +49 (0)521 55 785 413|  Fax +49 (0)521 55 785 121
andreas.schultz@xxxxxxxxxxx

 

www.empolis.com
Sitz Kaiserslautern  |  Amtsgericht Kaiserslautern HRB 30711  |  Geschäftsführer: Dr. Stefan Wess, Dr. Peter Tepassé

 

………………………………………………………………………………………………………………………………………………………………………………………………………..

Know. Right. Now.

Das ist unsere Philosophie. Empolis, an Attensity Group Company, bietet eine integrierte Suite von Geschäftsanwendungen,

die mit Hilfe patentierter semantischer Informations-Technologien die exponentiell wachsende Menge unstrukturierter
Daten analysiert, interpretiert und automatisiert verarbeitet. Entscheider, Experten, Mitarbeiter und Kunden erhalten so
stets situations- und aufgabengerecht genau das Wissen, das für ihre Arbeit relevant ist.

………………………………………………………………………………………………………………………………………………………………………………………………………..

Abonnieren Sie unseren monatlichen Newsletter: http://www.empolis.de/newsletter.html

 

Von: smila-user-bounces@xxxxxxxxxxx [mailto:smila-user-bounces@xxxxxxxxxxx] Im Auftrag von Thomas Menzel
Gesendet: Dienstag, 29. September 2009 21:25
An: Smila project user mailing list
Betreff: [smila-user] RE: JDBC-Crawling Phenomenon

 

hi andreas,

 

i'm not entirely sure as what ur problem or error is that u see:

 

> both listeners take the record

this not a bug it’s a feature ;)

both conditions fit, so both can take on the records. on the concurrent system you cant tell which gets what.

 

 

> mimetype error , line 17

the default addpipline invokes the MIME type detection service that needs a file extension to do its work, which is contained in a field as defined in config/../MimeTypeConfig.xml

if the detection fails the rest of the processing is skipped (see <if name="conditionIsText">… ) and hence nothing is added to the index

 

since I guess u read from the DB and u don’t need to detect mime type this can be ignored

 

Kind regards

Thomas Menzel @ brox IT-Solutions GmbH

 

From: smila-user-bounces@xxxxxxxxxxx [mailto:smila-user-bounces@xxxxxxxxxxx] On Behalf Of Andreas.Schultz@xxxxxxxxxxx
Sent: Dienstag, 29. September 2009 17:44
To: smila-user@xxxxxxxxxxx
Subject: [smila-user] JDBC-Crawling Phenomenon

 

Hi all,

 

I have a really nice phenomenon using a JDBC DS:

 

After having succeeded to connect to the DB (MSSQL with authorization via Windows-Domain) which was really hard work,

I added an entry to the Listener-config to call my pipeline:

 

  <Rule Name="ADD JDBC Rule" WaitMessageTimeout="10" Threads="4" MaxMessageBlockSize="20">

    <Source BrokerId="broker1" Queue="SMILA.connectivity"/>

    <Condition>Operation='ADD' and DataSourceID LIKE '%kinkon%'</Condition>

    <Task>

      <Process Workflow="KinKonAddPipeline"/>

    </Task>

  </Rule>

 

  <Rule Name="ADD Rule" WaitMessageTimeout="10" Threads="4" MaxMessageBlockSize="20">

    <Source BrokerId="broker1" Queue="SMILA.connectivity"/>

    <Condition>Operation='ADD' and NOT(DataSourceID LIKE '%feeds%') and NOT(DataSourceID LIKE '%xmldump%')</Condition>

    <Task>

      <Process Workflow="AddPipeline"/>

    </Task>

  </Rule>

 

The new pipeline has been a striped down copy of the normal addpipeline.

Funny was the behavior of the indexing-process: Sometimes it succeeded, sometimes not!

If you look at the attached log-file, you will discover 2 sections, first of failed to put the content to the index, second succeeded!

Obviously, the first one took its way through the ADD Rule,

“Record is processed by Listener with rule: [ADD Rule]”

The second one through the expected

“Record is processed by Listener with rule: [ADD JDBC Rule]”

 

Is this a misuse/ misconfiguration of mine or a bug?

 

Best

 

 

Andreas Schultz
Senior Software Developer

- - - - Bitte beachten Sie meine neuen Kontaktdaten - - - -


Empolis GmbH  |  Meisenstr. 90 | 33607 Bielefeld  |  Germany
AN ATTENSITY GROUP COMPANY
Phone +49 (0)521 55 785 413|  Fax +49 (0)521 55 785 121
andreas.schultz@xxxxxxxxxxx

 

www.empolis.com
Sitz Kaiserslautern  |  Amtsgericht Kaiserslautern HRB 30711  |  Geschäftsführer: Dr. Stefan Wess, Dr. Peter Tepassé

 

………………………………………………………………………………………………………………………………………………………………………………………………………..

Know. Right. Now.

Das ist unsere Philosophie. Empolis, an Attensity Group Company, bietet eine integrierte Suite von Geschäftsanwendungen,

die mit Hilfe patentierter semantischer Informations-Technologien die exponentiell wachsende Menge unstrukturierter
Daten analysiert, interpretiert und automatisiert verarbeitet. Entscheider, Experten, Mitarbeiter und Kunden erhalten so
stets situations- und aufgabengerecht genau das Wissen, das für ihre Arbeit relevant ist.

………………………………………………………………………………………………………………………………………………………………………………………………………..

Abonnieren Sie unseren monatlichen Newsletter: http://www.empolis.de/newsletter.html

 


Back to the top