|Re: [smila-dev] SMILA as Search engine|
Am 28.08.2012 13:02, schrieb Corinth, Rene:
You can add more crawl job definitions, one for each web site.
Either add them to the configuration jobs.json file, or POST them to /smila/jobmanager/jobs.
Another possibility to do this in one job is described on http://wiki.eclipse.org/SMILA/Documentation/Importing/CrawlingMultipleStartURLs.
For each crawled page you could extract the domain part of the URL into a new attribute and then in the search request add a filter to restrict the result to those pages with the required domain attribute value.
On adding attributes to the index see http://wiki.eclipse.org/SMILA/Documentation/Solr_3.5
On filtering see see http://wiki.eclipse.org/SMILA/Documentation/Search#Query_Parameters
Sorry, there is currently no complete tutorial on this.