SMILA logo

Unleashing the potential of unstructured data sources

The amount and diversity of information is growing exponentially (big data), mainly in the area of unstructured data, like emails, text files, blogs, images, etc. Poor data accessibility, user rights integration and the lack of semantic meta data are constraining factors for building next generation enterprise search and other document centric applications. Missing standards result in proprietary solutions with huge short and long term cost.

SMILA is an extensible framework for building big data and/or search solutions to access and process unstructured information in the enterprise. Besides providing essential infrastructure components and services, SMILA also delivers ready-to-use add-on components, like connectors to most relevant data sources. Using the framework as their basis will enable developers to concentrate on the creation of higher value solutions, like semantic driven applications etc.

SMILA Architecture

SMILA Architecture Overview

News

2014-12-01 - 1.3-M1 available

Today we published the milestone M1 of the upcoming release 1.3. The milestone contains the new scripting engine based on JavaScript, which can be used for all kinds of synchronous processing. It can also be seen as a (recommended) alternative for BPEL, while being faster, more flexible and easier to use. The milestone M1 doesn't contain the Solr 4 integration yet, this will be finished in the next weeks. As always, please try it out and give us your feedback.

July 2013 - Announcement for SMILA presentation in Berlin

On 20th of August there will be a SMILA presentation at the Fraunhofer Heinrich-Hertz-Institut in Berlin: Informationsgewinnung mit semantischen Technologien - die Anwendung SMILA. The participation is free, feel free to register and attend the event.

April 2013 - SMILA 1.2 released!

Today we released SMILA 1.2! The major new features of this release are the integration of Apache Tika for extracting text from binary content and a lot of enhancements in the importing components (JDBC and Web Crawler, Remote crawling). As always, please try it out and give us your feedback.

February 2013 - 1.2-M1 available!

Today we published the milestone M1 of the upcoming release 1.2. The major new features are the integration of Apache Tika for extracting text from binary content and crawling enhancements for Web- and JDBC-Crawler. As always, please try it out and give us your feedback.

July 2012 - Release 1.1 is out!

We are proud to announce the SMILA release 1.1. The major new features are the migration of file, web, JDBC and feed crawler implementations to self-scaling ETL and the integration of Solr 3.5. As always, please try it out and give us your feedback.

June 2012 - YourKit donates free licenses for its Java Profiler

We are happy to announce that thanks to YourKit's donation we are now able to profile and further improve SMILA!

YourKit is kindly supporting open source projects with its full-featured Java Profiler. YourKit, LLC is the creator of innovative and intelligent tools for profiling Java and .NET applications. Take a look at YourKit's leading software products: YourKit Java Profiler and YourKit .NET Profiler.

June 2012 - 1.1 M1 is out!

Today on 6th June we have published our first milestone of the upcoming release 1.1. The major new features are the migration of file system and web crawler implementations to self-scaling ETL and the integration of Solr 3.5. As always, please try it out and give us your feedback.

April 2012 - Announcement for SMILA conference in Berlin

Curious about what has been done in SMILA in the last two releases and who and how has been using SMILA? Then you should visit our SMILA conference in Berlin on 15th May. Make sure to register early at innovationszentrum@theseus-programm.de, since the number of participants is limited.

March 2012 - Announcement for SMILA tutorial in Berlin

Demand for SMILA tutorials is growing, so we decided to organize another one in May. This time the tutorial will take place in Berlin, Germany at TIZ on 14th May starting at 13:30. The participation is free, but since the number of participants is limited, make sure to register early at innovationszentrum@theseus-programm.de.

Part of
Links
Supporting Organizations
Meet us at