GeoMesa graduates from LocationTech Incubation with 1.2.0 Release

GeoMesa is an open-source suite of tools that enables geospatial analytics on high volume and high velocity spatiotemporal data in Accumulo, HBase, Cassandra, Google Bigtable, and Kafka. In addition to lengthening this list of supported data stores and adding several other new features, the new 1.2 release of GeoMesa has matured out of incubation with a thorough review by the Eclipse Foundation’s LocationTech Working Group. Here’s a quick introduction to the tools provided by GeoMesa.

heat map


Kafka Data Store for Streaming Near Real-time Analytics

For streaming data, GeoMesa provides an implementation of the Geotools API on top of Apache Kafka called the KafkaDataStore (KDS). The KDS lets developers build Kafka producers and consumers of spatiotemporal data on top of the high-performance Kafka message queue and to play back portions of the stream of events for interesting analyses. Coupled with a simple and flexible Extract, Transform, and Load (ETL) library, developers can quickly integrate many different streaming data sources into the KDS, where visualization tools can take advantage of this data via Open Geospatial Consortium standards to create near real-time, interactive applications.

map

Accumulo Data Store for High Volume Persistence, Query, and Retrieval

GeoMesa provides a mature persistence API on top of the Apache Accumulo column family database. All vector types are supported as well as spatio-temporal predicates like within, intersects, etc. GeoMesa supports arbitrary predicates on any attribute and can optimize query plans for predicates of high cardinality by dispatching scans to non-spatiotemporal attribute indexes.

GeoMesa performs relational projections to minimize the amount of data transferred during a query to just the attributes required to satisfy the request. This can significantly speed up rendering of maps that only require a minimal subset of attributes for styling. Both for speeding up queries and performing interactive aggregations, GeoMesa can push down filter predicates and computation to the Accumulo tablet servers.

HBase, Google Bigtable, and Cassandra

GeoMesa 1.2.0 includes initial support for the Apache HBase and Cassandra databases as well as Google’s Cloud Bigtable. The ability to use streaming data from Kafka producers, in addition to these large-scale data stores, broadens the possibilities even further.

Through multiple production deployments in different data domains, GeoMesa has evolved into a high-performance spatiotemporal analytics engine. Data volumes in production instances exceed hundreds of billions of IoT events while GeoMesa’s stream processing tools keep latencies down to stringent sub-second requirements. More and more users are tapping into the possibilities of batch analytics over huge volumes of data by using GeoMesa's bindings for the MapReduce and Spark distributed computing frameworks. GeoMesa is also powering interactive visualizations over this data through parallelized query-time aggregations such as heat maps and animations of tens of millions of events through compressed data representations.

For more information, see http://www.geomesa.org.

About the Authors