Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [geowave-dev] Geowave Cassandra Proposal

I started a branch off of GEOWAVE-238 labeled GEOWAVE-DSRW.  The intent is to simplify the DataStore interface to three or four methods, providing more general purpose utilities.  I believe the merge into GEOWAVE-238 should happen late next week.  I need to wait for the GEOWAVE-50 to merge first, which has some data store changes.  The plan is to refactor these changes out of the data store implementations.

I created issue for Cassandra, which can use for discussion.  Using the Wiki is a more appropriate place for design discussion.

On Wed, Oct 14, 2015 at 2:22 PM, Rich Fecher <rfecher@xxxxxxxxx> wrote:
Kartik,
To re-iterate, I do think the community would greatly benefit from a GeoWave+Cassandra connector and the core development team would love to support your contribution.

Here's a quick overview of how the extensibility model works for geowave, particularly as it relates to a Cassandra connector. We have geowave-core-index which is primarily focused around preserving multi-dimensional locality within a key value store, optimizing for range scan.  Then on top of that we have geowave-core-store, which generally supplies the core logic of storage, retrieval, and persisting associated metadata - applying geowave-core-index to a key-value store. We have geowave-core-geotime that introduces the core geo-temporal concepts on top of the general purpose "store."  As for the "extensions" directory, the "formats" subdirectory supports parsing specific formats for ingest, the "adapters" directory has data adapters (persistence models) for two important geospatial data types - a vector data adapter and a raster data adapter, and the datastores directory is where accumulo and hbase will exist and where a cassandra connector would exist.  The DataStore interface in GeoWave is how data is stored and retrieved throughout the system, although it utilizes 3 important metadata stores: the IndexStore, AdapterStore, and DataStatisticsStore.  These stores persist instances of each respective object, and there is an accumulo implementation as well as a memory implementation of each.  Accumulo dependencies are still in the master branch for various projects that have been pulled out in the GEOWAVE-238 branch.  SPI is used to inject a datastore implementation into the rest of the system.  Here's the SPI services provided by Accumulo: https://github.com/ngageoint/geowave/tree/GEOWAVE-238/extensions/datastores/accumulo/src/main/resources/META-INF/services

The store "family" is used by geotools plugins to tie the individual stores together and each is exposed as a datastore in GeoServer.  So depending on which SPI store families are within GeoServer's classpath will drive what is exposed as a data store.  Defining this interface fully for Cassandra will be the key to providing a GeoWave+Cassandra connector.

There are some ongoing changes occurring in GEOWAVE-238 which will be merged with a "secondary indexing" branch (GEOWAVE-50) pulled together into a release candidate branch by the end of this week.  Bare with us as we iron that out.  In the short-term, though, the general concepts and most of the interfaces will remain.  I suggest we fit any discussion or design documentation that would be helpful somewhere on the wiki (https://github.com/ngageoint/geowave/wiki) as your team digs into it, but in the meantime feel free to follow up on this thread with any questions, or we can just get some interaction going on gitter relating to this.  Additionally, we could setup a google hangout if that would be more helpful for your team or others interested.

Rich



On Mon, Oct 12, 2015 at 2:37 PM, Kartik Venkatesh <kartik@xxxxxxxxxxxxx> wrote:
Team,

I am the CTO for a Spatial Analytics Startup headquartered in Miami, with core engineering in Seattle. We are building our platform and utilizing Geowave+Accumulo as our core Data platform, with OrientDB as the storage for our Execution graph. 

As a company we would have a strategic advantage of utilizing Cassandra/DSE as our core Data storage engine, and in talking to @RFecher it looks like the community would greatly benefit from the support as a data store. I also know that a GSoC project was started but not completed in this realm. Myself and others in our team know a lot about C* & DSE and I have multiple contacts on the Cassandra team. Spatially would love to dev the project to build a connector for C*/DSE for GeoWave, and would be a great contribution back to the community. 

What would the process be for us to propose a solution and get accepted to build. We would be ready to start immediately. 

Thanks, Kartik 

_______________________________________________
geowave-dev mailing list
geowave-dev@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://www.locationtech.org/mailman/listinfo/geowave-dev



_______________________________________________
geowave-dev mailing list
geowave-dev@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://www.locationtech.org/mailman/listinfo/geowave-dev



Back to the top