Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [geomesa-users] Feature writing issue...

Emilio, Anthony, Jim,

 

Thanks for your suggestions.  With regards to overwriting a GeoMesa with the same ID:

With my use case I would be conceptually using the same feature ID, location, and time; however, the location portion has double-precision data in it and thus may not be exactly equivalent to a previously indexed entry.  Because of the double-precision data within my default geometry attribute, I think it is best not to assume equality between my previously indexed {id, location, time} entry, and my new {id, location, time} entry.

 

Jim suggested that doing a pure lookup by feature ID is slow, so to get around this slowness (since I also know the geometry and time), if I add a geospatial-temporal component to the Filter, will that ease the slowness concerns?

 

Below is some Java code for modification.  The code seems to functionally work, but:

A) Does combining the geospatialtemporalfilter with the idfilter offer speed improvements over a simple idfilter for larger data sets?

B) Is combining the geospatialtemporalfilter with the idfilter safe?  According to http://docs.geotools.org/stable/userguide/library/opengis/filter.html,  “Formally this style of Id matching is not supposed to be mixed with the traditional attribute based evaluation (such as a bounding box filter).”  This statement gives me cause for concern because I am mixing the id matching with a bounding box and time filter.

 

 

public boolean modifyFeatureByID(String featureName, String featureID, List<String> attributes, List<Object> values,

                                                String timeAttribute, Date startTime, Date endTime, String geometryAttribute, double startLat, double startLon, double endLat, double endLon)

                {

                                boolean wasSuccessful = true;

                               

                                // queryString is of the form: "( ( NOT (myTime AFTER 2014-06-20T15:23:03.952Z)) AND ( NOT (myTime BEFORE 2012-06-20T15:23:03.952Z)) ) AND BBOX(myGeometry,123.2,40.1,123.6,49.9)"

                                String queryString = constructBoundedBoxAndTimeFrameQueryString(timeAttribute, startTime, endTime,

                                                                geometryAttribute, startLat, startLon, endLat, endLon);

                                Filter geospatialtemporalfilter = queryStringToFilter(queryString);

 

                                logger.debug("Modifying feature with id: '" + featureID + "' from the '" + featureName + "' feature store");

 

                                if (attributes == null && values != null)

                                {

                                                throw new IllegalArgumentException("if attributes list is null, values list must also be null");

                                }

                                if (attributes != null && values == null)

                                {

                                                throw new IllegalArgumentException("if values list is null, attributes list must also be null");

                                }

 

                                if (attributes != null && values != null)

                                {

                                                int numAttributes = attributes.size();

                                                if (numAttributes != values.size())

                                                {

                                                                throw new IllegalArgumentException("attributes and values lists must be the same size");

                                                }

 

                                                if (numAttributes > 0)

                                                {

                                                                Name[] attributeNames    = new NameImpl[numAttributes];

                                                                Object[] attributeValues = new Object[numAttributes];

                                                                for (int k=0; k<numAttributes; k++)

                                                                {

                                                                                attributeNames[k]  = new NameImpl(attributes.get(k));

                                                                                attributeValues[k] = values.get(k);

                                                                }

 

                                                                DataStore dataStore = createDataStore();

                                                                FeatureStore<SimpleFeatureType, SimpleFeature> featureStore = createFeatureStore(dataStore, featureName);

 

                                                                FilterFactory2 ff = CommonFactoryFinder.getFilterFactory2();

                                                                Filter idfilter = ff.id(Collections.singleton(ff.featureId(featureID)));

                                                                Filter filter = ff.and(Arrays.asList(geospatialtemporalfilter, idfilter));

 

                                                                try

                                                                {

                                                                                featureStore.modifyFeatures(attributeNames, attributeValues, filter);

                                                                                logger.debug("Feature with id: '" + featureID + "' has been successfully modified within the '" + featureName + "' feature store");

                                                                }

                                                                catch (Exception e)

                                                                {

                                                                                wasSuccessful = false;

                                                                                logger.error("Problem modifying feature with id: '" + featureID + "' within the '" + featureName + "' feature store", e);

                                                                }

 

                                                                dataStore.dispose();

                                                }

                                                else

                                                {

                                                                logger.debug("modifyFeatureByID() invoked with empty attributes/values inputs, no features were modified for feature with id: " + featureID);

                                                }

                                }

                                else

                                {

                                                logger.debug("modifyFeatureByID() invoked with null attributes/values inputs, no features were modified for feature with id: " + featureID);

                                }

 

                                return wasSuccessful;

                }

 

 

From: geomesa-users-bounces@xxxxxxxxxxxxxxxx [mailto:geomesa-users-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Emilio Lahr-Vivaz
Sent: Friday, June 20, 2014 9:38 AM
To: geomesa-users@xxxxxxxxxxxxxxxx
Subject: Re: [geomesa-users] Feature writing issue...

 

Hi all,

A couple caveats:

In GeoMesa, the row key contains the temporal and spatial data for the feature. So if you try to overwrite a feature by keeping the same feature ID, but you change the location and/or time, it will create a new row and not overwrite the existing entry.

If you do overwrite the exact same row, Accumulo by default will apply a VersioningIterator so that you only see the latest entry. You can change this behavior if you want:

http://accumulo.apache.org/1.5/accumulo_user_manual.html#_versioning_iterators_and_timestamps

I would encourage people to check out the GeoMesa quick start tutorial:

http://geomesa.github.io/2014/05/28/geomesa-quickstart/

It lets you write and read just a few features at a time through the API. You can test out your exact scenario and determine what works for you.

Thanks,

- Emilio

On 06/20/2014 08:13 AM, Anthony Fox wrote:

Beau, Jim,

If you overwrite records with the same id, then you'll observe multiple records until Accumulo performs a table compaction or you set a versioning iterator.  Since this sounds like a valid use case, we can by default set a versioning iterator so you only see the most recent version of your record.  Table compactions will periodically remove stale data.

-Anthony

 

On Thu, Jun 19, 2014 at 6:27 PM, Jim Hughes <jnh5y@xxxxxxxx> wrote:

Hi Beau,

Initially, I wanted to say that #1 is the intended behavior.  I wanted to check things out before responding, and unfortunately, it has taken me a bit longer than I expected.

The first thing to point out is how GeoTools handles feature IDs.  Since several folks could be writing to a FeatureStore, you have to state your feature ids should be used for a given feature being written to the database.  For example...

feature.getUserData.put(Hints.USE_PROVIDED_FID, true) 

(I might have the exact syntax off since I'm bouncing between Scala and Java.)  Without that hint, GeoMesa will pick a random UUID for the feature id. 

Just to make sure that #1 isn't (currently) possible, I tried writing two distinct with the same id, and I ended up with two records with the same FID. 

As for the other two approaches, in the current implementation, looking up features by ID involves a table scan and hence generally is a bad idea.  We do have some work in progress which will make such queries faster/sane.  The last note on along these lines is to point out that to support this fully, we'll likely need to implement/override the DataStore's function called getFeatureWriter. 

I mention that because this is the GeoTools way of doing #2.  At the minute, we are using an abstract implementation of this, and it should work correctly.  The filtering is done entirely on the client side, so it'll be slow.  If your data is small (say, a few thousand records), this sort of thing might be tenable.

I hope that helps clarify the matter; let me know if you have other questions.

Jim




On 06/18/2014 04:37 PM, Beau Lalonde wrote:

Jim, Others,

 

If we do use the same ID, can we count on the previous value getting “overwritten”/replaced? 

 

In other words, if I actually intend to overwrite/replace a feature with a specific ID (if it exists, otherwise create a new feature), which of the following is the best option:

1.       Act as if I am adding the feature, counting on any existing feature with the same ID to be overwritten/replaced

2.       Query GeoMesa for the existence of a feature with the specific ID, modify feature if it exists, add feature if it doesn’t exist

3.       Blindly attempt to remove the feature with the specific ID, add a new feature with the same ID

 

Any suggestions for a recommended approach would be helpful.

 

Thanks,

Beau

 

 

From: geomesa-users-bounces@xxxxxxxxxxxxxxxx [mailto:geomesa-users-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Jim Hughes
Sent: Wednesday, June 18, 2014 9:06 AM
To: Geomesa User discussions
Subject: Re: [geomesa-users] Feature writing issue...

 

Hi Adnan,

Great question!  Geomesa uses the feature id as a unique identifier.  It sounds like you might using the id field to identify/name a thing which is moving/changing shape/varying attributes through time.  If that's the use case, I'd suggest putting that information which identifies the object into a different field like 'name' or 'identifier'. 

As for documentation, I'd suggest checking out http://geomesa.github.io/ and looking through the tutorials.  We've integrated with GeoTools, so I'd also point to their documentation about DataStores/FeatureStores (http://docs.geotools.org/latest/userguide/library/api/datastore.html and http://docs.geotools.org/stable/userguide/library/data/featuresource.html).*

Let us know what others questions we can help with,

Jim

* In particular, I believe that you would see the same behavior with (most) other GeoTools FeatureStores. 

On 06/18/2014 06:14 AM, Adnan Yaqoob wrote:

Hello Everybody,

 

I am new to Geomesa and trying its API. I have a question, how can I store features with same id and geometry with different time stamp and attributes values. I tried to write feature with same id with different attributes and it was overwriting previous feature. I am stuck on this point, please help me understand. 

 

Is there any documentation for Geomesa API?

 

Regards,

Adnan



_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
http://www.locationtech.org/mailman/listinfo/geomesa-users

 



_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
http://www.locationtech.org/mailman/listinfo/geomesa-users

 


_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
http://www.locationtech.org/mailman/listinfo/geomesa-users

 




_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
http://www.locationtech.org/mailman/listinfo/geomesa-users

 


Back to the top