Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [geomesa-users] International date line dilemma for non-point geometries, possible bug?

Beau,

The answer is yes to all questions.

You are welcome and have a good weekend!

Hunter

On 07/11/2014 09:24 AM, Beau Lalonde wrote:
Hunter,

I think that all makes sense, thanks for the explanation.  I have a few more questions.

1) If I index using a geometry with longitudinal values between 160 and 200, and then I do a query at -170, will I get back a result?  In other words, is my indexed data internally represented in the appropriate [-180, 180] range even if I provided it outside the [-180, 180] range?  Or do I need to query outside of the [-180, 180] region if I indexed outside o fthe [-180, 180] region?

2) You said that BBOX filters are exceptions to the "longitudinal difference greater than 180" rule.  I think that is a fine rule, but it brings me to the question, how would I do a BBOX that spans the IDL?  Would I simply "unwrap" my bounding box beyond the [-180, 180] region?  For example, if I want to query using a bounding box from 150 degrees longitude, wrapping across the IDL, to -170 degrees longitude, would I really need to represent the bounding box as 150 to 190 degrees longitude?  If that's how it works, then I think we are good to go.

3) Is this implementation going to be in the official GeoMesa 1.0 release?

Thanks again for all of your punctual work on this issue and have a great day!

Beau

-----Original Message-----
From: Hunter Provyn [mailto:fhp@xxxxxxxx]
Sent: Thursday, July 10, 2014 11:31 AM
To: Beau Lalonde; hunter@xxxxxxxx; Geomesa User discussions
Subject: Re: [geomesa-users] International date line dilemma for non-point geometries, possible bug?

(Resent again due to hideous, unintended line-wrapping of my previous
replies)

Beau,

I appreciate your feedback. I'm sorry for the delay in getting back to you. We've made some updates to GeoMesa and I wanted to clarify our approach to International Date Line.

GeoMesa now passes each geometry (both index and query) through spatial4j to infer International Date Line spanning geometries.
Spatial4j infers an IDL crossing for any Polygon whose successive coordinates' longitudinal difference is greater than 180. In cases where this is not desired, it is necessary to insert waypoints - but for one
exception: BBOX filters.

In order to support geoserver queries (which can be large, non-IDL spanning BBOXes), we automatically add waypoints for all BBOX filters.
For anything else, users must add the waypoints. This is very new (it will be merged within a few days) and we will update our documentation soon.

GeoMesa still supports data created through the technique you have been using to explicitly convey IDL crossing using longitudes outside of [-180, 180] for querying and indexing. You would just need to add waypoints as necessary per the rules above to avoid unintended IDL crossing.

Please let me know if you have any further questions or concerns.

Hunter

On 07/03/2014 10:31 AM, Beau Lalonde wrote:
Hunter,

I'm a bit confused as to what #1 and #2 means.  Does this mean that users will be restricted to the [-180, 180] space?, and that the spatial4j package will assume any difference in consecutive longitudes that exceed 180, to actually be wrapped around the date line?

Addressing your #1 and #2:

1) Obviously we do not want to crop geometries outside of lon [-180, 180].  I am not sure how the validation error could occur if all longitudinal values are in [-180, 180] and unwrapping is being performed, because how would spatial4j know when a lon difference greater than 180 is intended vs. when to wrap around the international date line?

2) I think requiring additional waypoints between actual (i.e. unwrapped) longitudinal differences greater than 180 is a fair implementation for indexing geometries.  I'm sure there is a counterpoint use case out there, but I think for the vast majority of situations users will be indexing geometries that never have actual (i.e. unwrapped) longitudinal differences of greater than 180 degrees.

      a) If you ever think anyone would not want this behavior, perhaps you could provide GeoMesa with a HINT that can change this behavior.  Just a thought.


Other observations:

A) I am not sure how you are going about your implementation, but I
think it's important to distinguish (at least in thought) between
indexing geometries (i.e. the geometries used for the features that
are stored in accumulo) vs. the geometries that may be used for
querying

      a) In my mind, to this point, we have been discussing the
indexing geometry issue

      b) I think for symmetry reasons, that whatever "international date line" solution is employed for indexing geometries should also be employed for querying geometries.  Do you think that's possible?

          b.i) Situation to consider:  What if a user wants to query with a polygon that spans almost the entire world?  Are you going to require additional waypoints be placed to avoid longitudinal jumps of  greater than 180?  If so, I think that is a workable solution (if documented) and it's the best solution I can think of because I like symmetry.

          b.ii) Situation to consider: What if the user want to query a region that spans the international date line?


Thanks,

Beau


-----Original Message-----
From: Hunter Provyn [mailto:fhp@xxxxxxxx]
Sent: Wednesday, July 02, 2014 10:16 AM
To: Beau Lalonde; hunter@xxxxxxxx; Geomesa User discussions
Subject: Re: [geomesa-users] International date line dilemma for non-point geometries, possible bug?

Beau,

I have figured out more about what spatial4j does.
If I transform all coordinates from my test cases to have lon [-180, 180], spatial4j performs dateline wrapping as we expect with the following exceptions/drawbacks:

1) It will either crop geometries outside of lon [-180, 180] or throw a validation error attempting to unwrap if it also contains successive coordinates with lon difference greater than 180.

2) Also, the way spatial4j assumes dateline wrapping make it impossible to define a shape with successive coordinates having lon difference greater than 180 - you can get around this by adding waypoints.

To get around 1, we can transform all coordinates to have lon [-180, 180] (by adding/subtracting 360 until they are within the interval - not by cropping).
The second issue remains, but it might not be significant.

What do you think about this approach?

Hunter
On 07/01/2014 02:10 PM, Beau Lalonde wrote:
Hunter,

Thanks for the update.

You are correct in that I would expect LINESTRING (-200 50, -160 60) to be equivalent to two linestrings (-180 55, -160 60) (160 50 180 55).

I will mention that in different parts of my code I am using LINESTRING geometries and in other parts of my code I am using POLYGONS (usually just a rectangle).  Given the location of the data, the LINESTRING or POLYGON geometries could cross the international date line and have longitudinal values that break out of the [-180, 180] region.  I allow the longitudinal values to "unwrap" so that there is no ambiguity in the Geometry.  That being said, it would be great if you somehow create a generic enough implementation such that you can handle any geometry that crosses the international date line.  In the generic sense, that could mean having to split a single geometry into N>=1 geometries that live in the [-180, 180] longitudinal space (concave polygons could be tricky).

It sounds like you are moving toward the exact implementation that I would find desirable:
    - an implementation in which I can index features with unwrapped geometries
    - an implementation in which my queries can remain in the [-180, 180] longitudinal space (meaning I don't have to query outside of [-180, 180] in order to get results for geometries that were indexed with longitudes that went beyond the [-180, 180] space.
    -- for example, if I index LINESTRING (-200 50, -160 60), and then query (160 50 180 55), I should get results.

Let me know if we are on the same page.  I believe we are.

Thanks,
Beau

-----Original Message-----
From: Hunter Provyn [mailto:fhp@xxxxxxxx]
Sent: Tuesday, July 01, 2014 1:19 PM
To: Beau Lalonde; hunter@xxxxxxxx; Geomesa User discussions
Subject: Re: [geomesa-users] International date line dilemma for non-point geometries, possible bug?

Beau,

Yes, the ticket is Geomesa-150.

https://geomesa.atlassian.net/browse/GEOMESA-150

I've got a PR up that handles part of this issue. We expect it to be resolved soon.
I was just looking into using spatial4j rather than the transformations that I have written.
However, spatial4j JtsGeometry turns LINESTRING (-200 50, -160 60) into LINESTRING (-180 55, -160 60).  So it fails the test I created that should turn that into two linestrings (-180 55, -160 60) (160 50 180 55). This (result of my test) is what you would be expecting, right?

thanks,
Hunter


On 07/01/2014 12:48 PM, Beau Lalonde wrote:
Hunter,

Any update about this issue?

Thanks,

Beau

-----Original Message-----
From: geomesa-users-bounces@xxxxxxxxxxxxxxxx
[mailto:geomesa-users-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Hunter
Provyn
Sent: Monday, June 23, 2014 9:45 AM
To: geomesa-users@xxxxxxxxxxxxxxxx
Subject: Re: [geomesa-users] International date line dilemma for non-point geometries, possible bug?

Beau,

Good catch. We are looking into this now.
I'll get back to you with some guidance and, if this turns out to be a bug, a JIRA tracking ticket.

thanks,
Hunter
On 06/20/2014 02:17 PM, Beau Lalonde wrote:
Hi,

I currently use GeoMesa to index non-point geometries (e.g. LineString, Polygon, MultiPoint), and I have recently switched from using a January version of GeoMesa to using a version that is current (as of the last week or so).

In the January version of GeoMesa, I had implemented an international date line solution for non-point geometries as follows:
- When indexing a non-point geometry (e.g. Polygon) that overlapped the international date line, I would make sure all of the longitudinal values were unwrapped.  In my implementation, this had the consequence of having my indexed geometries span anywhere from -540 degrees longitude to 540 degrees longitude (with at least a portion of the geometry in the [-180, 180] region).
- When querying using GeoMesa, I would actually perform multiple queries in the appropriate redundant regions in space [-540, -180], [-180, 180], and [180, 540] to make sure I captured any geometries that spanned the international date line.


With the latest version of GeoMesa, I get the following exception when I try to index a non-point geometry (by unwrapping the longitude) that spans the international date line:
Exception in thread "main" java.lang.Exception: ERROR:  Could not find a suitable 0-bit MBR for the target geometry:  POLYGON ((179.31530746783775 -39.87427143505418, 179.31530746783775 -37.87427143505418, 181.31530746783775 -37.87427143505418, 181.31530746783775 -39.87427143505418, 179.31530746783775 -39.87427143505418))
	at geomesa.utils.geohash.GeohashUtils$.getMinimumBoundingGeohash(GeohashUtils.scala:241)
	at geomesa.utils.geohash.GeohashUtils$.decomposeGeometry_(GeohashUtils.scala:535)
	at geomesa.utils.geohash.GeohashUtils$.decomposeGeometry(GeohashUtils.scala:564)
	at geomesa.core.index.IndexEncoder.encode(IndexEntry.scala:70)
	at geomesa.core.index.IndexSchema.encode(IndexSchema.scala:84)
	at geomesa.core.data.AccumuloFeatureWriter.writeToAccumulo(AccumuloFeatureWriter.scala:104)
	at geomesa.core.data.AppendAccumuloFeatureWriter.write(AccumuloFeatureWriter.scala:130)
	at org.geotools.data.AbstractFeatureStore.addFeatures(AbstractFeatureStore.java:324)
	at
geomesa.core.data.AccumuloFeatureStore.addFeatures(AccumuloFeatureS
t
o
r
e.scala:53)


When I try to query outside the [-180, 180] region I get the following exception (even if data does not exist outside of the [-180, 180] region:
Exception in thread "main" java.lang.IllegalStateException: getX called on empty Point
	at com.vividsolutions.jts.geom.Point.getX(Point.java:124)
	at geomesa.utils.geohash.GeohashUtils$.getCentroid(GeohashUtils.scala:256)
	at geomesa.utils.geohash.GeohashUtils$.getMinimumBoundingGeohash(GeohashUtils.scala:228)
	at geomesa.utils.geohash.GeohashUtils$.decomposeGeometry_(GeohashUtils.scala:535)
	at geomesa.utils.geohash.GeohashUtils$.decomposeGeometry(GeohashUtils.scala:564)
	at geomesa.utils.geohash.GeohashUtils$.getUniqueGeohashSubstringsInPolygon(GeohashUtils.scala:672)
	at geomesa.core.index.GeoHashPlanner$class.polyToGeoHashes(QueryPlanners.scala:275)
	at geomesa.core.inde I'll get back to you withx.GeoHashKeyPlanner.polyToGeoHashes(QueryPlanners.scala:308)
	at geomesa.core.index.GeoHashPlanner$class.polyToPlan(QueryPlanners.scala:280)
	at geomesa.core.index.GeoHashKeyPlanner.polyToPlan(QueryPlanners.scala:308)
	at geomesa.core.index.GeoHashPlanner$class.getKeyPlan(QueryPlanners.scala:302)
	at geomesa.core.index.GeoHashKeyPlanner.getKeyPlan(QueryPlanners.scala:308)
	at geomesa.core.index.GeoHashKeyPlanner.getKeyPlan(QueryPlanners.scala:309)
	at geomesa.core.index.CompositePlanner$$anonfun$9.apply(QueryPlanners.scala:396)
	at geomesa.core.index.CompositePlanner$$anonfun$9.apply(QueryPlanners.scala:396)
	at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
	at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
	at scala.collection.immutable.List.foreach(List.scala:318)
	at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
	at scala.collection.AbstractTraversable.map(Traversable.scala:105)
	at geomesa.core.index.CompositePlanner.getKeyPlan(QueryPlanners.scala:396)
	at geomesa.core.index.IndexQueryPlanner.planQuery(IndexQueryPlanner.scala:171)
	at geomesa.core.index.IndexQueryPlanner.getIterator(IndexQueryPlanner.scala:100)
	at geomesa.core.index.IndexSchema.query(IndexSchema.scala:98)
	at geomesa.core.data.AccumuloFeatureReader.<init>(AccumuloFeatureReader.scala:32)
	at geomesa.core.data.AccumuloDataStore.getFeatureReader(AccumuloDataStore.scala:294)
	at geomesa.core.data.AccumuloDataStore.getFeatureReader(AccumuloDataStore.scala:55)
	at org.geotools.data.AbstractDataStore.getFeatureReader(AbstractDataStore.java:369)
	at org.geotools.data.DefaultFeatureResults.reader(DefaultFeatureResults.java:215)
	at org.geotools.data.store.DataFeatureCollection.openIterator(DataFeatureCollection.java:231)
	at org.geotools.data.store.DataFeatureCollection.iterator(DataFeatureCollection.java:199)
	at org.geotools.data.store.DataFeatureCollection.features(DataFeatureCollection.java:188)
	at org.geotools.data.store.DataFeatureCollection.features(DataFeatureCollection.java:79)
	...


My questions are:
1) Is this behavior a bug?
2) if it's not a bug, what are some alternate approaches to indexing/querying non-point geometries that span the international date line?  Any direct guidance would be appreciated.


My ideal GeoMesa solution would be:
A) When I index geometries, I could provide GeoMesa unwrapped longitudes in order to remove ambiguity as to what the geometry represents in actual space (e.g. a small polygon crossing the international dateline vs. a large polygon covering almost the entire world).
B) When I query, I would prefer querying only the [-180, 180] longitudinal region and have GeoMesa/GeoTools be smart enough to detect overlaps with my unwrapped indexed geometries.

Thanks,
Beau
_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
http://www.locationtech.org/mailman/listinfo/geomesa-users
_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
http://www.locationtech.org/mailman/listinfo/geomesa-users



Back to the top