Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [geomesa-users] Questions on UDF efficiency

Hi Evan,

Good question. The spatial predicates are based on a model called DE-9IM (https://en.wikipedia.org/wiki/DE-9IM). JTS is the library which calculates the relationship between two geometries to answer the questions of 'covers', 'contains', etc.

If you are querying for points in a (multi)polygon, most of these relationships should be the same.* I'd just use st_intersects, but that's personal preference. (Also, I suppose if you switch to working with non-point geometries and you are rendering data in a web client, you'll likely want to see things which are partially on the screen (intersects) rather than just those data completely contained (covers/within).)

Since you've mentioned st_* functions, you are likely using Spark. The biggest observation is that while working with a GeoMesa DataStore in Spark, there's one chance to push down filters to the database. If you know ahead of time what subset of data you'd like to work with, you should build up that query as the first query while reading from the underlying datastore.

Optimizing a Spark workflow in this manner is definitely a complex topic. If you are using the Spark SQL API, I can recommend using the 'explain' command to understand when you are querying an underlying datastore, what work/filters are pushed down, and when work is being done in memory, etc.

Cheers,

Jim

On 2020-07-16 06:52, Yifan Wang wrote:
Hi,

I'm currently trying to filter points with Z-index using UDF like
st_within, st_cover, st_contains to calculate relationship with
polygon and I'm wondering which UDF has the highest efficiency? Thank
you!

Best Regards,
Evan
_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxx
To unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/geomesa-users


Back to the top