|Re: [geomesa-users] Questions on UDF efficiency|
Hi Evan,Good question. The spatial predicates are based on a model called DE-9IM (https://en.wikipedia.org/wiki/DE-9IM). JTS is the library which calculates the relationship between two geometries to answer the questions of 'covers', 'contains', etc.
If you are querying for points in a (multi)polygon, most of these relationships should be the same.* I'd just use st_intersects, but that's personal preference. (Also, I suppose if you switch to working with non-point geometries and you are rendering data in a web client, you'll likely want to see things which are partially on the screen (intersects) rather than just those data completely contained (covers/within).)
Since you've mentioned st_* functions, you are likely using Spark. The biggest observation is that while working with a GeoMesa DataStore in Spark, there's one chance to push down filters to the database. If you know ahead of time what subset of data you'd like to work with, you should build up that query as the first query while reading from the underlying datastore.
Optimizing a Spark workflow in this manner is definitely a complex topic. If you are using the Spark SQL API, I can recommend using the 'explain' command to understand when you are querying an underlying datastore, what work/filters are pushed down, and when work is being done in memory, etc.
Cheers, Jim On 2020-07-16 06:52, Yifan Wang wrote:
Hi, I'm currently trying to filter points with Z-index using UDF like st_within, st_cover, st_contains to calculate relationship with polygon and I'm wondering which UDF has the highest efficiency? Thank you! Best Regards, Evan _______________________________________________ geomesa-users mailing list geomesa-users@xxxxxxxxxxx To unsubscribe from this list, visit https://dev.eclipse.org/mailman/listinfo/geomesa-users
Back to the top