Hi
Emilio,
thanks fort he explanation. I mainly want to do
queries in the fashion of: “All records withing bbox and
between t0 and t1 which belong to simulation run x”. It
sounds like z3, or xz3 index is just the right choice for
me.
I have an additional question regarding
indexing. My simulation data also contains records which
have no spatial or temporal attributes, like a list of cars
used by the simulated persons. My first attempt now, was to
just also store them as SimpleFeatures with Geomesa but
without a geometry. This way I wouldn’t have to use an
additional storage technology. Now, it feels a little hacky
to do that since I would treat a SimpleFeature as a database
row. Is this something I can do or would you recommend a
different approach for storing such data?
Best Regards
Janek
The time
interval is used to break a unbounded dimension (time) into a
bounded dimension (e.g. seconds in a week). Then the bounded
dimension is used to create the z-index curve, which is used
to generate the scan ranges. Generally you would want to query
only over a few time intervals (z-curves), so having an hourly
interval would only be appropriate if you planned to query at
most 1-2 hours at a time.
Even at a day interval, data will be indexed down to the
millisecond. You can see the breakdowns for different time
intervals here:
https://github.com/locationtech/geomesa/blob/main/geomesa-z3/src/main/scala/org/locationtech/geomesa/curve/BinnedTime.scala#L17
Because we only use a signed 2 byte short to store the
interval offset from the java epoch (1970), we didn't
implement hourly intervals because the max date would be 1973.
However, that is an implementation detail, which could be
revisited if someone wanted to implement hourly intervals.
Whether to use an attribute index vs a z3 index will really
depend on your query patterns. If you plan to query with both
a spatial and temporal component, then the z3 index will be
much more effective. Otherwise, you could consider an
attribute index on your date, but keep in mind you would not
be able to leverage a secondary spatial index unless your
temporal predicate was an equality filter, i.e. you are
querying for a specific second if you store time as a simple
long. You could potentially store the hour in the day, and
create an attribute index with a secondary z-index - then you
would be able to query for a particular hour.
Thanks,
Emilio
On 7/8/20 3:00 PM, Laudan, Janek wrote:
Hi Emilio,
Thanks for the quick reply. I will investigate this further
next week when I'm back from my vacation.
Since smaller intervals are not supported, will it even
make sense to use this type of index in my case or will the
database have to perform a sequential scan on every temporal
query?
The temporal component of my data is actually just seconds
from the start of the simulation represented as a simple
long. Would it make more sense in my case to use an
attribute index - of this very number - in combination with
a spatial one?
Thanks again for your help
Janek
--
Janek Laudan
Research Associate
Transport System Planning and Telematics
TU Berlin
Mittwoch, 08 Juli 2020, 03:51nachm.
+02:00 von Emilio Lahr-Vivaz
elahrvivaz@xxxxxxxx:
Hello,
I'm actually surprised that you didn't get an
error. There isn't any implementation for hourly
intervals, so it either didn't index the time, or
it fell back to the default weekly interval. You
might be able to use the 'describe-schema' CLI
command to see if the user data was persisted as
you specified.
Thanks,
Emilio
On 7/8/20 4:14 AM, Laudan,
Janek wrote:
Hi,
I just started
working with geomesa. I’m planning to use it
for storing traffic simulation data
generated with MATSim (https://matsim.org).
Because a simulation run usually covers only
a single day, I thought it would make sense
to have a Z-Index Time Interval of ‘hour’.
Now, the documentation (https://www.geomesa.org/documentation/user/datastores/index_config.html#configuring-z-index-time-interval)
says that only ‘day’, ‘week’, ‘month’ and
‘year’ are supported. I tried to set the
index of my schema to hourly intervals like
so:
sft.getUserData().put("geomesa.z3.interval",
"hour");
and my test set
up stored and retrieved the feature I
submitted to the store just fine.
Will geomesa
silently fall back to another form of
indexing or are smaller intervals than
covered by the documentation supported?
Best Regards
Janek Laudan
-----------------------------------------------------------------------------------------------------------------------------------
Janek Laudan
Research
Associate
Transport
Systems Planning and Transport Telematics,
TU Berlin
Website:
https://vsp.tu-berlin.de
E-Mail:
laudan@xxxxxxxxxxxx
Skype:
live:janek.laudan
_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxx
To unsubscribe from this list, visit https://dev.eclipse.org/mailman/listinfo/geomesa-users
_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxx
To unsubscribe from this list, visit https://dev.eclipse.org/mailman/listinfo/geomesa-users