Hello,
I'm quite new to GeoWave and key-value data stores in general, so apologies if these questions are a little basic. I have installed GW 0.9.6 (HBase data store) on an HDP 2.6 sandbox and followed the GDELT ingestion
tutorial (locationtech.github.io/geowave/walkthrough-vector.html).
This ingested the data into HBase, but in a simple single family, single column format. How can I ingest data indexed spatially, temporally, etc, but specify a schema on the values?
For example, I have a CSV file with the example line:
45.4706448 , -75.7374945 ,
2018-02-09T22:54:50 , 1 , 3 , someText , 95333 , 453.5 , 5 , moreText , etc, etc,
The SQL schema would be:
float Latitude, float Longitude, string Timestamp, int CustomIndexKey1, int CustomIndexKey2, string StringField, | int AnotherInteger | float AnotherFloat | int YetAnotherInteger, string AnotherString, etc.
I would like to ingest this into GeoWave, and index on the first 5 columns -- that is, Lat/Lon spatial index, temporal index on Timestamp, and two more integer attributes (CustomIndexKey1, CustomIndexKey2).
So, my questions are:
1) If I use the GDELT tutorial as an example, I guess I would ingest this file type using something like
"geowave
ingest localtogw /mnt/myfile.csv myDataStore myfile-spatial -f
geotools-vector"
Is
that
right? This does not seem to do anything. I don't get any error messages, but nothing is imported into HBase. Well, I get error messages about missing extensions for GeoServer, etc, but those seem unrelated, as I got them when I was ingesting GDELT data as
well and it worked.
2) How do I define which columns are lat, long, timestamp, and desired secondary indices? I see that there is support for configuring the input by definining a JSON
'SIMPLE_FEATURE_CONFIG_FILE' (https://locationtech.github.io/geowave/userguide.html#ingest-plugins). Is that
what I would use? I don't see anywhere to define the lat-long columns in the example, just numeric secondary indices and temporal indices.
3) Does the CSV file need to have a header, or can I specify the schema in the JSON feature config file?
Thank you,
Mladen
locationtech.github.io
GeoWave is an open-source library for storage, index, and search of multi-dimensional data on top of sorted key-value datastores and popular big data frameworks.
|