After getting the geomesa-gdelt-master code to work with one of data files at
http://data.gdeltproject.org/events/index.html (http://data.gdeltproject.org/events/20140507.export.CSV.zip) I decided to
adapt the GDELTIngest.java, GDELTIngestMapper.java and GDELTQuery.java classes to work with a different data-set (see sensor.csv).
I believe I have made only minimal changes to the “GDELT*.java” files producing SensorIngest.java, SensorIngestMapper.java (enclosed) and SensorQuery.java (not included). The pom.xml and core-site.xml files are unchanged.
My data (see sensor.csv) is defined in SensorIngest.java as:
static List<String> attributes = Lists.newArrayList(
"EventID:Integer",
"Sensor:Integer",
"Time:Date", // Date => a "Unix" time (e.g., millis since 1970.01.01)
"SensorGeo_Lat:Float",
"SensorGeo_Long:Float",
"Altitude:Integer",
"MinAxis:Float",
"MajAxis:Float",
"Orientation:Float",
"*geom:Point:srid=4326" // SRID 4326 => standard coordinate system
);
The critical fields are “Time:Date” (which is a Unix-like time), and the event lat/long (“SensorGeo_Lat/Long:Float”). I note that the GDELT data uses an “SQLDATE” field of the form “YYYYMMDD”. However, the lat/long fields are similar. I
even renamed them to “SensorGeo_Lat” and “SensorGeo_Long” on the theory that “…Geo_...” is a significant naming convention.
I am running on an RHEL 6.5 server (VMWare Workstation VM), using Accumulo 1.5.1 and Hadoop 2.4.0.
Using the enclosed “./bin/data” Bash script, I run the command “data upload” to load the data into Hadoop, and then “data ingest” to ingest it into Accumulo (using Hadoop Map-Reduce).
I see no errors, but when I use the “accumulo shell” commands “table sensor” and “scan”, I only see 4 (metadata) entries in the target “sensor” table, rather than 124 “SimpleFeature” entries:
~METADATA_sensor attributes: [] EventID:Integer,Sensor:Integer,Time:Date,Latitude:Float,Longitude:Float,Altitude:Integer,MinAxis:Float,MajAxis:Float,Orientation:Float,geom:Point:srid=4326
~METADATA_sensor dtgfield: [] Time
~METADATA_sensor featureEncoding: [] avro
~METADATA_sensor schema: [] %~#s%99#r%sensor#cstr%0,3#gh%yyyyMMdd#d::%~#s%3,2#gh::%~#s%#id
I’m perplexed as to what I have not done or done wrong here.
Any thoughts or suggestions would be greatly appreciated!
Bob Barnhart
Chief Systems Engineer | 858 826 5596 (Office) | 619 972 9489 (Mobile) |
Robert.M.Barnhart@xxxxxxxxxx