Hi Raffaele,
I have found our error(s). In our ingest driver, we try and ensure
the existence of a specic temp directory and push our jar to it.
Unfortunately, this code
(https://github.com/geomesa/geomesa-gdelt/blob/master/src/main/java/geomesa/gdelt/GDELTIngest.java#L136-149)
seems to be creating files instead of directories.
I am going to work on fixing the project this afternoon and I'll
respond with more details. If you want a quick work-around, you can
do try these steps...
1. Remove the /tmp file and create a directory instead. (hadoop
dfs -rmr and -mkdir.)
2. Rename the jar created by the GDELT ingest project with this
command:
cp target/geomesa-gdelt-accumulo1.5-1.0-SNAPSHOT.jar
target/geomesa-gdelt-1.0-SNAPSHOT.jar
I don't think it is strictly necessary, but you may also create the
output directory with
hadoop dfs -mkdir /tmp/geomesa-gdelt-output
With those notes, I was able to ingest a daily update file
successfully. I grabbed it with this command:
wget data.gdeltproject.org/events/20150213.export.CSV.zip
I mention that since our project targets the 'full' rather than
'reduced' version of the GDELT dataset.
As I said, I'll write again when I've had a chance to fix the bugs
in the project and to update the documentation.
Thanks,
Jim
On 02/14/2015 10:06 AM, Raffaele
Palmieri wrote:
Thanks Jim,
I've done as you told and the result is the same,
Name
|
Type
|
Size
|
Replication
|
Block
Size
|
Modification
Time
|
Permission
|
Owner
|
Group
|
accumulo
|
dir
|
|
|
|
2015-02-13
14:56
|
rwxr-xr-x
|
root
|
supergroup
|
gdelt
|
dir
|
|
|
|
2015-02-13
17:12
|
rwxr-xr-x
|
root
|
supergroup
|
tmp
|
file
|
0
B
|
1
|
128
MB
|
2015-02-14
16:02
|
rw-r--r--
|
root
|
supergroup
|
So I think that's a bug of GDELTIngest.
Regards,
Raffaele.
_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
http://www.locationtech.org/mailman/listinfo/geomesa-users
|