Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [geomesa-users] Error ingesting gdelt file

Hi Raffaele,

I have found our error(s).  In our ingest driver, we try and ensure the existence of a specic temp directory and push our jar to it. 

Unfortunately, this code (https://github.com/geomesa/geomesa-gdelt/blob/master/src/main/java/geomesa/gdelt/GDELTIngest.java#L136-149) seems to be creating files instead of directories.

I am going to work on fixing the project this afternoon and I'll respond with more details.  If you want a quick work-around, you can do try these steps...

1.  Remove the /tmp file and create a directory instead.  (hadoop dfs -rmr and -mkdir.)

2.  Rename the jar created by the GDELT ingest project with this command:
cp target/geomesa-gdelt-accumulo1.5-1.0-SNAPSHOT.jar target/geomesa-gdelt-1.0-SNAPSHOT.jar

I don't think it is strictly necessary, but you may also create the output directory with
hadoop dfs -mkdir /tmp/geomesa-gdelt-output

With those notes, I was able to ingest a daily update file successfully.  I grabbed it with this command:
wget data.gdeltproject.org/events/20150213.export.CSV.zip

I mention that since our project targets the 'full' rather than 'reduced' version of the GDELT dataset.

As I said, I'll write again when I've had a chance to fix the bugs in the project and to update the documentation.

Thanks,

Jim

On 02/14/2015 10:06 AM, Raffaele Palmieri wrote:
Thanks Jim,
I've done as you told and the result is the same,

Name
Type
Size
Replication
Block Size
Modification Time
Permission
Owner
Group
accumulo
dir



2015-02-13 14:56
rwxr-xr-x
root
supergroup
gdelt
dir



2015-02-13 17:12
rwxr-xr-x
root
supergroup
tmp
file
0 B
1
128 MB
2015-02-14 16:02
rw-r--r--
root
supergroup

So I think that's a bug of GDELTIngest.
Regards,
Raffaele.

2015-02-14 14:35 GMT+01:00 Jim Hughes <jnh5y@xxxxxxxx>:
Hi Raffaele,

Yes, this is strange.  Many programs assume that /tmp is a directory and that they create additional working directories under it. 

I'd suggest moving /tmp out of the way, and trying the GDELT ingest project again.  If GeoMesa code is creating the file, then that's definitely a bug and I'd love to hear more details so that we can work out a fix.

Thanks,

Jim


On 02/14/2015 02:19 AM, Raffaele Palmieri wrote:
Hi Chris,
following are the contents of hdfs root file system:

accumulo
dir



2015-02-13 14:56
rwxr-xr-x
root
supergroup
gdelt
dir



2015-02-13 17:12
rwxr-xr-x
root
supergroup
tmp
file
0 B
1
128 MB
2015-02-13 17:43
rw-r--r--
root
supergroup

It seems that tmp was created as a file, could be this the problem?
Thank you,
regards,
Raffaele.




2015-02-13 21:25 GMT+01:00 Chris Eichelberger <cne1x@xxxxxxxx>:
Raffaele,

Just to be sure, does the "/tmp" directory exist in your HDFS as a
directory with permissions that would allow the job submitter to write
to it?

Sincerely,
  -- Chris


On Fri, 2015-02-13 at 18:37 +0100, Raffaele Palmieri wrote:
> Dear geomesa's users,
> when i try to ingest gdelt giving the following command:
> # hadoop jar target/geomesa-gdelt-accumulo1.5-1.0-SNAPSHOT.jar
> geomesa.gdelt.GDELTIngest -instanceId accumulo -zookeepers
> localhost:2181 -user root -password password -tableName gdelt
> -featureName event -ingestFile hdfs:///gdelt/uncompressed/gdelt.tsv
> I obtain the following error:
>
>  org.apache.hadoop.fs.FileAlreadyExistsException: Parent path is not a
> directory: /tmp tmp
>         at
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.mkdirs(FSDirectory.java:1916)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.addFile(FSDirectory.java:284)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:2093)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2012)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1963)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:491)
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:301)
>         at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos
> $ClientNamenodeProtocol
> $2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59570)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server
> $ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>         at org.apache.hadoop.ipc.Server$Handler
> $1.run(Server.java:2048)
>         at org.apache.hadoop.ipc.Server$Handler
> $1.run(Server.java:2044)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042)
>
>
> Can you suggest a solution for the problem?
> Thanks,
> Raffaele.
> _______________________________________________
> geomesa-users mailing list
> geomesa-users@xxxxxxxxxxxxxxxx
> To change your delivery options, retrieve your password, or unsubscribe from this list, visit
> http://www.locationtech.org/mailman/listinfo/geomesa-users


_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
http://www.locationtech.org/mailman/listinfo/geomesa-users



_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
http://www.locationtech.org/mailman/listinfo/geomesa-users


_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
http://www.locationtech.org/mailman/listinfo/geomesa-users



_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
http://www.locationtech.org/mailman/listinfo/geomesa-users


Back to the top