Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [geomesa-users] dateTime Converter

Thanks Austin and Anthony. Austin that is awfully helpful. I am a little uncertain of my footing as these are new code and concepts  (with the exception of the postgis functions) to me. I was fearful I was going to have to do the regex from my db (because of familiarity) and move back to s3. Which is allot of data transfer time. Plus I did not want to have replicated data columns. As far as the TZ that won't matter here, but I can see down the line where it will. 

Anthony thanks also. I was working backward, I knew the geom worked, but something was banging up, so I went first with the simple attributes, tried the geom, then moved onto the date. Btw I enjoy the blog posts and know your hard work at CCRi is going to be a big assist for us in transportation research. Also thanks for the responsiveness. 

Cheers, this will most likely not be my last hang up. 

Byron Chigoy

Sent from my iPhone

On Mar 9, 2017, at 5:36 PM, Anthony Fox <anthony.fox@xxxxxxxx> wrote:

In addition to Austin's suggestion, I think you might be missing a geometry column. Something like

{ name = "geom", transform = "point($lon, $lat)" }

On Mar 9, 2017, at 6:13 PM, Austin Heyne <aheyne@xxxxxxxx> wrote:

Hi Byron,


Looking at your example I see a couple things you can do. Firstly, you are correct that the date times are going to have to be in that example format, "2014-04-01T08:33:35.000Z". This includes the duration marker 'T' which seems to be absent from your samples. This can be easily fixed with a regexReplace transform of the space to 'T'. e.g.


{ name = "capdate", transform = "dateTime(regexReplace(' '::r, 'T', $3))" }


However, this will still leave the problem of the milliseconds missing. Here you have two options, you can either do another regexReplace of the '+' to '.000+' or you can use a custom date format.


The regexReplace would look something like:


{ name = "capdate", transform = "dateTime(regexReplace('\\\\+'::r, '.000+', $3))" }


Or you could use a custom date format like:


{ name = "capdate", transform = "date('YYYY-MM-dd HH:mm:ss', regexReplace('\\\\+\\\\d{2}'::r, '', $3))" }


But note this strips off the timezone as that's not supported with the custom date parser so this may not be a desirable approach depending on your data.


To compose all of these options in order to handle a mixed dataset you can utilize the 'try()' function. e.g.


{ name = "parsedDate", transform = "regexReplace(' '::r, 'T', $3)" }

{ name = "capdate", transform = "try(dateTime($parsedDate), date('YYYY-MM-dd\'T\'HH:mm:ss', regexReplace('\\\\+\\\\d{2}'::r, '', $3)))" }


If you need to nest multiple 'try()'s for each format that will work fine.


You can find more information about the available transformer functions here:

http://www.geomesa.org/documentation/user/convert/function_overview.html

You can find more information about the transformer functions' usage here:

http://www.geomesa.org/documentation/user/convert/function_usage.html


Let us know if you have any more trouble.


Thanks,

Austin



On 03/09/2017 05:03 PM, Byron Chigoy wrote:

Hi – my ingestion fails when bringing in a date attribute.  The date and time is in UTC, ISO-8601 format, example: "2014-04-01T08:33:35.000Z".  I have successfully set up the .sft and .convert and the ingest wiil work when I format the date type as a “string”. However, the features fail to ingest when I do:

 

.convert:

 

{ name = "tripid",  transform = "$1::string"   }

{ name = "wp",      transform = "$2::int"      }

{ name = "capdate", transform = "dateTime($3)" }

{ name = "lon",     transform = "$4::double"   }

{ name = "lat",     transform = "$5::double"   }

 

And .sft:

 

{ name = "tripid",  type = "String",  index = true  }

{ name = "wp",      type = "Integer", index = true  }

{ name = "capdate", type = "Date",    index = false }

{ name = "lon",     type = "Double",  index = false }

{ name = "lat",     type = "Double",  index = false }

 

Sample date times are

"2016-01-18 18:35:49.33+00"

"2016-02-22 01:24:00+00"

"2016-02-29 22:24:21+00"

"2016-03-21 13:21:47.173+00"

"2016-01-18 14:36:33+00"

"2016-02-22 02:28:26.98+00"

"2016-02-22 15:29:58.15+00"

I am ingesting as CSV – skip line 1 from a S3 bucket. I don’t know if it is because the millis are sometimes present and sometimes not (though postgres does not seem bothered by this). Any hints?


Thanks,

Byron




_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users

_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users
_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users

Back to the top