Re: [geomesa-users] Apply Geomesa function to standard Spark Dataset

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

Re: [geomesa-users] Apply Geomesa function to standard Spark Dataset

From: Diethard Steiner <diethard.steiner@xxxxxxxxx>
Date: Thu, 6 Jul 2017 22:16:15 +0100
Delivered-to: geomesa-users@xxxxxxxxxxxxxxxx
List-archive: <https://dev.locationtech.org/mhonarc/lists/geomesa-users>
List-help: <mailto:geomesa-users-request@locationtech.org?subject=help>
List-subscribe: <https://dev.locationtech.org/mailman/listinfo/geomesa-users>, <mailto:geomesa-users-request@locationtech.org?subject=subscribe>
List-unsubscribe: <https://dev.locationtech.org/mailman/options/geomesa-users>, <mailto:geomesa-users-request@locationtech.org?subject=unsubscribe>

Thanks a lot Jim! It's working now. My next step is to write it out into Accumulo. I tried this:

    val prepedData = spark.sql("""SELECT *, st_makePoint(Actor1Geo_Lat, Actor1Geo_Long) as geom FROM ingested_data""")

    prepedData.show(10)

    prepedData
      .write
      .format("geomesa")
      .options(dsParams)

However, the `write` part does not seem to do anything. I get no error, but there is also no data in Accumulo. Can you please let me know how to resolve this?

I also have the feature definition available (partial code example):

// GeoMesa Feature
var geoMesaSchema = Lists.newArrayList(
    "GLOBALEVENTID:Integer",
    "SQLDATE:Date",
    "MonthYear:Integer",
    "Year:Integer",

Is there a way to add this as an option to the write function?

Best regards,

Diethard

On Thu, Jul 6, 2017 at 9:38 PM, Jim Hughes <jnh5y@xxxxxxxx> wrote:

Hi Diethard,

GeoMesa uses a private bit of the Spark API to add user-defined types and functions. You'll want to make sure that the geomesa-spark-sql_2.11 jar is on the classpath, and then you can call

org.apache.spark.sql.SQLTypes.init(sqlContext)

Calling this function will add the geometric types, functions, and optimizations to the SQL Context. As part of loading a GeoMesa dataset into Spark SQL, the code calls this function. (This is why all these functions work when you use GeoMesa, etc.)

As another alternative, you can use the GeoMesa Converter library to load GDELT as a DataFrame. You should be able to use a spark.read("geomesa").options(params) call to parse GDELT CSVs straight into SimpleFeatures. That'd save needing to write SQL to munge columns into geometries, etc.

Cheers,

Jim

On 07/06/2017 04:11 PM, Diethard Steiner wrote:
Hi,

So I am sourcing some data with Spark SQL and now I want to use the Geomesa function `st_makePoint`:

    val ingestedData = (
      spark
        .read
        .option("header", "false")
        .option("delimiter","\\t")
        .option("ignoreLeadingWhiteSpace","true")
        .option("ignoreTrailingWhiteSpace","true")
        .option("treatEmptyValuesAsNulls","true")
        .option("dateFormat","yyyyMMdd")
        .schema(gdeltSchema)
        .csv(ingestFile)
      )

    ingestedData.show(10)
    println(ingestedData.getClass)

    ingestedData.createOrReplaceTempView("ingested_data")
    val prepedData = spark.sql("""SELECT *, st_makePoint(Actor1Geo_Lat, Actor1Geo_Long) as geom FROM ingested_data""")

I get following error:

Exception in thread "main" org.apache.spark.sql.AnalysisException: Undefined function: 'st_makePoint'. This function is neither a registered temporary function nor a permanent function registered in the database 'default'.;

How do I resolve this?

Note: The other example on sourcing data directly from Accumulo via Spark SQL and using functions on it is working in my environment. So I assume I just need a way to convert a normal Spark Dataset into one that can use the GeoMesa functions.

Best regards,

Diethard
_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users
_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users

Follow-Ups:
- Re: [geomesa-users] Apply Geomesa function to standard Spark Dataset
  - From: Jim Hughes

References:
- [geomesa-users] Apply Geomesa function to standard Spark Dataset
  - From: Diethard Steiner
- Re: [geomesa-users] Apply Geomesa function to standard Spark Dataset
  - From: Jim Hughes

Prev by Date: Re: [geomesa-users] Apply Geomesa function to standard Spark Dataset
Next by Date: Re: [geomesa-users] Apply Geomesa function to standard Spark Dataset
Previous by thread: Re: [geomesa-users] Apply Geomesa function to standard Spark Dataset
Next by thread: Re: [geomesa-users] Apply Geomesa function to standard Spark Dataset
Index(es):
- Date
- Thread

Breadcrumbs