Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [geomesa-users] Geomesa Spark Java API

Hi, José.

It appears some of our META-INF/services files don’t terminate w/ newlines.  We haven’t had an issue with this in the past but it appears this is causing you an issue.

Would you mind sharing your development environment?

OS?
JDK?
maven version?
maven shade plugin version?

Thanks,

Tom

On Mar 16, 2017, at 11:41 AM, Jose Bujalance <joseab56@xxxxxxxxx> wrote:

Actually I already had that block on my shade plugin, but I am still having those long lines in my services. The good part is that when I edit them manually once the jar has been generated, everything works perfectly, getting the same result as with the scala code.

2017-03-16 16:33 GMT+01:00 Jose Bujalance <joseab56@xxxxxxxxx>:
Hi Jim,

Thanks for your answer. You are right ! This is what I found in the generated META-INF/services/org.geotools.filter._expression_.PropertyAccessorFactory:

org.locationtech.geomesa.convert.cql.ArrayPropertyAccessorFactoryorg.locationtech.geomesa.features.kryo.json.JsonPropertyAccessorFactory
org.geotools.filter._expression_.SimpleFeaturePropertyAccessorFactory
org.geotools.filter._expression_.ThisPropertyAccessorFactory
org.geotools.filter._expression_.DirectPropertyAccessorFactory

And now that you mention it, I had the exact same problem with the META-INF/org.locationtech.geomesa.spark.SpatialRDDProvider, which I modify manually every time I generate the jar (you can guess I am not very good with Maven ^^)

So, is this solved 'automatically' by adding the following block to the shade plugin in my pom ?

                            <transformers>
                                <transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer" />
                            </transformers>
Thank you,
José

2017-03-16 15:46 GMT+01:00 Jim Hughes <jnh5y@xxxxxxxx>:
Hi José,

Does project bundle a fat jar for use in Spark?  You may need to add a block to the maven-shade-plugin (or whichever plugin you are using) to sort the META-INF services entry for org.geotools.filter._expression_.PropertyAccessorFactory. 

You can check out your jar with something like...

> jar xvf ./org/locationtech/geomesa/geomesa-convert-common_2.11/1.3.1/geomesa-convert-common_2.11-1.3.1.jar  META-INF/services/org.geotools.filter._expression_.PropertyAccessorFactory
> more META-INF/services/org.geotools.filter._expression_.PropertyAccessorFactory
org.locationtech.geomesa.convert.cql.ArrayPropertyAccessorFactory

My guess is that your jar has one long line with two factories on it...

As another note, GeoMesa 1.3.x depends on GeoTools 15.1.  It might work with later versions of GeoTools.

Cheers,

Jim

1.  https://github.com/locationtech/geomesa/blob/master/geomesa-accumulo/geomesa-accumulo-spark-runtime/pom.xml#L138-L140



On 03/16/2017 10:10 AM, Jose Bujalance wrote:
Hi again,

I am trying the new Java API for Geomesa-Spark provided in the 1.3.1 version, but I am having some troubles.

First of all, I have tested that everything works fine querying my Accumulo datastore through the Spark shell using the geomesa-accumulo-spark-runtime_2.11-1.3.1.jar. This is how my scala code looks like:

import org.apache.hadoop.conf.Configuration
import org.apache.spark.rdd.RDD
import org.apache.spark.{SparkConf, SparkContext}
import org.locationtech.geomesa.spark._
import org.geotools.data.{DataStoreFinder, Query}
import org.geotools.factory.CommonFactoryFinder
import org.geotools.filter.text.ecql.ECQL
import scala.collection.JavaConversions._

// Accumulo datastore params
val params = Map(
"instanceId" -> "hdp-accumulo-instance",
"user" -> "root",
"password" -> "praxedo",
"tableName" -> "Geoloc_Praxedo"
)

// set the configuration to the existant SparkContext
val conf = sc.getConf
conf.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
conf.set("spark.kryo.registrator", classOf[GeoMesaSparkKryoRegistrator].getName)
val sc = SparkContext.getOrCreate(conf)

// create RDD with a geospatial query using Geomesa functions
val spatialRDDProvider = GeoMesaSpark(params)
val filter = ECQL.toFilter("BBOX(coords, 2.249294, 48.815215, 2.419337, 48.904295)")
val query = new Query("history_1M", filter)
val resultRDD = spatialRDDProvider.rdd(new Configuration, sc, params, query)

resultRDD.count

This code works fine, giving the expected result.
Now I am trying to do the same thing on Java. This is how my code looks like:

package com.praxedo.geomesa.geomesa_spark;

import org.apache.hadoop.conf.Configuration;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaSparkContext;
import org.geotools.data.Query;
import org.geotools.filter.text.cql2.CQLException;
import org.geotools.filter.text.ecql.ECQL;
import org.locationtech.geomesa.spark.api.java.*;

import java.io.IOException;
import java.util.HashMap;
import java.util.Map;

public class Test {

private final static String ACCUMULO_INSTANCE = "hdp-accumulo-instance";
private final static String ACCUMULO_ZOOKEEPERS = "hdf-sb-a.praxedo.net:2181,hdf-sb-b.praxedo.net:2181";
private final static String ACCUMULO_USER = "root";
private final static String ACCUMULO_PASSWORD = "password";
private final static String GEOMESA_CATALOG = "Geoloc_Praxedo";
private final static String GEOMESA_FEATURE = "history_1M";
public static void main(String[] args) throws IOException, CQLException {
//Spark configuration
SparkConf conf = new SparkConf().setAppName("MyAppName").setMaster("local[*]");
conf.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer");
conf.set("spark.kryo.registrator", "org.locationtech.geomesa.spark.GeoMesaSparkKryoRegistrator");
JavaSparkContext jsc = new JavaSparkContext(conf);
//Datastore configuration
Map<String, String> parameters = new HashMap<>();
parameters.put("instanceId", ACCUMULO_INSTANCE);
parameters.put("zookeepers", ACCUMULO_ZOOKEEPERS);
parameters.put("user", ACCUMULO_USER);
parameters.put("password", ACCUMULO_PASSWORD);
parameters.put("tableName", GEOMESA_CATALOG);
JavaSpatialRDDProvider provider = JavaGeoMesaSpark.apply(parameters);
String predicate = "BBOX(coords, 2.249294, 48.815215, 2.419337, 48.904295)";
Query query = new Query(GEOMESA_FEATURE, ECQL.toFilter(predicate));
JavaSpatialRDD resultRDD = provider.rdd(new Configuration(), jsc, parameters, query);
System.out.println("Number of RDDs: " + resultRDD.count());
System.out.println("First RDD: " + resultRDD.first());
}

}

And here are the dependencies I am importing with Maven:

<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>

<dependency>
<groupId>org.apache.accumulo</groupId>
<artifactId>accumulo-core</artifactId>
<version>1.7.0</version>
</dependency>

<dependency>
<groupId>org.apache.accumulo</groupId>
<artifactId>accumulo-fate</artifactId>
<version>1.7.0</version>
</dependency>

<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.0.0</version>
</dependency>

<dependency>
<groupId>org.locationtech.geomesa</groupId>
<artifactId>geomesa-spark-core_2.11</artifactId>
<version>1.3.1</version>
</dependency>

<dependency>
<groupId>org.locationtech.geomesa</groupId>
<artifactId>geomesa-spark-converter_2.11</artifactId>
<version>1.3.1</version>
</dependency>

<dependency>
<groupId>org.locationtech.geomesa</groupId>
<artifactId>geomesa-security_2.11</artifactId>
<version>1.3.1</version>
</dependency>

<dependency>
<groupId>org.locationtech.geomesa</groupId>
<artifactId>geomesa-spark-geotools_2.11</artifactId>
<version>1.3.1</version>
</dependency>

<dependency>
<groupId>org.locationtech.geomesa</groupId>
<artifactId>geomesa-spark-sql_2.11</artifactId>
<version>1.3.1</version>
</dependency>

<dependency>
<groupId>org.locationtech.geomesa</groupId>
<artifactId>geomesa-accumulo-datastore_2.11</artifactId>
<version>1.3.1</version>
</dependency>

<dependency>
<groupId>org.locationtech.geomesa</groupId>
<artifactId>geomesa-accumulo-spark_2.11</artifactId>
<version>1.3.1</version>
</dependency>

<dependency>
<groupId>org.locationtech.geomesa</groupId>
<artifactId>geomesa-utils_2.11</artifactId>
<version>1.3.1</version>
</dependency>

<dependency>
<groupId>org.locationtech.geomesa</groupId>
<artifactId>geomesa-index-api_2.11</artifactId>
<version>1.3.1</version>
</dependency>

<dependency>
<groupId>org.geotools</groupId>
<artifactId>gt-main</artifactId>
<version>16.1</version>
</dependency>


I have succesfully built the jar, but when I launch it on my cluster I am getting the following error:

Exception in thread "main" java.util.ServiceConfigurationError: org.geotools.filter._expression_.PropertyAccessorFactory: Provider org.locationtech.geomesa.convert.cql.ArrayPropertyAccessorFactoryorg.locationtech.geomesa.features.kryo.json.JsonPropertyAccessorFactory not found
        at java.util.ServiceLoader.fail(ServiceLoader.java:239)
        at java.util.ServiceLoader.access$300(ServiceLoader.java:185)
        at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:372)
        at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)
        at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
        at org.geotools.filter._expression_.PropertyAccessors.<clinit>(PropertyAccessors.java:51)
        at org.geotools.filter.AttributeExpressionImpl.evaluate(AttributeExpressionImpl.java:213)
        at org.geotools.filter.AttributeExpressionImpl.evaluate(AttributeExpressionImpl.java:189)
        at org.geotools.filter.FilterAttributeExtractor.visit(FilterAttributeExtractor.java:130)
        at org.geotools.filter.AttributeExpressionImpl.accept(AttributeExpressionImpl.java:340)
        at org.geotools.filter.visitor.DefaultFilterVisitor.visit(DefaultFilterVisitor.java:214)
        at org.geotools.filter.spatial.BBOXImpl.accept(BBOXImpl.java:224)
        at org.geotools.data.DataUtilities.propertyNames(DataUtilities.java:413)
        at org.locationtech.geomesa.filter.FilterHelper$.propertyNames(FilterHelper.scala:469)
        at org.locationtech.geomesa.filter.visitor.FilterExtractingVisitor.keep(FilterExtractingVisitor.scala:44)
        at org.locationtech.geomesa.filter.visitor.FilterExtractingVisitor.visit(FilterExtractingVisitor.scala:133)
        at org.geotools.filter.spatial.BBOXImpl.accept(BBOXImpl.java:224)
        at org.locationtech.geomesa.filter.visitor.FilterExtractingVisitor$.apply(FilterExtractingVisitor.scala:28)
        at org.locationtech.geomesa.index.strategies.SpatioTemporalFilterStrategy$class.getFilterStrategy(SpatioTemporalFilterStrategy.scala:37)
        at org.locationtech.geomesa.accumulo.index.Z3Index$.getFilterStrategy(Z3Index.scala:21)
        at org.locationtech.geomesa.index.api.FilterSplitter$$anonfun$5.apply(FilterSplitter.scala:122)
        at org.locationtech.geomesa.index.api.FilterSplitter$$anonfun$5.apply(FilterSplitter.scala:122)
        at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
        at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
        at scala.collection.immutable.List.foreach(List.scala:381)
        at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
        at scala.collection.immutable.List.flatMap(List.scala:344)
        at org.locationtech.geomesa.index.api.FilterSplitter.org$locationtech$geomesa$index$api$FilterSplitter$$getSimpleQueryOptions(FilterSplitter.scala:122)
        at org.locationtech.geomesa.index.api.FilterSplitter.getQueryOptions(FilterSplitter.scala:104)
        at org.locationtech.geomesa.index.api.StrategyDecider$$anonfun$1.apply(StrategyDecider.scala:52)
        at org.locationtech.geomesa.index.api.StrategyDecider$$anonfun$1.apply(StrategyDecider.scala:52)
        at org.locationtech.geomesa.utils.stats.MethodProfiling$class.profile(MethodProfiling.scala:26)
        at org.locationtech.geomesa.index.api.StrategyDecider.profile(StrategyDecider.scala:18)
        at org.locationtech.geomesa.index.api.StrategyDecider.getFilterPlan(StrategyDecider.scala:52)
        at org.locationtech.geomesa.index.api.QueryPlanner$$anonfun$4.apply(QueryPlanner.scala:135)
        at org.locationtech.geomesa.index.api.QueryPlanner$$anonfun$4.apply(QueryPlanner.scala:114)
        at org.locationtech.geomesa.utils.stats.MethodProfiling$class.profile(MethodProfiling.scala:26)
        at org.locationtech.geomesa.index.api.QueryPlanner.profile(QueryPlanner.scala:43)
        at org.locationtech.geomesa.index.api.QueryPlanner.getQueryPlans(QueryPlanner.scala:114)
        at org.locationtech.geomesa.index.api.QueryPlanner.planQuery(QueryPlanner.scala:61)
        at org.locationtech.geomesa.index.geotools.GeoMesaDataStore.getQueryPlan(GeoMesaDataStore.scala:464)
        at org.locationtech.geomesa.accumulo.data.AccumuloDataStore.getQueryPlan(AccumuloDataStore.scala:108)
        at org.locationtech.geomesa.jobs.accumulo.AccumuloJobUtils$.getMultipleQueryPlan(AccumuloJobUtils.scala:117)
        at org.locationtech.geomesa.spark.accumulo.AccumuloSpatialRDDProvider.rdd(AccumuloSpatialRDDProvider.scala:107)
        at org.locationtech.geomesa.spark.api.java.JavaSpatialRDDProvider.rdd(JavaGeoMesaSpark.scala:37)
        at com.praxedo.geomesa.geomesa_spark.Test.main(Test.java:43)

Maybe a missing dependency? Any idea on this problem?
Thanks for your time.

José.


_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users



_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users


_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users


Back to the top