Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [geomesa-users] GeoMesa Docker EMR - Jupyter Notebook help

Byron,

What AWS region are you using?  We may need to replicate the bootstrap
script to each region in S3.  (Though I'd be surprised you got as far as you
did if you didn't have access to the bucket?

Thanks,
Anthony

Byron Chigoy <bchigoy@xxxxxxxxxxx> writes:

> Thanks Jason -
> That did not work. We checkeddocker exec -t -i accumulo-master find /opt -name *.jar and the file names there match the file names ingeomesa_spark_scala/kernel.json. We are wondering about the appended -SNAPSHOT (my .jar files vs yours).
>
> In order to get GeoMesa (at least ingestion) to work as well as geoserver we had to adjust the Bootstrap. Perhaps that is where we went wrong? This was due to the following:
>
> 1. Access error tos3://geomesa-docker/bootstrap-geodocker-accumulo.sh, we can see file contents via boto and CLI ls
>     my_bucket = s3.Bucket('geomesa-docker')
>     for obj in my_bucket.objects.all():
>     print(obj.key)
> But are unable to grab it via CLI cp orurllib.request.urlretrieve.
>
> 2. We adapted a copy from geowave-geomesa-comparative-analysis/analyze/bootstrap-geodocker-accumulo.shRelative changes were:
>
> IMAGE=quay.io/geomesa/accumulo-geomesa:latest
> vs
> IMAGE=quay.io/geodocker/accumulo:${TAG:-"latest"}
>
> AND
>
> DOCKER_OPT="-d --net=host --restart=always"
> if is_master ; then
> 	docker pull $IMAGE
> 	docker pull quay.io/geomesa/geoserver:latest
> 	docker pull quay.io/geomesa/geomesa-jupyter:latest
>   docker run $DOCKER_OPT --name=accumulo-master $DOCKER_ENV $IMAGE master --auto-init
>   docker run $DOCKER_OPT --name=accumulo-monitor $DOCKER_ENV $IMAGE monitor
>   docker run $DOCKER_OPT --name=accumulo-tracer $DOCKER_ENV $IMAGE tracer
>   docker run $DOCKER_OPT --name=accumulo-gc $DOCKER_ENV $IMAGE gc
>   docker run $DOCKER_OPT --name=geoserver quay.io/geomesa/geoserver:latest
> 	docker run $DOCKER_OPT --name=jupyter quay.io/geomesa/geomesa-jupyter:latest
> else # is worker
> 	docker pull $IMAGE
>   docker run -d --net=host --name=accumulo-tserver $DOCKER_ENV $IMAGE tserver
>
> Versus
>
> DOCKER_OPT="-d --net=host --restart=always"
> if is_master ; then
>   docker run $DOCKER_OPT --name=accumulo-master $DOCKER_ENV $IMAGE master --auto-init
>   docker run $DOCKER_OPT --name=accumulo-monitor $DOCKER_ENV $IMAGE monitor
>   docker run $DOCKER_OPT --name=accumulo-tracer $DOCKER_ENV $IMAGE tracer
>   docker run $DOCKER_OPT --name=accumulo-gc $DOCKER_ENV $IMAGE gc
>   docker run $DOCKER_OPT --name=geoserver quay.io/geodocker/geoserver:latest
> else # is worker
>   docker run -d --net=host --name=accumulo-tserver $DOCKER_ENV $IMAGE tserver
>
> 3. Bootstrap config changes were-i=quay.io/geomesa/accumulo-geomesa:latest, -n=gis, -p=secret, -e=TSERVER_XMX=10G, -e=TSERVER_CACHE_DATA_SIZE=6G, -e=TSERVER_CACHE_INDEX_SIZE=2G
>
> 4. Error from Jupyter Startup read (replaced IPs with <IP>)
>
> bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
> [I <IP> NotebookApp] Kernel started: 83a4cb2d-8004-4c69-ad21-ad46ac2b4a48
> Starting Spark Kernel with SPARK_HOME=/usr/local/spark
> bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
> bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
> (Scala,org.apache.toree.kernel.interpreter.scala.ScalaInterpreter@1bc715b8)
> (PySpark,org.apache.toree.kernel.interpreter.pyspark.PySparkInterpreter@292d1c71)
> (SparkR,org.apache.toree.kernel.interpreter.sparkr.SparkRInterpreter@2b491fee)
> (SQL,org.apache.toree.kernel.interpreter.sql.SqlInterpreter@3f1c5af9)
> 17/02/15 15:44:00 WARN toree.Main$$anon$1: No external magics provided to PluginManager!
> 17/02/15 15:44:04 WARN layer.StandardComponentInitialization$$anon$1: Locked to Scala interpreter with SparkIMain until decoupled!
> 17/02/15 15:44:04 WARN layer.StandardComponentInitialization$$anon$1: Unable to control initialization of REPL class server!
> [W 15:44:04.777 NotebookApp] Notebook GDELT+Analysis.ipynb is not trusted
> [W 15:44:04.803 NotebookApp] 404 GET /nbextensions/widgets/notebook/js/extension.js?v=20170215154320 (<IP>) 2.94ms referer=http://ec2<IP>.compute-1.amazonaws.com:8890/notebooks/GDELT%2BAnalysis.ipynb
> 17/02/15 15:44:04 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
> [W 15:44:06.577 NotebookApp] Timeout waiting for kernel_info reply from 83a4cb2d-8004-4c69-ad21-ad46ac2b4a48
> 17/02/15 15:44:06 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
> 17/02/15 15:44:13 ERROR spark.SparkContext: Error initializing SparkContext.
> org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.
>
> Any feedback is greatly appreciated.
> Byron, Texas A&M Transportation Institute
>
>
>
>
>
> From: geomesa-users-bounces@xxxxxxxxxxxxxxxx <geomesa-users-bounces@xxxxxxxxxxxxxxxx> on behalf of Jim Hughes <jnh5y@xxxxxxxx>
> Sent: Tuesday, February 14, 2017 4:15 PM
> To: geomesa-users@xxxxxxxxxxxxxxxx
> Subject: Re: [geomesa-users] GeoMesa Docker EMR - Jupyter Notebook help
>
> Hi Byron,
>
> As happenstance, I'm setting up a GeoMesa demo, and I have a quick fix.
>
> You'll want to connect to the Jupyter docker (say, with 'docker exec -it jupyter /bin/sh), and edit this file: /var/lib/hadoop-hdfs/.local/share/jupyter/kernels/geomesa_spark_scala/kernel.json.
>
> The line with the with the Toree Spark opts should read...
>
>  "__TOREE_SPARK_OPTS__": "--driver-java-options=-Xmx4096M --driver-java-options=-Dlog4j.logLevel=info --master yarn --jars  file:///opt/geomesa/dist/spark/geomesa-accumulo-spark-runtime_2.11-1.3.0.jar,file:///opt/geomesa/dist/spark/geomesa-spark-converter_2.11-1.3.0.jar,file:///opt/geomesa/dist/spark/geomesa-spark-geotools_2.11-1.3.0.jar",
>
> One of the jars changed names (from geomesa-accumulo-spark_2.11-1.3.0-shaded.jar to geomesa-accumulo-spark-runtime_2.11-1.3.0.jar). That difference caused the issues; I need to sort out re-building the Docker images.
>
> Let me know if that doesn't sort it out!
>
> Cheers,
>
> Jim
>
>
> On 02/14/2017 04:07 PM, Byron Chigoy wrote:
>
> Hi - probably pretty basic, but we are able to get the Docker Bootstrap tutorial working on AWS. We are pulling fromhttps://quay.io/organization/geomesa.  Once started we can ingest the GDELT example and get the descriptive. We are also able to bring the GDELT example into GeoServer.
>
> However while Jupyter gets docked - the Kernel GeoMesa Spark - Scala fails (Just says kernel busy). We started the notebook on another port to see the error behavior and get a list of them. See below any help or clues would be most appreciated.
>
>
>
>
> (Scala,org.apache.toree.kernel.interpreter.scala.ScalaInterpreter@5ef0d29e)
> (PySpark,org.apache.toree.kernel.interpreter.pyspark.PySparkInterpreter@38f57b3d)
> (SparkR,org.apache.toree.kernel.interpreter.sparkr.SparkRInterpreter@51850751)
> (SQL,org.apache.toree.kernel.interpreter.sql.SqlInterpreter@3ce3db41)
> 17/02/14 20:50:29 WARN toree.Main$$anon$1: No external magics provided to PluginManager!
> 17/02/14 20:50:32 WARN layer.StandardComponentInitialization$$anon$1: Locked to Scala interpreter with SparkIMain until decoupled!
> 17/02/14 20:50:32 WARN layer.StandardComponentInitialization$$anon$1: Unable to control initialization of REPL class server!
> 17/02/14 20:50:33 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
> [W 20:50:34.769 NotebookApp] Timeout waiting for kernel_info reply from fe9c2776-f5d7-47bc-b5dd-d2769f631f2f
> 17/02/14 20:50:35 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
> 17/02/14 20:50:42 ERROR spark.SparkContext: Error initializing SparkContext.
> org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.
>
>
> Byron
>
>
>
> _______________________________________________ geomesa-users mailing list geomesa-users@xxxxxxxxxxxxxxxx To change your delivery options, retrieve your password, or unsubscribe from this list, visit https://dev.locationtech.org/mailman/listinfo/geomesa-users
>
> _______________________________________________
> geomesa-users mailing list
> geomesa-users@xxxxxxxxxxxxxxxx
> To change your delivery options, retrieve your password, or unsubscribe from this list, visit
> https://dev.locationtech.org/mailman/listinfo/geomesa-users


Back to the top