Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [geomesa-users] About performance on writing data to HBase

Hello,

I can't really say how long I would expect it to take - there are a lot of factors that affect it, including the size and hardware of your hbase cluster, your splitting-related configurations, how many indices you are creating, your data locality, your backing storage speeds, and any other concurrent load on your cluster.

Some things that may help:

* Pre-splitting your tables[1]
* Disabling any indices you don't need[2]
* Enabling table compression[3]

The fastest thing will usually be to ingest offline[4] and then bulk import[5] the files, instead of using spark.

Thanks,

Emilio

[1]: https://www.geomesa.org/documentation/user/datastores/index_config.html#configuring-index-splits
[2]: https://www.geomesa.org/documentation/user/datastores/index_config.html#customizing-index-creation
[3]: https://www.geomesa.org/documentation/user/hbase/index_config.html#setting-file-compression
[4]: https://www.geomesa.org/documentation/user/hbase/commandline.html#bulk-ingest
[5]: https://www.geomesa.org/documentation/user/hbase/commandline.html#bulk-load

On 3/18/20 11:03 PM, Yifan Wang wrote:
Hi, 

I tried to write 1TB data through GeoMesa to Hbase. The resources snapshot is as follows:
image.png
It took 10 hours to finish the whole writing process. I'm wondering if 10 hours is normal in this situation or is there anything I can do to improve the efficiency? Thanks!

Best Regards,
Evan

_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxx
To unsubscribe from this list, visit https://dev.eclipse.org/mailman/listinfo/geomesa-users


Back to the top