[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
[
List Home]
[geomesa-users] 回复: 回复: -EXT- problem with query data by spark
|
hello:
In the scenario I described, the data should already exist on the server, as it can be read through DataStore.read().
I will continue to observe and test, and will contact you again if there are any new findings.
Thanks,
Mike
---- 回复的原邮件 ----
Hmm, GeoMesa uses the HBase MultiTableInputFormat for reading data from spark. I don't see any indication that wouldn't read entries in the memstore, although it's possible. Are you sure that the data has been flushed by your writer process to the region server?
HBase writers will cache data locally until they hit a threshold (size or time), so the data might not have actually been written yet in your testing.
Thanks,
Emilio Lahr-Vivaz
General Atomics, CCRi
From: geomesa-users <geomesa-users-bounces@xxxxxxxxxxx> on behalf of zhou lihuang via geomesa-users <geomesa-users@xxxxxxxxxxx>
Sent: Thursday, January 2, 2025 9:00 PM
To: Geomesa project user mailing list <geomesa-users@xxxxxxxxxxx>
Cc: zhou lihuang <zlh_0923@xxxxxxxxxxx>
Subject: [geomesa-users] 回复: -EXT- problem with query data by spark
WARNING: This message is from an external source. Evaluate the message carefully BEFORE clicking on links or opening attachments.
First of all, thannk you for your reply.
I've found the possible reason after multiple tests:
When HBase stores only a small amount of data, the data exists only in the MemStore and hasn't been flushed to the StoreFile yet. In this case, when using Spark to query through spatialRDDProvider.rdd
,
no data will be obtained. However, once the data has been flushed to the StoreFile, the query results will be normal.
I'm not sure whether this is a bug of GeoMesa or not.
Best Regards,
Mike
发件人: geomesa-users <geomesa-users-bounces@xxxxxxxxxxx> 代表 Lahr-Vivaz, Emilio via geomesa-users <geomesa-users@xxxxxxxxxxx>
发送时间: 2025年1月2日 21:33
收件人: Geomesa project user mailing list <geomesa-users@xxxxxxxxxxx>
抄送: Lahr-Vivaz, Emilio <Emilio.Lahr-Vivaz@xxxxxxxxxxx>
主题: Re: [geomesa-users] -EXT- problem with query data by spark
Thanks,
Emilio Lahr-Vivaz
General Atomics, CCRi
From: geomesa-users <geomesa-users-bounces@xxxxxxxxxxx> on behalf of zhou lihuang via geomesa-users <geomesa-users@xxxxxxxxxxx>
Sent: Wednesday, December 25, 2024 4:31 AM
To: geomesa-users@xxxxxxxxxxx <geomesa-users@xxxxxxxxxxx>
Cc: zhou lihuang <zlh_0923@xxxxxxxxxxx>
Subject: -EXT-[geomesa-users] problem with query data by spark
WARNING: This message is from an external source. Evaluate the message carefully BEFORE clicking on links or
opening attachments.
hello everyone:
I used RDD Provider to query data, but retrieve 0 data (there are 2 features).
And I used DataStore created by DataStoreFinder.getDataStore,It’s successfully get 2 features.
code is as follows:
The env is :
geomesa: 4.0.5
spark: 3.3.0
hbase: 2.2.0
I've tried modify geomesa version and dependencies version, but it didn't work.
How can I fix this problem now?
Thank you everyone.
Best Regards,
Mike