the problem is remain
I put the
geomesa-accumulo-distributed-runtime jar
inside in main accumulo lib directory
[g.rinchin@netris-cassandra-stage60-04
lib]$ pwd
/opt/accumulo/lib
[g.rinchin@netris-cassandra-stage60-04
lib]$ ls | grep geomesa
geomesa-accumulo-distributed-runtime_2.12-3.2.2.jar
after this i can correctly load class
org.locationtech.geomesa.accumulo.data.stats.StatsCombiner
root@accumulo> setiter -t
examples.runners -p 10 -scan -minc -majc
-n decStats -class
org.locationtech.geomesa.accumulo.data.stats.StatsCombiner
Combiners apply reduce functions to
multiple versions of values with otherwise
equal keys
----------> set StatsCombiner parameter
all, set to true to apply Combiner to
every column, otherwise leave blank. if
true, columns option will be ignored.:
I recreate namespace like
root@accumulo> deletenamespace -f
myNamespace
root@accumulo> createnamespace
myNamespace
root@accumulo> grant
NameSpace.CREATE_TABLE -ns myNamespace -u
root
root@accumulo> config -ns myNamespace
-s table.classpath.context=myNamespace
then run an application and it create
geomesa tables put
my myNamespace.geomesa_stats table
config
root@accumulo> config -t
myNamespace.geomesa_stats
-----------+-------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SCOPE | NAME
| VALUE
-----------+-------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
default | table.balancer
............................................
|
org.apache.accumulo.server.master.balancer.DefaultLoadBalancer
default | table.bloom.enabled
....................................... |
false
default | table.bloom.error.rate
.................................... |
0.5%
default | table.bloom.hash.type
..................................... |
murmur
default | table.bloom.key.functor
................................... |
org.apache.accumulo.core.file.keyfunctor.RowFunctor
default | table.bloom.load.threshold
................................ | 1
default | table.bloom.size
..........................................
| 1048576
default | table.cache.block.enable
.................................. | false
default | table.cache.index.enable
.................................. | true
default | table.classpath.context
................................... |
namespace | @override
..............................................
| myNamespace
default |
table.compaction.major.everything.idle
.................... | 1h
default | table.compaction.major.ratio
.............................. | 3
default | table.compaction.minor.idle
............................... | 5m
default |
table.compaction.minor.logs.threshold
..................... | 3
default |
table.compaction.minor.merge.file.size.max
................ | 0
table | table.constraint.1
........................................ |
org.apache.accumulo.core.constraints.DefaultKeySizeConstraint
default | table.durability
..........................................
| sync
default | table.failures.ignore
..................................... |
false
default | table.file.blocksize
...................................... |
0B
default | table.file.compress.blocksize
............................. | 100K
default |
table.file.compress.blocksize.index
....................... | 128K
default | table.file.compress.type
.................................. | gz
default | table.file.max
............................................
| 15
default | table.file.replication
.................................... | 0
default | table.file.summary.maxSize
................................ | 256K
default | table.file.type
...........................................
| rf
default | table.formatter
...........................................
|
org.apache.accumulo.core.util.format.DefaultFormatter
default | table.groups.enabled
...................................... |
default | table.interepreter
........................................ |
org.apache.accumulo.core.util.interpret.DefaultScanInterpreter
table |
table.iterator.majc.stats-combiner
........................ |
10,org.locationtech.geomesa.accumulo.data.stats.StatsCombiner
table |
table.iterator.majc.stats-combiner.opt.all
................ | true
table |
table.iterator.majc.stats-combiner.opt.sep
................ | ~
table |
table.iterator.majc.stats-combiner.opt.sft-SignalBuilder
.. |
*geo:Point,time:Date,cam:String:keep-stats=true,imei:String,dir:Double,alt:Double,vlc:Double,sl:Integer,ds:Integer,dir_y:Double,poi_azimuth_x:Double,poi_azimuth_y:Double
table | table.iterator.majc.vers
.................................. |
20,org.apache.accumulo.core.iterators.user.VersioningIterator
table |
table.iterator.majc.vers.opt.maxVersions
.................. | 1
table |
table.iterator.minc.stats-combiner
........................ |
10,org.locationtech.geomesa.accumulo.data.stats.StatsCombiner
table |
table.iterator.minc.stats-combiner.opt.all
................ | true
table |
table.iterator.minc.stats-combiner.opt.sep
................ | ~
table |
table.iterator.minc.stats-combiner.opt.sft-SignalBuilder
.. |
*geo:Point,time:Date,cam:String:keep-stats=true,imei:String,dir:Double,alt:Double,vlc:Double,sl:Integer,ds:Integer,dir_y:Double,poi_azimuth_x:Double,poi_azimuth_y:Double
table | table.iterator.minc.vers
.................................. |
20,org.apache.accumulo.core.iterators.user.VersioningIterator
table |
table.iterator.minc.vers.opt.maxVersions
.................. | 1
table |
table.iterator.scan.stats-combiner
........................ |
10,org.locationtech.geomesa.accumulo.data.stats.StatsCombiner
table |
table.iterator.scan.stats-combiner.opt.all
................ | true
table |
table.iterator.scan.stats-combiner.opt.sep
................ | ~
table |
table.iterator.scan.stats-combiner.opt.sft-SignalBuilder
.. |
*geo:Point,time:Date,cam:String:keep-stats=true,imei:String,dir:Double,alt:Double,vlc:Double,sl:Integer,ds:Integer,dir_y:Double,poi_azimuth_x:Double,poi_azimuth_y:Double
table | table.iterator.scan.vers
.................................. |
20,org.apache.accumulo.core.iterators.user.VersioningIterator
table |
table.iterator.scan.vers.opt.maxVersions
.................. | 1
default |
table.majc.compaction.strategy
............................ |
org.apache.accumulo.tserver.compaction.DefaultCompactionStrategy
default | table.replication
.........................................
| false
default | table.sampler
.............................................
|
default | table.scan.dispatcher
..................................... |
org.apache.accumulo.core.spi.scan.SimpleScanDispatcher
default | table.scan.max.memory
..................................... |
512K
default |
table.security.scan.visibility.default
.................... |
default | table.split.endrow.size.max
............................... | 10K
default | table.split.threshold
..................................... | 1G
default | table.suspend.duration
.................................... | 0s
default | table.walog.enabled
....................................... |
true
-----------+-------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
root@accumulo>
but the statistics is still not
correctly gathered for the first iteration
I put 1000 geocoordinates and stats
count by cam it returns
✘
~/bin/geomesa-accumulo_2.12-3.2.2/bin
./geomesa-accumulo stats-count -c
myNamespace.geomesa -z 10.200.217.27 -i
accumulo -u root -p qweasd123 -f
SignalBuilder
Estimated count: 1000
~/bin/geomesa-accumulo_2.12-3.2.2/bin
~/bin/geomesa-accumulo_2.12-3.2.2/bin
~/bin/geomesa-accumulo_2.12-3.2.2/bin
./geomesa-accumulo stats-count -c
myNamespace.geomesa -z 10.200.217.27 -i
accumulo -u root -p qweasd123 -f
SignalBuilder -q
"cam='0000c1fe-a727-4a86-9eee-5b99d21038ea'"
Estimated count: 866
866 - is last batch of events saved
from code
code log
16.02.2022 12:16:21.199 INFO
[pool-3-thread-4]
r.netris.gps.sampler.GeoEventSampler -
Saved 866 of 866 events
the next added events is correctly
added to count stats
~/bin/geomesa-accumulo_2.12-3.2.2/bin
./geomesa-accumulo stats-count -c
myNamespace.geomesa -z 10.200.217.27 -i
accumulo -u root -p qweasd123 -f
SignalBuilder -q
"cam='0000c1fe-a727-4a86-9eee-5b99d21038ea'"
Estimated count: 1866
~/bin/geomesa-accumulo_2.12-3.2.2/bin
./geomesa-accumulo stats-count -c
myNamespace.geomesa -z 10.200.217.27 -i
accumulo -u root -p qweasd123 -f
SignalBuilder -q
"cam='0000c1fe-a727-4a86-9eee-5b99d21038ea'"
Estimated count: 2866
~/bin/geomesa-accumulo_2.12-3.2.2/bin
~/bin/geomesa-accumulo_2.12-3.2.2/bin
~/bin/geomesa-accumulo_2.12-3.2.2/bin
./geomesa-accumulo stats-count -c
myNamespace.geomesa -z 10.200.217.27 -i
accumulo -u root -p qweasd123 -f
SignalBuilder -q
"cam='0000c1fe-a727-4a86-9eee-5b99d21038ea'"
Estimated count: 2866
~/bin/geomesa-accumulo_2.12-3.2.2/bin
./geomesa-accumulo stats-count -c
myNamespace.geomesa -z 10.200.217.27 -i
accumulo -u root -p qweasd123 -f
SignalBuilder -q
"cam='0000c1fe-a727-4a86-9eee-5b99d21038ea'"
Estimated count: 3866
~/bin/geomesa-accumulo_2.12-3.2.2/bin
./geomesa-accumulo stats-count -c
myNamespace.geomesa -z 10.200.217.27 -i
accumulo -u root -p qweasd123 -f
SignalBuilder -q
"cam='0000c1fe-a727-4a86-9eee-5b99d21038ea'"
Estimated count: 4866
~/bin/geomesa-accumulo_2.12-3.2.2/bin
./geomesa-accumulo stats-count -c
myNamespace.geomesa -z 10.200.217.27 -i
accumulo -u root -p qweasd123 -f
SignalBuilder -q
"cam='0000c1fe-a727-4a86-9eee-5b99d21038ea'"
Estimated count: 5866
~/bin/geomesa-accumulo_2.12-3.2.2/bin
./geomesa-accumulo stats-count -c
myNamespace.geomesa -z 10.200.217.27 -i
accumulo -u root -p qweasd123 -f
SignalBuilder -q
"cam='0000c1fe-a727-4a86-9eee-5b99d21038ea'"
Estimated count: 6866
To solve this in code I do next first I
write the first event then all others
events something like
private Integer writeDataInternalTest(List<GeoEvent> events) throws IOException {
if (events == null || events.isEmpty()) {
return 0;
}
int count = 0;
GeoEvent firstEvent = events.remove(0);
try (FeatureWriter<SimpleFeatureType, SimpleFeature> writer = dataStore.getFeatureWriterAppend(
SimpleFeatureUtils.TYPE.getTypeName(), Transaction.AUTO_COMMIT)) {
SimpleFeature feature = SimpleFeatureUtils.toSimpleFeature(firstEvent);
String event_id = feature.getID();
if (!event_id.contains(firstEvent.getCam())) {
log.info("event not contain camId");
}
SimpleFeature toWrite = writer.next();
toWrite.setAttributes(feature.getAttributes());
toWrite.getUserData().put(Hints.PROVIDED_FID, event_id);
toWrite.getUserData().putAll(feature.getUserData());
writer.write();
count++;
log.info("Event id = {}, for event = {}", event_id, firstEvent);
} catch (Exception e) {
log.error("Geomesa write error", e);
}
try (FeatureWriter<SimpleFeatureType, SimpleFeature> writer = dataStore.getFeatureWriterAppend(
SimpleFeatureUtils.TYPE.getTypeName(), Transaction.AUTO_COMMIT)) {
for (GeoEvent event : events) {
SimpleFeature feature = SimpleFeatureUtils.toSimpleFeature(event);
String event_id = feature.getID();
if (!event_id.contains(event.getCam())) {
log.info("event not contain camId");
}
SimpleFeature toWrite = writer.next();
toWrite.setAttributes(feature.getAttributes());
toWrite.getUserData().put(Hints.PROVIDED_FID, event_id);
toWrite.getUserData().putAll(feature.getUserData());
writer.write();
count++;
log.info("Event id = {}, for event = {}", event_id, event);
}
} catch (Exception e) {
log.error("Geomesa write error", e);
}
return count;
}
Then the statistics for putting the
first 1000 geoevents is 999
~/bin/geomesa-accumulo_2.12-3.2.2/bin
./geomesa-accumulo stats-count -c
myNamespace.geomesa -z 10.200.217.27 -i
accumulo -u root -p qweasd123 -f
SignalBuilder -q
"cam='0000c1fe-a727-4a86-9eee-5b99d21038ea'"
Estimated count: 999
But still if I run stats-analyze it
reset the count by cam to 0
✘
~/bin/geomesa-accumulo_2.12-3.2.2/bin
./geomesa-accumulo stats-analyze -c
myNamespace.geomesa -z 10.200.217.27 -i
accumulo -u root -p qweasd123 -f
SignalBuilder
INFO Running stat analysis for feature
type SignalBuilder...
INFO Stats analyzed:
Total features: 1000
Bounds for geo: [ 37.598174, 55.736823,
37.681424, 55.820073 ] cardinality: 981
Bounds for time: [
2022-02-27T08:26:42.000Z to
2022-02-27T09:00:00.000Z ] cardinality:
973
Bounds for cam: [
0000c1fe-a727-4a86-9eee-5b99d21038ea to
0000c1fe-a727-4a86-9eee-5b99d21038ea ]
cardinality: 1
INFO Use 'stats-histogram', 'stats-top-k'
or 'stats-count' commands for more details
~/bin/geomesa-accumulo_2.12-3.2.2/bin
./geomesa-accumulo stats-count -c
myNamespace.geomesa -z 10.200.217.27 -i
accumulo -u root -p qweasd123 -f
SignalBuilder -q
"cam='0000c1fe-a727-4a86-9eee-5b99d21038ea'"
Estimated count: 0
Thanks.