Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [geomesa-users] problem with stats


Are you sure that you're writing distinct cam values for each feature? You could try running:

./geomesa-accumulo export -c myNamespace.geomesa -z -i accumulo -u root -p qweasd123 -f SignalBuilder --attributes cam --no-header | sort -u

and see how many unique cam values come back that way.



On 2/15/22 7:32 AM, Rinchin Gomboev wrote:
Hello, everyone.

I try to write an application using geomesa with accumulo.
I have a problem stats not gathered.
I have empty namespace in accumulo. Create a schema like from java code:
// build the type
private static SimpleFeatureType makeSFT(SimpleFeatureTypeBuilder builder) {
  SimpleFeatureType sft = builder.buildFeatureType();
                        "time(30 days)"); // Age-off filter by "time" field
  sft.getUserData().put("geomesa.indices.enabled", "z3,z2,attr:time,attr:cam:time");

  sft.getUserData().put("geomesa.z3.interval", "week");
  sft.getUserData().put("geomesa.table.partition", "time");
  sft.getUserData().put("geomesa.index.dtg", "time");
  sft.getUserData().put("geomesa.z.splits", "4");
  sft.getUserData().put("geomesa.attr.splits", "4");

  sft.getDescriptor("cam").getUserData().put("index", "true");
  sft.getDescriptor("time").getUserData().put("index", "true");
  sft.getDescriptor("geo").getUserData().put("index", "true");

  sft.getDescriptor("cam").getUserData().put("keep-stats", "true");

  return sft;

 ~/bin/geomesa-accumulo_2.12-3.2.2/bin  ./geomesa-accumulo describe-schema -c myNamespace.geomesa -z -i accumulo -u root -p qweasd123 -f SignalBuilder                                            
INFO  Describing attributes of feature 'SignalBuilder'
geo           | Point   (Spatio-temporally indexed) (Spatially indexed)
time          | Date    (Spatio-temporally indexed) (Attribute indexed)
cam           | String  (Attribute indexed)
imei          | String  
dir           | Double  
alt           | Double  
vlc           | Double  
sl            | Integer
ds            | Integer
dir_y         | Double  
poi_azimuth_x | Double  
poi_azimuth_y | Double  

User data:
  geomesa.attr.splits     | 4
  geomesa.feature.expiry  | time(30 days)
  geomesa.index.dtg       | time
  geomesa.indices         | z3:7:3:geo:time,z2:5:3:geo,attr:8:3:time,attr:8:3:cam:time
  geomesa.stats.enable    | true
  geomesa.table.partition | time
  geomesa.z.splits        | 4
  geomesa.z3.interval     | week

And put 1000 geocoordinates like this
  private Integer writeDataInternal(List<GeoEvent> events) throws IOException {

    if (events == null || events.isEmpty()) {
      return 0;

    int count = 0;

    //запись в geomesa
    try (FeatureWriter<SimpleFeatureType, SimpleFeature> writer = dataStore.getFeatureWriterAppend(
        SimpleFeatureUtils.TYPE.getTypeName(), Transaction.AUTO_COMMIT)) {

      for (GeoEvent event : events) {
        SimpleFeature feature = SimpleFeatureUtils.toSimpleFeature(event);
        String event_id = feature.getID();
        if (!event_id.contains(event.getCam())) {
"event not contain camId");
        SimpleFeature toWrite =;
        toWrite.getUserData().put(Hints.PROVIDED_FID, event_id);

        count++;"Event id = {}, for event = {}", event_id, event);

    } catch (Exception e) {
      log.error("Geomesa write error", e);
    return count;

the result

 ~/bin/geomesa-accumulo_2.12-3.2.2/bin  ./geomesa-accumulo stats-count -c myNamespace.geomesa -z -i accumulo -u root -p qweasd123 -f SignalBuilder                                                
Estimated count: 1000
 ~/bin/geomesa-accumulo_2.12-3.2.2/bin  ./geomesa-accumulo stats-count -c myNamespace.geomesa -z -i accumulo -u root -p qweasd123 -f SignalBuilder -q "cam='0000c1fe-a727-4a86-9eee-5b99d21038ea'"
Estimated count: 950
 ~/bin/geomesa-accumulo_2.12-3.2.2/bin                                                                                                                                                                           
 ~/bin/geomesa-accumulo_2.12-3.2.2/bin 
 ~/bin/geomesa-accumulo_2.12-3.2.2/bin 

after analyze it removes all statistics for that cam  
 ~/bin/geomesa-accumulo_2.12-3.2.2/bin  ./geomesa-accumulo stats-analyze -c myNamespace.geomesa -z -i accumulo -u root -p qweasd123 -f SignalBuilder                                              
INFO  Running stat analysis for feature type SignalBuilder...
INFO  Stats analyzed:
  Total features: 1000
  Bounds for geo: [ 37.598174, 55.736823, 37.681424, 55.820073 ] cardinality: 981
  Bounds for time: [ 2022-02-22T11:46:42.000Z to 2022-02-22T12:20:00.000Z ] cardinality: 957
  Bounds for cam: [ 0000c1fe-a727-4a86-9eee-5b99d21038ea to 0000c1fe-a727-4a86-9eee-5b99d21038ea ] cardinality: 1
INFO  Use 'stats-histogram', 'stats-top-k' or 'stats-count' commands for more details
 ~/bin/geomesa-accumulo_2.12-3.2.2/bin  ./geomesa-accumulo stats-count -c myNamespace.geomesa -z -i accumulo -u root -p qweasd123 -f SignalBuilder -q "cam='0000c1fe-a727-4a86-9eee-5b99d21038ea'"
Estimated count: 0

Maybe the reason is on accumulo server there is no spark?

How to get statistics? Thank you

Rinchin Gomboev

geomesa-users mailing list
To unsubscribe from this list, visit

Back to the top