| Hi Ben, 
 Our feature iterators are wrappers around Accumulo BatchScanners, so
    they do fetch data in a lazy fashion. Generally they will pre-fetch
    a chunk of data, then wait for the iterator to be consumed before
    fetching more. There are some Accumulo settings that you can use to
    tweak this, but I can't find them at the moment...
 
 One very large caveat is if you are sorting your result sets. In
    order to sort, we have to load the entire data set into memory.
 
 I can't tell from your message, are you wrapping the feature
    iterators in some way? The standard
    dataStore.getFeatureReader(query, Transaction.AUTO_COMMIT) and
    dataStore.getFeatureSource.getFeatures.features both return the same
    lazy iterator.
 
 Thanks,
 
 Emilio
 
 
 On 01/11/2017 12:37 PM, Benjamin Weaver
      wrote:
 
      
      
      
        This email (and any attachments) may contain confidential
      information and is intended solely for the recipient(s) to whom
      the email is addressed. If you received this email in error,
      please inform us immediately and delete the email and all
      attachments without further using, copying or disclosing the
      information. This email and any attachments are believed to be,
      but cannot be guaranteed to be, secure or virus-free. Satellite
      Applications Catapult Limited is registered in England &
      Wales. Company Number: 7964746. Registered office: Electron
      Building, Fermi Avenue, Harwell Oxford, Didcot, Oxfordshire OX11
      0QR.Hi all, 
 We are returning large resultsets from Accumulo (1.7.2) while
          running Geomesa 1.2.1. Our system frequently runs slow and
          locks up. We believe our problems owe to our use, in our
          queries, of the non-lazy FeatureIterator. The problem seems to
          be that our query returns our entire resultset, loads that set
          into the FeatureIterator, before we traverse that
          FeatureIterator, converting features into protobuf and
          returning to the client.
          
 
 Our NATS server does not appear to be the bottleneck. We have
          noticed via metrics that the query itself takes the
          substantial portion of time. 
 Do we have available an asynchronous or lazy Geomesa driver
          or iterator enabling return of large results sets as they are
          returned from Accumulo? Is such asynchronous/lazy
          functionality available in Geomesa 1.2.1 or any newer version
          of Geomesa? 
 Any perspective is greatly appreciated! 
 Ben
 
 
 _______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://www.locationtech.org/mailman/listinfo/geomesa-users 
 |