Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [geomesa-users] GeoMesa Kryo serialization failed for SimpleFeature with certain length

Thanks for finding the issue, and taking the time to investigate it so thoroughly! Your solution seems correct, would you be willing to put it up as a PR? If you can share the values in the simple feature you were serializing that caused the error, we could also add a unit test to verify the fix.

Thanks,

Emilio

On 10/14/20 6:09 PM, Jun Cai wrote:
Hi All,

I am seeing serialization failure while writing features of certain length with GeoMesaFeatureWriter. Here is part of the stack trace:
Suppressed: java.lang.IllegalArgumentException: Error indexing feature '1804882973#10760608:MULTIPOLYGON
...
    at org.locationtech.geomesa.index.geotools.GeoMesaFeatureWriter$class.writeFeature(GeoMesaFeatureWriter.scala:55) ~[geomesa-hbase-spark-runtime_custom_2.11-2.4.1-20200904.jar:?]
    at org.locationtech.geomesa.index.geotools.GeoMesaFeatureWriter$TableFeatureWriter.writeFeature(GeoMesaFeatureWriter.scala:141) ~[geomesa-hbase-spark-runtime_custom_2.11-2.4.1-20200904.jar:?]
    at org.locationtech.geomesa.index.geotools.GeoMesaFeatureWriter$GeoMesaAppendFeatureWriter$class.write(GeoMesaFeatureWriter.scala:227) ~[geomesa-hbase-spark-runtime_custom_2.11-2.4.1-20200904.jar:?]
    at org.locationtech.geomesa.index.geotools.GeoMesaFeatureWriter$$anon$3.write(GeoMesaFeatureWriter.scala:108) ~[geomesa-hbase-spark-runtime_custom_2.11-2.4.1-20200904.jar:?]
    at org.locationtech.geomesa.utils.geotools.FeatureUtils$.write(FeatureUtils.scala:141) ~[geomesa-hbase-spark-runtime_custom_2.11-2.4.1-20200904.jar:?]
...
Caused by: java.lang.ArrayIndexOutOfBoundsException: 131072
    at org.locationtech.geomesa.features.kryo.impl.KryoFeatureSerialization$class.writeFeature(KryoFeatureSerialization.scala:91) ~[geomesa-hbase-spark-runtime_custom_2.11-2.4.1-20200904.jar:?]
    at org.locationtech.geomesa.features.kryo.impl.KryoFeatureSerialization$class.serialize(KryoFeatureSerialization.scala:42) ~[geomesa-hbase-spark-runtime_custom_2.11-2.4.1-20200904.jar:?]
    at org.locationtech.geomesa.features.kryo.KryoFeatureSerializer$MutableActiveSerializer.serialize(KryoFeatureSerializer.scala:78) ~[geomesa-hbase-spark-runtime_custom_2.11-2.4.1-20200904.jar:?]
    at org.locationtech.geomesa.index.api.WritableFeature$FeatureLevelWritableFeature$$anonfun$values$1$$anonfun$apply$1.apply(WritableFeature.scala:154) ~[geomesa-hbase-spark-runtime_custom_2.11-2.4.1-20200904.jar:?]
    at org.locationtech.geomesa.index.api.WritableFeature$FeatureLevelWritableFeature$$anonfun$values$1$$anonfun$apply$1.apply(WritableFeature.scala:154) ~[geomesa-hbase-spark-runtime_custom_2.11-2.4.1-20200904.jar:?]
...
Root Cause
With some debugging, I found the issue was in KryoFeatureSerialization at line 89 to 91:
var i = end
while (i > offset) {
buffer(i + shift) = buffer(i)
where it moves content in Kryo Output buffer by <shift> bytes starting from end position of current feature content which is acquired from line 78:
val end = output.position()
The output.position() here is the number of bytes that have not been flushed. In the output buffer, the zero-based index of the last byte is actually (end - 1). 
This means the code copies one more byte than needed and it becomes an issue when 
output.getBuffer.length = end + shift
which returns false for the buffer expansion check at line 82 
if (output.getBuffer.length < end + shift) {
and causes the ArrayIndexOutOfBoundsException when accessing buffer(i + shift).

An example from my use case, the feature content length is 131057 (end = 131058), shift is 14, output.getBuffer.length is 131072. When serializing this feature, 
it skipped the buffer expansion and tried to access buffer(131058 + 14), which is buffer(131072), and caused the ArrayIndexOutOfBoundsException. 
Proposed Fix
Change line 89 from
var i = end
to
var i = end - 1

          
Per my understanding of the code and local testing, this change shouldn't cause compatibility issue in feature serialization/deserialization. But would like to get it reviewed by more folks.

Regards,
Jun Cai


Back to the top