Geo_shape questions

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Geo_shape questions

gjdev
I posted a message yesterday, but somehow it didn't get on the list. Trying again....

I have some questions/remarks about the (incredibly useful) geo_shape type and filters.

- Are there plans to support "Pre-Indexed-Shapes" also in documents, i.e. specify a pre-indexed shape to be indexed with a new document, instead of adding the geometry itself to that document? I would expect that in many use-cases the same geometries will be indexed with multiple docs. Just like with filters/queries the performance could benefit quite a lot if the indexer could just copy the hashes over from an already indexed geometry.

- Imho allowing a serialisation of a geometry as e.g. WKT would not only trim-down on the size of documents, but also on the work that elasticsearch needs to do for serializing/deserializing geometries. Polygons quickly become really big when expressed in JSON... Is this something that is considered and/or that will be accepted when provided in a decent pull-request?

- A bit more documentation about how the combination of distance_error_pct and tree_levels affects the precision/results of filters would really be appreciated. From the docs and code I'm having a hard time understanding the consequences of altering both values on filters and indexes. What exactly does distance_error_pct, and how does it affect e.g. an intersection filter?

- Quote from the docs: "Because of current limitations of the algorithm, very large indexed shapes are not deemed to intersect with very small filter shapes". Are there any plans to fix this? Assuming the algorithmic problem is that large shapes are only hashed up to a maximum depth, there are a couple of ways to fix this. E.g. the indexer could add an extra field with the hashes from only the deepest hash-level it uses for that geometry. The intersection filter could use this by extending (boolean or) the current filter with a term-filter on that field for all parents of the hashes it currently uses for searching. That way larger shapes that intersect will be included, and smaller shapes that only happen to share a parent won't be included.

- The algorith for "within" (In TermQueryPrefixStrategy) could be improved (imho). It's currently inconsistent for geometries that are equal or just a tiny bit smaller than the filter-geometry, and I think that could easily be fixed. I've filed ann issue about this yesterday, so I won't get further into it here, see https://github.com/elasticsearch/elasticsearch/issues/2552

Thanks!

--