Hamming Distance on Binary strings

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Hamming Distance on Binary strings

anahap
Hi there,

Does anyone know the best way to store fixedlength binary data and query it, while scoring with hamming distance?
A hamming distance filter with a threshold would also be ok.


Thanks a lot, this is useful for all kinds of similarity searches based on fingerprinting algorithms.


 


Reply | Threaded
Open this post in threaded view
|

Re: Hamming Distance on Binary strings

kimchy
Administrator
You can use fuzzy queries for Levenshtein distance, but note that they are slow(er) in Lucene 3.3, will be much faster in Lucene 4.0 (when it comes out).

On Fri, Aug 26, 2011 at 1:50 PM, anahap <[hidden email]> wrote:
Hi there,

Does anyone know the best way to store fixedlength binary data and query it, while scoring with hamming distance?
A hamming distance filter with a threshold would also be ok.


Thanks a lot, this is useful for all kinds of similarity searches based on fingerprinting algorithms.


 



Reply | Threaded
Open this post in threaded view
|

Re: Hamming Distance on Binary strings

Catalin Banu
In reply to this post by anahap
Hi,

Did you find a solution?

On Friday, August 26, 2011 1:50:51 PM UTC+3, anahap wrote:
Hi there,

Does anyone know the best way to store fixedlength binary data and query it, while scoring with hamming distance?
A hamming distance filter with a threshold would also be ok.


Thanks a lot, this is useful for all kinds of similarity searches based on fingerprinting algorithms.


 


--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.