Extra filter on returned Facet values

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Extra filter on returned Facet values

Bob Sandiford
Hi,

We're currently using Solr, and one of our use cases requires us to
search for documents with a specific field containing (or having fuzzy
matches) the words in the search.  We have a facet field based on the
same original value.  However, rather than return ALL facets for that
field for the documents that match (the usual situation), we further
filter the returned facets to only those that match the search terms.

So, if we have a source document with an 'Author' source field (for
example), then we have two fields in our index document based on
Author, one is analyzed and searchable, the other is not so that we
just use it for facets.

If our original document had these values for Author:
  Smith, Joe
  Smith, Fred

and we did a search for 'joe smyth' (so that though it's not an exact
match, there's a fuzzy match), then what we want back in the Author
based facet is just the "Smith, Joe" value (along with the count of
documents in which that facet value appears).

We did this in Solr by modifying the Solr code that puts the facets
together, with a special facet parameter when we want this behaviour.
The code uses a Solr construct called a 'MemoryIndex' - a very fast
means for us to create a in-memory index, add one small document (one
of the Facet values), and run a search against it to see if there's a
match.

Anyways - in order to move to ElasticSearch, I'm needing to know if
there may be some mechanism for achieving the same ultimate result -
i.e. returning only the Facet Values that match the original search.

Ideas, anyone?

Thanks!
Reply | Threaded
Open this post in threaded view
|

Re: Extra filter on returned Facet values

Matt Weber
Take a look at facet filter.  I imagine you would do your fuzzy search, then apply a facet filter that does a fuzzy search against just the author field (actually a tokenized version of it) to filter out authors that don't match.

On Wednesday, May 16, 2012 at 7:43 AM, Bob Sandiford wrote:

Hi,

We're currently using Solr, and one of our use cases requires us to
search for documents with a specific field containing (or having fuzzy
matches) the words in the search. We have a facet field based on the
same original value. However, rather than return ALL facets for that
field for the documents that match (the usual situation), we further
filter the returned facets to only those that match the search terms.

So, if we have a source document with an 'Author' source field (for
example), then we have two fields in our index document based on
Author, one is analyzed and searchable, the other is not so that we
just use it for facets.

If our original document had these values for Author:
Smith, Joe
Smith, Fred

and we did a search for 'joe smyth' (so that though it's not an exact
match, there's a fuzzy match), then what we want back in the Author
based facet is just the "Smith, Joe" value (along with the count of
documents in which that facet value appears).

We did this in Solr by modifying the Solr code that puts the facets
together, with a special facet parameter when we want this behaviour.
The code uses a Solr construct called a 'MemoryIndex' - a very fast
means for us to create a in-memory index, add one small document (one
of the Facet values), and run a search against it to see if there's a
match.

Anyways - in order to move to ElasticSearch, I'm needing to know if
there may be some mechanism for achieving the same ultimate result -
i.e. returning only the Facet Values that match the original search.

Ideas, anyone?

Thanks!

Reply | Threaded
Open this post in threaded view
|

RE: Extra filter on returned Facet values

rpsandiford
This post has NOT been accepted by the mailing list yet.

Thanks, Matt.

 

Hmmm.  I’ll have to think about that.  One thing I forgot to mention is that in this case, we don’t actually get / want any actual search results – only the facets.  It means we don’t need to parse any search results.  If I’m understanding your suggestion correctly, I’d need to get back search results with the Author field values that matched the query (i.e. the tokenized version of the author field), and then manually filter the Facet values by those returned values to find the ones I want from the facets.  But – that will require ensuring that I get enough results that all the possible author field matches are returned in the result set, or I’ll end up removing too many values from the returned facets…

 

Unless I’m just overthinking this, and the facet filter can actually determine whether or not to return a (non-tokenized) facet value based on a match in a separate (tokenized) field?

 

Bob Sandiford | Lead Software Engineer SirsiDynix

P: 800.288.8020 X6943 | [hidden email]

www.sirsidynix.com

 

Join the conversation: Like us on Facebook! Follow us on Twitter!

 

From: Matt Weber [via ElasticSearch Users] [mailto:[hidden email]]
Sent: Wednesday, May 16, 2012 12:04 PM
To: Bob Sandiford
Subject: Re: Extra filter on returned Facet values

 

Take a look at facet filter.  I imagine you would do your fuzzy search, then apply a facet filter that does a fuzzy search against just the author field (actually a tokenized version of it) to filter out authors that don't match.

 

On Wednesday, May 16, 2012 at 7:43 AM, Bob Sandiford wrote:

Hi,

 

We're currently using Solr, and one of our use cases requires us to

search for documents with a specific field containing (or having fuzzy

matches) the words in the search. We have a facet field based on the

same original value. However, rather than return ALL facets for that

field for the documents that match (the usual situation), we further

filter the returned facets to only those that match the search terms.

 

So, if we have a source document with an 'Author' source field (for

example), then we have two fields in our index document based on

Author, one is analyzed and searchable, the other is not so that we

just use it for facets.

 

If our original document had these values for Author:

Smith, Joe

Smith, Fred

 

and we did a search for 'joe smyth' (so that though it's not an exact

match, there's a fuzzy match), then what we want back in the Author

based facet is just the "Smith, Joe" value (along with the count of

documents in which that facet value appears).

 

We did this in Solr by modifying the Solr code that puts the facets

together, with a special facet parameter when we want this behaviour.

The code uses a Solr construct called a 'MemoryIndex' - a very fast

means for us to create a in-memory index, add one small document (one

of the Facet values), and run a search against it to see if there's a

match.

 

Anyways - in order to move to ElasticSearch, I'm needing to know if

there may be some mechanism for achieving the same ultimate result -

i.e. returning only the Facet Values that match the original search.

 

Ideas, anyone?

 

Thanks!

 

 


If you reply to this email, your message will be added to the discussion below:

http://elasticsearch-users.115913.n3.nabble.com/Extra-filter-on-returned-Facet-values-tp3997253p3997396.html

To start a new topic under ElasticSearch Users, email [hidden email]
To unsubscribe from ElasticSearch Users, click here.
NAML

Bob.