match phrase queries to highlighted values

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

match phrase queries to highlighted values

Alexandra
This post has NOT been accepted by the mailing list yet.
Hi,

My use case is the following : I have a text segment and a list o terms and I want to find only the ones that match exactly the text.
For e.g. if my segment text is : "This is a simple text." And my terms are : "texts", "this", "text", I will find and highlight only the terms "this" and "text".

I'm building the query with the Java Api like this ( the segment is indexed ):

BoolQueryBuilder query = QueryBuilders.boolQuery();
 for(TermDocument termCandidate : termCandidates) {
         query.should(QueryBuilders.matchPhraseQuery(ElasticsearchDocumentField.TEXT_CONTENT.getName(), termCandidate.getTermText()).slop(0).queryName(termCandidate.getId()).analyzer(EN_ANALYZER));
     }


If I also highlight the terms ( because in the end I need the offsets ), the will all be highlighted and I don't know which one is which.  (e.g. <em>This</em> is a simple <em>text</em>.)

So now, my questions :
1.Is there a way to highlight the terms from the query separately ? And to associate some id to each of them in order to be able to match them back ?
2. Is there a way to receive the token numbers for an indexed text without using the analyze api ? (this is unrelated to the first question).