highlight whole sentence

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

highlight whole sentence

Guram Kajaia
Hello guys.

Please run this curl recreation.

As you can see, i'm search work 'elasticsearch' in text : ElasticSearch can be used to search all 1kind of documents. It provides a scalable search solution, has near real-time search and support for multitenancy.[5] ElasticSearch is distributed, which means that indices can be divided into shards and each shard can have zero or more replicas. Each node hosts one or more shards, and acts as a coordinator to delegate operations to the correct shard(s). Rebalancing and routing are done automatically.

and highlighted text returned: 
"<em>ElasticSearch</em> can be used to search all 1kind of documents. It provides a scalable search solution",
", has near real-time search and support for multitenancy.[5] <em>ElasticSearch</em> is distributed, which"

Is it possible to highlight text where matched text is between dots ? I want to get this highlight :
"<em>ElasticSearch</em> can be used to search all 1kind of documents.",
"<em>ElasticSearch</em> is distributed, which means that indices can be divided into shards and each shard can have zero or more replicas."

Thanks.
GuriK.

--
 
 
Reply | Threaded
Open this post in threaded view
|

Re: highlight whole sentence

dadoonet
Hi GuriK,

I think that the last part answers to your needs:

Boundary Characters

When highlighting a field that is mapped with term vectors, boundary_chars can be configured to define what constitutes a boundary for highlighting. Its a single string with each boundary character defined in it. It defaults to .,!? \t\n.

The boundary_max_size allows to control how far to look for boundary characters, and defaults to 20.


I never played with it myself. But I hope this could help.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 24 août 2012 à 00:15, GuriK <[hidden email]> a écrit :

Hello guys.

Please run this curl recreation.

As you can see, i'm search work 'elasticsearch' in text : ElasticSearch can be used to search all 1kind of documents. It provides a scalable search solution, has near real-time search and support for multitenancy.[5] ElasticSearch is distributed, which means that indices can be divided into shards and each shard can have zero or more replicas. Each node hosts one or more shards, and acts as a coordinator to delegate operations to the correct shard(s). Rebalancing and routing are done automatically.

and highlighted text returned: 
"<em>ElasticSearch</em> can be used to search all 1kind of documents. It provides a scalable search solution",
", has near real-time search and support for multitenancy.[5] <em>ElasticSearch</em> is distributed, which"

Is it possible to highlight text where matched text is between dots ? I want to get this highlight :
"<em>ElasticSearch</em> can be used to search all 1kind of documents.",
"<em>ElasticSearch</em> is distributed, which means that indices can be divided into shards and each shard can have zero or more replicas."

Thanks.
GuriK.

--
 
 

--
 
 
Reply | Threaded
Open this post in threaded view
|

Re: highlight whole sentence

Guram Kajaia
Hi David.

I mapped my field like this "message" : { "type" : "string", "term_vector" : "with_positions_offsets"}  but results is still not what i wanted.
All i want to do is to get text between dots no matter how long that text will be.


-
GuriK


On Fri, Aug 24, 2012 at 5:43 AM, David Pilato <[hidden email]> wrote:
Hi GuriK,

I think that the last part answers to your needs:

Boundary Characters

When highlighting a field that is mapped with term vectors, boundary_chars can be configured to define what constitutes a boundary for highlighting. Its a single string with each boundary character defined in it. It defaults to .,!? \t\n.

The boundary_max_size allows to control how far to look for boundary characters, and defaults to 20.


I never played with it myself. But I hope this could help.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 24 août 2012 à 00:15, GuriK <[hidden email]> a écrit :

Hello guys.

Please run this curl recreation.

As you can see, i'm search work 'elasticsearch' in text : ElasticSearch can be used to search all 1kind of documents. It provides a scalable search solution, has near real-time search and support for multitenancy.[5] ElasticSearch is distributed, which means that indices can be divided into shards and each shard can have zero or more replicas. Each node hosts one or more shards, and acts as a coordinator to delegate operations to the correct shard(s). Rebalancing and routing are done automatically.

and highlighted text returned: 
"<em>ElasticSearch</em> can be used to search all 1kind of documents. It provides a scalable search solution",
", has near real-time search and support for multitenancy.[5] <em>ElasticSearch</em> is distributed, which"

Is it possible to highlight text where matched text is between dots ? I want to get this highlight :
"<em>ElasticSearch</em> can be used to search all 1kind of documents.",
"<em>ElasticSearch</em> is distributed, which means that indices can be divided into shards and each shard can have zero or more replicas."

Thanks.
GuriK.

--
 
 

--
 
 

--
 
 
Reply | Threaded
Open this post in threaded view
|

Re: highlight whole sentence

dadoonet
Did you set 
boundary_max_size to 0

When highlighting?
--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 24 août 2012 à 10:35, Guram Kajaia <[hidden email]> a écrit :

Hi David.

I mapped my field like this "message" : { "type" : "string", "term_vector" : "with_positions_offsets"}  but results is still not what i wanted.
All i want to do is to get text between dots no matter how long that text will be.


-
GuriK


On Fri, Aug 24, 2012 at 5:43 AM, David Pilato <[hidden email]> wrote:
Hi GuriK,

I think that the last part answers to your needs:

Boundary Characters

When highlighting a field that is mapped with term vectors, boundary_chars can be configured to define what constitutes a boundary for highlighting. Its a single string with each boundary character defined in it. It defaults to .,!? \t\n.

The boundary_max_size allows to control how far to look for boundary characters, and defaults to 20.


I never played with it myself. But I hope this could help.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 24 août 2012 à 00:15, GuriK <[hidden email]> a écrit :

Hello guys.

Please run this curl recreation.

As you can see, i'm search work 'elasticsearch' in text : ElasticSearch can be used to search all 1kind of documents. It provides a scalable search solution, has near real-time search and support for multitenancy.[5] ElasticSearch is distributed, which means that indices can be divided into shards and each shard can have zero or more replicas. Each node hosts one or more shards, and acts as a coordinator to delegate operations to the correct shard(s). Rebalancing and routing are done automatically.

and highlighted text returned: 
"<em>ElasticSearch</em> can be used to search all 1kind of documents. It provides a scalable search solution",
", has near real-time search and support for multitenancy.[5] <em>ElasticSearch</em> is distributed, which"

Is it possible to highlight text where matched text is between dots ? I want to get this highlight :
"<em>ElasticSearch</em> can be used to search all 1kind of documents.",
"<em>ElasticSearch</em> is distributed, which means that indices can be divided into shards and each shard can have zero or more replicas."

Thanks.
GuriK.

--
 
 

--
 
 

--
 
 

--
 
 
Reply | Threaded
Open this post in threaded view
|

Re: highlight whole sentence

Guram Kajaia
Yes.
Result is the same ...

On Fri, Aug 24, 2012 at 5:04 PM, David Pilato <[hidden email]> wrote:
Did you set 
boundary_max_size to 0

When highlighting?
--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 24 août 2012 à 10:35, Guram Kajaia <[hidden email]> a écrit :

Hi David.

I mapped my field like this "message" : { "type" : "string", "term_vector" : "with_positions_offsets"}  but results is still not what i wanted.
All i want to do is to get text between dots no matter how long that text will be.


-
GuriK


On Fri, Aug 24, 2012 at 5:43 AM, David Pilato <[hidden email]> wrote:
Hi GuriK,

I think that the last part answers to your needs:

Boundary Characters

When highlighting a field that is mapped with term vectors, boundary_chars can be configured to define what constitutes a boundary for highlighting. Its a single string with each boundary character defined in it. It defaults to .,!? \t\n.

The boundary_max_size allows to control how far to look for boundary characters, and defaults to 20.


I never played with it myself. But I hope this could help.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 24 août 2012 à 00:15, GuriK <[hidden email]> a écrit :

Hello guys.

Please run this curl recreation.

As you can see, i'm search work 'elasticsearch' in text : ElasticSearch can be used to search all 1kind of documents. It provides a scalable search solution, has near real-time search and support for multitenancy.[5] ElasticSearch is distributed, which means that indices can be divided into shards and each shard can have zero or more replicas. Each node hosts one or more shards, and acts as a coordinator to delegate operations to the correct shard(s). Rebalancing and routing are done automatically.

and highlighted text returned: 
"<em>ElasticSearch</em> can be used to search all 1kind of documents. It provides a scalable search solution",
", has near real-time search and support for multitenancy.[5] <em>ElasticSearch</em> is distributed, which"

Is it possible to highlight text where matched text is between dots ? I want to get this highlight :
"<em>ElasticSearch</em> can be used to search all 1kind of documents.",
"<em>ElasticSearch</em> is distributed, which means that indices can be divided into shards and each shard can have zero or more replicas."

Thanks.
GuriK.

--
 
 

--
 
 

--
 
 

--
 
 

--
 
 
Reply | Threaded
Open this post in threaded view
|

Re: highlight whole sentence

Nick Dunn
I don't understand the "between dots" part; I can't see the pattern between the actual output and the desired output. Are you saying you always want the highlighted (wrapped with EM) to be the first characters of the highlight excerpt, and not midway within the string?

--


Reply | Threaded
Open this post in threaded view
|

Re: highlight whole sentence

dadoonet
I think he wants to highlight a full sentence and not 5 words before and after the highlighted term.


--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 25 août 2012 à 08:23, Nick Dunn <[hidden email]> a écrit :

I don't understand the "between dots" part; I can't see the pattern between the actual output and the desired output. Are you saying you always want the highlighted (wrapped with EM) to be the first characters of the highlight excerpt, and not midway within the string?

--


--


Reply | Threaded
Open this post in threaded view
|

Re: highlight whole sentence

Guram Kajaia
You're right David. I don't want to specify how many words to highlight before or/and after matched text. I just want full sentence which includes matched text.

On Sat, Aug 25, 2012 at 1:08 PM, David Pilato <[hidden email]> wrote:
I think he wants to highlight a full sentence and not 5 words before and after the highlighted term.


--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 25 août 2012 à 08:23, Nick Dunn <[hidden email]> a écrit :

I don't understand the "between dots" part; I can't see the pattern between the actual output and the desired output. Are you saying you always want the highlighted (wrapped with EM) to be the first characters of the highlight excerpt, and not midway within the string?

--


--



--
 
 
Reply | Threaded
Open this post in threaded view
|

Re: highlight whole sentence

Nick Dunn
Sounds like this is a task for your own application logic to parse and extract in my opinion.

--


Reply | Threaded
Open this post in threaded view
|

Re: highlight whole sentence

Guram Kajaia
Yes, maybe...
is any way to write plugin for elasticsearch which will do what i want ? in this case highlighting whole sentence...

On Sun, Aug 26, 2012 at 8:04 AM, Nick Dunn <[hidden email]> wrote:
Sounds like this is a task for your own application logic to parse and extract in my opinion.

--



--
 
 
Reply | Threaded
Open this post in threaded view
|

Re: highlight whole sentence

phill
This is really a problem of plugging in a different Fragmenter
http://lucene.apache.org/core/3_6_0/api/all/index.html
Which I do NOT believe is an extension point in ES.

-Paul

On 8/27/2012 4:45 AM, Guram Kajaia wrote:
Yes, maybe...
is any way to write plugin for elasticsearch which will do what i want ? in this case highlighting whole sentence...

On Sun, Aug 26, 2012 at 8:04 AM, Nick Dunn <[hidden email]> wrote:
Sounds like this is a task for your own application logic to parse and extract in my opinion.

--



--
 
 

--