term and string

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

term and string

Jason Wee
Can anybody explain what is the different between term and string in elasticsearch context?

When we index using default mapping (http://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html), the default type is string. 

But when we query, we use the word term (http://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-term-query.html) instead of string? 

I google lucene documentation, the term is define as 

A query is broken up into terms and operators. There are two types of terms: Single Terms and Phrases. A Single Term is a single word such as "test" or "hello". A Phrase is a group of words surrounded by double quotes such as "hello dolly".

but it has no mentioned on string.

https://lucene.apache.org/core/4_10_3/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#package_description

Thank you.

Jason

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/df898132-f7f8-4476-8a81-21e3891dfb1a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: term and string

Doug Turnbull
A term in a purely technical sense is an entry in the inverted index. Technically it is a very low-level entity.

For example, if you tokenized and analyzed doc1: "Dougie Turnbull" using the English analyzer (which stems words to root forms, lowercases, etc), you'd get an inverted index that looks somethinglike:

doug 
     document: 1
       position 0
       freq 1
turnbul
     document: 1
       position 1
       freq 1

A "term query" therefore directly accesses terms. Its a bit of a low-level concern. You'd have to query "doug" directly even though the original text said "dougie".

However, loosely people use the word "search term" to mean words people enter into a search bar.

"string" is a concept that just reflects the text being analyzed. IE "Dougie Turnbull". This type is at the Elasticsearch level, and is a peer for integer, floats, doubles etc. This type dicates how Elasticsearch understands the value passed from the client and converts it to the inverted index structure above. A string type will be analyzed, picked apart into terms, etc based on the associated analyzer. Other types like numeric types have other low-level magic that helps convert them to the inverted index data structure.

Hope that helps,
-Doug


On Thu, Apr 23, 2015 at 10:42 AM, Jason Wee <[hidden email]> wrote:
Can anybody explain what is the different between term and string in elasticsearch context?

When we index using default mapping (http://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html), the default type is string. 

But when we query, we use the word term (http://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-term-query.html) instead of string? 

I google lucene documentation, the term is define as 

A query is broken up into terms and operators. There are two types of terms: Single Terms and Phrases. A Single Term is a single word such as "test" or "hello". A Phrase is a group of words surrounded by double quotes such as "hello dolly".

but it has no mentioned on string.


Thank you.

Jason

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/df898132-f7f8-4476-8a81-21e3891dfb1a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Doug Turnbull Search Relevance Consultant | OpenSource Connections, LLC | 240.476.9983 | http://www.opensourceconnections.com 
Author: Taming Search from Manning Publications
This e-mail and all contents, including attachments, is considered to be Company Confidential unless explicitly stated otherwise, regardless of whether attachments are marked as such.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALG6HL8RtomqZ3tWxB%2BEN2q_JHZmppGwzVw0HfPWJjTmzVNXCw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: term and string

Jason Wee
Yeap, that help, thanks Doug! :-)

On Thu, Apr 23, 2015 at 10:56 PM, Doug Turnbull
<[hidden email]> wrote:

> A term in a purely technical sense is an entry in the inverted index.
> Technically it is a very low-level entity.
>
> For example, if you tokenized and analyzed doc1: "Dougie Turnbull" using the
> English analyzer (which stems words to root forms, lowercases, etc), you'd
> get an inverted index that looks somethinglike:
>
> doug
>      document: 1
>        position 0
>        freq 1
> turnbul
>      document: 1
>        position 1
>        freq 1
>
> A "term query" therefore directly accesses terms. Its a bit of a low-level
> concern. You'd have to query "doug" directly even though the original text
> said "dougie".
>
> However, loosely people use the word "search term" to mean words people
> enter into a search bar.
>
> "string" is a concept that just reflects the text being analyzed. IE "Dougie
> Turnbull". This type is at the Elasticsearch level, and is a peer for
> integer, floats, doubles etc. This type dicates how Elasticsearch
> understands the value passed from the client and converts it to the inverted
> index structure above. A string type will be analyzed, picked apart into
> terms, etc based on the associated analyzer. Other types like numeric types
> have other low-level magic that helps convert them to the inverted index
> data structure.
>
> Hope that helps,
> -Doug
>
>
> On Thu, Apr 23, 2015 at 10:42 AM, Jason Wee <[hidden email]> wrote:
>>
>> Can anybody explain what is the different between term and string in
>> elasticsearch context?
>>
>> When we index using default mapping
>> (http://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html),
>> the default type is string.
>>
>> But when we query, we use the word term
>> (http://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-term-query.html)
>> instead of string?
>>
>> I google lucene documentation, the term is define as
>>
>> A query is broken up into terms and operators. There are two types of
>> terms: Single Terms and Phrases. A Single Term is a single word such as
>> "test" or "hello". A Phrase is a group of words surrounded by double quotes
>> such as "hello dolly".
>>
>> but it has no mentioned on string.
>>
>>
>> https://lucene.apache.org/core/4_10_3/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#package_description
>>
>> Thank you.
>>
>> Jason
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [hidden email].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/df898132-f7f8-4476-8a81-21e3891dfb1a%40googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
>
>
>
>
> --
> Doug Turnbull | Search Relevance Consultant | OpenSource Connections, LLC |
> 240.476.9983 | http://www.opensourceconnections.com
> Author: Taming Search from Manning Publications
> This e-mail and all contents, including attachments, is considered to be
> Company Confidential unless explicitly stated otherwise, regardless of
> whether attachments are marked as such.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [hidden email].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CALG6HL8RtomqZ3tWxB%2BEN2q_JHZmppGwzVw0HfPWJjTmzVNXCw%40mail.gmail.com.
>
> For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHO4itziR1cpQn2jkB8SQmNDqYJUpxqqaNHDmPQu1d4u63dpiA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: term and string

Jason Wee
There are some terminology explain at this link. http://www.elastic.co/guide/en/elasticsearch/reference/0.90/glossary.html

On Fri, Apr 24, 2015 at 10:09 AM, Jason Wee <[hidden email]> wrote:
Yeap, that help, thanks Doug! :-)

On Thu, Apr 23, 2015 at 10:56 PM, Doug Turnbull
<[hidden email]> wrote:
> A term in a purely technical sense is an entry in the inverted index.
> Technically it is a very low-level entity.
>
> For example, if you tokenized and analyzed doc1: "Dougie Turnbull" using the
> English analyzer (which stems words to root forms, lowercases, etc), you'd
> get an inverted index that looks somethinglike:
>
> doug
>      document: 1
>        position 0
>        freq 1
> turnbul
>      document: 1
>        position 1
>        freq 1
>
> A "term query" therefore directly accesses terms. Its a bit of a low-level
> concern. You'd have to query "doug" directly even though the original text
> said "dougie".
>
> However, loosely people use the word "search term" to mean words people
> enter into a search bar.
>
> "string" is a concept that just reflects the text being analyzed. IE "Dougie
> Turnbull". This type is at the Elasticsearch level, and is a peer for
> integer, floats, doubles etc. This type dicates how Elasticsearch
> understands the value passed from the client and converts it to the inverted
> index structure above. A string type will be analyzed, picked apart into
> terms, etc based on the associated analyzer. Other types like numeric types
> have other low-level magic that helps convert them to the inverted index
> data structure.
>
> Hope that helps,
> -Doug
>
>
> On Thu, Apr 23, 2015 at 10:42 AM, Jason Wee <[hidden email]> wrote:
>>
>> Can anybody explain what is the different between term and string in
>> elasticsearch context?
>>
>> When we index using default mapping
>> (http://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html),
>> the default type is string.
>>
>> But when we query, we use the word term
>> (http://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-term-query.html)
>> instead of string?
>>
>> I google lucene documentation, the term is define as
>>
>> A query is broken up into terms and operators. There are two types of
>> terms: Single Terms and Phrases. A Single Term is a single word such as
>> "test" or "hello". A Phrase is a group of words surrounded by double quotes
>> such as "hello dolly".
>>
>> but it has no mentioned on string.
>>
>>
>> https://lucene.apache.org/core/4_10_3/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#package_description
>>
>> Thank you.
>>
>> Jason
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [hidden email].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/df898132-f7f8-4476-8a81-21e3891dfb1a%40googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
>
>
>
>
> --
> Doug Turnbull | Search Relevance Consultant | OpenSource Connections, LLC |
> 240.476.9983 | http://www.opensourceconnections.com
> Author: Taming Search from Manning Publications
> This e-mail and all contents, including attachments, is considered to be
> Company Confidential unless explicitly stated otherwise, regardless of
> whether attachments are marked as such.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [hidden email].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CALG6HL8RtomqZ3tWxB%2BEN2q_JHZmppGwzVw0HfPWJjTmzVNXCw%40mail.gmail.com.
>
> For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHO4ityjCw-gCnSHxOyrHTkW1%2B4U13JbPvbaxcF6aeCmw%3DPH3A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.