Question about :Fs rivers ,synonyms and elasticsearch java api

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Question about :Fs rivers ,synonyms and elasticsearch java api

yahia
I use rivers (fs river and jdbc river) to index documents, I use java api for keyword search, I want to use synonyms in my research,

for example when I type the word  Application, documents that contain the word ios or windows will be in the list of results,

can I load a dictionary at the time of research in which you will find all synonyms?

i have the following query :
                 
                QueryBuilder query = QueryBuilders.queryString(keyword);
SearchResponse searchHits = esClient.prepareSearch()
.setIndices(INDEX_NAME_DOC).setTypes(INDEX_TYPE_DOC)
.setSearchType(SearchType.DFS_QUERY_THEN_FETCH)
.setFrom(start).setSize(size)
.setQuery(query).addHighlightedField("name")
.addHighlightedField("file")
.execute().actionGet();

how can I modify it to accept synonyms ?

How can i porceed,

Cordially.


--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|

Re: Question about :Fs rivers ,synonyms and elasticsearch java api

dadoonet
IMHO, to use synonyms you should define it in mapping before indexing.

That way a document containing word will be indexed under microsoft for example.
When searching, Elasticsearch will apply the same analyzer. If you search for word, your search will be converted to microsoft and you will find your doc.

My 2 cents

-- 
David Pilato | Technical Advocate | Elasticsearch.com



Le 15 mars 2013 à 10:02, Ammar Yahia <[hidden email]> a écrit :

I use rivers (fs river and jdbc river) to index documents, I use java api for keyword search, I want to use synonyms in my research,

for example when I type the word  Application, documents that contain the word ios or windows will be in the list of results,

can I load a dictionary at the time of research in which you will find all synonyms?

i have the following query :
                 
                QueryBuilder query = QueryBuilders.queryString(keyword);
SearchResponse searchHits = esClient.prepareSearch()
.setIndices(INDEX_NAME_DOC).setTypes(INDEX_TYPE_DOC)
.setSearchType(SearchType.DFS_QUERY_THEN_FETCH)
.setFrom(start).setSize(size)
.setQuery(query).addHighlightedField("name")
.addHighlightedField("file")
.execute().actionGet();

how can I modify it to accept synonyms ?

How can i porceed,

Cordially.



--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|

Re: Question about :Fs rivers ,synonyms and elasticsearch java api

yahia
thx for the reply, i use fs river to index document, how can I change the mapping when i creat river to accept synonyms ?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|

Re: Question about :Fs rivers ,synonyms and elasticsearch java api

dadoonet
You have to create the mapping before creating the river. See: https://github.com/dadoonet/fsriver#creating-your-own-mapping-analyzers

Note that the required steps are:
1/ create index with its analyzer
2/ create mapping that will use this analyzer
3/ create the river

HTH

-- 
David Pilato | Technical Advocate | Elasticsearch.com



Le 15 mars 2013 à 10:15, Ammar Yahia <[hidden email]> a écrit :

thx for the reply, i use fs river to index document, how can I change the mapping when i creat river to accept synonyms ?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|

Re: Question about :Fs rivers ,synonyms and elasticsearch java api

yahia
I created an index that contains a simple analyser application => Applications :

curl -XPUT 'http://localhost:9200/mydocu/' -d '{
    "settings" : {"index" :  {"analysis" : {"analyzer" : { "synonym" : {"tokenizer" : "whitespace","filter" : ["synonym"]} },"filter" : {"synonym" : {"type" : "synonym", "ignore_case" : true,       "synonyms" : ["application => applications"]}}}}}}'

and I used the following mapping :

{
    "docu" : {
        "properties" : {"file" : {"type" : "attachment","path" : "full","fields" : {"file" : {"type" : "string","store" : "yes","term_vector" : "with_positions_offsets","index" : "analyzed","analyzer" : "french"},"author" : {"type" : "string"},"title" : {"type" : "string","store" : "yes"}, "name" : {"type" : "string"},"date" : {"type" : "date","format" : "dateOptionalTime"},"keywords" : {"type" : "string"},"content_type" : {"type" : "string" }}}, "name" : {"type" : "string","analyzer" : "keyword"},"pathEncoded" : {"type" : "string","analyzer" : "keyword"}, "postDate" : {"type" : "date","format" : "dateOptionalTime"},"rootpath" : {"type" : "string","analyzer" : "keyword"},"virtualpath" : { "type" : "string","analyzer" : "keyword"}}}}'

 and i create the following river:

curl -XPUT 'localhost:9200/_river/riverdocu/_meta' -d '{
  "type": "fs",
  "fs": {
    "name": "document river",
    "url": "C:\\tempDoc",
    "update_rate": 180000,
    "includes": [ "*.doc" , "*.xls", "*.pdf", "*.txt" ]
  },
  "index": {
    "index": "mydocu",
    "type": "docu",
  }
}'

but I get this error when the river trying to search for documents :

[2013-03-15 16:01:10,015][DEBUG][action.search.type       ] [Arcademan] [1] Failed to execute fetch phase
org.elasticsearch.transport.RemoteTransportException: [Loa][inet[/172.16.10.61:9301]][search/phase/fetch/id]
Caused by: org.elasticsearch.indices.TypeMissingException: [_river] type[riverdocu] missing: failed to find type loaded for doc [_meta]
        at org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:165)
        at org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:438)
        at org.elasticsearch.search.action.SearchServiceTransportAction$SearchFetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:634)
        at org.elasticsearch.search.action.SearchServiceTransportAction$SearchFetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:623)
        at org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:268)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|

Re: Question about :Fs rivers ,synonyms and elasticsearch java api

dadoonet
Did you clean the _river before redoing all your test?
Sounds like a _river doc is remaining somewhere in your cluster.

Are you running a multimode cluster or are you in standalone?

-- 
David Pilato | Technical Advocate | Elasticsearch.com



Le 15 mars 2013 à 16:11, Ammar Yahia <[hidden email]> a écrit :

I created an index that contains a simple analyser application => Applications :

curl -XPUT 'http://localhost:9200/mydocu/' -d '{
    "settings" : {"index" :  {"analysis" : {"analyzer" : { "synonym" : {"tokenizer" : "whitespace","filter" : ["synonym"]} },"filter" : {"synonym" : {"type" : "synonym", "ignore_case" : true,       "synonyms" : ["application => applications"]}}}}}}'

and I used the following mapping :

{
    "docu" : {
        "properties" : {"file" : {"type" : "attachment","path" : "full","fields" : {"file" : {"type" : "string","store" : "yes","term_vector" : "with_positions_offsets","index" : "analyzed","analyzer" : "french"},"author" : {"type" : "string"},"title" : {"type" : "string","store" : "yes"}, "name" : {"type" : "string"},"date" : {"type" : "date","format" : "dateOptionalTime"},"keywords" : {"type" : "string"},"content_type" : {"type" : "string" }}}, "name" : {"type" : "string","analyzer" : "keyword"},"pathEncoded" : {"type" : "string","analyzer" : "keyword"}, "postDate" : {"type" : "date","format" : "dateOptionalTime"},"rootpath" : {"type" : "string","analyzer" : "keyword"},"virtualpath" : { "type" : "string","analyzer" : "keyword"}}}}'

 and i create the following river:

curl -XPUT 'localhost:9200/_river/riverdocu/_meta' -d '{
  "type": "fs",
  "fs": {
    "name": "document river",
    "url": "C:<a href="smb://tempDoc">\\tempDoc",
    "update_rate": 180000,
    "includes": [ "*.doc" , "*.xls", "*.pdf", "*.txt" ]
  },
  "index": {
    "index": "mydocu",
    "type": "docu",
  }
}'

but I get this error when the river trying to search for documents :

[2013-03-15 16:01:10,015][DEBUG][action.search.type       ] [Arcademan] [1] Failed to execute fetch phase
org.elasticsearch.transport.RemoteTransportException: [Loa][inet[/172.16.10.61:9301]][search/phase/fetch/id]
Caused by: org.elasticsearch.indices.TypeMissingException: [_river] type[riverdocu] missing: failed to find type loaded for doc [_meta]
        at org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:165)
        at org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:438)
        at org.elasticsearch.search.action.SearchServiceTransportAction$SearchFetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:634)
        at org.elasticsearch.search.action.SearchServiceTransportAction$SearchFetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:623)
        at org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:268)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|

Re: Question about :Fs rivers ,synonyms and elasticsearch java api

yahia
i'm in standalone, and i use two fs river with name : newriver1 and mydocs , should i delete those rivers before creating my new river riverdocu ?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|

Re: Question about :Fs rivers ,synonyms and elasticsearch java api

dadoonet
Hmmmm. No. I don't think you have to.
That said if you are running tests, I suggest that you clean every time your environment before running new tests.

I can't say here what this happens.

Are you sending curl commands or doing this from Java?
Are you waiting a little (wait for cluster yellow status for example) after the index creation?



-- 
David Pilato | Technical Advocate | Elasticsearch.com



Le 15 mars 2013 à 16:28, Ammar Yahia <[hidden email]> a écrit :

i'm in standalone, and i use two fs river with name : newriver1 and mydocs , should i delete those rivers before creating my new river riverdocu ?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|

Re: Question about :Fs rivers ,synonyms and elasticsearch java api

yahia
it works, i clean the environment, and now it works, and now i have another problem :-p

 I want when I type the word application, the documents that contain the word applications will also be returned,

 but I have only documents that contain the word application in the result list,

i have the folowing index with its analyser:

curl -XPUT 'http://localhost:9200/docum/' -d '{
    "settings" : {"index" :  {"analysis" : {"analyzer" : { "synonym" : {"tokenizer" : "whitespace","filter" : ["synonym"]} },"filter" : {"synonym" : {"type" : "synonym", "ignore_case" : true, "synonyms" : ["application => applications"]}}}}}}'

and the following mapping:

{
    "mydocu" : {
        "properties" : {
"file" : { "type" : "attachment","path" : "full",
          "fields" : {"file" : {"type" : "string","store" : "yes","term_vector" : "with_positions_offsets","index" : "analyzed"},
                      "author" : {"type" : "string"},
              "title" : {"type" : "string","store" : "yes"},
              "name" : {"type" : "string"},
  "date" : {"type" : "date","format" : "dateOptionalTime"},
  "keywords" : {"type" : "string"},
              "content_type" : {"type" : "string" }}},
"name" : {"type" : "string","analyzer" : "keyword"},
"pathEncoded" : {"type" : "string","analyzer" : "keyword"},
"postDate" : {"type" : "date","format" : "dateOptionalTime"},
"rootpath" : {"type" : "string","analyzer" : "keyword"},
"virtualpath" : { "type" : "string","analyzer" : "keyword"}}}}'

is that I made a mistake in the declaration required of the index or mapping ?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|

Re: Question about :Fs rivers ,synonyms and elasticsearch java api

yahia
I also use this code to search :

QueryBuilder query = QueryBuilders.queryString(keyword);
SearchResponse searchHits = esClient.prepareSearch()
.setIndices(INDEX_NAME).setTypes(INDEX_TYPE_DOC)
.setSearchType(SearchType.DFS_QUERY_THEN_FETCH)
.setFrom(start).setSize(size)
.setQuery(query)
.addHighlightedField("name")
.addHighlightedField("file")
.setHighlighterPreTags("<span class='badge'>")
.setHighlighterPostTags("</span>").execute().actionGet();


--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.