River plugin clarification

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

River plugin clarification

Janusz Dalecki

Hi,

The doc on elasticsearch River plugin says:

“A river instance (and its name) is a type within the _river index. All different rivers implementations accept a document called _meta that at the very least has the type of the river (twitter / couchdb / …) associated with it.”

Isn’t “_meta” word an ‘id’ or ‘_action’ according to the elasticsearch documentation?

<a href="http://host:port/%5bindex%5d/%5btype%5d/%5b_action/id">http://host:port/[index]/[type]/[_action/id]

Can somebody give us a good example description, like:

 

curl -XPUT 'http://localhost:9200/_river/mongodb/_meta' -d '{

    "type": "mongodb",                                            // type

    "mongodb": {                                                    // mongodb instance – does it have to be the same as url type?

        "db": "testmongo",                                       // I think that strightforward

        "collection": "person"                                 // I think that strightforward

    },

    "index": {

        "name": "mongoindex",

        "type": "person"                                         // why do I have to repeat it again (its defined in as a collection)?

    }

}'

-           _river – an index

-           mongodb – a type

-           _meta – an id

 

Regards,

Janusz

--
 
 
Reply | Threaded
Open this post in threaded view
|

Re: River plugin clarification

Radu Gheorghe-2
Hi,

On Wed, Jan 2, 2013 at 11:27 AM, JD <[hidden email]> wrote:

Hi,

The doc on elasticsearch River plugin says:

“A river instance (and its name) is a type within the _river index. All different rivers implementations accept a document called _meta that at the very least has the type of the river (twitter / couchdb / …) associated with it.”

Isn’t “_meta” word an ‘id’ or ‘_action’ according to the elasticsearch documentation?


Yes, "_meta" is the document ID, as far as I understand.
 

<a href="http://host:port/%5bindex%5d/%5btype%5d/%5b_action/id" target="_blank">http://host:port/[index]/[type]/[_action/id]

Can somebody give us a good example description, like:

 

curl -XPUT 'http://localhost:9200/_river/mongodb/_meta' -d '{

    "type": "mongodb",                                            // type

    "mongodb": {                                                    // mongodb instance – does it have to be the same as url type?


No, the URL type is the name of your river (which can be anything AFAIK), while "mongodb" is a field that's required by the mongodb plugin.
 

        "db": "testmongo",                                       // I think that strightforward

        "collection": "person"                                 // I think that strightforward

    },

    "index": {

        "name": "mongoindex",

        "type": "person"                                         // why do I have to repeat it again (its defined in as a collection)?


The type here is the ES type you're indexing data in from your collection. It doesn't have to have the same name, it can be anything.
 

    }

}'

-           _river – an index

-           mongodb – a type

-           _meta – an id

 



If you want more info about the mongodb river itself, I think the best place to look (if you didn't already :D) is here:
https://github.com/richardwilly98/elasticsearch-river-mongodb/wiki

Best regards,
Radu
--
http://sematext.com/ -- ElasticSearch -- Solr -- Lucene

--
 
 
Reply | Threaded
Open this post in threaded view
|

Re: River plugin clarification

Janusz Dalecki
In reply to this post by Janusz Dalecki

Hi,

What I find little bit confusing in  mongodb river doc is lack of example for multi collection setup.

Wiki doc says that you need to create new river for MongoDB collection and gives this example:

$ curl -XPUT "localhost:9200/_river/mongodb/_meta" -d '
{
  "type": "mongodb",
  "mongodb": { 
    "servers":
    [
      { "host": ${mongo.instance1.host}, "port": ${mongo.instance1.port} },
      { "host": ${mongo.instance2.host}, "port": ${mongo.instance2.port} }
    ],
    "options": { "secondary_read_preference" : true},
    "credentials":
    [
      { "db": "local", "user": ${mongo.local.user}, "password": ${mongo.local.password} },
      { "db": ${mongo.db.name}, "user": ${mongo.db.user}, "password": ${mongo.db.password} }
    ],
    "db": ${mongo.db.name}, 
    "collection": ${mongo.collection.name}, 
    "gridfs": ${mongo.is.gridfs.collection},
    "filter": ${mongo.filter}
  }, 
  "index": { 
    "name": ${es.index.name}, 
    "throttle_size": ${es.throttle.size},
    "type": ${es.type.name}
  }
}'

I tried it and it does not work until I use URL line like this "localhost:9200/_river/{collection_name_river}/_meta"

… so in other words I need to replace “mongodb” word (in URL part) by the unique river collection name, which I think is not clearly stated in documentation. I am right?

Regards,

Janusz


On Wednesday, 2 January 2013 20:27:04 UTC+11, JD wrote:

Hi,

The doc on elasticsearch River plugin says:

“A river instance (and its name) is a type within the _river index. All different rivers implementations accept a document called _meta that at the very least has the type of the river (twitter / couchdb / …) associated with it.”

Isn’t “_meta” word an ‘id’ or ‘_action’ according to the elasticsearch documentation?

<a href="http://host:port/%5bindex%5d/%5btype%5d/%5b_action/id" target="_blank">http://host:port/[index]/[type]/[_action/id]

Can somebody give us a good example description, like:

 

curl -XPUT 'http://localhost:9200/_river/mongodb/_meta' -d '{

    "type": "mongodb",                                            // type

    "mongodb": {                                                    // mongodb instance – does it have to be the same as url type?

        "db": "testmongo",                                       // I think that strightforward

        "collection": "person"                                 // I think that strightforward

    },

    "index": {

        "name": "mongoindex",

        "type": "person"                                         // why do I have to repeat it again (its defined in as a collection)?

    }

}'

-           _river – an index

-           mongodb – a type

-           _meta – an id

 

Regards,

Janusz

--
 
 
Reply | Threaded
Open this post in threaded view
|

Re: River plugin clarification

Radu Gheorghe-2
Hi Janusz,

On Fri, Jan 4, 2013 at 10:05 AM, JD <[hidden email]> wrote:

Hi,

What I find little bit confusing in  mongodb river doc is lack of example for multi collection setup.

Wiki doc says that you need to create new river for MongoDB collection and gives this example:

$ curl -XPUT "localhost:9200/_river/mongodb/_meta" -d '
{
  "type": "mongodb",
  "mongodb": { 
    "servers":
    [
      { "host": ${mongo.instance1.host}, "port": ${mongo.instance1.port} },
      { "host": ${mongo.instance2.host}, "port": ${mongo.instance2.port} }
    ],
    "options": { "secondary_read_preference" : true},
    "credentials":
    [
      { "db": "local", "user": ${mongo.local.user}, "password": ${mongo.local.password} },
      { "db": ${mongo.db.name}, "user": ${mongo.db.user}, "password": ${mongo.db.password} }
    ],
    "db": ${mongo.db.name}, 
    "collection": ${mongo.collection.name}, 
    "gridfs": ${mongo.is.gridfs.collection},
    "filter": ${mongo.filter}
  }, 
  "index": { 
    "name": ${es.index.name}, 
    "throttle_size": ${es.throttle.size},
    "type": ${es.type.name}
  }
}'

I tried it and it does not work until I use URL line like this "localhost:9200/_river/{collection_name_river}/_meta"

… so in other words I need to replace “mongodb” word (in URL part) by the unique river collection name, which I think is not clearly stated in documentation. I am right?


Yeah, I don't see anything like that in the documentation. It's the first time I see that being reported, although I don't use the mongodb river plugin myself.

Either way, it seems to me that if you want to use the river with multiple collections you'd have to set up one river for each collection.

Best regards,
Radu
--
http://sematext.com/ -- ElasticSearch -- Solr -- Lucene 

--