Index metadata

classic Classic list List threaded Threaded
8 messages Options
MC
Reply | Threaded
Open this post in threaded view
|

Index metadata

MC
Is it possible to retrieve index metadata via the API?  In terms of
metadata, I am looking to be able to dynamically retrieve things like:

1. The date/time at which an index was last updated
2. The field names (currently) within an index
3. The field types (string/boolean/date/numeric) currently in the
index
4. The field storage scheme (i.e. stored, indexed, etc...)

and the like.


Thanks,
Mayer
Reply | Threaded
Open this post in threaded view
|

Re: Index metadata

Ivan Brusic
Mayer,

I do not think it is possible to get the date/time of the last time an
index was updated. Perhaps someone can prove otherwise. I would be a
bit tough in a distributed environment.

For info on the fields, you can use the mapping API:
http://www.elasticsearch.org/guide/reference/api/admin-indices-get-mapping.html

Fields that are not explicitly defined in your initial mapping will be
added dynamically to the mapping. The GET mapping call will contain
all fields.

--
Ivan


On Thu, Dec 29, 2011 at 5:59 AM, MC <[hidden email]> wrote:

> Is it possible to retrieve index metadata via the API?  In terms of
> metadata, I am looking to be able to dynamically retrieve things like:
>
> 1. The date/time at which an index was last updated
> 2. The field names (currently) within an index
> 3. The field types (string/boolean/date/numeric) currently in the
> index
> 4. The field storage scheme (i.e. stored, indexed, etc...)
>
> and the like.
>
>
> Thanks,
> Mayer
Reply | Threaded
Open this post in threaded view
|

Re: Index metadata

plaflamme
We manually store timestamps for creation and last update of our indices within the "_meta" tag of a mapping. Then we use "Get Mapping" to fetch these timestamps.

An alternative is to use the new timestamps[1] of the documents in the index and search on that field, maybe using the term_stats facet to obtain the max...

Hope it helps,
Philippe

[1] http://www.elasticsearch.org/guide/reference/mapping/timestamp-field.html

On Thu, Dec 29, 2011 at 12:38, Ivan Brusic <[hidden email]> wrote:
Mayer,

I do not think it is possible to get the date/time of the last time an
index was updated. Perhaps someone can prove otherwise. I would be a
bit tough in a distributed environment.

For info on the fields, you can use the mapping API:
http://www.elasticsearch.org/guide/reference/api/admin-indices-get-mapping.html

Fields that are not explicitly defined in your initial mapping will be
added dynamically to the mapping. The GET mapping call will contain
all fields.

--
Ivan


On Thu, Dec 29, 2011 at 5:59 AM, MC <[hidden email]> wrote:
> Is it possible to retrieve index metadata via the API?  In terms of
> metadata, I am looking to be able to dynamically retrieve things like:
>
> 1. The date/time at which an index was last updated
> 2. The field names (currently) within an index
> 3. The field types (string/boolean/date/numeric) currently in the
> index
> 4. The field storage scheme (i.e. stored, indexed, etc...)
>
> and the like.
>
>
> Thanks,
> Mayer

Reply | Threaded
Open this post in threaded view
|

Re: Index metadata

kimchy
Administrator
Storing the last update in the _meta part of the mapping? So does that mean that for each document indexed, you update the mapping? Thats expensive...

On Fri, Dec 30, 2011 at 4:56 PM, Philippe Laflamme <[hidden email]> wrote:
We manually store timestamps for creation and last update of our indices within the "_meta" tag of a mapping. Then we use "Get Mapping" to fetch these timestamps.

An alternative is to use the new timestamps[1] of the documents in the index and search on that field, maybe using the term_stats facet to obtain the max...

Hope it helps,
Philippe

[1] http://www.elasticsearch.org/guide/reference/mapping/timestamp-field.html


On Thu, Dec 29, 2011 at 12:38, Ivan Brusic <[hidden email]> wrote:
Mayer,

I do not think it is possible to get the date/time of the last time an
index was updated. Perhaps someone can prove otherwise. I would be a
bit tough in a distributed environment.

For info on the fields, you can use the mapping API:
http://www.elasticsearch.org/guide/reference/api/admin-indices-get-mapping.html

Fields that are not explicitly defined in your initial mapping will be
added dynamically to the mapping. The GET mapping call will contain
all fields.

--
Ivan


On Thu, Dec 29, 2011 at 5:59 AM, MC <[hidden email]> wrote:
> Is it possible to retrieve index metadata via the API?  In terms of
> metadata, I am looking to be able to dynamically retrieve things like:
>
> 1. The date/time at which an index was last updated
> 2. The field names (currently) within an index
> 3. The field types (string/boolean/date/numeric) currently in the
> index
> 4. The field storage scheme (i.e. stored, indexed, etc...)
>
> and the like.
>
>
> Thanks,
> Mayer


Reply | Threaded
Open this post in threaded view
|

Re: Index metadata

plaflamme
Yes, that's what we ended up with. At the time, the document timestamps didn't exist. I wasn't aware that updating the mapping was expensive. Is this the case even when it's not chagning at all (except under _meta)?

Philippe

On Fri, Dec 30, 2011 at 10:06, Shay Banon <[hidden email]> wrote:
Storing the last update in the _meta part of the mapping? So does that mean that for each document indexed, you update the mapping? Thats expensive...


On Fri, Dec 30, 2011 at 4:56 PM, Philippe Laflamme <[hidden email]> wrote:
We manually store timestamps for creation and last update of our indices within the "_meta" tag of a mapping. Then we use "Get Mapping" to fetch these timestamps.

An alternative is to use the new timestamps[1] of the documents in the index and search on that field, maybe using the term_stats facet to obtain the max...

Hope it helps,
Philippe

[1] http://www.elasticsearch.org/guide/reference/mapping/timestamp-field.html


On Thu, Dec 29, 2011 at 12:38, Ivan Brusic <[hidden email]> wrote:
Mayer,

I do not think it is possible to get the date/time of the last time an
index was updated. Perhaps someone can prove otherwise. I would be a
bit tough in a distributed environment.

For info on the fields, you can use the mapping API:
http://www.elasticsearch.org/guide/reference/api/admin-indices-get-mapping.html

Fields that are not explicitly defined in your initial mapping will be
added dynamically to the mapping. The GET mapping call will contain
all fields.

--
Ivan


On Thu, Dec 29, 2011 at 5:59 AM, MC <[hidden email]> wrote:
> Is it possible to retrieve index metadata via the API?  In terms of
> metadata, I am looking to be able to dynamically retrieve things like:
>
> 1. The date/time at which an index was last updated
> 2. The field names (currently) within an index
> 3. The field types (string/boolean/date/numeric) currently in the
> index
> 4. The field storage scheme (i.e. stored, indexed, etc...)
>
> and the like.
>
>
> Thanks,
> Mayer



Reply | Threaded
Open this post in threaded view
|

Re: Index metadata

kimchy
Administrator
Well, expensive in the sense that it needs to be broadcasted to the cluster and persisted (the whole metadata). Even without the timestamp feature, you can add a timestamp to a document indexed, and find the latest one, you don't need the timestamp to have that ability.

On Fri, Dec 30, 2011 at 6:02 PM, Philippe Laflamme <[hidden email]> wrote:
Yes, that's what we ended up with. At the time, the document timestamps didn't exist. I wasn't aware that updating the mapping was expensive. Is this the case even when it's not chagning at all (except under _meta)?

Philippe


On Fri, Dec 30, 2011 at 10:06, Shay Banon <[hidden email]> wrote:
Storing the last update in the _meta part of the mapping? So does that mean that for each document indexed, you update the mapping? Thats expensive...


On Fri, Dec 30, 2011 at 4:56 PM, Philippe Laflamme <[hidden email]> wrote:
We manually store timestamps for creation and last update of our indices within the "_meta" tag of a mapping. Then we use "Get Mapping" to fetch these timestamps.

An alternative is to use the new timestamps[1] of the documents in the index and search on that field, maybe using the term_stats facet to obtain the max...

Hope it helps,
Philippe

[1] http://www.elasticsearch.org/guide/reference/mapping/timestamp-field.html


On Thu, Dec 29, 2011 at 12:38, Ivan Brusic <[hidden email]> wrote:
Mayer,

I do not think it is possible to get the date/time of the last time an
index was updated. Perhaps someone can prove otherwise. I would be a
bit tough in a distributed environment.

For info on the fields, you can use the mapping API:
http://www.elasticsearch.org/guide/reference/api/admin-indices-get-mapping.html

Fields that are not explicitly defined in your initial mapping will be
added dynamically to the mapping. The GET mapping call will contain
all fields.

--
Ivan


On Thu, Dec 29, 2011 at 5:59 AM, MC <[hidden email]> wrote:
> Is it possible to retrieve index metadata via the API?  In terms of
> metadata, I am looking to be able to dynamically retrieve things like:
>
> 1. The date/time at which an index was last updated
> 2. The field names (currently) within an index
> 3. The field types (string/boolean/date/numeric) currently in the
> index
> 4. The field storage scheme (i.e. stored, indexed, etc...)
>
> and the like.
>
>
> Thanks,
> Mayer




Reply | Threaded
Open this post in threaded view
|

Re: Index metadata

plaflamme
Yes, we did consider adding timestamps to documents at the time, but it seemed more appropriate to use the _meta tag since it's an index-level property (in our case at least). Maybe the _meta tag could be made accessible from outside the mapping: since it's not used by es, updating it shouldn't have any impact; just a thought.

What would be the most efficient way to query the "latest" (maximum) value when using a timestamp field on the documents?

Thanks,
Philippe

On Fri, Dec 30, 2011 at 11:15, Shay Banon <[hidden email]> wrote:
Well, expensive in the sense that it needs to be broadcasted to the cluster and persisted (the whole metadata). Even without the timestamp feature, you can add a timestamp to a document indexed, and find the latest one, you don't need the timestamp to have that ability.


On Fri, Dec 30, 2011 at 6:02 PM, Philippe Laflamme <[hidden email]> wrote:
Yes, that's what we ended up with. At the time, the document timestamps didn't exist. I wasn't aware that updating the mapping was expensive. Is this the case even when it's not chagning at all (except under _meta)?

Philippe


On Fri, Dec 30, 2011 at 10:06, Shay Banon <[hidden email]> wrote:
Storing the last update in the _meta part of the mapping? So does that mean that for each document indexed, you update the mapping? Thats expensive...


On Fri, Dec 30, 2011 at 4:56 PM, Philippe Laflamme <[hidden email]> wrote:
We manually store timestamps for creation and last update of our indices within the "_meta" tag of a mapping. Then we use "Get Mapping" to fetch these timestamps.

An alternative is to use the new timestamps[1] of the documents in the index and search on that field, maybe using the term_stats facet to obtain the max...

Hope it helps,
Philippe

[1] http://www.elasticsearch.org/guide/reference/mapping/timestamp-field.html


On Thu, Dec 29, 2011 at 12:38, Ivan Brusic <[hidden email]> wrote:
Mayer,

I do not think it is possible to get the date/time of the last time an
index was updated. Perhaps someone can prove otherwise. I would be a
bit tough in a distributed environment.

For info on the fields, you can use the mapping API:
http://www.elasticsearch.org/guide/reference/api/admin-indices-get-mapping.html

Fields that are not explicitly defined in your initial mapping will be
added dynamically to the mapping. The GET mapping call will contain
all fields.

--
Ivan


On Thu, Dec 29, 2011 at 5:59 AM, MC <[hidden email]> wrote:
> Is it possible to retrieve index metadata via the API?  In terms of
> metadata, I am looking to be able to dynamically retrieve things like:
>
> 1. The date/time at which an index was last updated
> 2. The field names (currently) within an index
> 3. The field types (string/boolean/date/numeric) currently in the
> index
> 4. The field storage scheme (i.e. stored, indexed, etc...)
>
> and the like.
>
>
> Thanks,
> Mayer





Reply | Threaded
Open this post in threaded view
|

Re: Index metadata

kimchy
Administrator
It does not matter if _meta is extracted or not, it still needs to be broadcasted and persisted. The best way to get the "latest" value is to sort base on the time field and ask for a single hit.

On Fri, Dec 30, 2011 at 6:25 PM, Philippe Laflamme <[hidden email]> wrote:
Yes, we did consider adding timestamps to documents at the time, but it seemed more appropriate to use the _meta tag since it's an index-level property (in our case at least). Maybe the _meta tag could be made accessible from outside the mapping: since it's not used by es, updating it shouldn't have any impact; just a thought.

What would be the most efficient way to query the "latest" (maximum) value when using a timestamp field on the documents?

Thanks,
Philippe


On Fri, Dec 30, 2011 at 11:15, Shay Banon <[hidden email]> wrote:
Well, expensive in the sense that it needs to be broadcasted to the cluster and persisted (the whole metadata). Even without the timestamp feature, you can add a timestamp to a document indexed, and find the latest one, you don't need the timestamp to have that ability.


On Fri, Dec 30, 2011 at 6:02 PM, Philippe Laflamme <[hidden email]> wrote:
Yes, that's what we ended up with. At the time, the document timestamps didn't exist. I wasn't aware that updating the mapping was expensive. Is this the case even when it's not chagning at all (except under _meta)?

Philippe


On Fri, Dec 30, 2011 at 10:06, Shay Banon <[hidden email]> wrote:
Storing the last update in the _meta part of the mapping? So does that mean that for each document indexed, you update the mapping? Thats expensive...


On Fri, Dec 30, 2011 at 4:56 PM, Philippe Laflamme <[hidden email]> wrote:
We manually store timestamps for creation and last update of our indices within the "_meta" tag of a mapping. Then we use "Get Mapping" to fetch these timestamps.

An alternative is to use the new timestamps[1] of the documents in the index and search on that field, maybe using the term_stats facet to obtain the max...

Hope it helps,
Philippe

[1] http://www.elasticsearch.org/guide/reference/mapping/timestamp-field.html


On Thu, Dec 29, 2011 at 12:38, Ivan Brusic <[hidden email]> wrote:
Mayer,

I do not think it is possible to get the date/time of the last time an
index was updated. Perhaps someone can prove otherwise. I would be a
bit tough in a distributed environment.

For info on the fields, you can use the mapping API:
http://www.elasticsearch.org/guide/reference/api/admin-indices-get-mapping.html

Fields that are not explicitly defined in your initial mapping will be
added dynamically to the mapping. The GET mapping call will contain
all fields.

--
Ivan


On Thu, Dec 29, 2011 at 5:59 AM, MC <[hidden email]> wrote:
> Is it possible to retrieve index metadata via the API?  In terms of
> metadata, I am looking to be able to dynamically retrieve things like:
>
> 1. The date/time at which an index was last updated
> 2. The field names (currently) within an index
> 3. The field types (string/boolean/date/numeric) currently in the
> index
> 4. The field storage scheme (i.e. stored, indexed, etc...)
>
> and the like.
>
>
> Thanks,
> Mayer