Statistical facet on multiple fields

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
24 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Statistical facet on multiple fields

zohar
Hi
Is ther a way to produce a statistical facet ( sum avg min max etc )
on multiplie fields at once.

Cheers
Z
Reply | Threaded
Open this post in threaded view
|

Re: Statistical facet on multiple fields

kimchy
Administrator
You mean to have the statistical information accumulated across several fields, or execute it against different fields and have the results per field?

On Sun, Oct 10, 2010 at 11:37 PM, Zohar <[hidden email]> wrote:
Hi
Is ther a way to produce a statistical facet ( sum avg min max etc )
on multiplie fields at once.

Cheers
Z

Reply | Threaded
Open this post in threaded view
|

Re: Statistical facet on multiple fields

kimchy
Administrator
ok, I really can give you an answer for both:

I guess for the more obvious one, if you want to have stats on different fields and be computed differently, just execute  different facet (with a different facet name) against each field.

If you want to get a combined statistical data on two fields, there is no "formal" way to do, but, you can get it by executing two stats facet on the two different fields, just *name the facets the same*. The results will be reduced into a single result. This is a "feature" that I did not plan for, to be honest, and discovered it by mistake :).  A proper way to implement it is to have the ability to provide a list of fields to the stats facet, it will also be a tad faster this way. If you want, you can open a feature request for it (I already implemented something similar for terms facet).

-shay.banon

On Sun, Oct 10, 2010 at 11:44 PM, Shay Banon <[hidden email]> wrote:
You mean to have the statistical information accumulated across several fields, or execute it against different fields and have the results per field?


On Sun, Oct 10, 2010 at 11:37 PM, Zohar <[hidden email]> wrote:
Hi
Is ther a way to produce a statistical facet ( sum avg min max etc )
on multiplie fields at once.

Cheers
Z


Reply | Threaded
Open this post in threaded view
|

Re: Statistical facet on multiple fields

Thiago Souza
Hi Shay,

    This is interesting.
    So basically, by breaking json semantics one can achieve aggregation? So, what happens if the facet is defined by an array with many objects?

Regards,
Thigo Souza

On Sun, Oct 10, 2010 at 19:48, Shay Banon <[hidden email]> wrote:
ok, I really can give you an answer for both:

I guess for the more obvious one, if you want to have stats on different fields and be computed differently, just execute  different facet (with a different facet name) against each field.

If you want to get a combined statistical data on two fields, there is no "formal" way to do, but, you can get it by executing two stats facet on the two different fields, just *name the facets the same*. The results will be reduced into a single result. This is a "feature" that I did not plan for, to be honest, and discovered it by mistake :).  A proper way to implement it is to have the ability to provide a list of fields to the stats facet, it will also be a tad faster this way. If you want, you can open a feature request for it (I already implemented something similar for terms facet).

-shay.banon


On Sun, Oct 10, 2010 at 11:44 PM, Shay Banon <[hidden email]> wrote:
You mean to have the statistical information accumulated across several fields, or execute it against different fields and have the results per field?


On Sun, Oct 10, 2010 at 11:37 PM, Zohar <[hidden email]> wrote:
Hi
Is ther a way to produce a statistical facet ( sum avg min max etc )
on multiplie fields at once.

Cheers
Z



Reply | Threaded
Open this post in threaded view
|

Re: Statistical facet on multiple fields

kimchy
Administrator
Yea, this breaks json semantics since you need to define two facets under the same name. This "feature" was not planned, its just a matter of how the map reduce nature of facets work. If its going to be formal, then a proper json structure will need to be formalized. In any case, if there is a need for a facet over multiple fields (like stats), this feature can be added specifically for it.

On Mon, Oct 11, 2010 at 3:40 PM, Thiago Souza <[hidden email]> wrote:
Hi Shay,

    This is interesting.
    So basically, by breaking json semantics one can achieve aggregation? So, what happens if the facet is defined by an array with many objects?

Regards,
Thigo Souza


On Sun, Oct 10, 2010 at 19:48, Shay Banon <[hidden email]> wrote:
ok, I really can give you an answer for both:

I guess for the more obvious one, if you want to have stats on different fields and be computed differently, just execute  different facet (with a different facet name) against each field.

If you want to get a combined statistical data on two fields, there is no "formal" way to do, but, you can get it by executing two stats facet on the two different fields, just *name the facets the same*. The results will be reduced into a single result. This is a "feature" that I did not plan for, to be honest, and discovered it by mistake :).  A proper way to implement it is to have the ability to provide a list of fields to the stats facet, it will also be a tad faster this way. If you want, you can open a feature request for it (I already implemented something similar for terms facet).

-shay.banon


On Sun, Oct 10, 2010 at 11:44 PM, Shay Banon <[hidden email]> wrote:
You mean to have the statistical information accumulated across several fields, or execute it against different fields and have the results per field?


On Sun, Oct 10, 2010 at 11:37 PM, Zohar <[hidden email]> wrote:
Hi
Is ther a way to produce a statistical facet ( sum avg min max etc )
on multiplie fields at once.

Cheers
Z




Reply | Threaded
Open this post in threaded view
|

Re: Statistical facet on multiple fields

Thiago Souza
Hi Shay,

      But what happens if the facet is defined by an array with many objects?

Regards

On Mon, Oct 11, 2010 at 13:06, Shay Banon <[hidden email]> wrote:
Yea, this breaks json semantics since you need to define two facets under the same name. This "feature" was not planned, its just a matter of how the map reduce nature of facets work. If its going to be formal, then a proper json structure will need to be formalized. In any case, if there is a need for a facet over multiple fields (like stats), this feature can be added specifically for it.

On Mon, Oct 11, 2010 at 3:40 PM, Thiago Souza <[hidden email]> wrote:
Hi Shay,

    This is interesting.
    So basically, by breaking json semantics one can achieve aggregation? So, what happens if the facet is defined by an array with many objects?

Regards,
Thigo Souza


On Sun, Oct 10, 2010 at 19:48, Shay Banon <[hidden email]> wrote:
ok, I really can give you an answer for both:

I guess for the more obvious one, if you want to have stats on different fields and be computed differently, just execute  different facet (with a different facet name) against each field.

If you want to get a combined statistical data on two fields, there is no "formal" way to do, but, you can get it by executing two stats facet on the two different fields, just *name the facets the same*. The results will be reduced into a single result. This is a "feature" that I did not plan for, to be honest, and discovered it by mistake :).  A proper way to implement it is to have the ability to provide a list of fields to the stats facet, it will also be a tad faster this way. If you want, you can open a feature request for it (I already implemented something similar for terms facet).

-shay.banon


On Sun, Oct 10, 2010 at 11:44 PM, Shay Banon <[hidden email]> wrote:
You mean to have the statistical information accumulated across several fields, or execute it against different fields and have the results per field?


On Sun, Oct 10, 2010 at 11:37 PM, Zohar <[hidden email]> wrote:
Hi
Is ther a way to produce a statistical facet ( sum avg min max etc )
on multiplie fields at once.

Cheers
Z





Reply | Threaded
Open this post in threaded view
|

Re: Statistical facet on multiple fields

kimchy
Administrator
You mean the format of the facet expected is provided as an array? It won't work, the format mentioned in the docs is the only one supported. I mentioned in the previous email, if this is going to be a "formal" feature, then a different json structure needs to be defined.

On Mon, Oct 11, 2010 at 7:23 PM, Thiago Souza <[hidden email]> wrote:
Hi Shay,

      But what happens if the facet is defined by an array with many objects?

Regards

On Mon, Oct 11, 2010 at 13:06, Shay Banon <[hidden email]> wrote:
Yea, this breaks json semantics since you need to define two facets under the same name. This "feature" was not planned, its just a matter of how the map reduce nature of facets work. If its going to be formal, then a proper json structure will need to be formalized. In any case, if there is a need for a facet over multiple fields (like stats), this feature can be added specifically for it.

On Mon, Oct 11, 2010 at 3:40 PM, Thiago Souza <[hidden email]> wrote:
Hi Shay,

    This is interesting.
    So basically, by breaking json semantics one can achieve aggregation? So, what happens if the facet is defined by an array with many objects?

Regards,
Thigo Souza


On Sun, Oct 10, 2010 at 19:48, Shay Banon <[hidden email]> wrote:
ok, I really can give you an answer for both:

I guess for the more obvious one, if you want to have stats on different fields and be computed differently, just execute  different facet (with a different facet name) against each field.

If you want to get a combined statistical data on two fields, there is no "formal" way to do, but, you can get it by executing two stats facet on the two different fields, just *name the facets the same*. The results will be reduced into a single result. This is a "feature" that I did not plan for, to be honest, and discovered it by mistake :).  A proper way to implement it is to have the ability to provide a list of fields to the stats facet, it will also be a tad faster this way. If you want, you can open a feature request for it (I already implemented something similar for terms facet).

-shay.banon


On Sun, Oct 10, 2010 at 11:44 PM, Shay Banon <[hidden email]> wrote:
You mean to have the statistical information accumulated across several fields, or execute it against different fields and have the results per field?


On Sun, Oct 10, 2010 at 11:37 PM, Zohar <[hidden email]> wrote:
Hi
Is ther a way to produce a statistical facet ( sum avg min max etc )
on multiplie fields at once.

Cheers
Z






Reply | Threaded
Open this post in threaded view
|

Re: Statistical facet on multiple fields

Thiago Souza
Ok, thanks!

On Mon, Oct 11, 2010 at 14:32, Shay Banon <[hidden email]> wrote:
You mean the format of the facet expected is provided as an array? It won't work, the format mentioned in the docs is the only one supported. I mentioned in the previous email, if this is going to be a "formal" feature, then a different json structure needs to be defined.


On Mon, Oct 11, 2010 at 7:23 PM, Thiago Souza <[hidden email]> wrote:
Hi Shay,

      But what happens if the facet is defined by an array with many objects?

Regards

On Mon, Oct 11, 2010 at 13:06, Shay Banon <[hidden email]> wrote:
Yea, this breaks json semantics since you need to define two facets under the same name. This "feature" was not planned, its just a matter of how the map reduce nature of facets work. If its going to be formal, then a proper json structure will need to be formalized. In any case, if there is a need for a facet over multiple fields (like stats), this feature can be added specifically for it.

On Mon, Oct 11, 2010 at 3:40 PM, Thiago Souza <[hidden email]> wrote:
Hi Shay,

    This is interesting.
    So basically, by breaking json semantics one can achieve aggregation? So, what happens if the facet is defined by an array with many objects?

Regards,
Thigo Souza


On Sun, Oct 10, 2010 at 19:48, Shay Banon <[hidden email]> wrote:
ok, I really can give you an answer for both:

I guess for the more obvious one, if you want to have stats on different fields and be computed differently, just execute  different facet (with a different facet name) against each field.

If you want to get a combined statistical data on two fields, there is no "formal" way to do, but, you can get it by executing two stats facet on the two different fields, just *name the facets the same*. The results will be reduced into a single result. This is a "feature" that I did not plan for, to be honest, and discovered it by mistake :).  A proper way to implement it is to have the ability to provide a list of fields to the stats facet, it will also be a tad faster this way. If you want, you can open a feature request for it (I already implemented something similar for terms facet).

-shay.banon


On Sun, Oct 10, 2010 at 11:44 PM, Shay Banon <[hidden email]> wrote:
You mean to have the statistical information accumulated across several fields, or execute it against different fields and have the results per field?


On Sun, Oct 10, 2010 at 11:37 PM, Zohar <[hidden email]> wrote:
Hi
Is ther a way to produce a statistical facet ( sum avg min max etc )
on multiplie fields at once.

Cheers
Z







Reply | Threaded
Open this post in threaded view
|

Re: Statistical facet on multiple fields

kimchy
Administrator
By the way, what facet type do you want this feature for? As I mentioned earlier in response to Zohar, a more optimized solution would be to be able to execute the facet itself against several fields. stats and terms facet are simple enough that doing aggregation across facets on different fields will work. Other facets, which accept several configuration options, might produce strange results (or even fail, like range facet and not using the same range elements).

On Mon, Oct 11, 2010 at 9:53 PM, Thiago Souza <[hidden email]> wrote:
Ok, thanks!


On Mon, Oct 11, 2010 at 14:32, Shay Banon <[hidden email]> wrote:
You mean the format of the facet expected is provided as an array? It won't work, the format mentioned in the docs is the only one supported. I mentioned in the previous email, if this is going to be a "formal" feature, then a different json structure needs to be defined.


On Mon, Oct 11, 2010 at 7:23 PM, Thiago Souza <[hidden email]> wrote:
Hi Shay,

      But what happens if the facet is defined by an array with many objects?

Regards

On Mon, Oct 11, 2010 at 13:06, Shay Banon <[hidden email]> wrote:
Yea, this breaks json semantics since you need to define two facets under the same name. This "feature" was not planned, its just a matter of how the map reduce nature of facets work. If its going to be formal, then a proper json structure will need to be formalized. In any case, if there is a need for a facet over multiple fields (like stats), this feature can be added specifically for it.

On Mon, Oct 11, 2010 at 3:40 PM, Thiago Souza <[hidden email]> wrote:
Hi Shay,

    This is interesting.
    So basically, by breaking json semantics one can achieve aggregation? So, what happens if the facet is defined by an array with many objects?

Regards,
Thigo Souza


On Sun, Oct 10, 2010 at 19:48, Shay Banon <[hidden email]> wrote:
ok, I really can give you an answer for both:

I guess for the more obvious one, if you want to have stats on different fields and be computed differently, just execute  different facet (with a different facet name) against each field.

If you want to get a combined statistical data on two fields, there is no "formal" way to do, but, you can get it by executing two stats facet on the two different fields, just *name the facets the same*. The results will be reduced into a single result. This is a "feature" that I did not plan for, to be honest, and discovered it by mistake :).  A proper way to implement it is to have the ability to provide a list of fields to the stats facet, it will also be a tad faster this way. If you want, you can open a feature request for it (I already implemented something similar for terms facet).

-shay.banon


On Sun, Oct 10, 2010 at 11:44 PM, Shay Banon <[hidden email]> wrote:
You mean to have the statistical information accumulated across several fields, or execute it against different fields and have the results per field?


On Sun, Oct 10, 2010 at 11:37 PM, Zohar <[hidden email]> wrote:
Hi
Is ther a way to produce a statistical facet ( sum avg min max etc )
on multiplie fields at once.

Cheers
Z








Reply | Threaded
Open this post in threaded view
|

Re: Statistical facet on multiple fields

Thiago Souza
Hi Shay,

   Currently, for me, stats and terms is fine.

Regards

On Mon, Oct 11, 2010 at 17:25, Shay Banon <[hidden email]> wrote:
By the way, what facet type do you want this feature for? As I mentioned earlier in response to Zohar, a more optimized solution would be to be able to execute the facet itself against several fields. stats and terms facet are simple enough that doing aggregation across facets on different fields will work. Other facets, which accept several configuration options, might produce strange results (or even fail, like range facet and not using the same range elements).


On Mon, Oct 11, 2010 at 9:53 PM, Thiago Souza <[hidden email]> wrote:
Ok, thanks!


On Mon, Oct 11, 2010 at 14:32, Shay Banon <[hidden email]> wrote:
You mean the format of the facet expected is provided as an array? It won't work, the format mentioned in the docs is the only one supported. I mentioned in the previous email, if this is going to be a "formal" feature, then a different json structure needs to be defined.


On Mon, Oct 11, 2010 at 7:23 PM, Thiago Souza <[hidden email]> wrote:
Hi Shay,

      But what happens if the facet is defined by an array with many objects?

Regards

On Mon, Oct 11, 2010 at 13:06, Shay Banon <[hidden email]> wrote:
Yea, this breaks json semantics since you need to define two facets under the same name. This "feature" was not planned, its just a matter of how the map reduce nature of facets work. If its going to be formal, then a proper json structure will need to be formalized. In any case, if there is a need for a facet over multiple fields (like stats), this feature can be added specifically for it.

On Mon, Oct 11, 2010 at 3:40 PM, Thiago Souza <[hidden email]> wrote:
Hi Shay,

    This is interesting.
    So basically, by breaking json semantics one can achieve aggregation? So, what happens if the facet is defined by an array with many objects?

Regards,
Thigo Souza


On Sun, Oct 10, 2010 at 19:48, Shay Banon <[hidden email]> wrote:
ok, I really can give you an answer for both:

I guess for the more obvious one, if you want to have stats on different fields and be computed differently, just execute  different facet (with a different facet name) against each field.

If you want to get a combined statistical data on two fields, there is no "formal" way to do, but, you can get it by executing two stats facet on the two different fields, just *name the facets the same*. The results will be reduced into a single result. This is a "feature" that I did not plan for, to be honest, and discovered it by mistake :).  A proper way to implement it is to have the ability to provide a list of fields to the stats facet, it will also be a tad faster this way. If you want, you can open a feature request for it (I already implemented something similar for terms facet).

-shay.banon


On Sun, Oct 10, 2010 at 11:44 PM, Shay Banon <[hidden email]> wrote:
You mean to have the statistical information accumulated across several fields, or execute it against different fields and have the results per field?


On Sun, Oct 10, 2010 at 11:37 PM, Zohar <[hidden email]> wrote:
Hi
Is ther a way to produce a statistical facet ( sum avg min max etc )
on multiplie fields at once.

Cheers
Z









Reply | Threaded
Open this post in threaded view
|

Re: Statistical facet on multiple fields

kimchy
Administrator
cool, ok. I will work on adding multi field support for stats. The two are the ones the make most sense to do.

On Mon, Oct 11, 2010 at 10:40 PM, Thiago Souza <[hidden email]> wrote:
Hi Shay,

   Currently, for me, stats and terms is fine.

Regards


On Mon, Oct 11, 2010 at 17:25, Shay Banon <[hidden email]> wrote:
By the way, what facet type do you want this feature for? As I mentioned earlier in response to Zohar, a more optimized solution would be to be able to execute the facet itself against several fields. stats and terms facet are simple enough that doing aggregation across facets on different fields will work. Other facets, which accept several configuration options, might produce strange results (or even fail, like range facet and not using the same range elements).


On Mon, Oct 11, 2010 at 9:53 PM, Thiago Souza <[hidden email]> wrote:
Ok, thanks!


On Mon, Oct 11, 2010 at 14:32, Shay Banon <[hidden email]> wrote:
You mean the format of the facet expected is provided as an array? It won't work, the format mentioned in the docs is the only one supported. I mentioned in the previous email, if this is going to be a "formal" feature, then a different json structure needs to be defined.


On Mon, Oct 11, 2010 at 7:23 PM, Thiago Souza <[hidden email]> wrote:
Hi Shay,

      But what happens if the facet is defined by an array with many objects?

Regards

On Mon, Oct 11, 2010 at 13:06, Shay Banon <[hidden email]> wrote:
Yea, this breaks json semantics since you need to define two facets under the same name. This "feature" was not planned, its just a matter of how the map reduce nature of facets work. If its going to be formal, then a proper json structure will need to be formalized. In any case, if there is a need for a facet over multiple fields (like stats), this feature can be added specifically for it.

On Mon, Oct 11, 2010 at 3:40 PM, Thiago Souza <[hidden email]> wrote:
Hi Shay,

    This is interesting.
    So basically, by breaking json semantics one can achieve aggregation? So, what happens if the facet is defined by an array with many objects?

Regards,
Thigo Souza


On Sun, Oct 10, 2010 at 19:48, Shay Banon <[hidden email]> wrote:
ok, I really can give you an answer for both:

I guess for the more obvious one, if you want to have stats on different fields and be computed differently, just execute  different facet (with a different facet name) against each field.

If you want to get a combined statistical data on two fields, there is no "formal" way to do, but, you can get it by executing two stats facet on the two different fields, just *name the facets the same*. The results will be reduced into a single result. This is a "feature" that I did not plan for, to be honest, and discovered it by mistake :).  A proper way to implement it is to have the ability to provide a list of fields to the stats facet, it will also be a tad faster this way. If you want, you can open a feature request for it (I already implemented something similar for terms facet).

-shay.banon


On Sun, Oct 10, 2010 at 11:44 PM, Shay Banon <[hidden email]> wrote:
You mean to have the statistical information accumulated across several fields, or execute it against different fields and have the results per field?


On Sun, Oct 10, 2010 at 11:37 PM, Zohar <[hidden email]> wrote:
Hi
Is ther a way to produce a statistical facet ( sum avg min max etc )
on multiplie fields at once.

Cheers
Z










Reply | Threaded
Open this post in threaded view
|

Re: Statistical facet on multiple fields

kimchy
Administrator
Just pushed support for statistical facet on more than one field: http://github.com/elasticsearch/elasticsearch/issues/issue/436.

On Tue, Oct 12, 2010 at 12:51 AM, Shay Banon <[hidden email]> wrote:
cool, ok. I will work on adding multi field support for stats. The two are the ones the make most sense to do.


On Mon, Oct 11, 2010 at 10:40 PM, Thiago Souza <[hidden email]> wrote:
Hi Shay,

   Currently, for me, stats and terms is fine.

Regards


On Mon, Oct 11, 2010 at 17:25, Shay Banon <[hidden email]> wrote:
By the way, what facet type do you want this feature for? As I mentioned earlier in response to Zohar, a more optimized solution would be to be able to execute the facet itself against several fields. stats and terms facet are simple enough that doing aggregation across facets on different fields will work. Other facets, which accept several configuration options, might produce strange results (or even fail, like range facet and not using the same range elements).


On Mon, Oct 11, 2010 at 9:53 PM, Thiago Souza <[hidden email]> wrote:
Ok, thanks!


On Mon, Oct 11, 2010 at 14:32, Shay Banon <[hidden email]> wrote:
You mean the format of the facet expected is provided as an array? It won't work, the format mentioned in the docs is the only one supported. I mentioned in the previous email, if this is going to be a "formal" feature, then a different json structure needs to be defined.


On Mon, Oct 11, 2010 at 7:23 PM, Thiago Souza <[hidden email]> wrote:
Hi Shay,

      But what happens if the facet is defined by an array with many objects?

Regards

On Mon, Oct 11, 2010 at 13:06, Shay Banon <[hidden email]> wrote:
Yea, this breaks json semantics since you need to define two facets under the same name. This "feature" was not planned, its just a matter of how the map reduce nature of facets work. If its going to be formal, then a proper json structure will need to be formalized. In any case, if there is a need for a facet over multiple fields (like stats), this feature can be added specifically for it.

On Mon, Oct 11, 2010 at 3:40 PM, Thiago Souza <[hidden email]> wrote:
Hi Shay,

    This is interesting.
    So basically, by breaking json semantics one can achieve aggregation? So, what happens if the facet is defined by an array with many objects?

Regards,
Thigo Souza


On Sun, Oct 10, 2010 at 19:48, Shay Banon <[hidden email]> wrote:
ok, I really can give you an answer for both:

I guess for the more obvious one, if you want to have stats on different fields and be computed differently, just execute  different facet (with a different facet name) against each field.

If you want to get a combined statistical data on two fields, there is no "formal" way to do, but, you can get it by executing two stats facet on the two different fields, just *name the facets the same*. The results will be reduced into a single result. This is a "feature" that I did not plan for, to be honest, and discovered it by mistake :).  A proper way to implement it is to have the ability to provide a list of fields to the stats facet, it will also be a tad faster this way. If you want, you can open a feature request for it (I already implemented something similar for terms facet).

-shay.banon


On Sun, Oct 10, 2010 at 11:44 PM, Shay Banon <[hidden email]> wrote:
You mean to have the statistical information accumulated across several fields, or execute it against different fields and have the results per field?


On Sun, Oct 10, 2010 at 11:37 PM, Zohar <[hidden email]> wrote:
Hi
Is ther a way to produce a statistical facet ( sum avg min max etc )
on multiplie fields at once.

Cheers
Z











Reply | Threaded
Open this post in threaded view
|

Re: Statistical facet on multiple fields

zohar
In reply to this post by kimchy
Hi
Coming back to this question having spent a bit of time playing around.
What I am trying to achieve :

we have an index with documents that look like :
{
  currency:'USD',
  product:'toaster',
  value: 23.4
},
{
  currency:'GBP',
  product:'toaster',
  value: 13.4
}


I want to produce factes for 'currency' & 'product'  where the value of each term is not a count , its a sum of values.

The expected result would be something like :

facets": {
                "currencyFacet": {
                        "_type": "terms",
                        "_field": "currency",
                        "terms": [
                                {
                                        "term": "GBP",
                                        "count": 1,
                                        "value": 13.4
                                },
                                {
                                        "term": "USD",
                                        "count": 1,
                                        "value": 23.4
                                }
}
                        ]
                },
                "productFacet": {
                        "_type": "terms",
                        "_field": "product",
                        "terms": [
                                {
                                        "term": "toaster",
                                        "count": 2,
                                        "value":36.8

                                },
]
                }
        }
}

this aggregate some numeric field by a variety of non numeric terms is very very common for us.

Any ideas ?

Cheers
Z
Reply | Threaded
Open this post in threaded view
|

Re: Statistical facet on multiple fields

kimchy
Administrator
There isn't a built in one to do that, but sounds good, open a feature request?

On Thu, Nov 25, 2010 at 5:09 PM, zohar <[hidden email]> wrote:

Hi
Coming back to this question having spent a bit of time playing around.
What I am trying to achieve :

we have an index with documents that look like :
{
 currency:'USD',
 product:'toaster',
 value: 23.4
},
{
 currency:'GBP',
 product:'toaster',
 value: 13.4
}


I want to produce factes for 'currency' & 'product'  where the value of each
term is not a count , its a sum of values.

The expected result would be something like :

facets": {
               "currencyFacet": {
                       "_type": "terms",
                       "_field": "currency",
                       "terms": [
                               {
                                       "term": "GBP",
                                       "count": 1,
                                       "value": 13.4
                               },
                               {
                                       "term": "USD",
                                       "count": 1,
                                       "value": 23.4
                               }
}
                       ]
               },
               "productFacet": {
                       "_type": "terms",
                       "_field": "product",
                       "terms": [
                               {
                                       "term": "toaster",
                                       "count": 2,
                                       "value":36.8

                               },
]
               }
       }
}

this aggregate some numeric field by a variety of non numeric terms is very
very common for us.

Any ideas ?

Cheers
Z

--
View this message in context: http://elasticsearch-users.115913.n3.nabble.com/Statistical-facet-on-multiple-fields-tp1676993p1967528.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: Statistical facet on multiple fields

zohar
Reply | Threaded
Open this post in threaded view
|

Re: Statistical facet on multiple fields

kimchy
Administrator
Just wanted to make sure regarding one aspect of the feature, does each document consist of single values of currency and price, or can have several of those? If its the latter, then the feature is more complex to implement and requires some other features before hand... (the ordering of the fields within a doc when loaded  for facets is not maintained).

On Thu, Nov 25, 2010 at 9:37 PM, zohar <[hidden email]> wrote:

Created - https://github.com/elasticsearch/elasticsearch/issues/#issue/539
looking fwd to it.

Thanks
Z
--
View this message in context: http://elasticsearch-users.115913.n3.nabble.com/Statistical-facet-on-multiple-fields-tp1676993p1969052.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: Statistical facet on multiple fields

zohar
Yes they are single values
Reply | Threaded
Open this post in threaded view
|

Re: Statistical facet on multiple fields

harelba
This post has NOT been accepted by the mailing list yet.
In reply to this post by kimchy
Hi,

I've been looking for a way to perform aggregations similar to the ones talked about in this thread, grouping the data according to an arbitrary set or fields (or better yet - an expression).

The ScriptHistogramFacet seemed like a good choice, allowing the key to actually be a "key_script", and skipping the "bucketing" stage. I thought that this would allow me to achieve this kind of aggregations, but then I saw that ScriptHistogramFacetCollector.doCollect() relies on the fact that value returned from key_script has to be of type Number even if the interval==0. I know that currently you're using LongLong maps, but If it would have accepted other types as well (at least strings), that would have been really great.

Am I getting it wrong? Is there a good way to do that? Your help would be much appreciated.

Thanks,
RL

btw, it would have been totally cool if the data collected by the StatisticalFacet would be integrated into the HistogramFacet (and its scripted brother). The StatisticalFacet is great, but often-times the statistical data is required per some kind of "group", and not only on some kind of filter over the whole data.

Reply | Threaded
Open this post in threaded view
|

Re: Statistical facet on multiple fields

harelba
In reply to this post by zohar
Hi,

I've been looking for a way to perform aggregations similar to the
ones talked about in this thread, grouping the data according to an
arbitrary set or fields (or better yet - an expression).

The ScriptHistogramFacet seemed like a good choice, allowing the key
to actually be a "key_script", and skipping the "bucketing" stage. I
thought that this would allow me to achieve this kind of aggregations,
but then I saw that ScriptHistogramFacetCollector.doCollect() relies
on the fact that value returned from key_script has to be of type
Number even if the interval==0. I know that currently you're using
LongLong maps, but If it would have accepted other types as well (at
least strings), that would have been really great.

Am I getting it wrong? Is there a good way to do that? Your help would
be much appreciated.

Thanks,
RL

btw, it would have been totally cool if the data collected by the
StatisticalFacet would be integrated into the HistogramFacet (and its
scripted brother). The StatisticalFacet is great, but often-times the
statistical data is required per some kind of "group", and not only on
some kind of filter over the whole data.
Reply | Threaded
Open this post in threaded view
|

Re: Statistical facet on multiple fields

kimchy
Administrator
It make sense, what you are after. The main challenge with facets is the fact that they can get really interesting once you start to combine them (as is the case in this thread with terms and stats). The problem is that those facet implementation are highly optimized for the simple reason that they might end up running over 100s of millions of docs. And implementing all the combinations in a generic fashion is certainly possible, but will incur a performance overhead (both in computation, but even more in serialization over network).

One of the things lined up for 0.15 is to do some refactoring in facets and make them more pluggable. Once thats out of the way, then people can write their own facet implementations.

Of couse, there should be a good out of the box set of facets that comes with ES. My current line of thought is that there will simply be a lot of facet types, all heavily optimized. There will be a terms_stats, and date_histogram, and others. I don't mind implementing all of those and have them as past of  ES. Hopefully the community will help with it (or at the very least, help with coming up with good names for them :) ), so you will get a really rich and heavily optimized set of facets.

-shay.banon

On Tue, Dec 28, 2010 at 2:05 PM, harelba <[hidden email]> wrote:
Hi,

I've been looking for a way to perform aggregations similar to the
ones talked about in this thread, grouping the data according to an
arbitrary set or fields (or better yet - an expression).

The ScriptHistogramFacet seemed like a good choice, allowing the key
to actually be a "key_script", and skipping the "bucketing" stage. I
thought that this would allow me to achieve this kind of aggregations,
but then I saw that ScriptHistogramFacetCollector.doCollect() relies
on the fact that value returned from key_script has to be of type
Number even if the interval==0. I know that currently you're using
LongLong maps, but If it would have accepted other types as well (at
least strings), that would have been really great.

Am I getting it wrong? Is there a good way to do that? Your help would
be much appreciated.

Thanks,
RL

btw, it would have been totally cool if the data collected by the
StatisticalFacet would be integrated into the HistogramFacet (and its
scripted brother). The StatisticalFacet is great, but often-times the
statistical data is required per some kind of "group", and not only on
some kind of filter over the whole data.

12