Aggregating query results nested documents by nested field

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Aggregating query results nested documents by nested field

ohabushi
This post has NOT been accepted by the mailing list yet.
I've been looking for a while for answer, couldn't find a similar use case in the internet. I have an index of customers with nested documents of addresses. Each Customer could have zero to many addresses. The addresses could contain duplicates which could be identified by a specific field - mergeID. This is not possible to know while inserting the data because the data is updated in different points in time. I wish to aggregate the nested documents when returning the results.

For example: a search query on David Anderson returns the following document:

     "hits": [
     {
        "_shard": 2,
        "_node": "YgEG9pNDRbexEN6Xl15UVA",
        "_index": "MyIndex",
        "_type": "PARTY_new2",
        "_id": "10042",
        "_score": 1.282629,
        "_source": {
         "TimeStamp": "2013-08-22T09:07:20.991382",
           "birthDate": "1960-02-01",
           "firstName": "David",
           "lastName": "Anderson"
           "addresses": [
              {
                 "region": "NE",
                 "postalCode": "2000",
                 "street": "My Street",
                 "houseNumber": "3",
                 "mergeID": "10631265043",
                 "country": "CH",
                 "city": "Basel"
              },
              {
                 "region": "GE",
                 "postalCode": "1234",
                 "street": "rue des Faises",
                 "mergeID": "30835445084",
                 "houseNumber": "4",
                 "country": "FR",
                 "city": "Paris"
              },
              {
                 "region": "NE",
                 "postalCode": "2000",
                 "street": "My Street",
                 "houseNumber": "1",
                 "mergeID": "10631265043",
                 "country": "CH",
                 "city": "Basel"
              }
           ],

        }
The documents contains 3 nested documents with 3 addresses, 2 of the 3 addresses are basically the same. I wish to aggregate the nested documents by the "mergeID" field, and perhaps display only the first address document for each mergeID. Any idea how this could be achieved in query? In this case the result should be the parent document with 2 nested addresses one for Paris and one for Basel. It is also important to mention that not all addresses might have a mergeID field, in this case we would want to display them too.

Thank a lot!