Timeseries data aggregation

Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Timeseries data aggregation

Soumitra Kumar
This post has NOT been accepted by the mailing list yet.
Hello,

I have timeseries metrics about memory usage by processes. Here are the fields:

<date>, <user>, <processid>, <memoryMB>

Here <processid> by <user> takes <memoryMB> at time <date> .

1, foo, 1, 100
1, foo, 2, 500
2, foo, 3, 100
2, bar, 4, 100

What is the best way to find out peak memory usage by any user in last 15 days?

Here is query with aggregation, which adds the memoryMB by user at every time instant, then I post process the output to get the max memoryMB.

{
    "from":0,
    "size":0,
    "query":{"bool":{"must":[{"range":{"date":{"gte":"now-15d"}}}]}},
    "aggs": {
        "name": {
            "terms": {
                "field": "user",
                "size": 1000
            },
            "aggs": {
                "date": {
                    "terms": {
                        "field": "date",
                        "order" : { "_term" : "asc" },
                        "size": 15000
                    },
                    "aggs": {
                        "mb": {
                            "sum": { "field": "memoryMB" }
                        }
                    }
                }
            }
        }
    }
}

Thanks,
-Soumitra.