Quantcast

RE: calculating amount of disk space used

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: calculating amount of disk space used

Elastic Noob
Hi good day,

I was wondering how can i calculate the amount of disk space used by Elastic Search?

Here's an example:

I collected 1 million tweets, using the index "tweets" and I indexed the "text" and "users" key.


So my 2 questions are:
  1. In this situation, how do i find out the amount of disk space used the the "tweets" index?
  2. Is there any way I can find out the amount of diskspace used by tweets that contain the hashtag "#YOLO"  for example?

Thanks everyone!


Best.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: calculating amount of disk space used

Zachary Tong
  1. The Indices Stats API can help you out there.  Note, this is only the size of the primaries, not the replicas too (although that's easy to calculate on your own)

    $ curl localhost:9200/test/_stats
    {
       "ok": true,
       "_shards": {
          "total": 6,
          "successful": 6,
          "failed": 0
       }
       [...]
        "indices": {
             "test": {
                "primaries": {
                   "docs": {
                      "count": 7356989,
                      "deleted": 628485
                   },
                   "store": {
                      "size": "8.7gb",
                      "size_in_bytes": 9443315781,
                      "throttle_time": "0s",
                      "throttle_time_in_millis": 0
                   }
    [...]


  2. I do not believe this is possible, at least not through the API.  If you know the average size of each tweet, you could use the Count API and multiply the count by average doc size.
-Zach



On Sunday, February 10, 2013 9:20:24 AM UTC-5, Elastic Noob wrote:
Hi good day,

I was wondering how can i calculate the amount of disk space used by Elastic Search?

Here's an example:

I collected 1 million tweets, using the index "tweets" and I indexed the "text" and "users" key.


So my 2 questions are:
  1. In this situation, how do i find out the amount of disk space used the the "tweets" index?
  2. Is there any way I can find out the amount of diskspace used by tweets that contain the hashtag "#YOLO"  for example?

Thanks everyone!


Best.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: calculating amount of disk space used

Elastic Noob
Hi,

thanks for the tip!

Best Regards,
Eugene

On Sunday, February 10, 2013 10:46:05 PM UTC+8, Zachary Tong wrote:
  1. The Indices Stats API can help you out there.  Note, this is only the size of the primaries, not the replicas too (although that's easy to calculate on your own)

    $ curl localhost:9200/test/_stats
    {
       "ok": true,
       "_shards": {
          "total": 6,
          "successful": 6,
          "failed": 0
       }
       [...]
        "indices": {
             "test": {
                "primaries": {
                   "docs": {
                      "count": 7356989,
                      "deleted": 628485
                   },
                   "store": {
                      "size": "8.7gb",
                      "size_in_bytes": 9443315781,
                      "throttle_time": "0s",
                      "throttle_time_in_millis": 0
                   }
    [...]


  2. I do not believe this is possible, at least not through the API.  If you know the average size of each tweet, you could use the Count API and multiply the count by average doc size.
-Zach



On Sunday, February 10, 2013 9:20:24 AM UTC-5, Elastic Noob wrote:
Hi good day,

I was wondering how can i calculate the amount of disk space used by Elastic Search?

Here's an example:

I collected 1 million tweets, using the index "tweets" and I indexed the "text" and "users" key.


So my 2 questions are:
  1. In this situation, how do i find out the amount of disk space used the the "tweets" index?
  2. Is there any way I can find out the amount of diskspace used by tweets that contain the hashtag "#YOLO"  for example?

Thanks everyone!


Best.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: calculating amount of disk space used

Elastic Noob
In reply to this post by Zachary Tong
Hi again,

i was wondering if the size of the index reflect the size of the total amount of disk space used by the dataset? ( including the raw _source data ) ?

Thanks again!

On Sunday, February 10, 2013 10:46:05 PM UTC+8, Zachary Tong wrote:
  1. The Indices Stats API can help you out there.  Note, this is only the size of the primaries, not the replicas too (although that's easy to calculate on your own)

    $ curl localhost:9200/test/_stats
    {
       "ok": true,
       "_shards": {
          "total": 6,
          "successful": 6,
          "failed": 0
       }
       [...]
        "indices": {
             "test": {
                "primaries": {
                   "docs": {
                      "count": 7356989,
                      "deleted": 628485
                   },
                   "store": {
                      "size": "8.7gb",
                      "size_in_bytes": 9443315781,
                      "throttle_time": "0s",
                      "throttle_time_in_millis": 0
                   }
    [...]


  2. I do not believe this is possible, at least not through the API.  If you know the average size of each tweet, you could use the Count API and multiply the count by average doc size.
-Zach



On Sunday, February 10, 2013 9:20:24 AM UTC-5, Elastic Noob wrote:
Hi good day,

I was wondering how can i calculate the amount of disk space used by Elastic Search?

Here's an example:

I collected 1 million tweets, using the index "tweets" and I indexed the "text" and "users" key.


So my 2 questions are:
  1. In this situation, how do i find out the amount of disk space used the the "tweets" index?
  2. Is there any way I can find out the amount of diskspace used by tweets that contain the hashtag "#YOLO"  for example?

Thanks everyone!


Best.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: calculating amount of disk space used

Drew Raines-2
Elastic Noob wrote:

> i was wondering if the size of the index reflect the size of the
> total amount of disk space used by the dataset? ( including the raw
> _source data ) ?

You want the total -> store -> size(_in_bytes).  It should reflect
the total usage of your index's shards across all disks (primaries +
replicas).

Here's a simple way to get this number without looking through the
json.

  % curl -s download.elasticsearch.org/es2unix/es >~/bin/es; chmod +x ~/bin/es
  % es indices -v wik
  status name pri rep  size      bytes   docs
  green  wiki   5   1 5.3gb 5731796207 753816

-Drew

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: calculating amount of disk space used

Elastic Noob
WOW.

cool!

Thanks!

On Friday, February 15, 2013 10:56:30 PM UTC+8, Drew Raines wrote:
Elastic Noob wrote:

> i was wondering if the size of the index reflect the size of the
> total amount of disk space used by the dataset? ( including the raw
> _source data ) ?

You want the total -> store -> size(_in_bytes).  It should reflect
the total usage of your index's shards across all disks (primaries +
replicas).

Here's a simple way to get this number without looking through the
json.

  % curl -s download.elasticsearch.org/es2unix/es >~/bin/es; chmod +x ~/bin/es
  % es indices -v wik
  status name pri rep  size      bytes   docs
  green  wiki   5   1 5.3gb 5731796207 753816

-Drew

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Loading...