CLOSE_WAIT Sockets

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

CLOSE_WAIT Sockets

elasticsearcher
Hi all,

I'm running ElasticSearch backing up to HDFS with the following elasticsearch.yml:

index:
    store:
        fs:
            memory:
                enabled: true
gateway:
    type: hdfs
    hdfs:
        uri: hdfs://blah:54310
        path: elasticsearch/gateway

I've noticed that over the last week the number of sockets in the CLOSE_WAIT status has grown steadily.

$ netstat -aonp | grep CLOSE_WAIT | wc -l
6484

When I look at the sockets, I can see the process id is elasticsearch and the destination port is 50010, which I looked up and is a hadoop datanode port.

Has anyone seen this before/have any ideas about how to fix this?

Thanks!
Reply | Threaded
Open this post in threaded view
|

Re: CLOSE_WAIT Sockets

ppearcy
There are settings to reduce the amount of time that a socket stays in
this state after closing. I know on Windows this is 2 minutes and is
reg configurable. Not sure about other OSes.

However, sockets should get reused. What client are you using and if
it is rest/HTTP based, is it using keep-alive?

On Nov 11, 11:02 am, elasticsearcher <[hidden email]>
wrote:

> Hi all,
>
> I'm running ElasticSearch backing up to HDFS with the following
> elasticsearch.yml:
>
> index:
>     store:
>         fs:
>             memory:
>                 enabled: true
> gateway:
>     type: hdfs
>     hdfs:
>         uri: hdfs://blah:54310
>         path: elasticsearch/gateway
>
> I've noticed that over the last week the number of sockets in the CLOSE_WAIT
> status has grown steadily.
>
> $ netstat -aonp | grep CLOSE_WAIT | wc -l
> 6484
>
> When I look at the sockets, I can see the process id is elasticsearch and
> the destination port is 50010, which I looked up and is a hadoop datanode
> port.
>
> Has anyone seen this before/have any ideas about how to fix this?
>
> Thanks!
> --
> View this message in context:http://elasticsearch-users.115913.n3.nabble.com/CLOSE-WAIT-Sockets-tp...
> Sent from the ElasticSearch Users mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

Re: CLOSE_WAIT Sockets

kimchy
Administrator
Hi,

   I am not doing anything specific on hadoop except for using the formal API. I would think it would reuse the same socket when talking to the hadoop cluster... . Maybe you could check a bit with hadoop and see why it might happen?

-shay.banon

On Fri, Nov 12, 2010 at 3:07 AM, Paul <[hidden email]> wrote:
There are settings to reduce the amount of time that a socket stays in
this state after closing. I know on Windows this is 2 minutes and is
reg configurable. Not sure about other OSes.

However, sockets should get reused. What client are you using and if
it is rest/HTTP based, is it using keep-alive?

On Nov 11, 11:02 am, elasticsearcher <[hidden email]>
wrote:
> Hi all,
>
> I'm running ElasticSearch backing up to HDFS with the following
> elasticsearch.yml:
>
> index:
>     store:
>         fs:
>             memory:
>                 enabled: true
> gateway:
>     type: hdfs
>     hdfs:
>         uri: hdfs://blah:54310
>         path: elasticsearch/gateway
>
> I've noticed that over the last week the number of sockets in the CLOSE_WAIT
> status has grown steadily.
>
> $ netstat -aonp | grep CLOSE_WAIT | wc -l
> 6484
>
> When I look at the sockets, I can see the process id is elasticsearch and
> the destination port is 50010, which I looked up and is a hadoop datanode
> port.
>
> Has anyone seen this before/have any ideas about how to fix this?
>
> Thanks!
> --
> View this message in context:http://elasticsearch-users.115913.n3.nabble.com/CLOSE-WAIT-Sockets-tp...
> Sent from the ElasticSearch Users mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: CLOSE_WAIT Sockets

elasticsearcher
In reply to this post by ppearcy
Our base client is based on the pyelasticsearch class found on the elasticsearch website. I've looked into this and python uses keep-alive by default in the client as far as I can tell. I'm pretty sure that it isn't the client that is the problem since the port numbers indicate that it is a problem between hadoop and elasticsearch. I'll take a look at the interface between the two and see if I can find anything, if not I'll try to find another way around it.

Thanks.
Reply | Threaded
Open this post in threaded view
|

Re: CLOSE_WAIT Sockets

kimchy
Administrator
Yea, it looks like a CLOSED_WAIT communicating with hadoop. In elasticsearch, a single FileSystem instance is used for the node, not sure how hadoop internals work to tell how it manages sockets. Might be the expected behavior?

On Sat, Nov 13, 2010 at 12:56 AM, elasticsearcher <[hidden email]> wrote:

Our base client is based on the pyelasticsearch class found on the
elasticsearch website. I've looked into this and python uses keep-alive by
default in the client as far as I can tell. I'm pretty sure that it isn't
the client that is the problem since the port numbers indicate that it is a
problem between hadoop and elasticsearch. I'll take a look at the interface
between the two and see if I can find anything, if not I'll try to find
another way around it.

Thanks.
--
View this message in context: http://elasticsearch-users.115913.n3.nabble.com/CLOSE-WAIT-Sockets-tp1884117p1892000.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: CLOSE_WAIT Sockets

elasticsearcher
I've found something interesting:

$ ps aux | grep elasticsearch
<10600>
$ /usr/sbin/lsof -p 10600 | grep CLOSE_WAIT
<all destination 50010>
$ netstat -lp | grep 50010
<27091>
$ /usr/sbin/lsof -p 27091 | grep CLOSE_WAIT
<nothing>

Confirming:
$ /usr/sbin/lsof | grep CLOSE_WAIT | grep -v 10600
<nothing>

It appears that something is wrong with the way elasticsearch is using the hadoop api or something is wrong with the hadoop api, since I wouldn't think having thousands of CLOSE_WAIT sockets would be expected behavior. I'll keep looking and maybe post on the hadoop forums.