Remote access about Spark and Elasticsearch

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Remote access about Spark and Elasticsearch

Jen-Ming Chung
Hi Everyone,

First, I setup the ES server on AWS by the following instructions and the tcp port 9200, 9300 is allowed on security group:

wget https://download.elastic.co/elasticsearch/elasticsearch/elasticsearch-1.5.2.deb
sudo dpkg -i elasticsearch-1.5.2.deb

And by using sudo netstat -atnp to make sure the above ports are listening:

tcp6       0      0 :::9200                 :::*                    LISTEN      
tcp6       0      0 :::9300                 :::*                    LISTEN

Then, my scala code:
    val sparkConf = new SparkConf().setAppName("Test")
      .setMaster("local[2]")
      .set("es.nodes", "52.68.202.80:9200")
      .set("es.nodes.discovery", "false")
    val sc = new SparkContext(sparkConf)

    // total 267 hits
    val query = "{\"query\":{\"bool\":{\"must\":[{\"range\":{\"scan_time\":{\"from\":\"2014-09-01T00:00:00\",\"to\":\"2014-09-01T00:00:59\"}}}]}}}";
    val data = sc.esRDD("wifi-collection/final_data", query)
    data.collection().foreach(println)

It's weird that when I run the code on laptop (localhost), I always got the following message:

15/05/14 00:23:36 INFO SparkContext: Running Spark version 1.3.1
15/05/14 00:23:37 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/05/14 00:23:37 INFO SecurityManager: Changing view acls to: jeremy
15/05/14 00:23:37 INFO SecurityManager: Changing modify acls to: jeremy
15/05/14 00:23:37 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(jeremy); users with modify permissions: Set(jeremy)
15/05/14 00:23:37 INFO Slf4jLogger: Slf4jLogger started
15/05/14 00:23:37 INFO Remoting: Starting remoting
15/05/14 00:23:37 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@orion:58665]
15/05/14 00:23:37 INFO Utils: Successfully started service 'sparkDriver' on port 58665.
15/05/14 00:23:37 INFO SparkEnv: Registering MapOutputTracker
15/05/14 00:23:37 INFO SparkEnv: Registering BlockManagerMaster
15/05/14 00:23:37 INFO DiskBlockManager: Created local directory at /var/folders/lz/bc5hqqsn1gvg2hl4b8svwd_w0000gn/T/spark-4d9ee290-78d6-4537-8975-33886ece0b86/blockmgr-469a0ac4-e62d-4e55-b848-c3ddc5bf121f
15/05/14 00:23:37 INFO MemoryStore: MemoryStore started with capacity 1966.1 MB
15/05/14 00:23:37 INFO HttpFileServer: HTTP File server directory is /var/folders/lz/bc5hqqsn1gvg2hl4b8svwd_w0000gn/T/spark-81548d67-dcb2-4e79-b382-79a2a0a32a76/httpd-e77bb5ee-c571-4a0b-bdab-8e8037a3205e
15/05/14 00:23:37 INFO HttpServer: Starting HTTP Server
15/05/14 00:23:37 INFO Server: jetty-8.y.z-SNAPSHOT
15/05/14 00:23:37 INFO AbstractConnector: Started SocketConnector@0.0.0.0:58666
15/05/14 00:23:37 INFO Utils: Successfully started service 'HTTP file server' on port 58666.
15/05/14 00:23:37 INFO SparkEnv: Registering OutputCommitCoordinator
15/05/14 00:23:37 INFO Server: jetty-8.y.z-SNAPSHOT
15/05/14 00:23:37 INFO AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040
15/05/14 00:23:37 INFO Utils: Successfully started service 'SparkUI' on port 4040.
15/05/14 00:23:37 INFO SparkUI: Started SparkUI at http://orion:4040
15/05/14 00:23:38 INFO Executor: Starting executor ID <driver> on host localhost
15/05/14 00:23:38 INFO AkkaUtils: Connecting to HeartbeatReceiver: akka.tcp://sparkDriver@orion:58665/user/HeartbeatReceiver
15/05/14 00:23:38 INFO NettyBlockTransferService: Server created on 58667
15/05/14 00:23:38 INFO BlockManagerMaster: Trying to register BlockManager
15/05/14 00:23:38 INFO BlockManagerMasterActor: Registering block manager localhost:58667 with 1966.1 MB RAM, BlockManagerId(<driver>, localhost, 58667)
15/05/14 00:23:38 INFO BlockManagerMaster: Registered BlockManager
15/05/14 00:23:39 INFO Version: Elasticsearch Hadoop v2.1.0.Beta4 [2c62e273d2]
15/05/14 00:23:39 INFO ScalaEsRDD: Reading from [wifi-collection/final_data]
15/05/14 00:23:39 INFO ScalaEsRDD: Discovered mapping {wifi-collection=[mappings=[final_data=[bssid=STRING, gps_lat=DOUBLE, gps_lng=DOUBLE, imei=STRING, m_lat=DOUBLE, m_lng=DOUBLE, net_lat=DOUBLE, net_lng=DOUBLE, no=LONG, rss=DOUBLE, s_no=LONG, scan_time=DATE, source=STRING, ssid=STRING, trace=LONG]]]} for [wifi-collection/final_data]
15/05/14 00:23:39 INFO SparkContext: Starting job: collect at App.scala:19
15/05/14 00:23:39 INFO DAGScheduler: Got job 0 (collect at App.scala:19) with 5 output partitions (allowLocal=false)
15/05/14 00:23:39 INFO DAGScheduler: Final stage: Stage 0(collect at App.scala:19)
15/05/14 00:23:39 INFO DAGScheduler: Parents of final stage: List()
15/05/14 00:23:39 INFO DAGScheduler: Missing parents: List()
15/05/14 00:23:39 INFO DAGScheduler: Submitting Stage 0 (ScalaEsRDD[0] at RDD at AbstractEsRDD.scala:17), which has no missing parents
15/05/14 00:23:39 INFO MemoryStore: ensureFreeSpace(1496) called with curMem=0, maxMem=2061647216
15/05/14 00:23:39 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1496.0 B, free 1966.1 MB)
15/05/14 00:23:39 INFO MemoryStore: ensureFreeSpace(1148) called with curMem=1496, maxMem=2061647216
15/05/14 00:23:39 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1148.0 B, free 1966.1 MB)
15/05/14 00:23:39 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:58667 (size: 1148.0 B, free: 1966.1 MB)
15/05/14 00:23:39 INFO BlockManagerMaster: Updated info of block broadcast_0_piece0
15/05/14 00:23:39 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:839
15/05/14 00:23:39 INFO DAGScheduler: Submitting 5 missing tasks from Stage 0 (ScalaEsRDD[0] at RDD at AbstractEsRDD.scala:17)
15/05/14 00:23:39 INFO TaskSchedulerImpl: Adding task set 0.0 with 5 tasks
15/05/14 00:23:39 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, ANY, 3352 bytes)
15/05/14 00:23:39 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, ANY, 3352 bytes)
15/05/14 00:23:39 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
15/05/14 00:23:39 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
15/05/14 00:24:54 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Operation timed out
15/05/14 00:24:54 INFO HttpMethodDirector: Retrying request
15/05/14 00:24:54 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Operation timed out
15/05/14 00:24:54 INFO HttpMethodDirector: Retrying request
15/05/14 00:26:09 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Operation timed out
15/05/14 00:26:09 INFO HttpMethodDirector: Retrying request
15/05/14 00:26:09 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Operation timed out
15/05/14 00:26:09 INFO HttpMethodDirector: Retrying request
15/05/14 00:27:25 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Operation timed out
15/05/14 00:27:25 INFO HttpMethodDirector: Retrying request
15/05/14 00:27:25 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Operation timed out
15/05/14 00:27:25 INFO HttpMethodDirector: Retrying request
15/05/14 00:28:40 ERROR NetworkClient: Node [Operation timed out] failed (172.31.14.100:9200); selected next node [52.68.202.80:9200]
15/05/14 00:28:40 ERROR NetworkClient: Node [Operation timed out] failed (172.31.14.100:9200); selected next node [52.68.202.80:9200]
15/05/14 00:28:40 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 18184 bytes result sent to driver
15/05/14 00:28:40 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, localhost, ANY, 3352 bytes)
15/05/14 00:28:40 INFO Executor: Running task 2.0 in stage 0.0 (TID 2)
15/05/14 00:28:40 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 301234 ms on localhost (1/5)
15/05/14 00:28:40 INFO Executor: Finished task 1.0 in stage 0.0 (TID 1). 19532 bytes result sent to driver
15/05/14 00:28:40 INFO TaskSetManager: Starting task 3.0 in stage 0.0 (TID 3, localhost, ANY, 3352 bytes)
15/05/14 00:28:40 INFO Executor: Running task 3.0 in stage 0.0 (TID 3)
15/05/14 00:28:40 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 301315 ms on localhost (2/5)
15/05/14 00:29:55 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Operation timed out
15/05/14 00:29:55 INFO HttpMethodDirector: Retrying request
15/05/14 00:29:56 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Operation timed out
15/05/14 00:29:56 INFO HttpMethodDirector: Retrying request
15/05/14 00:31:11 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Operation timed out
15/05/14 00:31:11 INFO HttpMethodDirector: Retrying request
15/05/14 00:31:11 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Operation timed out
15/05/14 00:31:11 INFO HttpMethodDirector: Retrying request
15/05/14 00:32:26 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Operation timed out
15/05/14 00:32:26 INFO HttpMethodDirector: Retrying request
15/05/14 00:32:26 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Operation timed out
15/05/14 00:32:26 INFO HttpMethodDirector: Retrying request
15/05/14 00:33:42 ERROR NetworkClient: Node [Operation timed out] failed (172.31.14.100:9200); selected next node [52.68.202.80:9200]
15/05/14 00:33:42 ERROR NetworkClient: Node [Operation timed out] failed (172.31.14.100:9200); selected next node [52.68.202.80:9200]
15/05/14 00:33:42 INFO Executor: Finished task 2.0 in stage 0.0 (TID 2). 18177 bytes result sent to driver
15/05/14 00:33:42 INFO TaskSetManager: Starting task 4.0 in stage 0.0 (TID 4, localhost, ANY, 3352 bytes)
15/05/14 00:33:42 INFO Executor: Running task 4.0 in stage 0.0 (TID 4)
15/05/14 00:33:42 INFO TaskSetManager: Finished task 2.0 in stage 0.0 (TID 2) in 301864 ms on localhost (3/5)
15/05/14 00:33:42 INFO Executor: Finished task 3.0 in stage 0.0 (TID 3). 18176 bytes result sent to driver
15/05/14 00:33:42 INFO TaskSetManager: Finished task 3.0 in stage 0.0 (TID 3) in 301882 ms on localhost (4/5)
15/05/14 00:34:58 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Operation timed out
15/05/14 00:34:58 INFO HttpMethodDirector: Retrying request
15/05/14 00:36:13 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Operation timed out
15/05/14 00:36:13 INFO HttpMethodDirector: Retrying request
15/05/14 00:37:29 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Operation timed out
15/05/14 00:37:29 INFO HttpMethodDirector: Retrying request
15/05/14 00:38:44 ERROR NetworkClient: Node [Operation timed out] failed (172.31.14.100:9200); selected next node [52.68.202.80:9200]
15/05/14 00:38:45 INFO Executor: Finished task 4.0 in stage 0.0 (TID 4). 19190 bytes result sent to driver
15/05/14 00:38:45 INFO TaskSetManager: Finished task 4.0 in stage 0.0 (TID 4) in 302573 ms on localhost (5/5)
15/05/14 00:38:45 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 
15/05/14 00:38:45 INFO DAGScheduler: Stage 0 (collect at App.scala:19) finished in 905.680 s
15/05/14 00:38:45 INFO DAGScheduler: Job 0 finished: collect at App.scala:19, took 905.884621 s

* Finally, after the long way, starting to foreach(println)

(AU1KNAQHKoN_e7xsz2J7,Map(no -> 38344049, s_no -> 2988722, source -> wifiscan2, bssid -> d850e6d5a770, ssid -> jasonlan, rss -> -91.0, gps_lat -> 24.99175413, gps_lng -> 121.28153416, net_lat -> -10000.0, net_lng -> -10000.0, m_lat -> 24.9916650318, m_lng -> 121.281452128, imei -> 352842060663324, scan_time -> Mon Sep 01 08:00:06 CST 2014, trace -> 1))

 * and keep trying to connect

15/05/14 00:38:45 INFO SparkContext: Starting job: count at App.scala:20
15/05/14 00:38:45 INFO DAGScheduler: Got job 1 (count at App.scala:20) with 5 output partitions (allowLocal=false)
15/05/14 00:38:45 INFO DAGScheduler: Final stage: Stage 1(count at App.scala:20)
15/05/14 00:38:45 INFO DAGScheduler: Parents of final stage: List()
15/05/14 00:38:45 INFO DAGScheduler: Missing parents: List()
15/05/14 00:38:45 INFO DAGScheduler: Submitting Stage 1 (ScalaEsRDD[0] at RDD at AbstractEsRDD.scala:17), which has no missing parents
15/05/14 00:38:45 INFO MemoryStore: ensureFreeSpace(1464) called with curMem=2644, maxMem=2061647216
15/05/14 00:38:45 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 1464.0 B, free 1966.1 MB)
15/05/14 00:38:45 INFO MemoryStore: ensureFreeSpace(1121) called with curMem=4108, maxMem=2061647216
15/05/14 00:38:45 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 1121.0 B, free 1966.1 MB)
15/05/14 00:38:45 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on localhost:58667 (size: 1121.0 B, free: 1966.1 MB)
15/05/14 00:38:45 INFO BlockManagerMaster: Updated info of block broadcast_1_piece0
15/05/14 00:38:45 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:839
15/05/14 00:38:45 INFO DAGScheduler: Submitting 5 missing tasks from Stage 1 (ScalaEsRDD[0] at RDD at AbstractEsRDD.scala:17)
15/05/14 00:38:45 INFO TaskSchedulerImpl: Adding task set 1.0 with 5 tasks
15/05/14 00:38:45 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 5, localhost, ANY, 3352 bytes)
15/05/14 00:38:45 INFO TaskSetManager: Starting task 1.0 in stage 1.0 (TID 6, localhost, ANY, 3352 bytes)
15/05/14 00:38:45 INFO Executor: Running task 0.0 in stage 1.0 (TID 5)
15/05/14 00:38:45 INFO Executor: Running task 1.0 in stage 1.0 (TID 6)
15/05/14 00:40:00 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Operation timed out
15/05/14 00:40:00 INFO HttpMethodDirector: Retrying request


It's weird, when I sbt package --> deploy it to the ES server --> spark-submit --> everything is ok without waiting and error messages

 I am sure I am missing some thing but unable to figure out that. >_<

Thanks for your help!!

--
Please update your bookmarks! We have moved to https://discuss.elastic.co/
---
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7589e5a7-8428-4b0a-aefe-7f082e5f480a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Remote access about Spark and Elasticsearch

Curt Kohler

I'm having the same problem today as well, although trying to access a server running in a AWS VPC (runs correctly with everything on my laptop).  In my case, it appears to be trying to connect to the private IP address before failing over to the public one (which eventually works).

Curt

On Wednesday, May 13, 2015 at 12:48:54 PM UTC-4, Jen-Ming Chung wrote:
Hi Everyone,

First, I setup the ES server on AWS by the following instructions and the tcp port 9200, 9300 is allowed on security group:

wget <a href="https://download.elastic.co/elasticsearch/elasticsearch/elasticsearch-1.5.2.deb" target="_blank" rel="nofollow" onmousedown="this.href='https://www.google.com/url?q\75https%3A%2F%2Fdownload.elastic.co%2Felasticsearch%2Felasticsearch%2Felasticsearch-1.5.2.deb\46sa\75D\46sntz\0751\46usg\75AFQjCNGyUFjZsQqm0C5WFGL9g9KFi18tsw';return true;" onclick="this.href='https://www.google.com/url?q\75https%3A%2F%2Fdownload.elastic.co%2Felasticsearch%2Felasticsearch%2Felasticsearch-1.5.2.deb\46sa\75D\46sntz\0751\46usg\75AFQjCNGyUFjZsQqm0C5WFGL9g9KFi18tsw';return true;">https://download.elastic.co/elasticsearch/elasticsearch/elasticsearch-1.5.2.deb
sudo dpkg -i elasticsearch-1.5.2.deb

And by using sudo netstat -atnp to make sure the above ports are listening:

tcp6       0      0 :::9200                 :::*                    LISTEN      
tcp6       0      0 :::9300                 :::*                    LISTEN

Then, my scala code:
    val sparkConf = new SparkConf().setAppName("Test")
      .setMaster("local[2]")
      .set("es.nodes", "<a href="http://52.68.202.80:9200" target="_blank" rel="nofollow" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2F52.68.202.80%3A9200\46sa\75D\46sntz\0751\46usg\75AFQjCNF2Y0M2PrNWOnFXLHvK6-0hwBh8eA';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2F52.68.202.80%3A9200\46sa\75D\46sntz\0751\46usg\75AFQjCNF2Y0M2PrNWOnFXLHvK6-0hwBh8eA';return true;">52.68.202.80:9200")
      .set("es.nodes.discovery", "false")
    val sc = new SparkContext(sparkConf)

    // total 267 hits
    val query = "{\"query\":{\"bool\":{\"must\":[{\"range\":{\"scan_time\":{\"from\":\"2014-09-01T00:00:00\",\"to\":\"2014-09-01T00:00:59\"}}}]}}}";
    val data = sc.esRDD("wifi-collection/final_data", query)
    data.collection().foreach(println)

It's weird that when I run the code on laptop (localhost), I always got the following message:

15/05/14 00:23:36 INFO SparkContext: Running Spark version 1.3.1
15/05/14 00:23:37 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/05/14 00:23:37 INFO SecurityManager: Changing view acls to: jeremy
15/05/14 00:23:37 INFO SecurityManager: Changing modify acls to: jeremy
15/05/14 00:23:37 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(jeremy); users with modify permissions: Set(jeremy)
15/05/14 00:23:37 INFO Slf4jLogger: Slf4jLogger started
15/05/14 00:23:37 INFO Remoting: Starting remoting
15/05/14 00:23:37 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@orion:58665]
15/05/14 00:23:37 INFO Utils: Successfully started service 'sparkDriver' on port 58665.
15/05/14 00:23:37 INFO SparkEnv: Registering MapOutputTracker
15/05/14 00:23:37 INFO SparkEnv: Registering BlockManagerMaster
15/05/14 00:23:37 INFO DiskBlockManager: Created local directory at /var/folders/lz/bc5hqqsn1gvg2hl4b8svwd_w0000gn/T/spark-4d9ee290-78d6-4537-8975-33886ece0b86/blockmgr-469a0ac4-e62d-4e55-b848-c3ddc5bf121f
15/05/14 00:23:37 INFO MemoryStore: MemoryStore started with capacity 1966.1 MB
15/05/14 00:23:37 INFO HttpFileServer: HTTP File server directory is /var/folders/lz/bc5hqqsn1gvg2hl4b8svwd_w0000gn/T/spark-81548d67-dcb2-4e79-b382-79a2a0a32a76/httpd-e77bb5ee-c571-4a0b-bdab-8e8037a3205e
15/05/14 00:23:37 INFO HttpServer: Starting HTTP Server
15/05/14 00:23:37 INFO Server: jetty-8.y.z-SNAPSHOT
15/05/14 00:23:37 INFO AbstractConnector: Started <a href="http://SocketConnector@0.0.0.0:58666" target="_blank" rel="nofollow" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2FSocketConnector%400.0.0.0%3A58666\46sa\75D\46sntz\0751\46usg\75AFQjCNGozRmtCURYXn6BRws-sOTpGhFWhA';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2FSocketConnector%400.0.0.0%3A58666\46sa\75D\46sntz\0751\46usg\75AFQjCNGozRmtCURYXn6BRws-sOTpGhFWhA';return true;">SocketConnector@0.0.0.0:58666
15/05/14 00:23:37 INFO Utils: Successfully started service 'HTTP file server' on port 58666.
15/05/14 00:23:37 INFO SparkEnv: Registering OutputCommitCoordinator
15/05/14 00:23:37 INFO Server: jetty-8.y.z-SNAPSHOT
15/05/14 00:23:37 INFO AbstractConnector: Started <a href="http://SelectChannelConnector@0.0.0.0:4040" target="_blank" rel="nofollow" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2FSelectChannelConnector%400.0.0.0%3A4040\46sa\75D\46sntz\0751\46usg\75AFQjCNF2JSRYU1X_oLKXovTLAO5bvd0fvg';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2FSelectChannelConnector%400.0.0.0%3A4040\46sa\75D\46sntz\0751\46usg\75AFQjCNF2JSRYU1X_oLKXovTLAO5bvd0fvg';return true;">SelectChannelConnector@0.0.0.0:4040
15/05/14 00:23:37 INFO Utils: Successfully started service 'SparkUI' on port 4040.
15/05/14 00:23:37 INFO SparkUI: Started SparkUI at <a href="http://orion:4040" target="_blank" rel="nofollow" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2Forion%3A4040\46sa\75D\46sntz\0751\46usg\75AFQjCNEmeEx90tlYNETGh_kCM1vWQ4_oHw';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2Forion%3A4040\46sa\75D\46sntz\0751\46usg\75AFQjCNEmeEx90tlYNETGh_kCM1vWQ4_oHw';return true;">http://orion:4040
15/05/14 00:23:38 INFO Executor: Starting executor ID <driver> on host localhost
15/05/14 00:23:38 INFO AkkaUtils: Connecting to HeartbeatReceiver: akka.tcp://sparkDriver@orion:58665/user/HeartbeatReceiver
15/05/14 00:23:38 INFO NettyBlockTransferService: Server created on 58667
15/05/14 00:23:38 INFO BlockManagerMaster: Trying to register BlockManager
15/05/14 00:23:38 INFO BlockManagerMasterActor: Registering block manager localhost:58667 with 1966.1 MB RAM, BlockManagerId(<driver>, localhost, 58667)
15/05/14 00:23:38 INFO BlockManagerMaster: Registered BlockManager
15/05/14 00:23:39 INFO Version: Elasticsearch Hadoop v2.1.0.Beta4 [2c62e273d2]
15/05/14 00:23:39 INFO ScalaEsRDD: Reading from [wifi-collection/final_data]
15/05/14 00:23:39 INFO ScalaEsRDD: Discovered mapping {wifi-collection=[mappings=[final_data=[bssid=STRING, gps_lat=DOUBLE, gps_lng=DOUBLE, imei=STRING, m_lat=DOUBLE, m_lng=DOUBLE, net_lat=DOUBLE, net_lng=DOUBLE, no=LONG, rss=DOUBLE, s_no=LONG, scan_time=DATE, source=STRING, ssid=STRING, trace=LONG]]]} for [wifi-collection/final_data]
15/05/14 00:23:39 INFO SparkContext: Starting job: collect at App.scala:19
15/05/14 00:23:39 INFO DAGScheduler: Got job 0 (collect at App.scala:19) with 5 output partitions (allowLocal=false)
15/05/14 00:23:39 INFO DAGScheduler: Final stage: Stage 0(collect at App.scala:19)
15/05/14 00:23:39 INFO DAGScheduler: Parents of final stage: List()
15/05/14 00:23:39 INFO DAGScheduler: Missing parents: List()
15/05/14 00:23:39 INFO DAGScheduler: Submitting Stage 0 (ScalaEsRDD[0] at RDD at AbstractEsRDD.scala:17), which has no missing parents
15/05/14 00:23:39 INFO MemoryStore: ensureFreeSpace(1496) called with curMem=0, maxMem=2061647216
15/05/14 00:23:39 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1496.0 B, free 1966.1 MB)
15/05/14 00:23:39 INFO MemoryStore: ensureFreeSpace(1148) called with curMem=1496, maxMem=2061647216
15/05/14 00:23:39 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1148.0 B, free 1966.1 MB)
15/05/14 00:23:39 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:58667 (size: 1148.0 B, free: 1966.1 MB)
15/05/14 00:23:39 INFO BlockManagerMaster: Updated info of block broadcast_0_piece0
15/05/14 00:23:39 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:839
15/05/14 00:23:39 INFO DAGScheduler: Submitting 5 missing tasks from Stage 0 (ScalaEsRDD[0] at RDD at AbstractEsRDD.scala:17)
15/05/14 00:23:39 INFO TaskSchedulerImpl: Adding task set 0.0 with 5 tasks
15/05/14 00:23:39 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, ANY, 3352 bytes)
15/05/14 00:23:39 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, ANY, 3352 bytes)
15/05/14 00:23:39 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
15/05/14 00:23:39 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
15/05/14 00:24:54 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Operation timed out
15/05/14 00:24:54 INFO HttpMethodDirector: Retrying request
15/05/14 00:24:54 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Operation timed out
15/05/14 00:24:54 INFO HttpMethodDirector: Retrying request
15/05/14 00:26:09 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Operation timed out
15/05/14 00:26:09 INFO HttpMethodDirector: Retrying request
15/05/14 00:26:09 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Operation timed out
15/05/14 00:26:09 INFO HttpMethodDirector: Retrying request
15/05/14 00:27:25 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Operation timed out
15/05/14 00:27:25 INFO HttpMethodDirector: Retrying request
15/05/14 00:27:25 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Operation timed out
15/05/14 00:27:25 INFO HttpMethodDirector: Retrying request
15/05/14 00:28:40 ERROR NetworkClient: Node [Operation timed out] failed (<a href="http://172.31.14.100:9200" target="_blank" rel="nofollow" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2F172.31.14.100%3A9200\46sa\75D\46sntz\0751\46usg\75AFQjCNH13KAOZtpwLGdKMSw_hVyxNpaqUA';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2F172.31.14.100%3A9200\46sa\75D\46sntz\0751\46usg\75AFQjCNH13KAOZtpwLGdKMSw_hVyxNpaqUA';return true;">172.31.14.100:9200); selected next node [<a href="http://52.68.202.80:9200" target="_blank" rel="nofollow" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2F52.68.202.80%3A9200\46sa\75D\46sntz\0751\46usg\75AFQjCNF2Y0M2PrNWOnFXLHvK6-0hwBh8eA';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2F52.68.202.80%3A9200\46sa\75D\46sntz\0751\46usg\75AFQjCNF2Y0M2PrNWOnFXLHvK6-0hwBh8eA';return true;">52.68.202.80:9200]
15/05/14 00:28:40 ERROR NetworkClient: Node [Operation timed out] failed (<a href="http://172.31.14.100:9200" target="_blank" rel="nofollow" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2F172.31.14.100%3A9200\46sa\75D\46sntz\0751\46usg\75AFQjCNH13KAOZtpwLGdKMSw_hVyxNpaqUA';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2F172.31.14.100%3A9200\46sa\75D\46sntz\0751\46usg\75AFQjCNH13KAOZtpwLGdKMSw_hVyxNpaqUA';return true;">172.31.14.100:9200); selected next node [<a href="http://52.68.202.80:9200" target="_blank" rel="nofollow" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2F52.68.202.80%3A9200\46sa\75D\46sntz\0751\46usg\75AFQjCNF2Y0M2PrNWOnFXLHvK6-0hwBh8eA';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2F52.68.202.80%3A9200\46sa\75D\46sntz\0751\46usg\75AFQjCNF2Y0M2PrNWOnFXLHvK6-0hwBh8eA';return true;">52.68.202.80:9200]
15/05/14 00:28:40 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 18184 bytes result sent to driver
15/05/14 00:28:40 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, localhost, ANY, 3352 bytes)
15/05/14 00:28:40 INFO Executor: Running task 2.0 in stage 0.0 (TID 2)
15/05/14 00:28:40 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 301234 ms on localhost (1/5)
15/05/14 00:28:40 INFO Executor: Finished task 1.0 in stage 0.0 (TID 1). 19532 bytes result sent to driver
15/05/14 00:28:40 INFO TaskSetManager: Starting task 3.0 in stage 0.0 (TID 3, localhost, ANY, 3352 bytes)
15/05/14 00:28:40 INFO Executor: Running task 3.0 in stage 0.0 (TID 3)
15/05/14 00:28:40 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 301315 ms on localhost (2/5)
15/05/14 00:29:55 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Operation timed out
15/05/14 00:29:55 INFO HttpMethodDirector: Retrying request
15/05/14 00:29:56 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Operation timed out
15/05/14 00:29:56 INFO HttpMethodDirector: Retrying request
15/05/14 00:31:11 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Operation timed out
15/05/14 00:31:11 INFO HttpMethodDirector: Retrying request
15/05/14 00:31:11 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Operation timed out
15/05/14 00:31:11 INFO HttpMethodDirector: Retrying request
15/05/14 00:32:26 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Operation timed out
15/05/14 00:32:26 INFO HttpMethodDirector: Retrying request
15/05/14 00:32:26 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Operation timed out
15/05/14 00:32:26 INFO HttpMethodDirector: Retrying request
15/05/14 00:33:42 ERROR NetworkClient: Node [Operation timed out] failed (<a href="http://172.31.14.100:9200" target="_blank" rel="nofollow" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2F172.31.14.100%3A9200\46sa\75D\46sntz\0751\46usg\75AFQjCNH13KAOZtpwLGdKMSw_hVyxNpaqUA';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2F172.31.14.100%3A9200\46sa\75D\46sntz\0751\46usg\75AFQjCNH13KAOZtpwLGdKMSw_hVyxNpaqUA';return true;">172.31.14.100:9200); selected next node [<a href="http://52.68.202.80:9200" target="_blank" rel="nofollow" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2F52.68.202.80%3A9200\46sa\75D\46sntz\0751\46usg\75AFQjCNF2Y0M2PrNWOnFXLHvK6-0hwBh8eA';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2F52.68.202.80%3A9200\46sa\75D\46sntz\0751\46usg\75AFQjCNF2Y0M2PrNWOnFXLHvK6-0hwBh8eA';return true;">52.68.202.80:9200]
15/05/14 00:33:42 ERROR NetworkClient: Node [Operation timed out] failed (<a href="http://172.31.14.100:9200" target="_blank" rel="nofollow" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2F172.31.14.100%3A9200\46sa\75D\46sntz\0751\46usg\75AFQjCNH13KAOZtpwLGdKMSw_hVyxNpaqUA';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2F172.31.14.100%3A9200\46sa\75D\46sntz\0751\46usg\75AFQjCNH13KAOZtpwLGdKMSw_hVyxNpaqUA';return true;">172.31.14.100:9200); selected next node [<a href="http://52.68.202.80:9200" target="_blank" rel="nofollow" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2F52.68.202.80%3A9200\46sa\75D\46sntz\0751\46usg\75AFQjCNF2Y0M2PrNWOnFXLHvK6-0hwBh8eA';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2F52.68.202.80%3A9200\46sa\75D\46sntz\0751\46usg\75AFQjCNF2Y0M2PrNWOnFXLHvK6-0hwBh8eA';return true;">52.68.202.80:9200]
15/05/14 00:33:42 INFO Executor: Finished task 2.0 in stage 0.0 (TID 2). 18177 bytes result sent to driver
15/05/14 00:33:42 INFO TaskSetManager: Starting task 4.0 in stage 0.0 (TID 4, localhost, ANY, 3352 bytes)
15/05/14 00:33:42 INFO Executor: Running task 4.0 in stage 0.0 (TID 4)
15/05/14 00:33:42 INFO TaskSetManager: Finished task 2.0 in stage 0.0 (TID 2) in 301864 ms on localhost (3/5)
15/05/14 00:33:42 INFO Executor: Finished task 3.0 in stage 0.0 (TID 3). 18176 bytes result sent to driver
15/05/14 00:33:42 INFO TaskSetManager: Finished task 3.0 in stage 0.0 (TID 3) in 301882 ms on localhost (4/5)
15/05/14 00:34:58 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Operation timed out
15/05/14 00:34:58 INFO HttpMethodDirector: Retrying request
15/05/14 00:36:13 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Operation timed out
15/05/14 00:36:13 INFO HttpMethodDirector: Retrying request
15/05/14 00:37:29 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Operation timed out
15/05/14 00:37:29 INFO HttpMethodDirector: Retrying request
15/05/14 00:38:44 ERROR NetworkClient: Node [Operation timed out] failed (<a href="http://172.31.14.100:9200" target="_blank" rel="nofollow" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2F172.31.14.100%3A9200\46sa\75D\46sntz\0751\46usg\75AFQjCNH13KAOZtpwLGdKMSw_hVyxNpaqUA';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2F172.31.14.100%3A9200\46sa\75D\46sntz\0751\46usg\75AFQjCNH13KAOZtpwLGdKMSw_hVyxNpaqUA';return true;">172.31.14.100:9200); selected next node [<a href="http://52.68.202.80:9200" target="_blank" rel="nofollow" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2F52.68.202.80%3A9200\46sa\75D\46sntz\0751\46usg\75AFQjCNF2Y0M2PrNWOnFXLHvK6-0hwBh8eA';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2F52.68.202.80%3A9200\46sa\75D\46sntz\0751\46usg\75AFQjCNF2Y0M2PrNWOnFXLHvK6-0hwBh8eA';return true;">52.68.202.80:9200]
15/05/14 00:38:45 INFO Executor: Finished task 4.0 in stage 0.0 (TID 4). 19190 bytes result sent to driver
15/05/14 00:38:45 INFO TaskSetManager: Finished task 4.0 in stage 0.0 (TID 4) in 302573 ms on localhost (5/5)
15/05/14 00:38:45 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 
15/05/14 00:38:45 INFO DAGScheduler: Stage 0 (collect at App.scala:19) finished in 905.680 s
15/05/14 00:38:45 INFO DAGScheduler: Job 0 finished: collect at App.scala:19, took 905.884621 s

* Finally, after the long way, starting to foreach(println)

(AU1KNAQHKoN_e7xsz2J7,Map(no -> 38344049, s_no -> 2988722, source -> wifiscan2, bssid -> d850e6d5a770, ssid -> jasonlan, rss -> -91.0, gps_lat -> 24.99175413, gps_lng -> 121.28153416, net_lat -> -10000.0, net_lng -> -10000.0, m_lat -> 24.9916650318, m_lng -> 121.281452128, imei -> 352842060663324, scan_time -> Mon Sep 01 08:00:06 CST 2014, trace -> 1))

 * and keep trying to connect

15/05/14 00:38:45 INFO SparkContext: Starting job: count at App.scala:20
15/05/14 00:38:45 INFO DAGScheduler: Got job 1 (count at App.scala:20) with 5 output partitions (allowLocal=false)
15/05/14 00:38:45 INFO DAGScheduler: Final stage: Stage 1(count at App.scala:20)
15/05/14 00:38:45 INFO DAGScheduler: Parents of final stage: List()
15/05/14 00:38:45 INFO DAGScheduler: Missing parents: List()
15/05/14 00:38:45 INFO DAGScheduler: Submitting Stage 1 (ScalaEsRDD[0] at RDD at AbstractEsRDD.scala:17), which has no missing parents
15/05/14 00:38:45 INFO MemoryStore: ensureFreeSpace(1464) called with curMem=2644, maxMem=2061647216
15/05/14 00:38:45 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 1464.0 B, free 1966.1 MB)
15/05/14 00:38:45 INFO MemoryStore: ensureFreeSpace(1121) called with curMem=4108, maxMem=2061647216
15/05/14 00:38:45 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 1121.0 B, free 1966.1 MB)
15/05/14 00:38:45 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on localhost:58667 (size: 1121.0 B, free: 1966.1 MB)
15/05/14 00:38:45 INFO BlockManagerMaster: Updated info of block broadcast_1_piece0
15/05/14 00:38:45 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:839
15/05/14 00:38:45 INFO DAGScheduler: Submitting 5 missing tasks from Stage 1 (ScalaEsRDD[0] at RDD at AbstractEsRDD.scala:17)
15/05/14 00:38:45 INFO TaskSchedulerImpl: Adding task set 1.0 with 5 tasks
15/05/14 00:38:45 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 5, localhost, ANY, 3352 bytes)
15/05/14 00:38:45 INFO TaskSetManager: Starting task 1.0 in stage 1.0 (TID 6, localhost, ANY, 3352 bytes)
15/05/14 00:38:45 INFO Executor: Running task 0.0 in stage 1.0 (TID 5)
15/05/14 00:38:45 INFO Executor: Running task 1.0 in stage 1.0 (TID 6)
15/05/14 00:40:00 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Operation timed out
15/05/14 00:40:00 INFO HttpMethodDirector: Retrying request


It's weird, when I sbt package --> deploy it to the ES server --> spark-submit --> everything is ok without waiting and error messages

 I am sure I am missing some thing but unable to figure out that. >_<

Thanks for your help!!

--
Please update your bookmarks! We have moved to https://discuss.elastic.co/
---
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4b3a091b-00e0-4bce-95e9-00ec3bd1941e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Remote access about Spark and Elasticsearch

Costin Leau
Hi,

As you pointed out, the issue is caused by using the AWS private IP vs the public one. the connector queries the nodes
directly and will use the IP they advertise - in AWS, typically this is the private IP. As such, after the initial
discovery (which is done using the public IP), the connector (from Spark) tries to connect using the private IP.

Note this issue is not specific to Elasticsearch but rather any service in AWS where 'direct' connections are made
within the cluster.
You can address this through the network settings as mentioned here [1]

[1] https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-network.html

On 5/15/15 10:33 PM, Curt Kohler wrote:

>
> I'm having the same problem today as well, although trying to access a server running in a AWS VPC (runs correctly with
> everything on my laptop).  In my case, it appears to be trying to connect to the private IP address before failing over
> to the public one (which eventually works).
>
> Curt
>
> On Wednesday, May 13, 2015 at 12:48:54 PM UTC-4, Jen-Ming Chung wrote:
>
>     Hi Everyone,
>
>     First, I setup the ES server on AWS by the following instructions and the tcp port 9200, 9300 is allowed on security
>     group:
>
>     wget https://download.elastic.co/elasticsearch/elasticsearch/elasticsearch-1.5.2.deb
>     <https://download.elastic.co/elasticsearch/elasticsearch/elasticsearch-1.5.2.deb>
>     sudo dpkg -i elasticsearch-1.5.2.deb
>
>     And by using sudo netstat -atnp to make sure the above ports are listening:
>
>     tcp6       0      0 :::9200                 :::*                    LISTEN
>     tcp6       0      0 :::9300                 :::*                    LISTEN
>
>     Then, my scala code:
>          val sparkConf = new SparkConf().setAppName("Test")
>            .setMaster("local[2]")
>            .set("es.nodes", "52.68.202.80:9200 <http://52.68.202.80:9200>")
>            .set("es.nodes.discovery", "false")
>          val sc = new SparkContext(sparkConf)
>
>          // total 267 hits
>          val query =
>     "{\"query\":{\"bool\":{\"must\":[{\"range\":{\"scan_time\":{\"from\":\"2014-09-01T00:00:00\",\"to\":\"2014-09-01T00:00:59\"}}}]}}}";
>          val data = sc.esRDD("wifi-collection/final_data", query)
>          data.collection().foreach(println)
>
>     It's weird that when I run the code on laptop (localhost), I always got the following message:
>
>     15/05/14 00:23:36 INFO SparkContext: Running Spark version 1.3.1
>     15/05/14 00:23:37 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using
>     builtin-java classes where applicable
>     15/05/14 00:23:37 INFO SecurityManager: Changing view acls to: jeremy
>     15/05/14 00:23:37 INFO SecurityManager: Changing modify acls to: jeremy
>     15/05/14 00:23:37 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view
>     permissions: Set(jeremy); users with modify permissions: Set(jeremy)
>     15/05/14 00:23:37 INFO Slf4jLogger: Slf4jLogger started
>     15/05/14 00:23:37 INFO Remoting: Starting remoting
>     15/05/14 00:23:37 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@orion:58665]
>     15/05/14 00:23:37 INFO Utils: Successfully started service 'sparkDriver' on port 58665.
>     15/05/14 00:23:37 INFO SparkEnv: Registering MapOutputTracker
>     15/05/14 00:23:37 INFO SparkEnv: Registering BlockManagerMaster
>     15/05/14 00:23:37 INFO DiskBlockManager: Created local directory at
>     /var/folders/lz/bc5hqqsn1gvg2hl4b8svwd_w0000gn/T/spark-4d9ee290-78d6-4537-8975-33886ece0b86/blockmgr-469a0ac4-e62d-4e55-b848-c3ddc5bf121f
>     15/05/14 00:23:37 INFO MemoryStore: MemoryStore started with capacity 1966.1 MB
>     15/05/14 00:23:37 INFO HttpFileServer: HTTP File server directory is
>     /var/folders/lz/bc5hqqsn1gvg2hl4b8svwd_w0000gn/T/spark-81548d67-dcb2-4e79-b382-79a2a0a32a76/httpd-e77bb5ee-c571-4a0b-bdab-8e8037a3205e
>     15/05/14 00:23:37 INFO HttpServer: Starting HTTP Server
>     15/05/14 00:23:37 INFO Server: jetty-8.y.z-SNAPSHOT
>     15/05/14 00:23:37 INFO AbstractConnector: Started SocketConnector@0.0.0.0:58666 <http://SocketConnector@0.0.0.0:58666>
>     15/05/14 00:23:37 INFO Utils: Successfully started service 'HTTP file server' on port 58666.
>     15/05/14 00:23:37 INFO SparkEnv: Registering OutputCommitCoordinator
>     15/05/14 00:23:37 INFO Server: jetty-8.y.z-SNAPSHOT
>     15/05/14 00:23:37 INFO AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040
>     <http://SelectChannelConnector@0.0.0.0:4040>
>     15/05/14 00:23:37 INFO Utils: Successfully started service 'SparkUI' on port 4040.
>     15/05/14 00:23:37 INFO SparkUI: Started SparkUI at http://orion:4040
>     15/05/14 00:23:38 INFO Executor: Starting executor ID <driver> on host localhost
>     15/05/14 00:23:38 INFO AkkaUtils: Connecting to HeartbeatReceiver:
>     akka.tcp://sparkDriver@orion:58665/user/HeartbeatReceiver
>     15/05/14 00:23:38 INFO NettyBlockTransferService: Server created on 58667
>     15/05/14 00:23:38 INFO BlockManagerMaster: Trying to register BlockManager
>     15/05/14 00:23:38 INFO BlockManagerMasterActor: Registering block manager localhost:58667 with 1966.1 MB RAM,
>     BlockManagerId(<driver>, localhost, 58667)
>     15/05/14 00:23:38 INFO BlockManagerMaster: Registered BlockManager
>     15/05/14 00:23:39 INFO Version: Elasticsearch Hadoop v2.1.0.Beta4 [2c62e273d2]
>     15/05/14 00:23:39 INFO ScalaEsRDD: Reading from [wifi-collection/final_data]
>     15/05/14 00:23:39 INFO ScalaEsRDD: Discovered mapping {wifi-collection=[mappings=[final_data=[bssid=STRING,
>     gps_lat=DOUBLE, gps_lng=DOUBLE, imei=STRING, m_lat=DOUBLE, m_lng=DOUBLE, net_lat=DOUBLE, net_lng=DOUBLE, no=LONG,
>     rss=DOUBLE, s_no=LONG, scan_time=DATE, source=STRING, ssid=STRING, trace=LONG]]]} for [wifi-collection/final_data]
>     15/05/14 00:23:39 INFO SparkContext: Starting job: collect at App.scala:19
>     15/05/14 00:23:39 INFO DAGScheduler: Got job 0 (collect at App.scala:19) with 5 output partitions (allowLocal=false)
>     15/05/14 00:23:39 INFO DAGScheduler: Final stage: Stage 0(collect at App.scala:19)
>     15/05/14 00:23:39 INFO DAGScheduler: Parents of final stage: List()
>     15/05/14 00:23:39 INFO DAGScheduler: Missing parents: List()
>     15/05/14 00:23:39 INFO DAGScheduler: Submitting Stage 0 (ScalaEsRDD[0] at RDD at AbstractEsRDD.scala:17), which has
>     no missing parents
>     15/05/14 00:23:39 INFO MemoryStore: ensureFreeSpace(1496) called with curMem=0, maxMem=2061647216
>     15/05/14 00:23:39 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1496.0 B, free
>     1966.1 MB)
>     15/05/14 00:23:39 INFO MemoryStore: ensureFreeSpace(1148) called with curMem=1496, maxMem=2061647216
>     15/05/14 00:23:39 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1148.0 B,
>     free 1966.1 MB)
>     15/05/14 00:23:39 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:58667 (size: 1148.0 B,
>     free: 1966.1 MB)
>     15/05/14 00:23:39 INFO BlockManagerMaster: Updated info of block broadcast_0_piece0
>     15/05/14 00:23:39 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:839
>     15/05/14 00:23:39 INFO DAGScheduler: Submitting 5 missing tasks from Stage 0 (ScalaEsRDD[0] at RDD at
>     AbstractEsRDD.scala:17)
>     15/05/14 00:23:39 INFO TaskSchedulerImpl: Adding task set 0.0 with 5 tasks
>     15/05/14 00:23:39 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, ANY, 3352 bytes)
>     15/05/14 00:23:39 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, ANY, 3352 bytes)
>     15/05/14 00:23:39 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
>     15/05/14 00:23:39 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
>     15/05/14 00:24:54 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request:
>     Operation timed out
>     15/05/14 00:24:54 INFO HttpMethodDirector: Retrying request
>     15/05/14 00:24:54 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request:
>     Operation timed out
>     15/05/14 00:24:54 INFO HttpMethodDirector: Retrying request
>     15/05/14 00:26:09 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request:
>     Operation timed out
>     15/05/14 00:26:09 INFO HttpMethodDirector: Retrying request
>     15/05/14 00:26:09 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request:
>     Operation timed out
>     15/05/14 00:26:09 INFO HttpMethodDirector: Retrying request
>     15/05/14 00:27:25 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request:
>     Operation timed out
>     15/05/14 00:27:25 INFO HttpMethodDirector: Retrying request
>     15/05/14 00:27:25 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request:
>     Operation timed out
>     15/05/14 00:27:25 INFO HttpMethodDirector: Retrying request
>     15/05/14 00:28:40 ERROR NetworkClient: Node [Operation timed out] failed (172.31.14.100:9200
>     <http://172.31.14.100:9200>); selected next node [52.68.202.80:9200 <http://52.68.202.80:9200>]
>     15/05/14 00:28:40 ERROR NetworkClient: Node [Operation timed out] failed (172.31.14.100:9200
>     <http://172.31.14.100:9200>); selected next node [52.68.202.80:9200 <http://52.68.202.80:9200>]
>     15/05/14 00:28:40 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 18184 bytes result sent to driver
>     15/05/14 00:28:40 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, localhost, ANY, 3352 bytes)
>     15/05/14 00:28:40 INFO Executor: Running task 2.0 in stage 0.0 (TID 2)
>     15/05/14 00:28:40 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 301234 ms on localhost (1/5)
>     15/05/14 00:28:40 INFO Executor: Finished task 1.0 in stage 0.0 (TID 1). 19532 bytes result sent to driver
>     15/05/14 00:28:40 INFO TaskSetManager: Starting task 3.0 in stage 0.0 (TID 3, localhost, ANY, 3352 bytes)
>     15/05/14 00:28:40 INFO Executor: Running task 3.0 in stage 0.0 (TID 3)
>     15/05/14 00:28:40 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 301315 ms on localhost (2/5)
>     15/05/14 00:29:55 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request:
>     Operation timed out
>     15/05/14 00:29:55 INFO HttpMethodDirector: Retrying request
>     15/05/14 00:29:56 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request:
>     Operation timed out
>     15/05/14 00:29:56 INFO HttpMethodDirector: Retrying request
>     15/05/14 00:31:11 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request:
>     Operation timed out
>     15/05/14 00:31:11 INFO HttpMethodDirector: Retrying request
>     15/05/14 00:31:11 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request:
>     Operation timed out
>     15/05/14 00:31:11 INFO HttpMethodDirector: Retrying request
>     15/05/14 00:32:26 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request:
>     Operation timed out
>     15/05/14 00:32:26 INFO HttpMethodDirector: Retrying request
>     15/05/14 00:32:26 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request:
>     Operation timed out
>     15/05/14 00:32:26 INFO HttpMethodDirector: Retrying request
>     15/05/14 00:33:42 ERROR NetworkClient: Node [Operation timed out] failed (172.31.14.100:9200
>     <http://172.31.14.100:9200>); selected next node [52.68.202.80:9200 <http://52.68.202.80:9200>]
>     15/05/14 00:33:42 ERROR NetworkClient: Node [Operation timed out] failed (172.31.14.100:9200
>     <http://172.31.14.100:9200>); selected next node [52.68.202.80:9200 <http://52.68.202.80:9200>]
>     15/05/14 00:33:42 INFO Executor: Finished task 2.0 in stage 0.0 (TID 2). 18177 bytes result sent to driver
>     15/05/14 00:33:42 INFO TaskSetManager: Starting task 4.0 in stage 0.0 (TID 4, localhost, ANY, 3352 bytes)
>     15/05/14 00:33:42 INFO Executor: Running task 4.0 in stage 0.0 (TID 4)
>     15/05/14 00:33:42 INFO TaskSetManager: Finished task 2.0 in stage 0.0 (TID 2) in 301864 ms on localhost (3/5)
>     15/05/14 00:33:42 INFO Executor: Finished task 3.0 in stage 0.0 (TID 3). 18176 bytes result sent to driver
>     15/05/14 00:33:42 INFO TaskSetManager: Finished task 3.0 in stage 0.0 (TID 3) in 301882 ms on localhost (4/5)
>     15/05/14 00:34:58 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request:
>     Operation timed out
>     15/05/14 00:34:58 INFO HttpMethodDirector: Retrying request
>     15/05/14 00:36:13 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request:
>     Operation timed out
>     15/05/14 00:36:13 INFO HttpMethodDirector: Retrying request
>     15/05/14 00:37:29 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request:
>     Operation timed out
>     15/05/14 00:37:29 INFO HttpMethodDirector: Retrying request
>     15/05/14 00:38:44 ERROR NetworkClient: Node [Operation timed out] failed (172.31.14.100:9200
>     <http://172.31.14.100:9200>); selected next node [52.68.202.80:9200 <http://52.68.202.80:9200>]
>     15/05/14 00:38:45 INFO Executor: Finished task 4.0 in stage 0.0 (TID 4). 19190 bytes result sent to driver
>     15/05/14 00:38:45 INFO TaskSetManager: Finished task 4.0 in stage 0.0 (TID 4) in 302573 ms on localhost (5/5)
>     15/05/14 00:38:45 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
>     15/05/14 00:38:45 INFO DAGScheduler: Stage 0 (collect at App.scala:19) finished in 905.680 s
>     15/05/14 00:38:45 INFO DAGScheduler: Job 0 finished: collect at App.scala:19, took 905.884621 s
>
>     * Finally, after the long way, starting to foreach(println)
>
>     (AU1KNAQHKoN_e7xsz2J7,Map(no -> 38344049, s_no -> 2988722, source -> wifiscan2, bssid -> d850e6d5a770, ssid ->
>     jasonlan, rss -> -91.0, gps_lat -> 24.99175413, gps_lng -> 121.28153416, net_lat -> -10000.0, net_lng -> -10000.0,
>     m_lat -> 24.9916650318, m_lng -> 121.281452128, imei -> 352842060663324, scan_time -> Mon Sep 01 08:00:06 CST 2014,
>     trace -> 1))
>
>       * and keep trying to connect
>
>     15/05/14 00:38:45 INFO SparkContext: Starting job: count at App.scala:20
>     15/05/14 00:38:45 INFO DAGScheduler: Got job 1 (count at App.scala:20) with 5 output partitions (allowLocal=false)
>     15/05/14 00:38:45 INFO DAGScheduler: Final stage: Stage 1(count at App.scala:20)
>     15/05/14 00:38:45 INFO DAGScheduler: Parents of final stage: List()
>     15/05/14 00:38:45 INFO DAGScheduler: Missing parents: List()
>     15/05/14 00:38:45 INFO DAGScheduler: Submitting Stage 1 (ScalaEsRDD[0] at RDD at AbstractEsRDD.scala:17), which has
>     no missing parents
>     15/05/14 00:38:45 INFO MemoryStore: ensureFreeSpace(1464) called with curMem=2644, maxMem=2061647216
>     15/05/14 00:38:45 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 1464.0 B, free
>     1966.1 MB)
>     15/05/14 00:38:45 INFO MemoryStore: ensureFreeSpace(1121) called with curMem=4108, maxMem=2061647216
>     15/05/14 00:38:45 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 1121.0 B,
>     free 1966.1 MB)
>     15/05/14 00:38:45 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on localhost:58667 (size: 1121.0 B,
>     free: 1966.1 MB)
>     15/05/14 00:38:45 INFO BlockManagerMaster: Updated info of block broadcast_1_piece0
>     15/05/14 00:38:45 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:839
>     15/05/14 00:38:45 INFO DAGScheduler: Submitting 5 missing tasks from Stage 1 (ScalaEsRDD[0] at RDD at
>     AbstractEsRDD.scala:17)
>     15/05/14 00:38:45 INFO TaskSchedulerImpl: Adding task set 1.0 with 5 tasks
>     15/05/14 00:38:45 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 5, localhost, ANY, 3352 bytes)
>     15/05/14 00:38:45 INFO TaskSetManager: Starting task 1.0 in stage 1.0 (TID 6, localhost, ANY, 3352 bytes)
>     15/05/14 00:38:45 INFO Executor: Running task 0.0 in stage 1.0 (TID 5)
>     15/05/14 00:38:45 INFO Executor: Running task 1.0 in stage 1.0 (TID 6)
>     15/05/14 00:40:00 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request:
>     Operation timed out
>     15/05/14 00:40:00 INFO HttpMethodDirector: Retrying request
>
>
>     It's weird, when I sbt package --> deploy it to the ES server --> spark-submit --> everything is ok without waiting
>     and error messages
>
>       I am sure I am missing some thing but unable to figure out that. >_<
>
>     Thanks for your help!!
>
> --
> Please update your bookmarks! We have moved to https://discuss.elastic.co/
> ---
> You received this message because you are subscribed to the Google Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
> [hidden email] <mailto:[hidden email]>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/4b3a091b-00e0-4bce-95e9-00ec3bd1941e%40googlegroups.com
> <https://groups.google.com/d/msgid/elasticsearch/4b3a091b-00e0-4bce-95e9-00ec3bd1941e%40googlegroups.com?utm_medium=email&utm_source=footer>.
> For more options, visit https://groups.google.com/d/optout.

--
Costin

--
Please update your bookmarks! We have moved to https://discuss.elastic.co/
---
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/55570EE2.5050400%40gmail.com.
For more options, visit https://groups.google.com/d/optout.