You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 16, 2019. It is now read-only.
I get the same problem as #247 ,and change change the source location in lenet_memory_train_test to the hdfs path as @arundasan91 's suggestion.
However ,i still meet the same problem.
//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// 17/05/03 18:50:15 INFO yarn.Client: Application report for application_1493801577689_0009 (state: RUNNING) 17/05/03 18:50:16 INFO yarn.Client: Application report for application_1493801577689_0009 (state: FINISHED) 17/05/03 18:50:16 INFO yarn.Client: client token: N/A diagnostics: N/A ApplicationMaster host: 192.168.191.3 ApplicationMaster RPC port: 0 queue: default start time: 1493808511908 final status: FAILED tracking URL: http://sky:8088/proxy/application_1493801577689_0009/ user: hadoop Exception in thread "main" org.apache.spark.SparkException: Application application_1493801577689_0009 finished with failed status at org.apache.spark.deploy.yarn.Client.run(Client.scala:1029) at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1076) at org.apache.spark.deploy.yarn.Client.main(Client.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 17/05/03 18:50:16 INFO util.ShutdownHookManager: Shutdown hook called 17/05/03 18:50:16 INFO util.ShutdownHookManager: Deleting directory /home/hadoop/deep_learning/spark-1.6.0-bin-hadoop2.6/spark-2136e9ab-1b64-4d32-85d0-a6eb6fce0ea1
/////////////////////////////////////////////////////////////////////////////////////////////////////////
I have two machines . IP 192.168.191.2 is master ,32GB 8cores. IP 192.168.191.3 is slave 32GB 8cores.
as step 8 say: export SPARK_WORKER_INSTANCES=2 export DEVICES=1
The error in logpage is "Diagnostics:User class threw exception: java.lang.IllegalStateException: actual number of executors is not as expected"
when i change the command "export SPARK_WORKER_INSTANCES=2 export DEVICES=2"
The error in logpage is
"Diagnostics: User class threw exception: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 2.0 failed 4 times, most recent failure: Lost task 0.3 in stage 2.0 (TID 7, trc): ExecutorLostFailure (executor 4 exited caused by one of the running tasks) Reason: Container killed by YARN for exceeding memory limits. 46.3 GB of 4.2 GB virtual memory used. Consider boosting spark.yarn.executor.memoryOverhead.
Driver stacktrace:"
The text was updated successfully, but these errors were encountered:
I get the same problem as #247 ,and change change the source location in lenet_memory_train_test to the hdfs path as @arundasan91 's suggestion.
However ,i still meet the same problem.
////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
17/05/03 18:50:15 INFO yarn.Client: Application report for application_1493801577689_0009 (state: RUNNING) 17/05/03 18:50:16 INFO yarn.Client: Application report for application_1493801577689_0009 (state: FINISHED) 17/05/03 18:50:16 INFO yarn.Client: client token: N/A diagnostics: N/A ApplicationMaster host: 192.168.191.3 ApplicationMaster RPC port: 0 queue: default start time: 1493808511908 final status: FAILED tracking URL: http://sky:8088/proxy/application_1493801577689_0009/ user: hadoop Exception in thread "main" org.apache.spark.SparkException: Application application_1493801577689_0009 finished with failed status at org.apache.spark.deploy.yarn.Client.run(Client.scala:1029) at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1076) at org.apache.spark.deploy.yarn.Client.main(Client.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 17/05/03 18:50:16 INFO util.ShutdownHookManager: Shutdown hook called 17/05/03 18:50:16 INFO util.ShutdownHookManager: Deleting directory /home/hadoop/deep_learning/spark-1.6.0-bin-hadoop2.6/spark-2136e9ab-1b64-4d32-85d0-a6eb6fce0ea1
/////////////////////////////////////////////////////////////////////////////////////////////////////////
I have two machines . IP 192.168.191.2 is master ,32GB 8cores. IP 192.168.191.3 is slave 32GB 8cores.
as step 8 say: export SPARK_WORKER_INSTANCES=2 export DEVICES=1
The error in logpage is "Diagnostics:User class threw exception: java.lang.IllegalStateException: actual number of executors is not as expected"
when i change the command "export SPARK_WORKER_INSTANCES=2 export DEVICES=2"
The error in logpage is
"Diagnostics: User class threw exception: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 2.0 failed 4 times, most recent failure: Lost task 0.3 in stage 2.0 (TID 7, trc): ExecutorLostFailure (executor 4 exited caused by one of the running tasks) Reason: Container killed by YARN for exceeding memory limits. 46.3 GB of 4.2 GB virtual memory used. Consider boosting spark.yarn.executor.memoryOverhead.
Driver stacktrace:"
The text was updated successfully, but these errors were encountered: