Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

job not started #22

Open
joa-rodrigues opened this issue Sep 16, 2019 · 6 comments
Open

job not started #22

joa-rodrigues opened this issue Sep 16, 2019 · 6 comments

Comments

@joa-rodrigues
Copy link

joa-rodrigues commented Sep 16, 2019

Hello
When a run a job i'm getting :
Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

@joa-rodrigues joa-rodrigues changed the title network not found job not started Sep 19, 2019
@henryzxj
Copy link

I've got the same error.

command: docker ps

CONTAINER ID        IMAGE                           COMMAND                  CREATED             STATUS              PORTS                                                      NAMES
b3358be58cce        spydernaz/spark-worker:latest   "/bin/bash /start-wo…"   13 minutes ago      Up 13 minutes       8081/tcp                                                   docker-spark-cluster_spark-worker_1
08e637ed0527        spydernaz/spark-worker:latest   "/bin/bash /start-wo…"   13 minutes ago      Up 13 minutes       8081/tcp                                                   docker-spark-cluster_spark-worker_2
2f3606bea6da        spydernaz/spark-worker:latest   "/bin/bash /start-wo…"   13 minutes ago      Up 13 minutes       8081/tcp                                                   docker-spark-cluster_spark-worker_3
0bbed01935dc        spydernaz/spark-master:latest   "/bin/bash /start-ma…"   13 minutes ago      Up 13 minutes       6066/tcp, 0.0.0.0:7077->7077/tcp, 0.0.0.0:9090->8080/tcp   docker-spark-cluster_spark-master_1

input: 1.py

from pyspark import SparkContext
logFile = "README.md"
spark = SparkContext('spark://10.68.50.149:7077','SimpleApp')
logData = spark.textFile(logFile).cache()
numAs = logData.filter(lambda s: 'a' in s).count()
numBs = logData.filter(lambda s: 'b' in s).count()

print("Lines with a: %i, lines with b: %i" % (numAs, numBs))
spark.stop()

output

19/11/21 08:00:08 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
[Stage 0:>                                                          (0 + 0) / 2]19/11/21 08:00:29 WARN 
TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

@hooliowobbits
Copy link

I'm curious if either of you fellows got this going, i'm particularly interested in the submit command for the pyspark script

@javed2005
Copy link

javed2005 commented Oct 8, 2022

I am fasing issue job are failed when i use this script

/opt/spark/bin/spark-submit --master spark://spark-master:7077
--jars /opt/spark-apps/postgresql-42.2.22.jar
--driver-memory 1G
--executor-memory 1G
/opt/spark-apps/main.py

image

@javed2005
Copy link

image

@javed2005
Copy link

can anyone help

@reginold
Copy link

@javed2005
the error message said that you do not have the CSV file, because the owner has used the gitignore to filter the CSV file.
so you should create a data folder and download the first CSV file here, unzip the file.
http://web.mta.info/developers/MTA-Bus-Time-historical-data.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants