You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In reference.conf I set (and some other minor options):
# The execution modes in Sparta are: local, mesos or marathon
sparta.config.executionMode = yarn
# Yarn cluster name
sparta.yarn.master = yarn
# Cluster or Client. If the user need more than one policy running is necessary use "cluster". Is the same as the variable spark.submit.deployMode
sparta.yarn.deployMode = cluster
I have a correct workflow which can run on local mode, but after switching to yarn mode, I get below logs. It seems like sparta cannot connect with Resource Manager. Could anybody help with this issue?
02 Jul 2018 15:29:31.053 INFO c.s.s.s.c.a.ClusterLauncherActor Sparta submit options initialized correctly
02 Jul 2018 15:29:31.062 INFO c.s.s.s.c.a.ClusterLauncherActor Updating context d23359d0-de5b-4589-bb5a-236b1bde8eed with name test1:
Status: Failed ---> NotStarted
Status Information: The checker detects that the policy not start/stop correctly ---> Sparta submit options initialized correctly
Submission Id: undefined ---> undefined
Submission Status: LOST ---> LOST
Marathon Id: undefined ---> undefined
Last Error: undefined ---> undefined
Last Execution Mode: yarn-cluster ---> yarn-cluster
Resource Manager URL: undefined ---> undefined
02 Jul 2018 15:29:31.103 INFO c.s.s.s.c.a.ClusterLauncherActor Launching Sparta Job with options ...
Policy name: test1
Main Class: com.stratio.sparta.driver.SparkDriver
Driver file: http://0.0.0.0:9090/sparta/driver/driver-1.6.0-SNAPSHOT.jar
Master: yarn
Spark submit arguments: --deploy-mode -> cluster,--num-executors -> 1,--properties-file -> /etc/spark2/conf/spark-defaults.conf,--proxy-user -> hdfs
Spark configurations: spark.sql.parquet.binaryAsString -> true,spark.app.name -> test1-2018/07/02-03:29:30,spark.driver.memory -> 1G,spark.driver.cores -> 1,spark.mesos.driverEnv.SPARK_USER -> ,spark.executor.memory -> 1G,spark.executor.cores -> 1
Driver arguments: Map(plugins -> ICw=, clusterConfig -> eyJ5YXJuIjp7ImRlcGxveU1vZGUiOiJjbHVzdGVyIiwiZHJpdmVyQ29yZXMiOjEsImRyaXZlck1lbW9yeSI6IjFHIiwiZXhlY3V0b3JDb3JlcyI6MSwiZXhlY3V0b3JNZW1vcnkiOiIxRyIsImtpbGxVcmwiOiIvdjEvc3VibWlzc2lvbnMva2lsbCIsIm1hc3RlciI6Inlhcm4iLCJudW1FeGVjdXRvcnMiOjEsInByb3BlcnRpZXNGaWxlIjoiL2V0Yy9zcGFyazIvY29uZi9zcGFyay1kZWZhdWx0cy5jb25mIiwicHJveHktdXNlciI6ImhkZnMiLCJzcGFyayI6eyJzcWwiOnsicGFycXVldCI6eyJiaW5hcnlBc1N0cmluZyI6dHJ1ZX19fSwic3BhcmtIb21lIjoiL29wdC9jbG91ZGVyYS9wYXJjZWxzL1NQQVJLMi0yLjEuMC5jbG91ZGVyYTItMS5jZGg1LjcuMC5wMC4xNzE2NTgvbGliL3NwYXJrMiJ9fQ==, detailConfig -> eyJjb25maWciOnsiYWRkVGltZVRvQ2hlY2twb2ludFBhdGgiOmZhbHNlLCJhdXRvRGVsZXRlQ2hlY2twb2ludCI6dHJ1ZSwiYXdhaXRQb2xpY3lDaGFuZ2VTdGF0dXMiOiIxODBzIiwiYmFja3Vwc0xvY2F0aW9uIjoiL29wdC9zZHMvc3BhcnRhL2JhY2t1cHMiLCJjaGVja3BvaW50UGF0aCI6Ii90bXAvc3BhcnRhL2NoZWNrcG9pbnQiLCJkcml2ZXJQYWNrYWdlTG9jYXRpb24iOiIvb3B0L3Nkcy9zcGFydGEvZHJpdmVyIiwiZHJpdmVyVVJJIjoiaHR0cDovLzAuMC4wLjA6OTA5MC9zcGFydGEvZHJpdmVyL2RyaXZlci0xLjYuMC1TTkFQU0hPVC5qYXIiLCJleGVjdXRpb25Nb2RlIjoieWFybiIsImZyb250ZW5kIjp7InRpbWVvdXQiOjUwMDB9LCJwbHVnaW5QYWNrYWdlTG9jYXRpb24iOiIvb3B0L3Nkcy9zcGFydGEvcGx1Z2lucyIsInJlbWVtYmVyUGFydGl0aW9uZXIiOnRydWV9fQ==, storageConfig -> IA==, policyId -> d23359d0-de5b-4589-bb5a-236b1bde8eed, zookeeperConfig -> eyJ6b29rZWVwZXIiOnsiY29ubmVjdGlvblN0cmluZyI6IjEwLjAuMTEuMjI6MjE4MSwxMC4wLjExLjMwOjIxODEsMTAuMC4xMS4zMToyMTgxIiwiY29ubmVjdGlvblRpbWVvdXQiOjE1MDAwLCJyZXRyeUF0dGVtcHRzIjo1LCJyZXRyeUludGVydmFsIjoxMDAwMCwic2Vzc2lvblRpbWVvdXQiOjYwMDAwfX0=)
02 Jul 2018 15:29:31.128 INFO c.s.s.s.c.a.ClusterLauncherActor Sparta cluster job launched correctly
02 Jul 2018 15:29:31.131 INFO c.s.s.s.c.a.ClusterLauncherActor Updating context d23359d0-de5b-4589-bb5a-236b1bde8eed with name test1:
Status: NotStarted ---> Launched
Status Information: Sparta submit options initialized correctly ---> Sparta cluster job launched correctly
Submission Id: undefined ---> undefined
Submission Status: LOST ---> UNKNOWN
Marathon Id: undefined ---> undefined
Last Error: undefined ---> undefined
Last Execution Mode: yarn-cluster ---> yarn-cluster
Resource Manager URL: undefined ---> undefined
02 Jul 2018 15:29:31.205 INFO c.s.s.s.c.a.ClusterLauncherActor Cluster context listener added to test1 with id: d23359d0-de5b-4589-bb5a-236b1bde8eed
02 Jul 2018 15:29:31.218 INFO c.s.s.s.c.a.ClusterLauncherActor Starting scheduler task in awaitPolicyChangeStatus with time: 180s
02 Jul 2018 15:29:33.764 INFO c.s.s.s.c.a.ClusterLauncherActor Submission state changed to ... CONNECTED
02 Jul 2018 15:29:33.767 INFO c.s.s.s.c.a.ClusterLauncherActor Updating context d23359d0-de5b-4589-bb5a-236b1bde8eed with name test1:
Status: Launched ---> Launched
Status Information: Sparta cluster job launched correctly ---> Sparta cluster job launched correctly
Submission Id: undefined ---> undefined
Submission Status: UNKNOWN ---> CONNECTED
Marathon Id: undefined ---> undefined
Last Error: undefined ---> undefined
Last Execution Mode: yarn-cluster ---> yarn-cluster
Resource Manager URL: undefined ---> undefined
02 Jul 2018 15:29:34.299 INFO c.s.s.s.c.a.ClusterLauncherActor Submission state changed to ... LOST
02 Jul 2018 15:29:34.301 INFO c.s.s.s.c.a.ClusterLauncherActor Updating context d23359d0-de5b-4589-bb5a-236b1bde8eed with name test1:
Status: Launched ---> Launched
Status Information: Sparta cluster job launched correctly ---> Sparta cluster job launched correctly
Submission Id: undefined ---> undefined
Submission Status: CONNECTED ---> LOST
Marathon Id: undefined ---> undefined
Last Error: undefined ---> undefined
Last Execution Mode: yarn-cluster ---> yarn-cluster
Resource Manager URL: undefined ---> undefined
02 Jul 2018 15:29:51.657 INFO c.s.s.s.core.actor.StatusActor Updating context d23359d0-de5b-4589-bb5a-236b1bde8eed with name test1:
Status: Launched ---> Stopping
Status Information: Sparta cluster job launched correctly ---> Sparta cluster job launched correctly
Submission Id: undefined ---> undefined
Submission Status: LOST ---> LOST
Marathon Id: undefined ---> undefined
Last Error: undefined ---> undefined
Last Execution Mode: yarn-cluster ---> yarn-cluster
Resource Manager URL: undefined ---> undefined
02 Jul 2018 15:29:51.678 INFO c.s.s.s.c.a.ClusterLauncherActor Stopping message received from Zookeeper
02 Jul 2018 15:29:51.678 INFO c.s.s.s.c.a.ClusterLauncherActor The Sparta System don't have submission id associated to policy test1
02 Jul 2018 15:29:51.679 INFO c.s.s.s.c.a.ClusterLauncherActor Node cache to cluster context listener closed correctly
The text was updated successfully, but these errors were encountered:
@compae please help!
In ResourceManager log I found:
2018-07-02 19:00:48,944 INFO org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Allocated new applicationId: 2014
this means driver could connect with ResourceManager, right? But driver process became DEAD in 2-3 seconds, then "Submission Status" changed to LOST.
Hi,
In reference.conf I set (and some other minor options):
# The execution modes in Sparta are: local, mesos or marathon
sparta.config.executionMode = yarn
# Yarn cluster name
sparta.yarn.master = yarn
# Cluster or Client. If the user need more than one policy running is necessary use "cluster". Is the same as the variable spark.submit.deployMode
sparta.yarn.deployMode = cluster
I have a correct workflow which can run on local mode, but after switching to yarn mode, I get below logs. It seems like sparta cannot connect with Resource Manager. Could anybody help with this issue?
02 Jul 2018 15:29:31.053 INFO c.s.s.s.c.a.ClusterLauncherActor Sparta submit options initialized correctly
02 Jul 2018 15:29:31.062 INFO c.s.s.s.c.a.ClusterLauncherActor Updating context d23359d0-de5b-4589-bb5a-236b1bde8eed with name test1:
Status: Failed ---> NotStarted
Status Information: The checker detects that the policy not start/stop correctly ---> Sparta submit options initialized correctly
Submission Id: undefined ---> undefined
Submission Status: LOST ---> LOST
Marathon Id: undefined ---> undefined
Last Error: undefined ---> undefined
Last Execution Mode: yarn-cluster ---> yarn-cluster
Resource Manager URL: undefined ---> undefined
02 Jul 2018 15:29:31.103 INFO c.s.s.s.c.a.ClusterLauncherActor Launching Sparta Job with options ...
Policy name: test1
Main Class: com.stratio.sparta.driver.SparkDriver
Driver file: http://0.0.0.0:9090/sparta/driver/driver-1.6.0-SNAPSHOT.jar
Master: yarn
Spark submit arguments: --deploy-mode -> cluster,--num-executors -> 1,--properties-file -> /etc/spark2/conf/spark-defaults.conf,--proxy-user -> hdfs
Spark configurations: spark.sql.parquet.binaryAsString -> true,spark.app.name -> test1-2018/07/02-03:29:30,spark.driver.memory -> 1G,spark.driver.cores -> 1,spark.mesos.driverEnv.SPARK_USER -> ,spark.executor.memory -> 1G,spark.executor.cores -> 1
Driver arguments: Map(plugins -> ICw=, clusterConfig -> eyJ5YXJuIjp7ImRlcGxveU1vZGUiOiJjbHVzdGVyIiwiZHJpdmVyQ29yZXMiOjEsImRyaXZlck1lbW9yeSI6IjFHIiwiZXhlY3V0b3JDb3JlcyI6MSwiZXhlY3V0b3JNZW1vcnkiOiIxRyIsImtpbGxVcmwiOiIvdjEvc3VibWlzc2lvbnMva2lsbCIsIm1hc3RlciI6Inlhcm4iLCJudW1FeGVjdXRvcnMiOjEsInByb3BlcnRpZXNGaWxlIjoiL2V0Yy9zcGFyazIvY29uZi9zcGFyay1kZWZhdWx0cy5jb25mIiwicHJveHktdXNlciI6ImhkZnMiLCJzcGFyayI6eyJzcWwiOnsicGFycXVldCI6eyJiaW5hcnlBc1N0cmluZyI6dHJ1ZX19fSwic3BhcmtIb21lIjoiL29wdC9jbG91ZGVyYS9wYXJjZWxzL1NQQVJLMi0yLjEuMC5jbG91ZGVyYTItMS5jZGg1LjcuMC5wMC4xNzE2NTgvbGliL3NwYXJrMiJ9fQ==, detailConfig -> eyJjb25maWciOnsiYWRkVGltZVRvQ2hlY2twb2ludFBhdGgiOmZhbHNlLCJhdXRvRGVsZXRlQ2hlY2twb2ludCI6dHJ1ZSwiYXdhaXRQb2xpY3lDaGFuZ2VTdGF0dXMiOiIxODBzIiwiYmFja3Vwc0xvY2F0aW9uIjoiL29wdC9zZHMvc3BhcnRhL2JhY2t1cHMiLCJjaGVja3BvaW50UGF0aCI6Ii90bXAvc3BhcnRhL2NoZWNrcG9pbnQiLCJkcml2ZXJQYWNrYWdlTG9jYXRpb24iOiIvb3B0L3Nkcy9zcGFydGEvZHJpdmVyIiwiZHJpdmVyVVJJIjoiaHR0cDovLzAuMC4wLjA6OTA5MC9zcGFydGEvZHJpdmVyL2RyaXZlci0xLjYuMC1TTkFQU0hPVC5qYXIiLCJleGVjdXRpb25Nb2RlIjoieWFybiIsImZyb250ZW5kIjp7InRpbWVvdXQiOjUwMDB9LCJwbHVnaW5QYWNrYWdlTG9jYXRpb24iOiIvb3B0L3Nkcy9zcGFydGEvcGx1Z2lucyIsInJlbWVtYmVyUGFydGl0aW9uZXIiOnRydWV9fQ==, storageConfig -> IA==, policyId -> d23359d0-de5b-4589-bb5a-236b1bde8eed, zookeeperConfig -> eyJ6b29rZWVwZXIiOnsiY29ubmVjdGlvblN0cmluZyI6IjEwLjAuMTEuMjI6MjE4MSwxMC4wLjExLjMwOjIxODEsMTAuMC4xMS4zMToyMTgxIiwiY29ubmVjdGlvblRpbWVvdXQiOjE1MDAwLCJyZXRyeUF0dGVtcHRzIjo1LCJyZXRyeUludGVydmFsIjoxMDAwMCwic2Vzc2lvblRpbWVvdXQiOjYwMDAwfX0=)
02 Jul 2018 15:29:31.128 INFO c.s.s.s.c.a.ClusterLauncherActor Sparta cluster job launched correctly
02 Jul 2018 15:29:31.131 INFO c.s.s.s.c.a.ClusterLauncherActor Updating context d23359d0-de5b-4589-bb5a-236b1bde8eed with name test1:
Status: NotStarted ---> Launched
Status Information: Sparta submit options initialized correctly ---> Sparta cluster job launched correctly
Submission Id: undefined ---> undefined
Submission Status: LOST ---> UNKNOWN
Marathon Id: undefined ---> undefined
Last Error: undefined ---> undefined
Last Execution Mode: yarn-cluster ---> yarn-cluster
Resource Manager URL: undefined ---> undefined
02 Jul 2018 15:29:31.205 INFO c.s.s.s.c.a.ClusterLauncherActor Cluster context listener added to test1 with id: d23359d0-de5b-4589-bb5a-236b1bde8eed
02 Jul 2018 15:29:31.218 INFO c.s.s.s.c.a.ClusterLauncherActor Starting scheduler task in awaitPolicyChangeStatus with time: 180s
02 Jul 2018 15:29:33.764 INFO c.s.s.s.c.a.ClusterLauncherActor Submission state changed to ... CONNECTED
02 Jul 2018 15:29:33.767 INFO c.s.s.s.c.a.ClusterLauncherActor Updating context d23359d0-de5b-4589-bb5a-236b1bde8eed with name test1:
Status: Launched ---> Launched
Status Information: Sparta cluster job launched correctly ---> Sparta cluster job launched correctly
Submission Id: undefined ---> undefined
Submission Status: UNKNOWN ---> CONNECTED
Marathon Id: undefined ---> undefined
Last Error: undefined ---> undefined
Last Execution Mode: yarn-cluster ---> yarn-cluster
Resource Manager URL: undefined ---> undefined
02 Jul 2018 15:29:34.299 INFO c.s.s.s.c.a.ClusterLauncherActor Submission state changed to ... LOST
02 Jul 2018 15:29:34.301 INFO c.s.s.s.c.a.ClusterLauncherActor Updating context d23359d0-de5b-4589-bb5a-236b1bde8eed with name test1:
Status: Launched ---> Launched
Status Information: Sparta cluster job launched correctly ---> Sparta cluster job launched correctly
Submission Id: undefined ---> undefined
Submission Status: CONNECTED ---> LOST
Marathon Id: undefined ---> undefined
Last Error: undefined ---> undefined
Last Execution Mode: yarn-cluster ---> yarn-cluster
Resource Manager URL: undefined ---> undefined
02 Jul 2018 15:29:51.657 INFO c.s.s.s.core.actor.StatusActor Updating context d23359d0-de5b-4589-bb5a-236b1bde8eed with name test1:
Status: Launched ---> Stopping
Status Information: Sparta cluster job launched correctly ---> Sparta cluster job launched correctly
Submission Id: undefined ---> undefined
Submission Status: LOST ---> LOST
Marathon Id: undefined ---> undefined
Last Error: undefined ---> undefined
Last Execution Mode: yarn-cluster ---> yarn-cluster
Resource Manager URL: undefined ---> undefined
02 Jul 2018 15:29:51.678 INFO c.s.s.s.c.a.ClusterLauncherActor Stopping message received from Zookeeper
02 Jul 2018 15:29:51.678 INFO c.s.s.s.c.a.ClusterLauncherActor The Sparta System don't have submission id associated to policy test1
02 Jul 2018 15:29:51.679 INFO c.s.s.s.c.a.ClusterLauncherActor Node cache to cluster context listener closed correctly
The text was updated successfully, but these errors were encountered: