Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

如果是YARN 集群,category.spark.masterUrl应该怎么设置,还需要配置其他的吗? #6

Open
monkeyshichi opened this issue May 23, 2022 · 1 comment

Comments

@monkeyshichi
Copy link

如果是YARN 集群,category.spark.masterUrl应该怎么设置,还需要配置其他的吗?

@jingpeicomp
Copy link
Owner

jingpeicomp commented May 27, 2022

如果是yarn的话,主要是以下几类配置

  1. masterUrl固定为yarn,dependenceJar和app name和standalone没有区别
  2. 配置hdfs集群的namenode信息,因为job数据和日志都是放在hdfs中
  3. 配置yarn集群的resource manager信息,
  4. job相关的资源

下面一份配置是我使用过的,把这些属性设置到SparkConf中,然后通过SparkConf构建spark contex即可。供参考:
spark.hadoop.dfs.nameservices=hacluster
spark.hadoop.dfs.ha.namenodes.hacluster=33,34
spark.hadoop.dfs.namenode.rpc-address.hacluster.34=node-xxx2:9820
spark.hadoop.dfs.namenode.rpc-address.hacluster.33=node-xxx1:9820
spark.hadoop.dfs.client.failover.proxy.provider.hacluster=org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
spark.hadoop.mapreduce.framework.name=yarn
spark.hadoop.yarn.resourcemanager.ha.enabled=true
spark.hadoop.yarn.resourcemanager.ha.rm-ids=46,47
spark.hadoop.yarn.resourcemanager.hostname.47=node-xx1
spark.hadoop.yarn.resourcemanager.hostname.46=node-xx2
spark.hadoop.yarn.resourcemanager.address.47=node-xx1:8032
spark.hadoop.yarn.resourcemanager.address.46=node-xx2:8032
spark.hadoop.yarn.resourcemanager.scheduler.address.46=node-xx2:8030
spark.hadoop.yarn.resourcemanager.scheduler.address.47=node-xx1:8030
spark.yarn.stagingDir=hdfs://hacluster/user/spark/stagingDir
spark.yarn.preserve.staging.files=true
spark.yarn.archive=hdfs://hacluster/user/spark/jars/spark-archive-2x.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants