Spark Context
Initializing Spark Context for Spark 1.0 and for version 2 and above, SparkContext, SQLContext are unified under SparkSession.
Creating Spark Context — Create SparkConf() object with below syntax
val conf = new SparkConf().setMaster(“local[*]”).setAppName(“SparkTestApp”)
Initialize SparkContext by passing conf object
val sc = new SparkContext(conf)
val sqlContext = org.apache.spark.sql.SQLContext(sc) ( Deprecated )
In version 2.0 and greater we can use SparkSession
val spark = SparkSession.builder.appName(“Spark_Test”).master(“local[*]”).enableHiveSupport().getOrCreate()
Getting All Configuration Values
spark.SparkContext.getConf.getAll.foreach(println) or spark.conf.getAll.foreach(println)
(spark.sql.catalogImplementation,hive)
(spark.driver.host,192.168.0.13)
(spark.app.name,SparkDataFrames)
(spark.master,local[*])
(spark.executor.id,driver)
(spark.driver.port,58959)
(spark.app.id,local-1582091700372)
Changing Spark Configurations
- While Defining SparkSession
val spark = SparkSession.builder.config(“spark.driver.cores”,16).appName(“Spark_Test”).master(“local[*]).enableHiveSupport().getOrCreate()
2. Adding Configuration after initializing SparkSession
spark.conf.set(“spark.logConf”,false)
3. Supplying it in the command line
./bin/spark-submit --name "My app" --master local[4] --conf spark.eventLog.enabled=false
--conf "spark.executor.extraJavaOptions=-XX:+PrintGCDetails -XX:+PrintGCTimeStamps" myApp.jar
Different Spark Configurations
spark.app.name
spark.driver.cores
spark.driver.memory
spark.executor.memory
spark.extraListeners
spark.driver.maxResultSize
spark.local.dir
spark.logConf
spark.master
spark.submit.deployMode