Table of Content
  1. Install Java
  2. Install Scala
  3. Setup Spark2.X on Intellij Idea

It was a sunny day. I started to setup my Spark Evironment. After reading a lot of online tutorials about how to config Spark1.X on Intellij Idea, I found it would be very easy. But something unexpected may happen any time. I still met some troubles as the Spark2.X has some differences from Spark1.X. That is why I want to write this blog to specify how to setup Spark2.X on Intellij Idea. There are five elements required:

  • Ubuntu OS
  • JAVA
  • Spark 2.x (in my case, spark-2.0.2-bin-hadoop2.7)
  • Scala
  • Intellij Idea

Ok, let us go!

Install Java

The first step is to ensure that the Ubuntu has the Java JDK. After downloading JDK, we need to decompress the “.tar.gz” to the directory we want to install it. For exmaple: /usr/local/lib/jvm/. Then add the JAVA_HOME to our system. Open Bash, type:

sudo gedit /etc/profile

Add following path to the end of profile:

#JAVA path
export JAVA_HOME=/usr/local/lib/jvm/java
export JRE_HOME=${JAVA_HOME}/jre  
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib  
export PATH=${JAVA_HOME}/bin:$PATH

Sava and return to the Bash to check whether Java is installed to OS.

java -version

Install Scala

Although Spark has JAVA and Python API, its kernel language is Scala. So we also need to setup scala to our OS. We can download Scala from: http://www.scala-lang.org/download. It is very very very important to choose right version, becuase Spark does not support all scala version. In my case, I use scala 2.12.X version in the beginning. Then I run Spark code with error. So I change to scala2.11.x. Similarly, decompress scala-2.11.X.tar to where you want install.

Setup Spark2.X on Intellij Idea

Next, download Spark from http://spark.apache.org/. Also decompress it and remember its path. Open Intellij Idea, select “File-Settings-Plugins”. Search “scala” in “Browse Repositories” to find the scala plugin and install it.

Now, we can select “File-Create-Project” to create a Scala project. Next, we have three steps in “File-Project Structure”.

  • Set JDK Path to Project tab.
  • Set Scala Path to Global Libraries tab
  • Set Spark Path to Libraries tab
    It is noteworthy that comparing with Spark1.X, the Spark2.x dose not has “/lib/spark-assembly-XXXXX.jar”. Instead, there is a “jars” directory. So in the final step, what we need to do is to add the path of “jars” to the “Libraries” tab .

Ok run a simple Spark code to test:

import org.apache.spark.{SparkConf, SparkContext}
/**
  * Created by BIGBAI on 18/11/2016.
  */
object TestScala {
  def main(args: Array[String]): Unit={
    val conf = new SparkConf().setAppName("first spark")
    conf.setMaster("local[2]")
    val sc = new SparkContext(conf)
    sc.stop()
    println(sc)
  }
}

If everything is ok, the Spark2.X can successfully run in Intellij Idea. If there is still some errors, please feel free to discuss with me. Or, you can try to use SBT to set up Spark2.X.