Setting up IntelliJ for Spark

Appuri is hiring data scientists, Java/Scala developers as well as front-end devs proficient with modern frameworks like AngularJS. Please drop me a line at bilal at appuri dot com if you are interesting in learning more.

As a “reborn” Java developer, I often find myself struggling with project setup. I am comfortable with Python and C# projects, but when it comes to JVM, I get lost in a soup of maven and sbt. Last night, we hosted the second Appuri Big Data Hackathon, and we set up our developer boxes for Spark development with IntelliJ. I hope these steps help you get started too:

Install Scala

You can install Scala from scala-lang.org, but I recommend using brew install scala and brew install sbt to install Simple (hah!) Build Tool.

Install IntelliJ and configure for Scala

IntellIJ is, hands down, the best Java and Scala editor out there. Download and install the Community Edition.

Next, open Preferences, go to the Plugins menu and install the Scala plugin.

Create a Scala project

Click File > New Project and choose Scala Module. You have to configure the Scala Home by pointing IntelliJ to where Scala libraries are installed:

Importing Spark dependencies

Now you will start seeing nice syntax highlighting for Scala’s built-in libraries. Unfortunately, if we add a Spark import, we see ugliness:

We have to do a little bit of mumbo jumbo (really, Java developers, why do you put up with this?) to get things to work. First, create a file named build.sbt under ~/.sbt/plugins and add these lines. Note the line breaks!

resolvers += "Sonatype snapshots" at "http://oss.sonatype.org/content/repositories/snapshots/"

addSbtPlugin("com.github.mpeltonen" % "sbt-idea" % "1.3.0-SNAPSHOT")

Next, add a file named build.sbt in your project’s root directory with this content:

scalaVersion := "2.9.3"

libraryDependencies += "org.spark-project" %% "spark-core" % "0.7.3"

libraryDependencies += "org.spark-project" %% "spark-streaming" % "0.7.3"

resolvers ++= Seq(
  "Akka Repository" at "http://repo.akka.io/releases/",
  "Spray Repository" at "http://repo.spray.cc/")

Now, run sbt update and then run sbt. In the sbt prompt, type gen-idea to update your IntellIJ project file. When you go back to IntellIJ, it will prompt you to reload the project. Once you do, you will see that the references resolve correctly!