The goal of this post is to introduce a nice tool Wisp by Quantifind that enables visualizing data stored in Scala using the web browser. In a previous post, we loaded a weather data set into an RDD. To include Wisp in the project, update the sbt definition and add:
libraryDependencies += "com.quantifind" %% "wisp" % "0.0.1"
We’re going to plot the averages of
- tMinimum
- tAverage
- tMaximum
for all years.
The averages will be stored in mutable lists
var tempAverage = new ListBuffer[Double]
var tempMinimum = new ListBuffer[Double]
var tempMaximum = new ListBuffer[Double]
We need to add an import for the Scala mutable list and the Highchart library
import scala.collection.mutable.ListBuffer
import com.quantifind.charts.Highcharts._
The averages are computed using the Spark function “aggregate”
for(month <- 1 to 12) {
val monthData = tempData.filter(_.month==month)
val tAve = monthData.map(_.tAverage).aggregate((0.0, 0.0))((p, q) => (p._1 + q, p._2 + 1),(p, q) => (p._1 + q._1, p._2 + q._2))
val tMin = monthData.map(_.tMinimum).aggregate((0.0, 0.0))((p, q) => (p._1 + q, p._2 + 1),(p, q) => (p._1 + q._1, p._2 + q._2))
val tMax = monthData.map(_.tMaximum).aggregate((0.0, 0.0))((p, q) => (p._1 + q, p._2 + 1),(p, q) => (p._1 + q._1, p._2 + q._2))
tempMinimum += tMin._1/tMin._2
tempAverage += tAve._1/tAve._2
tempMaximum += tMax._1/tMax._2
}
- sum of the temperature values
- sum of the number of elements
The ratio of the 2 values represents the average.
Lets’ now use Wisp to plot the temperature profile
line(1 to 12, tempMinimum)
hold()
line(1 to 12, tempAverage)
hold()
line(1 to 12, tempMaximum)
title("Temperature")
xAxis("Month")
yAxis("Temperature")
legend(List("Tminimum", "Taverage","Tminimum"))
Compile the code and run it, if all goes well, the console displays a URL
Output written to http://machine-name:PORT
Navigate to the URL using a web browser and you should see a chart showing the monthly temperature averages for tMinimum, tAverage and tMaximum.