About MLLib, GraphX, and R

MLlib is Spark’s machine learning library. GraphX is Spark’s API for graphs and graph-parallel computation. SparkR exposes the API and allows users to run jobs from the R shell on a cluster. In this course, you will learn how to work with each of these libraries.

Who should take this course?

Programmers and developers familiar with Apache Spark who wish to expand their skill sets.

Course Content

  • start the course
  • describe data types
  • recall the basic statistics
  • describe linear SVMs
  • perform logistic regression
  • use naïve bayes
  • create decision trees
  • use collaborative filtering with ALS
  • perform clustering with K-means
  • perform clustering with LDA
  • perform analysis with frequent pattern mining
  • describe the property graph
  • describe the graph operators
  • perform analytics with neighborhood aggregation
  • perform messaging with Pregel API
  • build graphs
  • describe vertex and edge RDDs
  • optimize representation through partitioning
  • measure vertices with PageRank
  • install SparkR
  • run SparkR
  • use existing R packages
  • expose RDDs as distributed lists
  • convert existing RDDs into DataFrames
  • read and write parquet files
  • run SparkR on a cluster
  • use the algorithms and utilities in MLlib

Call Now- +91-921-276-0556

Send a Query









    Please prove you are human by selecting the Heart.

    Enquiry Form
    close slider











      Please prove you are human by selecting the Plane.