Learn Spark | Data Engineering, Machine Learning


Master Spark for data cleaning, aggregation, and building ML models. Hands-on projects, practical insights from industry experts.

Video CoursesBig DataSpark


Learn Spark by David Drummond and Judit Lantos


  • Learn to use Spark for cleaning and aggregating data
  • Explore Spark's ML capabilities and build ML models and pipelines
  • Run Spark on a distributed cluster in AWS UI and AWS CLI
  • Learn best practices for debugging and optimizing your Spark applications


This course is suitable for those who want to learn Spark and its applications in data engineering and machine learning. The course is taught by industry experts, David Drummond and Judit Lantos, who provide hands-on experience and practical insights. The course also includes projects based on real-world scenarios, allowing learners to apply their skills to practical situations and gain valuable experience.

GetVM 是如何工作的?




简单安装浏览器扩展并点击侧边栏中启动 GetVM。



从我们的环境库中选择你的操作系统、IDE 或应用,并立即启动。



在教程或视频的侧边栏中,在 VM 中实践你的新技能。保存你的工作,以便在将来继续学习。


How to optimize storage costs using Amazon S3 1
Technical TutorialsCloud Computing
Optimize storage costs using Amazon S3 and gain valuable business insights at lower cost. Understand the 4 pillars of S3 cost optimization and leverage S3 features to monitor, analyze, and manage storage.
Big Data Analytics with Hadoop 3 30
Technical TutorialsBig DataHadoop
Gain insights into big data analytics using the Hadoop platform. Learn data processing, analytics, and Hadoop ecosystem tools.
Cloudera Impala | Apache Hadoop Big Data Processing 28
Technical TutorialsBig DataHadoop
Comprehensive guide to understanding and using Cloudera Impala for big data processing and analysis within the Hadoop ecosystem.
NoSQL Databases | Database Management, Big Data Processing 19
Technical TutorialsBig DataNoSQL
Comprehensive overview of NoSQL databases, including key-value stores, document databases, and column-oriented databases. Covers distributed data processing via MapReduce and real-world case studies.
Learning Spark: Lightning-Fast Data Analytics 3
Technical TutorialsBig DataSpark
Comprehensive guide to learning Apache Spark, a powerful open-source data processing engine. Covers the latest Spark 3.0 developments and provides hands-on examples.
Algorithms for Big Data | Harvard University CS 229R 6
University CoursesBig DataMachine Learning
Dive into the theoretical foundations of efficient algorithms for processing big data. Relevant for internet search, machine learning, and scientific computing.
Big Data Analytics | Advanced Big Data Analytics - Columbia University 9
University CoursesBig DataData AnalysisMachine Learning
Gain in-depth knowledge on analyzing Big Data, including storage, processing, analysis, visualization, and application. Ideal for graduate students interested in Big Data and data analysis.
Data Mining | Machine Learning | Big Data Processing 7
University CoursesMachine LearningMapReduceSpark
Explore data mining and machine learning algorithms for analyzing large-scale data using MapReduce and Spark. Gain hands-on experience in data science and big data analysis.
Big Data Tutorials 0
Technical TutorialsBig DataHadoop
Comprehensive big data tutorials covering Hadoop, Hive, and NoSQL databases. Master key technologies and techniques through practical, step-by-step lessons.