Gain real-world experience on Databricks as a Data Engineer


Kickoff your Data Engineer career on Databricks


  • Access to a Databricks Workspace is required, but if you don’t have one we will create a free account in the course
  • All the code and step-by-step instructions are provided, but the skills below will greatly benefit your journey
  • Basic knowledge of Python and SQL


Are you looking for a sneak peek into what the job of a Data Engineer on Databricks looks like?

Would you like to gain some real-world experience on what it feels like to work with Databricks as a Data Engineer?

Are you eager to enhance your skill as a Data Engineer and prepare for the Databricks Certified Developer for Apache Spark 3.0 exam in Python?

Then look no further.

This course is going to provide you with some rudimentary yet realworld experience of what a Data Engineering job is so that you can understand if it suits you.

In no more than 1 hour, thanks to several practical examples, you will learn how to ingest data of different formats working on Databricks (comma separated text files, xml files, tab separated text files, fixed width files).

By the end of this course you will know how to deal with the Dataframe API and Delta API just like a real Data Engineer does!


  • In Notebook 1, you are given an overview of the projectthe dataand the exercises.
  • In Notebook 2, you are going to walk through the code to perform data ingestion into Delta Tables.
  • In Notebook 3, you are going to walk through the code to combine all the datasets into one to answer a handful of business questions.

The idea behind this project is to ingest data from a variety of file types and load it into Delta Tables for further analysis.

This project is self-contained, in that all the code required to complete this project is provided with it and you will just have to run each cell of each notebook.

Put it in other words, this project is simply going to walk you through several exercises that reflect the day by day work of a Data Engineer in real life, at least for the very first steps of the data ingestion part.

The data is about global educational indicators and comes from publicly available data from The World Bank. Please note that the author has performed some degree of data massage for the sake of simplicity of the project. Therefore, the data does not represent actual data from the source, but it is only for demonstrating how to work with PySpark on Databricks (take it for demonstration purposes only).

To close the project, you are going to answer a handful of simple business questions based on the combination of the data you previously loaded.

Who this course is for:

  • Anyone aiming at gaining real-world experience on Databricks as a Data Engineer
  • Anyone looking for a career in Data Engineering with Databricks
  • Anyone aiming at ingesting data into Delta Tables on Databricks
  • Data Engineers who want to study for the “Databricks Certified Developer for Spark 3.0” certification

Download Now

0 0 votes
Article Rating
Notify of
Inline Feedbacks
View all comments