Interviewing for any technical position generally requires preparing, studying and long-all day interviews. As a data engineer, what we need to study is not always clear. Some positions require Hadoop, others SQL, some roles require understanding statistics while still others require heavy amounts of system design.
We have gathered many of the resources that we have used to study and get jobs at companies int the FAANG family as well as other major tech companies. We have yet to find one that requires you to know anything about Hadoop during the interview so that has not bee included in this study guide.
However, as part of this study guide we have created the checklist. Sometimes, studying for interviews can feel like you are getting nothing done. Thus, this checklist will help you keep track so you know where to improve and what you have done. You can find the checklist here.Download The Data Engineering Interview Checklist
- SQL Pre-Video Problems
- SQL Videos
- SQL Post Video Problems
- Database, Data Warehouse And ETL Design
- Algorithm And Data Structures Programming Problems
- Operational Programming Problems
- System Design Videos
- Udemy Courses
As a data engineer it is almost inevitable that you will get some SQL questions. As someone who has participated in many interviews for a lot of top tech companies like Amazon and Capital One. They usually follow some similar patterns.
Typically there will be at least one question that requires an aggregation with a filter, another that requires a few joins and then one that requires a subquery. Along with that, there might be a few curve ball questions that require self-joins, recursions and analytic functions. So let’s look at a couple concepts that are good to cover
SQL Pre-Video Problems
These first few problems will help you gauge where you are on different concepts. That way you can take notes on the study guide and go back and review what you feel you were not comfortable with.
Join with Aggregation
Trips table holds all taxi trips. Each trip has a unique Id, while Client_Id and Driver_Id are both foreign keys to the Users_Id at the
Users table. Status is an ENUM type of (‘completed’, ‘cancelled_by_driver’, ‘cancelled_by_client’).
This is usually the 3rd or 4th style of question.
Sql Questions With Complex Logic
Sometimes, you get lucky and only have to deal with SQL questions that involve one point o logic. Sometimes it gets a little trickier. For instance this question asks you to both find cities with 3 consecutive rows and has populations over 100.
This is usually pretty easy to do when you are working, but can sometimes be a little difficult when you are in the middle of an interview.
Problem: X city built a new stadium, each day many people visit it and the stats are saved as these columns: id, visit_date, people
Please write a query to display the records which have 3 or more consecutive rows and the amount of people more than 100(inclusive).
Problem: Write a SQL query to find employees who earn the top three salaries in each of the department. For the above tables, your SQL query should return the following rows (order of rows does not matter).
Problem: Given a
Weather table, write a SQL query to find all dates’ Ids with higher temperature compared to its previous (yesterday’s) dates.
Problem: Mary is a teacher in a middle school and she has a table
seat storing students’ names and their corresponding seat ids.
The column id is continuous increment.
Mary wants to change seats for the adjacent students.
Can you write a SQL query to output the result for Mary?
Problem: Ketty gives Eve a task to generate a report containing three columns: Name, Grade and Mark. Ketty doesn’t want the NAMES of those students who received a grade lower than 8. The report must be in descending order by grade — i.e. higher grades are entered first. If there is more than one student with the same grade (8-10) assigned to them, order those particular students by their name alphabetically. Finally, if the grade is lower than 8, use “NULL” as their name and list them by their grades in descending order. If there is more than one student with the same grade (1-7) assigned to them, order those particular students by their marks in ascending order.
Write a query to help Eve.
Ranking, Row Numbers And Analytic Functions
Problem: Write a SQL query to get the nth highest salary from the
Complex Self Joins
Problem: You are given a table, Functions, containing two columns: X and Y.
Two pairs (X1, Y1) and (X2, Y2) are said to be symmetric pairs if X1 = Y2 and X2 = Y1.
Write a query to output all such symmetric pairs in ascending order by the value of X.
If you need more SQL try these one as well:
IQ15: 6 SQL Query Interview Questions
Post Video SQL Problems
Once you have finished watching the SQL videos above. Consider trying the new problems below. Try to see if you feel like you are improving. Again, note down any specific topics you feel weak on.
Databases, ETL and Data Warehouses
For database, ETL and data warehouse design questions we have gathered and even created some videos we hope will help you out when it comes to explaining your design in an interview. In addition, we have listed out a few plausible database/DW concepts you could attempt to design out on your own.
*We are working on making similar videos. Sign up to get future emails about our videos.
ETL Design Video
Self Practice Problems:
For this part of your interview practice we are going to list out a few business systems that you can try to design out. First we recommend designing a relational database, then thinking about how you would design an ETL and DW that rely on that relational DB.
*In addition, we have found it very common that interviewers will base their interview questions off of your design. So think about some of the questions you could answer with your DB and list them out.
Design a Database/ETL and DW for a:
- Dating App
- Bicycle Rental Service
- Music Streaming App
- Job Search Website
- Udemy like website
These are just a few ideas. We hope they help you have a clearer idea of what you can practice modeling and designing. Take some time to think about how users interact with these websites before getting started.
Data engineers do a significant amount of programming in there daily life. There are several specific languages data engineers use. In particular, Python is arguable the most common.
If the role requires a lot of Hadoop work, then Java is also a useful language to have. There are a few other useful languages like Java and Powershell (if you work at a Microsoft shop).
There are two types of questions we have experienced. Some interviewers will ask you more operational questions. Others will ask classic algorithm and data structure questions.
Below are list of them…
Algorithms And Data Structures
Before going to deep into data structure and algorithms. Let’s do a quick check to see how you are currently doing in this area. We have listed out 8 leetcode problems that vary in difficulty. Try these out and try to gauge yourself on how long it takes you as well as how many hints you needed. If you are following along with the study guide, then note this down. At the end of this list are a few more questions. So once you have watched all the videos, consider doing those problems and see if you feel like you are improving!
- 985. Sum of Even Numbers After Queries
- 657. Robot Return to Origin
- 961. N-Repeated Element in Size 2N Array
- 110. Balanced Binary Tree
- 3. Longest Substring Without Repeating Characters
- 19. Remove Nth Node From End of List
- 23. Merge k Sorted Lists
- 31. Next Permutation
Now that you have gone through these 8 questions, and shaken off the rust. Let’s start reviewing these concepts.
Big O Notation
Some Interview Walk Throughs
Post Study Problems
Once you have finished the videos above. Consider trying the algorithm and data structure problems below. Make sure you keep track of how comfortable you felt when working on the problems.
- Bigger Is Greater
- 6. ZigZag Conversion
- 7. Reverse Integer
- 40. Combination Sum II
- 43. Multiply Strings
- Larry’s Array
- Short Palindrome
- 65. Valid Number
If you still feel like you need help, then consider taking a course on algorithms and data structures.
Operational Programming Problems
Operational interview questions are harder to prep for. There are no “classic” interview questions here. However, they are also often easier to figure out on the spot. Algorithm interview questions usually have some sort of trick. Like the balanced brackets problem. If you don’t know you need to use queues, it will be very difficult to get to the correct answer.
Where as operational problems will be more focused on workflows, and businesses processes. So as long as you are good at walking through real problems, this should be easier. Here are some problems that are great for prepping. We find it is helpful to know how to use arrays and dictionaries. Beyond that, there isn’t too much more required.
System Design Questions
System design questions aren’t that common. However, this is a good skill to practice because you never know when an interviewer suddenly asks you to design a mobile app for the next Uber (it has happened to us). You don’t want to get stuck on pieces you don’t know when doing high level design. It quickly shows what you have 0 clue about. For instance, what is a CDN, what type of Webserver are you going to use for a ASP.NET site, where does the API layer go. All of these are basic questions you don’t want to get stumped on.
So spending a few minutes watching the videos below will help you be prepared in case your interviewer shocks you with their problem.
If you have finished all of the questions and videos above and still feel like you need to review, then consider taking the courses below. They will help you get some new perspective. The truth is, these courses will pretty much cover the same concepts covered in the videos we have provided. However, sometimes, hearing the same information from multiple sources is helpful.
Along with the courses, it might be helpful to read some programming, data warehousing design books. These are other key concepts worth covering.
We do hope this list will help you prepare for your next data engineering interview. If you need the checklist to help keep track then you can find the checklist here. Please let us know if you have any questions or need any future help. Please do sign up for our email list to continue to receive free content like this!
Are You Interested In Learning About Data Science Or Tech?