Machine Learning Engineer vs Data Scientist - Salary Gap
Updated on
In this article we explore why the median Machine Learning salary is 15-40% higher than Data Science Salary accross all seniority levels. The data for this study comes from 9,261 jobs indexed by our Data Science Job Hunter between June and September 2023 from 1605 companies worldwide.
With Data Science and Machine Learning being often interchangeably used in the industry, this systematic trend seems puzzling.
We studied 9,261 job descriptions to uncover the actual difference in requirements driving the difference in salary.
Let's start with understanding the actual salary gap.
Machine Learning salary is 15-40% higher than Data Science salary
Machine Learning Engineer vs Data Scientist - Salary Gap, Source: Jobs-in-data.com
Regular Level Positions:
- Data Scientists earn a median salary of approximately $119,550.
- ML Engineers outperform this, with a median salary of $165,000.
- ML Ops professionals are more closely aligned with Data Scientists, having a median salary of around $152,000.
In summary, ML Engineers at the regular level seem to have 38% higher median salary compared to Data Scientists.
Senior Level Positions:
- Senior Data Scientists have a median salary of $152,720.
- Senior ML Engineers again lead the pack with a median salary of $185,900.
- ML Ops senior roles offer a median salary of $185,800, virtually identical to ML Engineers.
Here, the salary gap between Data Scientists and the other roles is at 22%, with ML Engineers and ML Ops professionals almost neck-and-neck.
Manager/Lead Level Positions:
- Manager/Lead roles in Data Science command a median salary of about $175,637.
- For ML Engineers in Manager/Lead roles, the median salary jumps to $225,000.
- ML Ops Manager/Lead roles offer a median salary of $210,375.
ML Engineers at this level of seniority appear to be the most lucratively compensated (and ca. 30% higher than Data Scientists), followed closely by ML Ops roles.
Director/VP Level Positions:
- Director/VP level Data Scientists earn a median salary of $185,710.
- At the same level, ML Engineers have a median salary of $210,000.
- ML Ops Director/VP roles have an even higher median salary of $197,500.
At the Director/VP level, ML Ops roles surge ahead of Data Scientists but are still somewhat behind ML Engineers in terms of median salary.
Other studies
Other studies also identified a salary gap between Data Science and Machine Learning roles:
- When working on our salary calculator, we found a 17% - 20 % salary premium for ML Engineers / ML OPS over Data Scientists accross different seniority levels - based on Kaggle Survey Data.
- This study based on Indeed data found a 30% premium.
In conclusion, we observe the systematic and significant (15-40%) premium for ML Engineer/ML OPS Engineers over Data Scientist across different studies and seniority levels.
Let's study job requirements to understand the root causes.
Machine Learning Engineer vs Data Scientist - Skills
We compared requirements for Data Science roles and Machine Learning roles in the following aspects:
- Required education
- Programming languages
- Core ML skills
- Data processing and database technologies
- Cloud skills
- Data visualization tools
- IDEs
These are the most significant differences that we observed in job requirements:
1. Education Requirements: A noticeable difference is observed in the demand for Ph.D. holders - ML Engineer positions require a Ph.D. in 27% of job listings, showcasing a ~20% increase compared to the 23% requirement in Data Scientist listings.
2. Programming Language Proficiency: While ML Engineers lean towards lower-level languages like C, C++, and Java, Data Scientists often utilize SQL and R. Nevertheless, Python is a Lingua Franca for the Machine Learning world, serving as a crucial skill set for both job categories.
3. Core ML Skills: While there is a significant overlap, distinctions exist in the core skillsets for ML Engineers and Data Scientists. The former primarily focuses on deep learning technologies and mastery of frameworks like PyTorch and TensorFlow. In contrast, Data Scientists need to be adept in statistics and data visualization.
4. Data Processing and Database Technologies: The MLOps roles distinctly stand out when it comes to experience with various data processing and database technologies, often requiring a broader and more in-depth skill set compared to the other two roles.
5. Cloud Skills: There is a discernable demand for cloud skills among ML Engineers, which is significantly higher compared to the Data Scientists.
6. Visualization Tools Proficiency: Data Scientists have a higher demand for proficiency in visualization tools, a requirement that is not as prevalent among ML Engineers.
7. IDE Usage: Interestingly, across all three roles - ML Engineers, MLOps, and Data Scientists - Microsoft Excel is a frequently listed requirement.
Let's deep dive into the details.
Machine Learning Engineer vs Data Scientist: Required Education
Machine Learning Engineer vs Data Scientist - Required Education, Source: Jobs-in-data.com
In the Data Science and Machine Learning job market, educational qualifications play an important role, with a degree of some type being mentioned in approximately 80-90% of job descriptions.
For Data Scientist positions, with a total of 4,393 job listings observed, Bachelor’s degrees are mentioned in 28% of the listings, Master’s degrees in 37%, and PhDs in 23%.
In contrast, Machine Learning Engineer roles, which have 2,835 listings, display a different educational preference. Bachelor’s degrees are mentioned in 24% of listings, Master’s degrees in 28%, and notably, PhDs in 27%. The data indicates that there is a higher demand for PhD holders in Machine Learning Engineer roles compared to Data Scientist positions (27% versus 23%).
Regarding ML Ops jobs, of which there are 1,398 listings, Bachelor’s degrees are noted in 32% of job descriptions, Master’s degrees in 36%, and PhDs in 19%. MBAs are also mentioned across all three job types, but they represent a smaller percentage of the listings: 4% for Data Scientist roles, 1% for Machine Learning Engineer roles, and 2% for ML Ops roles.
Machine Learning Engineer vs Data Scientist: Programming Languages
Machine Learning Engineer vs Data Scientist - Programming Languages, Source: Jobs-in-data.com
In our analysis, Python indisputably stands out as a predominant language across all three job categories, being required in 65% of Data Scientist roles, 48% of ML Engineering positions, and a remarkable 75% of MLOps jobs. This underlines Python's unparalleled utility in data manipulation, modeling, and operational tasks in the ML lifecycle.
When we shift our focus to the Machine Learning Engineers segment (and compare it to Data Scientists), there's a conspicuous demand for lower-level programming languages. The data shows that languages like C, C++, and Java are more prevalent here, with 21%, 13%, and 16% demand respectively, compared to their presence in Data Scientist job requirements. This trend reflects the need for ML Engineers to engage deeply with system-level tasks, algorithm optimization, and performance enhancement, areas where lower-level languages traditionally excel.
SQL is another language that finds significant application, especially for Data Scientists (44%) and MLOps roles (46%), emphasizing its importance in database querying and management tasks crucial for data-intensive jobs.
R also maintains a strong presence, particularly among Data Scientists (44%), highlighting its continuous relevance for statistical analysis and data visualization tasks. However, its demand drops noticeably in ML Engineering (14%) and MLOps (26%) roles.
Other languages like Spark or Scala show variable demand across the three categories, with Spark being particularly sought after in MLOps jobs (34%). The demand for other languages like TypeScript, C#, MATLAB, Bash, Scratch, Rust, Ruby, Swift, Julia, and Perl is relatively low.
In a nutshell, while Python continues to be the lingua franca in the data and ML job market, there is a distinct pattern of language requirements that align with the unique responsibilities and technical demands of Data Scientist, ML Engineer, and MLOps roles.
Machine Learning Engineer vs Data Scientist: Core ML skills
Machine Learning Engineer vs Data Scientist - Core ML Skills, Source: Jobs-in-data.com
In terms of notable differences in Core ML Skills, Data Scientists predominantly lean towards expertise in statistics and data visualization, while Machine Learning Engineers and ML Ops specialists find a higher demand for skills in deep learning, particularly with tools like PyTorch and TensorFlow.
At the forefront of skills for all three roles is Communication, with 73.9% of Data Scientist job postings emphasizing it. Interestingly, while ML Engineers rate it at 65.6%, it is even more critical for ML OPS roles, appearing in 78.7% of the listings. This indicates the cross-functional nature of these roles, necessitating constant interactions with other teams.
Model training, a foundational skill in this domain, is sought in 36.4% of Data Scientist roles, 38.7% for ML Engineers, and as high as 48.8% for ML OPS roles, highlighting their active involvement in the training phase.
Statistics is a key expertise for Data Scientists, with a notable 50.9% of jobs demanding it. This contrasts the 22.8% for ML Engineers and 31.0% for ML OPS roles. Data visualization, another essential skill for Data Scientists, is called for in 18.9% of their roles but drops significantly for ML Engineers at 3.0% and 9.6% for ML OPS.
Diving deeper into machine learning specifics, Deep Learning and NLP are vital across all roles, but it's frameworks like PyTorch and Tensorflow that are especially desired in ML Engineer and ML OPS roles, at around 25-27%. Computer Vision also maintains consistent importance across the spectrum, ranging from 15.4% to 16.4%.
ML OPS roles distinctively require Monitoring expertise, emphasized in 42.9% of listings. LLM, or Lifelong Machine Learning, appears notably more in ML OPS roles at 14.2%, followed by Scikit-learn at 12.4%, showcasing the evolving nature of these roles.
Skills like Keras, Gen ai, Neural Networks, Feature engineering, and Inference range between 3% to 10% for all roles, indicating their niche yet valuable application. Interestingly, traditional algorithms like Random Forest see a significant dip in ML Engineer roles at 0.7% but maintain some importance for Data Scientists and ML OPS roles.
Incident handling, a vital operational skill, is most sought after in ML OPS roles.
Interestingly, Gradient Boosting models such as XGB, LGBM, and Catboost have a relatively modest presence in DS/ML/MLOPS job listings, featuring in fewer than 5% of the offerings. This contrasts sharply with their known efficacy in addressing tabular supervised challenges, a discrepancy that certainly warrants further exploration.
Machine Learning Engineer vs Data Scientist: Data processing and database technologies
Machine Learning Engineer vs Data Scientist - Data processing and database technologies, Source: Jobs-in-data.com
The proficiency accross a range of data processing and database technologies distinctly characterizes ML Ops roles, setting them apart from positions in Data Science or Machine Learning Engineering.
A striking 40% of ML Ops job listings specifically call for experience with Apache Spark, far outpacing the 21% in Data Scientist roles and 13% in ML Engineer positions. Similarly, Databricks is a requirement in 18% of ML Ops jobs, compared to just 5% for Data Scientists and 3% for ML Engineers.
The focus on data processing technologies like Spark, Databricks, and even Snowflake—which appears in 17% of ML Ops listings—is likely due to the nature of Machine Learning Operations. ML Ops involves the end-to-end lifecycle of machine learning models, from development to deployment and scaling, which often requires robust data processing capabilities for handling large-scale, real-time data. Technologies like Apache Hadoop and Apache Kafka also show up more frequently in ML Ops listings, with Hadoop at 11% and Kafka at 15%, underscoring the need for distributed data processing and streaming services in this role.
Other databases and technologies like MongoDB, NOSQL, Hive, and even Amazon Redshift show a mixed distribution across the roles, but it's ML Ops that consistently demands a broader skill set. Interestingly, Amazon Redshift shows up in 8% of ML Ops jobs but is almost negligible in Data Scientist and ML Engineer roles.
Machine Learning Engineer vs Data Scientist: Cloud skills
Machine Learning Engineer vs Data Scientist - Cloud Skills, Source: Jobs-in-data.com
Cloud technology expertise is considerably more frequently required for roles in Machine Learning Engineering and Machine Learning Operations than it is for Data Scientist positions.
Among 4,325 Data Scientist job listings, cloud skills of any kind were required in 27.5% of the cases. Contrast this with Machine Learning Engineers, where a substantial 43.5% of 2,732 job postings required cloud knowledge, and the figure escalates dramatically to 69.0% for Machine Learning Operations roles out of 1,367 listings.
Digging deeper, AWS dominates across all three job categories, being a requirement in 26.3% of Data Scientist jobs, 28.6% for ML Engineers, and a notable 51.9% for ML Ops roles. Azure follows but lags behind, especially in Data Science roles, where it's a requirement in just 12.0% of job listings. Azure performs somewhat better in ML Engineering and Ops roles with 10.9% and 36.3% respectively. Google Cloud Platform (GCP) holds its own but remains a lesser player with 7.6% in Data Science roles, 10.2% in ML Engineering, and 23.4% in ML Ops.
What's notably intriguing is Oracle Cloud's near-absence from these job listings, showing up in only 0.1% of ML Ops roles and completely missing in Data Science and ML Engineering positions. Despite Oracle's aggressive marketing efforts to scale its cloud services, it's struggling to make a dent in these tech-intensive job markets. One reason could be the platform's late entry into the cloud space, which allowed AWS, Azure, and GCP to firmly establish their ecosystems. Additionally, Oracle Cloud might not offer the same range of machine learning and data processing tools that its competitors do, which is critical for these roles.
Machine Learning Engineer vs Data Scientist: Data visualization tools
Machine Learning Engineer vs Data Scientist - Data Visualization Tools, Source: Jobs-in-data.com
Data Scientist roles require experience in visualization tools three to six times more often compared to Machine Learning Engineer roles
A 19.6% specifically sought expertise in Tableau, outpacing Machine Learning Engineers (3.3%) and Machine Learning Operations roles (9.7%) by a considerable margin. But that's just the tip of the iceberg. When we move down the list to Power BI, we see a similar trend; 10.8% of data scientist jobs required proficiency in Power BI as compared to a mere 2.4% for ML Engineers and 10.2% for ML Ops roles.
While Matplotlib, PowerPoint, Plotly, and GGplot also made appearances in the job listings, their percentages ranged from 3.9% to as low as 0.3% across the different roles. But even here, data scientists led the way, with 3.9% of job postings asking for PowerPoint skills, and 3.0% calling for Matplotlib expertise.
On a humorous note, we take particular pride in working in an industry that prefers Matplotlib over PowerPoint for data visualization. Only someone who has worked with both can truly grasp the significance of this statement.
Machine Learning Engineer vs Data Scientist: IDEs
Machine Learning Engineer vs Data Scientist - IDE, Source: Jobs-in-data.com
MS Excel is required in 10.6% of Data Scientist roles, making it the most popular IDE for this profession. While Excel may not be traditionally thought of as an IDE, its powerful functionalities for data manipulation, visualization, and basic scripting via VBA extend its capabilities beyond just a spreadsheet software, thus making it versatile enough to be considered an IDE for data tasks.
This requirement for Excel significantly outpaces its demand in ML Engineer jobs at 5.5% and ML Ops at 5.0%, showcasing the distinct needs and work patterns in data science. Furthermore, when we look at other IDEs like SAP, Jupyter Notebooks, Alteryx, Pycharm, Visual Studio Code, and Spyder, Data Scientists again lead in terms of requirements. SAP is a requirement in 4.0% of Data Scientist jobs, as opposed to just 1.4% in ML Engineer jobs and 3.7% in ML Ops roles. Jupyter Notebooks, a commonly used IDE in data analysis and machine learning, shows up in 2.6% of Data Scientist job listings, compared to 2.1% for ML Engineers and a mere 1.2% for ML Ops.
Machine Learning Engineer vs Data Scientist: Additional insights
When comparing Data Science and Machine Learning openings' job descriptions in pursuit of explanation of the salary gap, we uncovered two additional phenomena that we wanted to share:
Machine Learning Engineer vs Data Scientist: Companies do not hire juniors or interns
Machine Learning Engineer vs Data Scientist - Open positions by seniority, Source: Jobs-in-data.com
This is unrelated to our main quesiton, but it puzzles us nevertheless: our analysis of the Data job market reveals a glaring disparity in opportunities for those looking to step foot in the industry. Out of 4,325 Data Scientist job listings, a mere 4% are targeted towards Junior or Intern positions. Similarly, Machine Learning Engineer positions, which totaled 2,732, showed that only 3% were available for those at the junior or intern level. Even more striking is the scenario in ML Ops – with a sample of 1,367 jobs, only a scant 1% were reserved for newcomers.
Diving deeper into the numbers, for regular Data Scientist roles, 46% of positions are available, while Machine Learning Engineers and ML Ops jobs are at 48% and 43% respectively. Senior positions stand at 26% for Data Scientists, 23% for ML Engineers, and a slightly higher 27% for ML Ops roles. When it comes to managerial or lead roles, the distribution is fairly consistent across the board: 17% for Data Scientists, 20% for ML Engineers, and 18% for ML Ops. On the upper echelons, Director or VP roles constitute 7% of Data Scientist jobs, 6% for ML Engineer positions, and an 11% for ML Ops.
A question arises then: how to enter the field if there are very few junior / intern positions? We are going to continue to explore this topic in the coming work.
Machine Learning Engineer vs Data Scientist: Remote opportunities grow with seniority
Machine Learning Engineer vs Data Scientist - Share of Remote Positions, Source: Jobs-in-data.com
The data reveals an intriguing pattern: remote work opportunities grow with seniority. For junior or intern level roles, remote positions accounted for just 4% in Data Scientist jobs, 16% in ML Engineer roles (based on small sample), and 5% in ML Ops. At the regular level, the numbers saw a slight uptick across the board, standing at 6% for Data Scientists, 6% for ML Engineers, and 7% for ML Ops roles. When we move to senior positions, the share of remote jobs climbs to 9% for Data Scientists and ML Ops and reaches 10% for ML Engineers.
However, it's at the managerial and leadership levels where this trend becomes most pronounced. For Manager/Lead roles, 11% of Data Scientist jobs are remote, matching the 10% for ML Engineers but being outpaced by a notable 16% in ML Ops. Interestingly, at the Director/VP level, 9% of Data Scientist jobs offer remote options, while a substantial 14% of ML Engineer roles are remote, dropping to 7% for ML Ops.
While it does not seem to contribute the salary difference between Data Scientists and ML / ML Ops Engineers( since the remote share is very similar for comparable seniorities), it is very interesting to see this upward trajectory as professionals ascend the career ladder. But why might this be the case? There are several plausible explanations that shed light on this phenomenon.
-
Firstly, senior professionals, including managers and directors, often have a proven track record of delivering results, which builds a sense of trust with employers. This credibility might make companies more amenable to offering remote work options to seasoned experts who have consistently demonstrated their competence and reliability.
-
Secondly, higher-level roles often entail more strategic responsibilities as opposed to hands-on, day-to-day tasks that might require physical presence, such as lab work or hardware maintenance. Roles like Manager/Lead or Director/VP often involve decision-making, planning, and oversight—tasks that can feasibly be conducted remotely.
-
Thirdly, senior staff usually have better-established networks and communication skills, enabling them to manage teams and projects remotely with the same efficacy as they would on-site. Their experience in navigating organizational hierarchies and stakeholder management also plays a pivotal role in making remote work a viable option.
We see this as a career development opportunity: the pursuit of higher-level roles will not just come with greater responsibilities and higher pay, but also with an increased likelihood of being able to work remotely.
Machine Learning Engineer vs Data Scientist: About the dataset
The data for this study comes from 9,261 jobs in Data Science, Machine Learning and ML OPS, indexed by our Data Science Job Hunter between June and September 2023 from 1605 companies worldwide. The salary insight is based on United Stated data.
The dataset encompasses three primary roles: Data Scientist, Machine Learning (ML) Engineer, and ML Ops jobs. Each job title is further broken down into five levels of seniority: Junior/Intern, Regular, Senior, Manager/Lead, and Director/VP.
The geographical distribution of indexed jobs:
Position | Country | Data Scientist jobs | ML Engineer jobs | ML Ops jobs | Total |
---|---|---|---|---|---|
1 | United States | 1971 | 1093 | 471 | 3535 |
2 | India | 417 | 213 | 185 | 815 |
3 | United Kingdom | 316 | 90 | 124 | 530 |
4 | Canada | 100 | 119 | 51 | 270 |
5 | Germany | 103 | 49 | 56 | 208 |
6 | France | 120 | 24 | 33 | 177 |
7 | Netherlands | 91 | 31 | 49 | 171 |
8 | not_available | 77 | 37 | 45 | 159 |
9 | Spain | 73 | 28 | 24 | 125 |
10 | Poland | 47 | 42 | 33 | 122 |
11 | Mexico | 49 | 54 | 14 | 117 |
12 | Australia | 30 | 65 | 19 | 114 |
13 | Brazil | 40 | 24 | 35 | 99 |
14 | Singapore | 44 | 30 | 15 | 89 |
15 | Portugal | 29 | 21 | 9 | 59 |
16 | Belgium | 33 | 11 | 7 | 51 |
17 | China | 36 | 11 | 2 | 49 |
18 | Switzerland | 25 | 18 | 3 | 46 |
19 | Colombia | 22 | 17 | 5 | 44 |
20 | Ireland | 19 | 17 | 7 | 43 |
Job groups definitions:
- Data Scientist jobs: Jobs with the following keywords in the Job Title: 'data science', 'data scientist'. Does not include ML Engineering nor ML OPS jobs jobs.
- Machine Learning Engineer jobs: Jobs with the following keywords in the Job Title: 'ml engineer', 'machine learning engineer', 'developer in ml', 'developer in machine learning', 'Software Engineer - Machine Learning', 'machine learning' , 'ml/ai', 'ml'. Does not include Data Science nor ML OPS jobs.
- ML OPS jobs: Jobs with the following keywords related to ML OPS: 'mlops', 'ml ops', 'ml operations', 'machine learning operations', 'machine learning ops', 'data ops', 'dataops'. The keyword search was both in the job title and job description. Does not include Data Science nor ML Engineering jobs.
Final notes:
The .pdf version of the entire report can be downloaded here.