Our Data Science Collective group recently discussed whether Data Scientists require a PhD, the popularity of this discussion led us to delve a little deeper into the topic and threw up some interesting questions.
Does four years further education outweigh the experience gained from going straight into a commercial environment? Does the answer to this question boil down to the Data Scientist job role itself within the team?
Is there a difference in mind-set and approach from someone who has completed a PhD vs. someone who hasn’t?
We should start off by saying that there are a number of extremely talented and successful Data Scientists who do not have a PhD. Niall and I have each recruited in this space since 2013, we are fully aware of the market trends and what companies look for when they are hiring a Data Scientist.
Sam Savage mentioned in our Data Science Collective group:
“Phd = 3-4 years to solve 1 hard new problem,
DS = 1-6 months to wire together existing solutions and PROVE this really adds business value.”
Sam continued to write that you can solve pretty much any Data Science problem with a clear scientific methodology, i.e. clear statement of the problem prior to experiment, rigorous approach to experiment, transparency, reproduction, and peer review.
So, do you need a PhD to do this?
In our experience, no you do not need one. Someone who finishes their Masters and then goes straight into an analytically focussed role will have 4 years’ commercial experience by the time their colleague is finishing their PhD. Even if you do not possess all of the “tick boxes” that companies ask for in a Data Scientist position, you can still make that transition. When working in the retail industry, for example, you may be able to finish a Masters, work as a Data Analyst, work alongside the Data Science team, build on your predictive modelling, machine learning and programming skills, and within a year or two be working as a Data Scientist.
However, will this person have as deep an understanding of “Data Science” than someone with a PhD? There are several reasons PhDs are valued highly in Data Science. There is the element of “story telling” – it is evidence of being trained in wrestling data, knocking it into shape, wringing a story out of it, and telling that story to others. A PhD is evidence that you can optimise your work to an unprecedented degree and learn the methodologies to enable ground breaking development. It is often a sign that you can dive much deeper into the theory rather than applying out-of-the box algorithms and solutions. Further still, you are likely to better understand the foundations of the techniques used, be able to customise them, and understand when and why they may not work.
Having said all of this, this deeper understanding can have its limitations in the commercial world. It is a common generalisation that Data Scientists from a strong academic and research orientated environment have a tendency to (excuse the expression) build ivory towers, and struggle to meet the demands in a fast-paced environment, with short sprints for example. Personally, I feel this is too sweeping a statement, but it is understandable why many companies hesitate when bringing on board such people for their first commercial role. Courses and boot camps such as Science to Data Science, Advanced Skills Initiative, and Insight Data Science have been set up purely for this reason – to help bridge the gap between finishing a PhD and starting a job as a Data Scientist.
David Johnston (Principal Data Scientist at ThoughtWorks) recently posted an article on the shifting demographics within Data Science. One interesting point was the decline in Data Scientists with PhDs from 43% last year to 28% this year. This could start a whole separate discussion on the definition of those Data Scientist job titles, but we’ll leave that for another day.
Do you need a PhD to be a Data Scientist?
Absolutely not. Will your Data Science career be limited by not having a PhD? Absolutely not. Are there benefits of having a PhD as a Data Scientist? Absolutely. What is my overall advice when candidates ask whether to study a PhD or not when wanting to start a career as a Data Scientist? I am afraid that I am still going to sit on the fence with that one…