What Do Data Scientists Actually Do?

The age old question...*

Let's start by addressing the elephant in the room - there are thousands of blog articles out there that aim to answer this exact question. Why on earth am I adding another opinion to the mix? Well, as I explained in a recent-ish talk, data science is different now. Also, not everyone wants to read a 20-minute manifesto on the ins and outs of deep learning, not to mention seeing another bloody Venn diagram. I want to share a simple, digestible take on what data scientists do in practice.

OK, on to the article...

What do data scientists do?

I get asked this a lot. Like a lot a lot. And I've found when I give an answer, it can feel a little unsatisfactory, both to me, and to the person asking. I feel unsatisfied because it's a general answer - data is data, we analyse it. What data? All data. Any data. Which industry? Lots. Many. All...? Most. Being a generalist** is not the most popular choice in today's hyper-specialised workforce.

For the person asking, the answer is either too high-level, or the opposite. The question is phrased as what, but they usually mean how. Or if I answer how, it becomes a why. So let's try and give a well-rounded answer that still fits within the parameters of an elevator pitch.

High-level (the WHAT)

Data scientists create value from data. This may be in the form of insights for decision makers, automation / optimisation of important processes, and general problem solving. We translate business problems into quantifiable objectives and build out technical solutions driven by data.

Low-level (the HOW)

Data scientists use a range of frameworks, algorithms, tools and techniques to analyse data. We write code in languages like Python, R and SQL to manipulate, transform and interrogate data, and run experiments, often for the purposes of surfacing useful insights or predicting future behaviour. Machine learning? You bet - it's one of our core techniques. Deep learning? Yes, that too, when needed. We wield a mix of statistics and computer science, backing it up with communication and data visualisation skills to deliver results in a way that our stakeholders can understand.

The game (the WHY)

At least in the context of business - to increase revenue or reduce costs. It couldn't be any simpler, or more obvious - yet we've seen too many projects where these objectives are either disconnected from the process or are off the table completely. As generic as this is - if the work does not have a clear purpose - what is the point?

A slightly different, but related definition, is that data science is there to support and improve decision-making. A business that makes good decisions is a business that excels. This is why stakeholder buy-in is so important to data science success. And this is why data science is so important.

So, do you need a data scientist? We'd love to talk.

* since the term data science has been in common parlance only in the last decade, not quite. Yet somehow there are people walking around with >20 years data science experience!

** although it's kind of important here https://www.linkedin.com/posts/luke-metcalfe-51394a_why-data-science-teams-need-generalists-activity-6628837226543448065-WjJ-