Image by Author
I am a data scientist.
Some great things about the job? It comes with a thick paycheck, flexible working hours, and a world of opportunity.
Some not-so-amazing things about the job? It can be difficult.
Data science roles are not as straightforward as what’s shown to you in YouTube videos and online courses.
Apart from creating machine learning models and having a sound knowledge of statistics, you also need to:
- Work with engineering teams to build data pipelines
- Collaborate with domain experts to understand your product
- Define business metrics
- Add value to the company’s bottom line
In this article, I am going to break down what I actually do every day as a data scientist and some common misconceptions about the field.
By the end of this article, you will have more clarity into a day in the life of a data scientist, which will help you decide if this field is for you.
I will also share some lesser-known tips on landing and succeeding in a data science role.
Let’s get into it!
Misconceptions about being a data scientist
When you think data science, you probably think “machine-learning models.”
That’s what I thought too, until I got a data science job.
I build machine learning models like… 10% of the time at work.
Maybe even less.
If you’ve taken a data science course before, you’ve probably learned that the process of building ML models involves:
- Exploratory data analysis
- Model selection
- Hyperparameter tuning
- Model evaluation
However, what most of these courses don’t teach you is this:
How to turn model performance metrics into insights that actually matter to the business?
At the end of the day, you are paid based on the amount of value you bring to the business.
This can be in the form of:
- Building an ML model to help a business acquire new customers
- Create forecasts on future company performance
- Advise on product-launch decisions
I’ve worked on all three data science use-cases above.
Apart from building machine learning models, here’s what I spend my time on:
1. Building Domain Expertise and Defining Business Metrics
As a data scientist, you need to generate metrics that matter to the business.
This is typically done by collaborating with multiple teams (product, domain experts, management) to decide on a metric definition that matters to the company.
You can then focus on monitoring and optimizing the metric using your analytical skills.
Also, as a data scientist, you must select or even create features that have business impact.
While you might’ve learned about feature selection from a data science perspective, you also need to use domain expertise to decide on the variables to be included in your model.
Simply defining success metrics and creating variables that add business impact can take days, if not weeks, since this process involves having to align with multiple different teams.
2. Data Engineering
While you typically won’t be expected to have the core skill set of a data engineer, you will need to learn some engineering skills to succeed in a data science role.
This includes:
- How to perform ETL tasks and build data pipelines
- Version control
- Writing and optimizing SQL queries
- Basic cloud skills (storage, running compute jobs, cloud security)
Here is a recommended free course you can take to learn the above skills.
3. Data Storytelling
Finally, after you’ve gained the product and engineering skill set, you need to turn data into insight.
You’ve got to explain the results of your ML models to stakeholders across different teams.
This includes skills like building dashboards and presentations explaining your model’s results in a way that is easy to understand.
If you’d like to learn more about the topic, this tutorial breaks down the fundamentals of data storytelling.
4. Building Dashboards
You will often be expected to showcase company performance metrics and model results in the form of an interactive dashboard.
Most companies use Tableau or PowerBI, and mastering these tools can help you communicate data quickly.
I’ve found that PowerBI is easier to learn (especially at the initial stage), while Tableau can have a steeper learning curve.
Here is a free course I’d recommend if you’d like to learn PowerBI, and here’s one on Tableau.
5. Excel
I use Excel a lot.
So do 1.5 billion people around the world.
The teams you collaborate with (management and business stakeholders) do a lot of their work on spreadsheets, and prefer to look at data in Excel.
Due to this, it is often easier to build a report comprising your model’s results and analyses in the form of a spreadsheet.
For more straightforward reporting, you can save time by showcasing results in Excel. This saves time from having to create dashboards and complex visualizations.
If you don’t already know Excel, this YouTube tutorial should help you get started.
Should you be a data scientist?
If you’ve taken a data science online course, built a machine learning model, or created a portfolio project, you might have an incomplete view of what a data scientist does.
Unfortunately, a huge part of the job is unsexy, and it requires a ton of attention to detail and collaboration.
Also, you’ve got to be able to pick up new tools and skill sets on the fly.
The skills I’ve listed above scratch the surface of what’s expected of a data scientist — you’ll often find yourself working in different cloud environments, with different types of databases and visualization tools.
In my opinion, here’s the one true defining trait of a data scientist:
You must be a lifelong learner.
Your journey doesn’t end with an online course, boot camp, or even after you’ve landed a job. You will be expected to learn and do new things every day.
If this excites you, then you should definitely attempt to enter the field.
Otherwise, you might want to find a domain that is more predictable — one that doesn’t move as quickly.
 
 
Natassha Selvaraj is a self-taught data scientist with a passion for writing. Natassha writes on everything data science-related, a true master of all data topics. You can connect with her on LinkedIn or check out her YouTube channel.
This post was originally published on here