What does a data scientist do? We spoke to someone to find out more about this popular and lucrative field
Learn insights from a data scientist about what it’s like to work in the field and how to approach this career.
Image: metamorworks, Getty Images / iStockphoto
Data scientists process and interpret what usually constitutes huge amounts of information to provide insights in a wide range of areas and disciplines, including marketing, social media, finance, sales and healthcare.
SEE: Building an effective data science team: a guide for business and technical leaders (free PDF) (TechRepublic)
Data science is a growing, lucrative field with a lot of potential. Glassdoor even ranked data scientists as the best job in America for 2019 based on earning potential, job satisfaction, and number of openings. In fact, the average salary of data scientists comes to around $ 91K in the United States.
A career in data science is not only happening; the field attracts certain candidates with specific skills and backgrounds focused on analysis.
I spoke with such a data scientist, Sri Megha Vujjini, who works at Saggezza, a global managed services provider and technology consulting firm. She started a career at Deloitte for a year and then went back to school for her master in data sciences. Originally interested in telecommunication engineering, she switched to data science after building algorithms for robotics.
Scott Matteson: You said that robotics has aroused your interest in a career in data science. Can you tell more about your work with algorithms for robotics and how it inspired you to enter data science?
Sri Megha Vujjini: One of the first things I did when I started working with robots was to automate the direction of a robot. You could say that I built a self-driving car, but a smaller and less risky version. The concept behind it was still the same – it must move if it is safe and it must stop if it is not – almost a black-and-white situation. It becomes complicated if you add more functionality to it, for example, which direction should it go? Can it go well instead of stopping? Under what circumstances? All of these scenarios force you to think outside the box, because all possibilities and all opportunities that can influence the output.
As we expand the scale and apply it to a business case, we have a problem with data science. For me it was like solving a puzzle – many questions: “Why is this happening and how does it work?” then replicate that in code lines and optimize that code – that’s what led me to this field.
Scott Matteson: Can you give some examples of how you have focused on data mining, statistical modeling, pattern recognition and visualization methods during your career (or in your work today)?
Sri Megha Vujjini: A simple example is making budgets for a company, regardless of the industry. A budget is usually planned around the activities for the coming year, but there is a possibility to use the history statistically.
There was an opportunity for me to solve a puzzle in this regard. I work with the retail trade, and I was able to create a time series model around sales, promotions and external economic factors that would essentially predict sales for the coming years. With this as a baseline, numerous decisions and operations have taken place. It took them to recognize the trends (more sales in March and not just in November because of the holidays), to visualize it better to explain to the company and then to automate the entire solution to use when needed.
In short, this career is about understanding the company, understanding its problems and issues, and providing a solution using data as your backbone.
Scott Matteson: What is unique about data science? What kind of personality or character works best with that? What are the challenges?
Sri Megha Vujjini: Ironically, a unique thing about this field is that it does not have one specific definition. It is a broad field with divergent definitions in the industry and academia. This is because it is a combination of mathematics, statistics, computer science, analysis, artificial intelligence and business. Data science is the enhanced version of all the combination of all these fields.
Because I don’t want to discourage anyone, there are some features and characteristics that make working in this area easier – problem solving, math or probability or even puzzles, always thinking about the bigger picture, thinking outside the box and being organized helps sometimes. Data science sometimes causes chaotic problems, and the first step to solve them is usually to break them down and organize them in a matter of waterfall structure.
The only challenge, and I hope everyone in this area agrees with me, is: data. The data is never perfect, it is incomplete or not what you need. It may be small, which gives you no insight or it is too broad to limit the solution. It’s always the data, but once we understand how to use it and how it works, we can use it the best way to get all the insights we want.
Scott Matteson: What are some of the problems that are solved by data science?
Sri Megha Vujjini: No world peace, not yet at least. But within the industry we have now created improved customer experiences and recommendation systems, faster deliveries and smoother and improved business operations at some companies due to some of the data science solutions. If we look at Amazon’s growth as an online retailer, we can identify some of the improvements and link them to the points mentioned above.
But outside the company, on a daily basis, we have constantly improved Google / Apple Maps, by doing groundbreaking research in medicine, physics, space travel or even in self-driving cars. All these problems and subsets of these problems were solved by data science.
Scott Matteson: Which technological products or tools are used for this field?
Sri Megha Vujjini: There is a small proportion of jobs that do not require programming skills reserved for industry veterans. Otherwise it is always good to know Python, R and SQL, because they make life easier. From a mathematical / statistical perspective we can use SAS, MATLAB, Python, R and all the rich libraries that offer them all. And because so much data is being moved to the cloud, it would be useful to know and understand cloud technologies. We have used Azure, AWS, Google Cloud and Snowflake, all in different capacities in the industry. In some cases, visualizations are also important, and they can be done with the help of Python and R. We can always go further and use tools such as PowerBI or Tableau.
Big Data Insights newsletter
Master the basic principles of big data analysis by following these tips from experts and by reading insights about innovations in data science.
Delivered on Mondays
Register today