What is a Data Scientist?


Data science is the study of the generalizable extraction of knowledge from data, yet the key word is science. It incorporates varying elements and builds on techniques and theories from many fields, including signal processing, mathematics, probability models, machine learning, computer programming, statistics, data engineering, pattern recognition and learning, visualization, uncertainty modeling, data warehousing, and high performance computing with the goal of extracting meaning from data and creating data products. Data science is a buzzword, often used interchangeably with analytics or big data, that is often abused for marketing anything involving data processing, in particular to re-brand existing competitive intelligence and business analytics approaches. Data Science need not be always for big data, however, the fact that data is scaling up makes big data an important aspect of data science.

A practitioner of data science is called a data scientist. Data scientists solve complex data problems through employing deep expertise in some scientific discipline. It is generally expected that data scientists are able to work with various elements of mathematics, statistics and computer science, although expertise in these subjects are not required. However, a data scientist is most likely to be an expert in only one or two of these disciplines and proficient in another two or three. This means that data science must be practiced as a team, where across the membership of the team there is expertise and proficiency across all the disciplines.

Good data scientists are able to apply their skills to achieve a broad spectrum of end results. Some of these include the ability to find and interpret rich data sources, manage large amounts of data despite hardware, software and bandwidth constraints, merge data sources together, ensure consistency of data-sets, create visualizations to aid in understanding data, build mathematical models using the data, present and communicate the data insights/findings to specialists and scientists in their team and if required to a naive audience. The skill-sets and competencies that data scientists employ vary widely. Data scientists are an integral part of competitive intelligence, a newly emerging field that encompasses a number of activities, such as data mining and analysis, that can help businesses gain a competitive edge.

Data science techniques impact how we access data and conduct research across various domains, including the biological sciences, medical informatics, social sciences and the humanities.

