What is Data Science?
Data Science is a surprisingly hard definition to nail down, especially given the fact that how ubiquitous the term has become.
Vocal critics have variously dismissed the term as a superfluous label (after all, what science doesn’t involve data?)But, these critiques miss something important.
Data science, is perhaps the best label we have for the cross-disciplinary set of skills that are becoming increasingly important in many applications across industry and academia. This cross-disciplinary piece is the key.
In VanderPlas’s opinion, the best existing definition of data science is illustrated by Drew Conway’s Data Science Venn Diagram (see the figure below), first published on Drew Conway’s blog in September 2010.
The Data Science Venn Diagram above captures the essence of what people mean when they say “data science”:
it is fundamentally an interdisciplinary subject. Data science comprises three distinct and overlapping areas:
the skills of a statistician who knows how to model and summarize (big) datasets;
the skills of a computer scientist who can design and use algorithms to efficiently store, process, and visualize this data; and
the domain expertise — what we might think of as “classical” training in a subject — necessary both to formulate the right questions and to put their answers in context.
With this in mind, it would be better to think of data science not as a new domain of knowledge to learn, but as a new set of skills that you can apply within your current area of expertise.
(If you want to get started with your data science journey and apply it in your area of expertise, check out this page for some useful resources that I have collected for you.)
References and Further Reading List:
- VanderPlas, Jake. Python Data Science Handbook: Essential Tools for Working with Data . O’Reilly Media.
- Battle of the Data Science Venn Diagrams (pdf) — this post introduced a bunch of diagrams relating to Data Science.
- A Modification of Drew Conway’s Data Science Venn Diagram (pdf)
- THE DATA SCIENCE VENN DIAGRAM (pdf) (drewconway.com)
- The State of ML and Data Science 2017 | Kaggle (pdf)