John D. Kelleher

Through the examination of data, data science seeks to enhance decision-making. Today, data science decides which emails are filtered into our spam folders, which adverts we see online, which books and movies are recommended to us online, and even how much we pay for health insurance. The MIT Press Essential Knowledge volume on data science provides a succinct overview of the field’s history, present applications, problems with the data infrastructure, and ethical considerations.

Gathering, storing, and processing data for organizations has never been simpler. The discovery of such potent data analysis and modeling techniques as deep learning, the growth of big data and social media, the development of high-performance computing, and other factors all contribute to the use of data science. Data science is the study of how to extract non-obvious and practical patterns from huge datasets using a set of concepts, issue definitions, algorithms, and methods. Though more expansive in scope, it is closely related to the subjects of data mining and machine learning. In this book, core data principles are introduced, the stages of a data science project are described, and a brief history of the discipline is provided. It covers how to apply machine learning skills to real-world issues, introduces machine learning fundamentals, and considers data infrastructure and the difficulties presented by integrating data from various sources. The book also discusses changes in data governance, ethical and legal concerns, and computational methods for protecting privacy. It concludes by considering the potential effects of data science and providing guidelines for effective data science projects.