Data science requires a versatile skill-set primarily for processing very large data sets including ‘big data’ consisting of structured, unstructured or semi-structured data that large enterprises produce.
It incorporates varying elements and builds on techniques and theories from many fields, including mathematics, statistics, data engineering, pattern recognition and learning, advanced computing, visualization, uncertainty modeling, data warehousing, and high performance computing with the goal of extracting meaning from data and creating data products.
Data Science offers an overview of the technologies needed in order to understand the mechanics of data science. At present there is no single book which describes how these technologies mesh together to provide techniques and tools needed in this area. This book provides a look at the requisite mathematical background to understand data science, including probability distributions, Bayes’ rule, random processes, Markov models, linear and logistic regression.
It introduces programming skills commonly used for data science, including a quick introduction to R, PANDAS (Python data mining), NLTK (Python natural language tool kit) and scikit. It also provides a quick tour of the key concepts of information retrieval, machine learning, data mining, text analytics, artificial intelligence and predictive analytics in the context of data science and connects the theory to practical data science problems involving these disciplines.
Armed with both the theoretical concepts and practical programming knowledge needed for data science, it effectively starts you off the on the “data science-what, where, and how” journey with pointers to data science resources, courses, certifications, and applications.
As a primer on an interdisciplinary subject, ‘Data Science‘ draws scientific inquiry from a broad range of academic subject areas as well, and guides you into areas of research such as:
- Cloud computing
- Databases and information integration
- Learning, natural language processing and information extraction
- Computer vision
- Information retrieval and web information access
- Knowledge discovery in social and information networks
- Data science security