Principles of Data Science - Third Edition: A Beginner's Guide to Essential Math and Coding Skills for Data Fluency and Machine Learning

Sinan Ozdemir

Language: English

Publisher: Packt Publishing

Published: Jan 30, 2024

Description:

Transform your data into insights with must-know techniques and mathematical concepts to unravel the secrets hidden within your data

Key Features

  • Learn practical data science combined with data theory to gain maximum insights from data
  • Discover methods for deploying actionable machine learning pipelines while mitigating biases in data and models
  • Explore actionable case studies to put your new skills to use immediately
  • Purchase of the print or Kindle book includes a free PDF eBook

Book Description

Principles of Data Science bridges mathematics, programming, and business analysis, empowering you to confidently pose and address complex data questions and construct effective machine learning pipelines. This book will equip you with the tools to transform abstract concepts and raw statistics into actionable insights.

Starting with cleaning and preparation, you'll explore effective data mining strategies and techniques before moving on to building a holistic picture of how every piece of the data science puzzle fits together. Throughout the book, you'll discover statistical models with which you can control and navigate even the densest or the sparsest of datasets and learn how to create powerful visualizations that communicate the stories hidden in your data.

With a focus on application, this edition covers advanced transfer learning and pre-trained models for NLP and vision tasks. You'll get to grips with advanced techniques for mitigating algorithmic bias in data as well as models and addressing model and data drift. Finally, you'll explore medium-level data governance, including data provenance, privacy, and deletion request handling.

By the end of this data science book, you'll have learned the fundamentals of computational mathematics and statistics, all while navigating the intricacies of modern ML and large pre-trained models like GPT and BERT.

What you will learn

  • Master the fundamentals steps of data science through practical examples
  • Bridge the gap between math and programming using advanced statistics and ML
  • Harness probability, calculus, and models for effective data control
  • Explore transformative modern ML with large language models
  • Evaluate ML success with impactful metrics and MLOps
  • Create compelling visuals that convey actionable insights
  • Quantify and mitigate biases in data and ML models

Who this book is for

If you are an aspiring novice data scientist eager to expand your knowledge, this book is for you. Whether you have basic math skills and want to apply them in the field of data science, or you excel in programming but lack the necessary mathematical foundations, you'll find this book useful. Familiarity with Python programming will further enhance your learning experience.

Table of Contents

  1. Data Science Terminology
  2. Types of Data
  3. The Five Steps of Data Science
  4. Basic Mathematics
  5. Impossible or Improbable - A Gentle Introduction to Probability
  6. Advanced Probability
  7. What are the Chances? An Introduction to Statistics
  8. Advanced Statistics
  9. Communicating Data
  10. How to Tell if Your Toaster is Learning - Machine Learning Essentials
  11. Predictions Don't Grow on Trees, or Do They?
  12. Introduction to Transfer Learning and Pre-trained Models
  13. Mitigating Algorithmic Bias and Tackling Model and Data Drift
  14. AI Governance
  15. Navigating Real-World Data Science Case Studies in Action

About the Author

Sinan is an active lecturer focusing on large language models and a former lecturer of data science at the Johns Hopkins University. He is the author of multiple textbooks on data science and machine learning including "Quick Start Guide to LLMs". Sinan is currently the founder of LoopGenius which uses AI to help people and businesses boost their sales and was previously the founder of the acquired Kylie.ai, an enterprise-grade conversational AI platform with RPA capabilities. He holds a Master's Degree in Pure Mathematics from Johns Hopkins University and is based in San Francisco.