Zum Hauptinhalt springen Zur Suche springen Zur Hauptnavigation springen
Beschreibung
Leverage the power of Python to build real-world feature engineering and machine learning pipelines ready to be deployed to production
Key Features:
- Craft powerful features from tabular, transactional, and time-series data
- Develop efficient and reproducible real-world feature engineering pipelines
- Optimize data transformation and save valuable time
- Purchase of the print or Kindle book includes a free PDF eBook
Book Description:
Streamline data preprocessing and feature engineering in your machine learning project with this third edition of the Python Feature Engineering Cookbook to make your data preparation more efficient.
This guide addresses common challenges, such as imputing missing values and encoding categorical variables using practical solutions and open source Python libraries.
You'll learn advanced techniques for transforming numerical variables, discretizing variables, and dealing with outliers. Each chapter offers step-by-step instructions and real-world examples, helping you understand when and how to apply various transformations for well-prepared data.
The book explores feature extraction from complex data types such as dates, times, and text. You'll see how to create new features through mathematical operations and decision trees and use advanced tools like Featuretools and tsfresh to extract features from relational data and time series.
By the end, you'll be ready to build reproducible feature engineering pipelines that can be easily deployed into production, optimizing data preprocessing workflows and enhancing machine learning model performance.
What You Will Learn:
- Discover multiple methods to impute missing data effectively
- Encode categorical variables while tackling high cardinality
- Find out how to properly transform, discretize, and scale your variables
- Automate feature extraction from date and time data
- Combine variables strategically to create new and powerful features
- Extract features from transactional data and time series
- Learn methods to extract meaningful features from text data
Who this book is for:
If you're a machine learning or data science enthusiast who wants to learn more about feature engineering, data preprocessing, and how to optimize these tasks, this book is for you. If you already know the basics of feature engineering and are looking to learn more advanced methods to craft powerful features, this book will help you. You should have basic knowledge of Python programming and machine learning to get started.
Table of Contents
- Imputing Missing Data
- Encoding Categorical Variables
- Transforming Numerical Variables
- Performing Variable Discretization
- Working with Outliers
- Extracting Features from Date and Time Variables
- Performing Feature Scaling
- Creating New Features
- Extracting Features from Relational Data with Featuretools
- Creating Features from a Time Series with tsfresh
- Extracting Features from Text Variables
Leverage the power of Python to build real-world feature engineering and machine learning pipelines ready to be deployed to production
Key Features:
- Craft powerful features from tabular, transactional, and time-series data
- Develop efficient and reproducible real-world feature engineering pipelines
- Optimize data transformation and save valuable time
- Purchase of the print or Kindle book includes a free PDF eBook
Book Description:
Streamline data preprocessing and feature engineering in your machine learning project with this third edition of the Python Feature Engineering Cookbook to make your data preparation more efficient.
This guide addresses common challenges, such as imputing missing values and encoding categorical variables using practical solutions and open source Python libraries.
You'll learn advanced techniques for transforming numerical variables, discretizing variables, and dealing with outliers. Each chapter offers step-by-step instructions and real-world examples, helping you understand when and how to apply various transformations for well-prepared data.
The book explores feature extraction from complex data types such as dates, times, and text. You'll see how to create new features through mathematical operations and decision trees and use advanced tools like Featuretools and tsfresh to extract features from relational data and time series.
By the end, you'll be ready to build reproducible feature engineering pipelines that can be easily deployed into production, optimizing data preprocessing workflows and enhancing machine learning model performance.
What You Will Learn:
- Discover multiple methods to impute missing data effectively
- Encode categorical variables while tackling high cardinality
- Find out how to properly transform, discretize, and scale your variables
- Automate feature extraction from date and time data
- Combine variables strategically to create new and powerful features
- Extract features from transactional data and time series
- Learn methods to extract meaningful features from text data
Who this book is for:
If you're a machine learning or data science enthusiast who wants to learn more about feature engineering, data preprocessing, and how to optimize these tasks, this book is for you. If you already know the basics of feature engineering and are looking to learn more advanced methods to craft powerful features, this book will help you. You should have basic knowledge of Python programming and machine learning to get started.
Table of Contents
- Imputing Missing Data
- Encoding Categorical Variables
- Transforming Numerical Variables
- Performing Variable Discretization
- Working with Outliers
- Extracting Features from Date and Time Variables
- Performing Feature Scaling
- Creating New Features
- Extracting Features from Relational Data with Featuretools
- Creating Features from a Time Series with tsfresh
- Extracting Features from Text Variables
Über den Autor
Soledad Galli is a lead data scientist with more than 10 years of experience in world-class academic institutions and renowned businesses. She has researched, developed, and put into production machine learning models for insurance claims, credit risk assessment, and fraud prevention. Soledad received a Data Science Leaders' award in 2018 and was named one of LinkedIn's voices in data science and analytics in 2019. She is passionate about enabling people to step into and excel in data science, which is why she mentors data scientists and speaks at data science meetings regularly. She also teaches online courses on machine learning in a prestigious Massive Open Online Course platform, which have reached more than 10,000 students worldwide.
Details
Erscheinungsjahr: 2024
Genre: Importe, Informatik
Rubrik: Naturwissenschaften & Technik
Medium: Taschenbuch
ISBN-13: 9781835883587
ISBN-10: 1835883583
Sprache: Englisch
Einband: Kartoniert / Broschiert
Autor: Galli, Soledad
Auflage: 3. Auflage
Hersteller: Packt Publishing
Verantwortliche Person für die EU: Libri GmbH, Europaallee 1, D-36244 Bad Hersfeld, gpsr@libri.de
Maße: 235 x 191 x 21 mm
Von/Mit: Soledad Galli
Erscheinungsdatum: 30.08.2024
Gewicht: 0,737 kg
Artikel-ID: 129963913

Ähnliche Produkte