Machine Learning for Business Analytics

Dekorationsartikel gehören nicht zum Leistungsumfang.

Sprache: Englisch

141,50 €*

inkl. MwSt.

Versandkostenfrei per Post / DHL

Aktuell nicht verfügbar

Kategorien:

Der Artikel ist nicht in deutscher Sprache verfasst.
Die Sprache der Produktbeschreibung und Details kann in Englisch angegeben sein. Sollten Sie Verständnisprobleme haben, wenden Sie sich gerne an uns.

Beschreibung

MACHINE LEARNING FOR BUSINESS ANALYTICS

Machine learning --also known as data mining or data analytics-- is a fundamental part of data science. It is used by organizations in a wide variety of arenas to turn raw data into actionable information.

Machine Learning for Business Analytics: Concepts, Techniques, and Applications in R provides a comprehensive introduction and an overview of this methodology. This best-selling textbook covers both statistical and machine learning algorithms for prediction, classification, visualization, dimension reduction, rule mining, recommendations, clustering, text mining, experimentation, and network analytics. Along with hands-on exercises and real-life case studies, it also discusses managerial and ethical issues for responsible use of machine learning techniques.

This is the second R edition of Machine Learning for Business Analytics. This edition also includes:
* A new co-author, Peter Gedeck, who brings over 20 years of experience in machine learning using R
* An expanded chapter focused on discussion of deep learning techniques
* A new chapter on experimental feedback techniques including A/B testing, uplift modeling, and reinforcement learning
* A new chapter on responsible data science
* Updates and new material based on feedback from instructors teaching MBA, Masters in Business Analytics and related programs, undergraduate, diploma and executive courses, and from their students
* A full chapter devoted to relevant case studies with more than a dozen cases demonstrating applications for the machine learning techniques
* End-of-chapter exercises that help readers gauge and expand their comprehension and competency of the material presented
* A companion website with more than two dozen data sets, and instructor materials including exercise solutions, slides, and case solutions

This textbook is an ideal resource for upper-level undergraduate and graduate level courses in data science, predictive analytics, and business analytics. It is also an excellent reference for analysts, researchers, and data science practitioners working with quantitative data in management, finance, marketing, operations management, information systems, computer science, and information technology.

Über den Autor

Galit Shmueli, PhD, is Distinguished Professor and Institute Director at National Tsing Hua University's Institute of Service Science. She has designed and instructed business analytics courses since 2004 at University of Maryland, [...], The Indian School of Business, and National Tsing Hua University, Taiwan.

Peter C. Bruce, is Founder of the Institute for Statistics Education at [...], and Chief Learning Officer at Elder Research, Inc.

Peter Gedeck, PhD, is Senior Data Scientist at Collaborative Drug Discovery and teaches at [...] and the UVA School of Data Science. His specialty is the development of machine learning algorithms to predict biological and physicochemical properties of drug candidates.

Inbal Yahav, PhD, is a Senior Lecturer in The Coller School of Management at Tel Aviv University, Israel. Her work focuses on the development and adaptation of statistical models for use by researchers in the field of information systems.

Nitin R. Patel, PhD, is Co-founder and Lead Researcher at Cytel Inc. He was also a Co-founder of Tata Consultancy Services. A Fellow of the American Statistical Association, Dr. Patel has served as a Visiting Professor at the Massachusetts Institute of Technology and at Harvard University, USA.

Inhaltsverzeichnis

Foreword by Ravi Bapna xix

Foreword by Gareth James xxi

Preface to the Second R Edition xxiii

Acknowledgments xxvi

Part I Preliminaries

Chapter 1 Introduction 3

1.1 What Is Business Analytics? 3

1.2 What Is Machine Learning? 5

1.3 Machine Learning, AI, and Related Terms 5

1.4 Big Data 7

1.5 Data Science 8

1.6 Why Are There So Many Different Methods? 8

1.7 Terminology and Notation 9

1.8 Road Maps to This Book 11

Order of Topics 13

Chapter 2 Overview of the Machine Learning Process 17

2.1 Introduction 17

2.2 Core Ideas in Machine Learning 18

Classification 18

Prediction 18

Association Rules and Recommendation Systems 18

Predictive Analytics 19

Data Reduction and Dimension Reduction 19

Data Exploration and Visualization 19

Supervised and Unsupervised Learning 20

2.3 The Steps in a Machine Learning Project 21

2.4 Preliminary Steps 23

Organization of Data 23

Predicting Home Values in the West Roxbury Neighborhood 23

Loading and Looking at the Data in R 24

Sampling from a Database 26

Oversampling Rare Events in Classification Tasks 27

Preprocessing and Cleaning the Data 28

2.5 Predictive Power and Overfitting 35

Overfitting 36

Creating and Using Data Partitions 38

2.6 Building a Predictive Model 41

Modeling Process 41

2.7 Using R for Machine Learning on a Local Machine 46

2.8 Automating Machine Learning Solutions 47

Predicting Power Generator Failure 48

Uber's Michelangelo 50

2.9 Ethical Practice in Machine Learning 52

Machine Learning Software: The State of the Market (by Herb Edelstein) 53

Problems 57

Part II Data Exploration and Dimension Reduction

Chapter 3 Data Visualization 63

3.1 Uses of Data Visualization 63

Base R or ggplot? 65

3.2 Data Examples 65

Example 1: Boston Housing Data 65

Example 2: Ridership on Amtrak Trains 67

3.3 Basic Charts: Bar Charts, Line Charts, and Scatter Plots 67

Distribution Plots: Boxplots and Histograms 70

Heatmaps: Visualizing Correlations and Missing Values 73

3.4 Multidimensional Visualization 75

Adding Variables: Color, Size, Shape, Multiple Panels, and Animation 76

Manipulations: Rescaling, Aggregation and Hierarchies, Zooming, Filtering 79

Reference: Trend Lines and Labels 83

Scaling Up to Large Datasets 85

Multivariate Plot: Parallel Coordinates Plot 85

Interactive Visualization 88

3.5 Specialized Visualizations 91

Visualizing Networked Data 91

Visualizing Hierarchical Data: Treemaps 93

Visualizing Geographical Data: Map Charts 95

3.6 Major Visualizations and Operations, by Machine Learning Goal 97

Prediction 97

Classification 97

Time Series Forecasting 97

Unsupervised Learning 98

Problems 99

Chapter 4 Dimension Reduction 101

4.1 Introduction 101

4.2 Curse of Dimensionality 102

4.3 Practical Considerations 102

Example 1: House Prices in Boston 103

4.4 Data Summaries 103

Summary Statistics 104

Aggregation and Pivot Tables 104

4.5 Correlation Analysis 107

4.6 Reducing the Number of Categories in Categorical Variables 109

4.7 Converting a Categorical Variable to a Numerical Variable 111

4.8 Principal Component Analysis 111

Example 2: Breakfast Cereals 111

Principal Components 116

Normalizing the Data 117

Using Principal Components for Classification and Prediction 120

4.9 Dimension Reduction Using Regression Models 121

4.10 Dimension Reduction Using Classification and Regression Trees 121

Problems 123

Part III Performance Evaluation

Chapter 5 Evaluating Predictive Performance 129

5.1 Introduction 130

5.2 Evaluating Predictive Performance 130

Naive Benchmark: The Average 131

Prediction Accuracy Measures 131

Comparing Training and Holdout Performance 133

Cumulative Gains and Lift Charts 133

5.3 Judging Classifier Performance 136

Benchmark: The Naive Rule 136

Class Separation 136

The Confusion (Classification) Matrix 137

Using the Holdout Data 138

Accuracy Measures 139

Propensities and Threshold for Classification 139

Performance in Case of Unequal Importance of Classes 143

Asymmetric Misclassification Costs 146

Generalization to More Than Two Classes 149

5.4 Judging Ranking Performance 150

Cumulative Gains and Lift Charts for Binary Data 150

Decile-wise Lift Charts 153

Beyond Two Classes 154

Gains and Lift Charts Incorporating Costs and Benefits 154

Cumulative Gains as a Function of Threshold 155

5.5 Oversampling 156

Creating an Over-sampled Training Set 158

Evaluating Model Performance Using a Non-oversampled Holdout Set 159

Evaluating Model Performance If Only Oversampled Holdout Set Exists 159

Problems 162

Part IV Prediction and Classification Methods

Chapter 6 Multiple Linear Regression 167

6.1 Introduction 167

6.2 Explanatory vs. Predictive Modeling 168

6.3 Estimating the Regression Equation and Prediction 170

Example: Predicting the Price of Used Toyota Corolla Cars 171

Cross-validation and caret 175

6.4 Variable Selection in Linear Regression 176

Reducing the Number of Predictors 176

How to Reduce the Number of Predictors 178

Regularization (Shrinkage Models) 183

Problems 188

Chapter 7 k-Nearest Neighbors (kNN) 193

7.1 The k-NN Classifier (Categorical Outcome) 193

Determining Neighbors 194

Classification Rule 194

Example: Riding Mowers 195

Choosing k 196

Weighted k-NN 199

Setting the Cutoff Value 200

k-NN with More Than Two Classes 201

Converting Categorical Variables to Binary Dummies 201

7.2 k-NN for a Numerical Outcome 201

7.3 Advantages and Shortcomings of k-NN Algorithms 204

Problems 205

Chapter 8 The Naive Bayes Classifier 207

8.1 Introduction 207

Threshold Probability Method 208

Conditional Probability 208

Example 1: Predicting Fraudulent Financial Reporting 208

8.2 Applying the Full (Exact) Bayesian Classifier 209

Using the "Assign to the Most Probable Class" Method 210

Using the Threshold Probability Method 210

Practical Difficulty with the Complete (Exact) Bayes Procedure 210

8.3 Solution: Naive Bayes 211

The Naive Bayes Assumption of Conditional Independence 212

Using the Threshold Probability Method 212

Example 2: Predicting Fraudulent Financial Reports, Two Predictors 213

Example 3: Predicting Delayed Flights 214

Working with Continuous Predictors 218

8.4 Advantages and Shortcomings of the Naive Bayes Classifier 220

Problems 223

Chapter 9 Classification and Regression Trees 225

9.1 Introduction 226

Tree Structure 227

Decision Rules 227

Classifying a New Record 227

9.2 Classification Trees 228

Recursive Partitioning 228

Example 1: Riding Mowers 228

Measures of Impurity 231

9.3 Evaluating the Performance of a Classification Tree 235

Example 2: Acceptance of Personal Loan 236

9.4 Avoiding Overfitting 239

Stopping Tree Growth 242

Pruning the Tree 243

Best-Pruned Tree 245

9.5 Classification Rules from Trees 247

9.6 Classification Trees for More Than Two Classes 248

9.7 Regression Trees 249

Prediction 250

Measuring Impurity 250

Evaluating Performance 250

9.8 Advantages and Weaknesses of a Tree 250

9.9 Improving Prediction: Random Forests and Boosted Trees 252

Random Forests 252

Boosted Trees 254

Problems 257

Chapter 10 Logistic Regression 261

10.1 Introduction 261

10.2 The Logistic Regression Model 263

10.3 Example: Acceptance of Personal Loan 264

Model with a Single Predictor 265

Estimating the Logistic Model from Data: Computing Parameter Estimates 267

Interpreting Results in Terms of Odds (for a Profiling Goal) 270

10.4 Evaluating Classification Performance 271

10.5 Variable Selection 273

10.6 Logistic Regression for Multi-Class Classification 274

Ordinal Classes 275

Nominal Classes 276

10.7 Example of Complete Analysis: Predicting Delayed Flights 277

Data Preprocessing 282

Model-Fitting and Estimation 282

Model Interpretation 282

Model Performance 284

Variable Selection 285

Problems 289

Chapter 11 Neural Nets 293

11.1 Introduction 293

11.2 Concept and Structure of a Neural Network 294

11.3 Fitting a Network to Data 295

Example 1: Tiny Dataset 295

Computing Output of Nodes 296

Preprocessing the Data 299

Training the Model 300

Example 2: Classifying Accident Severity 304

Avoiding Overfitting 305

Using the Output for Prediction and Classification 305

11.4 Required User Input 307

11.5 Exploring the Relationship Between Predictors and Outcome 308

11.6 Deep Learning 309

Convolutional Neural Networks (CNNs) 310

Local Feature Map 311

A Hierarchy of Features 311

The Learning Process 312

Unsupervised Learning 312

Example: Classification of Fashion Images 313

Conclusion 320

11.7...

Details

Erscheinungsjahr:	2023
Genre:	Importe , Mathematik
Rubrik:	Naturwissenschaften & Technik
Medium:	Buch
Inhalt:	688 S.
ISBN-13:	9781119835172
ISBN-10:	1119835178
Sprache:	Englisch
Einband:	Gebunden
Autor:	Shmueli, Galit Yahav, Inbal Patel, Nitin R. Gedeck, Peter Bruce, Peter C.
Hersteller:	John Wiley & Sons Inc
Verantwortliche Person für die EU:	Produktsicherheitsverantwortliche/r, Europaallee 1, D-36244 Bad Hersfeld, gpsr@libri.de
Maße:	261 x 185 x 43 mm
Von/Mit:	Galit Shmueli (u. a.)
Erscheinungsdatum:	08.02.2023
Gewicht:	1,616 kg

Artikel-ID: 125772617