Machine learning in action By Harrington P. PDF
Informations about the book:
Title: Machine learning in action
Author: Harrington P.
Size: 4.2
Format: PDF
Year: 2012
Pages: 382
Book Contents:
acknowledgments
about this book
about the author
about the cover illustration
PART 1 CLASSIFICATION
1 Machine learning basics
1.1 What is machine learning?
Sensors and the data deluge
â– Machine learning will be more
important in the future
1.2 Key terminology
1.3 Key tasks of machine learning
1.4 How to choose the right algorithm
1.5 Steps in developing a machine learning application
1.6 Why Python?
Executable pseudo-code
â– Python is popular
■What Python has that other languages don’t have
â– Drawbacks
1.7 Getting started with the NumPy library
2 Classifying with k-Nearest Neighbors
2.1 Classifying with distance measurements
Prepare: importing data with Python
â– Putting the kNN classification algorithm into action
â– How to test a classifier
2.2 Example: improving matches from a dating site with kNN
Prepare: parsing data from a text file
â– Analyze: creating scatter plots with Matplotlib
â– Prepare: normalizing numeric values
â– Test: testing the classifier as a whole program
â– Use: putting together a useful system
2.3 Example: a handwriting recognition system
Prepare: converting images into test vectors
â– Test: kNN on handwritten digits
3 Splitting datasets one feature at a time: decision trees
3.1 Tree construction
Information gain
â– Splitting the dataset
â– Recursively
building the tree
3.2 Plotting trees in Python with Matplotlib annotations
Matplotlib annotations
â– Constructing a tree of annotations
3.3 Testing and storing the classifier
Test: using the tree for classification
â– Use: persisting the
decision tree
3.4 Example: using decision trees to predict contact lens type
4 Classifying with probability theory: naïve Bayes
4.1 Classifying with Bayesian decision theory
4.2 Conditional probability
4.3 Classifying with conditional probabilities
4.4 Document classification with naïve Bayes
4.5 Classifying text with Python
Prepare: making word vectors from text
â– Train: calculating
probabilities from word vectors
â– Test: modifying the classifier for realworld conditions
â– Prepare: the bag-of-words document model
4.6 Example: classifying spam email with naïve Bayes
Prepare: tokenizing text
■Test: cross validation with naïve Bayes
4.7 Example: using naïve Bayes to reveal local attitudes from personal ads
Collect: importing RSS feeds
â– Analyze: displaying locally used words
5 Logistic regression
5.1 Classification with logistic regression and the sigmoid
function: a tractable step function
5.2 Using optimization to find the best regression coefficients
Gradient ascent
â– Train: using gradient ascent to find the best parameters
â– Analyze: plotting the decision boundary Train: stochastic gradient ascent
5.3 Example: estimating horse fatalities from colic
Prepare: dealing with missing values in the data
â– Test: classifying with logistic regression
6 Support vector machines
6.1 Separating data with the maximum margin
6.2 Finding the maximum margin
Framing the optimization problem in terms of our classifier
Approaching SVMs with our general framework
6.3 Efficient optimization with the SMO algorithm Platt’s SMO algorithm
â– Solving small datasets with the simplified SMO
6.4 Speeding up optimization with the full Platt SMO
6.5 Using kernels for more complex data
Mapping data to higher dimensions with kernels
â– The radial bias function as a kernel
â– Using a kernel for testing
6.6 Example: revisiting handwriting classification
7 Improving classification with the AdaBoost meta-algorithm
7.1 Classifiers using multiple samples of the dataset
Building classifiers from randomly resampled data: bagging Boosting
7.2 Train: improving the classifier by focusing on errors
7.3 Creating a weak learner with a decision stump
7.4 Implementing the full AdaBoost algorithm
7.5 Test: classifying with AdaBoost
7.6 Example: AdaBoost on a difficult dataset
7.7 Classification imbalance
Alternative performance metrics: precision, recall, and ROC
Manipulating the classifier’s decision with a cost function
Data sampling for dealing with classification imbalance
PART 2 FORECASTING NUMERIC VALUES WITH REGRESSION
8 Predicting numeric values: regression
8.1 Finding best-fit lines with linear regression
8.2 Locally weighted linear regression
8.3 Example: predicting the age of an abalone
8.4 Shrinking coefficients to understand our data
Ridge regression
â– The lasso
â– Forward stagewise regression
8.5 The bias/variance tradeoff
8.6 Example: forecasting the price of LEGO sets
Collect: using the Google shopping API
â– Train: building a model
9 Tree-based regression
9.1 Locally modeling complex data
9.2 Building trees with continuous and discrete features
9.3 Using CART for regression
Building the tree
â– Executing the code
9.4 Tree pruning Prepruning
â– Postpruning
9.5 Model trees
9.6 Example: comparing tree methods to standard regression
9.7 Using Tkinter to create a GUI in Python
Building a GUI in Tkinter
â– Interfacing Matplotlib and Tkinter
PART 3 UNSUPERVISED LEARNING
10 Grouping unlabeled items using k-means clustering
10.1 The k-means clustering algorithm
10.2 Improving cluster performance with postprocessing
10.3 Bisecting k-means
10.4 Example: clustering points on a map
The Yahoo! PlaceFinder API
â– Clustering geographic coordinates
11 Association analysis with the Apriori algorithm
11.1 Association analysis
11.2 The Apriori principle
11.3 Finding frequent itemsets with the Apriori algorithm
Generating candidate itemsets
â– Putting together the full
Apriori algorithm
11.4 Mining association rules from frequent item sets
11.5 Example: uncovering patterns in congressional voting
Collect: build a transaction data set of congressional voting
records
â– Test: association rules from congressional voting records
11.6 Example: finding similar features in poisonous mushrooms
12 Efficiently finding frequent itemsets with FP-growth
12.1 FP-trees: an efficient way to encode a dataset
12.2 Build an FP-tree
Creating the FP-tree data structure
â– Constructing the FP-tree
12.3 Mining frequent items from an FP-tree
Extracting conditional pattern bases
â– Creating conditional
FP-trees
12.4 Example: finding co-occurring words in a Twitter feed
12.5 Example: mining a clickstream from a news site
PART 4 ADDITIONAL TOOLS
13 Using principal component analysis to simplify data
13.1 Dimensionality reduction techniques
13.2 Principal component analysis
Moving the coordinate axes
â– Performing PCA in NumPy
13.3 Example: using PCA to reduce the dimensionality of
semiconductor manufacturing data
14 Simplifying data with the singular value decomposition
14.1 Applications of the SVD
Latent semantic indexing
â– Recommendation systems
14.2 Matrix factorization
14.3 SVD in Python
14.4 Collaborative filtering–based recommendation engines
Measuring similarity
â– Item-based or user-based similarity?
Evaluating recommendation engines
14.5 Example: a restaurant dish recommendation engine
Recommending untasted dishes
â– Improving recommendations with the SVD
â– Challenges with building recommendation engines
14.6 Example: image compression with the SVD
15 Big data and MapReduce
15.1 MapReduce: a framework for distributed computing
15.2 Hadoop Streaming
Distributed mean and variance mapper
â– Distributed mean and variance reducer
15.3 Running Hadoop jobs on Amazon Web Services
Services available on AWS
â– Getting started with Amazon Web Services
â– Running a Hadoop job on EMR
15.4 Machine learning in MapReduce
15.5 Using mrjob to automate MapReduce in Python
Using mrjob for seamless integration with EMR
â– The anatomy of a MapReduce script in mrjob
15.6 Example: the Pegasos algorithm for distributed SVMs
The Pegasos algorithm
â– Training: MapReduce support
vector machines with mrjob
15.7 Do you really need MapReduce?
appendix A Getting started with Python
appendix B Linear algebra
appendix C Probability refresher
appendix D Resources
acknowledgments
about this book
about the author
about the cover illustration
PART 1 CLASSIFICATION
1 Machine learning basics
1.1 What is machine learning?
Sensors and the data deluge
â– Machine learning will be more
important in the future
1.2 Key terminology
1.3 Key tasks of machine learning
1.4 How to choose the right algorithm
1.5 Steps in developing a machine learning application
1.6 Why Python?
Executable pseudo-code
â– Python is popular
■What Python has that other languages don’t have
â– Drawbacks
1.7 Getting started with the NumPy library
2 Classifying with k-Nearest Neighbors
2.1 Classifying with distance measurements
Prepare: importing data with Python
â– Putting the kNN classification algorithm into action
â– How to test a classifier
2.2 Example: improving matches from a dating site with kNN
Prepare: parsing data from a text file
â– Analyze: creating scatter plots with Matplotlib
â– Prepare: normalizing numeric values
â– Test: testing the classifier as a whole program
â– Use: putting together a useful system
2.3 Example: a handwriting recognition system
Prepare: converting images into test vectors
â– Test: kNN on handwritten digits
3 Splitting datasets one feature at a time: decision trees
3.1 Tree construction
Information gain
â– Splitting the dataset
â– Recursively
building the tree
3.2 Plotting trees in Python with Matplotlib annotations
Matplotlib annotations
â– Constructing a tree of annotations
3.3 Testing and storing the classifier
Test: using the tree for classification
â– Use: persisting the
decision tree
3.4 Example: using decision trees to predict contact lens type
4 Classifying with probability theory: naïve Bayes
4.1 Classifying with Bayesian decision theory
4.2 Conditional probability
4.3 Classifying with conditional probabilities
4.4 Document classification with naïve Bayes
4.5 Classifying text with Python
Prepare: making word vectors from text
â– Train: calculating
probabilities from word vectors
â– Test: modifying the classifier for realworld conditions
â– Prepare: the bag-of-words document model
4.6 Example: classifying spam email with naïve Bayes
Prepare: tokenizing text
■Test: cross validation with naïve Bayes
4.7 Example: using naïve Bayes to reveal local attitudes from personal ads
Collect: importing RSS feeds
â– Analyze: displaying locally used words
5 Logistic regression
5.1 Classification with logistic regression and the sigmoid
function: a tractable step function
5.2 Using optimization to find the best regression coefficients
Gradient ascent
â– Train: using gradient ascent to find the best parameters
â– Analyze: plotting the decision boundary Train: stochastic gradient ascent
5.3 Example: estimating horse fatalities from colic
Prepare: dealing with missing values in the data
â– Test: classifying with logistic regression
6 Support vector machines
6.1 Separating data with the maximum margin
6.2 Finding the maximum margin
Framing the optimization problem in terms of our classifier
Approaching SVMs with our general framework
6.3 Efficient optimization with the SMO algorithm Platt’s SMO algorithm
â– Solving small datasets with the simplified SMO
6.4 Speeding up optimization with the full Platt SMO
6.5 Using kernels for more complex data
Mapping data to higher dimensions with kernels
â– The radial bias function as a kernel
â– Using a kernel for testing
6.6 Example: revisiting handwriting classification
7 Improving classification with the AdaBoost meta-algorithm
7.1 Classifiers using multiple samples of the dataset
Building classifiers from randomly resampled data: bagging Boosting
7.2 Train: improving the classifier by focusing on errors
7.3 Creating a weak learner with a decision stump
7.4 Implementing the full AdaBoost algorithm
7.5 Test: classifying with AdaBoost
7.6 Example: AdaBoost on a difficult dataset
7.7 Classification imbalance
Alternative performance metrics: precision, recall, and ROC
Manipulating the classifier’s decision with a cost function
Data sampling for dealing with classification imbalance
PART 2 FORECASTING NUMERIC VALUES WITH REGRESSION
8 Predicting numeric values: regression
8.1 Finding best-fit lines with linear regression
8.2 Locally weighted linear regression
8.3 Example: predicting the age of an abalone
8.4 Shrinking coefficients to understand our data
Ridge regression
â– The lasso
â– Forward stagewise regression
8.5 The bias/variance tradeoff
8.6 Example: forecasting the price of LEGO sets
Collect: using the Google shopping API
â– Train: building a model
9 Tree-based regression
9.1 Locally modeling complex data
9.2 Building trees with continuous and discrete features
9.3 Using CART for regression
Building the tree
â– Executing the code
9.4 Tree pruning Prepruning
â– Postpruning
9.5 Model trees
9.6 Example: comparing tree methods to standard regression
9.7 Using Tkinter to create a GUI in Python
Building a GUI in Tkinter
â– Interfacing Matplotlib and Tkinter
PART 3 UNSUPERVISED LEARNING
10 Grouping unlabeled items using k-means clustering
10.1 The k-means clustering algorithm
10.2 Improving cluster performance with postprocessing
10.3 Bisecting k-means
10.4 Example: clustering points on a map
The Yahoo! PlaceFinder API
â– Clustering geographic coordinates
11 Association analysis with the Apriori algorithm
11.1 Association analysis
11.2 The Apriori principle
11.3 Finding frequent itemsets with the Apriori algorithm
Generating candidate itemsets
â– Putting together the full
Apriori algorithm
11.4 Mining association rules from frequent item sets
11.5 Example: uncovering patterns in congressional voting
Collect: build a transaction data set of congressional voting
records
â– Test: association rules from congressional voting records
11.6 Example: finding similar features in poisonous mushrooms
12 Efficiently finding frequent itemsets with FP-growth
12.1 FP-trees: an efficient way to encode a dataset
12.2 Build an FP-tree
Creating the FP-tree data structure
â– Constructing the FP-tree
12.3 Mining frequent items from an FP-tree
Extracting conditional pattern bases
â– Creating conditional
FP-trees
12.4 Example: finding co-occurring words in a Twitter feed
12.5 Example: mining a clickstream from a news site
PART 4 ADDITIONAL TOOLS
13 Using principal component analysis to simplify data
13.1 Dimensionality reduction techniques
13.2 Principal component analysis
Moving the coordinate axes
â– Performing PCA in NumPy
13.3 Example: using PCA to reduce the dimensionality of
semiconductor manufacturing data
14 Simplifying data with the singular value decomposition
14.1 Applications of the SVD
Latent semantic indexing
â– Recommendation systems
14.2 Matrix factorization
14.3 SVD in Python
14.4 Collaborative filtering–based recommendation engines
Measuring similarity
â– Item-based or user-based similarity?
Evaluating recommendation engines
14.5 Example: a restaurant dish recommendation engine
Recommending untasted dishes
â– Improving recommendations with the SVD
â– Challenges with building recommendation engines
14.6 Example: image compression with the SVD
15 Big data and MapReduce
15.1 MapReduce: a framework for distributed computing
15.2 Hadoop Streaming
Distributed mean and variance mapper
â– Distributed mean and variance reducer
15.3 Running Hadoop jobs on Amazon Web Services
Services available on AWS
â– Getting started with Amazon Web Services
â– Running a Hadoop job on EMR
15.4 Machine learning in MapReduce
15.5 Using mrjob to automate MapReduce in Python
Using mrjob for seamless integration with EMR
â– The anatomy of a MapReduce script in mrjob
15.6 Example: the Pegasos algorithm for distributed SVMs
The Pegasos algorithm
â– Training: MapReduce support
vector machines with mrjob
15.7 Do you really need MapReduce?
appendix A Getting started with Python
appendix B Linear algebra
appendix C Probability refresher
appendix D Resources
No comments:
Post a Comment