Full width home advertisement

Travel the world

Climb the mountains

Post Page Advertisement [Top]

bookMachine LearningPython

Machine learning in action By Harrington P.

Machine learning in action By Harrington P. PDF

spinningbot-temp-inter-5

Informations about the book:

TitleMachine learning in action

AuthorHarrington P. 

Size: 4.2

Format: PDF

Year: 2012

Pages: 382

Book Contents:

acknowledgments 
about this book 
about the author 
about the cover illustration 
PART 1 CLASSIFICATION 
1 Machine learning basics 
1.1 What is machine learning? 
Sensors and the data deluge 
â–  Machine learning will be more
important in the future 
1.2 Key terminology 
1.3 Key tasks of machine learning 
1.4 How to choose the right algorithm 
1.5 Steps in developing a machine learning application 
1.6 Why Python? 
Executable pseudo-code  
â–  Python is popular  
â–  What Python has that other languages don’t have  
â–  Drawbacks 
1.7 Getting started with the NumPy library
2 Classifying with k-Nearest Neighbors 
2.1 Classifying with distance measurements 
Prepare: importing data with Python  
â–  Putting the kNN classification algorithm into action  
â–  How to test a classifier 
2.2 Example: improving matches from a dating site with kNN 
Prepare: parsing data from a text file  
â–  Analyze: creating scatter plots with Matplotlib 
â–  Prepare: normalizing numeric values  
â–  Test: testing the classifier as a whole program  
â–  Use: putting together a useful system 
2.3 Example: a handwriting recognition system 
Prepare: converting images into test vectors 
â–  Test: kNN on handwritten digits 
3 Splitting datasets one feature at a time: decision trees 
3.1 Tree construction 
Information gain  
â–  Splitting the dataset  
â–  Recursively
building the tree 
3.2 Plotting trees in Python with Matplotlib annotations 
Matplotlib annotations  
â–  Constructing a tree of annotations 
3.3 Testing and storing the classifier 
Test: using the tree for classification  
â–  Use: persisting the
decision tree 
3.4 Example: using decision trees to predict contact lens type 
4 Classifying with probability theory: naïve Bayes 
4.1 Classifying with Bayesian decision theory 
4.2 Conditional probability 
4.3 Classifying with conditional probabilities 
4.4 Document classification with naïve Bayes 
4.5 Classifying text with Python 
Prepare: making word vectors from text  
â–  Train: calculating
probabilities from word vectors 
â–  Test: modifying the classifier for realworld conditions  
â–  Prepare: the bag-of-words document model 
4.6 Example: classifying spam email with naïve Bayes 
Prepare: tokenizing text 
■ Test: cross validation with naïve Bayes
4.7 Example: using naïve Bayes to reveal local attitudes from personal ads 
Collect: importing RSS feeds 
â–  Analyze: displaying locally used words
5 Logistic regression 
5.1 Classification with logistic regression and the sigmoid
function: a tractable step function 
5.2 Using optimization to find the best regression coefficients 
Gradient ascent  
â–  Train: using gradient ascent to find the best parameters  
â–  Analyze: plotting the decision boundary Train: stochastic gradient ascent 
5.3 Example: estimating horse fatalities from colic 
Prepare: dealing with missing values in the data  
â–  Test: classifying with logistic regression 
6 Support vector machines 
6.1 Separating data with the maximum margin 
6.2 Finding the maximum margin 
Framing the optimization problem in terms of our classifier 
Approaching SVMs with our general framework 
6.3 Efficient optimization with the SMO algorithm  Platt’s SMO algorithm  
â–  Solving small datasets with the simplified SMO 
6.4 Speeding up optimization with the full Platt SMO 
6.5 Using kernels for more complex data 
Mapping data to higher dimensions with kernels
â–  The radial bias function as a kernel 
â–  Using a kernel for testing 
6.6 Example: revisiting handwriting classification
7 Improving classification with the AdaBoost meta-algorithm 
7.1 Classifiers using multiple samples of the dataset 
Building classifiers from randomly resampled data: bagging Boosting 
7.2 Train: improving the classifier by focusing on errors 
7.3 Creating a weak learner with a decision stump 
7.4 Implementing the full AdaBoost algorithm 
7.5 Test: classifying with AdaBoost 
7.6 Example: AdaBoost on a difficult dataset 
7.7 Classification imbalance 
Alternative performance metrics: precision, recall, and ROC 
Manipulating the classifier’s decision with a cost function 
Data sampling for dealing with classification imbalance 
PART 2 FORECASTING NUMERIC VALUES WITH REGRESSION 
8 Predicting numeric values: regression 
8.1 Finding best-fit lines with linear regression 
8.2 Locally weighted linear regression 
8.3 Example: predicting the age of an abalone 
8.4 Shrinking coefficients to understand our data 
Ridge regression 
â–  The lasso 
â–  Forward stagewise regression 
8.5 The bias/variance tradeoff 
8.6 Example: forecasting the price of LEGO sets 
Collect: using the Google shopping API 
â–  Train: building a model
9 Tree-based regression 
9.1 Locally modeling complex data 
9.2 Building trees with continuous and discrete features 
9.3 Using CART for regression 
Building the tree 
â–  Executing the code 
9.4 Tree pruning Prepruning  
â–  Postpruning 
9.5 Model trees 
9.6 Example: comparing tree methods to standard regression 
9.7 Using Tkinter to create a GUI in Python 
Building a GUI in Tkinter 
â–  Interfacing Matplotlib and Tkinter
PART 3 UNSUPERVISED LEARNING 
10 Grouping unlabeled items using k-means clustering 
10.1 The k-means clustering algorithm 
10.2 Improving cluster performance with postprocessing 
10.3 Bisecting k-means 
10.4 Example: clustering points on a map 
The Yahoo! PlaceFinder API 
â–  Clustering geographic coordinates 
11 Association analysis with the Apriori algorithm 
11.1 Association analysis 
11.2 The Apriori principle 
11.3 Finding frequent itemsets with the Apriori algorithm 
Generating candidate itemsets 
â–  Putting together the full
Apriori algorithm 
11.4 Mining association rules from frequent item sets 
11.5 Example: uncovering patterns in congressional voting 
Collect: build a transaction data set of congressional voting
records 
â–  Test: association rules from congressional voting records 
11.6 Example: finding similar features in poisonous mushrooms 
12 Efficiently finding frequent itemsets with FP-growth 
12.1 FP-trees: an efficient way to encode a dataset 
12.2 Build an FP-tree 
Creating the FP-tree data structure 
â–  Constructing the FP-tree 
12.3 Mining frequent items from an FP-tree 
Extracting conditional pattern bases  
â–  Creating conditional
FP-trees 
12.4 Example: finding co-occurring words in a Twitter feed 
12.5 Example: mining a clickstream from a news site 
PART 4 ADDITIONAL TOOLS 
13 Using principal component analysis to simplify data 
13.1 Dimensionality reduction techniques 
13.2 Principal component analysis 
Moving the coordinate axes  
â–  Performing PCA in NumPy 
13.3 Example: using PCA to reduce the dimensionality of
semiconductor manufacturing data
14 Simplifying data with the singular value decomposition 
14.1 Applications of the SVD 
Latent semantic indexing 
â–  Recommendation systems 
14.2 Matrix factorization 
14.3 SVD in Python 
14.4 Collaborative filtering–based recommendation engines 
Measuring similarity  
â–  Item-based or user-based similarity? 
Evaluating recommendation engines 
14.5 Example: a restaurant dish recommendation engine 
Recommending untasted dishes  
â–  Improving recommendations with the SVD  
â–  Challenges with building recommendation engines 
14.6 Example: image compression with the SVD 
15 Big data and MapReduce 
15.1 MapReduce: a framework for distributed computing 
15.2 Hadoop Streaming 
Distributed mean and variance mapper
â–  Distributed mean and variance reducer 
15.3 Running Hadoop jobs on Amazon Web Services 
Services available on AWS 
â–  Getting started with Amazon Web Services  
â–  Running a Hadoop job on EMR 
15.4 Machine learning in MapReduce 
15.5 Using mrjob to automate MapReduce in Python 
Using mrjob for seamless integration with EMR 
â–  The anatomy of a MapReduce script in mrjob 
15.6 Example: the Pegasos algorithm for distributed SVMs 
The Pegasos algorithm  
â–  Training: MapReduce support
vector machines with mrjob 
15.7 Do you really need MapReduce? 
appendix A Getting started with Python 
appendix B Linear algebra 
appendix C Probability refresher 
appendix D Resources 


png+button

No comments:

Post a Comment

Bottom Ad [Post Page]