Decision Trees, Random Forests

Decision Trees, Random Forests: get ready with Python

Learn to make and understand predictions with decision trees and random forests. Includes detailed Python demos.

Description

The lessons of this course help you mastering the use of decision trees and random forests for your data analysis projects. The course focuses on decision tree classifiers and random forest classifiers because most of the successful machine learning applications appear to be classification problems.

Decision Trees, Random Forests: get ready with Python

The lessons explain:

• Decision trees for classification problems.

• Elements of growing decision trees.

• The sklearn parameters to define decision tree classifiers.

• Prediction with decision trees using Scikit-learn (fitting, pruning/tuning, investigating).

• The sklearn parameters to define random forest classifiers.

• Prediction with random forests using Scikit-learn (fitting, tuning, investigating).

• The ideas behind random forests for prediction.

• Characteristics of fitted decision trees and random forests.

• Importance of data and understanding prediction performance.

• How you can carry out a prediction project using decision trees and random forests.

Focusing on classification problems, the course uses the DecisionTreeClassifier and RandomForestClassifier methods of Python’s Scikit-learn library. It prepares you for using decision trees and random forests to make predictions and understanding the predictive structure of data sets.

This is what is inside the lessons:

This course is for people who want to use decision trees or random forests for prediction with Scikit-learn. This requires practical experience and the course facilitates you with Jupyter notebooks to review and practice the lessons’ topics.

Each lesson is a short video to watch. Most of the lessons explain something about decision trees or random forests with an example in a Jupyter notebook. The course materials include more than 50 Jupyter notebooks and the corresponding Python code. You can download the notebooks of the lessons for review. You can also use the notebooks to try other definitions of decision trees and random forests or other data for further practice.

Who is this course for?

• Professionals, students, anybody who wants to use decision trees and random forests for making predictions with data.

• Professionals, students, anybody who works with data on projects and wants to know more about decision trees or random forest after an initial experience using them.

• Professionals, students, anybody interested in doing prediction projects with the Python Scikit-learn library using decision trees or random forests.

Requirements

• You should be comfortable with reading and following Python code in Jupyter notebooks representing data descriptions, estimation or model fitting and data analysis output (using Python libraries: pandas, numpy, scikit-learn, matplotlib).

• To fully benefit from the course you should be able to run the Jupyter notebooks or Python programs of the lessons.

• You’ll need to know some elementary statistics to follow all the lessons (random variable, probability distribution, histogram, boxplot). The lessons are easier to follow if you already have some general idea of supervised learning or classification problems.

What you will learn

• Learn how decision trees and random forests make their predictions.

• Learn how to use Scikit-learn for prediction with decision trees and random forests and for understanding the predictive structure of data sets.

• Learn how to do your own prediction project with decision trees and random forests using Scikit-learn.

• Learn about each parameter of Scikit-learn’s methods DecisonTreeClassifier and RandomForestClassifier to define your decision tree or random forest.

• Learn using the output of Scikit-learn’s DecisonTreeClassifier and RandomForestClassifier methods to investigate and understand your predictions.

• Learn about how to work with imbalanced class values in the data and how noisy data can affect random forests’ prediction performance.

• Growing decision trees: node splitting, node impurity, Gini diversity, entropy, impurity reduction, feature thresholds.

• Improving decision trees: cross-validation, grid/randomized search, tuning and minimum cost-complexity pruning, evaluating feature importance.

• Creating random forests: bootstrapping, bagging, random feature selection, decorrelation of tree predictions.

• Improving random forests: cross-validation, grid/randomized search, tuning, out-of-bag scoring, calibration of probability estimates.

This course includes:

• 3.5 hours of video on demand

• 62 downloadable resources

• lifetime access

• Access on mobile devices and TV

• Certificate of completion

Course content

Classification and Decision Trees

• Introduction

• Sotware

• Classification

• Purposes of classification

• Classification and decision trees

• End of this section

Decision trees

• Introduction

• Introduction to decision trees

• Data partitioning

• Learning

• An additional node split

• Impurity

• Quality of node splits

• Another classification problem

• Data preparation

• Fitting the tree

• Plotting the tree

• Binary splits

• The Gini diversity index

• Growing a decision tree

• A note on the Random Forest Classifier

• The DecisionTreeClassifier method

• The criterion parameter

• The splitter parameter

• The max_depth parameter

• The min_samples_split parameter

• The min_samples_leaf parameter

• The class_weight parameter

• The min_weight_fraction parameter

• The random_state parameter

• The max_features parameter

• The max_leaf_nodes parameter

• The min_impurity_decrease parameter

• The ccp_alpha parameter

• Minimum cost-complexity pruning

• Prediction with a classification tree

• Cross-validation and prediction

• Pruning a tree and prediction

• Tuning and cross-validation

• Pruning a tree with ‘optimized’ parameters

• Feature importance

• Attributes of DecisionTreeClassifier

• The tree_ object of DecisionTreeClassifier

• Advantages and disadvantages of decision trees

• End on this section

Random Forests

• Introduction

• A bootstrap example

• Bagging 15 classification trees

• Random forests and decorrelation

• The RandomForestClassifier method

• The n_estimators parameter

• The bootstrap and oob_score parameters

• The max_samples parameter

• The warm_start parameter

• The n_jobs parameter

• The verbose parameter

• Tuning a random forest

• Attributes of the RandomForestClassifier method

• Advantages and disadvantages of Random Forests

• Random forests and logistic regression

• Random forests and probabilities

• Weighted random forests and imbalanced data

• Over-sampling and under-sampling

• Balanced random forests

• Random forests and noise in features

• Random forests and noise in class values

• End on this section

Application: online purchases

• Introduction

• Why predicting?

• Available data

• A closer look at the data set

• A closer look at the analytics information

• Fitting and pruning a decision tree

• Extracting a prediction rule for one observation

• The contribution of using features for predicting a purchase

• Importance of features to the best decision tree

• Importance of features to selecting the best tree

• Fitting and tuning random forest

• Dissecting a prediction by the random forest

• Importance of features to the random forest

• Importance of features to the random forest algorithm

• How the important features predict a purchase

• What if we use recall and balanced accuracy?

• Fitting a balanced random forest

• Purchase probabilities

• A simple prediction rule?

• Data measurement and selection issues

• Prediction in practice: two relevant issues

• End of this section

End of this course

• End of this course

This course is free or with a percentage discount for a limited time.

Keep in mind that these types of coupons last for a very short time, sometimes even expiring within a few hours or minutes of being published.

If the coupon has already expired, you can purchase the course as usual.

Access the course

Our task is to centralize the largest number of courses, training, tutorials, videos to make it easier for you to search when you want to train.

Generally, we incorporate those that are free, but if we find some interesting ones that are paid, we add them so that your training is more profitable and professional.

Keep training yourself to become a better professional or simply for personal achievements, cheer up, don't give up and go ahead !!!.

Good luck !!! The team of: Only Courses and Training