Random Forest — it is an ensemble method capable of performing both regression and classification tasks using multiple decision trees and a technique called Bootstrap Aggregation, commonly known as batching . The basic idea is to combine multiple decision trees in determining the end result, rather than relying on separate decision trees. Fit:
Select at random K data points from the training set.
Build a decision tree associated with these K data points .
Select the number of trees you want to build and repeat steps 1 and 2.
For a new data point, have each of your Ntree trees predict the Y value for the data point , and assign the new data point the average of all predicted Y values.
Below is a step-by-step Python implementation. Step 1: Import the required libraries.
# Library import
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
Step 2: Import and print dataset p>
data = pd.read_csv ( code > `Salaries.csv` )
print code > (data)
Step 3: Select all rows and column 1 from dataset in x and all rows and column 2 as y
x = data.iloc [:, 1: 2] .values print (x) code > y = data.iloc [:, 2] .values
Step 4: Install the Random Forest regressor into the dataset
# Fitting random forest regression to dataset # import regressor
from sklearn.ensemble import RandomForestRegressor