Change language

# Random forest regression in Python

|

Random Forest — it is an ensemble method capable of performing both regression and classification tasks using multiple decision trees and a technique called Bootstrap Aggregation, commonly known as batching . The basic idea is to combine multiple decision trees in determining the end result, rather than relying on separate decision trees.
Fit:

• Select at random K data points from the training set.
• Build a decision tree associated with these K data points .
• Select the number of trees you want to build and repeat steps 1 and 2.
• For a new data point, have each of your Ntree trees predict the Y value for the data point , and assign the new data point the average of all predicted Y values.

Below is a step-by-step Python implementation.
Step 1: Import the required libraries.

 ` # Library import ` ` import ` ` numpy as np ` ` import ` ` matplotlib.pyplot as plt ` ` import ` ` pandas as pd `

Step 2: Import and print dataset

 ` data ` ` = ` ` pd.read_csv (` ` ’Salaries.csv’ ` `) ` ` print ` ` (data) `

Step 3: Select all rows and column 1 from dataset in x and all rows and column 2 as y

 ` x = data.iloc [:, 1: 2] .values ` ` print (x) ` ` y = data.iloc [:, 2] .values `

Step 4: Install the Random Forest regressor into the dataset

` `

` # Fitting random forest regression to dataset # import regressor from sklearn.ensemble import RandomForestRegressor      # create regressor object regressor = RandomForestRegressor (n_estimators = 100 , random_state = 0 )   # install a regressor with x and y data regressor.fit (x, y)  `

` `

Step 5: predicting a new result

 ` y_pred ` ` = ` ` regressor.predict (` ` 6.5 ` `) ` ` # check the output by changing the values ​​`

Step 6: Rendering the result

 ` # Visualize random forest regression results `   ` # arange to create a range of values ​​` ` # from minimum x to maximum ` ` # x value with 0.01 difference ` ` # between two consecutive values ​​` ` X_grid ` ` = ` ` np .arange (` ` min ` ` (x), ` ` max ` ` (x), ` ` 0.01 ` `) `   ` # reshape to convert data to array len (X_grid) * 1, ` ` # i.e. make a column from X_grid value ` ` X_grid ` ` = ` ` X_grid.reshape ((` ` len ` ` (X_grid), ` ` 1 ` `)) `   ` # Scatter plot for source data ` ` plt.scatter (x, y, color ` ` = ` ` ’blue’ ` `) ` ` `  ` # predicted data plot ` ` plt.plot (X_grid, regressor.predict (X_grid), ` ` color ` ` = ` ` ’ green’ ` `) ` ` plt.title (` ` ’Random Forest Regression’ ` `) ` ` plt.xlabel (` ` ’Position level’ ` `) ` ` plt.ylabel (` `’ Salary’ ` `) ` ` plt.show () `

## Shop

Learn programming in R: courses

\$

Best Python online courses for 2022

\$

Best laptop for Fortnite

\$

Best laptop for Excel

\$

Best laptop for Solidworks

\$

Best laptop for Roblox

\$

Best computer for crypto mining

\$

Best laptop for Sims 4

\$

Latest questions

NUMPYNUMPY

Common xlabel/ylabel for matplotlib subplots

NUMPYNUMPY

How to specify multiple return types using type-hints

NUMPYNUMPY

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

NUMPYNUMPY

Flake8: Ignore specific warning for entire file

NUMPYNUMPY

glob exclude pattern

NUMPYNUMPY

How to avoid HTTP error 429 (Too Many Requests) python

NUMPYNUMPY

Python CSV error: line contains NULL byte

NUMPYNUMPY

csv.Error: iterator should return strings, not bytes

## Wiki

Python | How to copy data from one Excel sheet to another

Common xlabel/ylabel for matplotlib subplots

Check if one list is a subset of another in Python

How to specify multiple return types using type-hints

Printing words vertically in Python

Python Extract words from a given string

Cyclic redundancy check in Python

Finding mean, median, mode in Python without libraries