# ML | Logistic regression v / s Decision tree classification

| | |

We can compare two algorithms in different categories —

Criteria Logistic Regression Decision Tree Classification
Interpretability Less interpretable More interpretable
Decision Boundaries Linear and single decision boundary Bisects the space into smaller spaces
Ease of Decision Making A decision threshold has to be set Automatically handles decision making
Overfitting Not prone to overfitting Prone to overfitting
Robustness to noise Robust to noise Majorly affected by noise
Scalability Requires a large enough training set Can be train ed on a small training set

As a simple experiment, we run two models on the same dataset and compare their characteristics.

Step 1: Import the required libraries

 ` import ` ` numpy as np ` ` import ` ` pandas as pd ` ` from ` ` sklearn.model_selection ` ` import ` ` train_test_split ` ` from ` ` sklearn.linear_model ` ` import ` ` LogisticRegression ` ` from ` ` sklearn.tree ` ` import ` ` DecisionTreeC lassifier `

Step 2: Read and clear the dataset

` `

` cd C: UsersDevDesktopKaggleSinking Titanic # Change workplace to file location df = pd.read_csv ( ’_train.csv’ ) y = df [ ’Survived’ ]    X = df.drop ( ’Survived’ , axis = 1 ) X = X.drop ([ ’Name’ , ’Ticket’ , ’ Cabin’ , ’Embarked’ ], axis = 1 )   X = X.replace ([ ’ male’ , ’female’ ], [ 2 , 3 ]) # Hot coding categorical variables    X.fillna (method = ’ ffill’ , inplace = True ) # Handling missing values ​​ `

` `

Step 3: Train and evaluate the Logisitc regression model

` `

` X_train, X_test, y_train, y_test = train_test_split ( X, y, test_size = 0.3 , random_state = 0 )    lr = LogisticRegression ( ) lr.fit (X_train, y_train) print (lr.score (X_test, y_test)) `

` ` Step 4: Train and evaluate the decision tree classifier model

 ` criteria ` ` = ` ` [` ` ’gini’ ` `, ` `’ entropy’ ` `] ` ` scores ` ` = ` ` { } `   ` for ` ` c ` ` in ` ` criteria: ` ` dt ` ` = ` ` DecisionTreeClassifier (criterion ` ` = ` ` c) ` ` ` ` dt.fit (X_train, y_train) ` ` ` ` test_score ` ` = ` ` dt.score (X_test, y_test) ` ` scores ` ` = ` ` test_score `   ` print ` ` ( scores) ` Comparing the scores, we see that the logistic regression model performed better in the current dataset, but this may not always be the case.

## Shop Learn programming in R: courses

\$ Best Python online courses for 2022

\$ Best laptop for Fortnite

\$ Best laptop for Excel

\$ Best laptop for Solidworks

\$ Best laptop for Roblox

\$ Best computer for crypto mining

\$ Best laptop for Sims 4

\$

Latest questions

NUMPYNUMPY

psycopg2: insert multiple rows with one query

NUMPYNUMPY

How to convert Nonetype to int or string?

NUMPYNUMPY

How to specify multiple return types using type-hints

NUMPYNUMPY

Javascript Error: IPython is not defined in JupyterLab

## Wiki

Python OpenCV | cv2.putText () method

numpy.arctan2 () in Python

Python | os.path.realpath () method

Python OpenCV | cv2.circle () method

Python OpenCV cv2.cvtColor () method

Python - Move item to the end of the list

time.perf_counter () function in Python

Check if one list is a subset of another in Python

Python os.path.join () method