Friends , welcome to my channel . Dhanesh here.
In this video, we will see how to develop a text classification model with multiple outputs.
We will be developing a text classification model that analyzes a textual comment and predicts multiple labels associated with the comment.
The multi-label classification problem is actually a subset of multiple output model.
At the end of this video you will be able to perform multi-label text classification on your data.
The approach explained in this video can be extended to perform general multi-label classification.
For instance you can solve a classification problem where you have an image as input and you want to predict the image category and image description.
At this point, it is important to explain the difference between a multi-class classification problem and a multi-label classification.
In multi-class classification problem, an instance or a record can belong to one and only one of the multiple output classes.
For instance, in the sentiment analysis problem that we studied in the last video, a text review could be either "good", "bad", or "average".
It could not be both "good" and "average" at the same time.
On the other hand in multi-label classification problems, an instance can have multiple outputs at the same time.
For instance, in the text classification problem that we are going to solve in this video, a comment can have multiple tags.
These tags include "toxic", "obscene", "insulting", etc., at the same time.
The Dataset The dataset contains comments from Wikipedias talk page edits.
There are six output labels for each comment: toxic, severe_toxic, obscene, threat, insult and identity_hate.
A comment can belong to all of these categories or a subset of these categories, which makes it a multi-label classification problem.
The dataset for this video can be downloaded from this Kaggle link.
We will only use the "train.csv" file that contains 160,000 records.
Download the CSV file into your local directory.
I have renamed the file as "toxic_comments.csv".
You can give it any name, but just be sure to use that name in your code.
Lets now import the required libraries and load the dataset into our application.
The following script imports the required libraries: Lets now load the dataset into the memory: The following script displays the shape of the dataset and it also prints the header of the dataset: The dataset contains 159571 records and 8 columns.
The header of the dataset looks like this: The comment_text column contains text comments.
Lets print a random comment and then see the labels for the comments.
This is clearly a toxic comment.
Lets see the associated labels with this comment: Lets now plot the comment count for each label.
To do so, we will first filter all the label or output columns.
Using the toxic_comments_labels dataframe we will plot bar plots that show the total comment counts for different labels.
You can see that the "toxic" comment has the highest frequency of occurrence followed by "obscene" and "insult", respectively.
We have successfully analyzed our dataset, in the next section we will create multi-label classification models using this dataset.
Creating Multi-label Text Classification Models There are two ways to create multi-label classification models: Using single dense output layer and using multiple dense output layers.
In the first approach, we can use a single dense layer with six outputs with a sigmoid activation functions and binary cross entropy loss functions.
Each neuron in the output dense layer will represent one of the six output labels.
The sigmoid activation function will return a value between 0 and 1 for each neuron.
If any neurons output value is greater than 0.5, it is assumed that the comment belongs to the class represented by that particular neuron.
In the second approach we will create one dense output layer for each label.
We will have a total of 6 dense layers in the output.
Each layer will have its own sigmoid function.
Multi-lable Text Classification Model with Single Output Layer In this section, we will create multi-label text classification model with single output layer.
As always, the first step in the text classification model is to create a function responsible for cleaning the text.
In the next step we will create our input and output set.
The input is the comment from the comment_text column.
We will clean all the comments and will store them in the X variable.
The labels or outputs have already been stored in the toxic_comments_labels dataframe.
We will use that dataframe values to store output in the y variable.
Look at the following script: Here we do not need to perform any one-hot encoding because our output labels are already in the form of one-hot encoded vectors.
In the next step, we will divide our data into training and test sets: We need to convert text inputs into embedded vectors.
To understand word embeddings in detail, please refer to my previous videos We will be using GloVe word embeddings to convert text inputs to their numeric counterparts.
The following script creates the model.
Our model will have one input layer, one embedding layer, one LSTM layer with 128 neurons and one output layer with 6 neurons since we have 6 labels in the output.
Lets print the model summary: We will train our model for 5 epochs.
You can train the model with more epochs and see if you get better or worse results.
The result for all the 5 epochs is as follows: Lets now evaluate our model on the test set: Our model achieves an accuracy of around 98% which is pretty impressive.
Finally, we will plot the loss and accuracy values for training and test sets to see if our model is overfitting.
From the output you can see that the accuracy for test (validation) set doesnt converge after the first epochs.
Also, the difference between training and validation accuracy is very minimal.
Therefore, the model starts to overfit after the first epochs and hence we get a poor performance on unseen test set.
Multi-label text classification is one of the most common text classification problems.
In this article, we studied two deep learning approaches for multi-label text classification.
In the first approach we used a single dense output layer with multiple neurons where each neuron represented one label.
In the second approach, we created separate dense layers for each label with one neuron.
Results show that in our case, single output layer with multiple neurons works better than multiple output layers.
As a next step, I would advise you to change the activation function and the train test split to see if you can get better results than the one presented in this video Thanks for watching . Please like share and subscribe