Change language

Leveraging context features and multitask learning (Building recommendation systems with TensorFlow)

Leveraging context features and multitask learning (Building recommendation systems with TensorFlow)

WEI WEI: Hello.

Welcome back to our video series of building recommendation systems with TensorFlow.

My name is Wei, and Im a developer advocate at Google.

In our last video, we showed you how to build a basic ranking model using TensorFlow recommenders.

Now you know how to build a retrieval and ranking system.

But to make sure your recommender is effective, you need to do more to improve the accuracy of your models.

So in this video, well be discussing how to leverage context features and how to do multi-task learning with TensorFlow recommenders.

To improve our model accuracy, one of the things we can do is to leverage context features, or sometimes called side features.

Our previous experience of using TensorFlow recommenders has not incorporated context features.

We have relied purely on user and item IDs.

But if you know anything about feature engineering, you must understand that context features are very important and can influence your model accuracy quite a bit.

For example, day of the week may be an important feature when deciding whether to recommend a short clip or a movie.

Users may only have time to watch short content during the week, but can relax and enjoy a full-length movie during the weekend.

Similarly, recurring timestamps may play an important role in modeling popularity dynamics.

One movie may be highly popular around the time of the release but decay quickly afterwards.

Conversely, other movies may be evergreens that are happily watched time and time again.

In addition, many recommendation data sets is sparse.

When very few observations are available for a given user or item, the model may struggle with finding a good representation for it.

This is a particularly relevance with code start problem in which you have fresh items or users coming in with no prior interaction with the system.

So it is critical to leverage context features to tackle code start problem.

So in our case, we will include two user context features, timestamp and normalized timestamp, and one movie context feature, movie title text, in our model.

If you have done any feature engineering before, they will look very similar.

I also encourage you to include more context features in your own experiments to see how additional features affect your model of performance.

Were going to use a retrieval model to demo how to do this.

First, for the user model, we create a user model and set up the user ID embedding.

This is pretty straightforward by now.

Next, we use discretization preprocessing layer to bucketize timestamps and use normalization preprocessing layer to normalize timestamps.

Now we can concatenate user ID embedding, timestamp embedding, and normalize the timestamp embedding into a single vector.

This will be the input for the query tower.

Moving over to movie model, we first start with the movie title embedding.

Then we use text vectorization preprocessing layer together with embedding and the global average putting layers to map the movie title text to an embedding.

We then concatenate two embeddings into a vector as a movie item representation.

This is similar to what we did for the user features.

Now we can define a movielens model.

In the init method, we add one more dense layer on top of the user model as our recurring model.

Similarly we add one more dense layer on top of the movie model as our candidate model.

Now we define the retrieval task as we did before.

Lastly, we define the loss.

Overall, the workflow is pretty much the same as before.

We only added additional context features into the two towers, and it you should improve our model performance.

Im going to skip the performance benchmark parts since we have that in our documentation.

Feel free to check it out.

Our next topic is multitask recommenders.

Multitask learning is not a new idea.

Back in 1997, Rich Caruana published a widely cited paper on multitask learning.

The idea is to solve multiple machine learning tasks at the same time by exploiting commonalities and differences across tasks.

This makes sense because, in many real-world applications, there are multiple sources of feedback to draw upon.

For example, on YouTube, users can provide a variety of different signals.

Users may watch some videos but skip others, which provides implicit feedback.

They may thumb up or down the videos, add their comments on the videos, and even share the videos to a social network, like Twitter.

Integrating all these different forms of feedback is critical to building systems that users love to use and avoiding optimizing a single metric at the expense of overall performance.

In addition, building a joint model for multiple tasks may produce better results than building a number of task-specific models.

This is especially true when some data is abundant.

For example, clicks.

And some data is sparse.

For example, comments or sharing.

In those scenarios, are joint model may be able to use representations, learn from the abundant task to improve his predictions over the sparse task via transfer learning.

Now we understand why we want to use multitask learning in your recommender system.

Lets build one that includes a retrieval task and the ranking task using both implicit and explicit feedback.

We first define a ranking task that leverages explicit feedback, which is the movie ratings from movielens data set.

Next, we define a retrieval task using the implicit feedback of movie watches.

The user model and movie model are defined as before, so we wont elaborate them here.

The rating model is three dense layers stacked together as before.

We apply the rating model to the concatenation of user embeddings and movie embeddings.

Lastly, we compute the rating loss and the retrieval loss separately, and then combine them by their respective weights.

These weights are hyperparameters that you need to tune to satisfy your own needs.

Thats all.

We can now call the standard keras compile method and feed method to train our multitask recommender.

Assuming we have done that they have preprocessing, as well.

As you can see here, its actually fairly easy to view the multitask model within the framework of TensorFlow recommenders.

Just to summarize, today, we discussed how to leverage context features and multitask learning to improve your model accuracy.

I have put together some links for you to check out in case you want to learn more about them.

In the next video, well be showing you deeper in the cross network.

See you next time.



Learn programming in R: courses


Best Python online courses for 2022


Best laptop for Fortnite


Best laptop for Excel


Best laptop for Solidworks


Best laptop for Roblox


Best computer for crypto mining


Best laptop for Sims 4


Latest questions


Common xlabel/ylabel for matplotlib subplots

12 answers


How to specify multiple return types using type-hints

12 answers


Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

12 answers


Flake8: Ignore specific warning for entire file

12 answers


glob exclude pattern

12 answers


How to avoid HTTP error 429 (Too Many Requests) python

12 answers


Python CSV error: line contains NULL byte

12 answers


csv.Error: iterator should return strings, not bytes

12 answers



Python | How to copy data from one Excel sheet to another

Common xlabel/ylabel for matplotlib subplots

Check if one list is a subset of another in Python


How to specify multiple return types using type-hints


Printing words vertically in Python


Python Extract words from a given string

Cyclic redundancy check in Python

Finding mean, median, mode in Python without libraries


Python add suffix / add prefix to strings in a list

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

Python - Move item to the end of the list

Python - Print list vertically