Pushing the Limits of the Two-Tower Model | by Samuel Flender

[ad_1]

The place the assumptions behind the two-tower mannequin structure break — and tips on how to transcend

(Picture created by the creator utilizing generative AI)

Two-tower models are among the many most typical architectural design selections in fashionable recommender programs — the important thing concept is to have one tower that learns relevance, and a second, shallow, tower that learns observational biases resembling place bias.

On this submit, we’ll take a better have a look at two assumptions behind two-tower fashions, particularly:

the factorization assumption, i.e. the speculation that we will merely multiply the possibilities computed by the 2 towers (or add their logits), and
the positional independence assumption, i.e. the speculation that the one variable that determines place bias is the place of the merchandise itself, and never the context by which it’s impressed.

We’ll see the place each of those assumptions break, and tips on how to transcend these limitations with newer algorithms such because the MixEM mannequin, the Dot Product mannequin, and XPA.

Let’s begin with a really temporary reminder.

Two-tower fashions: the story to date

The first studying goal for the rating fashions in recommender programs is relevance: we would like the mannequin to foretell the absolute best piece of content material given the context. Right here, context merely means every thing that we’ve realized in regards to the consumer, for instance from their earlier engagement or search histories, relying on the applying.

Nonetheless, rating fashions normally exhibit sure remark biases, that’s, the tendency for customers to have interaction kind of with an impression relying on the way it was introduced to them. Essentially the most outstanding remark bias is place bias — the tendency of customers to have interaction extra with objects which can be proven first.

The important thing concept in two-tower fashions is to coach two “towers”, that’s, neural networks, in parallel, the primary tower for studying relevance, and…

[ad_2]

Source link

Pushing the Limits of the Two-Tower Model | by Samuel Flender | Dec, 2023

How AI can improve patient experience and patient engagement

Google DeepMind Introduces AlphaCode 2: An Artificial Intelligence (AI) System that Uses the Power of the Gemini Model for a Remarkable Advance in Competitive Programming Excellence

Editor

Google DeepMind Introduces AlphaCode 2: An Artificial Intelligence (AI) System that Uses the Power of the Gemini Model for a Remarkable Advance in Competitive Programming Excellence

Leave a Reply Cancel reply

Browse by Category

Categories

Recommended

Pushing the Limits of the Two-Tower Model | by Samuel Flender | Dec, 2023

The place the assumptions behind the two-tower mannequin structure break — and tips on how to transcend

Two-tower fashions: the story to date

How AI can improve patient experience and patient engagement

Google DeepMind Introduces AlphaCode 2: An Artificial Intelligence (AI) System that Uses the Power of the Gemini Model for a Remarkable Advance in Competitive Programming Excellence

Editor

Google DeepMind Introduces AlphaCode 2: An Artificial Intelligence (AI) System that Uses the Power of the Gemini Model for a Remarkable Advance in Competitive Programming Excellence

Leave a Reply Cancel reply

Browse by Category

Browse by Tags

Categories

Recommended