How Meta leverages generative AI to understand user intent

MT HANNACH
7 Min Read
Disclosure: This website may contain affiliate links, which means I may earn a commission if you click on the link and make a purchase. I only recommend products or services that I personally use and believe will add value to my readers. Your support is appreciated!

Join our daily and weekly newsletters for the latest updates and exclusive content covering cutting-edge AI. Learn more


Meta – the parent company of Facebook, Instagram, WhatsApp, Threads and more – runs one of the largest recommendation systems in the world.

In two recently published papers, its researchers revealed how generative models can be used to better understand and respond to user intentions.

By viewing recommendations as a generative problem, you can approach it in new ways that are richer in content and more effective than traditional approaches. This approach can have important uses for any application requiring the retrieval of documents, products, or other types of objects.

Dense or generative recovery

The standard approach to creation recommendation systems consists of computing, storing and retrieving dense representations of documents. For example, to recommend items to users, an application must train a model that can calculate integrations for users and items. Next, it needs to create a large store of item integrations.

At inference time, the recommender system attempts to understand the user’s intent by finding one or more items whose embeddings are similar to those of the user. This approach requires an increasing amount of storage and computing capacity as the number of items increases, because each item embedding must be stored and each recommendation operation requires comparing the user’s embedding to the entire element store.

Dense recovery
Dense recovery (source: arXiv)

Generative retrieval is a newer approach that attempts to understand user intent and make recommendations by predicting the next item in a sequence instead of searching a database. Generative retrieval does not require storing item embeddings, and its inference and storage costs remain constant as the item list grows.

The key to making generative retrieval work is to calculate “semantic identifiers” (SIDs) that contain the contextual information about each item. Generative recovery systems like TIGER work in two phases. First, an encoder model is trained to create a unique embedding value for each element based on its description and properties. These integration values ​​become the SIDs and are stored with the element.

Generative recovery
Generative recovery (source: arXiv)

In the second step, a Transformer model is trained to predict the next SID in an input sequence. The list of input SIDs represents the user’s interactions with past items and the model’s prediction is the SID of the item to recommend. Generative retrieval reduces the need for storage and search in embeddings of individual elements. It also improves the ability to capture deeper semantic relationships within the data and provides other benefits of generative models, such as changing the temperature to adjust the diversity of recommendations.

Advanced generative recovery

Despite its lower storage and inference costs, generative recovery suffers from some limitations. For example, it tends to overfit items it saw during training, which means it has difficulty handling items that were added to the catalog after training the model. In recommender systems, this is often called “the cold start problem“, which concerns users and items that are new and have no interaction history.

To address these shortcomings, Meta developed a hybrid recommendation system called LIGERwhich combines the computational and storage efficiency of generative retrieval with the robust integration quality and ranking capabilities of dense retrieval.

During training, LIGER uses both the similarity score and the next token’s goals to improve the model’s recommendations. During inference, LIGER selects several candidates based on the generative mechanism and supplements them with a few cold-start items, which are then ranked based on the embeddings of the generated candidates.

LIGER
LIGER combines generative and dense recovery (source: arXiv)

The researchers note that “the fusion of dense and generative retrieval methods holds enormous potential for advancing recommendation systems” and that as the models evolve, “they will become increasingly practical for real-world applications , enabling more personalized and responsive user experiences.”

In another article, researchers present a new multimodal generative retrieval method called Multimodal Preference Discriminator (Mender), a technique that can allow generative models to capture implicit preferences from user interactions with different elements. Mender builds on SID-based generative retrieval methods and adds a few components that can enrich the recommendations with user preferences.

Mender uses a large language model (LLM) to translate user interactions into specific preferences. For example, if the user praised or complained about a specific item in a review, the model will summarize it into a preference for that product category.

The main recommendation model is trained to be conditioned on both the sequence of user interactions and user preferences when predicting the next semantic identifier in the input sequence. This gives the recommendation model the ability to generalize and learn in context and adapt to user preferences without being explicitly trained on them.

“Our contributions pave the way for a new class of generative retrieval models that enable the use of organic data to guide recommendations via user text preferences,” the researchers write.

Mender
Mender Recommendation Framework (source: arXiv)

Implications for enterprise applications

The efficiencies provided by generative recovery systems can have important implications for enterprise applications. These advances translate into immediate practical benefits, including reduced infrastructure costs and faster inference. The technology’s ability to keep storage and inference costs constant, regardless of catalog size, makes it particularly attractive to growing businesses.

The benefits extend to every industry, from e-commerce to business search. Generative recovery is still in its early stages and we can expect applications and frameworks to emerge as it matures.

Share This Article
Leave a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *