In today’s digital landscape, product teams are increasingly incorporating AI-driven features powered by Large Language Models (LLMs). These advanced capabilities enhance various system components, including search, question-answering, and data extraction.
However, a crucial preliminary step in delivering the right feature at the right time is accurately routing and interpreting user actions. Understanding user intent is therefore fundamental to the success of any LLM and Retrieval-Augmented Generation (RAG) based solution.
In this blog post, we will delve into how LLMs 🤖 can be utilized to effectively detect and interpret user intent from a diverse range of queries.
What is intent detection and why does it matter?
Intent Detection or intent recognition is an NLP technique used to identify the purpose or goal behind a user’s query. Historically, this type of classification problem has been very important in search and recommendation engines 1.
Some key aspects of intent detection include:
- Natural Language Understanding: i.e. understanding the meaning behind what a user is saying.
- Context Analysis: i.e. considering the context in which the user query is in to accurately detect the intent. The context here could be a document, a part of a document, in a chat etc.
- Classification: i.e. assigning pre-defined labels or categories to a user’s input and predicted intent.
As you can imagine this is a crucial step for an LLM-based system using e.g. RAG for various reasons such as:
- Improving the user experience: by understanding users we can tailor responses and actions to meet individual needs. But also improve the efficiency and relevancy of the responses to the user’s query by personalization.
- Automation: knowing or predicting the intent can lead to the automation of certain pre-defined routines or tasks based on the user’s query.
- Feature Activation: intent detection can be used to route the user to various parts of our system based on the predicted intent and thus incorporate context to respond to user queries promptly.
You may have heard about semantic routing2, which is an adjacent concept. TLDR; use intent detection to route your users to relevant features or parts of your system to provide them with a tailored and timely experience.
Intent Detection Method(s)
Now that we know what intent detection is, what it can be used for and why it is important.
Let’s delve a bit further into various methods for detecting intent:
A rough classification of these various methods is the following:
- Rule-based
- Traditional NLP methods
- Classification models
- Sequence models
- Transformer-based models
- Large Language models
We will not look into all the details of these various methods. If you want to learn more, numerous great sources are available online for each type of method/algorithm mentioned.
However, see the table below, with some pros and cons of these various methods:
Method | Pros | Cons |
---|---|---|
Rule-based Methods | - Simple to implement and interpret - Easy to maintain for small sets of rules | - Limited flexibility and scalability - Requires manual updates for new intents and vocabulary |
Traditional NLP Methods (Bag-of-Words, TF-IDF) | - Simple and effective for basic text classification - Quick and computationally inexpensive | - Ignores word order and context - Can lead to loss of meaning |
Classification Models (Naive Bayes, SVM, Random Forests) | - Generally more accurate than rule-based systems - Can handle a variety of inputs | - Requires feature engineering - May not capture complex linguistic nuances |
Sequence Models (RNN, LSTM, GRU) | - Effective for capturing context and handling long sequences - Good at modeling temporal dependencies in text | - Computationally intensive - Requires large datasets |
Transformer-based Methods (BERT, GPT, SetFit, FastFit) | - State-of-the-art performance in many NLP tasks - Capable of understanding complex context and nuances | - Requires significant computational power - Needs substantial training data |
Large Language Models (GPT-4o, Haiku, Mistral Large) | - High accuracy and versatility in various applications - Can handle a wide range of tasks without extensive retraining | - Very computationally expensive - Potential issues with bias and interpretability |
Intent Detection Using LLM(s)
Now that we know what Ìntent Detection
is and why it is useful, let’s look at a real-world example of how you could detect
intent using LLMs.
In this example, we will operate a fictional recipe chatbot where people can in a QA fashion ask questions about a recipe and also get recipe recommendations. Our end-users, can either chat in the context of a recipe
or a more general QA
context.
This means that we have two distinct FoodContextTypes
to consider based on the user’s actions:
RECIPE
: queries in the context of a recipeGENERAL
: more general queries not strictly connected to a specific recipe.
After talking with our product team and some clients, we understand that to a start we need to be able to detect the following FoodIntentTypes
:
RECIPE_QA
: Question and answer about a recipeRECIPE_SUGGEST
: Suggestions or changes to a recipeRECIPE_TRANSLATE
: Translate a recipe, or parts of a recipe in some way or formRECIPE_EXPLAIN
: Explain a recipe in more detail i.e. what type of cuisine is this etc.GENERAL_QA
: General QA with the chatbotGENREAL_RECOMMENDATION
: Queries related to getting recipe recommendations
To spice 🔥 it up a bit we will be using TypeScript
for this example, mainly as I have been doing quite a lot of TS
and JS
lately at my current workplace.
The libraries we will be using:
@langchain/openai
@langchain/core
zod
To start with, let’s see what entities
we have to deal with, TS
we use enums
to enumerate these:
1/* Enum for chat context */
2export enum FoodContextType {
3 RECIPE = "RECIPE"
4 GENERAL = "GENERAL"
5}
6
7/* Enum for chat intent */
8export enum FoodIntentType {
9 RECIPE_QA = "RECIPE_QA"
10 RECIPE_SUGGEST = "RECIPE_SUGGEST"
11 RECIPE_TRANSLATE = "RECIPE_TRANSLATE"
12 RECIPE_EXPLAIN = "RECIPE_EXPLAIN"
13 GENERAL_QA = "GENERAL_QA"
14 GENERAL_RECOMMENDATION = "GENERAL_RECOMMENDATION"
15}
As intentDetection
is a classification
problem we want to use an LLM to predict what the user intent is based on the following:
context
: the context the user is inuserQuery
: the actual user question or queries.
Intent
can also be of multiple meanings i.e. only allowing or deducing a single intent might be wrong. For instance, look at the query below:
“Please recommended me a french cooking recipe with instructions in french”
The intent we can deduce based on the query above is the following:
GENERAL_RECOMMENDATION
: the user wants to cook 🇫🇷RECIPE_TRANSLATE
: the user wants to cook French food in French🇫🇷 🇫🇷
To model this we want to use zod
3. Luckily for us, many LLM(s) are good at functionCalling
and extracting structuredOutput
based on a provided schema.
A zod
object for our task could look like the below:
1import { z } from 'zod';
2
3const zDetectFoodIntentResponse = z.object({
4 foodIntent: z
5 .array(
6 z.object({
7 foodContextType: z.nativeEnum(FoodContextType))
8 .describe('Type of context the user is in'),
9 foodIntentType: z.nativeEnum(FoodIntentType))
10 .describe('Predict food related intent'),
11 reasoning: z.string()
12 .describe('Reasoning around the predicted intent')
13 })
14 )
15});
16
17/* Infer type */
18type FoodIntentDetectionResponse = z.infer<typeof zDetectFoodIntentResponse>;
Many modern LLMs support tool calling / structured output out-of-the-box, and using orchestrating libraries such as langchain
makes it very easy to get started. Langchain released fairly recent updates to how to do structured output extraction and function calling across several different LLM providers. See more about it here.
To continue, the next step is to create our prompt
and build or chain
of one or several LLM
calls. If you want to see some tips and tricks on how to do multiple LLM calls for data extraction tasks see my other blogpost. And if you like me are a fan of DSPy
check my other post.
Anyhow, see below for the initial starting point of our prompt:
1export const FoodIntentDetectionPromptTemplate = `
2You are an expert restauranteur and online TV chef.
3Based on the provided 'context' and 'userQuery', predict the users 'intent'.
4Make sure to follow the instructions.
5
6# Instructions
71. Only use the 'foodContextTypes' specified in the schema.
82. Use the 'foodContextType' to determine what type of 'context' the user is in.
93. Based on the 'foodContextType' and 'userQuery' predict the 'foodIntentType'.
104. If the 'userQuery' is uncertain, unclear, or irrelevant use 'GENERAL_QA' as the default intent.
11
12# Food Context Input Type
13{foodContextType}
14
15# User Context
16{context}
17
18# User Query
19{userQuery}
20`
As prompt engineering is still more of an art than a science (if you are not using frameworks such as DSPy
) you likely need to refine a prompt such as the above for your use case. Anyhow, for this example, this is good enough and let’s proceed with building our chain.
First, we define a helper class to keep track of our chat messages:
1import { ChatPromptTemplate } from '@langchain/core/prompts'
2import { ChatOpenAI } from '@langchain/openai'
3
4/* MessageRole enum */
5export enum MessageRole {
6 ASSISTANT = 'ASSISTANT'
7 SYSTEM = 'SYSTEM'
8 USER = 'USER'
9}
10
11/* Messages object */
12export class Messages {
13 id: string;
14 content: string
15 recipe: string
16 role: MessageRole
17}
18}
Then we build our predictIntent
function:
1async predictIntent(messages: Messages)
2: Promise<FoodIntentDetectionResponse> {
3 // unpack message
4 const { content, recipe, role } = message;
5
6 // get userContext
7 const userContext = (content == null && recipe != null): recipe ? content;
8
9 // deduce foodContextType from message
10 const foodContextType = !recipe ? FoodContextType.GENERAL : FoodContextType.RECIPE ;
11
12 // get user question
13 const userQuery = ...;
14
15 // build chain
16 const llm = new ChatOpenAI({
17 temperature: 0,
18 modelName: 'gpt-4o',
19 openAIApiKey: process.env.apiKey
20
21 });
22
23 const = chain = ChatPromptTemplate
24 .fromTemplate(FoodIntentDetectionPromptTemplate)
25 .pipe(llm.withStructuredOutput(zDetectFoodIntentResponse));
26
27 // invoke chain and parse response
28 const response = await chain.invoke(({
29 context: userContext ?? '',
30 foodContextType,
31 userQuery: userquery ?? ''
32
33 }));
34
35 const parsedResponse = zDetectFoodIntentResponse.safeParse(response);
36
37 if (!parsedResponse.success) {
38 throw new Error('Failed to parse response...');
39 }
40
41 return parsedResponse.data;
42
43
44}
Not too hard right? Using this function we might get output such as the below for different queries:
Query 1:
“Can give me a good dinner recommendation that is fairly quick and easy, preferably japanese?”
Output 1:
1{
2 "foodIntent": [
3 {
4 "foodContextType": "GENERAL",
5 "foodIntentType": "GENERAL_RECOMMENDATION",
6 "reasoning": "The user is asking for a recommendation of Japanese food that is easy and quick. Due to this, the predicted intent is GENERAL_RECOMMENDATION."
7 }
8 ]
9}
Query 2:
“This recipe is great, but I would like to make it vegetarian, and also use the imperial system instead of the metrics system for the ingredients”
Output 2:
1{
2 "foodIntent": [
3 {
4 "foodContextType": "RECIPE",
5 "foodIntentType": "RECIPE_QA",
6 "reasoning": "The user ..."
7 },
8 {
9 "foodContextType": "RECIPE",
10 "foodIntentType": "RECIPE_TRANSLATE",
11 "reasoning": "The user ..."
12 },
13 ]
14}
Closing remarks
In this post, we explored intent detection in a bit more detail and why it is important for any AI or LLM-powered system. With the main goal of increasing the relevancy and accuracy of a user’s query in a QA/search-based system.
We also demonstrated how you could use LLMs such as gpt-4o
to detect intent in a fictional QA system. Intent detection is not just a technical necessity but a strategic advantage in building intelligent, user-centric LLM-powered systems.
Even if we used gpt-4o
in this example, there are many relevant lower latency alternatives such as haiku
from Antrophic. And if you have a few 100s of examples other transformer-based approaches such as FastFit
or SetFit
are also interesting. Thanks for reading and until the next time 👋!
-
This is somewhat similar to
Pydantic
in Python. ↩︎