2023 was the year of exploration, testing and proof-of-concepts or deployment of smaller LLM-powered workflows/use cases for many organizations. Whilst 2024 will likely be the year where we will see even more production systems leveraging LLMs. Compared to a traditional ML system where data (examples, labels), model and weights are some of the main artifacts, prompts are instead the main artifacts. Prompts and prompt engineering are fundamental in driving a certain behavior of an assistant or agent, for your use case.

Therefore many of the large players as well as academia have provided guides on how to prompt LLMs efficiently:

  1. 💻 OpenAI Prompt Engineering
  2. 💻 Guide to Anthropic’s prompt engineering resources
  3. 💻 Principled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4
  4. 💻 Prompt Engineering Guide

Many of these guides are quite generic and can work for many different use cases. However, there are very few guides mentioning best practices for constructing prompts for Named Entity Recognition (NER). OpenAI has for instance a cookbook for NER as well as this and this paper which both suggested a method called Prompt-NER.

In this article, we will discuss some techniques that might be helpful when using LLM for NER use cases. To set the stage we will first start with defining what Prompt Engineering and Named Entity Recognition are.

Prompt Engineering Link to this heading

Prompt Engineering sometimes feels more like an art compared to science but there are more best practices showing up (see some of the references in the previous section).

One definition of Prompt Engineering is shown below:

Info

Prompt engineering is a relatively new discipline for developing and optimizing prompts to efficiently use language models (LMs) for a wide variety of applications and research topics. Prompt engineering skills help to better understand the capabilities and limitations of large language models (LLMs).

Prompt engineering is not just about designing and developing prompts. It encompasses a wide range of skills and techniques that are useful for interacting and developing with LLMs. It’s an important skill to interface, build with, and understand the capabilities of LLMs. You can use prompt engineering to improve the safety of LLMs and build new capabilities like augmenting LLMs with domain knowledge and external tools. — Prompt Engineering Guide1


  1. The above quote is excerpted from https://www.promptingguide.ai/ ↩︎

Like any other artifact prompts may be “outdated” or “drift” which is why it is important to have systems in place to do the:

  • Experiment tracking of prompts
  • Evaluating your prompts (either via the “golden dataset” approach, LLM-based evals or both)
  • Observability of how your prompts are being used
  • Versioning of your prompts.

Named Entity Recognition (NER) Link to this heading

Extracting entities or tags from data can be very useful for many different domains and businesses and can be used for many different things such as classification, knowledge retrieval, search etc.

See one definition below for Named Entity Recognition:

Info

Named Entity Recognition (NER) is a task of Natural Language Processing (NLP) that involves identifying and classifying named entities in a text into predefined categories such as person names, organizations, locations, and others. The goal of NER is to extract structured information from unstructured text data and represent it in a machine-readable format. Approaches typically use BIO notation, which differentiates the beginning (B) and the inside (I) of entities. O is used for non-entity tokens. — Prompt Engineering Guide1

Below is an example of the BIO notation, which is a common format used for NER:

1Mac [B-PER]
2Doe [I-PER]   
3ate [O]
4a [O]
5hamburger [O] 
6at [O]
7Mcdonalds [B-LOC]

However, the BIO notation does not make sense for all use cases. Let’s say that if you are interested in extracting food entities then hamburger above might be a key entity or tag that you want to predict.

Usually, a “typical” NER pipeline 1 comprises the following steps:

  1. tokenizer: Turn text into tokens
  2. tagger: Assign part of speech tags
  3. parser: Assign dependency labels
  4. ner: Detect and label named entities

Solving NER systems has previously been done by using:

  1. Rule-based systems
  2. Statistical & ML systems
  3. Deep-Learning systems
  4. Mix of (1) - (3).

Where spacy and Transformer architectures from e.g. HuggingFace such as BERT, XLM-ROBERta etc. have been the go-to methods or architectures.

Before we start with prompt engineering let’s set the stage with an example use case.

Food Entities from recipes Link to this heading

In the following section we will assume the following:

  1. We are a food tech startup that provides and sells custom smart purchase lists to retailers from online food recipes.
  2. We want to extract the following type of entities: FOOD, QUANTITY, UNIT, PHYSICAL_QUALITY, COLOR
  3. When we have the entities we want to populate smart purchase lists with recommendations on where to get the food.

The example recipe that we will be using:

 1### Chashu pork (for ramen and more)
 2Chashu pork is a classic way to prepare pork belly for Japanese dishes such as ramen. 
 3While it takes a little time, it's relatively hands-off and easy, and the result is delicious.
 4
 5### Ingredients
 62 lb pork belly or a little more/less
 72 green onions spring onions, or 3 if small
 81 in fresh ginger (a chunk that will give around 4 - 6 slices)
 92 cloves garlic
10⅔ cup sake
11⅔ cup soy sauce
12¼ cup mirin
13½ cup sugar
142 cups water or a little more as needed
15
16### Instructions
17Have some string/kitchen twine ready and a pair of scissors before you prepare the pork. 
18If needed, trim excess fat from the outside of the pork but you still want a later over the
19...

Technique #1 - Use temperature=0.0 Link to this heading

LLMs are non-deterministic by nature and different generations of using e.g. chat completions APIs from e.g. OpenAI will return different responses. However, one way to mitigate this is to use the temperature parameter often provided in these types of APIs. In short, the lower the temperature, the more deterministic the results in the sense that the highest probable next token is always picked 2. Do notice that this is still not a guarantee for deterministic output. Rather the best try effort to always select the most likely token as model output.

This is of course useful in a NER case where you would like the same or similar input to produce the same or similar output.

Tip

Tip 1: Use 0.0 as temperature to get more deterministic output.

Technique #2 - Set seed for your runs Link to this heading

Recently OpenAI added the functionality to set the seed to make runs more reproducible. If all other params are set to the same value e.g. temperature=0 etc, seed is set and the system_fingerprint3 then output will be mostly deterministic.

Tip

Tip 2: Set seed of each API call e.g. seed=42 to get more deterministic output.

Both the previous section and this section were more focused on model parameters. The following sections will instead focus on what Prompt Engineering techniques we can use to extract named entities.

Technique #3 - Use clear instructions Link to this heading

The next step is to use clear instructions to the agent normally:

  • System prompt is used for instructions
  • User prompt to provide context and data
  • (Assistant) prompt to provide examples

Starting with a prompt like the below:

1System:
2You are a food AI assistant who is an expert in natural language processing
3and especially name entity recognition.
4
5User:
6Extract all food-related entities from the recipe below in backticks:
7```{recipe}```
8...

We get the following output:

 1#### extract_food_entities
 2I will now proceed to extract all the food-related entities from the given recipe.
 3
 4#### extract_food_entities
 5The food-related entities present in the recipe are as follows:
 6
 7* Pork belly
 8* Green onions
 9* Fresh ginger
10* Garlic
11* Sake
12* Soy sauce
13* Mirin
14* Sugar
15* Water
16* Rice
17
18These entities cover the main ingredients used for the Chashu pork recipe.

You can also use the Assistant message to prompt with some examples:

 1Assistant:
 2Example:
 3```json
 4{
 5  "food": "minced meat",
 6  "quantity": 500
 7  "unit": grams (g)
 8  "physicalQuality": "minced",
 9  "color": "brown"
10}
11```
12...

Tip

Tip 3: Use clear instructions.

Technique #4 - Use functions or tools Link to this heading

When we use functions or tools we prompt the model to provide input arguments for an actual function in a downstream manner. This is similar to what is mentioned in section 4.2 here 4 You can use these arguments as they are (as they will be valid JSON) or do some further processing by doing the function calling in your logic. One example could be that we want to trigger certain actions or do certain formatting based on the function arguments.

The functions will also be part of the system prompt. Many of the latest models have been fine-tuned to work with function calling and thus produce valid JSON output in that way. To define a function we define a jsonSchema as below:

 1{
 2    "name": "extract_food_entities",
 3    "description": "You are a food AI assistant. Your task is to extract food entities from a recipe based on the JSON schema. You are to return the output as valid JSON.",
 4    "parameters": {
 5      "type": "object",
 6      "properties": {
 7        "food-metadata": {
 8          "type": "object",
 9          "properties": {
10            "food": {
11              "type": "string",
12              "description": "The name of the food item"
13            },
14            "quantity": {
15              "type": "string",
16              "description": "The quantity of the food item"
17              
18            },
19            "unit": {
20              "type": "string",
21              "description": "The unit of the food item"
22            },
23            "physicalQuality": {
24              "type": "string",
25              "description": "The physical quality of the food item"
26            },
27            "color": {
28              "type": "string",
29              "description": "The color of the food item"
30            }
31          },
32          "required": [
33            "food", 
34            "quantity",
35            "unit",
36            "physicalQuality",
37            "color"
38          ]
39        }
40      },
41      "required": [
42        "food-metadata"
43      ]
44    }
45  }

The example output of using the extract_food_entities below is:

1{
2    "food-metadata": {
3        "food": "Pork Belly",
4        "quantity": "2",
5        "unit": "lb",
6        "physicalQuality": "-",
7        "color": "-"
8    }
9}

Tip

Tip 4: Use tools or function calling with a jsonSchema to extract wanted metadata.

Technique #5 - Use domain prompts Link to this heading

As seen above using the jsonSchema above gives us metadata in a structured format that we can use for downstream processing. However, there are some limitations in the number of characters you can set in the description for each property in the jsonSchema. One way to give further instructions to the LLM is to add domain-specific instructions to e.g. the system prompt:

 1System:
 2You are a food AI assistant who is an expert in natural language processing
 3and especially name entity recognition. The entities we are interested in are: "food", "quantity", "unit", "physicalQuality" and "color". 
 4
 5See further instructions below for each entity:
 6
 7"food": This can be both liquid and solid food such as meat, vegetables, alcohol, etc. 
 8
 9"quantity": The exact quantity or amount of the food that should be used in the recipe. Answer in both full units such as 1,2,3, etc but also fractions e.g. 1/3, 2/4, etc. 
10
11"unit": The unit being used e.g. grams, milliliters, pounds, etc. The unit must always be returned.
12
13"physicalQuality": The characteristic of the ingredient (e.g. boneless for chicken breast, frozen for
14spinach, fresh or dried for basil, powdered for sugar).
15
16"color": The color of the food e.g. green, black, white. If no color is identified respond with colorless.
17
18User:
19Extract all food-related entities from the recipe below in backticks:
20```{recipe}```
21...

Example output with this update prompt is shown below:

 1{
 2    "food-metadata": {
 3        "food": "pork belly",
 4        "quantity": "2 lb",
 5        "unit": "pound",
 6        "physicalQuality": "raw",
 7        "color": "colorless"
 8    }
 9}
10
11{
12    "food-metadata": {
13        "food": "green onions", 
14        "quantity": "2",
15        "unit": "pieces",
16        "physicalQuality": "fresh",
17        "color": "green"
18    }
19}

Tip

Tip 5: Incorporate domain knowledge to help the LLM with extracting the entities you are looking for.

Technique #6 - Use Chain-of-Thought Link to this heading

Info

Chain-of-Thought (CoT) is a prompting technique where each input question is followed by an intermediate reasoning step, that leads to the final answer. This shown to improve the the output from LLMs. There is also a slight variation of CoT called Zero-Shot Chain-of-Thought where you introduce “Let’s think step by step” to guide the LLM’s reasoning.

An update to the prompt now using Zero-Shot Chain-of-Thought would be:

 1System:
 2You are a food AI assistant who is an expert in natural language processing
 3and especially name entity recognition. The entities we are interested in are: "food", "quantity", "unit", "physicalQuality" and "color". 
 4
 5See further instructions below for each entity:
 6
 7"food": This can be both liquid and solid food such as meat, vegetables, alcohol, etc. 
 8
 9"quantity": The exact quantity or amount of the food that should be used in the recipe. Answer in both full units such as 1,2,3, etc but also fractions e.g. 1/3, 2/4, etc. 
10
11"unit": The unit being used e.g. grams, milliliters, pounds, etc. The unit must always be returned.
12
13"physicalQuality": The characteristic of the ingredient (e.g. boneless for chicken breast, frozen for
14spinach, fresh or dried for basil, powdered for sugar).
15
16"color": The color of the food e.g. green, black, white. If no color is identified respond with colorless.
17
18Let's think step-by-step.
19
20User:
21Extract all food-related entities from the recipe below in backticks:
22```{recipe}```
23...

By adding “Let’s think step by step” we can see some slight improvements for the extraction:

1{
2    "food-metadata": {
3        "food": "pork belly", "quantity": "2 lb",
4        "unit": "ounce",
5        "physicalQuality": "trimmed",
6        "color": "colorless"
7    }
8}
Trimmed is likely a better physicalQuality to describe the pork belly, instead of raw. This as the pork belly is sliced and used as a topping e.g. a bowl of ramen.

Tip

Tip 6: Chain-of-thought may improve performance, especially for fields that need some calculation or reasoning steps.

Technique #7 - Use Prompt Chaining Link to this heading

Info

To improve the reliability and performance of LLMs, one of the important prompting engineering techniques is to break tasks into subtasks. Once those subtasks have been identified, the LLM is prompted with a subtask and then its response is used as input to another prompt. This is what’s referred to as prompt chaining where a task is split into subtasks with the idea to create a chain of prompt operations. — Prompt Engineering Guide1

Prompt Chaining is somewhat similar to CoT here with the idea of breaking down the problem into sub-problems or tasks. However, the main difference here is that one prompt outputs its response from a previous prompt to the next prompt as refined context.

In our recipe example, we might construct a prompt looking like the below:

 1CONTEXT_PROMPT = """
 2You are a food AI assistant who is an expert in natural language processing
 3and especially named entity recognition. The entities we are interested in are: "food", "quantity", "unit", "physicalQuality" and "color". 
 4
 5Your task is to extract the entity, the value of the entity and the reasoning, 
 6for why the extracted value is the correct value. 
 7If you cannot extract the entity add "N/A".
 8
 9Make sure to output the results according to the output format below.
10
11<Food-Metadata>:
12"<metadata>":
13- Entity: // the actual entity i.e. pork, onions etc
14- Value: // Value of the metadata field
15- Reasoning: // Reasoning for why the entity is correct.
16
17Let's think step-by-step.
18"""

The output from the first prompt is:

 1"pork":
 2- Entity: food
 3- Value: pork belly
 4- Reasoning: The text mentions preparing pork belly for Japanese dishes such as ramen, and the recipe specifically calls for 2 lb of pork belly.
 5
 6"green onions":
 7- Entity: food
 8- Value: green onions
 9- Reasoning: The text mentions using 2 green onions and specifies that 3 should be used if they are small.
10
11"ginger":
12- Entity: food
13- Value: ginger
14- Reasoning: The recipe specifies using 1 inch of fresh ginger.
15
16"garlic":
17- Entity: food
18- Value: garlic
19- Reasoning: The recipe specifies using 2 cloves of garlic.
20
21"sake":
22- Entity: food
23- Value: sake
24- Reasoning: The recipe calls for ⅔ cup of sake.
25
26"soy sauce":
27- Entity: food
28- Value: soy sauce
29- Reasoning: The recipe calls for ⅔ cup of soy sauce.
30
31"mirin":
32- Entity: food
33- Value: mirin
34- Reasoning: The recipe calls for ¼ cup of mirin.
35
36"sugar":
37- Entity: food
38- Value: sugar
39- Reasoning: The recipe calls for ½ cup of sugar.
40
41"water":
42- Entity: food
43- Value: water
44- Reasoning: The recipe specifies using 2 cups of water.
We then use the output from the prompt above as input to our extract_food_entities prompt from before. This approach may be helpful when you have entities that need to be calculated with some reasoning around them or they may not be in the exact format that you have in your JSON schema.

Tip

Tip 7: Prompt-Chaining can help as an import pre-processing step to provide more relevant context.

Closing Remarks Link to this heading

In this post, we have been walking through some useful prompt-engineering techniques that might be helpful when you deal with Named Entity Recognition (NER) using LLMs such as OpenAI.

Depending on your use-case one or several of these techniques may help improve your NER solution. However, writing clear instructions, using CoT and or prompt chaining together with tools or functions tend to improve the NER extraction.


  1. Language Processing Pipelines from https://spacy.io/usage/processing-pipelines ↩︎

  2. LLM Settings from https://www.promptingguide.ai/introduction/settings ↩︎

  3. The fingerprint represents the backend configuration that the model runs with. ↩︎

  4. OpenAI cookbook for NER: https://cookbook.openai.com/examples/named_entity_recognition_to_enrich_text ↩︎