Anyone, who has been working with LLMs and generative AI recently has noticed that how you prompt an LLM matters. Slight changes to your prompts might lead to unexpected results. It is often non-trivial to reuse the same prompts when switching the underlying LLM you are using. An example is e.g. moving from OpenAI
to Antrophic
and using function
calling.
This often leads to quite some time spent on rewriting your prompts, thus more prompt engineering is required. Luckily, there are some interesting frameworks out there such as DSPy that focus more on ‘programming’ rather than ‘prompting’ your LLMs.
To get a good overview of DSPy
see some of the references below:
- π» Intro to DSPy: Goodbye Prompting, Hello Programming!
- π» DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines
- π» DSPy Deep-Dive
In this post, we will try to use DSPy
to extract metadata data from recipes. For a recap of our previous approach see e.g. this post.
DSPy - Declarative Self-improving Language Programs
DSPy
or Declarative Self-improving Language Programs was first introduced in the paper in (2):
Info
DSPy
is a framework for algorithmically optimizing LM prompts and weights, especially when LMs are used one or more times within a pipeline.
DSPy
can routinely teach powerful models like GPT-3.5
or GPT-4
and local models like T5-base
or Llama2-13b
to be much more reliable at tasks, i.e. having higher quality and/or avoiding specific failure patterns. DSPy optimizers will “compile” the same program into different instructions, few-shot prompts, and/or weight updates (finetunes) for each LM
β About DSPy1
-
The above quote is excerpted from https://dspy-docs.vercel.app/docs/intro ↩︎
Note the use of optimizing
above as this provides some analogies to optimizing
neural networks using a framework such as Pytorch
or Tensorflow
. The nice thing with the optimizer
above is that DSPy
enables us to not focus too much on prompt engineering
with whatever LLM we choose. Instead, we can compile
i.e. optimize the underlying instructions to work with any LLM.
In short, the DSPy
programming model has the following abstractions:
- Signatures instead of the needed for hand-written prompts/fine-tuning.
- Modules that implement various prompt engineering techniques such as Cot, REACT etc.
- Optimizer1 to automated manual prompt engineering based on given metrics
A DSPy
program is a program using (1) - (3) together with data to use in the optimization step. For a more thorough walk-through of DSPy
see e.g., (1) and (2) from the introduction section.
NER using DSPy, extracting food-related entities
In the following sections, we will use this notebook π to examine how you can use DSPy
for NER
use cases. As in the previous NER post, we want to extract different metadata for food items.
Setup environment
The first thing to do is to load the necessary libraries and do any setup of these libraries.
1import dspy
2from pydantic import BaseModel, Field
3from dspy.functional import TypedPredictor
4from IPython.display import Markdown, display
5from typing import List, Optional, Union
6from dotenv import load_dotenv
7from devtools import pprint
8
9assert load_dotenv() == True
10gpt4 = dspy.OpenAI(model="gpt-4-turbo-preview", max_tokens=4096, model_type="chat")
11gpt_turbo = dspy.OpenAI(model="gpt-3.5-turbo", max_tokens=4096, model_type="chat")
12dspy.settings.configure(lm=gpt4)
Here we use OpenAI
for the example. DSPy
seems to support many of the big open/closed source providers. For implementations see more here and here
Data
As in this post, we will use the data shown below:
1### Chashu pork (for ramen and more)
2Chashu pork is a classic way to prepare pork belly for Japanese dishes such as ramen.
3While it takes a little time, it's relatively hands-off and easy, and the result is delicious.
4
5### Ingredients
62 lb pork belly or a little more/less
72 green onions spring onions, or 3 if small
81 in fresh ginger (a chunk that will give around 4 - 6 slices)
92 cloves garlic
10β
cup sake
11β
cup soy sauce
12ΒΌ cup mirin
13Β½ cup sugar
142 cups water or a little more as needed
15
16### Instructions
17Have some string/kitchen twine ready and a pair of scissors before you prepare the pork.
18If needed, trim excess fat from the outside of the pork but you still want a later over the
19...
We will also use the following Pydantic
data models 2 as part of the problem:
1class FoodMetaData(BaseModel):
2 reasoning: str = Field(description="Reasoning for why the entity is correct")
3 value: Union[str, int] = Field(description="Value of the entity")
4 entity: str = Field(description="The actual entity i.e. pork, onions etc")
5
6class FoodMetaData(BaseModel):
7 context: List[FoodMetaData]
The first model above represents the “reasoning” object as part of the CoT step in the workflow.
1class FoodEntity(BaseModel):
2 food: str = Field(description="This can be both liquid and \
3 solid food such as meat, vegetables, alcohol, etc")
4 quantity: int = Field(description="The exact quantity or amount \
5 of the food that should be used in the recipe")
6 unit: str = Field(description="The unit being used e.g. \
7 grams, milliliters, pounds, etc")
8 physical_quality: Optional[str] = Field(description="The characteristic of the ingredient")
9 color: str = Field(description="The color of the food")
10
11class FoodEntities(BaseModel):
12 entities: List[FoodEntity]
The second model above is the schema for the actual metadata that we want to extract. Below is the resulting JSON
schema for this object:
1{
2 "properties": {
3 "food": {
4 "description": "This can be both liquid and solid food such as meat, vegetables, alcohol, etc",
5 "title": "Food",
6 "type": "string"
7 },
8 "quantity": {
9 "description": "The exact quantity or amount of the food that should be used in the recipe",
10 "title": "Quantity",
11 "type": "integer"
12 },
13 "unit": {
14 "description": "The unit being used e.g. grams, milliliters, pounds, etc",
15 "title": "Unit",
16 "type": "string"
17 },
18 "physical_quality": {
19 "anyOf": [
20 {
21 "type": "string"
22 },
23 {
24 "type": "null"
25 }
26 ],
27 "description": "The characteristic of the ingredient",
28 "title": "Physical Quality"
29 },
30 "color": {
31 "description": "The color of the food",
32 "title": "Color",
33 "type": "string"
34 }
35 },
36 "required": [
37 "food",
38 "quantity",
39 "unit",
40 "physical_quality",
41 "color"
42 ],
43 "title": "FoodEntity",
44 "type": "object"
45}
46
Finally, for the teleprompter/optimizer, we need to provide some training examples 3:
1# create some dummy data for training
2trainset = [
3 dspy.Example(
4 recipe="French omelett with 2 eggs, 500grams of butter and 10 grams gruyere",
5 entities=[
6 FoodEntity(food="eggs", quantity=2, unit="", physical_quality="", color="white"),
7 FoodEntity(food="butter", quantity=500, unit="grams", physical_quality="", color="yellow"),
8 FoodEntity(food="cheese", quantity=10, unit="grams", physical_quality="gruyer", color="yellow")
9 ]
10 ).with_inputs("recipe"),
11 dspy.Example(
12 recipe="200 grams of Ramen noodles bowel with one pickled egg, 500grams of pork, and 1 spring onion",
13 entities=[
14 FoodEntity(food="egg", quantity=1, unit="", physical_quality="pickled", color="ivory"),
15 FoodEntity(food="ramen nudles", quantity=200, unit="grams", physical_quality="", color="yellow"),
16 FoodEntity(food="spring onion", quantity=1, unit="", physical_quality="", color="white")
17 ]
18 ).with_inputs("recipe"),
19 dspy.Example(
20 recipe="10 grams of dutch orange cheese, 2 liters of water, and 5 ml of ice",
21 entities=[
22 FoodEntity(food="cheese", quantity=10, unit="grams", physical_quality="", color="orange"),
23 FoodEntity(food="water", quantity=2, unit="liters", physical_quality="translucent", color=""),
24 FoodEntity(food="ice", quantity=5, unit="militers", physical_quality="cold", color="white")
25 ]
26 ).with_inputs("recipe"),
27 dspy.Example(
28 recipe="Pasta carbonara, 250 grams of pasta 300 grams of pancetta, \
29 150 grams pecorino romano, 150grams parmesan cheese, 3 egg yolks",
30 entities=[
31 FoodEntity(food="pasta", quantity=250, unit="grams", physical_quality="dried", color="yellow"),
32 FoodEntity(food="egg yolk", quantity=3, unit="", physical_quality="", color="orange"),
33 FoodEntity(food="pancetta", quantity=300, unit="grams", physical_quality="pork", color=""),
34 FoodEntity(food="pecorino", quantity=150, unit="grams", physical_quality="goat chese", color="yellow"),
35 FoodEntity(food="parmesan", quantity=150, unit="grams", physical_quality="chese", color="yellow"),
36 ]
37 ).with_inputs("recipe"),
38 dspy.Example(
39 recipe="American pancakes with 250g flour, 1 tsp baking powder, 1 gram salt, 10g sugar, 100ml fat milk",
40 entities=[
41 FoodEntity(food="flour", quantity=250, unit="grams", physical_quality="", color="white"),
42 FoodEntity(food="baking powder", quantity=1, unit="tsp", physical_quality="", color="white"),
43 FoodEntity(food="salt", quantity=1, unit="grams", physical_quality="salty", color="white"),
44 FoodEntity(food="milk", quantity=100, unit="mil", physical_quality="fat", color="white"),
45 ]
46 ).with_inputs("recipe")
47]
Signatures
The next step is to create the dspy.Signature
objects, where we need to specify an InputField(...)
and OutPutField(...)
. To recap what a Signature
is:
Info
A signature is a declarative specification of the input/output behavior of a DSPy module. Signatures allow you to tell the LM what it needs to do, rather than specify how we should ask the LM to do it. β DSPy Signatures1
-
The above quote is excerpted from https://dspy-docs.vercel.app/docs/building-blocks/signatures ↩︎
Below are the Signatures
we will be using:
1class RecipeToFoodContext(dspy.Signature):
2 """You are a food AI assistant. Your task is to extract the entity, the value of the entity and the reasoning
3 for why the extracted value is the correct value. If you cannot extract the entity, add null"""
4 recipe: str = dspy.InputField()
5 context: FoodMetaData = dspy.OutputField()
6
7class RecipeToFoodEntities(dspy.Signature):
8 """You are a food AI assistant. Your task is to extract food-related metadata from recipes."""
9 recipe: str = dspy.InputField()
10 entities: FoodEntities = dspy.OutputField()
Notice the modular and sleek nature of creating these compared to how it would look in other frameworks. Looking into the actual code for these you will see that these are wrappers for the Pydantic
Fields object:
1def InputField(**kwargs):
2 return pydantic.Field(**move_kwargs(**kwargs, __dspy_field_type="input"))
3
4def OutputField(**kwargs):
5 return pydantic.Field(**move_kwargs(**kwargs, __dspy_field_type="output"))
Modules
The next thing to do is select what Modules
that we want to use. To recap what Modules
are:
Info
Each built-in module abstracts a prompting technique (like chain of thought or ReAct). Crucially, they are generalized to handle any [DSPy Signature]
. Your init method declares the modules you will use. Your forward method expresses any computation you want to do with your modules
β DSPy Modules1
-
The above quote is excerpted from https://dspy-docs.vercel.app/docs/building-blocks/modules ↩︎
The Modules
that we will be using are:
TypedPredictor
TypedChainOfThought
These are 2 functional
modules that let us specify types
via Pydantic
schemas which are useful for structured
data extraction. These can either be used with dspy.Functional
or dspy.Module
. However, before creating the actual modules, we will define 1 helper method to parse the context
call:
1def parse_context(food_context: FoodMetaData) -> str:
2 context_str = ""
3 for context in food_context:
4 context: FoodMetaData
5 context_str += f"{context.entity}:\n" + context.model_dump_json(indent=4) + "\n"
6 return context_str
This is mainly to extract the resulting context JSON
object as a string for the next step of the chain.
Moving on to the actual Modules
using dspy.Module
we define it as:
1class ExtractFoodEntities(dspy.Module):
2 def __init__(self, temperature: int = 0, seed: int = 123):
3 super().__init__()
4 self.temperature = temperature
5 self.seed = seed
6 self.extract_food_context = dspy.TypedPredictor(RecipeToFoodContext)
7 self.extract_food_context_cot = dspy.TypedChainOfThought(RecipeToFoodContext)
8 self.extract_food_entities = dspy.TypedPredictor(RecipeToFoodEntities)
9
10 def forward(self, recipe: str) -> FoodEntities:
11 food_context = self.extract_food_context(recipe=recipe).context
12 parsed_context = parse_context(food_context.context)
13 food_entities = self.extract_food_entities(recipe=parsed_context)
14 return food_entities.entities
Or using dspy.Functional
we define it as:
1from dspy.functional import FunctionalModule, predictor, cot
2
3class ExtractFoodEntitiesV2(FunctionalModule):
4 def __init__(self, temperature: int = 0, seed: int = 123):
5 super().__init__()
6 self.temperature = temperature
7 self.seed = seed
8
9 @predictor
10 def extract_food_context(self, recipe: str) -> FoodMetaData:
11 """You are a food AI assistant. Your task is to extract the entity, the value of the entity and the reasoning
12 for why the extracted value is the correct value. If you cannot extract the entity, add null"""
13 pass
14
15 @cot
16 def extract_food_context_cot(self, recipe: str) -> FoodMetaData:
17 """You are a food AI assistant. Your task is to extract the entity, the value of the entity and the reasoning
18 for why the extracted value is the correct value. If you cannot extract the entity, add null"""
19 pass
20
21 @predictor
22 def extract_food_entities(self, recipe: str) -> FoodEntities:
23 """You are a food AI assistant. Your task is to extract food entities from a recipe."""
24 pass
25
26 def forward(self, recipe: str) -> FoodEntities:
27 food_context = self.extract_food_context(recipe=recipe)
28 parsed_context = parse_context(food_context.context)
29 food_entities = self.extract_food_entities(recipe=parsed_context)
30 return food_entities
Using the functional
API we can use some nifty decorator functions i.e. @predictor
and @cot
. Now when we have our Module
we might want to test it on some example data. DSPy
also allows you to specify a dspy.Context
where you can choose what LLM to use:
1extract_food_entities = ExtractFoodEntities()
2
3with dspy.context(lm=gpt4):
4 entities = extract_food_entities(recipe="Ten grams of orange dutch cheese, \
5 2 liters of water and 5 ml of ice")
6 pprint(entities)
This will result in the following entities
:
1FoodEntities(
2 entities=[
3 FoodEntity(
4 food='orange dutch cheese',
5 quantity=10,
6 unit='grams',
7 physical_quality=None,
8 color='orange',
9 ),
10 FoodEntity(
11 food='water',
12 quantity=2000,
13 unit='milliliters',
14 physical_quality=None,
15 color='clear',
16 ),
17 FoodEntity(
18 food='ice',
19 quantity=5,
20 unit='milliliters',
21 physical_quality=None,
22 color='clear',
23 ),
24 ],
25)
Optimize the program
Now we have all the components we need to start optimizing
our program. To recap:
Info
A DSPy optimizer is an algorithm that can tune the parameters of a DSPy program (i.e., the prompts and/or the LM weights) to maximize the metrics you specify, like accuracy.
…
DSPy programs consist of multiple calls to LMs, stacked together as [DSPy modules]
. Each DSPy module has internal parameters of three kinds: (1) the LM weights, (2) the instructions, and (3) demonstrations of the input/output behavior.
Given a metric, DSPy can optimize all of these three with multi-stage optimization algorithms. β DSPy Optimizets1
-
The above quote is excerpted from https://dspy-docs.vercel.app/docs/building-blocks/optimizers ↩︎
For the optimization, we will use the BootstrapFewShot 4 and the metric below:
1def validate_entities(example, pred, trace=None):
2 """Check if both objects are equal"""
3 return example.entities == pred
To run the optimization step we use the compile
method:
1from dspy.teleprompt import BootstrapFewShot
2
3teleprompter = BootstrapFewShot(metric=validate_entities)
4compiled_ner = teleprompter.compile(ExtractFoodEntitiesV2(), trainset=trainset)
The compiled programming is something we can store and load from disk as well for later use.
To use the compiled
program on our dataset we do:
1pprint(compiled_ner(recipe=train_data))
2
3FoodEntities(
4 entities=[
5 FoodEntity(
6 food='pork belly',
7 quantity=2,
8 unit='lb',
9 physical_quality=None,
10 color='',
11 ),
12 FoodEntity(
13 food='green onions',
14 quantity=2,
15 unit='items',
16 physical_quality='or 3 if small',
17 color='',
18 ),
19 FoodEntity(
20 food='fresh ginger',
21 quantity=1,
22 unit='inch',
23 physical_quality='chunk',
24 color='',
25 ),
26 FoodEntity(
27 food='garlic',
28 quantity=2,
29 unit='cloves',
30 physical_quality=None,
31 color='',
32 ),
33 FoodEntity(
34 food='sake',
35 quantity=2,
36 unit='β
cup',
37 physical_quality=None,
38 color='',
39 ),
40 FoodEntity(
41 food='soy sauce',
42 quantity=2,
43 unit='β
cup',
44 physical_quality=None,
45 color='',
46 ),
47 FoodEntity(
48 food='mirin',
49 quantity=1,
50 unit='ΒΌ cup',
51 physical_quality=None,
52 color='',
53 ),
54 FoodEntity(
55 food='sugar',
56 quantity=1,
57 unit='Β½ cup',
58 physical_quality=None,
59 color='',
60 ),
61 FoodEntity(
62 food='water',
63 quantity=2,
64 unit='cups',
65 physical_quality='or a little more as needed',
66 color='',
67 ),
68 ],
69)
Note too bad after doing some code for a couple of hours π. Finally to inspect the resulting prompt used by the program each LM
has an inspect_history
method:
1gpt4.inspect_history(n=1)
Which outputs the prompt below:
1You are a food AI assistant. Your task is to extract food entities from a recipe.
2
3---
4
5Follow the following format.
6
7Recipe: ${recipe}
8Extract Food Entities: ${extract_food_entities}. Respond with a single JSON object. JSON Schema: {"$defs": {"FoodEntity": {"properties": {"food": {"description": "This can be both liquid and solid food such as meat, vegetables, alcohol, etc", "title": "Food", "type": "string"}, "quantity": {"description": "The exact quantity or amount of the food that should be used in the recipe", "title": "Quantity", "type": "integer"}, "unit": {"description": "The unit being used e.g. grams, milliliters, pounds, etc", "title": "Unit", "type": "string"}, "physical_quality": {"anyOf": [{"type": "string"}, {"type": "null"}], "description": "The characteristic of the ingredient", "title": "Physical Quality"}, "color": {"description": "The color of the food", "title": "Color", "type": "string"}}, "required": ["food", "quantity", "unit", "physical_quality", "color"], "title": "FoodEntity", "type": "object"}}, "properties": {"entities": {"items": {"$ref": "#/$defs/FoodEntity"}, "title": "Entities", "type": "array"}}, "required": ["entities"], "title": "FoodEntities", "type": "object"}
9
10---
11
12Recipe:
13pork belly:
14{
15 "reasoning": "The recipe specifies using 2 lb of pork belly as the main ingredient for the chashu pork.",
16 "value": "2 lb",
17 "entity": "pork belly"
18}
19green onions:
20{
21 "reasoning": "The recipe calls for 2 green onions, or 3 if they are small, to be used in the cooking process.",
22 "value": "2 or 3",
23 "entity": "green onions"
24}
25fresh ginger:
26{
27 "reasoning": "A 1 inch chunk of fresh ginger is required, which will give around 4 - 6 slices for the recipe.",
28 "value": "1 in",
29 "entity": "fresh ginger"
30}
31garlic:
32{
33 "reasoning": "2 cloves of garlic are needed as part of the ingredients.",
34 "value": "2 cloves",
35 "entity": "garlic"
36}
37sake:
38{
39 "reasoning": "β
cup of sake is used in the cooking liquid for flavor.",
40 "value": "β
cup",
41 "entity": "sake"
42}
43soy sauce:
44{
45 "reasoning": "β
cup of soy sauce is added to the cooking liquid, contributing to the dish's savory taste.",
46 "value": "β
cup",
47 "entity": "soy sauce"
48}
49mirin:
50{
51 "reasoning": "ΒΌ cup of mirin is included in the recipe for sweetness and depth of flavor.",
52 "value": "ΒΌ cup",
53 "entity": "mirin"
54}
55sugar:
56{
57 "reasoning": "Β½ cup of sugar is used to sweeten the cooking liquid.",
58 "value": "Β½ cup",
59 "entity": "sugar"
60}
61water:
62{
63 "reasoning": "2 cups of water (or a little more as needed) are required to create the cooking liquid for the pork.",
64 "value": "2 cups",
65 "entity": "water"
66}
67
68Extract Food Entities: ```json
69{
70 "entities": [
71 {
72 "food": "pork belly",
73 "quantity": 2,
74 "unit": "lb",
75 "physical_quality": null,
76 "color": ""
77 },
78 {
79 "food": "green onions",
80 "quantity": 2,
81 "unit": "items",
82 "physical_quality": "or 3 if small",
83 "color": ""
84 },
85 {
86 "food": "fresh ginger",
87 "quantity": 1,
88 "unit": "inch",
89 "physical_quality": "chunk",
90 "color": ""
91 },
92 {
93 "food": "garlic",
94 "quantity": 2,
95 "unit": "cloves",
96 "physical_quality": null,
97 "color": ""
98 },
99 {
100 "food": "sake",
101 "quantity": 2,
102 "unit": "β
cup",
103 "physical_quality": null,
104 "color": ""
105 },
106 {
107 "food": "soy sauce",
108 "quantity": 2,
109 "unit": "β
cup",
110 "physical_quality": null,
111 "color": ""
112 },
113 {
114 "food": "mirin",
115 "quantity": 1,
116 "unit": "ΒΌ cup",
117 "physical_quality": null,
118 "color": ""
119 },
120 {
121 "food": "sugar",
122 "quantity": 1,
123 "unit": "Β½ cup",
124 "physical_quality": null,
125 "color": ""
126 },
127 {
128 "food": "water",
129 "quantity": 2,
130 "unit": "cups",
131 "physical_quality": "or a little more as needed",
132 "color": ""
133 }
134 ]
135}
136```
Closing remarks
The aim for this post was for me to familiarize myself a bit more with DSPy
that seems to be all the ‘rave’ lately. This is only an initial attempt to get a first understanding of what you can and cannot do. Hopefully, this will give you some insights on how you can get started with DSPy
as well.
However to summarize my first impressions:
- Easy to use and get started with β
- Nice to not have to spend hours on prompt-engineering β
- Nice to treat this a a “typical” ML problem using
optimization
β - Still evolving and not “production ready” yet β
- Needs better logging / tracing to make it easier to understand what is happening when you are debugging your programs β
All in all the approach of programming
rather prompting
really reasonates with me and is inline with the current trends of Compound AI systems. Will be exacting to follow how this package evolves and matures over time. As prompting in its current state is not likely the future of building scalable, non-fragile and resilient systems using LLMs.