Overview
Here is a brief overview of the different concepts and how they work together. Please see the relevant sections in the sidebar for more details.
In Opper, a call
is the most primitive way of interacting with a generative model.
Opper calls are structured. This means that you need to declare input and output schemas whenever you want a model to do something.
Behind the scenes, the Opper API constructs prompts for the model to generate outputs that align with the desired schemas. The Opper platform implements best and emerging practices for structured generation, ensuring that the output consistently matches the desired schema. This includes optimizing prompts per model, performing retries with intelligent feedback, and more. A cool property of the Opper platform is that all actions that the Opper platform takes are completely observable in the form of traces.
Here is an example of a structured call that translates text to a given language:
import asyncio
from opperai import AsyncOpper
from pydantic import BaseModel
class Translation(BaseModel):
from_language: str
to_language: str
translated_text: str
async def translate(text: str, target_language: str):
opper = AsyncOpper()
result, _ = await opper.call(
name="translate",
model="anthropic/claude-3-haiku",
instructions="Translate the given text to the target language.",
input={
"text": text,
"target_language": target_language
},
output_type=Translation,
)
print(result.translated_text)
asyncio.run(translate("Hello, how are you?", "French"))
import Client from "opperai";
const client = new Client();
(async () => {
const { json_payload } = await client.call({
name: "translate",
input: {
text: "Hello, how are you?",
target_language: "French"
},
output_schema: {
type: "object",
properties: {
from_language: { type: "string" },
to_language: { type: "string" },
translated_text: { type: "string" }
},
required: ["from_language", "to_language", "translated_text"]
},
model: "anthropic/claude-3-haiku",
instructions: "Translate the given text to the target language.",
});
console.log(json_payload);
})();
This declarative nature of calls is quite different from the default model APIs, which are typically optimized for chat, with some structured output functionality added on top.
The benefits of defaulting to structured input and output are significant, such as:
- They are model independent. Since you have clearly declared what you want, you can switch models without the task having to be described differently.
- They work across modalities. Other modalities like voice, vision, and video don't fit chat as well as text. By starting from structure, you are setting yourself up for a better experience.
- They are much easier to test, debug, and optimize as they are very clear in their task, input, and output.
- They still allow you to build your own classes for use in chat messages and other use cases, but in a way that completely fits your use case.
In Opper, an index
is essentially a database with advanced semantic retrieval capabilities.
Indexes allow you to store data, knowledge, and information, and retrieve based on semantic similarity. This means that you can query the database not only for keywords but also for similar concepts.
For example, let's say you index large amounts of financial data, and you want to find the most relevant information about the question what is our ARR?
You would do something like this to retrieve the 3 most relevant chunks of information:
import asyncio
from opperai import AsyncOpper, AsyncIndex
opper = AsyncOpper()
async def prepare_index():
index = await opper.indexes.create("financial_information")
await index.add("Our ARR is 1,400,000 Euro as per the latest financials.")
return index
async def call_with_index(index: AsyncIndex, user_question: str):
knowledge = await index.query(
query=user_question,
k=3
)
result, _ = await opper.call(
name="respond",
instructions="Respond to the user question using only the knowledge provided.",
input={
"user_question": user_question,
"knowledge": knowledge
},
output_type=str,
)
print(result)
async def main():
index = await prepare_index()
await call_with_index(index, "What is our ARR?")
asyncio.run(main())
import Client from "opperai";
const client = new Client();
(async () => {
let index = await client.indexes.get("support-tickets");
if (!index) {
index = await client.indexes.create("support-tickets");
}
await index.add({
content: "Our ARR is 1,400,000 Euro as per the latest financials."
});
const knowledge = await index.query({
query: "What is our ARR?",
k: 3
});
const { json_payload } = await client.call({
name: "respond",
input: {
user_question: "What is our ARR?",
knowledge: knowledge
},
output_schema: {
type: "string"
}
});
console.log(json_payload);
})();
A trace
is a record of a call to a model. It contains the input, output, and other metadata. All interactions with the Opper platform are traced, and you can look at traces to deeply understand, debug, and correct your AI implementation.
Traces are very flexible and can be used to implement things like session logging, debugging, and more. For example, a trace of a complete end-user session in a chat app could look like this:
import asyncio
from opperai import AsyncOpper
import uuid
from pydantic import BaseModel
opper = AsyncOpper()
class Translation(BaseModel):
from_language: str
to_language: str
translated_text: str
async def chat_session():
session_id = uuid.uuid4()
async with opper.traces.start(f"session {session_id}") as session:
result,_ = await opper.call(
name = "translate",
instructions = "Translate the given text to the target language.",
input = {
"text": "Hello, how are you?",
"target_language": "French"
},
output_type = Translation,
)
print(result.translated_text)
asyncio.run(chat_session())
import Client from "opperai";
const client = new Client();
(async () => {
const input = {
text: "Hello, how are you?",
target_language: "French"
};
const session = await client.traces.start({
name: "session 1",
description: "A chat session with the user",
input: input,
});
const { json_payload } = await client.call({
name: "translate",
input: input,
output_schema: {
type: "object",
properties: {
from_language: { type: "string" },
to_language: { type: "string" },
translated_text: { type: "string" }
},
required: ["from_language", "to_language", "translated_text"]
},
parent_span_uuid: session.uuid,
});
await session.end({
output: json_payload
});
console.log(json_payload.translated_text);
})();
To have traces available from the start is of great benefit, as it allows you to experiment with different prompts, models, etc. faster and see exactly what the models are doing. In production, it is absolutely necessary as it is the only way for you to see the experience of your AI feature and how it is interacting with your users.
A metric
in Opper is data you can attach to a trace to assign information on the performance of a given call. It is a simple way to build a feedback loop between your feature and yourself.
Being able to attach metrics to traces allows you to build features such as thumbs-up commands, or custom evaluations of calls and save them in the context of the call for later action.
For example, to attach a thumbs-up action to a call, you would do something like this:
import asyncio
from opperai import AsyncOpper
opper = AsyncOpper()
async def thumbs_up():
result, response = await opper.call(
name="translate",
instructions="Translate the given text to the target language.",
input={
"text": "Hello, how are you?",
"target_language": "French"
},
)
print(result)
await response.span.save_metric("thumbs_up", 1, "User pressed thumbs up button")
asyncio.run(thumbs_up())
import Client from "opperai";
const client = new Client();
(async () => {
const { message, span_id } = await client.call({
name: "translate",
input: {
text: "Hello, how are you?",
target_language: "French"
},
});
await client.spans.saveMetric(span_id, {
dimension: "thumbs_up",
value: 1,
comment: "User pressed thumbs up button"
});
console.log(message);
})();
In Opper, an example
is an example of a perfect execution of a call. An Example always includes a pair of input and output, and they allow you to show the model how perfect execution looks like for a given task and a set of inputs.
This technique is called few-shot prompting
or in-context learning
and is very powerful for steering the model to do what you want.
Opper supports examples in a call, but also offers a more automated way through datasets.
To provide examples with a call, you can do something like this:
import asyncio
from opperai import AsyncOpper
from opperai.types import Example
opper = AsyncOpper()
async def greet_user():
result, _ = await opper.call(
name="greet",
instructions="Greet the user in a friendly manner.",
input={
"user_name": "John@example.com",
"first_name": "John",
"last_name": "Doe",
"last_login": "2024-04-21"
},
examples = [
Example(
input = {
"user_name": "emmy@example.com",
"first_name": "Emmy",
"last_name": "Emilisson",
"last_login": "2024-01-01"
},
output="Hello Emmy, welcome back to Opper!"
),
Example(
input = {
"user_name": "Jane@example.com",
"first_name": "Jane",
"last_name": "Jones",
"last_login": None
},
output="Hello Jane, welcome to Opper!"
)
],
output_type=str,
)
print(result)
asyncio.run(greet_user())
import Client from "opperai";
const client = new Client();
(async () => {
const { message } = await client.call({
name: "greet",
instructions: "Greet the user in a friendly manner.",
input: {
user_name: "John@example.com",
first_name: "John",
last_name: "Doe",
last_login: "2024-04-21"
},
examples: [
{
input: {
user_name: "emmy@example.com",
first_name: "Emmy",
last_name: "Emilisson",
last_login: "2024-01-01"
},
output: "Hello Emmy, welcome back to Opper!"
},
{
input: {
user_name: "Jane@example.com",
first_name: "Jane",
last_name: "Jones",
last_login: null
},
output: "Hello Jane, welcome to Opper!"
}
],
});
console.log(message);
})();
This allows the model to see examples of how you would like output to look given some similar inputs. This technique can be used for steering output format as shown above, but it is even more powerful for reasoning-intensive tasks where you want to be able to describe how the model should think.
For more advanced usage of examples, you should use Datasets and automatic few-shot retrieval that automatically populates calls with the most semantically relevant examples for the input.
Evaluations are AI-powered observations that the Opper platform performs on calls. It effectively identifies if the call is performing well based on inputs, outputs, and instructions.
The benefit of evaluations is that they allow you to get an idea of high and low-performing AI calls, and they provide you with a way to see what is working and what is not.