The knowledge endpoint allows you to store information from text chunks, PDFs, and documents, making it available for retrieval in relevant segments. This is useful for building knowledge-aware assistants, providing memory for agents, or functioning as a datastore with enhanced semantic retrieval capabilities.
When you upload documents or data to the knowledge API, they are processed by chunking the content into manageable segments and generating vector embeddings for each chunk. This process may vary slightly depending on whether the content is a file or plain text.When querying the knowledge store, the query is embedded using the same model as during the storage phase. The system then retrieves the chunks with the most similar vector embeddings—i.e., the ones most semantically relevant to the query. Before returning the results, a reranking step is performed to reorder the retrieved elements so that the most relevant appear first.
You can upload files such as PDFs, or just plain text strings of arbitrary length.Here is an example where we index a set of support tickets:
Copy
Ask AI
from opperai import Opper# Our SDK supports Pydantic to provide structured outputfrom pydantic import BaseModelfrom typing import Literalimport osopper = Opper(http_bearer=os.getenv("OPPER_API_KEY"))# Define the support ticket structureclass SupportTicket(BaseModel): ticket_id: str issue_description: str issue_resolution: str status: Literal['open', 'in_progress', 'resolved', 'closed']def main(): kb = opper.knowledge.get_by_name(knowledge_base_name="Tickets") if not kb: kb = opper.knowledge.create( name="Tickets" ) ticket = SupportTicket( ticket_id="123", issue_description="I'm having trouble accessing my account. Whenever I try to log in, I receive an error message stating that my credentials are incorrect. I have tried resetting my password multiple times, but the issue persists. Please assist in resolving this matter as soon as possible.", issue_resolution="The issue was resolved by verifying the user's identity and resetting the account credentials from the backend. The user was able to log in successfully after the credentials were reset.", status="resolved" ) opper.knowledge.add( knowledge_base_id=kb.id, key=ticket.ticket_id, # unique key, will overwrite existing data with that key content=ticket.model_dump_json(), metadata={ "source": "our_ticket_system", "status": ticket.status } ) res = opper.knowledge.query(knowledge_base_id=kb.id, query="Can't login", top_k=3) print(res)main()main()
This yields
Copy
Ask AI
{ 'id'="06ffa4f6-170c-4931-8675-9bc4f53c2a2d", key='06ffa4f6-170c-4931-8675-9bc4f53c2a2d', content='{"ticket_id":"123","issue_description":"I\'m having trouble accessing my account. Whenever I try to log in, I receive an error message stating that my credentials are incorrect. I have tried resetting my password multiple times, but the issue persists. Please assist in resolving this matter as soon as possible.","issue_resolution":"The issue was resolved by verifying the user\'s identity and resetting the account credentials from the backend. The user was able to log in successfully after the credentials were reset.",,"status":"resolved"}', metadata={'source': 'our_ticket_system', 'status': 'resolved', 'priority': 'high', 'customer_name': 'John Doe'}, score=14.421875}
For indexing files like PDFs and other documents, please refer to the API reference for more detailed instructions.
Retrieval results can be used as context to task completions, like this:
Copy
Ask AI
class SuggestResolution(BaseModel): thoughts: str message: str reference_ticket_ids: list[int]completion = opper.call( name="suggest_resolution", instructions="Given a user question and a list of potentially relevant past tickets, provide a suggestion for a resolution to the support agent", input={ "past_tickets": tickets, "user_issue": "Can't login" }, output_schema=SuggestResolution)print(completion.json_payload)
This yields
Copy
Ask AI
{ 'thoughts': "The user's issue seems similar to a previous ticket where the problem involved login errors and resetting credentials. It would be prudent to verify the user's identity and consider backend credential resetting, as was effective in the past case.", 'message': "Based on a similar past ticket, this issue might be resolved by verifying the user's identity and resetting their credentials at the backend. Once complete, guide the user to attempt logging in with the new credentials.", 'reference_ticket_ids': [123]}
Here we pass retrieval results directly as context to the task. There may be metadata not necessary in the retrieved results so it is often smart to only pull the relevant parts of the retrieval results into the task.