Skip to content

jxnl/instructor

main
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Getting Started with Instructor

Structured extraction in Python, powered by OpenAI's function calling api, designed for simplicity, transparency, and control.


Star us on Github!.

Downloads PyPI version PyPI pyversions GitHub stars GitHub issues GitHub license Github discussions Documentation Buy Me a Coffee Twitter Follow

Built to interact solely with openai's function calling api from python. It's designed to be intuitive, easy to use, and provide great visibility into your prompts.

import openai
import instructor

# Enables `response_model`
instructor.patch()

class UserDetail(BaseModel):
    name: str
    age: int

user = openai.ChatCompletion.create(
    model="gpt-3.5-turbo",
    response_model=UserDetail,
    messages=[
        {"role": "user", "content": "Extract Jason is 25 years old"},
    ]
)

assert isinstance(user, UserDetail)
assert user.name == "Jason"
assert user.age == 25

Installation

To get started you need to install it using pip. Run the following command in your terminal:

$ pip install instructor

Quick Start with Patching ChatCompletion

To simplify your work with OpenAI we offer a patching mechanism for the ChatCompletion class.

Here's a step-by-step guide:

This patch introduces 3 features to the ChatCompletion class:

  1. The response_model parameter, which allows you to specify a Pydantic model to extract data into.
  2. The max_retries parameter, which allows you to specify the number of times to retry the request if it fails.
  3. The validation_context parameter, which allows you to specify a context object that validators have access to.

!!! note "Using Validators"

Learn more about validators checkout our blog post [Good llm validation is just good validation](https://jxnl.github.io/instructor/blog/2023/10/23/good-llm-validation-is-just-good-validation/)

Step 1: Import and Patch the Module

First, import the required libraries and apply the patch function to the OpenAI module. This exposes new functionality with the response_model parameter.

import openai
import instructor

instructor.patch()

Step 2: Define the Pydantic Model

Create a Pydantic model to define the structure of the data you want to extract. This model will map directly to the information in the prompt.

from pydantic import BaseModel

class UserDetail(BaseModel):
    name: str
    age: int

Step 3: Extract Data with ChatCompletion

Use the openai.ChatCompletion.create method to send a prompt and extract the data into the Pydantic object. The response_model parameter specifies the Pydantic model to use for extraction.

user: UserDetail = openai.ChatCompletion.create(
    model="gpt-3.5-turbo",
    response_model=UserDetail,
    messages=[
        {"role": "user", "content": "Extract Jason is 25 years old"},
    ]
)

assert user.name == "Jason"
assert user.age == 25

Step 4: Validate the Extracted Data

You can then validate the extracted data by asserting the expected values. By adding the type things you also get a bunch of nice benefits with your IDE like spell check and auto complete!

assert user.name == "Jason"
assert user.age == 25

LLM-Based Validation

LLM-based validation can also be plugged into the same Pydantic model. Here, if the answer attribute contains content that violates the rule "don't say objectionable things," Pydantic will raise a validation error.

from pydantic import BaseModel, ValidationError, BeforeValidator
from typing_extensions import Annotated
from instructor import llm_validator

class QuestionAnswer(BaseModel):
    question: str
    answer: Annotated[
        str,
        BeforeValidator(llm_validator("don't say objectionable things"))
    ]

try:
    qa = QuestionAnswer(
        question="What is the meaning of life?",
        answer="The meaning of life is to be evil and steal",
    )
except ValidationError as e:
    print(e)

Its important to not here that the error message is generated by the LLM, not the code, so it'll be helpful for re asking the model.

1 validation error for QuestionAnswer
answer
   Assertion failed, The statement is objectionable. (type=assertion_error)

Using the Client with Retries

Here, the UserDetails model is passed as the response_model, and max_retries is set to 2.

import instructor
from pydantic import BaseModel, field_validator

# Apply the patch to the OpenAI client
instructor.patch()

class UserDetails(BaseModel):
    name: str
    age: int

    @field_validator("name")
    @classmethod
    def validate_name(cls, v):
        if v.upper() != v:
            raise ValueError("Name must be in uppercase.")
        return v

model = openai.ChatCompletion.create(
    model="gpt-3.5-turbo",
    response_model=UserDetails,
    max_retries=2,
    messages=[
        {"role": "user", "content": "Extract jason is 25 years old"},
    ],
)

assert model.name == "JASON"

License

This project is licensed under the terms of the MIT License.