Skip to content

LLM Utilities

pyconveyor's LLM utilities are importable independently for scripts that don't need the full pipeline runner.

from pyconveyor.llm import make_client, call_llm, probe_json_mode, extract_json
from pyconveyor.prompt import render_prompt, render_prompt_string

make_client

Creates an OpenAI-compatible client.

from pyconveyor.llm import make_client

client = make_client(
    base_url="https://api.openai.com/v1",
    api_key="sk-...",
)

Also accepts timeout and any keyword arguments that openai.OpenAI supports.


probe_json_mode

Tests whether a model supports JSON mode (response_format={"type": "json_object"}).

from pyconveyor.llm import make_client, probe_json_mode

client = make_client(base_url="...", api_key="...")
supported = probe_json_mode(client, "gpt-4o-mini", timeout=30)
# True or False

Sends a minimal test request. Falls back gracefully — returns False if the endpoint doesn't support it, rather than raising.


call_llm

Calls the model with a messages array and returns the raw response string.

from pyconveyor.llm import make_client, call_llm

client = make_client(base_url="...", api_key="...")

response = call_llm(
    client,
    messages=[{"role": "user", "content": "Extract the key points from: ..."}],
    model="gpt-4o-mini",
    timeout=120,
    json_mode=True,         # use response_format={"type": "json_object"}
    temperature=0.0,
    max_tokens=2048,
)
# response is a str — the model's raw output

Parameters

Parameter Description
client An openai.OpenAI client (from make_client)
messages List of {"role": ..., "content": ...} dicts
model Model name string
timeout Request timeout in seconds
json_mode Whether to use response_format={"type": "json_object"}
temperature Sampling temperature (optional)
top_p Top-p sampling (optional)
max_tokens Max response tokens (optional)
seed Random seed for reproducibility (optional)
extra_params Dict of additional parameters passed through to the API

extract_json

Extracts a JSON object from a string that may contain surrounding prose, markdown fences, or other noise.

from pyconveyor.llm import extract_json

raw = '''
Here is the extracted data:
```json
{"title": "Example", "key_points": ["Point one"]}
'''

data = extract_json(raw)

{"title": "Example", "key_points": ["Point one"]}

Handles common model output patterns:

- Fenced code blocks (` ```json ... ``` ` or ` ``` ... ``` `)
- Prose before/after the JSON object
- BOM characters
- Trailing commas (best-effort)

Raises `ValueError` if no valid JSON object can be found.

`extract_json` is the default parser for `llm` steps. You only need to call it directly if you're using the utilities standalone or writing a custom parser.

---

## `render_prompt`

Renders a Jinja2 template file with context variables.

```python
from pyconveyor.prompt import render_prompt

prompt = render_prompt(
    "prompts/",           # template directory
    "extract.j2",         # template filename
    document=text,        # keyword args become template variables
    mode="detailed",
)

The template receives all keyword arguments as top-level variables:

{# prompts/extract.j2 #}
Extract {{ mode }} information from:

{{ document }}

render_prompt_string

Renders a Jinja2 template from a string rather than a file.

from pyconveyor.prompt import render_prompt_string

template = "Extract information from: {{ document }}"
prompt = render_prompt_string(template, document=text)

Using utilities standalone

A complete extraction script without the pipeline runner:

from pyconveyor.llm import make_client, call_llm, probe_json_mode, extract_json
from pyconveyor.prompt import render_prompt
from schemas import ExtractionResult

client = make_client(
    base_url="https://api.openai.com/v1",
    api_key="sk-...",
)

json_mode = probe_json_mode(client, "gpt-4o-mini", timeout=30)

prompt = render_prompt("prompts/", "extract.j2", document=text)

raw = call_llm(
    client,
    messages=[{"role": "user", "content": prompt}],
    model="gpt-4o-mini",
    timeout=120,
    json_mode=json_mode,
    temperature=0.0,
)

data = extract_json(raw)
result = ExtractionResult(**data)
print(result.title)

This is exactly what the pipeline runner does internally for each llm step, minus the retry loop and schema feedback.