batchor¶
batchor is a durable OpenAI Batch runner for Python teams that want typed results, resumable runs, replayable artifacts, and a narrow operator CLI.
Why batchor?¶
-
Durable by default
Runs are persisted through a control-plane store and can be rehydrated across fresh processes.
-
Deterministic source checkpoints
Built-in CSV, JSONL, and Parquet sources can resume mid-ingest when the same
run_id, source identity, and job config are reused. -
Typed structured results
Python API users can pass a Pydantic v2 model and receive parsed typed outputs instead of manually validating JSON strings.
-
Operator controls without orchestration scope creep
Library callers can
pause,resume,cancel, and read terminal results incrementally without turningbatchorinto a full workflow engine. -
Replayable request artifacts
Submitted request JSONL is stored as an artifact so retry and resume can replay the exact provider payload that was already prepared.
-
Explicit retention model
Request artifacts and raw provider outputs can be exported or pruned deliberately instead of being hidden inside implicit cleanup logic.
The short version¶
batchor gives you four main concepts:
BatchItem: one logical unit of workBatchJob: how to turn items into provider requestsBatchRunner: the durable orchestratorRun: the handle you refresh, wait on, inspect, export, and prune
If that is the mental model you were missing from the generated docs, start with Architecture and then read Storage & Runs.
Current surface¶
- Built-in provider: OpenAI Batch
- Durable storage: SQLite by default, Postgres as an opt-in control-plane backend
- Ephemeral storage: in-memory state store
- Artifact backend: local filesystem via
LocalArtifactStore - Deterministic built-in sources: CSV, JSONL, and Parquet
- Python-first control plane:
pause,resume,cancel, incremental terminal-result reads/exports - Narrow CLI: CSV/JSONL operator workflows only
Reading order¶
- Start with Installation if you are evaluating the package.
- Read Use Cases for concrete single-file, multi-file, and pipeline-style examples.
- Read Architecture for the runtime model and canonical diagrams.
- Read Storage & Runs for durable-run, resume, and artifact semantics.
- Use Python API or CLI Usage for concrete workflows.
- Use API Reference for symbols and signatures.
- Use the Design section for implementation details and extension boundaries.