# Pioneer AI — Documentation for LLMs

> Pioneer is a platform for fine-tuning, evaluating, and deploying small language models (SLMs) and LLMs. Base URL: https://api.pioneer.ai
> For the full interactive docs, visit https://agent.pioneer.ai/docs
> For the machine-readable version, see https://agent.pioneer.ai/llms.txt

## Authentication

Pioneer uses API keys to authenticate requests. Include your key in the `X-API-Key` header (Bearer auth is also supported). Keys start with `pio_sk_`.

```
curl -X GET https://api.pioneer.ai/base-models \
  -H "X-API-Key: YOUR_API_KEY"
```

## API Reference

### Inference (Pioneer format)

- POST /inference — Run inference on a model
- GET /base-models — Model catalog (filterable by training/inference/task_type)

Task types: `extract_entities`, `classify_text`, `extract_json`, `generate`.

For `classify_text`, send `{"categories": ["label-a", "label-b"]}` as the schema object.

```
curl -X POST https://api.pioneer.ai/inference \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model_id": "YOUR_TRAINING_JOB_ID",
    "task": "extract_entities",
    "text": "Apple announced the new MacBook Pro at WWDC in Cupertino.",
    "schema": ["person", "organization", "product", "location", "event"],
    "threshold": 0.5
  }'
```

Response format:
```json
{
  "type": "encoder",
  "inference_id": "a4c5f23b-9184-454f-b22e-774bd765ccfa",
  "result": {
    "entities": {
      "organization": ["Apple"],
      "product": ["MacBook Pro"],
      "location": ["Cupertino"]
    }
  },
  "model_id": "YOUR_TRAINING_JOB_ID",
  "latency_ms": 419.08,
  "token_usage": 9
}
```

### Inference (OpenAI-compatible)

- POST /v1/chat/completions — Chat completions
- POST /v1/completions — Text completions
- POST /v1/responses — Responses API
- GET /v1/models — List available models

Drop-in replacement for the OpenAI SDK. Set `base_url` to `https://api.pioneer.ai/v1` and use your Pioneer API key. Supports streaming. Pass Pioneer-specific fields like `schema` via `extra_body`.

```python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.pioneer.ai/v1",
    api_key="YOUR_API_KEY",
)

response = client.chat.completions.create(
    model="YOUR_TRAINING_JOB_ID",
    messages=[{"role": "user", "content": "Extract entities from: Apple launched the iPhone in San Francisco."}],
    extra_body={"schema": ["organization", "product", "location"]},
)

print(response.choices[0].message.content)
```

### Inference (Anthropic-compatible)

- POST /v1/messages — Messages API

Drop-in replacement for the Anthropic SDK. Set `base_url` to `https://api.pioneer.ai` and use your Pioneer API key. Supports streaming.

```python
import anthropic

client = anthropic.Anthropic(
    base_url="https://api.pioneer.ai",
    api_key="YOUR_API_KEY",
)

message = client.messages.create(
    model="YOUR_TRAINING_JOB_ID",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Extract entities from: Apple launched the iPhone in San Francisco."}],
    extra_body={"schema": ["organization", "product", "location"]},
)

print(message.content[0].text)
```

### Inference History & Feedback

- GET /inferences — List past inferences (filters: limit, offset, model_id, task, project_id, training_job_id)
- GET /inferences/:id — Get inference details
- POST /inferences/:id/feedback — Submit correction feedback

### Datasets

- GET /felix/datasets — List all datasets
- GET /felix/datasets/:name — Get dataset versions
- DELETE /felix/datasets/:name — Delete a dataset

### Dataset Upload (Presigned URL)

```bash
# Step 1: Get a presigned S3 upload URL
curl -X POST https://api.pioneer.ai/felix/datasets/upload/url \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"dataset_name":"YOUR_DATASET_NAME","dataset_type":"classification","format":"csv"}'

# Step 2: Upload file directly to S3
curl -X PUT "$PRESIGNED_URL" \
  -H "Content-Type: application/octet-stream" \
  --data-binary @./dataset.csv

# Step 3: Trigger processing
curl -X POST https://api.pioneer.ai/felix/datasets/upload/process \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"dataset_id":"DATASET_ID"}'
```

Valid `dataset_type` values: `ner`, `classification`, `custom`, `decoder`.

### Synthetic Data Generation

- POST /generate — Start a generation job
- GET /generate/jobs/:job_id — Poll generation job status

Required body fields: `task_type` (ner, classification, decoder), `dataset_name`, `num_examples`. Also accepts `labels`, `domain_description`, `classified_examples`, `prompt`.

### Label Existing Data

- POST /generate/ner/label-existing — Auto-label text for NER
- POST /generate/classification/label-existing — Auto-classify text

Send your own unlabeled text and get NER annotations or classifications back. Required fields: `labels` and `inputs` (1–1000 strings).

### Training

- POST /felix/training-jobs — Start a training job (requires `base_model`)
- GET /felix/training-jobs — List training jobs (filters: status, project_id)
- GET /felix/training-jobs/:id — Get training job status
- GET /felix/training-jobs/:id/logs — Get training logs
- GET /felix/training-jobs/:id/checkpoints — List checkpoints
- GET /felix/training-jobs/:id/download — Download trained model
- POST /felix/training-jobs/:id/stop — Stop a running job
- DELETE /felix/training-jobs/:id — Delete a training job
- GET /felix/trained-models — List all trained models

```bash
curl -X POST https://api.pioneer.ai/felix/training-jobs \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model_name": "my-custom-model",
    "base_model": "fastino/gliner2-base-v1",
    "datasets": [{"name": "YOUR_DATASET_NAME"}],
    "training_type": "lora",
    "nr_epochs": 10,
    "learning_rate": 5e-5,
    "batch_size": 8
  }'
```

### Evaluations

- POST /felix/evaluations — Run an evaluation
- GET /felix/evaluations — List evaluations (filter: project_id)
- GET /felix/evaluations/:id — Get evaluation results
- DELETE /felix/evaluations/:id — Delete an evaluation
- GET /felix/baseline-models — List baseline LLM models

Evaluation `base_model` accepts a training job ID (unlike training, which requires a HuggingFace model ID).

```bash
curl -X POST https://api.pioneer.ai/felix/evaluations \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"base_model": "YOUR_TRAINING_JOB_ID", "dataset_name": "YOUR_DATASET_NAME"}'
```

### Projects & Deployments

- GET /projects — List projects
- POST /projects — Create a project
- DELETE /projects/:project_id — Delete a project
- POST /projects/:project_id/deployments — Deploy a model to a project (requires training_job_id)
- GET /projects/:project_id/deployments — List deployment history
- POST /projects/:project_id/inference — Run inference on project model

### API Keys

- POST /create-api-key — Generate a new API key
- GET /list-api-keys — List your API keys
- DELETE /delete-api-key — Revoke an API key

## Recommended API Order

1. Create or upload a dataset, then confirm it is `ready`.
2. Start training with that exact dataset name.
3. Poll training status until it is `complete`.
4. Run an evaluation using the completed training job ID and the dataset name.
5. Run inference with `model_id` set to your training job ID (or a base model ID like `fastino/gliner2-base-v1`).

## Available Models

### Encoder Models (NER — GLiNER)

| Model ID | Label | Training | Inference |
|---|---|---|---|
| fastino/gliner2-base-v1 | GLiNER2 Base | LoRA, Full | On-demand |
| fastino/gliner2-large-v1 | GLiNER2 Large | LoRA, Full | On-demand |
| fastino/gliner2-multi-v1 | GLiNER2 Multi | LoRA, Full | On-demand |
| fastino/gliner2-multi-large-v1 | GLiNER2 Multi Large | LoRA, Full | On-demand |

### Decoder Models — Training (LoRA fine-tuning)

| Model ID | Label | Context |
|---|---|---|
| Qwen/Qwen3-32B | Qwen3 32B | 131K |
| Qwen/Qwen3-30B-A3B-Instruct-2507 | Qwen3 30B A3B Instruct | 262K |
| Qwen/Qwen3-30B-A3B | Qwen3 30B A3B | 131K |
| Qwen/Qwen3-8B | Qwen3 8B | 131K |
| Qwen/Qwen3-8B-Base | Qwen3 8B Base | 32K |
| Qwen/Qwen3-4B-Instruct-2507 | Qwen3 4B Instruct | 262K |
| Qwen/Qwen2.5-Coder-0.5B | Qwen2.5 Coder 0.5B | 32K |
| Qwen/Qwen2.5-7B-Instruct | Qwen2.5 7B Instruct | 131K |
| Qwen/Qwen2.5-14B-Instruct | Qwen2.5 14B Instruct | 131K |
| google/gemma-4-31b-it | Gemma 4 31B IT | 128K |
| meta-llama/Llama-3.3-70B-Instruct | Llama 3.3 70B Instruct | 131K |
| meta-llama/Llama-3.1-8B-Instruct | Llama 3.1 8B Instruct | 131K |
| meta-llama/Llama-3.1-70B-Instruct | Llama 3.1 70B Instruct | 131K |
| meta-llama/Llama-3.2-3B-Instruct | Llama 3.2 3B Instruct | 131K |
| meta-llama/Llama-3.2-1B-Instruct | Llama 3.2 1B Instruct | 131K |
| meta-llama/Llama-3.2-3B | Llama 3.2 3B | 131K |
| meta-llama/Llama-3.2-1B | Llama 3.2 1B | 32K |
| nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16 | Nemotron 3 Nano 30B | 64K |
| openai/gpt-oss-120b | GPT-OSS 120B | 131K |
| openai/gpt-oss-20b | GPT-OSS 20B | 131K |
| deepseek-ai/DeepSeek-V3.1 | DeepSeek V3.1 | 163K |

### Decoder Models — Serverless Inference (no startup latency)

| Model ID | Label | Context |
|---|---|---|
| Qwen/Qwen3-235B-A22B-Instruct-2507 | Qwen3 235B A22B Instruct | 262K |
| Qwen/Qwen3-8B | Qwen3 8B | 131K |
| deepseek-ai/DeepSeek-V3.1 | DeepSeek V3.1 | 163K |
| openai/gpt-oss-120b | GPT-OSS 120B | 131K |
| openai/gpt-oss-20b | GPT-OSS 20B | 131K |
| meta-llama/Llama-3.3-70B-Instruct | Llama 3.3 70B Instruct | 131K |
| moonshotai/Kimi-K2-Thinking | Kimi K2 Thinking | 262K |

Serverless = pre-deployed, no startup latency, pay-per-token.
On-demand = dedicated GPU deployments created after fine-tuning.

## Dataset Formats

Pioneer accepts JSON, JSONL, and CSV files up to 50 MB.

### NER (Named Entity Recognition)
```json
{
  "text": "Apple launched the iPhone in San Francisco.",
  "entities": [
    ["Apple", "ORG"],
    ["iPhone", "PRODUCT"],
    ["San Francisco", "LOC"]
  ]
}
```

### Text Classification
```json
// Single-label
{"text": "Great product, highly recommend!", "label": "positive"}

// Multi-label
{"text": "Apple announces new AI chip for data centers", "labels": ["technology", "business"]}
```

### JSON Extraction
Same schema as NER — `text` plus `entities` as `[span, label]` pairs.

### Decoder (Chat SFT)
```json
{"messages": [
  {"role": "system", "content": "You are a helpful coding assistant."},
  {"role": "user", "content": "Write a Python function to sort a list."},
  {"role": "assistant", "content": "def sort_list(items):\n    return sorted(items)"}
]}
```

Allowed roles: `system` (optional, must be first), `user`, `assistant`. JSONL recommended for large datasets.

Pioneer auto-detects and converts: OpenAI/ChatML, Alpaca (instruction/input/output), ShareGPT (conversations), Prompt/Output, Instruction/Response formats.

## Training

Hyperparameters: learning_rate, nr_epochs, batch_size, training_type (lora/full), train/test split.

The `base_model` field is required. Use a model ID from GET /base-models or a checkpoint UUID from a previous training job.

Metrics tracked in real-time: F1 score, precision, recall, loss.

## Evaluations

Metrics: F1 Score (primary), Precision, Recall, Per-entity/class breakdown.

Compare fine-tuned models against base models, LLMs, and other SLMs side-by-side.

## Terminology

- **Small Language Models (SLMs)**: Models trained to be highly accurate on specific tasks. A 205M parameter model trained on your data can outperform many LLMs on your use case, at a fraction of cost and latency.
- **Datasets**: Labeled input/output pairs that capture what "correct" looks like for your task.
- **Training**: Fine-tuning — taking a pre-trained base model and continuing to update its weights on your dataset. Pioneer uses parameter-efficient methods (LoRA).
- **Evaluations**: Run a held-out test set through the model and score outputs against expected results.
- **Inference**: Using a trained model — pass input, get structured predictions back in milliseconds.
- **Continuous Adaptation**: Structured cycle: production logs accumulate, agents curate training data, fine-tuning runs, best checkpoint is evaluated before promotion.

## FAQ

**Do you charge for storage?** No, we do not charge for dataset storage.

**Which plan is best for me?** Pro for production workloads (uncapped inference). Free to experiment. Custom for enterprise needs (HIPAA, private networking).

**Special pricing for students/non-profits/open source?** Yes. Complete the intake form at https://docs.google.com/forms/d/e/1FAIpQLSdgxmKeS69UVk27cII_UNyDO5W2Uo_-T7TBjKnAvOeuKBl0sA/viewform

**Do you train on our data?** Yes, with opt-out on Pro and Custom plans. Custom plans allow running in your VPCs.

**Can I share models with teammates using Teams?** No, Teams are for shared billing only.