Local Providers

Local and self-hosted providers are useful for evaluation, data-control experiments, and private model hosting. Facio treats them as OpenAI-compatible endpoints unless the provider has a dedicated backend.

Supported local routes

Provider	Config key	Typical API base	API key behavior
Custom	`custom`	Your endpoint	Direct OpenAI-compatible endpoint; provide everything explicitly.
vLLM	`vllm`	Your vLLM `/v1` endpoint	`apiBase` is required; API key depends on your deployment.
Ollama	`ollama`	`http://localhost:11434/v1`	Local provider; key often empty or placeholder depending on proxy.
LM Studio	`lm_studio`	`http://localhost:1234/v1`	Local provider; key often empty or placeholder depending on server.
OVMS	`ovms`	`http://localhost:8000/v3`	OpenVINO Model Server; direct local route.

Docker networking note

Inside the Facio container, localhost means the Facio container, not the host machine. If Ollama, LM Studio, or vLLM runs on the host, expose it to the container through Docker networking or a reverse proxy and use that reachable URL as apiBase.

For example, on Docker Desktop you may use a host gateway address such as http://host.docker.internal:11434/v1 if your host service allows it.

Configuration example

{
  "providers": {
    "ollama": {
      "apiBase": "http://host.docker.internal:11434/v1"
    }
  },
  "agents": {
    "defaults": {
      "model": "llama3.2",
      "provider": "ollama"
    }
  }
}

When auto-routing helps

If a local provider has an apiBase, Facio can route plain local model names to it even when the model name does not contain a provider prefix. For Gemma-family names, configured local providers are preferred over Gemini so open-weight local runs do not accidentally route to a hosted API.

Checklist

Confirm the endpoint is reachable from inside the Facio container.
Use the correct OpenAI-compatible path, usually ending in /v1.
Set provider explicitly while testing local routing.
Run /check after setting the provider.
Watch logs for provider errors; local servers often fail because of Docker networking, model names, or context length.