Environment Variables

All configuration is done via .env. Copy example.env and fill in your values:

cp example.env .env

Configuration Levels

Each integration has a configuration level indicating its importance:

Level	Meaning	Behavior when not configured
Required	Core system dependency	System will error — chat and primary functions will not work
Recommended	Significant feature enabler	Graceful degradation — the feature is visibly unavailable but the system runs
Optional	Enhancement capability	Transparent degradation — system works fine, capability simply not present

Note: Admin-configured models (Admin → Models page) can substitute for LLM environment variables. The health check considers both sources.

Frontend (Local Dev Only)

The frontend has a separate env file only for local development: frontend/.env.local.

This file is NOT used in Docker. Inside the Docker container, Next.js proxies /api/* to the Python backend internally (port 8000 is container-internal), so no frontend env file is needed.

For local dev, the defaults work out of the box — you do not need to create frontend/.env.local unless your backend runs on a non-default port. If you need to override, create frontend/.env.local manually:

echo 'NEXT_PUBLIC_API_URL=http://localhost:9000' > frontend/.env.local

Variable	Default	Description
`NEXT_PUBLIC_API_URL`	`http://localhost:8000` (auto)	Backend URL the browser uses for direct API calls (OAuth redirects, streaming). Auto-detected from `window.location` if unset — only override if your backend runs on a non-standard port locally.

Build-time note: NEXT_PUBLIC_* variables are baked into the JS bundle at pnpm build time. Changing them at runtime (e.g. via root .env) has no effect — this is why they live in frontend/.env.local for local dev only.

LLM (Required)

Variable	Required	Default	Description
`LLM_API_KEY`	Yes	—	API key for the LLM provider
`LLM_BASE_URL`	No	`https://api.openai.com/v1`	Base URL of any OpenAI-compatible API
`LLM_MODEL`	No	`gpt-4o`	Main model — used for planning, analysis, and ReAct agent
`FAST_LLM_MODEL`	No	(falls back to `LLM_MODEL`)	Fast model — used for DAG step execution (cheaper, faster)
`LLM_TEMPERATURE`	No	`0.7`	Default sampling temperature
`LLM_CONTEXT_SIZE`	No	`128000`	Context window size for the main LLM
`LLM_MAX_OUTPUT_TOKENS`	No	`64000`	Max output tokens per call for the main LLM
`FAST_LLM_API_KEY`	No	(falls back to `LLM_API_KEY`)	API key for the fast model provider. Use when the fast model is hosted by a different provider than the main model
`FAST_LLM_BASE_URL`	No	(falls back to `LLM_BASE_URL`)	Base URL for the fast model provider
`FAST_LLM_TEMPERATURE`	No	(falls back to `LLM_TEMPERATURE`)	Sampling temperature for the fast model
`FAST_LLM_CONTEXT_SIZE`	No	(falls back to `LLM_CONTEXT_SIZE`)	Context window size for the fast LLM
`FAST_LLM_MAX_OUTPUT_TOKENS`	No	(falls back to `LLM_MAX_OUTPUT_TOKENS`)	Max output tokens per call for the fast LLM
`LLM_REASONING_EFFORT`	No	(disabled)	Extended thinking level for supported models (OpenAI o-series, Gemini 2.5+, Claude). Values: `low`, `medium`, `high`. LiteLLM translates this to each provider’s native format automatically. The model’s chain-of-thought is surfaced in the UI “thinking” step.
`LLM_REASONING_BUDGET_TOKENS`	No	(auto from effort)	Explicit token budget for Anthropic thinking (minimum 1024). For OpenAI/Gemini the effort level is used directly. Only effective when `LLM_REASONING_EFFORT` is set.
`LLM_JSON_MODE_ENABLED`	No	`true`	Global toggle for `response_format=json_object`. Set to `false` if your provider rejects LiteLLM’s assistant prefill injection (e.g. AWS Bedrock relay → `ValidationException` on the 2nd+ agent iteration). When disabled, structured calls skip JSON mode and fall back to plain-text regex extraction — no quality loss. Applies to all models (ENV-configured and Admin-configured).
`LLM_TOOL_CHOICE_ENABLED`	No	`true`	Global toggle for forced `tool_choice` in structured output extraction (Level 1 — Native Function Calling). Set to `false` if your model returns errors with forced tool selection (e.g. thinking-mode models that reject `tool_choice='specified'`). When disabled, structured calls skip native FC and start from JSON Mode. Per-model override available in Settings → Models → Advanced.
`REASONING_LLM_MODEL`	No	(falls back to `LLM_MODEL`)	Model name for the reasoning tier. Used for tasks requiring deep analysis (e.g., DAG planning, plan analysis)
`REASONING_LLM_API_KEY`	No	(falls back to `LLM_API_KEY`)	API key for the reasoning model provider
`REASONING_LLM_BASE_URL`	No	(falls back to `LLM_BASE_URL`)	Base URL for the reasoning model provider
`REASONING_LLM_TEMPERATURE`	No	(falls back to `LLM_TEMPERATURE`)	Sampling temperature for the reasoning model
`REASONING_LLM_CONTEXT_SIZE`	No	(falls back to `LLM_CONTEXT_SIZE`)	Context window size for the reasoning model
`REASONING_LLM_MAX_OUTPUT_TOKENS`	No	(falls back to `LLM_MAX_OUTPUT_TOKENS`)	Max output tokens per call for the reasoning model
`REASONING_LLM_EFFORT`	No	(falls back to `LLM_REASONING_EFFORT`)	Reasoning effort level for the reasoning model tier. Values: `low`, `medium`, `high`
`REASONING_LLM_BUDGET`	No	(falls back to `LLM_REASONING_BUDGET_TOKENS`)	Token budget for reasoning (primarily Anthropic). Overrides the auto-calculated budget for the reasoning tier
`LLM_SUPPORTS_VISION`	No	`true` (optimistic)	Controls whether ENV-mode document OCR (via MarkItDown + `markitdown-ocr`) is attempted. Only applies when no active model group is configured in Admin → Models (pure ENV mode). When the default `true` is in effect, `convert_to_markdown` and RAG ingestion assume `LLM_MODEL` supports vision and call it for image OCR — this is the correct behavior for all common choices (`gpt-4o`, `claude-3-5-sonnet`, `gemini-1.5-pro/flash`). Set this to `false` when your ENV-configured `LLM_MODEL` does not support vision (e.g. `deepseek-v3`, `qwen-chat`, `llama-3.1`, `gpt-3.5-turbo`, `o1-mini`) to skip the failing vision call and go straight to text-only extraction. When an active model group exists in the Admin → Models panel, this flag is ignored and the group’s `supports_vision` flags take over — the admin-curated choice is always the source of truth in DB mode.

Resolution order: User Preference → Admin Models (DB) → ENV Fallback. If an admin model with role “General” is configured in Admin → Models, these ENV vars serve as fallback only. The health check considers both sources.

MarkItDown OCR Resolution

The convert_to_markdown built-in tool and the RAG ingestion pipeline both use Microsoft’s MarkItDown + the official markitdown-ocr plugin to extract text from documents — including OCR on embedded images and scanned PDF pages when a vision-capable LLM is available. Vision LLM resolution order (first match wins):

#	Source	Priority rationale
1	Agent’s primary LLM if `supports_vision=True`	Consistency: same API key, same billing bucket, same rate-limit pool as the conversation.
2	Active ModelGroup → Fast Model if `supports_vision=True`	Fast models (`gpt-4o-mini`, `claude-haiku`, `gemini-1.5-flash`) are the ideal OCR workhorse — cheap, low-latency, usually multimodal.
3	Active ModelGroup → General Model if `supports_vision=True`	Quality fallback when the primary is not in the group.
4	ENV primary LLM (`LLM_MODEL`)	Optimistic fallback for pure ENV mode. Only taken when no active ModelGroup exists. Gated by `LLM_SUPPORTS_VISION`.

Reasoning models are never preferred for OCR. Reasoning tiers (o1, o3-mini, DeepSeek-R1) historically lack vision support and are the wrong tool for OCR anyway — OCR is a perception task, not deliberation. If a workspace has only a reasoning model with supports_vision=True it will still be picked up via the primary-LLM path, but the resolver does not actively rank it above fast/general. Zero-regression fallback: when no vision-capable model is found at any level, OCR is silently disabled and MarkItDown runs in text-only mode. Word/PowerPoint/Excel embedded-image OCR becomes unavailable (same as before this feature shipped), but all other text extraction (headings, tables, paragraph text) continues to work unchanged. There is never a case where adding this feature made extraction worse than the previous behavior. Non-OpenAI providers (Anthropic, Google Gemini, etc.) are supported transparently: the resolved LLM is wrapped in a LiteLLMOpenAIShim that routes chat.completions.create(...) calls through litellm.completion(), which handles the provider-specific message format translation (e.g. Anthropic’s source.type="base64" image block). One shim covers every provider LiteLLM supports — adding a new provider costs zero code changes in FIM One.

Extended Thinking (Reasoning)

When LLM_REASONING_EFFORT is set, FIM One enables the model’s extended thinking capability so the internal chain-of-thought is surfaced in the UI “thinking” step. FIM One uses LiteLLM to translate the reasoning effort parameter into each provider’s native format automatically.

Supported providers

Provider	`LLM_BASE_URL`	How it works	Reasoning content returned?
OpenAI (o1 / o3 / o4-mini)	`https://api.openai.com/v1`	`reasoning_effort` sent natively	Yes
Anthropic (Claude 3.7+)	`https://api.anthropic.com/v1/`	LiteLLM routes via native Anthropic API with `thinking` parameter	Yes
Google Gemini (2.5+)	`https://generativelanguage.googleapis.com/v1beta/openai/`	`reasoning_effort` sent on compat endpoint	Yes

LiteLLM auto-detects the provider from LLM_BASE_URL and maps it to the correct API format. Unknown URLs are treated as OpenAI-compatible.

Important caveats

Third-party proxies / custom endpoints are not guaranteed. If your LLM_BASE_URL points to a third-party API proxy (e.g., OpenRouter, one-api, custom gateway), LiteLLM will attempt to route correctly based on the URL. However, if your proxy expects a non-standard format, reasoning may not work as expected. Consult the proxy’s documentation for their expected parameter format.

Temperature constraints with reasoning

Some providers impose temperature restrictions when reasoning is active:

Anthropic: Requires temperature=1 when extended thinking is enabled. If using Anthropic with extended thinking, you must set LLM_TEMPERATURE=1 — Anthropic rejects other values when thinking is enabled.
OpenAI GPT-5.x: Only supports temperature=1 at all times. LiteLLM’s drop_params filtering handles this automatically — unsupported temperature values are silently dropped. No user action is needed for GPT-5.x.

How `LLM_REASONING_BUDGET_TOKENS` works

This variable is primarily meaningful for the Anthropic path. When set, it overrides the auto-calculated budget and is sent as budget_tokens in the thinking parameter via LiteLLM. When not set, the budget is derived from LLM_MAX_OUTPUT_TOKENS x effort ratio:

`LLM_REASONING_EFFORT`	Budget ratio	Example (max_tokens = 64000)
`low`	20%	12,800 tokens
`medium`	50%	32,000 tokens
`high`	80%	51,200 tokens

The minimum budget is 1,024 tokens (Anthropic’s hard minimum). For OpenAI and Gemini, the provider handles token allocation internally based on the reasoning_effort level — LLM_REASONING_BUDGET_TOKENS has no effect.

Agent Execution

ReAct Agent

Variable	Required	Default	Description
`REACT_MAX_ITERATIONS`	No	`20`	Max tool-call iterations per ReAct request. Higher = more thorough but slower and costlier
`REACT_MAX_TURN_TOKENS`	No	`0`	Emergency circuit-breaker: max cumulative tokens (prompt + completion across all iterations) per single ReAct turn. Default `0` = unlimited. This is NOT for daily token control — use per-user `token_quota` for that. This is a last-resort safety valve for extreme scenarios like an agent stuck in an infinite tool-call loop. Hitting this limit aborts the task mid-execution, wasting all tokens consumed so far and returning an incomplete result. Keep at `0` unless you have a specific runaway-agent problem to contain
`REACT_TOOL_SELECTION_THRESHOLD`	No	`12`	When the total number of registered tools exceeds this threshold, a lightweight LLM call selects the most relevant subset before each request
`REACT_TOOL_SELECTION_MAX`	No	`6`	Max tools to keep after smart selection (only effective when tool count exceeds `REACT_TOOL_SELECTION_THRESHOLD`)
`REACT_SELF_REFLECTION_INTERVAL`	No	`6`	Inject a self-reflection prompt every N tool calls to help the agent course-correct and avoid loops
`REACT_TOOL_OBS_TRUNCATION`	No	`8000`	Max characters per tool observation when synthesizing the final answer. Higher values preserve more structured data (JSON, tables) at the cost of more tokens
`REACT_TOOL_RESULT_BUDGET`	No	`40000`	Aggregate token budget for all tool results in a single session. When total tool-result tokens exceed this cap, new results are truncated with a notice. Prevents context bloat from large API responses (e.g., 5 connector calls returning 8K each). Set to `0` to disable the cap
`REACT_COMPLETION_CHECK_SKIP_CHARS`	No	`800`	Skip the post-answer completion-check LLM call when the agent’s final answer exceeds this many characters. Long detailed answers don’t need a “did I miss anything?” verification round-trip. Set lower to skip more aggressively; set to a very large value to always run the check
`REACT_CYCLE_DETECTION_THRESHOLD`	No	`2`	When the same tool is called with identical arguments this many times in a row, a deterministic warning is injected telling the agent to try a different approach. Unlike self-reflection (which relies on the LLM noticing the loop), this is a hash-based check that cannot be bypassed. Also applies to DAG steps
`REACT_COMPLETION_CHECK_MIN_TOOLS`	No	`3`	Minimum number of tool calls before the completion checklist fires. Simple tasks (1-2 tool calls) skip verification to avoid unnecessary latency. Set to `1` for always-on verification. Also applies to DAG steps
`REACT_TURN_PROFILE_ENABLED`	No	`true`	Emit per-turn phase-level timing logs (`memory_load`, `compact`, `tool_schema_build`, `llm_first_token`, `llm_total`, `tool_exec`). One structured log line per turn. Set to `false` to disable profiling entirely (zero overhead)
`LLM_RATE_LIMIT_PER_USER`	No	`true`	Use per-user keyed rate-limit buckets instead of a single process-global bucket. Prevents one noisy user from starving all others on the same worker. The underlying rate is hardcoded at 60 requests/min and 100K tokens/min per bucket — this setting only controls whether the bucket is shared (global) or partitioned (per-user). Set to `false` to revert to the legacy global bucket (not recommended)

DAG Planner

Variable	Required	Default	Description
`MAX_CONCURRENCY`	No	`5`	Max parallel steps in DAG executor
`DAG_STEP_MAX_ITERATIONS`	No	`15`	Max tool-call iterations within each DAG step
`DAG_STEP_TIMEOUT`	No	`600`	Step execution timeout in seconds. Steps exceeding this are marked as failed and their dependents are cascade-skipped
`DAG_MAX_REPLAN_ROUNDS`	No	`3`	Max autonomous re-plan attempts when goal is not achieved. User interrupts (inject) are unlimited and do not count against this budget
`DAG_REPLAN_STOP_CONFIDENCE`	No	`0.8`	Stop retrying when agent confidence that goal is unachievable exceeds this threshold (`0.0` = never stop early, `1.0` = stop on any failure)
`DAG_VERIFY_TRUNCATION`	No	`2000`	Max characters of step output sent to the step verifier LLM for quality judgment
`DAG_ANALYZER_TRUNCATION`	No	`10000`	Max characters per step result when formatting for the post-execution analyzer
`DAG_REPLAN_RECENT_TRUNCATION`	No	`500`	Max characters per step result from the most recent round when building re-plan context
`DAG_REPLAN_OLDER_TRUNCATION`	No	`200`	Max characters per step result from older rounds when building re-plan context. Older rounds are more aggressively truncated to save context
`DAG_TOOL_CACHE`	No	`true`	Cache identical tool calls within a single DAG execution. Only tools explicitly marked as `cacheable` (read-only tools like search, knowledge retrieval) are cached. Set to `false` to disable caching entirely
`DAG_STEP_VERIFICATION`	No	`false`	Generic LLM-based quality check after each DAG step. On failure, the step retries once with feedback. Default off — adds latency on every step and is rarely needed; most step outputs are acceptable without re-checking. Use only when you observe frequent low-quality step results
`DAG_CITATION_VERIFICATION`	No	`true`	Citation-accuracy check for specialist-domain steps. Prerequisite: the query must first be classified as a specialist domain by the LLM domain classifier (see `ESCALATION_DOMAINS`). When the domain is detected AND this flag is `true`, each completed step is scanned for legal/medical/financial citations and verified for accuracy — catching hallucinated article numbers, fabricated case references, and incorrect regulatory citations. If domain classification returns `null` (general query), citation verification does not run regardless of this setting
`DAG_CITATION_VERIFY_TRUNCATION`	No	`6000`	Max characters of step result sent to the citation verification prompt

Domain Classification

Controls the independent LLM-based domain detection layer that runs before both ReAct and DAG execution. When a query is classified as a specialist domain, the system activates domain-aware features: model escalation to reasoning model, domain-specific SOP instructions, and citation verification (DAG only).

Variable	Required	Default	Description
`ESCALATION_DOMAINS`	No	`legal,medical,financial,tax,compliance,patent`	Comma-separated list of specialist domains. A fast LLM classifies each query against this list. When matched, the system: (1) upgrades to the reasoning model for higher accuracy, (2) injects domain-specific SOP instructions (e.g. verify citations via search before writing), (3) enables citation verification for DAG steps. Add custom domains as needed (e.g. `legal,education,construction`)

Context Guard

Controls the automatic context window management that prevents conversations from exceeding the model’s limit.

Variable	Required	Default	Description
`CONTEXT_GUARD_DEFAULT_BUDGET`	No	`32000`	Default token budget for context window management. When the conversation exceeds this, older messages are compacted
`CONTEXT_GUARD_MAX_MSG_CHARS`	No	`50000`	Hard character limit on any single message. Messages exceeding this are truncated as a safety net
`CONTEXT_GUARD_KEEP_RECENT`	No	`4`	Number of most recent messages to preserve when compacting conversation history

Content Guardrails

Comma-separated names of guardrails that inspect the content of input or output. Independent of the tool-permission gate (core/hooks/*) and the security layer (core/security/*). See Content Guardrails for the full picture.

Variable	Required	Default	Description
`FIM_GUARDRAILS_INPUT`	No	`jailbreak`	Active input guardrails. The default `jailbreak` regex detector aborts the turn before any LLM tokens are spent when known prompt-override phrases are detected. Set to empty to disable. Unknown names are logged and skipped
`FIM_GUARDRAILS_OUTPUT`	No	(empty)	Active output guardrails. Currently shipped: `max_length` (caps answer character count). Run after the agent produces its final answer
`FIM_GUARDRAIL_MAX_OUTPUT_CHARS`	No	`50000`	Character cap used by the `max_length` output guardrail. Only effective when `max_length` is listed in `FIM_GUARDRAILS_OUTPUT`

Agent Workspace

Variable	Required	Default	Description
`WORKSPACE_OFFLOAD_THRESHOLD`	No	`8000`	When a tool output exceeds this many characters, it is saved to a workspace file and a truncated preview is injected into the conversation context
`WORKSPACE_PREVIEW_CHARS`	No	`2000`	Number of preview characters to include in truncated workspace references
`WORKSPACE_CLEANUP_MAX_HOURS`	No	`72`	Workspace files older than this many hours are eligible for automatic cleanup

System

Variable	Required	Default	Description
~~`SYSTEM_PROMPT_RESERVE`~~	—	—	Removed. Previously subtracted a fixed 4K reserve from the context budget for system prompts. This caused double-counting because ContextGuard already includes the system prompt when estimating message list tokens. The budget formula is now simply `context_size - max_output_tokens`, and the system prompt’s actual size is accounted for dynamically

Web Tools (Optional)

Variable	Required	Default	Description
`JINA_API_KEY`	No	—	Jina API key. Acts as a shared fallback for search, fetch, embedding, and reranker when no service-specific key is set. Get yours at jina.ai
`TAVILY_API_KEY`	No	—	Tavily Search API key (auto-selected if set and `WEB_SEARCH_PROVIDER` is unset)
`BRAVE_API_KEY`	No	—	Brave Search API key (auto-selected if set and `WEB_SEARCH_PROVIDER` is unset)
`EXA_API_KEY`	No	—	Exa Search API key (auto-selected if set and `WEB_SEARCH_PROVIDER` is unset). Get yours at exa.ai
`WEB_SEARCH_PROVIDER`	No	`jina`	Search provider selector: `jina` / `tavily` / `brave` / `exa`
`WEB_FETCH_PROVIDER`	No	`jina` (if key set, else `httpx`)	Fetch provider: `jina` (uses Jina Reader API) / `httpx` (direct HTTP request, no API key needed)

Quick start tip: Setting just JINA_API_KEY enables web search, web fetch, embedding, and reranking all at once — one key, four services. You can override each service individually with the variables below.

RAG & Knowledge Base (Recommended)

Embedding

Embedding converts text into vectors for knowledge base search. FIM One uses the standard OpenAI-compatible /v1/embeddings endpoint, so it works with any provider that exposes this interface — not just Jina.

Variable	Required	Default	Description
`EMBEDDING_API_KEY`	No	(falls back to `JINA_API_KEY`)	API key for the embedding provider
`EMBEDDING_BASE_URL`	No	`https://api.jina.ai/v1`	Base URL for the embedding provider
`EMBEDDING_MODEL`	No	`jina-embeddings-v3`	Model identifier
`EMBEDDING_DIMENSION`	No	`1024`	Vector dimension

Provider examples — just set the three variables to switch:

Provider	`EMBEDDING_BASE_URL`	`EMBEDDING_MODEL`	`EMBEDDING_DIMENSION`
Jina (default)	`https://api.jina.ai/v1`	`jina-embeddings-v3`	`1024`
OpenAI	`https://api.openai.com/v1`	`text-embedding-3-small`	`1536`
Voyage	`https://api.voyageai.com/v1`	`voyage-3`	`1024`
Ollama (local)	`http://localhost:11434/v1`	`nomic-embed-text`	`768`

Changing the embedding model or dimension invalidates all existing knowledge base vectors. Old vectors were computed in a different embedding space — retrieval accuracy will degrade silently. You must rebuild all knowledge base indexes after switching.

Retrieval

Variable	Required	Default	Description
`RETRIEVAL_MODE`	No	`grounding`	`grounding` (full pipeline with citations and confidence scoring) or `simple` (basic RAG)

Reranker

Reranker re-scores retrieved documents to improve relevance. Three providers are supported — select via RERANKER_PROVIDER or let the system auto-detect from available API keys.

Variable	Required	Default	Description
`RERANKER_PROVIDER`	No	(auto-detect)	`jina` / `cohere` / `openai`. If unset: uses Cohere if `COHERE_API_KEY` set, otherwise Jina
`RERANKER_MODEL`	No	`jina-reranker-v2-base-multilingual`	Model identifier (applies to Jina and OpenAI providers)
`COHERE_API_KEY`	No	—	Cohere API key (auto-selects Cohere reranker when set and `RERANKER_PROVIDER` is unset)
`COHERE_RERANKER_MODEL`	No	`rerank-multilingual-v3.0`	Cohere-specific reranker model

Jina uses JINA_API_KEY (from Web Tools above). OpenAI reuses LLM_API_KEY / LLM_BASE_URL — no extra key needed. Cohere requires its own COHERE_API_KEY.

Reranker is optional — knowledge base search works without it using fusion scoring. Embedding is recommended for knowledge base features.

Vector Store

Variable	Required	Default	Description
`VECTOR_STORE_DIR`	No	`./data/vector_store`	Directory for LanceDB vector store data (file-based, zero external services)

Code Execution

Variable	Required	Default	Description
`CODE_EXEC_BACKEND`	No	`local`	`local` (direct host execution) or `docker` (isolated containers)
`DOCKER_PYTHON_IMAGE`	No	`python:3.11-slim`	Docker image for Python execution
`DOCKER_NODE_IMAGE`	No	`node:20-slim`	Docker image for Node.js execution
`DOCKER_SHELL_IMAGE`	No	`python:3.11-slim`	Docker image for shell execution
`DOCKER_MEMORY`	No	(Docker default)	RAM cap per container (e.g. `256m`, `512m`, `1g`)
`DOCKER_CPUS`	No	(Docker default)	CPU quota per container (e.g. `0.5`, `1.0`)
`SANDBOX_TIMEOUT`	No	`120`	Default execution timeout in seconds
`DOCKER_HOST_DATA_DIR`	No	(not set)	Host-side absolute path of the `./data` volume mount. Required for DooD (Docker-outside-of-Docker) deployments; `docker-compose.yml` auto-sets via `${PWD}/data`.

Security: local mode runs AI-generated code directly on the host. For internet-facing or multi-user deployments, always set CODE_EXEC_BACKEND=docker.

Tool Artifacts

Size limits for files produced by tool execution (code execution, template rendering, image generation).

Variable	Required	Default	Description
`MAX_ARTIFACT_SIZE`	No	`10485760` (10 MB)	Max single artifact file size in bytes
`MAX_ARTIFACTS_TOTAL`	No	`52428800` (50 MB)	Max total artifact size per session in bytes

Document Processing (Optional)

Controls how uploaded PDF/DOCX files are processed for LLM consumption. Vision-capable models (GPT-4o, Claude 3/4, Gemini) can receive PDF pages as rendered images for higher fidelity.

Variable	Required	Default	Description
`DOCUMENT_PROCESSING_MODE`	No	`auto`	`auto` (vision if model supports it), `vision` (always render pages), `text` (always extract text only)
`DOCUMENT_VISION_DPI`	No	`150`	DPI for PDF page rendering. Higher = better quality, more tokens
`DOCUMENT_VISION_MAX_PAGES`	No	`20`	Maximum pages to render as images per PDF

Note: Per-model vision support is configured via the supports_vision toggle in Admin → Models. When not explicitly set, the system auto-detects vision capability from the model name.

Image Generation (Optional)

Variable	Required	Default	Description
`IMAGE_GEN_PROVIDER`	No	`google`	`google` (Gemini native API) or `openai` (OpenAI-compatible `/v1/images/generations`)
`IMAGE_GEN_API_KEY`	No	—	Google AI Studio key (`google`) or proxy/OpenAI API key (`openai`)
`IMAGE_GEN_MODEL`	No	`gemini-3.1-flash-image-preview`	Image generation model (e.g. `dall-e-3`, `gemini-nano-banana-2`)
`IMAGE_GEN_BASE_URL`	No	(per provider)	Google: `https://generativelanguage.googleapis.com/v1beta`; OpenAI: `https://api.openai.com/v1`

Email (SMTP) (Recommended)

Auto-registers the email_send built-in tool when SMTP_HOST, SMTP_USER, and SMTP_PASS are all set.

Variable	Required	Default	Description
`SMTP_HOST`	Cond.	—	SMTP server hostname
`SMTP_PORT`	No	`465`	SMTP port
`SMTP_SSL`	No	`ssl`	TLS mode: `ssl` (port 465) / `tls` (STARTTLS, port 587) / `""` (plain)
`SMTP_USER`	Cond.	—	SMTP login username
`SMTP_PASS`	Cond.	—	SMTP login password
`SMTP_FROM`	No	(uses `SMTP_USER`)	Sender address shown in From header
`SMTP_FROM_NAME`	No	—	Display name shown in From header
`SMTP_REPLY_TO`	No	—	Reply-To address; replies go here instead of `SMTP_FROM`
`SMTP_ALLOWED_DOMAINS`	No	—	Comma-separated domain allowlist (e.g. `example.com,corp.io`); blocks recipients outside listed domains
`SMTP_ALLOWED_ADDRESSES`	No	—	Comma-separated exact-address allowlist; combined with `SMTP_ALLOWED_DOMAINS`; leave both unset to allow any recipient (not recommended for shared mailboxes)

Connectors

Variable	Required	Default	Description
`CONNECTOR_RESPONSE_MAX_CHARS`	No	`50000`	Max characters for non-array JSON / plain-text connector responses
`CONNECTOR_RESPONSE_MAX_ITEMS`	No	`10`	Max array items to keep when connector response is a JSON array
`CREDENTIAL_ENCRYPTION_KEY`	No	(unset)	Fernet encryption key for connector credential blobs. When set, auth tokens stored in `connector_credentials` are encrypted at rest. If unset, credentials are stored as plaintext JSON (backward-compatible). Changing this key invalidates all existing encrypted credentials.
`CONNECTOR_TOOL_MODE`	No	`progressive`	How connector tools are exposed to agents. `progressive`: single `ConnectorMetaTool` with `discover`/`execute` subcommands (~30 tokens/connector). `classic`: one tool per action (legacy, ~250 tokens/action).
`DATABASE_TOOL_MODE`	No	`progressive`	How database connector tools are exposed to agents. `progressive`: single `DatabaseMetaTool` with `list_tables`/`discover`/`query` subcommands. `legacy`: one tool per action per database connector (3 tools each).
`MCP_TOOL_MODE`	No	`progressive`	How MCP server tools are exposed to agents. `progressive`: single `MCPServerMetaTool` with `discover`/`call` subcommands. `legacy`: one tool per MCP server action (original individual tools).

Platform

Variable	Required	Default	Description
`DATABASE_URL`	No	`sqlite+aiosqlite:///./data/fim_one.db`	Database connection string. SQLite (zero-config): `sqlite+aiosqlite:///./data/fim_one.db`. PostgreSQL (production): `postgresql+asyncpg://user:pass@localhost:5432/fim_one`. Docker Compose auto-sets PostgreSQL.
`JWT_SECRET_KEY`	No	`CHANGE_ME`	Secret key for JWT token signing. Placeholder value `CHANGE_ME` (or any legacy default) triggers auto-generation of a secure 256-bit random key on first start, which is written back to `.env`. Set explicitly in production to keep tokens valid across restarts and replicas.
`CORS_ORIGINS`	No	—	Comma-separated list of extra allowed CORS origins beyond the default localhost entries. Required when the frontend runs on a non-localhost domain (e.g. `https://app.example.com`).
`UPLOADS_DIR`	No	`./uploads`	Directory for uploaded files
`MAX_UPLOAD_SIZE_MB`	No	`50`	Max file upload size in megabytes (backend enforcement)
`NEXT_PUBLIC_MAX_UPLOAD_SIZE_MB`	No	`50`	Max file upload size shown in frontend UI. Build-time variable — must match `MAX_UPLOAD_SIZE_MB`.
`MCP_SERVERS`	No	—	JSON array of MCP server configs (requires `uv sync --extra mcp`)
`ALLOW_STDIO_MCP`	No	`false`	Allow stdio MCP servers. Set `true` only for trusted local deployments
`ALLOWED_STDIO_COMMANDS`	No	`npx,uvx,node,python,python3,deno,bun`	Comma-separated list of allowed base commands for stdio MCP servers. Only effective when `ALLOW_STDIO_MCP=true`
`LOG_LEVEL`	No	`INFO`	Logging level: `DEBUG` / `INFO` / `WARNING` / `ERROR` / `CRITICAL`
`REDIS_URL`	No	—	Redis connection URL for cross-worker interrupt relay. Required when `WORKERS>1` — without it, mid-stream interrupt/inject requests may hit a different worker and silently fail. Auto-configured by Docker Compose.
`WORKERS`	No	`1`	Uvicorn worker processes. `1` is safe and needs no external services. For production multi-worker, use PostgreSQL (SQLite is single-writer). SQLite works for local dev under light load. Auth, OAuth, and file operations are fully multi-worker safe (JWT-based). Docker Compose auto-configures both PostgreSQL and Redis.

Multi-worker checklist (WORKERS>1):

Stop (abort streaming) — always works, no extra config needed (signal travels on the same TCP connection).
Inject (mid-stream follow-up) — requires REDIS_URL. Without Redis, the inject request may land on a different worker that has no knowledge of the running execution, causing it to silently fail.
Production: use PostgreSQL (DATABASE_URL). SQLite’s single-writer lock can cause contention under concurrent writes.
Local dev: SQLite + multi-worker is fine for light usage; just add REDIS_URL if you use the inject feature.

Workflow Run Retention

Background cleanup task that automatically purges old workflow runs. Per-workflow overrides (configured in the workflow settings UI) take priority over these global defaults.

Variable	Required	Default	Description
`WORKFLOW_RUN_MAX_AGE_DAYS`	No	`30`	Delete workflow runs older than this many days
`WORKFLOW_RUN_MAX_PER_WORKFLOW`	No	`100`	Keep at most this many runs per workflow (oldest deleted first)
`WORKFLOW_RUN_CLEANUP_INTERVAL_HOURS`	No	`24`	How often the background cleanup task runs, in hours

Channel Confirmation Request Expiry

Background sweeper that marks stale pending approval requests (produced by channel hooks like FeishuGateHook or the Approval Playground) as expired. Ensures a click days later on a forgotten card doesn’t flip agent state that has already been torn down.

Variable	Required	Default	Description
`CHANNEL_CONFIRMATION_TTL_MINUTES`	No	`1440`	Pending confirmations older than this are auto-expired (default: 24 hours)
`CHANNEL_CONFIRMATION_SWEEP_INTERVAL_SECONDS`	No	`600`	How often the expiry sweeper runs (default: every 10 minutes)

OAuth (Optional)

When both CLIENT_ID and CLIENT_SECRET are set for a provider, the login page automatically shows the corresponding OAuth button.

Variable	Required	Default	Description
`GITHUB_CLIENT_ID`	No	—	GitHub OAuth App client ID. Create at github.com/settings/developers → OAuth Apps
`GITHUB_CLIENT_SECRET`	No	—	GitHub OAuth App client secret
`GOOGLE_CLIENT_ID`	No	—	Google OAuth client ID. Create at console.cloud.google.com/apis/credentials
`GOOGLE_CLIENT_SECRET`	No	—	Google OAuth client secret
`DISCORD_CLIENT_ID`	No	—	Discord OAuth2 client ID. Create at discord.com/developers
`DISCORD_CLIENT_SECRET`	No	—	Discord OAuth2 client secret
`FEISHU_APP_ID`	No	—	Feishu (Lark) App ID. Create at open.feishu.cn. Requires `contact:user.email:readonly` permission
`FEISHU_APP_SECRET`	No	—	Feishu (Lark) App Secret
`FRONTEND_URL`	Prod	`http://localhost:3000`	Where the browser lands after OAuth completes. Must be set in production (e.g. `https://yourdomain.com`)
`API_BASE_URL`	Prod	`http://localhost:8000`	Externally reachable backend URL, used to build OAuth callback URLs. Must be set in production
`NEXT_PUBLIC_API_URL`	Prod	(auto-detected as `<hostname>:8000`)	Browser-side API base URL for OAuth redirects. This is a frontend build-time variable — set it in `frontend/.env.local` for local dev, or pass it as a Docker build arg for custom production deployments. Auto-detection works for standard reverse-proxy setups (port 80/443).

Prod = optional locally (defaults work), but required for any internet-facing deployment.

OAuth Callback URLs to register with each provider

The backend constructs callback URLs as: {API_BASE_URL}/api/auth/oauth/{provider}/callback

Provider	Callback URL to register
GitHub	`https://yourdomain.com/api/auth/oauth/github/callback`
Google	`https://yourdomain.com/api/auth/oauth/google/callback`
Discord	`https://yourdomain.com/api/auth/oauth/discord/callback`

Cloudflare Tunnel (Optional)

Route all traffic through Cloudflare’s network instead of exposing ports directly. Eliminates the need for Nginx, SSL certificates, and open firewall rules. See the Production Deployment section for setup instructions.

Mainland China users: Cloudflare Free/Pro/Business plans have no PoPs in mainland China. Traffic is routed to overseas edges, causing frequent 502 errors. Do not use this if your primary users are in mainland China unless you have Cloudflare Enterprise with China Network.

Variable	Required	Default	Description
`CLOUDFLARE_TUNNEL_TOKEN`	Yes (if using Tunnel)	—	Token from Cloudflare Zero Trust → Networks → Tunnels → your tunnel → Configure. Starts with `eyJ...`. Required by the `cloudflared` sidecar in `docker-compose.tunnel.yml`.

Analytics (Optional)

All analytics providers are optional. Set any combination — all active providers load simultaneously. Leave all blank to disable analytics entirely (recommended for local dev).

Variable	Required	Default	Description
`NEXT_PUBLIC_GA_MEASUREMENT_ID`	No	—	Google Analytics 4 measurement ID (e.g. `G-XXXXXXXXXX`). Get yours at analytics.google.com
`NEXT_PUBLIC_UMAMI_SCRIPT_URL`	No	—	Umami analytics script URL (e.g. `https://your-umami.com/script.js`). Self-hosted, privacy-friendly alternative — umami.is
`NEXT_PUBLIC_UMAMI_WEBSITE_ID`	No	—	Umami website ID. Required when `NEXT_PUBLIC_UMAMI_SCRIPT_URL` is set
`NEXT_PUBLIC_PLAUSIBLE_DOMAIN`	No	—	Plausible analytics domain (e.g. `yourdomain.com`). Lightweight, privacy-friendly — plausible.io
`NEXT_PUBLIC_PLAUSIBLE_SCRIPT_URL`	No	`https://plausible.io/js/script.js`	Custom Plausible script URL for self-hosted instances

All NEXT_PUBLIC_* analytics variables are build-time — changes require a frontend rebuild to take effect.

Stripe Billing (Optional)

Stripe powers Pro subscriptions. Leave all three variables blank to disable billing — the rest of FIM One works unchanged. Both STRIPE_SECRET_KEY and STRIPE_WEBHOOK_SECRET must be set together; partial config raises an error at first use.

Variable	Required	Default	Description
`STRIPE_SECRET_KEY`	No	—	Stripe API secret key. Must start with `sk_test_` / `sk_live_` (full access) or `rk_test_` / `rk_live_` (restricted key). Get yours from the Stripe Dashboard → Developers → API keys. Never commit a `sk_live_*` key to source.
`STRIPE_WEBHOOK_SECRET`	No	—	Stripe webhook signing secret (`whsec_*`). Created when you register the webhook endpoint in Stripe Dashboard → Developers → Webhooks → Add endpoint. Required to verify inbound webhook payloads.
`STRIPE_BILLING_RETURN_URL`	No	`http://localhost:3000/settings?tab=billing`	URL Stripe redirects users to after Checkout / Customer Portal sessions. Set this to your production billing settings page (e.g. `https://your-domain.com/settings?tab=billing`).

​Configuration Levels

​Frontend (Local Dev Only)

​LLM (Required)

​MarkItDown OCR Resolution

​Extended Thinking (Reasoning)

​Supported providers

​Important caveats

​Temperature constraints with reasoning

​How LLM_REASONING_BUDGET_TOKENS works

​Agent Execution

​ReAct Agent

​DAG Planner

​Domain Classification

​Context Guard

​Content Guardrails

​Agent Workspace

​System

​Web Tools (Optional)

​RAG & Knowledge Base (Recommended)

​Embedding

​Retrieval

​Reranker

​Vector Store

​Code Execution

​Tool Artifacts

​Document Processing (Optional)

​Image Generation (Optional)

​Email (SMTP) (Recommended)

​Connectors

​Platform

​Workflow Run Retention

​Channel Confirmation Request Expiry

​OAuth (Optional)

​OAuth Callback URLs to register with each provider

​Cloudflare Tunnel (Optional)

​Analytics (Optional)

​Stripe Billing (Optional)

Configuration Levels

Frontend (Local Dev Only)

LLM (Required)

MarkItDown OCR Resolution

Extended Thinking (Reasoning)

Supported providers

Important caveats

Temperature constraints with reasoning

How `LLM_REASONING_BUDGET_TOKENS` works

Agent Execution

ReAct Agent

DAG Planner

Domain Classification

Context Guard

Content Guardrails

Agent Workspace

System

Web Tools (Optional)

RAG & Knowledge Base (Recommended)

Embedding

Retrieval

Reranker

Vector Store

Code Execution

Tool Artifacts

Document Processing (Optional)

Image Generation (Optional)

Email (SMTP) (Recommended)

Connectors

Platform

Workflow Run Retention

Channel Confirmation Request Expiry

OAuth (Optional)

OAuth Callback URLs to register with each provider

Cloudflare Tunnel (Optional)

Analytics (Optional)

Stripe Billing (Optional)