.env. Copy example.env and fill in your values:
Configuration Levels
Each integration has a configuration level indicating its importance:| Level | Meaning | Behavior when not configured |
|---|---|---|
| Required | Core system dependency | System will error — chat and primary functions will not work |
| Recommended | Significant feature enabler | Graceful degradation — the feature is visibly unavailable but the system runs |
| Optional | Enhancement capability | Transparent degradation — system works fine, capability simply not present |
Note: Admin-configured models (Admin → Models page) can substitute for LLM environment variables. The health check considers both sources.
Frontend (Local Dev Only)
The frontend has a separate env file only for local development:frontend/.env.local.
This file is NOT used in Docker. Inside the Docker container, Next.js proxies /api/* to the Python backend internally (port 8000 is container-internal), so no frontend env file is needed.
For local dev, the defaults work out of the box — you do not need to create frontend/.env.local unless your backend runs on a non-default port.
If you need to override, create frontend/.env.local manually:
| Variable | Default | Description |
|---|---|---|
NEXT_PUBLIC_API_URL | http://localhost:8000 (auto) | Backend URL the browser uses for direct API calls (OAuth redirects, streaming). Auto-detected from window.location if unset — only override if your backend runs on a non-standard port locally. |
Build-time note:NEXT_PUBLIC_*variables are baked into the JS bundle atpnpm buildtime. Changing them at runtime (e.g. via root.env) has no effect — this is why they live infrontend/.env.localfor local dev only.
LLM (Required)
| Variable | Required | Default | Description |
|---|---|---|---|
LLM_API_KEY | Yes | — | API key for the LLM provider |
LLM_BASE_URL | No | https://api.openai.com/v1 | Base URL of any OpenAI-compatible API |
LLM_MODEL | No | gpt-4o | Main model — used for planning, analysis, and ReAct agent |
FAST_LLM_MODEL | No | (falls back to LLM_MODEL) | Fast model — used for DAG step execution (cheaper, faster) |
LLM_TEMPERATURE | No | 0.7 | Default sampling temperature |
LLM_CONTEXT_SIZE | No | 128000 | Context window size for the main LLM |
LLM_MAX_OUTPUT_TOKENS | No | 64000 | Max output tokens per call for the main LLM |
FAST_LLM_API_KEY | No | (falls back to LLM_API_KEY) | API key for the fast model provider. Use when the fast model is hosted by a different provider than the main model |
FAST_LLM_BASE_URL | No | (falls back to LLM_BASE_URL) | Base URL for the fast model provider |
FAST_LLM_TEMPERATURE | No | (falls back to LLM_TEMPERATURE) | Sampling temperature for the fast model |
FAST_LLM_CONTEXT_SIZE | No | (falls back to LLM_CONTEXT_SIZE) | Context window size for the fast LLM |
FAST_LLM_MAX_OUTPUT_TOKENS | No | (falls back to LLM_MAX_OUTPUT_TOKENS) | Max output tokens per call for the fast LLM |
LLM_REASONING_EFFORT | No | (disabled) | Extended thinking level for supported models (OpenAI o-series, Gemini 2.5+, Claude). Values: low, medium, high. LiteLLM translates this to each provider’s native format automatically. The model’s chain-of-thought is surfaced in the UI “thinking” step. |
LLM_REASONING_BUDGET_TOKENS | No | (auto from effort) | Explicit token budget for Anthropic thinking (minimum 1024). For OpenAI/Gemini the effort level is used directly. Only effective when LLM_REASONING_EFFORT is set. |
LLM_JSON_MODE_ENABLED | No | true | Global toggle for response_format=json_object. Set to false if your provider rejects LiteLLM’s assistant prefill injection (e.g. AWS Bedrock relay → ValidationException on the 2nd+ agent iteration). When disabled, structured calls skip JSON mode and fall back to plain-text regex extraction — no quality loss. Applies to all models (ENV-configured and Admin-configured). |
LLM_TOOL_CHOICE_ENABLED | No | true | Global toggle for forced tool_choice in structured output extraction (Level 1 — Native Function Calling). Set to false if your model returns errors with forced tool selection (e.g. thinking-mode models that reject tool_choice='specified'). When disabled, structured calls skip native FC and start from JSON Mode. Per-model override available in Settings → Models → Advanced. |
REASONING_LLM_MODEL | No | (falls back to LLM_MODEL) | Model name for the reasoning tier. Used for tasks requiring deep analysis (e.g., DAG planning, plan analysis) |
REASONING_LLM_API_KEY | No | (falls back to LLM_API_KEY) | API key for the reasoning model provider |
REASONING_LLM_BASE_URL | No | (falls back to LLM_BASE_URL) | Base URL for the reasoning model provider |
REASONING_LLM_TEMPERATURE | No | (falls back to LLM_TEMPERATURE) | Sampling temperature for the reasoning model |
REASONING_LLM_CONTEXT_SIZE | No | (falls back to LLM_CONTEXT_SIZE) | Context window size for the reasoning model |
REASONING_LLM_MAX_OUTPUT_TOKENS | No | (falls back to LLM_MAX_OUTPUT_TOKENS) | Max output tokens per call for the reasoning model |
REASONING_LLM_EFFORT | No | (falls back to LLM_REASONING_EFFORT) | Reasoning effort level for the reasoning model tier. Values: low, medium, high |
REASONING_LLM_BUDGET | No | (falls back to LLM_REASONING_BUDGET_TOKENS) | Token budget for reasoning (primarily Anthropic). Overrides the auto-calculated budget for the reasoning tier |
LLM_SUPPORTS_VISION | No | true (optimistic) | Controls whether ENV-mode document OCR (via MarkItDown + markitdown-ocr) is attempted. Only applies when no active model group is configured in Admin → Models (pure ENV mode). When the default true is in effect, convert_to_markdown and RAG ingestion assume LLM_MODEL supports vision and call it for image OCR — this is the correct behavior for all common choices (gpt-4o, claude-3-5-sonnet, gemini-1.5-pro/flash). Set this to false when your ENV-configured LLM_MODEL does not support vision (e.g. deepseek-v3, qwen-chat, llama-3.1, gpt-3.5-turbo, o1-mini) to skip the failing vision call and go straight to text-only extraction. When an active model group exists in the Admin → Models panel, this flag is ignored and the group’s supports_vision flags take over — the admin-curated choice is always the source of truth in DB mode. |
Resolution order: User Preference → Admin Models (DB) → ENV Fallback. If an admin model with role “General” is configured in Admin → Models, these ENV vars serve as fallback only. The health check considers both sources.
MarkItDown OCR Resolution
Theconvert_to_markdown built-in tool and the RAG ingestion pipeline both use Microsoft’s MarkItDown + the official markitdown-ocr plugin to extract text from documents — including OCR on embedded images and scanned PDF pages when a vision-capable LLM is available.
Vision LLM resolution order (first match wins):
| # | Source | Priority rationale |
|---|---|---|
| 1 | Agent’s primary LLM if supports_vision=True | Consistency: same API key, same billing bucket, same rate-limit pool as the conversation. |
| 2 | Active ModelGroup → Fast Model if supports_vision=True | Fast models (gpt-4o-mini, claude-haiku, gemini-1.5-flash) are the ideal OCR workhorse — cheap, low-latency, usually multimodal. |
| 3 | Active ModelGroup → General Model if supports_vision=True | Quality fallback when the primary is not in the group. |
| 4 | ENV primary LLM (LLM_MODEL) | Optimistic fallback for pure ENV mode. Only taken when no active ModelGroup exists. Gated by LLM_SUPPORTS_VISION. |
o1, o3-mini, DeepSeek-R1) historically lack vision support and are the wrong tool for OCR anyway — OCR is a perception task, not deliberation. If a workspace has only a reasoning model with supports_vision=True it will still be picked up via the primary-LLM path, but the resolver does not actively rank it above fast/general.
Zero-regression fallback: when no vision-capable model is found at any level, OCR is silently disabled and MarkItDown runs in text-only mode. Word/PowerPoint/Excel embedded-image OCR becomes unavailable (same as before this feature shipped), but all other text extraction (headings, tables, paragraph text) continues to work unchanged. There is never a case where adding this feature made extraction worse than the previous behavior.
Non-OpenAI providers (Anthropic, Google Gemini, etc.) are supported transparently: the resolved LLM is wrapped in a LiteLLMOpenAIShim that routes chat.completions.create(...) calls through litellm.completion(), which handles the provider-specific message format translation (e.g. Anthropic’s source.type="base64" image block). One shim covers every provider LiteLLM supports — adding a new provider costs zero code changes in FIM One.
Extended Thinking (Reasoning)
WhenLLM_REASONING_EFFORT is set, FIM One enables the model’s extended thinking capability so the internal chain-of-thought is surfaced in the UI “thinking” step. FIM One uses LiteLLM to translate the reasoning effort parameter into each provider’s native format automatically.
Supported providers
| Provider | LLM_BASE_URL | How it works | Reasoning content returned? |
|---|---|---|---|
| OpenAI (o1 / o3 / o4-mini) | https://api.openai.com/v1 | reasoning_effort sent natively | Yes |
| Anthropic (Claude 3.7+) | https://api.anthropic.com/v1/ | LiteLLM routes via native Anthropic API with thinking parameter | Yes |
| Google Gemini (2.5+) | https://generativelanguage.googleapis.com/v1beta/openai/ | reasoning_effort sent on compat endpoint | Yes |
LLM_BASE_URL and maps it to the correct API format. Unknown URLs are treated as OpenAI-compatible.
Important caveats
Temperature constraints with reasoning
Some providers impose temperature restrictions when reasoning is active:- Anthropic: Requires
temperature=1when extended thinking is enabled. If using Anthropic with extended thinking, you must setLLM_TEMPERATURE=1— Anthropic rejects other values when thinking is enabled. - OpenAI GPT-5.x: Only supports
temperature=1at all times. LiteLLM’sdrop_paramsfiltering handles this automatically — unsupported temperature values are silently dropped. No user action is needed for GPT-5.x.
How LLM_REASONING_BUDGET_TOKENS works
This variable is primarily meaningful for the Anthropic path. When set, it overrides the auto-calculated budget and is sent as budget_tokens in the thinking parameter via LiteLLM. When not set, the budget is derived from LLM_MAX_OUTPUT_TOKENS x effort ratio:
LLM_REASONING_EFFORT | Budget ratio | Example (max_tokens = 64000) |
|---|---|---|
low | 20% | 12,800 tokens |
medium | 50% | 32,000 tokens |
high | 80% | 51,200 tokens |
reasoning_effort level — LLM_REASONING_BUDGET_TOKENS has no effect.
Agent Execution
ReAct Agent
| Variable | Required | Default | Description |
|---|---|---|---|
REACT_MAX_ITERATIONS | No | 20 | Max tool-call iterations per ReAct request. Higher = more thorough but slower and costlier |
REACT_MAX_TURN_TOKENS | No | 0 | Emergency circuit-breaker: max cumulative tokens (prompt + completion across all iterations) per single ReAct turn. Default 0 = unlimited. This is NOT for daily token control — use per-user token_quota for that. This is a last-resort safety valve for extreme scenarios like an agent stuck in an infinite tool-call loop. Hitting this limit aborts the task mid-execution, wasting all tokens consumed so far and returning an incomplete result. Keep at 0 unless you have a specific runaway-agent problem to contain |
REACT_TOOL_SELECTION_THRESHOLD | No | 12 | When the total number of registered tools exceeds this threshold, a lightweight LLM call selects the most relevant subset before each request |
REACT_TOOL_SELECTION_MAX | No | 6 | Max tools to keep after smart selection (only effective when tool count exceeds REACT_TOOL_SELECTION_THRESHOLD) |
REACT_SELF_REFLECTION_INTERVAL | No | 6 | Inject a self-reflection prompt every N tool calls to help the agent course-correct and avoid loops |
REACT_TOOL_OBS_TRUNCATION | No | 8000 | Max characters per tool observation when synthesizing the final answer. Higher values preserve more structured data (JSON, tables) at the cost of more tokens |
REACT_TOOL_RESULT_BUDGET | No | 40000 | Aggregate token budget for all tool results in a single session. When total tool-result tokens exceed this cap, new results are truncated with a notice. Prevents context bloat from large API responses (e.g., 5 connector calls returning 8K each). Set to 0 to disable the cap |
REACT_COMPLETION_CHECK_SKIP_CHARS | No | 800 | Skip the post-answer completion-check LLM call when the agent’s final answer exceeds this many characters. Long detailed answers don’t need a “did I miss anything?” verification round-trip. Set lower to skip more aggressively; set to a very large value to always run the check |
REACT_CYCLE_DETECTION_THRESHOLD | No | 2 | When the same tool is called with identical arguments this many times in a row, a deterministic warning is injected telling the agent to try a different approach. Unlike self-reflection (which relies on the LLM noticing the loop), this is a hash-based check that cannot be bypassed. Also applies to DAG steps |
REACT_COMPLETION_CHECK_MIN_TOOLS | No | 3 | Minimum number of tool calls before the completion checklist fires. Simple tasks (1-2 tool calls) skip verification to avoid unnecessary latency. Set to 1 for always-on verification. Also applies to DAG steps |
REACT_TURN_PROFILE_ENABLED | No | true | Emit per-turn phase-level timing logs (memory_load, compact, tool_schema_build, llm_first_token, llm_total, tool_exec). One structured log line per turn. Set to false to disable profiling entirely (zero overhead) |
LLM_RATE_LIMIT_PER_USER | No | true | Use per-user keyed rate-limit buckets instead of a single process-global bucket. Prevents one noisy user from starving all others on the same worker. The underlying rate is hardcoded at 60 requests/min and 100K tokens/min per bucket — this setting only controls whether the bucket is shared (global) or partitioned (per-user). Set to false to revert to the legacy global bucket (not recommended) |
DAG Planner
| Variable | Required | Default | Description |
|---|---|---|---|
MAX_CONCURRENCY | No | 5 | Max parallel steps in DAG executor |
DAG_STEP_MAX_ITERATIONS | No | 15 | Max tool-call iterations within each DAG step |
DAG_STEP_TIMEOUT | No | 600 | Step execution timeout in seconds. Steps exceeding this are marked as failed and their dependents are cascade-skipped |
DAG_MAX_REPLAN_ROUNDS | No | 3 | Max autonomous re-plan attempts when goal is not achieved. User interrupts (inject) are unlimited and do not count against this budget |
DAG_REPLAN_STOP_CONFIDENCE | No | 0.8 | Stop retrying when agent confidence that goal is unachievable exceeds this threshold (0.0 = never stop early, 1.0 = stop on any failure) |
DAG_VERIFY_TRUNCATION | No | 2000 | Max characters of step output sent to the step verifier LLM for quality judgment |
DAG_ANALYZER_TRUNCATION | No | 10000 | Max characters per step result when formatting for the post-execution analyzer |
DAG_REPLAN_RECENT_TRUNCATION | No | 500 | Max characters per step result from the most recent round when building re-plan context |
DAG_REPLAN_OLDER_TRUNCATION | No | 200 | Max characters per step result from older rounds when building re-plan context. Older rounds are more aggressively truncated to save context |
DAG_TOOL_CACHE | No | true | Cache identical tool calls within a single DAG execution. Only tools explicitly marked as cacheable (read-only tools like search, knowledge retrieval) are cached. Set to false to disable caching entirely |
DAG_STEP_VERIFICATION | No | false | Generic LLM-based quality check after each DAG step. On failure, the step retries once with feedback. Default off — adds latency on every step and is rarely needed; most step outputs are acceptable without re-checking. Use only when you observe frequent low-quality step results |
DAG_CITATION_VERIFICATION | No | true | Citation-accuracy check for specialist-domain steps. Prerequisite: the query must first be classified as a specialist domain by the LLM domain classifier (see ESCALATION_DOMAINS). When the domain is detected AND this flag is true, each completed step is scanned for legal/medical/financial citations and verified for accuracy — catching hallucinated article numbers, fabricated case references, and incorrect regulatory citations. If domain classification returns null (general query), citation verification does not run regardless of this setting |
DAG_CITATION_VERIFY_TRUNCATION | No | 6000 | Max characters of step result sent to the citation verification prompt |
Domain Classification
Controls the independent LLM-based domain detection layer that runs before both ReAct and DAG execution. When a query is classified as a specialist domain, the system activates domain-aware features: model escalation to reasoning model, domain-specific SOP instructions, and citation verification (DAG only).| Variable | Required | Default | Description |
|---|---|---|---|
ESCALATION_DOMAINS | No | legal,medical,financial,tax,compliance,patent | Comma-separated list of specialist domains. A fast LLM classifies each query against this list. When matched, the system: (1) upgrades to the reasoning model for higher accuracy, (2) injects domain-specific SOP instructions (e.g. verify citations via search before writing), (3) enables citation verification for DAG steps. Add custom domains as needed (e.g. legal,education,construction) |
Context Guard
Controls the automatic context window management that prevents conversations from exceeding the model’s limit.| Variable | Required | Default | Description |
|---|---|---|---|
CONTEXT_GUARD_DEFAULT_BUDGET | No | 32000 | Default token budget for context window management. When the conversation exceeds this, older messages are compacted |
CONTEXT_GUARD_MAX_MSG_CHARS | No | 50000 | Hard character limit on any single message. Messages exceeding this are truncated as a safety net |
CONTEXT_GUARD_KEEP_RECENT | No | 4 | Number of most recent messages to preserve when compacting conversation history |
Agent Workspace
| Variable | Required | Default | Description |
|---|---|---|---|
WORKSPACE_OFFLOAD_THRESHOLD | No | 8000 | When a tool output exceeds this many characters, it is saved to a workspace file and a truncated preview is injected into the conversation context |
WORKSPACE_PREVIEW_CHARS | No | 2000 | Number of preview characters to include in truncated workspace references |
WORKSPACE_CLEANUP_MAX_HOURS | No | 72 | Workspace files older than this many hours are eligible for automatic cleanup |
System
| Variable | Required | Default | Description |
|---|---|---|---|
SYSTEM_PROMPT_RESERVE | — | — | Removed. Previously subtracted a fixed 4K reserve from the context budget for system prompts. This caused double-counting because ContextGuard already includes the system prompt when estimating message list tokens. The budget formula is now simply context_size - max_output_tokens, and the system prompt’s actual size is accounted for dynamically |
Web Tools (Optional)
| Variable | Required | Default | Description |
|---|---|---|---|
JINA_API_KEY | No | — | Jina API key. Acts as a shared fallback for search, fetch, embedding, and reranker when no service-specific key is set. Get yours at jina.ai |
TAVILY_API_KEY | No | — | Tavily Search API key (auto-selected if set and WEB_SEARCH_PROVIDER is unset) |
BRAVE_API_KEY | No | — | Brave Search API key (auto-selected if set and WEB_SEARCH_PROVIDER is unset) |
EXA_API_KEY | No | — | Exa Search API key (auto-selected if set and WEB_SEARCH_PROVIDER is unset). Get yours at exa.ai |
WEB_SEARCH_PROVIDER | No | jina | Search provider selector: jina / tavily / brave / exa |
WEB_FETCH_PROVIDER | No | jina (if key set, else httpx) | Fetch provider: jina (uses Jina Reader API) / httpx (direct HTTP request, no API key needed) |
Quick start tip: Setting just JINA_API_KEY enables web search, web fetch, embedding, and reranking all at once — one key, four services. You can override each service individually with the variables below.
RAG & Knowledge Base (Recommended)
Embedding
Embedding converts text into vectors for knowledge base search. FIM One uses the standard OpenAI-compatible/v1/embeddings endpoint, so it works with any provider that exposes this interface — not just Jina.
| Variable | Required | Default | Description |
|---|---|---|---|
EMBEDDING_API_KEY | No | (falls back to JINA_API_KEY) | API key for the embedding provider |
EMBEDDING_BASE_URL | No | https://api.jina.ai/v1 | Base URL for the embedding provider |
EMBEDDING_MODEL | No | jina-embeddings-v3 | Model identifier |
EMBEDDING_DIMENSION | No | 1024 | Vector dimension |
| Provider | EMBEDDING_BASE_URL | EMBEDDING_MODEL | EMBEDDING_DIMENSION |
|---|---|---|---|
| Jina (default) | https://api.jina.ai/v1 | jina-embeddings-v3 | 1024 |
| OpenAI | https://api.openai.com/v1 | text-embedding-3-small | 1536 |
| Voyage | https://api.voyageai.com/v1 | voyage-3 | 1024 |
| Ollama (local) | http://localhost:11434/v1 | nomic-embed-text | 768 |
Retrieval
| Variable | Required | Default | Description |
|---|---|---|---|
RETRIEVAL_MODE | No | grounding | grounding (full pipeline with citations and confidence scoring) or simple (basic RAG) |
Reranker
Reranker re-scores retrieved documents to improve relevance. Three providers are supported — select viaRERANKER_PROVIDER or let the system auto-detect from available API keys.
| Variable | Required | Default | Description |
|---|---|---|---|
RERANKER_PROVIDER | No | (auto-detect) | jina / cohere / openai. If unset: uses Cohere if COHERE_API_KEY set, otherwise Jina |
RERANKER_MODEL | No | jina-reranker-v2-base-multilingual | Model identifier (applies to Jina and OpenAI providers) |
COHERE_API_KEY | No | — | Cohere API key (auto-selects Cohere reranker when set and RERANKER_PROVIDER is unset) |
COHERE_RERANKER_MODEL | No | rerank-multilingual-v3.0 | Cohere-specific reranker model |
Jina usesJINA_API_KEY(from Web Tools above). OpenAI reusesLLM_API_KEY/LLM_BASE_URL— no extra key needed. Cohere requires its ownCOHERE_API_KEY.
Reranker is optional — knowledge base search works without it using fusion scoring. Embedding is recommended for knowledge base features.
Vector Store
| Variable | Required | Default | Description |
|---|---|---|---|
VECTOR_STORE_DIR | No | ./data/vector_store | Directory for LanceDB vector store data (file-based, zero external services) |
Code Execution
| Variable | Required | Default | Description |
|---|---|---|---|
CODE_EXEC_BACKEND | No | local | local (direct host execution) or docker (isolated containers) |
DOCKER_PYTHON_IMAGE | No | python:3.11-slim | Docker image for Python execution |
DOCKER_NODE_IMAGE | No | node:20-slim | Docker image for Node.js execution |
DOCKER_SHELL_IMAGE | No | python:3.11-slim | Docker image for shell execution |
DOCKER_MEMORY | No | (Docker default) | RAM cap per container (e.g. 256m, 512m, 1g) |
DOCKER_CPUS | No | (Docker default) | CPU quota per container (e.g. 0.5, 1.0) |
SANDBOX_TIMEOUT | No | 120 | Default execution timeout in seconds |
DOCKER_HOST_DATA_DIR | No | (not set) | Host-side absolute path of the ./data volume mount. Required for DooD (Docker-outside-of-Docker) deployments; docker-compose.yml auto-sets via ${PWD}/data. |
Security:localmode runs AI-generated code directly on the host. For internet-facing or multi-user deployments, always setCODE_EXEC_BACKEND=docker.
Tool Artifacts
Size limits for files produced by tool execution (code execution, template rendering, image generation).| Variable | Required | Default | Description |
|---|---|---|---|
MAX_ARTIFACT_SIZE | No | 10485760 (10 MB) | Max single artifact file size in bytes |
MAX_ARTIFACTS_TOTAL | No | 52428800 (50 MB) | Max total artifact size per session in bytes |
Document Processing (Optional)
Controls how uploaded PDF/DOCX files are processed for LLM consumption. Vision-capable models (GPT-4o, Claude 3/4, Gemini) can receive PDF pages as rendered images for higher fidelity.| Variable | Required | Default | Description |
|---|---|---|---|
DOCUMENT_PROCESSING_MODE | No | auto | auto (vision if model supports it), vision (always render pages), text (always extract text only) |
DOCUMENT_VISION_DPI | No | 150 | DPI for PDF page rendering. Higher = better quality, more tokens |
DOCUMENT_VISION_MAX_PAGES | No | 20 | Maximum pages to render as images per PDF |
Note: Per-model vision support is configured via the supports_vision toggle in Admin → Models. When not explicitly set, the system auto-detects vision capability from the model name.
Image Generation (Optional)
| Variable | Required | Default | Description |
|---|---|---|---|
IMAGE_GEN_PROVIDER | No | google | google (Gemini native API) or openai (OpenAI-compatible /v1/images/generations) |
IMAGE_GEN_API_KEY | No | — | Google AI Studio key (google) or proxy/OpenAI API key (openai) |
IMAGE_GEN_MODEL | No | gemini-3.1-flash-image-preview | Image generation model (e.g. dall-e-3, gemini-nano-banana-2) |
IMAGE_GEN_BASE_URL | No | (per provider) | Google: https://generativelanguage.googleapis.com/v1beta; OpenAI: https://api.openai.com/v1 |
Email (SMTP) (Recommended)
Auto-registers theemail_send built-in tool when SMTP_HOST, SMTP_USER, and SMTP_PASS are all set.
| Variable | Required | Default | Description |
|---|---|---|---|
SMTP_HOST | Cond. | — | SMTP server hostname |
SMTP_PORT | No | 465 | SMTP port |
SMTP_SSL | No | ssl | TLS mode: ssl (port 465) / tls (STARTTLS, port 587) / "" (plain) |
SMTP_USER | Cond. | — | SMTP login username |
SMTP_PASS | Cond. | — | SMTP login password |
SMTP_FROM | No | (uses SMTP_USER) | Sender address shown in From header |
SMTP_FROM_NAME | No | — | Display name shown in From header |
SMTP_REPLY_TO | No | — | Reply-To address; replies go here instead of SMTP_FROM |
SMTP_ALLOWED_DOMAINS | No | — | Comma-separated domain allowlist (e.g. example.com,corp.io); blocks recipients outside listed domains |
SMTP_ALLOWED_ADDRESSES | No | — | Comma-separated exact-address allowlist; combined with SMTP_ALLOWED_DOMAINS; leave both unset to allow any recipient (not recommended for shared mailboxes) |
Connectors
| Variable | Required | Default | Description |
|---|---|---|---|
CONNECTOR_RESPONSE_MAX_CHARS | No | 50000 | Max characters for non-array JSON / plain-text connector responses |
CONNECTOR_RESPONSE_MAX_ITEMS | No | 10 | Max array items to keep when connector response is a JSON array |
CREDENTIAL_ENCRYPTION_KEY | No | (unset) | Fernet encryption key for connector credential blobs. When set, auth tokens stored in connector_credentials are encrypted at rest. If unset, credentials are stored as plaintext JSON (backward-compatible). Changing this key invalidates all existing encrypted credentials. |
CONNECTOR_TOOL_MODE | No | progressive | How connector tools are exposed to agents. progressive: single ConnectorMetaTool with discover/execute subcommands (~30 tokens/connector). classic: one tool per action (legacy, ~250 tokens/action). |
DATABASE_TOOL_MODE | No | progressive | How database connector tools are exposed to agents. progressive: single DatabaseMetaTool with list_tables/discover/query subcommands. legacy: one tool per action per database connector (3 tools each). |
MCP_TOOL_MODE | No | progressive | How MCP server tools are exposed to agents. progressive: single MCPServerMetaTool with discover/call subcommands. legacy: one tool per MCP server action (original individual tools). |
Platform
| Variable | Required | Default | Description |
|---|---|---|---|
DATABASE_URL | No | sqlite+aiosqlite:///./data/fim_one.db | Database connection string. SQLite (zero-config): sqlite+aiosqlite:///./data/fim_one.db. PostgreSQL (production): postgresql+asyncpg://user:pass@localhost:5432/fim_one. Docker Compose auto-sets PostgreSQL. |
JWT_SECRET_KEY | No | CHANGE_ME | Secret key for JWT token signing. Placeholder value CHANGE_ME (or any legacy default) triggers auto-generation of a secure 256-bit random key on first start, which is written back to .env. Set explicitly in production to keep tokens valid across restarts and replicas. |
CORS_ORIGINS | No | — | Comma-separated list of extra allowed CORS origins beyond the default localhost entries. Required when the frontend runs on a non-localhost domain (e.g. https://app.example.com). |
UPLOADS_DIR | No | ./uploads | Directory for uploaded files |
MAX_UPLOAD_SIZE_MB | No | 50 | Max file upload size in megabytes (backend enforcement) |
NEXT_PUBLIC_MAX_UPLOAD_SIZE_MB | No | 50 | Max file upload size shown in frontend UI. Build-time variable — must match MAX_UPLOAD_SIZE_MB. |
MCP_SERVERS | No | — | JSON array of MCP server configs (requires uv sync --extra mcp) |
ALLOW_STDIO_MCP | No | false | Allow stdio MCP servers. Set true only for trusted local deployments |
ALLOWED_STDIO_COMMANDS | No | npx,uvx,node,python,python3,deno,bun | Comma-separated list of allowed base commands for stdio MCP servers. Only effective when ALLOW_STDIO_MCP=true |
LOG_LEVEL | No | INFO | Logging level: DEBUG / INFO / WARNING / ERROR / CRITICAL |
REDIS_URL | No | — | Redis connection URL for cross-worker interrupt relay. Required when WORKERS>1 — without it, mid-stream interrupt/inject requests may hit a different worker and silently fail. Auto-configured by Docker Compose. |
WORKERS | No | 1 | Uvicorn worker processes. 1 is safe and needs no external services. For production multi-worker, use PostgreSQL (SQLite is single-writer). SQLite works for local dev under light load. Auth, OAuth, and file operations are fully multi-worker safe (JWT-based). Docker Compose auto-configures both PostgreSQL and Redis. |
Workflow Run Retention
Background cleanup task that automatically purges old workflow runs. Per-workflow overrides (configured in the workflow settings UI) take priority over these global defaults.| Variable | Required | Default | Description |
|---|---|---|---|
WORKFLOW_RUN_MAX_AGE_DAYS | No | 30 | Delete workflow runs older than this many days |
WORKFLOW_RUN_MAX_PER_WORKFLOW | No | 100 | Keep at most this many runs per workflow (oldest deleted first) |
WORKFLOW_RUN_CLEANUP_INTERVAL_HOURS | No | 24 | How often the background cleanup task runs, in hours |
Channel Confirmation Request Expiry
Background sweeper that marks stale pending approval requests (produced by channel hooks likeFeishuGateHook or the Approval Playground) as expired. Ensures a click days later on a forgotten card doesn’t flip agent state that has already been torn down.
| Variable | Required | Default | Description |
|---|---|---|---|
CHANNEL_CONFIRMATION_TTL_MINUTES | No | 1440 | Pending confirmations older than this are auto-expired (default: 24 hours) |
CHANNEL_CONFIRMATION_SWEEP_INTERVAL_SECONDS | No | 600 | How often the expiry sweeper runs (default: every 10 minutes) |
OAuth (Optional)
When bothCLIENT_ID and CLIENT_SECRET are set for a provider, the login page automatically shows the corresponding OAuth button.
| Variable | Required | Default | Description |
|---|---|---|---|
GITHUB_CLIENT_ID | No | — | GitHub OAuth App client ID. Create at github.com/settings/developers → OAuth Apps |
GITHUB_CLIENT_SECRET | No | — | GitHub OAuth App client secret |
GOOGLE_CLIENT_ID | No | — | Google OAuth client ID. Create at console.cloud.google.com/apis/credentials |
GOOGLE_CLIENT_SECRET | No | — | Google OAuth client secret |
DISCORD_CLIENT_ID | No | — | Discord OAuth2 client ID. Create at discord.com/developers |
DISCORD_CLIENT_SECRET | No | — | Discord OAuth2 client secret |
FEISHU_APP_ID | No | — | Feishu (Lark) App ID. Create at open.feishu.cn. Requires contact:user.email:readonly permission |
FEISHU_APP_SECRET | No | — | Feishu (Lark) App Secret |
FRONTEND_URL | Prod | http://localhost:3000 | Where the browser lands after OAuth completes. Must be set in production (e.g. https://yourdomain.com) |
API_BASE_URL | Prod | http://localhost:8000 | Externally reachable backend URL, used to build OAuth callback URLs. Must be set in production |
NEXT_PUBLIC_API_URL | Prod | (auto-detected as <hostname>:8000) | Browser-side API base URL for OAuth redirects. This is a frontend build-time variable — set it in frontend/.env.local for local dev, or pass it as a Docker build arg for custom production deployments. Auto-detection works for standard reverse-proxy setups (port 80/443). |
Prod = optional locally (defaults work), but required for any internet-facing deployment.
OAuth Callback URLs to register with each provider
The backend constructs callback URLs as:{API_BASE_URL}/api/auth/oauth/{provider}/callback
| Provider | Callback URL to register |
|---|---|
| GitHub | https://yourdomain.com/api/auth/oauth/github/callback |
https://yourdomain.com/api/auth/oauth/google/callback | |
| Discord | https://yourdomain.com/api/auth/oauth/discord/callback |
Cloudflare Tunnel (Optional)
Route all traffic through Cloudflare’s network instead of exposing ports directly. Eliminates the need for Nginx, SSL certificates, and open firewall rules. See the Production Deployment section for setup instructions.| Variable | Required | Default | Description |
|---|---|---|---|
CLOUDFLARE_TUNNEL_TOKEN | Yes (if using Tunnel) | — | Token from Cloudflare Zero Trust → Networks → Tunnels → your tunnel → Configure. Starts with eyJ.... Required by the cloudflared sidecar in docker-compose.tunnel.yml. |
Analytics (Optional)
All analytics providers are optional. Set any combination — all active providers load simultaneously. Leave all blank to disable analytics entirely (recommended for local dev).| Variable | Required | Default | Description |
|---|---|---|---|
NEXT_PUBLIC_GA_MEASUREMENT_ID | No | — | Google Analytics 4 measurement ID (e.g. G-XXXXXXXXXX). Get yours at analytics.google.com |
NEXT_PUBLIC_UMAMI_SCRIPT_URL | No | — | Umami analytics script URL (e.g. https://your-umami.com/script.js). Self-hosted, privacy-friendly alternative — umami.is |
NEXT_PUBLIC_UMAMI_WEBSITE_ID | No | — | Umami website ID. Required when NEXT_PUBLIC_UMAMI_SCRIPT_URL is set |
NEXT_PUBLIC_PLAUSIBLE_DOMAIN | No | — | Plausible analytics domain (e.g. yourdomain.com). Lightweight, privacy-friendly — plausible.io |
NEXT_PUBLIC_PLAUSIBLE_SCRIPT_URL | No | https://plausible.io/js/script.js | Custom Plausible script URL for self-hosted instances |
All NEXT_PUBLIC_* analytics variables are build-time — changes require a frontend rebuild to take effect.