Skip to main content
All configuration is done via .env. Copy example.env and fill in your values:
cp example.env .env

Configuration Levels

Each integration has a configuration level indicating its importance:
LevelMeaningBehavior when not configured
RequiredCore system dependencySystem will error — chat and primary functions will not work
RecommendedSignificant feature enablerGraceful degradation — the feature is visibly unavailable but the system runs
OptionalEnhancement capabilityTransparent degradation — system works fine, capability simply not present
Note: Admin-configured models (Admin → Models page) can substitute for LLM environment variables. The health check considers both sources.

Frontend (Local Dev Only)

The frontend has a separate env file only for local development: frontend/.env.local.
This file is NOT used in Docker. Inside the Docker container, Next.js proxies /api/* to the Python backend internally (port 8000 is container-internal), so no frontend env file is needed.
For local dev, the defaults work out of the box — you do not need to create frontend/.env.local unless your backend runs on a non-default port. If you need to override, create frontend/.env.local manually:
echo 'NEXT_PUBLIC_API_URL=http://localhost:9000' > frontend/.env.local
VariableDefaultDescription
NEXT_PUBLIC_API_URLhttp://localhost:8000 (auto)Backend URL the browser uses for direct API calls (OAuth redirects, streaming). Auto-detected from window.location if unset — only override if your backend runs on a non-standard port locally.
Build-time note: NEXT_PUBLIC_* variables are baked into the JS bundle at pnpm build time. Changing them at runtime (e.g. via root .env) has no effect — this is why they live in frontend/.env.local for local dev only.

LLM (Required)

VariableRequiredDefaultDescription
LLM_API_KEYYesAPI key for the LLM provider
LLM_BASE_URLNohttps://api.openai.com/v1Base URL of any OpenAI-compatible API
LLM_MODELNogpt-4oMain model — used for planning, analysis, and ReAct agent
FAST_LLM_MODELNo(falls back to LLM_MODEL)Fast model — used for DAG step execution (cheaper, faster)
LLM_TEMPERATURENo0.7Default sampling temperature
LLM_CONTEXT_SIZENo128000Context window size for the main LLM
LLM_MAX_OUTPUT_TOKENSNo64000Max output tokens per call for the main LLM
FAST_LLM_API_KEYNo(falls back to LLM_API_KEY)API key for the fast model provider. Use when the fast model is hosted by a different provider than the main model
FAST_LLM_BASE_URLNo(falls back to LLM_BASE_URL)Base URL for the fast model provider
FAST_LLM_TEMPERATURENo(falls back to LLM_TEMPERATURE)Sampling temperature for the fast model
FAST_LLM_CONTEXT_SIZENo(falls back to LLM_CONTEXT_SIZE)Context window size for the fast LLM
FAST_LLM_MAX_OUTPUT_TOKENSNo(falls back to LLM_MAX_OUTPUT_TOKENS)Max output tokens per call for the fast LLM
LLM_REASONING_EFFORTNo(disabled)Extended thinking level for supported models (OpenAI o-series, Gemini 2.5+, Claude). Values: low, medium, high. LiteLLM translates this to each provider’s native format automatically. The model’s chain-of-thought is surfaced in the UI “thinking” step.
LLM_REASONING_BUDGET_TOKENSNo(auto from effort)Explicit token budget for Anthropic thinking (minimum 1024). For OpenAI/Gemini the effort level is used directly. Only effective when LLM_REASONING_EFFORT is set.
LLM_JSON_MODE_ENABLEDNotrueGlobal toggle for response_format=json_object. Set to false if your provider rejects LiteLLM’s assistant prefill injection (e.g. AWS Bedrock relay → ValidationException on the 2nd+ agent iteration). When disabled, structured calls skip JSON mode and fall back to plain-text regex extraction — no quality loss. Applies to all models (ENV-configured and Admin-configured).
LLM_TOOL_CHOICE_ENABLEDNotrueGlobal toggle for forced tool_choice in structured output extraction (Level 1 — Native Function Calling). Set to false if your model returns errors with forced tool selection (e.g. thinking-mode models that reject tool_choice='specified'). When disabled, structured calls skip native FC and start from JSON Mode. Per-model override available in Settings → Models → Advanced.
REASONING_LLM_MODELNo(falls back to LLM_MODEL)Model name for the reasoning tier. Used for tasks requiring deep analysis (e.g., DAG planning, plan analysis)
REASONING_LLM_API_KEYNo(falls back to LLM_API_KEY)API key for the reasoning model provider
REASONING_LLM_BASE_URLNo(falls back to LLM_BASE_URL)Base URL for the reasoning model provider
REASONING_LLM_TEMPERATURENo(falls back to LLM_TEMPERATURE)Sampling temperature for the reasoning model
REASONING_LLM_CONTEXT_SIZENo(falls back to LLM_CONTEXT_SIZE)Context window size for the reasoning model
REASONING_LLM_MAX_OUTPUT_TOKENSNo(falls back to LLM_MAX_OUTPUT_TOKENS)Max output tokens per call for the reasoning model
REASONING_LLM_EFFORTNo(falls back to LLM_REASONING_EFFORT)Reasoning effort level for the reasoning model tier. Values: low, medium, high
REASONING_LLM_BUDGETNo(falls back to LLM_REASONING_BUDGET_TOKENS)Token budget for reasoning (primarily Anthropic). Overrides the auto-calculated budget for the reasoning tier
LLM_SUPPORTS_VISIONNotrue (optimistic)Controls whether ENV-mode document OCR (via MarkItDown + markitdown-ocr) is attempted. Only applies when no active model group is configured in Admin → Models (pure ENV mode). When the default true is in effect, convert_to_markdown and RAG ingestion assume LLM_MODEL supports vision and call it for image OCR — this is the correct behavior for all common choices (gpt-4o, claude-3-5-sonnet, gemini-1.5-pro/flash). Set this to false when your ENV-configured LLM_MODEL does not support vision (e.g. deepseek-v3, qwen-chat, llama-3.1, gpt-3.5-turbo, o1-mini) to skip the failing vision call and go straight to text-only extraction. When an active model group exists in the Admin → Models panel, this flag is ignored and the group’s supports_vision flags take over — the admin-curated choice is always the source of truth in DB mode.
Resolution order: User Preference → Admin Models (DB) → ENV Fallback. If an admin model with role “General” is configured in Admin → Models, these ENV vars serve as fallback only. The health check considers both sources.

MarkItDown OCR Resolution

The convert_to_markdown built-in tool and the RAG ingestion pipeline both use Microsoft’s MarkItDown + the official markitdown-ocr plugin to extract text from documents — including OCR on embedded images and scanned PDF pages when a vision-capable LLM is available. Vision LLM resolution order (first match wins):
#SourcePriority rationale
1Agent’s primary LLM if supports_vision=TrueConsistency: same API key, same billing bucket, same rate-limit pool as the conversation.
2Active ModelGroup → Fast Model if supports_vision=TrueFast models (gpt-4o-mini, claude-haiku, gemini-1.5-flash) are the ideal OCR workhorse — cheap, low-latency, usually multimodal.
3Active ModelGroup → General Model if supports_vision=TrueQuality fallback when the primary is not in the group.
4ENV primary LLM (LLM_MODEL)Optimistic fallback for pure ENV mode. Only taken when no active ModelGroup exists. Gated by LLM_SUPPORTS_VISION.
Reasoning models are never preferred for OCR. Reasoning tiers (o1, o3-mini, DeepSeek-R1) historically lack vision support and are the wrong tool for OCR anyway — OCR is a perception task, not deliberation. If a workspace has only a reasoning model with supports_vision=True it will still be picked up via the primary-LLM path, but the resolver does not actively rank it above fast/general. Zero-regression fallback: when no vision-capable model is found at any level, OCR is silently disabled and MarkItDown runs in text-only mode. Word/PowerPoint/Excel embedded-image OCR becomes unavailable (same as before this feature shipped), but all other text extraction (headings, tables, paragraph text) continues to work unchanged. There is never a case where adding this feature made extraction worse than the previous behavior. Non-OpenAI providers (Anthropic, Google Gemini, etc.) are supported transparently: the resolved LLM is wrapped in a LiteLLMOpenAIShim that routes chat.completions.create(...) calls through litellm.completion(), which handles the provider-specific message format translation (e.g. Anthropic’s source.type="base64" image block). One shim covers every provider LiteLLM supports — adding a new provider costs zero code changes in FIM One.

Extended Thinking (Reasoning)

When LLM_REASONING_EFFORT is set, FIM One enables the model’s extended thinking capability so the internal chain-of-thought is surfaced in the UI “thinking” step. FIM One uses LiteLLM to translate the reasoning effort parameter into each provider’s native format automatically.

Supported providers

ProviderLLM_BASE_URLHow it worksReasoning content returned?
OpenAI (o1 / o3 / o4-mini)https://api.openai.com/v1reasoning_effort sent nativelyYes
Anthropic (Claude 3.7+)https://api.anthropic.com/v1/LiteLLM routes via native Anthropic API with thinking parameterYes
Google Gemini (2.5+)https://generativelanguage.googleapis.com/v1beta/openai/reasoning_effort sent on compat endpointYes
LiteLLM auto-detects the provider from LLM_BASE_URL and maps it to the correct API format. Unknown URLs are treated as OpenAI-compatible.

Important caveats

Third-party proxies / custom endpoints are not guaranteed. If your LLM_BASE_URL points to a third-party API proxy (e.g., OpenRouter, one-api, custom gateway), LiteLLM will attempt to route correctly based on the URL. However, if your proxy expects a non-standard format, reasoning may not work as expected. Consult the proxy’s documentation for their expected parameter format.

Temperature constraints with reasoning

Some providers impose temperature restrictions when reasoning is active:
  • Anthropic: Requires temperature=1 when extended thinking is enabled. If using Anthropic with extended thinking, you must set LLM_TEMPERATURE=1 — Anthropic rejects other values when thinking is enabled.
  • OpenAI GPT-5.x: Only supports temperature=1 at all times. LiteLLM’s drop_params filtering handles this automatically — unsupported temperature values are silently dropped. No user action is needed for GPT-5.x.

How LLM_REASONING_BUDGET_TOKENS works

This variable is primarily meaningful for the Anthropic path. When set, it overrides the auto-calculated budget and is sent as budget_tokens in the thinking parameter via LiteLLM. When not set, the budget is derived from LLM_MAX_OUTPUT_TOKENS x effort ratio:
LLM_REASONING_EFFORTBudget ratioExample (max_tokens = 64000)
low20%12,800 tokens
medium50%32,000 tokens
high80%51,200 tokens
The minimum budget is 1,024 tokens (Anthropic’s hard minimum). For OpenAI and Gemini, the provider handles token allocation internally based on the reasoning_effort level — LLM_REASONING_BUDGET_TOKENS has no effect.

Agent Execution

ReAct Agent

VariableRequiredDefaultDescription
REACT_MAX_ITERATIONSNo20Max tool-call iterations per ReAct request. Higher = more thorough but slower and costlier
REACT_MAX_TURN_TOKENSNo0Emergency circuit-breaker: max cumulative tokens (prompt + completion across all iterations) per single ReAct turn. Default 0 = unlimited. This is NOT for daily token control — use per-user token_quota for that. This is a last-resort safety valve for extreme scenarios like an agent stuck in an infinite tool-call loop. Hitting this limit aborts the task mid-execution, wasting all tokens consumed so far and returning an incomplete result. Keep at 0 unless you have a specific runaway-agent problem to contain
REACT_TOOL_SELECTION_THRESHOLDNo12When the total number of registered tools exceeds this threshold, a lightweight LLM call selects the most relevant subset before each request
REACT_TOOL_SELECTION_MAXNo6Max tools to keep after smart selection (only effective when tool count exceeds REACT_TOOL_SELECTION_THRESHOLD)
REACT_SELF_REFLECTION_INTERVALNo6Inject a self-reflection prompt every N tool calls to help the agent course-correct and avoid loops
REACT_TOOL_OBS_TRUNCATIONNo8000Max characters per tool observation when synthesizing the final answer. Higher values preserve more structured data (JSON, tables) at the cost of more tokens
REACT_TOOL_RESULT_BUDGETNo40000Aggregate token budget for all tool results in a single session. When total tool-result tokens exceed this cap, new results are truncated with a notice. Prevents context bloat from large API responses (e.g., 5 connector calls returning 8K each). Set to 0 to disable the cap
REACT_COMPLETION_CHECK_SKIP_CHARSNo800Skip the post-answer completion-check LLM call when the agent’s final answer exceeds this many characters. Long detailed answers don’t need a “did I miss anything?” verification round-trip. Set lower to skip more aggressively; set to a very large value to always run the check
REACT_CYCLE_DETECTION_THRESHOLDNo2When the same tool is called with identical arguments this many times in a row, a deterministic warning is injected telling the agent to try a different approach. Unlike self-reflection (which relies on the LLM noticing the loop), this is a hash-based check that cannot be bypassed. Also applies to DAG steps
REACT_COMPLETION_CHECK_MIN_TOOLSNo3Minimum number of tool calls before the completion checklist fires. Simple tasks (1-2 tool calls) skip verification to avoid unnecessary latency. Set to 1 for always-on verification. Also applies to DAG steps
REACT_TURN_PROFILE_ENABLEDNotrueEmit per-turn phase-level timing logs (memory_load, compact, tool_schema_build, llm_first_token, llm_total, tool_exec). One structured log line per turn. Set to false to disable profiling entirely (zero overhead)
LLM_RATE_LIMIT_PER_USERNotrueUse per-user keyed rate-limit buckets instead of a single process-global bucket. Prevents one noisy user from starving all others on the same worker. The underlying rate is hardcoded at 60 requests/min and 100K tokens/min per bucket — this setting only controls whether the bucket is shared (global) or partitioned (per-user). Set to false to revert to the legacy global bucket (not recommended)

DAG Planner

VariableRequiredDefaultDescription
MAX_CONCURRENCYNo5Max parallel steps in DAG executor
DAG_STEP_MAX_ITERATIONSNo15Max tool-call iterations within each DAG step
DAG_STEP_TIMEOUTNo600Step execution timeout in seconds. Steps exceeding this are marked as failed and their dependents are cascade-skipped
DAG_MAX_REPLAN_ROUNDSNo3Max autonomous re-plan attempts when goal is not achieved. User interrupts (inject) are unlimited and do not count against this budget
DAG_REPLAN_STOP_CONFIDENCENo0.8Stop retrying when agent confidence that goal is unachievable exceeds this threshold (0.0 = never stop early, 1.0 = stop on any failure)
DAG_VERIFY_TRUNCATIONNo2000Max characters of step output sent to the step verifier LLM for quality judgment
DAG_ANALYZER_TRUNCATIONNo10000Max characters per step result when formatting for the post-execution analyzer
DAG_REPLAN_RECENT_TRUNCATIONNo500Max characters per step result from the most recent round when building re-plan context
DAG_REPLAN_OLDER_TRUNCATIONNo200Max characters per step result from older rounds when building re-plan context. Older rounds are more aggressively truncated to save context
DAG_TOOL_CACHENotrueCache identical tool calls within a single DAG execution. Only tools explicitly marked as cacheable (read-only tools like search, knowledge retrieval) are cached. Set to false to disable caching entirely
DAG_STEP_VERIFICATIONNofalseGeneric LLM-based quality check after each DAG step. On failure, the step retries once with feedback. Default off — adds latency on every step and is rarely needed; most step outputs are acceptable without re-checking. Use only when you observe frequent low-quality step results
DAG_CITATION_VERIFICATIONNotrueCitation-accuracy check for specialist-domain steps. Prerequisite: the query must first be classified as a specialist domain by the LLM domain classifier (see ESCALATION_DOMAINS). When the domain is detected AND this flag is true, each completed step is scanned for legal/medical/financial citations and verified for accuracy — catching hallucinated article numbers, fabricated case references, and incorrect regulatory citations. If domain classification returns null (general query), citation verification does not run regardless of this setting
DAG_CITATION_VERIFY_TRUNCATIONNo6000Max characters of step result sent to the citation verification prompt

Domain Classification

Controls the independent LLM-based domain detection layer that runs before both ReAct and DAG execution. When a query is classified as a specialist domain, the system activates domain-aware features: model escalation to reasoning model, domain-specific SOP instructions, and citation verification (DAG only).
VariableRequiredDefaultDescription
ESCALATION_DOMAINSNolegal,medical,financial,tax,compliance,patentComma-separated list of specialist domains. A fast LLM classifies each query against this list. When matched, the system: (1) upgrades to the reasoning model for higher accuracy, (2) injects domain-specific SOP instructions (e.g. verify citations via search before writing), (3) enables citation verification for DAG steps. Add custom domains as needed (e.g. legal,education,construction)

Context Guard

Controls the automatic context window management that prevents conversations from exceeding the model’s limit.
VariableRequiredDefaultDescription
CONTEXT_GUARD_DEFAULT_BUDGETNo32000Default token budget for context window management. When the conversation exceeds this, older messages are compacted
CONTEXT_GUARD_MAX_MSG_CHARSNo50000Hard character limit on any single message. Messages exceeding this are truncated as a safety net
CONTEXT_GUARD_KEEP_RECENTNo4Number of most recent messages to preserve when compacting conversation history

Agent Workspace

VariableRequiredDefaultDescription
WORKSPACE_OFFLOAD_THRESHOLDNo8000When a tool output exceeds this many characters, it is saved to a workspace file and a truncated preview is injected into the conversation context
WORKSPACE_PREVIEW_CHARSNo2000Number of preview characters to include in truncated workspace references
WORKSPACE_CLEANUP_MAX_HOURSNo72Workspace files older than this many hours are eligible for automatic cleanup

System

VariableRequiredDefaultDescription
SYSTEM_PROMPT_RESERVERemoved. Previously subtracted a fixed 4K reserve from the context budget for system prompts. This caused double-counting because ContextGuard already includes the system prompt when estimating message list tokens. The budget formula is now simply context_size - max_output_tokens, and the system prompt’s actual size is accounted for dynamically

Web Tools (Optional)

VariableRequiredDefaultDescription
JINA_API_KEYNoJina API key. Acts as a shared fallback for search, fetch, embedding, and reranker when no service-specific key is set. Get yours at jina.ai
TAVILY_API_KEYNoTavily Search API key (auto-selected if set and WEB_SEARCH_PROVIDER is unset)
BRAVE_API_KEYNoBrave Search API key (auto-selected if set and WEB_SEARCH_PROVIDER is unset)
EXA_API_KEYNoExa Search API key (auto-selected if set and WEB_SEARCH_PROVIDER is unset). Get yours at exa.ai
WEB_SEARCH_PROVIDERNojinaSearch provider selector: jina / tavily / brave / exa
WEB_FETCH_PROVIDERNojina (if key set, else httpx)Fetch provider: jina (uses Jina Reader API) / httpx (direct HTTP request, no API key needed)
Quick start tip: Setting just JINA_API_KEY enables web search, web fetch, embedding, and reranking all at once — one key, four services. You can override each service individually with the variables below.

Embedding

Embedding converts text into vectors for knowledge base search. FIM One uses the standard OpenAI-compatible /v1/embeddings endpoint, so it works with any provider that exposes this interface — not just Jina.
VariableRequiredDefaultDescription
EMBEDDING_API_KEYNo(falls back to JINA_API_KEY)API key for the embedding provider
EMBEDDING_BASE_URLNohttps://api.jina.ai/v1Base URL for the embedding provider
EMBEDDING_MODELNojina-embeddings-v3Model identifier
EMBEDDING_DIMENSIONNo1024Vector dimension
Provider examples — just set the three variables to switch:
ProviderEMBEDDING_BASE_URLEMBEDDING_MODELEMBEDDING_DIMENSION
Jina (default)https://api.jina.ai/v1jina-embeddings-v31024
OpenAIhttps://api.openai.com/v1text-embedding-3-small1536
Voyagehttps://api.voyageai.com/v1voyage-31024
Ollama (local)http://localhost:11434/v1nomic-embed-text768
Changing the embedding model or dimension invalidates all existing knowledge base vectors. Old vectors were computed in a different embedding space — retrieval accuracy will degrade silently. You must rebuild all knowledge base indexes after switching.

Retrieval

VariableRequiredDefaultDescription
RETRIEVAL_MODENogroundinggrounding (full pipeline with citations and confidence scoring) or simple (basic RAG)

Reranker

Reranker re-scores retrieved documents to improve relevance. Three providers are supported — select via RERANKER_PROVIDER or let the system auto-detect from available API keys.
VariableRequiredDefaultDescription
RERANKER_PROVIDERNo(auto-detect)jina / cohere / openai. If unset: uses Cohere if COHERE_API_KEY set, otherwise Jina
RERANKER_MODELNojina-reranker-v2-base-multilingualModel identifier (applies to Jina and OpenAI providers)
COHERE_API_KEYNoCohere API key (auto-selects Cohere reranker when set and RERANKER_PROVIDER is unset)
COHERE_RERANKER_MODELNorerank-multilingual-v3.0Cohere-specific reranker model
Jina uses JINA_API_KEY (from Web Tools above). OpenAI reuses LLM_API_KEY / LLM_BASE_URL — no extra key needed. Cohere requires its own COHERE_API_KEY.
Reranker is optional — knowledge base search works without it using fusion scoring. Embedding is recommended for knowledge base features.

Vector Store

VariableRequiredDefaultDescription
VECTOR_STORE_DIRNo./data/vector_storeDirectory for LanceDB vector store data (file-based, zero external services)

Code Execution

VariableRequiredDefaultDescription
CODE_EXEC_BACKENDNolocallocal (direct host execution) or docker (isolated containers)
DOCKER_PYTHON_IMAGENopython:3.11-slimDocker image for Python execution
DOCKER_NODE_IMAGENonode:20-slimDocker image for Node.js execution
DOCKER_SHELL_IMAGENopython:3.11-slimDocker image for shell execution
DOCKER_MEMORYNo(Docker default)RAM cap per container (e.g. 256m, 512m, 1g)
DOCKER_CPUSNo(Docker default)CPU quota per container (e.g. 0.5, 1.0)
SANDBOX_TIMEOUTNo120Default execution timeout in seconds
DOCKER_HOST_DATA_DIRNo(not set)Host-side absolute path of the ./data volume mount. Required for DooD (Docker-outside-of-Docker) deployments; docker-compose.yml auto-sets via ${PWD}/data.
Security: local mode runs AI-generated code directly on the host. For internet-facing or multi-user deployments, always set CODE_EXEC_BACKEND=docker.

Tool Artifacts

Size limits for files produced by tool execution (code execution, template rendering, image generation).
VariableRequiredDefaultDescription
MAX_ARTIFACT_SIZENo10485760 (10 MB)Max single artifact file size in bytes
MAX_ARTIFACTS_TOTALNo52428800 (50 MB)Max total artifact size per session in bytes

Document Processing (Optional)

Controls how uploaded PDF/DOCX files are processed for LLM consumption. Vision-capable models (GPT-4o, Claude 3/4, Gemini) can receive PDF pages as rendered images for higher fidelity.
VariableRequiredDefaultDescription
DOCUMENT_PROCESSING_MODENoautoauto (vision if model supports it), vision (always render pages), text (always extract text only)
DOCUMENT_VISION_DPINo150DPI for PDF page rendering. Higher = better quality, more tokens
DOCUMENT_VISION_MAX_PAGESNo20Maximum pages to render as images per PDF
Note: Per-model vision support is configured via the supports_vision toggle in Admin → Models. When not explicitly set, the system auto-detects vision capability from the model name.

Image Generation (Optional)

VariableRequiredDefaultDescription
IMAGE_GEN_PROVIDERNogooglegoogle (Gemini native API) or openai (OpenAI-compatible /v1/images/generations)
IMAGE_GEN_API_KEYNoGoogle AI Studio key (google) or proxy/OpenAI API key (openai)
IMAGE_GEN_MODELNogemini-3.1-flash-image-previewImage generation model (e.g. dall-e-3, gemini-nano-banana-2)
IMAGE_GEN_BASE_URLNo(per provider)Google: https://generativelanguage.googleapis.com/v1beta; OpenAI: https://api.openai.com/v1

Auto-registers the email_send built-in tool when SMTP_HOST, SMTP_USER, and SMTP_PASS are all set.
VariableRequiredDefaultDescription
SMTP_HOSTCond.SMTP server hostname
SMTP_PORTNo465SMTP port
SMTP_SSLNosslTLS mode: ssl (port 465) / tls (STARTTLS, port 587) / "" (plain)
SMTP_USERCond.SMTP login username
SMTP_PASSCond.SMTP login password
SMTP_FROMNo(uses SMTP_USER)Sender address shown in From header
SMTP_FROM_NAMENoDisplay name shown in From header
SMTP_REPLY_TONoReply-To address; replies go here instead of SMTP_FROM
SMTP_ALLOWED_DOMAINSNoComma-separated domain allowlist (e.g. example.com,corp.io); blocks recipients outside listed domains
SMTP_ALLOWED_ADDRESSESNoComma-separated exact-address allowlist; combined with SMTP_ALLOWED_DOMAINS; leave both unset to allow any recipient (not recommended for shared mailboxes)

Connectors

VariableRequiredDefaultDescription
CONNECTOR_RESPONSE_MAX_CHARSNo50000Max characters for non-array JSON / plain-text connector responses
CONNECTOR_RESPONSE_MAX_ITEMSNo10Max array items to keep when connector response is a JSON array
CREDENTIAL_ENCRYPTION_KEYNo(unset)Fernet encryption key for connector credential blobs. When set, auth tokens stored in connector_credentials are encrypted at rest. If unset, credentials are stored as plaintext JSON (backward-compatible). Changing this key invalidates all existing encrypted credentials.
CONNECTOR_TOOL_MODENoprogressiveHow connector tools are exposed to agents. progressive: single ConnectorMetaTool with discover/execute subcommands (~30 tokens/connector). classic: one tool per action (legacy, ~250 tokens/action).
DATABASE_TOOL_MODENoprogressiveHow database connector tools are exposed to agents. progressive: single DatabaseMetaTool with list_tables/discover/query subcommands. legacy: one tool per action per database connector (3 tools each).
MCP_TOOL_MODENoprogressiveHow MCP server tools are exposed to agents. progressive: single MCPServerMetaTool with discover/call subcommands. legacy: one tool per MCP server action (original individual tools).

Platform

VariableRequiredDefaultDescription
DATABASE_URLNosqlite+aiosqlite:///./data/fim_one.dbDatabase connection string. SQLite (zero-config): sqlite+aiosqlite:///./data/fim_one.db. PostgreSQL (production): postgresql+asyncpg://user:pass@localhost:5432/fim_one. Docker Compose auto-sets PostgreSQL.
JWT_SECRET_KEYNoCHANGE_MESecret key for JWT token signing. Placeholder value CHANGE_ME (or any legacy default) triggers auto-generation of a secure 256-bit random key on first start, which is written back to .env. Set explicitly in production to keep tokens valid across restarts and replicas.
CORS_ORIGINSNoComma-separated list of extra allowed CORS origins beyond the default localhost entries. Required when the frontend runs on a non-localhost domain (e.g. https://app.example.com).
UPLOADS_DIRNo./uploadsDirectory for uploaded files
MAX_UPLOAD_SIZE_MBNo50Max file upload size in megabytes (backend enforcement)
NEXT_PUBLIC_MAX_UPLOAD_SIZE_MBNo50Max file upload size shown in frontend UI. Build-time variable — must match MAX_UPLOAD_SIZE_MB.
MCP_SERVERSNoJSON array of MCP server configs (requires uv sync --extra mcp)
ALLOW_STDIO_MCPNofalseAllow stdio MCP servers. Set true only for trusted local deployments
ALLOWED_STDIO_COMMANDSNonpx,uvx,node,python,python3,deno,bunComma-separated list of allowed base commands for stdio MCP servers. Only effective when ALLOW_STDIO_MCP=true
LOG_LEVELNoINFOLogging level: DEBUG / INFO / WARNING / ERROR / CRITICAL
REDIS_URLNoRedis connection URL for cross-worker interrupt relay. Required when WORKERS>1 — without it, mid-stream interrupt/inject requests may hit a different worker and silently fail. Auto-configured by Docker Compose.
WORKERSNo1Uvicorn worker processes. 1 is safe and needs no external services. For production multi-worker, use PostgreSQL (SQLite is single-writer). SQLite works for local dev under light load. Auth, OAuth, and file operations are fully multi-worker safe (JWT-based). Docker Compose auto-configures both PostgreSQL and Redis.
Multi-worker checklist (WORKERS>1):
  • Stop (abort streaming) — always works, no extra config needed (signal travels on the same TCP connection).
  • Inject (mid-stream follow-up)requires REDIS_URL. Without Redis, the inject request may land on a different worker that has no knowledge of the running execution, causing it to silently fail.
  • Production: use PostgreSQL (DATABASE_URL). SQLite’s single-writer lock can cause contention under concurrent writes.
  • Local dev: SQLite + multi-worker is fine for light usage; just add REDIS_URL if you use the inject feature.

Workflow Run Retention

Background cleanup task that automatically purges old workflow runs. Per-workflow overrides (configured in the workflow settings UI) take priority over these global defaults.
VariableRequiredDefaultDescription
WORKFLOW_RUN_MAX_AGE_DAYSNo30Delete workflow runs older than this many days
WORKFLOW_RUN_MAX_PER_WORKFLOWNo100Keep at most this many runs per workflow (oldest deleted first)
WORKFLOW_RUN_CLEANUP_INTERVAL_HOURSNo24How often the background cleanup task runs, in hours

Channel Confirmation Request Expiry

Background sweeper that marks stale pending approval requests (produced by channel hooks like FeishuGateHook or the Approval Playground) as expired. Ensures a click days later on a forgotten card doesn’t flip agent state that has already been torn down.
VariableRequiredDefaultDescription
CHANNEL_CONFIRMATION_TTL_MINUTESNo1440Pending confirmations older than this are auto-expired (default: 24 hours)
CHANNEL_CONFIRMATION_SWEEP_INTERVAL_SECONDSNo600How often the expiry sweeper runs (default: every 10 minutes)

OAuth (Optional)

When both CLIENT_ID and CLIENT_SECRET are set for a provider, the login page automatically shows the corresponding OAuth button.
VariableRequiredDefaultDescription
GITHUB_CLIENT_IDNoGitHub OAuth App client ID. Create at github.com/settings/developers → OAuth Apps
GITHUB_CLIENT_SECRETNoGitHub OAuth App client secret
GOOGLE_CLIENT_IDNoGoogle OAuth client ID. Create at console.cloud.google.com/apis/credentials
GOOGLE_CLIENT_SECRETNoGoogle OAuth client secret
DISCORD_CLIENT_IDNoDiscord OAuth2 client ID. Create at discord.com/developers
DISCORD_CLIENT_SECRETNoDiscord OAuth2 client secret
FEISHU_APP_IDNoFeishu (Lark) App ID. Create at open.feishu.cn. Requires contact:user.email:readonly permission
FEISHU_APP_SECRETNoFeishu (Lark) App Secret
FRONTEND_URLProdhttp://localhost:3000Where the browser lands after OAuth completes. Must be set in production (e.g. https://yourdomain.com)
API_BASE_URLProdhttp://localhost:8000Externally reachable backend URL, used to build OAuth callback URLs. Must be set in production
NEXT_PUBLIC_API_URLProd(auto-detected as <hostname>:8000)Browser-side API base URL for OAuth redirects. This is a frontend build-time variable — set it in frontend/.env.local for local dev, or pass it as a Docker build arg for custom production deployments. Auto-detection works for standard reverse-proxy setups (port 80/443).
Prod = optional locally (defaults work), but required for any internet-facing deployment.

OAuth Callback URLs to register with each provider

The backend constructs callback URLs as: {API_BASE_URL}/api/auth/oauth/{provider}/callback
ProviderCallback URL to register
GitHubhttps://yourdomain.com/api/auth/oauth/github/callback
Googlehttps://yourdomain.com/api/auth/oauth/google/callback
Discordhttps://yourdomain.com/api/auth/oauth/discord/callback

Cloudflare Tunnel (Optional)

Route all traffic through Cloudflare’s network instead of exposing ports directly. Eliminates the need for Nginx, SSL certificates, and open firewall rules. See the Production Deployment section for setup instructions.
Mainland China users: Cloudflare Free/Pro/Business plans have no PoPs in mainland China. Traffic is routed to overseas edges, causing frequent 502 errors. Do not use this if your primary users are in mainland China unless you have Cloudflare Enterprise with China Network.
VariableRequiredDefaultDescription
CLOUDFLARE_TUNNEL_TOKENYes (if using Tunnel)Token from Cloudflare Zero Trust → Networks → Tunnels → your tunnel → Configure. Starts with eyJ.... Required by the cloudflared sidecar in docker-compose.tunnel.yml.

Analytics (Optional)

All analytics providers are optional. Set any combination — all active providers load simultaneously. Leave all blank to disable analytics entirely (recommended for local dev).
VariableRequiredDefaultDescription
NEXT_PUBLIC_GA_MEASUREMENT_IDNoGoogle Analytics 4 measurement ID (e.g. G-XXXXXXXXXX). Get yours at analytics.google.com
NEXT_PUBLIC_UMAMI_SCRIPT_URLNoUmami analytics script URL (e.g. https://your-umami.com/script.js). Self-hosted, privacy-friendly alternative — umami.is
NEXT_PUBLIC_UMAMI_WEBSITE_IDNoUmami website ID. Required when NEXT_PUBLIC_UMAMI_SCRIPT_URL is set
NEXT_PUBLIC_PLAUSIBLE_DOMAINNoPlausible analytics domain (e.g. yourdomain.com). Lightweight, privacy-friendly — plausible.io
NEXT_PUBLIC_PLAUSIBLE_SCRIPT_URLNohttps://plausible.io/js/script.jsCustom Plausible script URL for self-hosted instances
All NEXT_PUBLIC_* analytics variables are build-time — changes require a frontend rebuild to take effect.