[Unreleased]
Added
- Non-English docs now show the full API Reference (OpenAPI endpoints) and Channels configuration pages. Previously only the English site rendered these sections — the
zh/ja/ko/de/frnav was missing theEndpointsgroup and theConfiguration > Channelssub-group, so readers on those locales couldn’t reach the auto-generated API playground or the Feishu channel setup guide. Fixed as a side-effect of migratingdocs/docs.jsonto a single English source of truth (docs/nav.template.json+scripts/docs-nav-glossary.json) that regenerates all six locales deterministically; manual per-locale sync is no longer required when adding new doc pages. - Admin users: per-user unlimited quota + Registered column. Leaving the quota field empty now inherits the global limit; setting it to
0grants the user unlimited usage. Previously both collapsed into the same state, so granting an individual unlimited access required editing the DB directly. The users table also gains a Registered column for onboarding audits.
Fixed
- Agent settings: bound resources no longer flicker as “(已删除)” on entry. The KB / connector / MCP-server pickers used to derive the orphan badge synchronously from
selected − available, so any agent whose inventory list fetch was slower than the agent fetch would briefly mark every linked resource as deleted before snapping back to the correct state — most noticeable for org agents with multiple connectors. The orphan badge now waits for each inventory fetch to settle before rendering. - Toasts now follow the light/dark theme. The root toaster was hardcoded to dark regardless of user preference.
- Usage by-agent breakdown no longer duplicates “Direct Chat” rows. Conversations tied to deleted agents used to produce one row per orphaned
agent_id; they now collapse into a single row. - Self-hosted deploys no longer fail intermittently with
container name already in use. A new idempotent./deploy.shwrapper sweeps hash-prefixed zombie containers and DooD sandbox children beforedocker compose up, so repeat deploys stop colliding on the sandbox service name. - No more redundant
chore(i18n): sync translationscommits after pushes. The pre-commit translation hook updated.translation-cache.jsonon disk but forgot to stage it, so CI would see stale cache entries, re-translate the same sections, and auto-commit a slightly different output due to LLM nondeterminism. The hook now stages the cache alongside the translated files — CI detects no diff and exits silently as intended. - Agent chat no longer crashes when a confirmation-required tool runs in an unbound-agent session. Pure model conversations that invoked a built-in tool marked
requires_confirmation=truehit a hard 500 (or a red “no agent_id” gate error in the UI) because the per-agent approval router had no configuration to consult. The gate now bows out gracefully when there is no agent to route for, and the chat stream no longer trips on a missingagent_cfg. - Retry no longer fails on strict-alternation providers when history contains orphan user messages. Stopping and retrying a turn leaves a “stopped” user message in conversation history; replaying that history to Claude (which enforces alternating user/assistant turns) previously returned HTTP 400 and silently dropped earlier messages. Consecutive same-role messages are now collapsed into a single turn before dispatch, so every orphan is carried into the next call as context — retry works even if you stop, add “I wasn’t done — continue with this…”, and retry again.
- Playground retry no longer briefly shows the query twice. The retry flow simultaneously replayed the stopped turn from history and rendered a fresh user bubble for the in-flight turn; during the pre-stream window both were visible. The in-flight bubble is now suppressed whenever history already renders the same content.
- ReAct agent no longer retries paraphrased tool calls after an operator rejects an approval request — it acknowledges the rejection and stops.
Changed
- Translation authority shifted from locale files to glossary: translation rules now live in
scripts/translation-glossary.md— a single source of truth loaded into every LLM translation call (JSON, MDX, README). The pre-commit hook now unconditionally refuses manual edits to generated locale files (removing the priorALLOW_LOCALE_EDIT=1override), because scattered per-locale edits silently drift and get overwritten on full retranslation. To fix a mistranslation, edit the glossary (a permanent rule that applies to all five locales) and regenerate affected files with--force. Inline glossary rules previously embedded in three system prompts have been consolidated into the external glossary file.
[v0.8.5] - 2026-04-23
Added
-
Contributor-friendly i18n workflow: contributors no longer need to configure
LLM_API_KEYto submit PRs that touch English source files. If the local pre-commit translation step is skipped (no key), a new.github/workflows/i18n-sync.ymlworkflow translates EN → ZH/JA/KO/DE/FR onmasterafter the PR is merged and auto-commits the result. The pre-commit hook also now refuses manual edits to generated locale files (with anALLOW_LOCALE_EDIT=1override for legitimate translation fixes), preventing silent drift between EN sources and translated outputs. -
i18n CI fallback end-to-end verified: smoke-test push (EN-only, local translate hook skipped) confirmed the GitHub Actions workflow detects the changed source, translates into all five locales, and auto-commits the result to master with
[skip ci]to prevent recursion. - Exa integration docs page: dedicated Integrations section in the sidebar with a first-class Exa page covering the full Exa search surface (neural / fast / deep-reasoning / instant), filtering, content retrieval, and three tuned presets for news monitoring, research paper retrieval, and deep-reasoning agents. Serves as the partner-facing landing page for the Exa integration directory.
-
Xinchuang (信创) database support: the Database Connector now lists KingbaseES (人大金仓), HighGo (瀚高), and DM8 (达梦) alongside PostgreSQL/MySQL. KingbaseES and HighGo are PG-compatible and reuse
asyncpgwith no extra dependencies; DM8 uses the officialdmPythonvendor wheel. A standalonescripts/test_xinchuang_dbs.pylets operators verify live connectivity from the CLI. -
Feishu Channel + confirmation gate via IM: new
Channelresource type (org-scoped, credentials encrypted at rest) lets orgs connect a Feishu app for outbound messaging. Tools flaggedrequires_confirmation=Truenow send an Approve/Reject interactive card to the configured Feishu group instead of only showing the confirmation dialog in the portal — any authorized member of the group can approve or reject directly from Feishu. Covers Settings → Channels management UI (list, create/edit with dirty-state protection, details with copyable callback URL, test-send), CRUD API (/api/channels), and the Feishu event callback endpoint (/api/channels/{id}/callback) with signature verification and URL challenge support. First step of the v0.9 IM Channel Integration roadmap item, ships ahead of schedule for the 2026-04-24 roadshow. -
Agent Hook System (skeleton): new
PreToolUseHook/PostToolUseHookabstraction insrc/fim_one/core/hooks/lets deterministic logic run outside the LLM loop — the FeishuGateHook is the first concrete implementation, attached to the confirmation-gate flow. Full hook lifecycle + user-defined YAML hooks remain v0.9 scope. -
Hook System is now live in the ReAct and DAG runtime: agents that declare
hooks.class_hooksin theirmodel_config_jsonhave those hooks instantiated and registered on every chat session. The first consumer —FeishuGateHook— fires automatically when an agent calls a tool whose connector action is flaggedrequires_confirmation=True, posts an Approve/Reject card to the org’s Feishu group, blocks the tool, and resumes or aborts based on the verdict. Previously the hook abstraction was in place but nothing in the web layer wired it to live chat. -
Channels and the Hook System architecture documented: new
docs/architecture/hook-system.mdxexplains the three hook points, why hooks run outside the LLM loop, and walks through FeishuGateHook end-to-end. Existing architecture pages (system-overview, organization, react-engine, philosophy) cross-link to it. README now lists Messaging Channels as a first-class v0.8 capability and the Application Layer diagram includes IM targets alongside Portal/API/iframe. -
Hook Approval Playground: Channels details sheet now has a “Test Approval Flow” action that simulates a sensitive tool call, pushes a real confirmation card to the linked Feishu group, and polls for the reviewer’s decision live. Unlike the existing preview button, this exercises the full production path (genuine
ConfirmationRequestrow, real Feishu callback, status transitions), so demos and pre-rollout rehearsals use the same code path a production hook would. - Per-agent task completion notifications: agents can now push a summary card to the org’s channel (currently Feishu) when a long-running ReAct or DAG task finishes. Configurable per-agent in Settings → Agent → Notifications. First consumer of the generic outbound notification pattern.
-
Configurable confirmation gate — inline or channel: every agent now has an “Approval” section in Settings with three routing modes (Auto / Inline only / Channel only), an approver-scope selector (initiator / agent owner / anyone in the org), a “require confirmation for every tool call” override, and an explicit approval-channel picker. Auto mode uses a linked channel if one exists and gracefully falls back to an inline approval card in the chat stream otherwise — so agents without any channel still get a real approval UX instead of silently failing. A new
POST /api/confirmations/{id}/respondendpoint shares a single decision-recording path with the Feishu webhook, so every approval — whether clicked in chat or in a Feishu group — stamps the sameapprover_user_idanddecided_ataudit fields.
Changed
- Playground loading indicators now use a subtle text shimmer instead of the fake-looking progress bar that froze at a pseudo-full width after 8 seconds. Unified the two existing shimmer implementations (
.shiny-textand.text-shimmer) into a single theme-aware primitive with an optional warm preset. - Connector cards now surface a “Private default” badge (with tooltip) when a connector has
allow_fallbackdisabled, so owners can tell at a glance which connectors require every user to bring their own credentials. The help text under the Allow-Fallback toggle in the connector settings form also clarifies that the flag only gates sharing with other users — the owner can always use their own default credential regardless.
Fixed
- Connector calls made by the owner of a connector with
allow_fallback=falseand only a default credential (no per-user credential) no longer 401 with “Requires authentication”. The owner is now exempt from the fallback gate — that flag only controls whether other users may borrow the owner’s default credential. Previously the owner’s own agents were silently sending unauthenticated requests, and the same issue also affected workflowconnector_actionnodes. - Conversation export now shows the correct mode label (“Planner” / “规划”) for auto-routed DAG conversations instead of always displaying “Standard”.
- Export timestamps now respect the user’s configured timezone instead of displaying raw UTC.
- Uploaded file content no longer leaks into exported conversations; only the user’s message text is included.
- Parallel tool calls no longer collide when a provider reuses
index=0for every streamed tool-call delta; the aggregator now detects boundaries via id or name change and remaps subsequent deltas to the correct slot. - Settings → Channels now reflects the current user’s org role: members (non-admin/owner) see a disabled “New Channel” button, hidden Edit / Enable-Disable / Delete actions, a read-only banner, and a permission-aware empty state — instead of an enabled CTA that failed on submit with “Organization admin access required”.
- Session-expiry redirect now preserves the query string, so users land back on the exact tab / filter they were viewing after re-authenticating instead of the bare path.
- Feishu channel form no longer shows a spurious “discard unsaved changes” prompt when interacting with the chat picker that’s layered above the dialog.
- Feishu channel setup hints no longer duplicate Chinese labels when the UI itself is already in Chinese (e.g. previously rendered “事件与回调 (事件与回调)”).
- “Annotate All” in the schema manager no longer returns 500 Internal Server Error — the full-annotate backend path had an unbound-variable bug that blocked every invocation.
- Editing a database connector now shows the
********placeholder in the password field instead of three-bullet masked text, making it obvious that leaving the field blank keeps the stored password. - Updating a connector action no longer collapses the detail panel — the edited action stays selected so users can keep iterating on it.
- AI connector editor now distinguishes success, partial failure, and complete failure instead of showing the same “completed” message for all three. Failure reasons are surfaced inline so users can see what actually went wrong.
- AI connector editor can no longer silently wipe multiple actions in one go. Bulk-delete (>2 actions) now requires an explicit destructive keyword in the user’s instruction (“rebuild”, “全部重建”, “wipe”, etc.); otherwise the operation is rejected with a clear error, protecting requires_confirmation / JMESPath settings from accidental loss.
- Confirmation cards in the portal chat now show whether the request was routed to a channel (e.g. Feishu) or handled inline, alongside a human-readable hint about who is allowed to approve (the initiator, the agent owner, or any org member). Channel-routed requests also produce an inline pending card so the user isn’t left wondering whether a notification was actually sent.
- Feishu approval cards now become read-only after the first decision: the
/callbackwebhook returns a replacement card with the Approve/Reject buttons removed and the header coloured green (approved) or red (rejected), preventing repeated clicks. Duplicate clicks that still arrive from stale Feishu clients get a “This request was already approved/rejected.” toast and a fresh copy of the decided card so the stale view catches up. - Restored the plain “Send Test Message” action on channel rows and the details sheet. The Approval Playground exercises the full hook round-trip, but a notification-only channel (no approval hook wired) still needs a quick credential/connectivity sanity check, which the plain test-send covers.
- Concurrent clicks on the same Feishu approval card can no longer both succeed. The
/callbackhandler now flips theConfirmationRequeststatus via a conditionalUPDATE ... WHERE status='pending'and uses the affected rowcount to decide which caller “won”; previously two parallel requests could both readpendingand race a write, potentially ending up with approved-then-rejected on the same row. - Pending approval requests now auto-expire after
CHANNEL_CONFIRMATION_TTL_MINUTES(default 24h) via a background sweeper. Prevents a stale click days later from flipping agent state that has already been torn down; the next click on an expired card gets a grey “Expired” decided card and a “no longer active” toast. - Send Test Message now delivers a plain text notification (no Approve/Reject buttons) and lives only in the channel details sheet — not the row dropdown. Users who don’t intend to use approval hooks aren’t confused by interactive buttons on a “test” message. Approval round-trip testing remains available via the Approval Playground button.
- Channel details sheet tightened: “How to finish setup” is now a collapsible section collapsed by default (so it doesn’t dominate the sheet for already-configured channels), and the outer padding was reduced so content sits closer to the sheet edge.
- Builder AI no longer reports masked (
****) credentials as missing — it now recognizes them as configured and skips false “credential missing” guidance. - Playground agent list now shows all accessible agents instead of only published ones, so draft agents can be tested without publishing first.
- Chat image uploads no longer crash the stream on malformed
data:URLs — the MIME extractor now safely falls back toapplication/octet-streaminstead of raising an IndexError mid-generate. - Playground image thumbnails now cancel in-flight fetches on unmount via
AbortController, avoiding stale blob-URL assignments and wasted bandwidth during rapid navigation.
[v0.8.4] - 2026-04-17
Added
- Conversation recovery: synthetic tool_result rows now persist after an interrupted turn; clients can resume a disconnected SSE stream via
POST /chat/resumewith the last-seen cursor. - Playground now auto-reconnects dropped SSE streams using the
/chat/resumeendpoint with exponential backoff (max 3 attempts); shows a “Reconnecting…” indicator during recovery. - Prompt cache observability:
cache_read_input_tokensandcache_creation_input_tokensare captured from LLM responses, aggregated per turn inTurnProfiler, logged as aturn_cachesummary line (read/create tokens + estimated savings), and surfaced in the chatdone_payloadunder a newcachefield. Enables verification that Anthropic prompt caching actually hits, and doubles as a detector for whether API relay stations honor the cache discount.
Changed
- System prompts now use a memoized section registry with Anthropic prompt-caching breakpoints on the stable prefix — reduces per-turn token cost ~60-80% on cached prefix for Claude models. ReAct JSON mode, native function-calling mode, and synthesis all emit two system messages for cache-capable providers (Claude, Bedrock Anthropic, Vertex Claude) and fall back to a single concatenated message for every other provider.
Fixed
- Thinking/reasoning tokens now persist across multi-turn conversations — Anthropic
signaturefield is captured and replayed per API requirements. - Provider-aware reasoning replay policy:
reasoning_content(from DeepSeek-R1, Qwen QwQ, Gemini thinking, OpenAI o-series) is no longer replayed back to non-Anthropic providers on subsequent turns. Previously the field was serialized unconditionally inChatMessage.to_openai_dict(), which violated provider documentation (DeepSeek and Qwen both explicitly document “do not sendreasoning_contentback in message history”) and silently invalidated their automatic prefix / KV caches on every multi-turn exchange. Policy is centralized incore/prompt/reasoning.py— Claude family (including Bedrock and Vertex proxies) still replays thinking blocks with signature as required.
[v0.8.3] - 2026-04-16
Added
convert_to_markdownbuilt-in tool — New general-purpose Agent tool that converts any file, URL, YouTube link, or data URI to clean Markdown using Microsoft’s MarkItDown. Supports PDF, Word (.docx), Excel (.xlsx/.xls), PowerPoint (.pptx), HTML, JSON, CSV, XML, ZIP, EPUB, Outlook .msg, images, audio (speech → text), and YouTube transcripts. Available to every agent by default — same tier asweb_fetch. When a vision-capable LLM is configured, embedded images and scanned PDF pages are OCR’d automatically via the officialmarkitdown-ocrplugin. Previously this capability was hidden inside the background RAG ingestion pipeline; agents now have it on the interactive conversation path.- Document OCR via
markitdown-ocr— Embedded images in DOCX / XLSX / PPTX and scanned PDF pages are now OCR’d using the same vision-capable LLM the rest of FIM One routes through. Applies to both the built-inconvert_to_markdowntool and the RAG ingestion pipeline, so chat-time conversion and knowledge-base ingestion produce byte-identical Markdown for the same input. - Universal vision provider support for document OCR — A new
LiteLLMOpenAIShimduck-type wraps any FIM OneOpenAICompatibleLLMin the openai SDK’s.chat.completions.create(...)API shape, then dispatches throughlitellm.completion(). MarkItDown (which hard-codes the openai SDK surface) can now consume Anthropic Claude, Google Gemini, Azure, Bedrock, and any other provider LiteLLM supports — no per-provider adapter code in FIM One. - Vision-aware RAG ingestion — Knowledge-base uploads of Office documents and scanned PDFs now resolve the workspace’s default vision LLM (DB-first, ENV fallback) and pass it through to MarkItDown for OCR during ingestion. Zero-regression: when no vision-capable model is available, ingestion silently falls back to text-only mode — exactly the pre-feature behavior.
- Expanded MarkItDown format coverage — RAG now natively ingests
.pdf,.msg(Outlook),.epub,.mp3,.wav, and.m4avia MarkItDown’s audio-transcription and outlook extras. YouTube URLs flow throughconvert_to_markdownviamarkitdown[youtube-transcription]. LLM_SUPPORTS_VISIONenv var — Optional opt-out (=false) for the ENV-mode document-OCR fallback. Default behavior is optimistic (true), which covers the common ENV setups (gpt-4o,claude-3-5-sonnet,gemini-1.5-pro/flash). Set tofalseonly when your ENV-configuredLLM_MODELdoes not support vision (e.g.deepseek-v3,qwen-chat,llama-3.1,gpt-3.5-turbo,o1-mini) to skip a failing vision call on every document upload. Ignored entirely when an admin-curated ModelGroup is active — DB mode is always the source of truth when available.- Turn-level profiler — Each ReAct turn now logs phase-level timings (
memory_load,compact,tool_schema_build,llm_first_token,llm_total,tool_exec) in a single structured log line per turn. Toggleable viaREACT_TURN_PROFILE_ENABLED(default: on; set tofalsefor zero-overhead no-op). - Structured compact work card — Conversation compaction now parses its own 9-section markdown output into a typed
WorkCardand merges new compacts into the previous one, so errors and pending tasks from earlier in a long session survive across multiple compaction rounds instead of being re-summarized from scratch.
Changed
- Per-user rate limiting — LLM-layer rate limiter now maintains a separate bucket per user instead of a single process-global bucket. Prevents one noisy user from throttling all other users on the same worker. Toggleable via
LLM_RATE_LIMIT_PER_USER(default: on).
Fixed
- Dangling tool_use recovery — Conversations interrupted mid-tool-execution (user Stop, SSE disconnect, crash) previously left an assistant message with a
tool_useblock and no matchingtool_result, causing the next turn to crash with an opaque HTTP 400 from the LLM API.DbMemory.get_messages()now detects and repairs these dangling blocks on the read path with a synthetic[interrupted]tool_result. Raw DB log is not mutated. - Empty-content assistant messages with tool_calls no longer dropped — The DbMemory load-path filter previously silently discarded any assistant row with empty text content. Native function-calling intermediates (which carry only
tool_calls, no text) were being wiped out. The filter now requires BOTH empty content AND notool_calls.
[v0.8.2] - 2026-04-10
Added
- Intelligent Document Processing (Vision-Aware) — Adaptive document handling based on model capabilities. When the target LLM supports vision (GPT-4o, Claude 3/4, Gemini), PDF pages are rendered as images and sent via vision content blocks for full visual fidelity. Text-only models fall back to pdfplumber text extraction. Two modes: Vision and Text-only. Configurable via
DOCUMENT_PROCESSING_MODE,DOCUMENT_VISION_DPI,DOCUMENT_VISION_MAX_PAGESenv vars. Per-modelsupports_visiontoggle in Admin. - Document vision pipeline — DOCX, PPTX, and PDF files uploaded in chat now have their embedded images extracted and sent as vision content to the LLM when vision is enabled on the model.
- Multi-turn vision persistence — Vision content from uploaded documents and images persists across conversation turns, so the model retains visual context throughout the conversation.
- Smart PDF processing — Text-rich PDF pages extract text plus embedded images separately (saving tokens). Scanned or image-only pages render as full-page PNG for maximum fidelity.
- Pre-built sandbox image —
Dockerfile.sandboxwith common data-science packages (pdfplumber, Pillow, pandas, etc.) so AI code execution works out of the box in--network=nonecontainers. - Resource Fork completion — All five resource types now support fork: Agent, Connector, Workflow, MCP Server, and Skill. KB fork removed (inherently user-local).
Changed
- Faster chat response completion — SSE stream now closes immediately after the agent finishes; title generation and follow-up suggestions run in the background instead of blocking the response.
- Smarter context compaction — Conversation compaction uses a structured 9-section format that better preserves key information (original request, errors, pending tasks) across long sessions.
- Reduced agent looping — Anti-loop instructions added to agent prompts; cycle detection threshold lowered so repeated identical tool calls are caught earlier.
- Faster request startup — LLM configuration lookups and domain classification now run concurrently, reducing per-request overhead by 400-1100ms.
- Better empty tool handling — Tools that return no output now produce a descriptive message instead of bare “(no output)”, preventing wasteful retries.
- Automatic old tool result cleanup — Tool results older than the 6 most recent are automatically cleared before context compaction, keeping conversations lean.
- Tool result aggregate budget — Total tool result tokens are capped at 40K per session; new results are truncated when the budget is exceeded, preventing context bloat from large API responses.
- Context overflow auto-recovery — When the LLM rejects a request due to context length overflow, the agent automatically compacts to 50% and retries, instead of crashing the entire conversation.
- Keyword-based tool selection — When a query obviously matches a specific tool by name or description keywords, the agent skips the LLM-based tool selection call, saving 200-500ms.
- LLM connection pooling — All LLM API calls now share a single connection pool with optimized keepalive settings, reducing connection overhead across the entire session.
- Smarter completion check — The post-answer verification step is skipped for long detailed answers (>200 tokens), eliminating an unnecessary LLM round-trip.
- Model fallback on provider outage — When the primary model is unavailable (rate limited, overloaded, or down), the agent automatically retries with the fast model instead of failing.
Fixed
- Agent hallucination on unreadable files — When the AI agent could not read a file (e.g., image-based PDF), it previously read other unrelated files and presented their content as the target file’s. A file integrity guardrail in the system prompt now prevents this.
- File ID injection for uploads — Uploaded files now include their UUID file_id in the message context, so the agent can directly access them via
read_uploaded_filewithout guessing. - Vision toggle reading from new model structure — The
supports_visionflag on model configs was not being read correctly from the ModelGroup/ModelProviderModel ORM structure. Fixed. - Improved error messages for unreadable files — When files cannot be read, the tool now returns specific guidance (file type, vision suggestion) instead of generic errors.
[v0.8.1] - 2026-03-29
Added
- Timezone-aware admin notifications — Admin notification emails now display event times in each recipient’s configured timezone instead of always showing UTC.
- Progressive database tool disclosure — Single
databasemeta-tool withlist_tables/discover/querysubcommands replaces individual per-table tools. Configurable viaDATABASE_TOOL_MODEenv var (progressivedefault,legacyfallback). - On-demand tool loading — When more than 12 tools are available, a
request_toolsmeta-tool lets the agent dynamically load additional tools mid-conversation instead of being stuck with the initial selection. - Progressive MCP tool disclosure — Single
mcpmeta-tool withdiscover/callsubcommands replaces individual per-server tools. Configurable viaMCP_TOOL_MODEenv var (progressivedefault,legacyfallback). - Per-turn token budget circuit breaker —
REACT_MAX_TURN_TOKENSenv var provides an emergency stop for runaway agent loops. Default0(unlimited) — use per-usertoken_quotafor daily cost control instead. - Per-model Native Function Calling toggle —
tool_choice_enabledsetting (ENV + Admin per-model) lets models that reject forced tool selection skip Level 1 and go directly to JSON Mode. Configurable in Settings → Models → Advanced. - DAG quality overhaul — Five improvements: default model upgrade to general model for non-fast steps; skill auto-discovery in planning; citation verifier for legal/medical/financial domains; structured content context preservation with configurable truncation multiplier; domain classification in router with domain-aware model selection.
- Domain model escalation in ReAct — Specialist domains (legal/medical/financial) auto-escalate to reasoning model with mandatory web search and citation verification.
- File attachment download — File cards in chat messages are now clickable to download the original file.
- Admin notification master switch — Global on/off toggle for admin email notifications with runtime SMTP detection. Shows a warning banner when SMTP is not configured and disables all notification controls.
- SMTP Reply-To header — New
SMTP_REPLY_TOenv var allows replies to go to a different address than the sender. - Resource Fork Phase 1 (MCP Server + Skill) —
POST /api/mcp-servers/{id}/forkandPOST /api/skills/{id}/forkendpoints create user-owned deep copies withvisibility=personalandforked_fromlineage tracking. Encrypted env/headers are skipped on MCP Server fork; publish status is skipped on Skill fork. Alembic migration addsforked_fromcolumn to both tables. 41 tests. - Workflow Connection Dep Auto-Subscribe —
DependencyAnalyzer._resolve_workflownow recursively resolves sub-workflow dependencies with cycle detection (visited set). Agent and sub-workflow nodes correctly added as content deps in dependency manifests. Missing resources handled gracefully (log warning, no failure). 19 tests. - Prebuilt Solution Templates (Market Seed Content) — 8 vertical solution templates bootstrapped idempotently on first-user registration: Financial Audit, Contract Review, Data Reporting, IT Helpdesk, HR Onboarding, Sales Assistant, Content Writer, Meeting Summary. Each bundles an Agent + Skill with Chinese SOPs. Published to Market org (
visibility=org,publish_status=approved) for immediate marketplace availability. 4 tests. - ReAct cycle detection — Deterministic detection of repeated identical tool calls. Injects a warning after 3 consecutive calls with the same arguments, preventing agents from looping on failing tools. Configurable via
REACT_CYCLE_DETECTION_THRESHOLD. - ReAct completion checklist — One-time verification prompt before accepting final answers when tools were used, reducing premature or incomplete responses. Toggleable per agent instance.
Changed
- Completion checklist min-tools threshold — Checklist now only fires when the agent has made 3+ tool calls (configurable via
REACT_COMPLETION_CHECK_MIN_TOOLS). Simple 1-2 tool tasks skip verification to avoid unnecessary latency. - Dynamic system prompt budgeting — Removed the fixed
SYSTEM_PROMPT_RESERVE(4K tokens) from context budget calculation. ContextGuard now accounts for the system prompt dynamically, giving each iteration ~4K more usable context. - Centralized tool truncation — All tool types now delegate truncation to a shared module. Defaults configurable via
TOOL_OUTPUT_MAX_CHARS,TOOL_OUTPUT_MAX_ITEMS,TOOL_OUTPUT_MAX_BYTESenv vars. - Domain detection decoupled — Domain classification runs independently in each endpoint, no longer bundled with auto-routing. Domain SOP instructions softened to guide rather than mandate web search.
AUTO_ROUTINGenv var removed — Auto endpoint always classifies queries.
Fixed
- Duplicate message submission — Chat input now uses a synchronous guard to prevent the same message from being submitted multiple times on rapid clicks.
- Structured output degradation chain — The 3-level fallback (native FC → JSON mode → plain text) now properly falls through all levels.
json_mode_enabledDB value ignored — Models configured via Admin now correctly use their per-model setting instead of always falling back to the env var.- DAG planning failure message — Now shows user-friendly bilingual message instead of raw pipeline error.
- MCP server owner bypass for allow_fallback — Server owner is no longer blocked by
allow_fallback=False.
[v0.8] - 2026-03-20
Added
- Marketplace redesign Phase 1 — Solutions + Components — Two-tier Market model (Solutions: Agent/Skill/Workflow; Components: Connector/MCP Server) with scope selector (Global Market / org). KB removed from Market scope. Unified subscription model.
- Smart file content injection +
read_uploaded_filetool — Small uploads (<32Kchars) auto-inlined into LLM context; large files get metadata + tool hint. Dual-mode reading tool with pagination and regex search.GET /api/files/{file_id}/contentendpoint. - Workflow Blueprint System — Visual workflow editor for multi-step automation: 25 node types (Start, End, LLM, ConditionBranch, QuestionClassifier, Agent, KnowledgeRetrieval, Connector, HTTPRequest, VariableAssign, TemplateTransform, CodeExecution, Iterator, Loop, VariableAggregator, ParameterExtractor, ListOperation, Transform, DocumentExtractor, QuestionUnderstanding, HumanIntervention, SubWorkflow, ENV + more), React Flow v12 editor with drag-and-drop palette, auto-layout, SSE real-time execution, variable interpolation, condition/classifier branching, error strategies per node, per-node timeout, import/export/duplicate, version history with diff viewer, 14 built-in templates, 306 tests.
- Workflow Triggers — Cron scheduling with timezone support; public API keys (
wf_prefix) for external execution without user auth; batch execution (up to 100 input sets, configurable parallelism). - Workflow Operations — Real-time execution log viewer, trace viewer with variable snapshots, run replay overlay on canvas, run history export, analytics dashboard with daily trends and percentiles, per-node statistics panel, favorites/pinning, inline validation badges, canvas node search (
Cmd+F), keyboard shortcuts, snap-to-grid. - Workflow Admin + Templates — Admin management tab for all workflows,
WorkflowTemplatemodel with admin CRUD and 5 seed templates, publish flow with org-level review gating, import conflict resolver for external references. - Agent Skill System — On-demand skill loading:
Skillmodel with CRUD/publish/review,read_skill(name)tool for progressive disclosure (~80% token reduction),compact_instructionsper-agent for custom ContextGuard compaction. Full Skills UI with list page, editor, and agent skill selector. - ConnectorMetaTool (Progressive Disclosure Phase 1-2) — Single meta-tool replaces per-action tools. System prompt receives lightweight stubs (~30 tokens/connector); agent calls
discover/executeon demand. Feature flagCONNECTOR_TOOL_MODEfor backward compatibility. - Connector import/export/fork — Share connector templates via JSON export, clone and customize via fork. Backend sanitizes credentials on export.
- Connector credential encryption + per-user override —
connector_credentialstable with Fernet encryption,allow_fallbackflag,GET/PUT/DELETE /my-credentialsendpoints. - Publish review UI — Org-level review system with approve/reject workflow, status badges on resource cards, review notice in publish dialog, resubmit for rejected resources.
- Semantic schema annotations — 16 predefined semantic tags for connector fields with
descriptionandpiiflags, surfaced in LLM tool descriptions. - Agent mid-loop self-reflection — Goal-check prompt injected every 6 iterations in ReAct to prevent drift in long chains.
- Shadow Market org + resource subscriptions — Pull-based resource sharing: resources discovered via marketplace and explicitly subscribed. Market API for browse/subscribe/unsubscribe.
- Agent auto-discovery + sub-agent binding —
discoverableflag +sub_agent_idswhitelist +CallAgentToolfor one-level delegation. - MCP server credentials + per-user override —
mcp_server_credentialstable withallow_fallbackflag for credential fallback behavior. - Connector/KB toggle — Suspend/resume endpoints for both resource types.
- Standalone KB conversations —
kb_idsfield on conversations for direct KB chat without agent binding. - Review log audit tab — Admin audit page with system log / review log toggle and filterable review trail per org/resource.
- Agent directive in synthesis —
agent_directiveparameter ensures final answers honour the agent’s core purpose.
Changed
- Subscription-based visibility model — Simplified from 3-tier to 2-tier (own → subscribed). Auto-migration preserves existing access.
- Tool cache whitelist — Replaced blacklist with explicit
cacheableproperty on tools. 11 read-only tools marked cacheable. - DAG executor cascade failure — Failed steps now cascade-block dependents with transitive propagation.
- DAG planner improvements — Tool descriptions in planner, full re-plan history across all rounds, 14 engine constants parameterized as env vars.
- stream_answer observation truncation — Increased from 2000 to 8000 characters (configurable via
REACT_TOOL_OBS_TRUNCATION). - Evidence confidence UI — Amber warning cards,
[N]citation badges with hover popovers, conflict warning banner with side-by-side comparison. - Workflow version change summaries — Auto-generated human-readable summaries from blueprint diffs on version save.
- Workflow run retention cleanup — Background cleanup task with configurable age/count limits. Env vars:
WORKFLOW_RUN_MAX_AGE_DAYS,WORKFLOW_RUN_MAX_PER_WORKFLOW. - Connector circuit breaker — Three-state machine (closed/open/half-open) with per-connector failure tracking and monitoring endpoints.
- Replaced elkjs with lightweight BFS auto-layout —
/workflows/[id]bundle reduced from 473 kB to 43 kB.
Fixed
- Workflow eval namespace flattening — Fixed short variable name resolution in ConditionBranch and VariableAssign.
- No re-review on
is_activetoggle — Togglingis_activeno longer revertspublish_statusfromapprovedtopending_review. - Cascade-skip for condition branches — Skipped nodes correctly deactivate outgoing edges.
- Dependency analyzer — Fixed
skill_idsresolution, case-insensitive node type matching.
Removed
- Removed
is_globalfield and all global visibility concepts — replaced by Market org + subscriptions. - Removed global agent/MCP server admin endpoints.
[v0.7.5] - 2026-03-12
Added
- Free mode switching — Switch between Auto/React/DAG mid-conversation. Per-turn mode tracking via
metadata.mode. - Three model roles — Independent env config for General, Fast, and Reasoning tiers. Fast model no longer inherits main model settings.
- DAG engine improvements —
StepOutputstructured data, tool cache with async lock stampede prevention, per-step LLM verification with retry (DAG_STEP_VERIFICATION), auto-routing via fast LLM classification (AUTO_ROUTING). - Connector credential encryption — Auth tokens extracted to
connector_credentialstable with Fernet encryption viaCREDENTIAL_ENCRYPTION_KEY. Per-user credential override endpoints.allow_fallbackflag. - ModelConfig API key encryption at rest — Transparent encrypt-on-write / decrypt-on-read with backward-compatible plaintext detection.
Changed
- Skeleton screens — All list/grid pages show layout-aware skeletons during load instead of spinners.
Fixed
- Fast model no longer inherits settings from the main model.
- SSE routing event field names aligned with backend.
[v0.7.4] - 2026-03-12
Added
- Evaluation Center — Test dataset management, parallel eval runs with LLM grading, per-case pass/fail/latency/token results viewer with auto-polling.
- Admin:
json_mode_enabledper-model flag — Explicit toggle preventing AWS Bedrock prefill issues. ENV models controlled byLLM_JSON_MODE_ENABLED. - SSE Protocol v2 — Real-time streaming with
delta_reasoning,usagefields, splitdone/suggestions/title/endevents. - AI Builder expansion — 7 new builder tools,
is_builderflag, builder prompt auto-refresh, SSRF guard. Full ReAct agent dialog for connector management. - Dual database support — SQLite (zero-config) + PostgreSQL (production). Docker Compose auto-provisions PG with health checks.
- Extended thinking / reasoning —
LLM_REASONING_EFFORTandLLM_REASONING_BUDGET_TOKENSfor OpenAI o-series, Gemini 2.5+, Claude. - Admin: tool disable — Per-tool enable/disable toggles; disabled tools filtered from chat at runtime.
- Settings: Organizations tab — Create, join, manage orgs with member roles directly from Settings.
- Docker Compose deployment — Single image, named volumes, standalone Next.js output.
- Export: PDF format — Conversations exportable as PDF documents.
- Multi-worker support —
WORKERS=Nenv var; Redis interrupt broker for cross-worker relay.
Changed
- LLM layer: LiteLLM — Replaced direct
AsyncOpenAIclient for universal provider support. - Structured output degradation — Unified
structured_llm_call()with 3-level extraction (Native FC → JSON Mode → plain text + regex). - Smart relay routing — Auto-detects API protocol from URL path patterns for third-party relay platforms.
Fixed
- Docker sandbox (DooD) volume mount path translation.
- Security: sandbox AST dunder validation, MCP stdio defaults, SSRF DNS rebinding, shell metacharacter evasion, connector template injection.
- Admin dashboard stats crash on PostgreSQL.
- Docker: i18n file discovery, startup race condition, OAuth auto-detect for custom ports.
- Export: RFC 5987 filename for CJK.
[v0.7.3] - 2026-03-06
Added
- Global MCP servers — Admin-provisioned, loaded in all chat sessions.
- Structured audit logging —
write_audit()helper with structured columns.
Fixed
- Invite code backward-compat for legacy
registration_enabledfield.
[v0.7.2] - 2026-03-06
Added
- Invite-only registration — Three modes (open/invite/disabled) with invite code CRUD.
- Storage management — Per-user disk usage, clear, orphan cleanup.
- Per-user force logout — Admin token revocation.
- Conversation moderation — Admin list/delete all conversations.
[v0.7.1] - 2026-03-06
Added
- API health dashboard — System stats, connector metrics, token usage charts.
- JWT auth — Token-based SSE auth, conversation ownership.
- Admin API — Agent management, per-user token quota (429 enforcement).
[v0.7] - 2026-03-06
Added
- Admin Platform — User management, role toggle, password reset, account enable/disable.
- First-run setup wizard — Guided admin account creation.
- Personal Center — Per-user global instructions, language preference.
[v0.6.5] - 2026-03-05
Added
- Utility tools —
email_send,json_transform,template_render,text_utils. - Connector response filtering —
CONNECTOR_RESPONSE_MAX_CHARSandCONNECTOR_RESPONSE_MAX_ITEMS. - Embedding model options — Jina, OpenAI, and custom providers.
[v0.6] - 2026-03-01
Added
- Connector Platform — Full CRUD, ConnectorToolAdapter, per-user credential encryption, confirmation gate, circuit breaker, audit logging.
- MCP integration — Tool auto-discovery via protocol, process isolation.
[v0.5] - 2026-02-28
Added
- Full RAG pipeline — Jina embedding + LanceDB + FTS + RRF + reranker.
- Grounded Generation — Evidence-anchored citations, conflict detection, confidence scores.
- KB document management — Chunk-level CRUD, search, retry, schema migration.
- ContextGuard + Pinned Messages — Token budget manager.
- DAG Re-Planning — Up to 3 rounds; LLM Compact for memory.
[v0.4] - 2026-02-25
Added
- Multi-turn conversations — DbMemory persistence, smart truncation.
- Tool step folding UI — Collapse/expand tool calls.
- HTTP request + shell exec tools.
- Agent management — Create, configure, publish with bound models/tools.
- JWT authentication.
[v0.3] - 2026-02-25
Added
- Web tools —
web_search(Jina/Tavily/Brave),web_fetch. - File operations + MCP client.
- DAG visualization — Interactive flow graph with live status.
- Code execution in Docker —
--network=none, memory limits, timeout.
[v0.2] - 2026-02-24
Added
- Retry & rate limiting — Exponential backoff.
- Usage tracking — Per-request token/cost accounting.
- Native function calling — Direct model tool selection.
- Multi-model support —
FAST_LLM_MODELfor DAG steps. - Memory system — Window, Summary, Db memory.
- FastAPI backend —
/api/execute,/api/stream(SSE).
[v0.1] - 2026-02-22
Added
- ReActAgent — Reason → Act → Observe loop.
- DAGPlanner — LLM-generated dependency graphs, concurrent execution, result verification.
- Tools — Calculator, Python exec.
- Portal UI — Next.js with streaming, dark/light theme, KaTeX.