Goal: Build an AI-powered Connector Hub — Standalone (portal assistant), Copilot (embedded in host system), Hub (central cross-system orchestration). Principles: Provider-agnostic (no vendor lock-in), minimal-abstraction, protocol-first, connector-first (integration is the core value).
Product Vision
FIM One is an AI Connector Hub that serves three progressive modes:| Step | Mode | What happens |
|---|---|---|
| Land | Copilot | Embed into one system, prove value inside their UI |
| Expand | Copilot → Hub | Roll out to more systems; Hub aggregates them |
Known Issues
Tracked bugs that are reproducible in production but not yet fixed. Each entry names the symptom, the suspected surface area, and the workaround (if any). Items move to a version section once a fix is scoped and scheduled.-
Agent editor shows unsaved-changes warning on entry without any edit. Opening an existing agent via
/agents/[id]and immediately clicking back triggers the “Unsaved changes” dialog even when no field was touched. The dirty check diffs 20+ fields against the loaded agent payload, so one asymmetric default between state init and dirty compare is enough to cause a phantom mismatch — current suspicion is one of the nestedmodel_config_json/ notification / approval-routing fields, possibly fromundefinedvsnullvs""normalization. Reproduces on org-scoped agents in particular. Workaround: dismiss the dialog (Discard and leave) — no data loss since nothing actually changed. Attempted fix (cb40c86a) removed a related orphan-badge flicker on the resource pickers but did not resolve this. -
Saving an agent edit can fail with
Input should be 'initiator', 'agent_owner' or 'org_members'. Pydantic rejects theconfirmation_approver_scopefield at the/api/agents/{id}PUT boundary even though every stored value in the database is one of the three valid literals. Suspicion: the frontendas "initiator" | "agent_owner" | "org_members"cast is a compile-time-only promise, so a legacy or unexpected runtime string (possibly from a template, import, or older migration) can slip throughsetConfirmationApproverScopeand be echoed back verbatim. Workaround: explicitly re-select a value in the Approval → Approver Scope dropdown before saving. -
Playground stop-and-retry shows transient visual artefacts that a page refresh always clears. Three concurrent render sources —
activeConversation.messages(DB snapshot), the SSEmessagesstream, and the optimisticpendingQueryplaceholder — are not collapsed into a single derived state, so between clicking “Retry” and the paired assistant response landing, the UI can (a) briefly render the same query twice in the pre-stream window, (b) drop prior orphan user bubbles from the retry history whilehasLiveMessagesis true and before the snapshot reloads, and (c) flicker in the narrow window between the SSE “done” event and the nextselectConversationrefresh. Data is never lost — every user message (including aborted retries) is persisted inconversation.messages, carried into the next LLM call vianormalize_alternating_messages, and rendered correctly after refresh viaHistoryTurn.orphanUserContentsintroduced in the48ba08c6render fix. For context, Claude’s own web UI exhibits an analogous class of bug — stopping mid-response and immediately sending a follow-up query sometimes forks the follow-up as a sibling-edit branch of the first query rather than appending it as a new turn — so this is a known hard problem in optimistic-UI + SSE + persisted-history designs, not a FIM-One-specific defect. A proper fix requires collapsing the three render sources into a single derived state; deferred until a broader Playground state-machine refactor.
Shipped Versions
v0.1 (2026-02-22) — MVP: ReAct + DAG Planner
- ReActAgent with tools (calculator, python_exec, web_search)
- DAG Planner (LLM generates dependency graphs)
- Portal UI with streaming + KaTeX
v0.2 (2026-02-24) — Multi-Model + Memory
- Retry / rate limiting / usage tracking
- Native function calling (no JSON-only parsing)
- Multi-model support (fast + main LLM)
- Memory: WindowMemory, SummaryMemory
- FastAPI backend with SSE streaming
v0.3 (2026-02-25) — Web Tools + MCP
- Web tools (web_search, web_fetch) via Jina/Tavily/Brave
- File operations tool
- MCP client (standard tool integration)
- Tool auto-discovery + categories
- DAG visualization with click-to-scroll
- Code exec in Docker (
--network=none)
v0.4 (2026-02-25) — Multi-Turn + Agents
- Multi-turn conversations (DbMemory)
- Tool step folding UI
- HTTP request + shell exec tools
- Agent management (create, configure, publish)
- JWT authentication
- Per-agent execution mode + temperature control
v0.5 (2026-02-28) — Full RAG + Grounded Gen
- Full RAG pipeline (embedding + vector store + FTS + RRF + reranker)
- Grounded Generation (citations, confidence scores)
- Knowledge base document management (CRUD, search, retry, schema migration)
- ContextGuard + pinned messages (token budget manager)
- DbMemory persistence + LLM Compact
- DAG Re-Planning (up to 3 rounds)
v0.6 (2026-03-01) — Connector Platform
- Connector CRUD: create, read, update, delete
- ConnectorToolAdapter: converts Connector → BaseTool
- Per-user credentials: AES-GCM encryption
- Confirmation gate: write operation approval
- Audit logging: all tool calls recorded
- Circuit breaker: graceful degradation on failures
- Utility tools: email_send, json_transform, template_render, text_utils
- Embedding options: Jina, OpenAI, custom providers
v0.7 (2026-03-06) — Admin Platform + Multi-Tenant
- Admin Platform: user management, role toggle, password reset, account enable/disable
- Invite-only registration: three modes (open/invite/disabled) + invite code CRUD
- Storage management: per-user disk usage, clear, orphan cleanup
- Conversation moderation: admin list/delete all
- Per-user force logout: revoke all tokens
- API health dashboard: system stats, connector metrics
- First-run setup wizard: guided admin account creation
- Personal Center: per-user global instructions, language preference
- JWT auth: token-based SSE auth, conversation ownership
- Global MCP servers: admin-provisioned, loaded in all sessions
- Backward-compat: registration_enabled → registration_mode auto-migration
v0.7.x (2026-03-07 to 2026-03-12) — Stability + Polish
- Invite code management
- Per-user quotas (429 enforcement)
- Structured audit logging
- Sensitive word filtering
- Admin login history
- Admin file browser
- Enhanced admin views (model_name, tools, kb_ids fields)
- Docker Compose deployment (single image, named volumes)
- OAuth auto-detection from window.location
- Extended thinking / reasoning support (
LLM_REASONING_EFFORT,LLM_REASONING_BUDGET_TOKENS) for OpenAI o-series, Gemini 2.5+, Claude - Admin per-tool enable/disable (disabled tools excluded from chat at runtime)
- MCP servers management moved to Connectors page
- Dual database support: SQLite (zero-config default) + PostgreSQL (production); Docker Compose auto-provisions PostgreSQL
- Models configuration documentation page with extended thinking setup per provider
- SSE Protocol v2: real-time answer streaming with
delta_reasoning,usagefields, and splitdone/suggestions/title/endevents; SQLite pool size 5 -> 20 - AI Builder expansion: 7 new builder tools (GetSettings, TestConnection, ImportOpenAPI for connectors; ListConnectors, AddConnector, RemoveConnector, SetModel for agents),
is_builderflag on agents, builder prompt auto-refresh, SSRF guard - SSE v2 frontend: streaming dot-pulse cursor, DAG re-plan round snapshots as collapsible cards, DAG layout decoupled from step states
- AI Builder concept documentation page with connector and agent builder guides
- Organization system: full CRUD with role-based membership (owner/admin/member), admin management UI
- Three-tier resource visibility (personal/org/global) for agents, connectors, knowledge bases, MCP servers
- Publish/unpublish API for all resource types; owner delegation for published agents
- Admin set-visibility endpoint (replaces clone-to-global); unified
build_visibility_filter()query helper - Database Connectors (Phase 1-3): direct SQL access to PG/MySQL/Oracle/SQL Server + Chinese legacy DBs; schema introspection, AI annotation, read-only query execution, encrypted credentials, 3 tools per connector (
list_tables,describe_table,query) - Evaluation Center: quantitative agent quality benchmarking — test dataset CRUD (prompt + expected behavior + assertions), eval runs (parallel execution + LLM grader + per-case pass/fail/latency/token results), results viewer with auto-polling; migration
r8t0v2x4z567 - Three model roles (General/Fast/Reasoning) with per-tier env config isolation; fast model no longer inherits main model settings
StepOutputdataclass replacing plain string step results for structured data and artifact passing- Tool cache for DAG execution — identical tool calls cached per-run with async lock stampede prevention (
DAG_TOOL_CACHE) - Per-step LLM verification with 1 retry on failure (
DAG_STEP_VERIFICATION) - Auto-routing: fast LLM classifies queries as ReAct or DAG;
/api/autoendpoint; frontend 3-way mode toggle (AUTO_ROUTING) -
Shadow Market Organization + Resource Subscriptions: Built-in Market org (shadow, no auto-join) replaces Platform org; resources discovered via marketplace browsing and explicitly subscribed (pull model); Market API for subscribing to shared resources; publish-to-Market always requires review; Resource subscriptions table; org-based resource sharing replacing global visibility -
Agent Auto-discovery and Sub-agent Binding:discoverableflag on agents;sub_agent_idswhitelist; CallAgentTool for delegating tasks to specialist agents -
MCP Server Credentials + Per-User Override:mcp_server_credentialstable;PUT /api/mcp-servers/{id}/my-credentialsendpoint;allow_fallbackflag for credential fallback behavior -
Connector/KB Toggle:POST /api/connectors/{id}/toggleandPOST /api/knowledge-bases/{id}/togglefor suspending/resuming resources -
Standalone KB Conversations:kb_idsfield on conversations for direct KB chat without agent binding
v0.8 (2026-03-20) — Connector Declarative Config + Progressive Disclosure
- Database connectors: direct SQL access (PostgreSQL, MySQL, Oracle) (shipped in v0.7.x — Phase 1-3)
- RBAC: per-user/role connector access control (shipped in v0.7.x — org system + three-tier visibility)
- Connector credential encryption + per-user override:
connector_credentialstable, Fernet encryption viaCREDENTIAL_ENCRYPTION_KEY,allow_fallbackflag,GET/PUT/DELETE /my-credentialsendpoints, per-user credential resolution in chat tool loading - Publish review UI: Org-level publish review system — review toggle per org, ReviewsSheet with approve/reject workflow, status badges on resource cards, review notice in publish dialog, resubmit for rejected resources
- Connector Progressive Disclosure (Phase 1-2): single
ConnectorMetaToolreplaces per-action tools; system prompt receives lightweight stubs only (name + 1-line description, ~30 tokens/connector vs ~250 tokens/action); agent callsdiscover(connector)to load full action schema on demand — schema only loads when the model selects a connector, keeping the prompt prefix stable for caching. Follows the deferred tool-loading pattern common in modern agent frameworks.executesubcommand; feature flag for backward compatibility. - Agent Skill System + Compact Instructions: On-demand skill loading for agent instructions —
Skillmodel (name, content/SOP, optional scripts) attached to agents; referenced in system prompt by name only (~10 tokens/skill); agent callsread_skill(name)to load full content on demand. Reduces per-conversation instruction token cost by ~80% while allowing richer SOP libraries. Counterpart to ConnectorMetaTool’s progressive disclosure applied at the instruction level. Enables the “指令 + 工具 + 技能” differentiation story. Also addscompact_instructionsfield to Agent model — per-agent compression priority list injected intoContextGuardwhen compacting (e.g., “preserve order IDs and amounts, drop raw API responses”), replacing the current static generic prompt. Follows the Compact Instructions convention widely adopted in modern agent frameworks. - Connector import/export: share connector templates
- Connector fork: clone + customize existing connectors
- Workflow Phase 2 Nodes: Iterator, Loop, VariableAggregator, ParameterExtractor, ListOperation, Transform, DocumentExtractor, QuestionUnderstanding, HumanIntervention — 9 advanced node types with full frontend + backend + 150 new tests (275 total). Node retry with exponential backoff, safe expression evaluation. Stats panel with success rate bar. 12 built-in templates. Pane context menu (Paste, Select All, Fit View, Auto Layout).
- Workflow Phase 3 Nodes: SubWorkflow + ENV — 2 new node types (25 nodes total), 14 new tests (306 total), 14 built-in templates. SubWorkflow: full DB-backed nested workflow executor with target workflow selection, variable mapping, and configurable depth limit to prevent infinite recursion. ENV: reads encrypted environment variables with key picker and fallback defaults. Full frontend (node components, config panels, palette entries, minimap colors). Per-node execution statistics panel (success rates, durations, failure counts sorted worst-first).
getNodeStatsAPI client +NodeStatEntrytype. Keyboard shortcuts dialog (?key). - Workflow Scheduled Triggers: Per-workflow cron configuration with timezone, default inputs, and next-run-at calculation. Preset cron buttons, 30 trigger tests.
- Workflow API Triggers: Public per-workflow API keys (
wf_prefix) for external execution without user auth, with rate limiting. API key management dialog with generate/regenerate/revoke, trigger URL, and cURL/JS examples. - Workflow Batch Execution:
POST /batch-runwith up to 100 input sets, configurable parallelism (1-10), collapsible per-item results, JSON export. 14 batch execution tests. - Workflow Execution Log Viewer: Real-time chronological SSE event stream in the run panel with timestamps, color-coded badges, and event type filter toggles.
- Workflow Run Stats: Backend batch-fetches run counts and success rates via GROUP BY subquery; frontend displays stats on workflow cards with color-coded success rate indicators.
- Workflow Scheduler Daemon: Background async service polling every 60s for due cron-based workflows. Croniter timezone support, semaphore concurrency,
last_scheduled_attracking, webhook delivery. 14 tests. - Workflow Import Conflict Resolver: Detects unresolved agent/connector/KB/MCP references during import. Batch DB queries with visibility filtering, frontend toast warnings. 17 tests.
- Workflow Test-Node Execution: Isolated single-node testing with mock variables, integrated into editor (config panel Test button + context menu). 23 tests.
- Workflow Version Diff: Side-by-side blueprint comparison with node/edge change detection, color-coded indicators (added/removed/modified).
- Workflow Run Management: Delete individual runs (
DELETE /runs/{run_id}) and clear all completed runs (DELETE /runs), with frontend confirmation dialogs. - Workflow Run Replay Overlay: “View on Canvas” button in run history to overlay past execution results on the canvas, showing per-node status and output without re-executing.
- Workflow Favorites/Pinning: Star/pin workflows to the top of the list with localStorage persistence.
- Workflow Run History Export: Export run history as JSON file download with full run metadata and per-node results.
- Admin Workflows Management: Admin panel tab for managing all workflows across users — list, toggle active/inactive, delete with confirmation. Batch endpoints for delete, toggle, and publish with audit logging.
- Workflow Templates System:
WorkflowTemplateORM model with admin CRUD, public listing/clone API, and 5 seed templates auto-inserted on first startup. - Workflow Inline Validation Badges: Real-time per-node
ValidationBadgeon canvas with error/warning tooltips for immediate visual feedback during editing. - Workflow Execution Trace Viewer: Timeline-based trace viewer Sheet with engine
trace_levelparameter and per-node variable snapshots for step-through debugging. - Workflow Rate Limiting and Timeout: Per-user
WorkflowRateLimiter(sliding window 10 runs/min, 3 concurrent) and default 10-minute global run timeout. - Workflow Blueprint System: Visual workflow editor for designing and executing multi-step automation blueprints —
Workflow/WorkflowRunORM models, full CRUD + SSE execution API, import/export, duplicate, blueprint validation endpoint,WorkflowEnginewith topological sort + semaphore-based concurrency + condition branching and 12 node types (Start, End, LLM, ConditionBranch, QuestionClassifier, Agent, KnowledgeRetrieval, Connector, HTTPRequest, VariableAssign, TemplateTransform, CodeExecution),VariableStorewith{{node_id.output}}interpolation andenv.*namespace, error strategies per node (STOP_WORKFLOW / CONTINUE / FAIL_BRANCH) with per-node timeout and advanced config UI, React Flow v12 visual editor with drag-and-drop palette + node config panel + variable picker combobox + add-node-on-edge + auto-layout (ELK.js) + run history sheet, Dify-style compact node design with ring-based run status styling and animated edge transitions, 4 built-in starter templates (Simple LLM Chain, Conditional Router, Knowledge-Augmented QA, HTTP API Pipeline) with template picker dialog andGET /templates+POST /from-templateAPI, stats endpoint,?run=trueURL param auto-open, subprocess-based code execution security, 105-test suite (templates, eval namespace flattening, blueprint validation warnings, node/edge deletion, import/export/duplicate, deadlock detection, multi-condition branching) - Operation audit: detailed logging of who did what — admin review log audit tab added (publish review trail per org/resource)
- Semantic Schema Annotations: extend connector schema fields with
semantic_tag,description, andpiiflags; annotations surfaced in LLM tool descriptions so the agent understands field intent without guessing from column names
v0.8.1 (2026-03-29) — Progressive Disclosure Maturity + ReAct Hardening
- Progressive disclosure for DB connectors (
DatabaseMetaTool), MCP servers (MCPServerMetaTool), and on-demand tool loading (request_toolsmeta-tool) - DAG quality overhaul (5 improvements: model upgrade, skill auto-discovery, citation verifier, structured content preservation, domain-aware routing)
- Domain model escalation in ReAct (specialist domains auto-escalate to reasoning model)
- Per-model Native Function Calling toggle (
tool_choice_enabled) - ReAct cycle detection (deterministic duplicate tool call prevention)
- ReAct completion checklist (pre-answer verification when tools were used)
- Resource Fork Phase 1 (MCP Server + Skill fork endpoints with lineage tracking)
- Workflow Connection Dep Auto-Subscribe (recursive sub-workflow dependency resolution)
- Prebuilt Solution Templates (8 vertical solutions seeded to Market on first registration)
- Admin notification improvements (timezone-aware, master switch, SMTP Reply-To)
- Per-turn token budget circuit breaker (
REACT_MAX_TURN_TOKENS) - Centralized tool truncation, dynamic system prompt budgeting
- File attachment download, duplicate message submission fix
v0.8.2 (2026-04-10) — Agent Core Hardening + Vision Documents
- Agent Core Phase 0 — Compact prompt upgraded to 9-section structured format; empty tool result protection (descriptive message instead of
(no output)); anti-loop prompt + cycle detection threshold lowered to 2; domain classifier + pre-flight DB config resolution parallelized (400–1100 ms saved per request); SSEendevent sent immediately after answer, with title/suggestions moved to background tasks - Agent Core Phase 1 (Context Anti-Bloat) —
MicroCompactrule-based old tool result cleanup (keep last 6);REACT_TOOL_RESULT_BUDGET=40000aggregate cap; reactive compact on context overflow (auto-compact to 50% budget and retry instead of crashing) - Agent Core Phase 2 (Speed) — Keyword-based tool pre-selection (skips LLM call on obvious matches, 200–500 ms saved);
SharedHttpClientLLM connection pooling; completion check skipped for answers >200 tokens;FallbackLLMwraps primary+fast with automatic failover on 429/503/529/connection errors - Intelligent Document Processing (Vision-Aware) — Adaptive document handling: PDF pages rendered as images via PyMuPDF for vision-capable models (GPT-4o, Claude 3/4, Gemini), text-only fallback via pdfplumber. Per-model
supports_visionflag. Modes viaDOCUMENT_PROCESSING_MODE,DOCUMENT_VISION_DPI,DOCUMENT_VISION_MAX_PAGES. DOCX/PPTX embedded image extraction. Multi-turn vision persistence across conversation turns. Smart PDF processing (text-rich pages extract text + images; scanned pages render as full-page PNG). Pre-built sandbox image (Dockerfile.sandbox) with common data-science packages for--network=nonecode execution - Resource Fork completion — Agent / Connector / Workflow fork endpoints added, completing the five-type lineage tracking (KB fork removed — inherently user-local)
- File integrity guardrail — System prompt rule prevents the agent from substituting unrelated file contents when a target file is unreadable; uploaded files now include
file_idin message context for directread_uploaded_fileaccess
v0.8.3 (2026-04-16) — Universal Document Conversion + Agent Core Phase 3
- Universal Document Conversion (
convert_to_markdown+ OCR) — Built-in Agent tool wrapping Microsoft MarkItDown; converts PDF, Word, Excel, PowerPoint, HTML, JSON, CSV, XML, ZIP, EPUB, Outlook .msg, images, audio, YouTube URLs to Markdown.LiteLLMOpenAIShimenables OCR via any vision-capable LLM (Claude, Gemini, Bedrock, Azure). Vision-aware RAG ingestion with zero-regression text-only fallback.LLM_SUPPORTS_VISIONenv var for opt-out - Agent Core Phase 3 (Runtime Invariant Hardening) — Conversation recovery (dangling
tool_useauto-repair); structured compact work card (WorkCardtyped merge across compaction rounds); turn-level profiler (REACT_TURN_PROFILE_ENABLED); per-user rate limiting (LLM_RATE_LIMIT_PER_USER); empty-content assistant message withtool_callsno longer dropped
v0.8.4 (2026-04-17) — Prompt Cache + Reasoning Correctness
- System prompt section registry with cache breakpoints — Memoized
PromptRegistrysplits system prompts into stable prefix + dynamic suffix; cache-capable providers (Claude, Bedrock Anthropic, Vertex Claude) receivecache_control: {"type": "ephemeral"}on the prefix for ~60-80% per-turn input token savings. Non-cache providers get a single concatenated message (zero behavior change) - Prompt cache observability —
cache_read_input_tokensandcache_creation_input_tokenstracked throughUsageSummary→TurnProfiler→done_payload.cachefield. Structuredturn_cachelog line per turn. Doubles as relay cache-honesty probe - Conversation recovery MVP — Synthetic
tool_resultrows persist after interrupted turns;POST /chat/resumereplays cached SSE events from a monotonic cursor; frontenduseSseResumehook auto-reconnects with exponential backoff (300ms → 1s → 3s, max 3 attempts) and “Reconnecting…” indicator - Thinking-block persistence with signature —
reasoning_content+ Anthropicsignaturepersisted inmetadata_["thinking"]and replayed on subsequent turns; fixes HTTP 400 signature mismatch on Claude 4 multi-turn conversations - Provider-aware reasoning replay policy — Centralized
reasoning_replay_policy()incore/prompt/reasoning.pygates serialization per provider family: Claude replays thinking blocks with signature; DeepSeek-R1/Qwen-QwQ/Gemini-thinking/o-series dropreasoning_contenton outbound (previously leaked, breaking provider KV caches and violating API docs)
v0.8.5 (2026-04-23) — Channel Integration + Hook System + Contributor i18n
- Feishu Channel (Phase 1 subset) — Org-scoped
Channelresource with Fernet-encrypted credentials;FeishuChannelsupports interactive card send + callback (signature verification + URL challenge); Settings → Channels management UI (list, create/edit with dirty-state protection, details with copyable callback URL, test-send); CRUD API (/api/channels) and event callback endpoint (/api/channels/{id}/callback). Shipped early for 2026-04-24 roadshow - Agent Hook System (live in ReAct + DAG runtime) —
PreToolUseHook/PostToolUseHookabstraction insrc/fim_one/core/hooks/; agents declaringhooks.class_hooksinmodel_config_jsonhave hooks instantiated and registered per chat session. First consumerFeishuGateHookposts an Approve/Reject card to the linked Feishu group when an agent calls arequires_confirmation=Truetool, blocks execution, and resumes or aborts based on verdict - Configurable confirmation gate (inline OR channel) — Every agent gets an Approval section with three routing modes (Auto / Inline only / Channel only), approver-scope selector (initiator / owner / anyone in org), per-tool override, and explicit approval-channel picker. Auto mode gracefully falls back to an inline approval card when no channel is linked.
POST /api/confirmations/{id}/respondshares a single decision-recording path with the Feishu webhook - Per-agent task completion notifications — Long-running ReAct or DAG agents can push a summary card to the org’s channel when a task finishes. First consumer of the generic outbound notification pattern
- Hook Approval Playground — Channels details sheet has a “Test Approval Flow” action that exercises the full production path (genuine
ConfirmationRequestrow, real Feishu callback, status transitions) — same code path a production hook uses - Contributor-friendly i18n CI fallback —
.github/workflows/i18n-sync.ymltranslates EN → ZH/JA/KO/DE/FR on master after PR merge and auto-commits with[skip ci]; contributors no longer needLLM_API_KEYlocally. Pre-commit locale-edit guard refuses manual edits to generated locale files (ALLOW_LOCALE_EDIT=1override for legitimate translation fixes). End-to-end verified via smoke-test push - Exa integration docs — Dedicated Integrations section with a first-class Exa page covering the full Exa search surface (neural / fast / deep-reasoning / instant), filtering, content retrieval, and three tuned presets
- Xinchuang (信创) database support — Database Connector now lists KingbaseES (人大金仓), HighGo (瀚高), and DM8 (达梦) alongside PostgreSQL/MySQL. PG-compatible drivers reuse
asyncpg; DM8 usesdmPython.scripts/test_xinchuang_dbs.pyverifies live connectivity from the CLI - Channels + Hook System architecture docs —
docs/architecture/hook-system.mdxexplains the three hook points and walks through FeishuGateHook end-to-end; existing architecture pages cross-link; README lists Messaging Channels as a first-class capability - Hardening — Duplicate Feishu callback clicks produce a replacement card instead of double-deciding; concurrent callback clicks resolved via conditional
UPDATE ... WHERE status='pending'rowcount check; pending approvals auto-expire afterCHANNEL_CONFIRMATION_TTL_MINUTES(default 24h) via background sweeper; Settings → Channels respects org role (members see read-only UI); parallel tool-call aggregator handles providers that reuseindex=0for every delta; session-expiry redirect preserves query string
Planned Versions
v0.8.6 — Channel & Hook Polish
Goal: Close loose ends from the v0.8.5 Channel + Hook rollout before the v0.9 production-hardening wave lands. Scope is intentionally narrow — polish, not new capability.- Per-hook config pass-through —
class_hooksentries today are bare strings; to overrideFeishuGateHook.timeout_seconds,poll_interval_seconds, orcallback_base_urlper-agent, the schema needs to accept{"name": "feishu_gate", "config": {...}}objects that get forwarded as kwargs to the hook factory. Low-risk follow-up; current defaults (120s timeout / 1.5s poll / env-var callback URL) are acceptable in the meantime. - DAG
tools_usedaccuracy — the completion notification card currently derivestools_usedfromplan.steps[*].tool_hint(the planner’s suggestion), not the real tool names the per-step ReAct loops chose. Plumb the actual chosen tool names out of the DAG executor’s step-completion callback so notification cards reflect what was actually run. - Hook inheritance policy for delegated sub-agents and Workflow AGENT nodes — today
CallAgentToolchildren and WorkflowAGENTnodes create fresh ReActAgents that do not inherit the parent’s hook registry, so a sensitive tool call reached via delegation silently bypassesfeishu_gate. Decide and document: do child agents inherit (default-secure, prevents gate bypass) or isolate (lets teams delegate non-approval-gated work to a child agent)? Eval Center runs stay opt-out by design (automation cannot block on human approval).
v0.9 — Observability + Production Hardening
Goal: Production-grade operations, debugging, and monitoring. Introduces the Hook System — a deterministic enforcement layer that sits below agent instructions and cannot be overridden by the LLM.- Connector Progressive Disclosure (Phase 3-4): unified
ConnectorExecutorinterface (API/DB/MCP behind one abstraction); action parameter validation withjsonschema; protocol-agnostic discover/execute - YAML/JSON connector config: platform auto-generates MCP server
- Database connectors Phase 4: enterprise drivers — Oracle (
oracledb), SQL Server (aioodbc), [x] 达梦 DM8 (nativedmPython), [x] 人大金仓 KingbaseES + 瀚高 HighGo (PG-compatible, reuseasyncpg), 南大通用 GBase (aioodbc+ GBase ODBC) - IM Channel Integration (Bidirectional): Phase 1 — Outbound push: Lark, WeCom, Slack, Email, Teams notification actions from Agent/Workflow results. Phase 2 — Inbound trigger: users @mention Agent in IM group chats to trigger tasks without opening Portal; webhook receiver per channel; each IM channel modeled as a Connector with bidirectional actions (send + receive). Hub mode killer feature
- Feishu Channel (Phase 1 subset) — Org-scoped
Channelresource with Fernet-encrypted credentials;FeishuChannelimplementation supporting interactive card send + callback (signature verification + URL challenge); integrates with confirmation gate so write-operation approvals land as Approve/Reject cards in the configured Feishu group chat; Settings → Channels management UI. Shipped early for 2026-04-24 roadshow. Remaining: WeCom, Slack, Email, Teams outbound + Phase 2 inbound triggers. - Outbound notification patterns (generic, reusable across channel types): The same
BaseChannelabstraction supports a catalog of outbound use cases beyond approval gating. Each pattern is implemented once againstBaseChanneland automatically works for every concrete channel (Feishu today; Slack/WeCom/Teams/Email as they land).- Task completion notification: when a long-running DAG / Workflow / scheduled agent finishes, post a summary card to the org channel (or a user-chosen channel) with result snippet + artifact links. The minimum viable “Channel outbound” product — first consumer after
FeishuGateHook - Exception / failure alerts: agent inference fails, an LLM provider errors out, a connector raises a 5xx, a DAG plan exhausts re-plan budget → push a diagnostic card to the ops channel with trace ID and retry affordance
- Cost / budget warnings: per-user or per-agent token/request budget reaches 80% / 100% thresholds → push to the admin channel (or @-mention the agent owner) with current usage vs. cap
- Scheduled digests: agents or workflows emit periodic (daily / weekly) summary cards — KPI rollup, open-ticket triage, contract renewal list — directly into the channel without requiring users to open Portal
- Escalation on stuck agent: agent has made no observable progress for N consecutive iterations (cycle detected, or
max_iterationsexceeded) → post a card asking a human operator to take over, with current context and “resume with your note” action - Audit receipts for sensitive tool calls: independent of the approval gate — every call to a tool tagged
audit=trueemits a read-only receipt card to the compliance channel (who / what / when / args), providing a durable audit trail outside the database - Approval escalation: if a
feishu_gateapproval card receives no response within N minutes, automatically @-mention the group owner or forward the card to a higher-tier channel; configurable per tool/connector
- Task completion notification: when a long-running DAG / Workflow / scheduled agent finishes, post a summary card to the org channel (or a user-chosen channel) with result snippet + artifact links. The minimum viable “Channel outbound” product — first consumer after
- Feishu Channel (Phase 1 subset) — Org-scoped
Connector Authorization Layers (Data-level RBAC)
Existing RBAC controls resource visibility (who can see a Connector), not execution-time authorization (what the caller can do through it). When an admin wires up a DB/API with a high-privilege credential, every org member using that Connector inherits the admin’s blast radius. This subsection closes the gap across three distinct upstream capability tiers:- Tier 1 — DB mode (full-admin + basic/legacy): admin supplies a single DB credential (root or a least-privileged service account) because most legacy systems cannot issue per-user DB accounts. Enforcement happens above the connection via
ConnectorScopeGuard— aPreToolUsehook that filtersquery/executecalls per caller identity. Capabilities: block destructive verbs (DROP, TRUNCATE, unscoped DELETE/UPDATE); table-level allow/deny lists; column redaction driven bypii=truesemantic annotations; auto-injection of scope predicates (e.g., appendAND dept_id = :caller_deptto everySELECT). Config is ascope_rulesJSON field on the Connector with role-based matching; default is deny-if-ambiguous. - Tier 2 — Open API mode (per-user API key): the preferred path. Users bring their own API key (shipped in v0.8 —
connector_credentials+allow_fallback=false); upstream system’s native authz enforces scope naturally. Remaining work: per-connector admin UI to require per-user credentials (disable admin-fallback globally) and a health dashboard showing which org members still haven’t bound their own key. - Tier 3 — Middle tier (login-ticket exchange): for frontend/backend-split systems with no user-scoped API key. Call the system’s login endpoint with user-provided credentials, cache the returned short-lived ticket, auto-refresh on expiry. New
LoginTicketCredentialtype alongside API key / OAuth; connector spec declaresauth_type: login_ticketwithlogin_endpoint,ticket_field, andrefresh_strategy. Adapter layer converts the ticket into the outbound auth header per request. - Cross-tier auditability: every tool call stamped with
caller_user_id,effective_credential_source(user / admin-fallback / ticket), andscope_rules_appliedinConnectorCallLog, so ops can answer “who actually ran what as whom” after an incident.
Channel → Integration Promotion
Today Feishu is wired as a Channel + Connector pair — delivery pipe and API surface. Enterprise rollouts need a third role: Integration (the same third-party binding also provides SSO and org-graph sync). Landing in v0.9 because the existing Feishu binding already covers 3 of the 4 required facets, and the identity story unblocks Tier-2 authorization above (users can obtain their own upstream token at login instead of manually provisioning API keys).- Channel → Integration model: promote
Channelfrom “outbound-only delivery” into aThirdPartyIntegrationparent with three opt-in sub-capabilities — (a) Delivery (existing Channel behavior: send cards, gate confirmations); (b) Login (OIDC / custom SSO; “Login with Feishu” yields both a FIM session and the upstream platform token); (c) Org graph sync (mirror upstream departments/members into FIM org structure; scheduled or webhook-driven). Admins toggle each capability per binding. - Feishu SSO as Integration capability: reuse the existing Feishu app binding (app_id/secret already on the Channel) to expose “Login with Feishu” to every user whose Feishu tenant is bound to an org. The token obtained at login becomes the user’s default credential for the Feishu Connector — removing the “go get your own API key” friction for Tier-2 enforcement.
- Org graph sync (Feishu → FIM org): pull Feishu departments + members into FIM; map Feishu tenant-admin / department-head / member roles to FIM owner/admin/member. Foundation for WeCom and DingTalk next, and for Kingdee / Yonyou / SAP ERP-OA adapters in v1.0.
Public API (Phase 2)
Phase 1 (shipped): API key authentication middleware, scope support, curated OpenAPI spec, Mintlify API Reference with interactive playground.- Per-key rate limiting — Configurable requests/minute and requests/day limits per API key;
429 Too Many Requestsresponses withX-RateLimit-*headers - Per-key usage quota — Monthly token/request budgets with admin dashboard and threshold alerts
- Scope enforcement per endpoint —
require_scope("chat")dependency on all protected endpoints; keys withscopes=chatcan only access chat-related APIs - API versioning (
/v1/...) — Stable versioned API contract; deprecation headers for sunset endpoints - Webhook callbacks — Register webhook URLs per API key; receive POST notifications for conversation completion, agent errors, and async task results
- SDK generation — Auto-generated Python and TypeScript client SDKs from OpenAPI spec; published to PyPI and npm
- Developer Portal — Interactive “Try it” panels in Mintlify docs; usage analytics visible to key owners
- API key rotation — One-click key rotation with grace period (old key valid for 24h after rotation)
- Batch / async API —
POST /api/batchaccepting up to 100 queries; returns abatch_idfor polling results; useful for bulk KB queries or multi-agent orchestration - Circuit breaker per external dependency — Prevent cascading failures when downstream LLM providers or connectors are unavailable; automatic fallback and recovery
Observability & Agent Runtime
- Agent Trace Layer (Observability++): Application-level run/trace/thread hierarchy for agent debugging — each conversation →
Trace, each LLM call / tool call / DAG step →Spanwith input/output/tokens/timing. Frontend trace viewer with timeline and expandable LLM call payloads. This goes beyond OTel (infrastructure-level) to provide actionable agent-loop debugging for developers and enterprise clients. OpenTelemetry export as a data sink option. Modeled after LangSmith’s run/trace/thread concepts — the industry-validated pattern for agent observability. - Metrics dashboard: latency, success rate, token usage, connector call analytics — per-agent, per-user, per-org breakdowns
-
Circuit breaker: three-state machine (closed/open/half-open) with per-connector failure tracking, 5xx detection, and monitoring endpoints(shipped early — implemented in v0.8) -
Workflow run retention cleanup: background cleanup task with configurable max age and max count per workflow; per-workflow overrides; admin endpoint for manual trigger(shipped in v0.8.1) -
Workflow version change summaries:(shipped in v0.8.1)compute_blueprint_diff()auto-generates human-readable summaries from blueprint diffs on version save -
DAG quality overhaul: default model upgrade for non-fast steps; skill auto-discovery in planning; citation verifier for legal/medical/financial domains; structured content context preservation; domain classification in router with domain-aware model selection(shipped in v0.8.1) -
Domain model escalation in ReAct: specialist domains auto-escalate to reasoning model with mandatory web search and citation verification(shipped in v0.8.1) -
Per-model Native Function Calling toggle:(shipped in v0.8.1)tool_choice_enabledsetting lets models skip forced tool selection and go directly to JSON Mode -
DatabaseMetaTool (Progressive Disclosure for DB connectors): single(shipped in v0.8.1)databasemeta-tool withlist_tables/discover/querysubcommands replaces 3N individual tools per database connector; configurable viaDATABASE_TOOL_MODEenv var (progressivedefault,legacyfallback) -
On-demand tool loading via(shipped in v0.8.1)request_toolsmeta-tool: when >12 tools are available after smart selection, LLM can dynamically load additional tools mid-conversation; works in both JSON and native function-calling modes -
MCPServerMetaTool (Progressive Disclosure for MCP): single(shipped in v0.8.1)mcpmeta-tool withdiscover/callsubcommands replaces N*M individual MCP tools; configurable viaMCP_TOOL_MODEenv var (progressivedefault,legacyfallback) -
Workflow Connection Dep Auto-Subscribe: Extend Market subscription cascade to auto-subscribe Workflow’s connection dependencies (API Connectors, MCP Servers). Workflow nodes can reference Connectors, MCP Servers, Agents (which recursively reference more deps), and sub-Workflows — all connection deps in the full tree must be auto-subscribed on subscribe and cascade-cleaned on unsubscribe. More complex than Skill/Agent due to recursive sub-workflow resolution with cycle detection. Counterpart to Solution (Skill/Agent) connection dep auto-subscribe(shipped in v0.8.1) -
Workflow real executors: replaced MCP and BuiltinTool node executor stubs with full implementations (MCP server discovery + tool calling; ToolRegistry integration)(shipped in v0.8.1) - Agent Hook System: A deterministic enforcement layer that runs outside the LLM loop — hooks execute automatically on tool events and cannot be bypassed by agent instructions. Three hook points:
PreToolUse(validate / block before execution),PostToolUse(side effects after execution),SessionStart(inject dynamic context). Built-in hooks: auto-writeConnectorCallLogon every connector call (currently manual); block write operations when org is in read-only mode; auto-truncate oversized DB query results before they hit the agent; rate-limit per-connector call frequency. User-defined hooks: per-agent YAML config (hooks:field) declaring shell commands or Python callables triggered on matching tool events — a hook-based enforcement pattern shared across modern agent frameworks. Key design principle: hooks are for “must always happen” logic that should never depend on the LLM remembering to do it. Instructions say “record all calls”; hooks actually record them. Instructions say “don’t write in read-only mode”; hooks actually block it.- Hook System skeleton + FeishuGateHook — Class-based
PreToolUseHook/PostToolUseHookabstractions wired below the ReAct loop; first concrete hook isFeishuGateHook, which intercepts tools flaggedrequires_confirmation=Trueand routes approval through the org’s Feishu Channel. Full hook lifecycle (user-defined YAML hooks, built-in audit/block/truncate hooks,SessionStart) remains v0.9 scope. Shipped early for 2026-04-24 roadshow. - Hook Approval Playground — UI-driven round-trip tester: Channels details sheet now launches a dialog that creates a real
ConfirmationRequestrow, sends the production card to Feishu, and polls the decision live. Exercises the exact code path a production hook would, making pre-rollout rehearsals and demos faithful. - Hook System live in ReAct + DAG runtime —
build_hook_registry_for_agentresolvesagent.model_config_json.hooks.class_hookson every chat session; ReAct and DAG entry points both instantiate hooks before tool dispatch. Tool adapter exposesrequires_confirmationas a public property so hook predicates can duck-type without adapter coupling. Paired with an end-to-end smoke script (scripts/smoke_feishu_gate.py) covering approve / reject paths through the real agent loop. - Per-hook config pass-through —
class_hooksentries today are bare strings; to overrideFeishuGateHook.timeout_seconds,poll_interval_seconds, orcallback_base_urlper-agent, the schema needs to accept{"name": "feishu_gate", "config": {...}}objects that get forwarded as kwargs to the hook factory. Low-risk v0.8.5 follow-up; current defaults (120s timeout / 1.5s poll / env-var callback URL) are acceptable for v0.8.
- Hook System skeleton + FeishuGateHook — Class-based
- Agent Workspace (Persistent Agent Desktop): Three layers: (1) Tool Output Offloading — auto-save oversized tool responses (>8K chars) to
workspace://files, return truncated preview + URI; builtin toolsread_workspace_file,list_workspace_files,write_workspace_filefor selective access and agent-generated artifacts. (2) Handoff Notes —write_handoff(summary)for context transitions across compression/session switches. (3) Workspace UI — frontend file browser panel per conversation (preview text/JSON/CSV, download, delete/rename); cross-session file retention; per-user storage quota integration. Extendstruncate_tool_output()in adapters +GET /api/conversations/{id}/workspaceendpoint - Smart File Content Injection +
read_uploaded_filetool: Small uploaded files (<32Kchars) auto-inlined into LLM context; large files get metadata + tool hint. Dual-moderead_uploaded_filetool (paginated reading + regex search).GET /api/files/{file_id}/contentendpoint,.contentsidecar storage,content_lengthin file API responses -
Intelligent Document Processing (Vision-Aware): Adaptive document handling — PDF pages rendered as images via PyMuPDF for vision-capable models (GPT-4o, Claude 3/4, Gemini); text-only fallback via pdfplumber. Per-model(shipped in v0.8.2)supports_visionflag in Admin. Two modes (vision/text) configurable viaDOCUMENT_PROCESSING_MODEenv var. Smart PDF processing: text-rich pages extract text + embedded images (token-efficient), scanned pages render as full-page PNG. DOCX/PPTX embedded image extraction. Multi-turn vision persistence. Pre-built sandbox image (Dockerfile.sandbox). Cached page rendering with.pages/sidecar. -
Universal Document Conversion ((shipped in v0.8.3)convert_to_markdown+ OCR) — Built-in Agent tool wrapping Microsoft MarkItDown, available to every agent by default. Converts PDF, Word, Excel, PowerPoint, HTML, JSON, CSV, XML, ZIP, EPUB, Outlook .msg, images, audio, YouTube URLs, and data URIs into clean Markdown in-conversation. Embedded images in Office files and scanned PDF pages are OCR’d automatically via the officialmarkitdown-ocrplugin using the workspace’s vision-capable LLM (DB-first, ENV fallback). The same conversion kernel is shared by the RAG ingestion pipeline, so chat-time conversion and knowledge-base ingestion produce byte-identical Markdown. Non-OpenAI providers (Claude, Gemini, Bedrock, Azure) are supported transparently via aLiteLLMOpenAIShimthat presents any FIM One LLM as an openai-SDK-shaped client, then routes through LiteLLM. Zero-regression fallback: when no vision model is available, text-only extraction continues exactly as before. - Cross-session file management: file browser UI, storage quotas, auto-expiration cleanup
- Session-level file associations: track which files were used in which conversations
- Cross-session conversation recall: agent tools to list, search, and read past conversation history —
list_conversations,search_conversations(keyword/regex across history),read_conversation(retrieve full thread). Enables “what did we discuss last time” and “find the API change I asked for last week” workflows. Paired with cross-session file management to form a complete long-term agent memory layer - Sandbox hardening: v2 improvements to code execution isolation
- Performance testing: concurrent load benchmarks
- MCP Connection Pooling: per-request STDIO subprocess spawning doesn’t scale — 100 concurrent users = 100 subprocesses per MCP server. Pool STDIO connections with per-user env isolation (keyed by
(server_id, env_hash)); SSE/HTTP transports sharehttpx.AsyncClientsessions. Target:≤100 mswarm-start for pooled STDIO, O(1) HTTP connections per MCP server regardless of user count - Internal Harness Benchmark — Built-in test suite for quantifying harness parameter changes (system prompt, self-reflection frequency, tool selection threshold) using EVAL CENTER infrastructure
- Workflow trigger identity observability: Add
ExecutionContext.trigger_source("webhook" | "cron" | "manual" | "batch" | "sub") populated at each of the fiveWorkflowEngineinstantiation sites (trigger_workflow,run_workflow,batch_run_workflow,scheduler,sub_workflow). Today those paths implicitly mix owner-identity (wf.user_id— webhook/cron) with caller-identity (current_user.id— manual/batch); the defaults are sensible but the convention is undocumented, so 401-class failures (“why did my cron not borrow my token?”) require reverse-engineering the trigger path. Surfacetrigger_sourcein connector call logs, run metadata, and the run panel so each run declares its own identity policy. Pairs with thecredential_policyoverride below. -
ReAct Cycle Detection — Deterministic duplicate tool call detection with configurable threshold; prevents agent loops without relying on LLM self-awareness(shipped in v0.8.1) -
ReAct Completion Checklist — One-time pre-answer verification prompt when tools were used; reduces premature conclusions(shipped in v0.8.1) -
Agent Core Phase 3 — Runtime Invariant Hardening: four bug-fixes / observability upgrades addressing CC’s runtime invariants — (I.14) Conversation Recovery: detect and repair dangling(shipped in v0.8.3)tool_useblocks from interrupted turns inDbMemory.get_messages()to prevent HTTP 400 trajectory errors; (I.15) Structured Compact Work Card: parse 9-section compact output into a typedWorkCardand merge across successive compactions so errors/pending tasks survive multiple rounds; (I.16) Turn Profiler: per-turn phase-level timing logs (memory_load/compact/tool_schema_build/llm_first_token/llm_total/tool_exec) gated byREACT_TURN_PROFILE_ENABLED; (I.17) Per-user Rate Limiting: fix process-global rate limiter to per-user keyed buckets viaContextVarplumbing, gated byLLM_RATE_LIMIT_PER_USER. Prerequisite foundation for Agent Trace Layer. -
Conversation Recovery (partial — MVP): synthetic(shipped in v0.8.4)tool_resultrows persisted as durablerole="tool"DB rows;POST /chat/resumereplays cached SSE events after a monotonic cursor; frontend playground auto-reconnects viauseSseResumehook with exponential backoff -
System prompt section registry — memoized sections + cache breakpoints (MVP: ReAct only):(shipped in v0.8.4)fim_one.core.promptmodule withPromptRegistry,PromptSection,DYNAMIC_BOUNDARY; ReAct JSON / native / synthesis paths split into static prefix + dynamic suffix; cache-capable providers receivecache_control: {"type": "ephemeral"}on the prefix. 91.9%–95.9% cache ratio on Claude 4 turns -
Thinking-block persistence with signature (correctness fix): assistant messages persist(shipped in v0.8.4)reasoning_content+signatureinmetadata_["thinking"]and replay on subsequent turns. Fixes Claude 4 HTTP 400 signature mismatch on multi-turn.is_thinking_capable()gate covers Claude 4, Anthropic proxies, DeepSeek R1, Qwen QwQ, o-series, GPT-5, Gemini 2.x thinking -
Provider-aware reasoning replay policy (modelless hardening): centralized(shipped in v0.8.4)reasoning_replay_policy(model_id)incore/prompt/reasoning.pyreturninganthropic_thinking/informational_only/unsupported. Single choke point atOpenAICompatibleLLM._build_request_kwargs(). 50-test cross-provider matrix with reverse assertions -
Prompt cache observability:(shipped in v0.8.4)UsageSummarytrackscache_read_input_tokens/cache_creation_input_tokens;TurnProfiler.add_cache_hit()emits aturn_cachelog line per turn;done_payload.cachesurfaces it to frontend. Doubles as an API relay cache-honesty probe
Prompt Cache + Reasoning Follow-ups (from Batch A MVPs)
These items complete partial work shipped in Batch A (Conversation Recovery, System Prompt Registry, Thinking Blocks) and extend cache coverage to the remaining provider families.- Gemini Context Cache Adapter: Google Gemini uses a separate REST API (
POST /v1beta/cachedContents→ returnscacheName→ referenced viacachedContent: "<cacheName>"in subsequentgenerateContentcalls) rather than the inlinecache_controlmarker Anthropic uses. Requires aGeminiCacheAdapterwith lifecycle management (pre-register prefix → reference cacheName → TTL-aware invalidation), integrated into the Gemini path ofOpenAICompatibleLLMor LiteLLM’s Gemini provider. Read discount ~0.25×, minimum prefix 32,768 tokens (Gemini Pro) / 4,096 (Flash) — primary beneficiaries are long-context KB/RAG agents and document-heavy workflows. - Prompt registry expansion to planner / verifier / domain classifier: extend the
PromptRegistry+DYNAMIC_BOUNDARYpattern from ReAct to the remaining LLM call sites:DAGPlanner,PlanAnalyzer,StepVerifier,CitationVerifier,DomainClassifier,ExecutionModeRouter,CompactUtils. Currently these rebuild prompts from scratch on every invocation. Lower frequency than ReAct, so lower ROI, but completes the cache story. - Per-agent
cache_ttlconfig: let agent owners choose betweenephemeral(5 min, default, cheap write) andextended(1 hour, 2× write cost but better for batch / scheduled workflows). Surface as a field on the Agent model and pass throughcache_control: {"type": "...", "ttl": "..."}where supported. - DAG step-level checkpoint table: current A1 Conversation Recovery MVP persists synthetic tool_results and cached SSE events but DAG intermediate step state lives in memory only. New
dag_execution_steptable snapshots each step’s tool_calls, results, and artifact references so a mid-DAG disconnect can resume without re-executing completed steps. Paired with the frontenduseSseResumehook for end-to-end continuity. - Dedicated
tool_call_idcolumn on Message: todaytool_call_idlives inmetadata_JSON, requiringjson_extract(...)/::json->>lookups for orphan-tool-use queries. For high-traffic deployments a first-class indexed column would let the recovery pass run O(log n) instead of O(n) scans. Low priority until scale demands it. - Mid-stream thinking token reconstruction: current resume granularity is “next complete SSE event” — if the drop happens inside a thinking delta, the client restarts from the following event. Token-level resume would require re-emitting the in-flight thinking block’s buffered tokens. Niche improvement; only worth pursuing if users report thinking UX jitter on flaky connections.
- API relay cache-honesty probe: background tool (admin-triggered or scheduled) that sends two identical Claude requests through each configured relay, compares actual billed input vs
cache_read_input_tokens, and flags relays that strip thecache_controlmarker or don’t pass through the 0.10× discount. Surfaced as a Workspace-level “relay health” signal — useful operational tool for enterprises routing through Chinese API proxies.
Reliability Follow-ups (Agent Core Priority Matrix)
Bulk of the Agent Core integration blueprint (Phase 0–3, I.1–I.16) shipped across v0.8.2 and v0.8.3. The items below are from the parallel Priority Matrix that still need attention.- Content replacement state persistence (streaming invariant #2): “once seen, fate frozen” — message content that was already emitted to the client must not be retroactively mutated across resume / reload. Requires a replacement ledger aligned with the SSE cursor from A1. Blocked on understanding actual user-visible glitches; no active complaint.
- Attachment context router: smarter attachment injection with
alreadySurfaced+readFileStatededup, aggregate attachment budget, and liveness checks. Prevents resending the same 50KB PDF extract on every turn. Couples with workspace file offloading (already on v0.9 plan). - Side query workers (prompt worker pool): dedicated lightweight pools for recall / classification / summary / session-memory queries so they don’t contend with the main agent LLM call for rate-limit budget. Prerequisite: prompt registry expansion (above).
Ecosystem & Scaling
- Scheduled jobs + Event-triggered Agents (Loop): cron-like background task triggers;
scheduled_jobs+job_runsDB tables; APScheduler integration; job CRUD API + job history UI; result notification via message push connectors. Scope covers both time-triggered (cron) and event-triggered (webhook inbound) patterns — an agent running asynchronously in the background IS the async sub-agent use case for Hub mode. -
Prebuilt Solution Templates (Market Seed Content): 8 ready-to-use vertical Solutions published to Market on first-user registration — Financial Audit, Contract Review, Data Reporting, IT Helpdesk, HR Onboarding, Sales Assistant, Content Writer, Meeting Summary. Each bundles an Agent + Skill with Chinese SOPs; bootstrapped idempotently via(shipped in v0.8.1)ensure_solution_templates(), published to Market org for immediate marketplace availability - DB Schema Advanced Builder: AI-driven schema management agent for large-scale databases — strategic table annotation (pattern-based, SQL-execution-informed), bulk visibility management by domain prefix, iterative multi-round annotation for 1K–7K+ table deployments; complements existing batch-annotation job with selectivity and business-context reasoning
-
Resource Fork (Package Phase 1 — prerequisite for v1.0 Package System): All per-resource fork endpoints implemented — MCP Server, Skill, Agent, Connector, Workflow. KB fork removed (inherently user-local). Each(completed in v0.8.1)POST /api/{type}/{id}/forkcreates a user-owned deep copy withforked_fromlineage tracking. - Per-workflow
credential_policyoverride (owner|caller|auto): The five workflow trigger paths currently hardcode whose identity runs connector actions — webhook/cron passwf.user_id(owner), manual/batch passcurrent_user.id(caller). This matches the common “automations run as owner, manual runs as caller” expectation, but enterprise deployments occasionally need to override per workflow (e.g. a cron that must run under the current on-call engineer, or a shared template that must borrow the owner’s credentials even on manual run). Add acredential_policyfield on the Workflow model, surfaced in the UI next to Schedule / API-Key config, that overrides the defaulttrigger_source → identitymapping. Prerequisite:trigger_sourceobservability above.
v1.0 — Hot-Plug + Embeddable
Goal: Zero-restart connector addition, Package ecosystem, and embedded delivery.- Connector Progressive Disclosure (Phase 5): Semantic-Guided Tool Selection (entity extraction from query → Ontology Registry lookup → connector set reduction; 90%+ token reduction for 50+ connector deployments); Scale mode for batch/ETL connectors; CLI-style universal
connector <name> <action> <params>interface - Cross-Connector Entity Alignment (Ontology Registry): define shared entity types (Customer, Order, Asset) with field mappings across connectors; DAGPlanner auto-resolves cross-system JOIN keys; enables cross-connector queries (e.g., “customers in Salesforce who ordered in Shopify”) without hardcoded field names
- Hot-plug connectors: upload OpenAPI spec, AI generates config, live in 5 minutes (no restart)
-
Marketplace Redesign Phase 1 — Solutions + Components: Two-tier Market model (Solutions: Agent/Skill/Workflow; Components: Connector/MCP Server); scope selector (Global Market / org); unified subscription model (org auto-appear removed); KB removed from Market scope; data migration backfills subscriptions for existing org members - Market Package System: Distributable resource bundles for the Marketplace — replaces per-type “marketplace” with a unified packaging layer.
fim-package.yamlmanifest declares: metadata (name, version, description, author, license, tags,min_fim_version), entry point (primary Skill or Agent), resource list (agents, skills, connectors, KBs, MCP servers, workflows) with config references, inter-package dependencies (semver ranges), required credentials (mapped to connector refs for install-time collection), and user-configurable variables with defaults. Two consumption modes: (1) install — batch-create all resources + auto-wire internal references via ID substitution; installation linked to source for version update notifications;POST /api/market/packages/{id}/install; (2) fork — clone as user-owned editable copies with no update link (this IS the template mode);POST /api/market/packages/{id}/fork. Additional endpoints: publish (POST /api/market/packageswith review workflow), uninstall (DELETE /packages/{id}/uninstallwith dependency check + modified-resource confirmation), version history (GET /packages/{id}/versions), upgrade (POST /packages/{id}/upgradewith per-resource diff preview). Dependency resolver for nested package requirements with conflict detection.PackageInstallationtable tracks installed packages per user with resource ID mapping for uninstall/upgrade. Coexists with individual resource publishing — Package is a composition layer, not a replacement; a single Connector is still publishable standalone. Example dependency tree:Package: contract-review→Skill: contract-review(entry point) →Agent: contract-analyst+Agent: risk-scorer→KB: legal-clauses+Connector: docusign-api+MCP: pdf-extractor+Workflow: contract-approval-flow - Creator Program: Marketplace monetization layer — creator profiles with portfolio pages, per-package analytics (installs, forks, active users, ratings/reviews), affiliate commission tracking when packages drive new subscriptions. Paid package tier with pricing, purchase flow, and approval workflow. Creator dashboard with install trends, revenue reporting, and user feedback. Public creator API for programmatic package publishing (CI/CD for package authors). Community features: package comments, Q&A, changelogs per version
- Embeddable widget:
<script src="fim-one.js">injected into host page - Page context injection: widget reads host page context (current ID, URL, DOM selectors)
- Advanced triggers: Webhook inbound events; scheduled job enhancements (multi-timezone, calendar-aware)
- Batch execution: process 1000+ items via DAG
- Enterprise security: IP whitelisting, encryption at rest, SSO
- KB Advanced Editor: Builder-mode agent for power users managing large knowledge bases — bulk URL ingestion, duplicate detection, gap analysis, document lifecycle management; extends existing KB AI chat with ReAct tool loop
Frozen Features (Shipped, Maintain Only)
Per the Orthogonality Strategy, these features are shipped and working but will not receive new capabilities (bug fixes only):| Feature | Version | Why frozen |
|---|---|---|
| ReAct Agent | v0.1, v0.9 | Models now have native tool calling. Mid-loop self-reflection (v0.9) prevents goal drift in long chains. Tool observation synthesis quality improved (8K chars, configurable via REACT_TOOL_OBS_TRUNCATION) |
| DAG Planning / Re-Planning | v0.1, v0.5, v0.7.5 | Model reasoning capabilities improving; decomposition becoming single-shot. Per-step verification shipped in v0.7.5 (DAG_STEP_VERIFICATION). Hardened: cascade failure propagation, verifier status fix, planner tool descriptions, full replan history, whitelist-based tool cache. 14 engine constants exposed as ENV vars — no further planning primitives planned |
| Memory (Window, Summary, Compact) | v0.2, v0.5 | Context windows growing (200K+); less need for external memory management |
| RAG pipeline | v0.5 | Providers building retrieval natively (OpenAI file_search, Gemini Search Grounding) |
| Grounded Generation | v0.5 | Models improving at citations; 5-stage pipeline adds diminishing value |
| ContextGuard / Pinned Messages | v0.5 | Shipping as-is; no new features |
Consider (Deferred Indefinitely)
Per the Orthogonality Strategy, these would be high-effort and face absorption risk:| Feature | Why deferred |
|---|---|
| Multi-Agent Orchestration (deep hierarchies) | Providers building natively (OpenAI Swarm, Google A2A, and similar multi-agent offerings). FIM One’s CallAgentTool covers the one-level delegation case; event-triggered background agents are covered by Scheduled Jobs in v0.9 |
| Agent Self-modifying Skills (Procedural Memory) | Agents updating their own skill.md during execution — high complexity, safety/audit surface area. Depends on Agent Skill System (v0.8) shipping first. Re-evaluate if enterprise customers request self-improving agents explicitly |
| Promoted to v0.9. The value is selective reading, not context capacity — cross-framework validation confirmed. Original deferral reasoning (“200K+ windows reduce urgency”) was wrong. | |
| Cross-Session Long-Term Memory | Context windows growing rapidly (200K–2M); providers adding built-in memory (OpenAI memory, Gemini context caching); high implementation cost vs diminishing differentiation value. Re-evaluate when enterprise customers explicitly request it |
| Memory Lifecycle (TTL, quotas) | Depends on cross-session memory; deferred together |
| Active Context Compression Tool (agent-triggered) | Explicitly frozen with ContextGuard (v0.5). Context windows at 200K+ reduce value. Will not be revisited unless context costs become a major enterprise complaint |
| Browser Automation / Computer Use | High maintenance cost (DOM changes, anti-bot, sandboxing). Industry converging on Computer Use mode (Anthropic, OpenAI Operator, Google Mariner) and MCP browser tools (Puppeteer/Playwright MCP). Consume via MCP integration, don’t self-build. Re-evaluate when stable Computer Use MCP standard emerges |
| Web Push Notifications | Browser-native push via Service Worker + VAPID. Overlaps with IM Channel Integration (v0.8) which covers enterprise-preferred channels (Lark/Slack/WeCom/Email). IM push has higher enterprise value; Web Push is a nice-to-have for Portal-only users. Re-evaluate after IM Channel ships — if users request browser notifications beyond IM coverage |
| Multi-user workflow collaborative editing | Real-time co-editing of the same workflow blueprint (Figma/Notion style) with cursor awareness, conflict resolution, and per-node lock. High implementation cost (CRDT / OT, presence infra), unclear enterprise demand over today’s “one editor at a time + version diff” model. Re-evaluate if multiple enterprises specifically request shared live editing |
| Per-node workflow execution permissions (RBAC on run) | Fine-grained authorization inside a single workflow run — e.g. “node X requires role finance_approver to execute”. Today authorization happens at the workflow level (who can trigger) and at the connector level (whose credential runs); per-node RBAC adds a third axis with material complexity and no active customer request |
| Cross-org workflow sharing with live updates | Subscribe to a workflow from another org and receive upstream updates without re-forking. Today subscribe = fork (snapshot), so breaking upstream changes never propagate. Live updates would require upstream-compatible schema evolution + conflict resolution; high maintenance cost. Re-evaluate if enterprises ask for “shared workflows across subsidiaries” |
How Versions Align With Modes
| Version | Standalone | Copilot | Hub | Notes |
|---|---|---|---|---|
| v0.1–v0.3 | Working | Not yet | Not yet | Portal-only, single-user |
| v0.4 | Working | Not yet | Not yet | Multi-conversation, agent management |
| v0.5 | Working | Not yet | Not yet | Knowledge base + RAG |
| v0.6 | Working | Possible | Possible | Connectors ship; Copilot/Hub possible with manual wiring |
| v0.7 | Working | Ready | Ready | Admin platform; multi-tenant auth; ready for production |
| v0.8 | Working | Ready | Optimized | RBAC + audit log per-system; easier to onboard |
| v0.9 | Working | Ready | Production | Observability, performance, hardening |
| v1.0 | Working | Optimized | Enterprise | Package system, creator program, hot-plug, embeddable widget, webhooks, batch |
Resource Allocation (v0.8–v1.0)
The Orthogonality Strategy shapes where effort goes:| Category | Allocation | Versions | Why |
|---|---|---|---|
| Connector Platform (v0.6+) | 50% | Ongoing | Core differentiation; no absorption risk |
| Enterprise Features (RBAC, audit, security, observability) | 30% | v0.8–v1.0 | Boring but durable; production requirement. Agent Trace Layer is commercial anchor |
| Agent Intelligence (Skill System, scheduled agents) | 15% | v0.8–v0.9 | 指令+工具+技能 differentiation story; low absorption risk — frameworks validate patterns, but enterprise SOPs are customer-specific |
| v0.1–v0.5 maintenance | 5% | Ongoing | Bug fixes only; no new features |
Metric-Driven Milestones
Success is measured by:| Metric | v0.7 Target | v0.8 Target | v1.0 Target |
|---|---|---|---|
| Connectors deployed | 5 | 20+ | 100+ |
| Enterprise customers | 1–2 | 5–10 | 20+ |
| Avg connector setup time | 2 weeks | 2 days | 5 minutes (hot-plug) |
| Token efficiency (DAG vs ReAct-only) | 30% reduction | 40% reduction | 50% reduction |
| Uptime SLA | 99.5% | 99.9% | 99.95% |
| Support ticket themes | Integration, setup | Connector custom logic | Hot-plug, scaling |
Open Questions / TBD
- Marketplace moderation: How to validate community packages and individual resources? Automated scanning for credential leaks in package configs? (v1.0)
- Token economics: How to price multi-user, multi-agent scenarios? (v1.0)
- Package versioning: Breaking changes in installed packages — auto-upgrade with migration scripts, or manual approval per update? Dependency diamond problem resolution? (v1.0)
- Package pricing: Free vs paid tiers, commission rates for Creator Program, payment provider integration? (v1.0)
- Package credential UX: Install-time credential collection — wizard-style step-by-step or deferred setup? Credential sharing across packages that use the same connector type? (v1.0)
- Telemetry opt-out: How to honor privacy preferences? (v0.8)
- Connector versioning: How to manage breaking changes in connector APIs? (v0.8)
- Rate limiting: Per-user workflow rate limiting shipped (sliding window 10 runs/min, 3 concurrent). Per-connector and per-agent rate limiting TBD (v0.9)
- Connector authorization tier selection: how does an admin discover which tier applies to a given upstream system? Auto-probe (try per-user API key → fall back to login-ticket → fall back to shared-DB) vs. explicit declaration in the connector spec? How do we express “this connector supports Tier 2 but the admin chose to operate in Tier 1” in the UI without confusing non-technical admins? (v0.9)
- Integration vs Connector duality: when a Feishu binding is simultaneously an SSO provider AND an API-call surface, how do we present it in Settings? One object with three toggles, or three separate bindings that share a credential? Implications for uninstall semantics (does revoking SSO kill the Connector?) (v0.9)