All articlesFrontier

Mistral Medium 3.5 ships remote agents — a quick note on why we won't route production through them

Mistral Medium 3.5 added server-side tool execution. Useful for prototypes; not where we put production traffic. A short field note on the trade-off.

Apr 29, 2026 1 min read
mistralremote-agentsagentic-aitool-use

Mistral dropped Medium 3.5 yesterday, a mid-size model comparable on code tasks to Claude Sonnet 3.5, with one feature most coverage glossed over: remote agents that execute tool calls server-side. Send a prompt, Mistral runs the tools, you get the final answer. One API call instead of the usual prompt-tool-execute-feed-back loop. Python, web search, file access, and custom function calling, all sandboxed on Mistral's infrastructure.

For prototypes and one-off research tasks, this is great. For production traffic, we won't be using it. Three reasons.

Observability. Remote execution is a black box. When a tool fails, we want to know which payload triggered it and at what time, with full request and response. We log every tool call locally and route it through LangSmith and Helicone. Remote agents give you an error message if you're lucky.

State. Our agents share context across a single Postgres database. Remote execution means shipping that state back and forth or maintaining dual truth, neither of which is appealing. We'd rather query pgvector directly through the orchestration layer.

Tool surface. Mistral covers Python and web search. A typical deployment for us calls Twilio, ElevenLabs, Stripe, QuickBooks, HubSpot, Airtable, Make, n8n webhooks, and dozens of custom endpoints per client. Remote execution would mean Mistral supporting every SaaS we touch, or building a translation layer that defeats the purpose.

Where remote agents do make sense: one-off research tasks, edge-deployed agents in restricted runtimes (Cloudflare Workers, mobile shells) where you can't spin up Python locally.

Everything else, orchestrate locally. Remote agents make prototyping faster. They don't make production easier. They move the complexity from your code to the model provider's runtime, and when that runtime fails, you debug without logs.

/ 06 — Start hereOne business day response

Tell us what you'd like built.

Send us a paragraph about the workflow, phone line, or tool you want built. We'll reply within one business day with a one-page plan, a fixed price, and a delivery date you can put on a calendar.

  • 30-min scoping call, free
  • Written proposal within 48 hours
  • Fixed price before we start
  • Most builds delivered in 2–8 weeks