VentureBeat Apr 29, 06:37 PM
IBM launches Bob with multi-model routing and human checkpoints to turn AI coding into a secure production system Bringing AI agents into the enterprise software development lifecycle is fast becoming the norm. As developers experiment with new platforms, organizations are exposed to potential security and orchestration failures. Systems that work in pilots may fail once the agents start working with real-time data.
Legacy tech giant IBM is one of several companies trying to address that gap by introducing more structure into how these workflows run. Yesterday, it announced the global launch of its AI-powered software development platform Bob, designed to write and test code across the development cycle, already in use by more than 80,000 of its employees after starting with just 100 internal users in summer 2025.
Bob introduces a structured layer that constantly pauses for human-led checkpoints, yet by harnessing AI models to perform agentic tasks, IBM says it has saved some teams up to 70% of time "on selected tasks...equaling an average time savings of 10 hours per week."
Specific models supported include IBM's own Granite series, Anthropic's Claude, some from French AI firm Mistral and other smaller distilled models — no Alibaba Qwen or other fully open source ones.
This approach reflects a shift in how enterprises want to approach AI-led development: to build systems that not only build applications but also execute complex, multi-step workflows that do not rely on a single model or a single orchestration framework. It provides a structured, guarded approach to automation that seeks to center humans more in the process and fill audit gaps.
Neal Sundaresan, general manager, Automation and AI at IBM, told VentureBeat in an exclusive interview that a large part of using AI for software development is being systematic.
“Model capability alone isn’t enough,” Sundaresan said. “How you deploy it, how you structure context, and how you keep humans in the loop is what determines whether AI actually delivers.”
That divide is shaping how enterprises choose AI tools, whether they prioritize flexibility and experimentation or reliability and auditability.
Varying approaches to AI-led development
A growing class of open or autonomous agent systems has pushed the boundaries of what developers can do. They can now run extended or stateful workflows without much human intervention.
The rise of OpenClaw showed enterprises how far experimentation can go, especially when trained on local data and run in sandboxes. But it also meant that the choice between easier agent and workflow creation and security.
Some companies have embraced this spirit of experimentation.
Enterprise providers like Nvidia chose to embrace OpenClaw-like systems by adding a fence around the sandbox environment that runs autonomous agents, using NemoClaw. Kilo launched Kilo Claw, aimed at providing security for autonomous agents. OpenAI, in its updated Agents SDK, added support for sandbox agent implementations that mirror a lot of the usage patterns of systems like OpenClaw.
Sundaresan said ente