Or try one of the following: 詹姆斯.com, adult swim, Afterdawn, Ajaxian, Andy Budd, Ask a Ninja, AtomEnabled.org, BBC News, BBC Arabic, BBC China, BBC Russia, Brent Simmons, Channel Frederator, CNN, Digg, Diggnation, Flickr, Google News, Google Video, Harvard Law, Hebrew Language, InfoWorld, iTunes, Japanese Language, Korean Language, mir.aculo.us, Movie Trailers, Newspond, Nick Bradbury, OK/Cancel, OS News, Phil Ringnalda, Photoshop Videocast, reddit, Romanian Language, Russian Language, Ryan Parman, Traditional Chinese Language, Technorati, Tim Bray, TUAW, TVgasm, UNEASYsilence, Web 2.0 Show, Windows Vista Blog, XKCD, Yahoo! News, You Tube, Zeldman
Snowflake offers help to users and builders of AI agents | InfoWorld
Technology insight for the enterpriseSnowflake offers help to users and builders of AI agents 21 Apr 2026, 1:15 pm
Snowflake is enhancing Snowflake Intelligence and Cortex Code to create a unified experience connecting enterprise systems, data sources, and AI models with Snowflake data. It’s part of the company’s vision to become the control plane for the agentic enterprise, enabling enterprises to align data, tools, and workflows with AI agents built on its platform.
With these updates, the company said, Snowflake Intelligence becomes an adaptable personal work agent for business users, and Cortex Code expands as a builder layer for enterprise AI that provides governed, data-native development.
Enhancements to Snowflake Intelligence include automation of routine tasks by describing them in natural language, new Model Context Protocol (MCP) connectors, and reusable artifacts that let users save and share analyses, visualizations, and workflows, all of which will be generally available “soon.” In addition, a new iOS mobile app, and multi-step reasoning with deep research that uses agentic architecture to reason across data will soon be in public preview.
The company said that all of these updates came out from customer feedback, as well as from insights gleaned from Project SnowWork, last month’s preview of an autonomous AI layer for its data cloud.
Cortex Code now supports additional external data sources, including AWS Glue, Databricks, and Postgres, connectivity with other AI agents via MCP and Agent Communication Protocol (ACP), a Claude Code plugin, and a new agent software development kit with support for Python and TypeScript. There are also enhancements to Cortex Code in Snowsight, Snowflake’s web interface, including Plan Mode to allow developers to preview and approve workflows, and Snap & Ask to enable interaction with data artifacts such as charts and tables.
Snowflake also announced the private preview of Cortex Code Sandboxes in Snowsight, a dedicated cloud environment where developers can execute code end-to-end with no setup.
Michael Leone, VP & principal analyst at Moor Insights & Strategy, thinks the roadmap is “ambitious,” noting the number of items announced that are “coming soon” or are in public preview. “These announcements are starting to blur together, with almost every vendor claiming their agents can reason, act, and transform the business,” he said, adding, “What makes this one worth slowing down on, at least for me, is that Snowflake is going after both halves of the enterprise at the same time. Intelligence is built for the business users who want answers and actions without writing SQL, and Cortex Code is built for the builders who actually have to put this into production.”
Most vendors pick one target, users or builders, and come back to the other later, he said, but Snowflake is putting both on the same governed data foundation. “[This] is a harder engineering problem, but I’d argue it’s a cleaner answer to the question enterprises are actually asking, which is how to open AI up to more people without losing control of the data underneath,” he said, noting that Snowflake has changed its approach from “let’s do it inside Snowflake,” to realizing that agentic AI only works if it’s interoperable with the rest of the stack.
Igor Ikonnikov, advisory fellow at Info-Tech Research Group, also sees the control plane play as part of an industry trend. “As always, the devil is in the details: what those platforms are composed of and how they offer to control AI agents,” he said. “Most platforms are built the old-fashioned way: All the controls are coded. Snowflake speaks about reusable analytics through saving the whole solution and reusing complete modules or models. It means that common semantics are still buried inside database models and code.”
All AI vendors are motivated by the same demand from the market, he said: “Move from Copilot-based generic chatbots to business-purpose-specific AI agents that understand business logic and can interact with one another.” With these updates, he sees Snowflake as having caught up with the competition, but not yet surpassing it.
Sanjeev Mohan, principal at SanjMo, said, “The good news for customers is the support for Databricks and AWS Glue. What Snowflake is saying is that even if your data lives in a competitor’s system, Snowflake AI coding agent can be used. And vice versa, the VS Code extension and Claude Code plugin can be used on Snowflake data. In other words, it reduces vendor lock-in fears.”
It’s also the right strategic direction, said Sanchit Vir Gogia, chief analyst at Greyhound Research. “Enterprise AI is moving from generation to orchestration to execution, and Snowflake’s focus on governed data as the foundation for action aligns with that shift,” he said.
“However, becoming the execution layer for enterprise AI requires more than integrating agents and expanding tooling,” he said. It also requires consistent semantics, reliable cross-system execution, strong governance, economic viability, and organisational readiness, as well as overcoming a structural constraint. “Control without ownership of the systems where work is executed introduces dependency that is difficult to fully resolve. This is the central tension in Snowflake’s strategy and will define how far it can realistically extend its influence,” he said. “Snowflake has taken a meaningful step in that direction. It has not yet proven that it can deliver this at scale. At this stage, it is one of the most credible contenders in a race that will be defined not by who builds the smartest AI, but by who can make that AI work reliably inside the enterprise.”
Amazon’s $5B Anthropic bet is really about compute, not just cash 21 Apr 2026, 11:37 am
Amazon on Monday said it was investing an additional $5 billion in Anthropic, a move that analysts say is aimed as much at easing the AI startup’s growing infrastructure bottlenecks as at deepening their strategic partnership.
As part of the deal, Anthropic will lock in up to 5 gigawatts of compute capacity across AWS’s Trainium chips, including the new Trainium 3 and upcoming Trainium 4, the companies said in a joint statement.
“Right now, users see limits like throttling and session caps because Anthropic is running out of capacity and must ration usage to avoid crashes. This deal helps fix that,” said Pareekh Jain, principal analyst at Pareekh Consulting.
“Over time, the expanded capacity will let Anthropic support more users at once, build bigger models, and reduce these limits, especially for paid and enterprise users,” Jain added.
The analyst was referring to Anthropic’s move to throttle usage across its Claude subscriptions, especially during peak demand hours, which also coincided with other concerns, such as complaints of degradation in Claude’s reasoning performance across complex tasks.
Scaling compute capacity
A significant portion of Trainium 3 capacity is expected to come online this year, they added. Anthropic already uses Trainium 2 via AWS’ Project Rainer, which is a cluster of nearly half a million chips, to train and run its models.
The agreement between Amazon and Anthropic also includes an expansion of inference capacity in Asia and Europe, which Jain said should improve Claude’s speed and reliability globally. Anthropic will also have the option to buy future generations of Trainium as they become available.
However, Anthropic isn’t alone when it comes to model providers trying to add compute capacity to train and run their models.
Earlier in February, rival OpenAI signed a deal with Amazon, Nvidia, and SoftBank to raise around $110 billion to add infrastructure to increase compute capacity.
As part of the arrangement, OpenAI has committed to consuming at least 2GW of AWS Trainium-based compute tied to Amazon’s $50 billion investment, along with 3GW of dedicated inference capacity from Nvidia under its separate $30 billion commitment.
From funding to supply chain financing
In fact, deals such as these, analysts say, reflect a broader shift in how AI infrastructure is getting financed presently.
“Rather than simple cash-for-equity, these deals bundle equity investment with massive cloud-spend, or GPU spend commitments by locking in customers, securing capex returns, and validating infrastructure buildouts in a single transaction. This isn’t venture capital anymore, it’s supply chain financing,” Jain said.
The pattern present in these deals, Jain noted, is consistent across the ecosystem, giving examples of Microsoft, Oracle, and Nvidia.
“Microsoft invested tens of billions into OpenAI while simultaneously committing Azure capacity for training and inference, with OpenAI’s Azure spend now running at a multi-billion dollar annual rate,” Jain said.
“Oracle, too, signed a $30 billion cloud deal with OpenAI, then followed it with a staggering $300 billion five-year compute commitment starting in 2027. Nvidia took it further still with its $100 billion investment in OpenAI, which was paid in GPUs, not dollars — a model it replicated with xAI,” Jain added.
That framing, however, according to Greyhound Research chief analyst Sanchit Vir Gogia, may miss a deeper shift.
Such deals, Gogia said, are more about securing scarce compute supply ahead of competitors. “What capital does is improve your position. It allows you to commit earlier and at greater scale,” the analyst pointed out, adding that the real advantage lies in locking in infrastructure before others can.
On the flip side, though, long-term capacity commitments tend to anchor companies to specific providers, Gogia cautioned.
While model providers may operate across platforms and hyperscalers, their largest infrastructure commitments ultimately shape where they optimize workloads, build features, and direct spending, the analyst pointed out.
For Anthropic, the Amazon deal comes with equally significant long-term obligations. The company has committed to spending more than $100 billion on AWS over the next decade.
For Amazon, the $5 billion investment builds on its earlier $8 billion bet on Anthropic and comes with the potential to commit up to an additional $20 billion tied to certain commercial milestones, which were not revealed. Anthropic is also looking beyond AWS. The company recently said it plans to add capacity using Google’s TPUs. These chips are expected to come online by next year.
The article originally appeared in NetworkWorld.
From the engine room to the bridge: What the modern leadership shift means for architects like me 21 Apr 2026, 10:00 am
We all agree that the role of the technology leader is being rewritten in real time, and if you’re building the systems they depend on, you need to understand what they’re asking for now.
Let me be honest about something. For most of my career, the conversations I had with CIOs followed a pretty predictable script. They’d describe a pain point, I’d map it to a solution and we’d talk timelines and integration. Clean. Transactional. Technical. Very straightforward, right?
That script has been shredded.
Over the past couple of years, working across public sector agencies, global enterprises and mid-market companies in Latin America and now the US, I’ve watched the CIO role transform in a way that genuinely changes how I do my job as a solutions architect. The technology leader who used to care primarily about uptime and cost efficiency now walks into conversations asking about competitive differentiation, cultural change and workforce transformation. And they’re right to ask.
The shift isn’t cosmetic. It’s structural.
The problem hiding in plain sight
Here’s what I kept seeing in failed modernization projects, and I saw a lot of them: The technology worked fine. The architecture was sound. The implementation was clean. And the project still stalled or quietly died six months in. The root cause was seldom technical. It was a decision-making problem upstream of delivery. Strategy that hadn’t been translated into clear operating priorities. Conflicting stakeholder mandates that nobody had formally resolved. Organizational structures that pull in different directions from the infrastructure teams trying to serve them.
What I’ve come to think of as “decision integrity”, the discipline of making sure strategy connects to execution, was missing. And the CIO, historically, wasn’t positioned to own that gap. They were downstream of it.
That’s changed. The CIOs I work with now are increasingly the ones driving that upstream clarity. They’re defining outcome frameworks, arbitrating tradeoffs and forcing the organizational alignment that makes technical delivery land. The architecture conversation I have with them today is as much about governance and organizational design as it is about platforms.
What this means if you’re building for them
From where I sit, designing solutions around open-source platforms, hybrid cloud and AI infrastructure, the practical consequence is this: The technology decisions my customers make are no longer primarily about technology.
A CIO investing in AI-ready infrastructure isn’t just buying a faster platform. They’re making a strategic bet that the organization can operate differently at scale. Which means the infrastructure must support not just the technical requirements, consistent data access, automated policy enforcement and visibility across hybrid environments, but also the organizational ones.
Can non-technical stakeholders trust the system? Can the governance model hold up as scope expands? Can the platform absorb the messiness of real enterprise change without the whole thing collapsing?
Technical debt is where this gets painfully concrete. I’ve seen environments where 30–40% of engineering capacity is absorbed by legacy maintenance. Not because anyone made a bad choice, but because previous decisions compounded over time without a deliberate modernization strategy. When a CIO tells me they want to move fast on AI adoption, the first conversation we must have is about what’s sitting underneath that ambition. You can’t build a reliable AI pipeline on top of infrastructure you don’t trust.
The CIOs who are winning right now are the ones who dealt with that debt proactively not by declaring a big-bang rewrite, but by systematically creating the conditions where innovation can happen without adding to the entropy. That’s what I try to help architect.
The cultural piece is the hardest part, and it’s real
I’ll be straight here: When someone says, “cultural transformation,” my instinct is usually to translate it into something more concrete I can design for. Agile delivery models. Feedback loops. Automation that removes friction from the right places. That’s still my instinct.
However, I’ve had to sit with the fact that the cultural piece isn’t just a soft addendum to the technical work. It’s load-bearing.
Here’s the version I’ve watched play out more than once: You build a genuinely excellent automation platform. The tooling is solid. The pipelines work. And then adoption stalls because the teams who are supposed to use it don’t trust it, weren’t involved in defining it or are quietly protecting workflows that the new system would disrupt. The problem isn’t the platform. The problem is that nobody built the social infrastructure around it.
Gartner’s projection that 25% of IT work will be handled autonomously by AI by 2030 isn’t a threat or a promise; it’s a design constraint. If you’re architecting systems today, you have to ask: What does the human role look like in this workflow once AI is doing the routine work? What skills are you developing in the team? Where does judgment still belong to a person?
Those aren’t questions with clean technical answers. But they’re questions an architect must have an opinion on.
Both hands on the wheel
There’s a framing I keep coming back to, which describes the modern technology leader as both the navigator on the bridge and the engineer in the engine room at the same time. That’s exactly the tension I recognize from the field. The CIOs I most want to work with are the ones who haven’t abandoned either role. They’re genuinely curious about how the infrastructure works, not just what it delivers. And they’re genuinely accountable for business outcomes, not just technical ones. That dual orientation is rare, and it’s valuable and when I find it, those tend to be the engagements where we build something worth building. And this is one thing that fascinates me about open source: The people who engage with it tend to be true tech experts.
For those of us on the architecture side, the implication is clear. We can’t show up to these conversations as purely technical resources anymore, either. The best solution I can design is useless if it doesn’t connect to the organizational reality my customer is operating in. Understanding the strategic pressure they’re under, the cultural conditions they’re working with, the decision-making constraints they’re navigating, that context shapes everything about how I recommend we build.
The engine room and the bridge have always been part of the same ship. It just took a while for the org charts to catch up.
This article is published as part of the Foundry Expert Contributor Network.
Want to join?
Enterprises are rethinking Kubernetes 21 Apr 2026, 9:00 am
For years, Kubernetes held an almost mythic place in enterprise IT. It was positioned as the control plane for the future, the standard abstraction for cloud-native systems, and the platform that would finally free enterprises from infrastructure lock-in. To be fair, some of that was true. Kubernetes brought discipline to container orchestration, enabled portable deployment models, and provided architects with a powerful framework for managing distributed applications at scale.
However, the market is changing, and so are enterprise expectations. The question is no longer whether Kubernetes is technically impressive. It clearly is. The question is whether it still represents the best fit for a growing number of mainstream enterprise use cases. In many cases, the answer is increasingly no. What we are seeing is not the death of Kubernetes but the end of its unquestioned dominance as the default strategic choice. Here’s why.
Too operationally expensive
As Kubernetes adoption grew, many organizations hesitated to admit that it introduced operational complexity and needed specialized skills, constant tuning, and strong governance. Running Kubernetes well requires mature engineering, observability, security, networking, and life-cycle management—much more than a side project. Many underestimated this burden.
What looked elegant in architectural diagrams became a real-world tax on operations teams. Clusters multiplied. Toolchains sprawled. Upgrades became risky. Policy enforcement became an engineering discipline in its own right. Enterprises realized they were not just adopting an orchestration platform. They were building and maintaining an internal product that required sustained investment and scarce expertise.
That might be acceptable for digital-native businesses whose scale and complexity justify the effort. It is a much harder sell for enterprises that want reliable deployments, resilient applications, and reasonable cloud costs. In those cases, Kubernetes can feel like overengineering disguised as strategic modernization. When a company spends more time managing the platform than delivering business value on top of it, the novelty wears off quickly.
Portability becomes less important
Kubernetes was marketed as a hedge against lock-in, enabling applications to run across on-premises, cloud, and edge. However, most enterprises faced ecosystem dependencies—storage, networking, security, identity, observability, CI/CD, managed services, and cloud-native databases—creating practical lock-in that Kubernetes didn’t eliminate.
What enterprises gained in workload portability, they often lost in ecosystem complexity. They standardized on Kubernetes while still depending heavily on a particular cloud provider’s managed services and operational conventions. The result was a strange middle ground: all the complexity of a highly abstracted platform without the full simplicity of using opinionated native services end-to-end.
This matters more now because boards and executive teams are less interested in theoretical architectural optionality and more focused on measurable business outcomes. They want speed, resilience, cost control, and lower risk. If a managed application platform, serverless environment, or provider-specific platform-as-a-service offering gets them there faster, many are willing to accept some level of dependency. Enterprises are becoming more candid about the trade-offs. They are realizing that strategic flexibility is valuable, but not at any cost.
This is where Kubernetes starts losing favor. Portability has value, but for many enterprises, it hasn’t justified the operational and organizational burden it entails. The promise exceeded the actual return.
Better abstractions are catching up
Perhaps the most important shift is that enterprises are moving away from buying raw technical primitives and toward consuming higher-level platforms that better align with developer productivity and business outcomes. Platform engineering teams increasingly hide Kubernetes behind internal developer platforms. Public cloud providers continue to improve managed container services, serverless offerings, and integrated application environments that reduce hands-on infrastructure management. Developers, meanwhile, do not want to become part-time cluster operators. They want fast paths to build, deploy, secure, and monitor applications without stitching together a dozen components.
In other words, Kubernetes may still be present under the hood, but it is becoming less visible and less central to strategic buying decisions. That is usually a sign of maturity. Technologies shift from being the headline to being plumbing. Enterprises are not asking, “How do we adopt Kubernetes?” as often as they are asking, “What is the fastest, safest, most cost-effective way to deliver modern applications?” That is a much healthier question.
The answer increasingly points to curated platforms, opinionated developer environments, and managed services that abstract away Kubernetes rather than exposing it. This is not a rejection of cloud-native principles. It is a rejection of unnecessary cognitive load. Enterprises are deciding they do not need to own every layer of complexity to realize the benefits of modern architecture.
Surrendering the spotlight
None of this means Kubernetes is disappearing. It remains important for large-scale, heterogeneous, and highly customized environments. It is still an excellent fit for organizations with strong platform maturity, regulatory constraints, or sophisticated multicloud operational needs. But that is a narrower slice of the market than the hype cycle once suggested.
What is losing popularity is not Kubernetes as a technology, but Kubernetes as the unquestioned standard for enterprises. This difference is important. Companies are becoming more selective about where to accept complexity and where to avoid it. They are less inclined to idealize infrastructure and more eager to choose simplicity when it exists.
That is probably a good thing. The job of enterprise architecture is not to admire elegant technology for its own sake. It is to align technology choices with operational realities, economic constraints, and business outcomes. By that standard, Kubernetes still has a place, but it no longer gets a free pass.
The cookbook for safe, powerful agents 21 Apr 2026, 9:00 am
As companies move from experimenting with AI agents to deploying them in production, one pattern becomes clear: capability without control is a liability.
Agents operate in long-running, stateful environments. They browse the web, read repositories, execute shell commands, call APIs and interact with internal systems. That power is transformative — and it meaningfully expands the attack surface.
In a recent interview, Jonathan Wall, CEO of Runloop, summarized the shift: “By default, agents should have access to very little. They need to do real work, but capabilities have to be layered on in a controlled way.” That framing reflects a broader industry reality: agent infrastructure must be designed around least privilege, explicit isolation and observable execution.
What follows is a practical control architecture for production agents.
The layered control model
A resilient agent deployment combines six explicit layers:
- Strong runtime isolation with a microVM
- Restrictive network policy with explicit egress allowlists
- Centralized credential management through a gateway
- Disciplined identity management with short-lived, scoped credentials
- Deliberate friction around sensitive actions and high-risk tools
- Continuous monitoring, logging and adversarial testing
Each layer addresses a different failure mode. Together, they contain blast radius when — not if — something breaks.
Start with least privilege
A production-grade agent environment begins in a constrained state: Isolated runtime boundary, no inbound access, no outbound network access and no implicit tool permissions.
The runtime boundary itself is part of least privilege. Containers provide efficient isolation for trusted or single-tenant workloads, but they share a host kernel. Real-world escape vulnerabilities have repeatedly shown that this boundary can fail under adversarial pressure. CVE-2019-5736 allowed attackers to overwrite the host runc binary from within a container
; CVE-2022-0492 enabled breakout via cgroups misconfiguration; CVE-2024-21626 again exposed runc-based escape paths. These incidents do not render containers unusable — but they clarify the tradeoff. MicroVMs introduce a stronger hardware-level boundary, reducing blast radius when agents execute arbitrary or unvetted code.
Isolation is not a performance decision alone. It is a risk decision.
The modern agent threat model
Traditional SaaS systems process deterministic requests. Agent systems ingest untrusted content and generate probabilistic actions.
Prompt injection has demonstrated how fragile instruction boundaries can be.
In 2023, public experiments against Bing Chat showed that hidden instructions embedded in web pages could override system prompts. Academic research from Stanford and others has shown that tool-using agents can be coerced to leak credentials or proprietary data when external content is treated as trusted context.
The danger compounds when agents operate with broad credentials. Service accounts, long-lived API keys and shared internal tokens convert a successful injection from “unexpected output” into repository compromise, database access or SaaS abuse. System prompts that embed internal URLs or configuration data become reusable artifacts once exposed.
Retrieval-augmented systems and MCP-style integrations widen the surface further. When external documents are ingested without segmentation or role separation, attacker-controlled content can redirect behavior or induce data disclosure.
This is the environment the layered model must withstand.
Network policy as containment
Network controls are often treated as compliance checkboxes. In agent systems, they are containment mechanisms.
Agents typically require outbound access for documentation lookup, dependency installation or API interaction. Yet unrestricted egress provides the cleanest path for data exfiltration after injection. Restrictive allowlists — permitting only explicitly approved domains or endpoints — dramatically reduce blast radius.
If a model is tricked into reading a .env file, a strict egress policy can prevent the obvious next step: shipping those secrets to an attacker-controlled domain. Logging outbound traffic establishes behavioral baselines and highlights anomalies early.
Containment turns catastrophic compromise into a recoverable incident.
Ingress as an operational event
Most agent runtimes do not require unsolicited inbound connections. Leaving services exposed by default accumulates unnecessary risk.
When debugging or collaborative inspection is required, exposure should be temporary and scoped — authenticated tunnels opened deliberately and closed promptly. Ingress becomes an operational decision rather than a static configuration state.
Ephemerality is a security control.
Governing model access
Large language models are external systems with cost, compliance and leakage implications. Allowing each runtime to independently manage model credentials fragments oversight.
A centralized gateway restores control. It can restrict approved models, enforce rate ceilings, log prompts and responses, and apply filtering or compliance checks. Agents no longer hold raw provider credentials directly.
The lesson from both container escapes and prompt injection incidents is consistent: implicit trust boundaries erode. Centralized governance reinforces them.
Tooling, identity and friction by design
As agents integrate with repositories, CI systems, deployment pipelines and databases, tool governance becomes inseparable from identity discipline.
Dedicated identities per agent, short-lived tokens and strict RBAC or ABAC reduce the impact of compromise. Reusing human or root-level credentials collapses isolation entirely.
Sensitive actions — sending email, modifying production code, accessing secrets, changing authentication — benefit from friction. Policy checks, approval workflows or out-of-band confirmations create deliberate pauses at high-risk boundaries.
Secrets should not live in prompts. System prompts embedded with credentials have been shown to leak under injection pressure. External secret managers and strict separation between model-visible text and credential material materially reduce exposure.
Continuous adversarial testing
Container escape CVEs and public prompt injection demonstrations share a common lesson: systems fail at integration boundaries, not in isolation. Logging tool calls, data access and network egress creates behavioral baselines against which anomalies — unusual domains, atypical file reads, unexpected tool invocation patterns — can be detected early. Red-teaming and adversarial prompt fuzzing help surface injection paths before attackers do, forcing organizations to confront weaknesses under controlled conditions rather than in production.
Agents can build, test, browse and execute arbitrary code. That capability is powerful — and dangerous when unconstrained. Production readiness is therefore defined not by what agents can do, but by how precisely their boundaries are defined, enforced and observed. The organizations that scale agents successfully will treat infrastructure as policy, isolation as a design decision and monitoring as a first-class requirement — not an afterthought.
This article is published as part of the Foundry Expert Contributor Network.
Want to join?
Addressing the challenges of unstructured data governance for AI 21 Apr 2026, 9:00 am
Large enterprises in regulated industries, especially in data-rich financial services and insurance, have invested significantly in data governance programs. Other businesses have been catching up as part of their efforts to become more data-driven organizations. Data governance often starts with defining policies, classifying data sources, establishing data catalogs, and communicating non-negotiables.
But look a little closer at the implementations, and you’ll see much of the focus has been on governing data warehouses, relational data, and other structured data sources. AI has elevated the importance of implementing data governance and establishing guardrails on unstructured data sources used to train language models and provide context to AI agents.
“Unstructured data now makes up the vast majority of enterprise information, and AI is redefining how organizations bring control, accessibility, and security to it,” says Ashish Mohindroo, general manager and senior vice president of Nutanix Database Service platform. “Leaders should ask themselves, ‘Who needs daily access to this data?’ and ‘How can we keep data safe from unauthorized access or accidental loss?’ ” Those are two key questions to address on all data sources, but unstructured ones have historically been more challenging to implement. I consulted with several experts on these complexities and on how AI can ease unstructured data governance challenges.
Context as important as content
Joanne Friedman, CEO of ReilAI, says that organizations must ensure safety through governed autonomy, which requires shifting from static access control to contract-based safety. “Routing messages is not the same as reasoning about them, connecting assets is not the same as understanding them, and reactive telemetry is not the same as choreographed intelligence,” says Friedman.
Structured data sources are a mix of transactional and relational data, supported by mature technologies to improve data quality and manage metadata. Document stores and other NoSQL databases provided better data management and search capabilities of unstructured data, but it wasn’t until vector databases and large language models (LLMs) emerged that we had tools to derive meaning from documents at scale.
“When I look at unstructured documents, I focus on the risk that lives inside the content because sensitive details hide in places people never review,” says Amanda Levay, CEO of Redactable. “I expect controls that stop those documents from entering unsafe workflows because exposure often happens before anyone knows the risk exists. I also push for systems that flag when a file carries information that shouldn’t move forward, so teams catch the problem at the moment it matters most.”
It’s a lot easier to define controls for accessing rows of structured financial transactions and customer records than to define rules for unstructured documents, such as contracts and health records. Friedman points out that the rules for unstructured documents are more dynamic, while Levay notes the scale and real-time complexities in evaluating documents.
Governance across the life cycle
Where should one begin implementing governance policies? There are many considerations for data pipelines, source data sets, consuming applications, AI models, and AI agents. Stéphan Donzé, founder and CEO of AODocs, says organizations need strong plumbing. He recommends a governed system that can perform the following tasks:
- Routes content to the right models
- Enforces granular permissions
- Maps relationships between extracted entities and other taxonomies
- Tracks implicit versions
- Calls in humans when the stakes are high
“Without these capabilities, AI becomes another black box. With them, you unlock an auditable, secure, explainable insight layer for data governance, risk, compliance, and mission-critical decisions at enterprise scale”, says Donzé.
Policies need to be implemented consistently across the full data lineage from source through consumption, including the creation of derivative data.
“One of the biggest security challenges with unstructured data is the lack of visibility and lineage as information moves across systems, clouds, and teams,” says Jack Berkowitz, chief data officer at Securiti. “When organizations cannot track where data originated, how it has changed—even what version is active or whether it is still relevant—they increase the risk of exposing sensitive or inaccurate data through genAI applications.”
Using AI to classify and categorize
Extracting knowledge from documents, categorizing them, and then classifying them for user entitlements is complex enough. Add the fact that documents are roll-ups of sections and subsections that need independent analysis and are then related to the full document’s context.
Consider building construction specifications, which often follow the CSI MasterFormat document standard. CSI MasterFormat has 50 divisions, such as general specifications, electrical, and plumbing. Now consider access controls for this document, given that security is covered in two separate divisions and may require different classifications than other sections, such as equipment. But even that’s not sufficient context, as a general contractor should have different policies for accessing the specifications for a nuclear power plant than for a small office building.
Complex classification challenges are being addressed with AI and advanced algorithms. “Enterprises are shifting toward commodity-driven, API-driven governance accelerators, especially in areas like classification, taxonomy management, and domain-specific labeling,” says Nandakumar Sivaraman, senior vice president and chief architect of enterprise data at Bridgenext. “Instead of manually applying categories, rules, and policies across thousands of assets, companies are now using AI-driven classification APIs to auto-tag and categorize data. They use machine learning–based pattern detection to assign taxonomies, product hierarchies, or entity domains, and implement lightweight governance microservices for real-time classification in ingestion pipelines.”
Another approach uses vision language models (VLMs) to analyze the document’s visual structure for additional contextual clues. Harpreet Sahota, hacker-in-residence at Voxel51, says VLMs can classify documents without training data, but the bigger issue is that most organizations don’t have consistent taxonomies to begin with. “A first step is to treat documents as images rather than just extracting text, which preserves layout information that is important for understanding structure,” recommends Sahota.
Managing versions and duplicates
Documents can have hundreds of versions and derivatives scattered across SharePoint sites, cloud storage areas, SaaS platforms, and email attachments. One of the more significant unstructured data governance challenges is identifying the latest, accurate versions to include in AI models, retrieval-augmented generation (RAG) systems, and AI agents.
“To improve document versioning, measure the semantic similarity between files and cluster documents that are likely versions of the same document,” says Reece Griffiths, field CTO for Collibra. “Once grouped, apply additional signals, such as last-modified date, metadata, or even title patterns to infer which document in each cluster is the most recent version.”
Determining document versions was once a rules-based system with controls for data owners and tools for handling exceptions. Modern systems now incorporate AI to automate or recommend the latest, most accurate documents and suggest which ones to archive.
“Agents excel at processing unstructured data, reading and analyzing the contents of presentations, videos, emails, and chat logs at scale,” says Dr. Michael Wu, chief AI strategist at PROS. “To manage versions, we must combine search and genAI to enhance the practice of ‘search first, search often’ with ‘read all before creating.’ This fosters continuous document evolution, where outdated or incorrect content is naturally updated or flagged for deprecation.”
Document retention policies
Even after duplication is addressed, a key data governance question remains: How to implement document retention policies? “Most organizations have well-defined retention rules for structured data, but applying those same rules to unstructured content has historically been very difficult,” says Griffiths of Collibra. “By performing AI-based tagging of every document according to a retention taxonomy, including record types and subtypes, companies can then query and manage unstructured data with the same precision they apply to structured data sets.”
Retention policies tend to follow legal guidelines with specific rules. A more difficult challenge is recognizing outdated information in documents that should no longer be used with AI models and agents.
“AI can age documents the way our minds naturally let older memories fade by noticing declining relevance signals, reduced connections to current work, and changing patterns of use,” says Jason Williamson, CEO of MythWorx. “Instead of a hard cutoff, it adapts continuously, helping organizations surface what’s still meaningful while gently retiring what no longer fits the present.”
Data security from start to finish
Three data disciplines are related: data governance protects the business, data privacy protects people, and data security protects the data. Implementing data security must first consider how people create and manage documents.
“When you’re dealing with documents at scale, security and governance can’t be separate workflows with handoffs between teams; they become the same integrated workflow, with discovery, classification, and enforcement happening as one coordinated response,” says Rohan Sathe, cofounder and CEO at Nightfall. “Modern platforms need to quarantine inappropriately shared messages, emails, and files the moment they’re detected. They need to revoke over-permissioned access to sensitive documents, prevent unauthorized cloud sync operations, block risky CLI commands, and stop file uploads to unsanctioned destinations—all in real time.”
Since documents feed AI models and AI agents, a second data security consideration is which documents to include and how to protect the data embedded in AI. “The primary risk with AI isn’t just a traditional breach; it’s contextual leakage,” says Nico Dupont, founder and CEO of Cyborg. “Once you ground a model in your enterprise data, that model becomes a potential vector for surfacing sensitive information to unauthorized users, and you cannot rely on the model to be its own gatekeeper. True data security requires inference time governance and treating AI as a new tier of infrastructure where the security is built into the architecture and is as automated as the data cleaning itself.”
A third consideration is how data is protected as people interact with LLMs and AI agents. These must adhere to the user’s access policies and the usage context. “The primary security risk in AI document management is inference exposure, where an AI might correctly answer a question by accessing a sensitive document that the user technically shouldn’t see,” says James Urquhart, field CTO and developer evangelist at Kamiwaza AI. “To mitigate this risk, organizations must understand the relationships between different entities in their business ontologies and implement permission-aware indexing that ensures that AI and agentic systems respect the same access controls that a human would be subject to.”
One of the most challenging aspects of unstructured data governance is that regulations are evolving and AI capabilities are improving. Policies must evolve as businesses add more data sets, increase AI literacy across their employee base, and expand their AI use cases. Addressing the challenges of unstructured data governance will generate a growing backlog of work for the foreseeable future.
GitHub pauses new Copilot sign-ups as agentic AI strains infrastructure 21 Apr 2026, 8:57 am
GitHub has paused new sign-ups for several individual Copilot plans and tightened usage limits, saying newer agentic coding workflows are consuming far more compute than its original pricing and service model was built to handle.
The move is a reminder that as AI coding assistants grow more autonomous, vendors may have to balance developer demand against infrastructure cost and service reliability.
“As Copilot’s agentic capabilities have expanded rapidly, agents are doing more work, and more customers are hitting usage limits designed to maintain service reliability,” GitHub said in a blog post. “Without further action, service quality degrades for everyone.”
Under the changes, GitHub has paused new sign-ups for its Copilot Pro, Pro+, and Student plans, saying the move will help it better serve existing customers.
The company is also tightening usage limits on individual plans, while positioning Pro+ as the higher-capacity tier with more than five times the limits of Pro for users who need heavier usage.
At the same time, GitHub is narrowing model access: Opus models will no longer be available on Pro plans, while Opus 4.7 will remain on Pro+, and Opus 4.5 and 4.6 are also set to be removed from that tier.
GitHub said it will now show usage limits directly in VS Code and Copilot CLI so users can more easily track how close they are to those caps.
The company added that affected Pro and Pro+ users who contact support between April 20 and May 20 can request a refund and will not be charged for April usage if the updated plans do not meet their needs.
GitHub’s move comes as other AI vendors are also adjusting usage policies to manage capacity, with Anthropic last month changing how Claude’s timed limits work during peak hours while keeping weekly limits unchanged.
Charlie Dai, vice president and principal analyst at Forrester, said the move shows how agent-driven coding is shifting workloads toward longer-running and parallel sessions that create higher and less predictable compute demand.
“Cost structures built for lightweight assistance no longer hold, and this puts pressure on GPU capacity, reliability, and unit economics,” Dai said.
Dai added that similar usage restrictions by major model providers suggest capacity rationing is likely to become a structural feature of the industry as agentic development becomes more routine.
Impact for developers
GitHub said Copilot now operates with both session limits and weekly seven-day limits, and that those caps are based on token consumption and model multipliers rather than just raw request counts. Users may still have premium requests left and yet hit a usage limit, because the two systems are separate.
In practice, that means developers using heavier agent-style workflows, especially long-running or parallel sessions, are more likely to hit limits than those using Copilot for simpler tasks.
GitHub is encouraging users nearing their caps to switch to lower-multiplier models, use plan mode in VS Code and Copilot CLI, and cut back on parallel workflows such as /fleet.
Analysts said the move also reflects a familiar pattern in the tech industry.
“First you give users access to a tool with relatively open usage, and then gradually start defining limits as adoption grows,” said Faisal Kawoosa, founder and chief analyst at Techarc. “GitHub has an unavoidable role in the developer world. A developer can live without an email ID, but not a GitHub account. Such is the depth of its integration. But at the same time, the rationalization of AI/Copilot in the ecosystem is inevitable, as resources are constrained.”
Kawoosa added that developers have now seen what Copilot can do, and there is little reason for GitHub to keep offering it without tighter limits. He said the next step is likely to be more differentiated plans that create clearer monetization opportunities among individual users. For enterprise engineering leaders, Dai said the episode is a reminder to evaluate AI coding tools as metered infrastructure rather than unlimited productivity layers. He said buyers should pay close attention to usage ceilings, downgrade behavior, model entitlements, and how clearly vendors communicate limits and cost controls to developers.
Hackers exploit Vercel’s trust in AI integration 20 Apr 2026, 12:13 pm
Frontend cloud platform Vercel, the creator of Next.js and Turbo.js, has warned about a data breach after a compromised third-party AI application abused OAuth to access its internal systems.
A Vercel employee used the third-party app, identified as Context.ai, which allowed the attackers to take over their Google Workspace account and access some environment variables that the company said were not marked as “sensitive.”
“Environment variables marked as ‘sensitive’ in Vercel are stored in a manner that prevents them from being read, and we currently do not have evidence that those values were accessed,” Vercel said in a security post.
The incident compromised what the company described as a “limited subset” of customers whose Vercel credentials were exposed. These customers have now been reached out to with requests to rotate their credentials, Vercel said.
According to reports surfacing on the internet, a threat actor claiming to be the Shinyhunters began attempting to sell the stolen data, which allegedly includes access key, source code, and private database, even before Vercel confirmed the breach publicly.
Hacking the access
Vercel’s disclosure confirmed that the initial access vector was Google Workspace OAuth tied to Context.ai. Once the application was compromised, attackers inherited the permissions granted to it, including access to the Vercel employee’s account.
It remains unclear whether Context.ai’s infrastructure was compromised, whether OAuth tokens were stolen, or whether a session/token leak within the AI workspace enabled attackers to abuse authenticated access into Vercel’s environments. Context.ai did not immediately respond to CSO’s request for comments.
“We have engaged Context.ai directly to understand the full scope of the underlying compromise,” Vercel said in the post. “We assess the attacker as highly sophisticated based on their operational velocity and detailed understanding of Vercel’s systems. We are working with Mandiant, additional cybersecurity firms, industry peers, and law enforcement.”
Vercel has urged its customers to review activity logs for suspicious behavior and to rotate environment variables, especially any unprotected secrets that may have been exposed. It also recommended enabling sensitive variable protections, checking recent deployments for anomalies, and strengthening safeguards by updating deployment protection settings and rotating related tokens where needed.
Sensitive secrets, including API keys, tokens, database credentials, and signing keys that were not marked as “sensitive,” should be treated as potentially exposed and rotated as a priority, Vercel emphasized.
For users in panic, Vercel has offered a shortcut. “If you have not been contacted, we do not have reason to believe that your Vercel credentials or personal data have been compromised at this time,” the post reassured.
Allegedly breached by ShinyHunters
According to screenshots circulating on the internet, a threat actor has already claimed the breach on the dark web and is attempting to sell the spoils. “Greetings All, Today I am selling Access Key/ Source Code/ Database from Vercel company,” the actor said in one of such posts. “Give me a quote if you’re interested. This could be the largest supply chain attack ever if done right.”
The data was put up for $2 million on April 19.
The threat actor can be seen using a “BreachForums” domain in the screenshot, claiming (not explicitly) to be Shinyhunters themselves, one of the operators of the notorious hacksite. Other giveaways include a Telegram channel “@Shinyc0rpsss” and an email ID “shinysevy@tutamail.com” mentioned in the post.
While recent incidents have hinted at ShinyHunters resurfacing after takedowns and alleged arrests, it remains likely that this is an imposter leveraging the name to lend credibility, something that has precedent.
Making agents dull 20 Apr 2026, 9:00 am
I’ve been arguing for a while now that enterprise AI won’t really take off until it gets boring. Not boring in the sense of uninspired; no, I mean boring in the sense that enterprises can trust it, govern it, observe it, and hand it to rank-and-file employees without undue concern that things will go wrong.
We have no shortage of over-funded startups clamoring to be the next big thing in AI, but not nearly enough that are quietly doing the essential work to make AI safe for enterprise consumption. Enter Stacklok.
On the surface, this might look like yet another startup trying to surf the AI agent wave. It’s not. Stacklok is exciting precisely because its executive team is deeply experienced in being unexciting. Back at Google, Craig McLuckie and Joe Beda were instrumental in the creation of Kubernetes. They took the messy, chaotic world of container orchestration and built an abstraction layer that made it “boring” enough that the largest banks, telcos, and retailers in the world could rely on it with confidence. Now they’re bringing that ability to wring order out of chaos to agentic AI, and they recognize that the real problem in enterprise AI has more to do with operational accountability than model quality.
I interviewed McLuckie and Beda to better understand the opportunity to create a “Kubernetes moment” in agentic AI.
Targeting accountability
McLuckie founded Stacklok in early 2023. Beda, his Kubernetes and later Heptio counterpart, had “semi-retired” in 2022. Beda doesn’t need to make more money, and he’s not joining out of nostalgia. As he tells it, this is “an extraordinary moment in the industry,” with “an opportunity to bring deep expertise in developer platforms and enterprise-grade infrastructure” to solving key enterprise problems.
“The biggest problem,” McLuckie says, “is accountability.” He explains: “An agent, no matter how sophisticated, no matter how capable, no matter how useful, cannot be held accountable for the work it undertakes.” That’s exactly right. A large language model can write code, summarize a contract, file a ticket, or trigger a workflow, but if it mangles customer data, oversteps its permissions, or keeps running after the employee who launched it has left the company, nobody gets to shrug and blame the model. The enterprise still owns the outcome.
Even OpenAI, which has been slower to take the enterprise seriously than Anthropic, now recognizes that enterprises need AI to fit inside workflows, controls, deployment models, and day-to-day operations. It’s no longer just about raw model prowess, as Tom Krazit writes. In other words, the market is slowly rediscovering what infrastructure people have known for a long time: Enterprises may buy capability, but they deploy control.
A related issue, according to Beda, is that AI’s speed changes everything. Tasks that used to take a human days or weeks may soon be completed in minutes by an agent. That doesn’t just create productivity. It creates scale, and scale turns manageable sloppiness into operational disaster. As he puts it, “The volume dial is going to 11 across the board.” I recently said that humans don’t use most of their granted permissions, but agents will. That’s exactly why identity, authorization, and auditability suddenly stop being problems for the security team and become architecture.
This is where the Kubernetes analogy is actually useful, rather than just founder mythmaking.
AI’s Kubernetes moment
Too many people remember Kubernetes as a container story. Enterprises embraced it for a more practical reason: It gave them a common operating model across environments, plus an ecosystem of policy, security, observability, and workflow tools layered on top. Cloud Native Computing Foundation now says 82% of container users run Kubernetes in production, and the organization explicitly frames Kubernetes as the operating system for AI. In our interview, McLuckie describes Kubernetes’ deeper contribution as “self-determination.” That is, it gave enterprises a consistent substrate on premises, at the edge, and in the cloud. That consistency is what helped an ecosystem to flourish around it.
Beda goes one step further: “One of the core ideas in Kubernetes is that you describe what you want to happen, and then you have the system go make it happen.” This, he says, means that Kubernetes is essentially “control theory rendered into software. Over time, an enterprise’s desired state moves into code, into version control, and into systems traceable back to accountable humans. Nerdy and sort of dull? Sure. But that’s the point. Enterprise AI doesn’t just need smarter models. It needs systems where humans declare intent, machines execute it, and the whole mess remains observable and auditable.
This is why I keep insisting that the biggest strategic question in agentic AI isn’t whether agents are cool. They are—or at least they can be. No, the real question is who owns the control plane. Stacklok matters because it is explicitly aiming at that layer. The company’s bet is that enterprises want to run and manage Model Context Protocol–based agent infrastructure on the Kubernetes they already know. They want policy, identity, isolation, and observability built in, not bolted on afterwards.
That last part matters because MCP is important, but it isn’t enough. Anthropic introduced MCP in November 2024 as an open standard for connecting AI systems to tools and data. Later, they donated it to the Linux Foundation’s Agentic AI Foundation to keep it neutral and community-driven. It worked. Anthropic reports there are now more than 10,000 active public MCP servers and support across ChatGPT, Cursor, Gemini, Microsoft Copilot, and VS Code.
That’s awesome, but it’s also not enough. Why? Because a protocol isn’t a platform. A protocol can help an agent talk to a tool, but it doesn’t, by itself, tell an enterprise who approved that agent, what data it can touch, how its actions are logged, or how to shut it down safely when the human who launched it has left the company.
Meeting users where they are
That’s where Stacklok’s self-hosted, Kubernetes-native bias starts to look smart rather than stodgy. (Though, again, “stodgy” isn’t a bad thing for risk-averse enterprises.) McLuckie is blunt: “If you’re an enterprise connecting agents to sensitive data, you are almost certainly not comfortable with that data egressing your security domain or being sent to a SaaS endpoint that a vendor controls.” We’ve seen this movie before. When your hosting, identity, tool integration, and policy layers all belong to the same vendor, “choice” starts to mean “replatform.”
No one wants that.
This is also where open source matters, though not in the simplistic sense that open source automatically wins. It doesn’t. Enterprises don’t buy ideology: they buy simplicity. But in a young market, they also value leverage. I’ve written before that open source doesn’t magically redistribute market power. What it can do is give customers options and some control over their fate. In AI, where model switching costs are still relatively low, that optionality matters. Talking with McLuckie and Beda, it’s clear they are open source true believers, but not obnoxiously so. That’s good, because enterprises don’t need a sermon on openness; they just need enough neutrality to avoid getting trapped while the market is still changing underneath them.
It’s all about meeting enterprises where they are and helping them to incrementally move to where they’d like to be. As McLuckie stresses, most enterprise AI teams are being asked to deliver more with AI while running with flat or capped headcount. They don’t need and can’t implement a grand theory of some idealized, fully autonomous enterprise. Instead, they need an accretive (golden) path from here to there using things they already understand, such as containers, isolation, OpenTelemetry, Kubernetes, existing identity systems, and existing observability stacks.
Sound boring? Good!
The opposite of “boring” in enterprise AI isn’t innovation. It’s slideware or demoware that looks great in a keynote but dies on contact with procurement, security review, compliance, and the first ugly bit of enterprise data. McLuckie captures this perfectly: “Vibe-coding a platform for two weeks can produce something plausible. It won’t produce something accurate, hardened, or enterprise-grade.”
Will Stacklok be the company that defines this layer? It’s way too early to say. Markets this young are littered with smart people who were directionally right and commercially wrong. But the company is aiming at the right problem, and that already puts it ahead of a depressingly large percentage of the AI industry.
Again, the next era of enterprise AI will be won by whoever makes agents governable, portable, observable, and boring enough to trust. Kubernetes helped do that for cloud-native infrastructure. Stacklok is betting the same playbook can work for agentic infrastructure. That’s not a nostalgic rerun of Kubernetes. It’s a recognition that enterprises still need what they’ve always needed: not more magic, but a way to control it.
Best practices for building agentic systems 20 Apr 2026, 9:00 am
Agentic AI has emerged as the software industry’s latest shiny thing. Beyond smarter chatbots, AI agents operate with increasing autonomy, making them poised to drive efficiency gains across enterprises.
“Agentic refers to AI systems that can take actions on behalf of users, not just generate text or answer questions,” says Andrew McNamara, director of applied machine learning at Shopify. Agentic systems run continuously until a task is complete, he adds, citing Shopify’s Sidekick, a proactive agent for merchants.
Development of agentic AI now spans many business domains. According to Anthropic, a provider of large language models (LLMs), AI agents are most commonly deployed in software engineering, accounting for roughly half of use cases, followed by back-office automation, marketing, sales, finance, and data analysis.
“A concrete example is in IT incident resolution,” says Heath Ramsey, group VP of AI platform outbound product management at ServiceNow. In this context, AI agents surface contextual data across systems, check prior resolutions and policies, issue fixes, update records, and loop in team members, he says.
But agent-centered development demands a new form of systems thinking to avoid pitfalls such as indeterminism and token bloat. There are also pressing LLM-derived security gaps, such as a model’s willingness to lie or fabricate information to achieve a goal, a condition researchers call agentic misalignment.
For teams building agents that integrate with other systems and reason through various options to execute multi-step workflows, the proper upfront planning is table stakes. For these reasons and more, agentic architecture design requires a new playbook.
“Building agentic systems requires a fundamentally new architecture, one designed for autonomy, not just automation,” says Anurag Gurtu, CEO of AIRRIVED, an agentic AI platform provider. “Agents need a runtime, a brain, hands, memory, and guardrails.”
Although agentic AI shows promise, ROI from AI is a moving target. Less than half of organizations report a measurable impact from agentic AI experiments, according to Alteryx, with less than a third trusting AI for accurate decision-making.
So, what are the ingredients behind successful enterprise-grade agentic systems? Rather than focusing on how to build within a single vendor platform, let’s explore the common traits across agentic systems to surface practical guidance and lessons learned for developers and architects.
Architectural components of an agentic system
Agentic systems are composed of a handful of building blocks that make it all possible. Together, they form an interconnected web of software architecture, with different components serving different purposes. “Building an AI agent is like constructing a nervous system,” says Ari Weil, cloud evangelist at Akamai.
This system spans layers for reasoning, memory, context-gathering, coordination, validation, and human-in-the-loop guardrails. “Agentic systems rely on a combination of AI, workflow automation, and enterprise controls working together,” adds ServiceNow’s Ramsey.
Reasoning model
First off, if you break down agentic systems into their foundational components, you have to begin with the underlying model.
“A reasoning model sits at the core,” says Frank Kilcommins, head of enterprise architecture at Jentic, builders of an integration layer for AI. This reasoning engine performs the planning based on the user’s prompt, combined with the context-at-hand and available capabilities.
Some reasoning models are better suited than others. “We look for models that feel agentic,” says Shopify’s McNamara. “They have the right amount of tool calls, and have strong instruction following that’s easy to prompt and steer.”
Context and data
Next, an agent needs context. This may take the form of internal company data, institutional knowledge and policies, system prompts, external data, memory of past chats, and agentic metadata, i.e., the user prompts, reasoning steps, and interactions with tools and data sources that allow you to observe and debug the agent’s behavior.
According to Edgar Kussberg, product director for AI, agents, IDE, and devtools at Sonar, sources for data can include databases and APIs, retrieval-augmented generation (RAG) systems and vector databases, file systems and document stores, internal dashboards, or external systems like Google Drive.
Organizations are actively building agentic knowledge bases to organize such data and streamline the retrieval process. Simultaneously, patterns are emerging behind semantic retrieval processes that power agentic context management systems.
“For memory, most teams combine a vector store like pgvector with something structured like a data catalog or knowledge graph,” says Anusha Kovi, a business intelligence engineer at Amazon.
Tools and discovery
But for agents to be actionable, they need more than just static context — they need read and write access to databases, tools, and APIs.
“Some of the most important work being done to make agents more powerful is happening with the ways we connect AI and existing systems,” says Jackie Brosamer, head of data and AI at Block, the financial services company behind Square and Cash App.
To enable access to such capabilities, the industry has really coalesced around the Model Context Protocol (MCP) as a universal connector between agents and systems. MCP registries are emerging to unify and catalog MCP capabilities for agents at scale.
There are numerous public case studies of MCP use within agentic architectures, including Block’s open-source goose agent for LLM-powered software development and Workato’s use of MCP for Claude-powered enterprise workflows.
Defined workflows
Another useful component is having clearly documented workflows for common procedures. These include multi-step actions that are interlinked between MCP servers or direct API calls.
“What matters is that these agents are coordinated through defined workflows,” says ServiceNow’s Ramsey, “so autonomy scales in a predictable and governed way rather than becoming chaotic.”
Jentic’s Kilcommins describes how this can be achieved using “clear, machine-readable capability definitions,” referencing the Arazzo specification, an industry standard from the OpenAPI Initiative, as a method to document such behaviors.
Multi-agent orchestration
On that note, agents must be equipped to integrate with each other and fit well into a continuous feedback loop.
Multi-agent systems typically become necessary at scale, says AIRRIVED’s Gurtu. “Instead of one generalist agent, you often have teams of specialized agents such as reasoning agents, retrieval agents, action agents, and validation agents.”
This reality necessitates connective tissue. “At the core, you need an orchestration layer for the plan-do-evaluate loop,” says Amazon’s Kovi.
Common components for orchestration, adds Kovi, include LangGraph, a low-level orchestration framework, CrewAI, a Python framework for multi-agent orchestration, and Bedrock Agents, for helping agents automate multi-step tasks.
Open standards and protocols, like the A2A protocol for agent-to-agent communications, will also be important to enable AI agents to collaborate effectively.
Security and authorization
Given LLMs’ propensity to hallucinate and deviate from expectations, security is perhaps the most important element of building safe agentic systems.
“You’re no longer securing software that suggests, you’re securing software that acts,” says Gurtu. “Once agents can change access, trigger workflows, or remediate incidents, every decision becomes a potential control failure if it isn’t governed.”
According to Kilkommins, the potential blast radius for agentic actions is huge, especially for uncontrolled, chained executions. He recommends having clearly defined permissions to avoid privilege escalation and sensitive data exposure.
In agentic systems, nuanced security methods are necessary. “An agent decides at run time what to query and what tools to call, so you can’t scope permissions the traditional way,” adds Kovi. Experts say that just-in-time authorization will be crucial to future-proof the non-human internet.
Kovi adds that safety rules, like “don’t query personal information columns,” shouldn’t live in the prompt window. “Guardrails belong in identity and access management policies and configuration, not just prompt instructions.”
Human checkpoints
Even with advanced authentication and authorization, sensitive actions will require human approvals.
Shopify defaults to “human-in-the-loop by design,” says McNamara. They’ve adopted approval gates to prevent fully autonomous changes to production systems. This allows merchants to review Sidekick’s AI-generated content before it goes live.
Others take a similar stance, particularly for financial transactions. “Our general rule is that anything touching production systems needs human checkpoints,” says Block’s Brosamer, referring to how user confirmation is a key element of Moneybot, the agent inside Cash App.
Evaluation capabilities
Building agentic systems also requires a good deal of upfront testing to evaluate whether outcomes match the intended results.
For instance, Shopify performs rigorous pre-deployment evaluation on agentic outputs using both human testing and user simulation with specialized LLM-based judges. “Once your judge reliably matches human evaluators, you can trust it at scale,” says McNamara.
Others agree that evaluations are critical for enterprise-grade agentic systems. “Treat agents like regulated systems,” says Gurtu. “Sandbox changes, and test agents in simulation.”
Behavioral observability
Lastly, another core layer is observability. For agentic systems, this must go beyond traditional monitoring or failure detection to capture advanced signals, such as why agents failed, or why they picked certain actions over others.
“Observability must be built in from day one,” says Sonar’s Kussberg. “You need transparency into every step of execution: prompts, tool calls, intermediate decisions, and final outputs.”
With more observable agent behaviors, you can improve the system continuously over time. As Kussberg says, “transparency fuels improvement.”
Context optimization strategies
Nearly all experts agree: giving AI agents minimal, relevant data is far better than data overload. This is critical to avoid maxing out context windows and degrading output quality.
“Thoughtful data curation matters far more than data volume,” says Brosamer. “The quality of an agent’s output is directly tied to the quality of its context.”
At Block, engineers maintain clear README files, apply consistent documentation standards and well-structured project hierarchies, and adhere to other semantic conventions that help agents surface relevant information.
“Agentic systems don’t need more data, they need the right data at the right time,” adds Sonar’s Kussberg. “Effective systems give agents versatile discovery tools and allow them to run retrieval loops until they determine they have sufficient context.”
The prevailing philosophy is to adopt progressive disclosure of information. Shopify takes this to heart, using modular instruction delivery. “Just-in-time context delivery is key,” says McNamara. “Rather than overloading the system prompt, we return relevant context alongside tool data when it’s needed.”
Others point out that context should include semantic nuances too, says Kovi. “If an agent doesn’t know ‘active users’ means something different in product versus marketing, it’ll give confident wrong answers,” she says. “That’s hard to catch.”
Architectural best practices
There are plenty of additional recommendations regarding agentic systems development. First is the realization that not everything needs to be agentified.
Pairing LLMs and MCP integrations is great for novel situations requiring highly scalable, situationally-aware reasoning and responsiveness. But MCP can be overkill for repetitive, deterministic programmed automation, especially when context is static and security is strict.
As such, Kilkommins recommends determining what behavior is adaptive versus deterministic, and codifying the latter, as this will allow agents to initiate intentionally-defined programmed behaviors, bringing more stability.
Determining the prime areas for agentic processes also comes down to finding reusable use cases. “Organizations that have successfully deployed agentic AI most often start by identifying a high-friction process,” says Ramsey. This could include employee service requests, new-hire onboarding, or customer incident response, he says.
Gurtu adds that agents perform best when they are given concrete business goals. “Start with decisions, not demos,” he says. “What doesn’t work is treating agents like stateless chatbots or replacing humans overnight,” says Gurtu.
Others believe that narrowing an agent’s autonomy yields better results. “Agents work best as specialists, not generalists,” Kussberg says.
For instance, Shopify sets clear boundaries when scaling tools. “Somewhere between 20 and 50 tools the boundaries start to blur,” says McNamara. While some propose separating role boundaries with distinct task-specific agents, Shopify has opted for a sub-agent architecture with low-level tools.
“Our recommendation is actually to avoid multi-agent architectures early,” McNamara says. We are now getting into sub-agents with the right approach, and one key principle is to build very low-level tools and teach the system to translate natural language to that low-level language, rather than building out tools scenario by scenario.”
Experts share other wisdom for designing and developing agentic systems:
- Use open infrastructure: Open agents and vendor-agnostic frameworks allow you to use the best fit-for-purpose models.
- Think API-first: Good API design and clear, machine-readable definitions better prepare an organization for AI agents.
- Keep data in sync: Keeping shared data in sync is another challenge. Event-driven architectures can keep data fresh.
- Balance access with control: Keeping agentic systems secure will require offensive security exercises, comprehensive audit logs, and defensive data validation.
- Continually improve: To avoid agent drift, agentic systems development will inevitably require ongoing maintenance as the industry and AI technology evolve.
The future for agentic systems
Agentic AI development has moved forward at a blistering pace. Now, we’re at the point where agentic system patterns are beginning to solidify.
Looking to the future, experts anticipate a turn toward more multi-agent systems development, guiding the need for more complex orchestration patterns and reliance upon open standards. Some forecast a substantial overhaul to knowledge work at large.
“I expect that in 2026, we will see experimentation with frameworks to structure ‘factories’ of agents to coordinate producing complex knowledge work, starting with coding,” says Block’s Brosamer. The most challenging aspect will be optimizing existing information flows for agentic use cases, she adds.
One aspect of that future could be more emphasis on alternative clouds and edge-based inference to move certain workloads out of centralized cloud architecture to reduce latency.
“The future of competitive AI demands proximity, not just processing power,” says Akamai’s Weil. “Agents need to act in the real world, interacting with users, devices, and data as events unfold.”
All in all, building agentic systems is a highly complex endeavor, and the practices are still maturing. It will take a combination of novel technologies, microservices-esque design thinking, and security guardrails to take these projects to fruition at scale in a meaningful and sustainable way — all while still granting agents meaningful autonomy.
The future looks agentic. But the smart system design underpinning agentic systems will set apart successful outcomes from failed pilots.
Oracle delivers semantic search without LLMs 17 Apr 2026, 5:07 pm
Oracle says its new Trusted Answer Search can deliver reliable results at scale in the enterprise by scouring a governed set of approved documents using vector search instead of large language models (LLMs) and retrieval-augmented generation (RAG).
Available for download or accessible through APIs, it works by having enterprises define a curated “search space” of approved reports, documents, or application endpoints paired with metadata, and then using vector-based similarity to match a user’s natural language query to the most relevant of pre-approved target, said Tirthankar Lahiri, SVP of mission-critical data and AI engines at Oracle.
Instead of retrieving raw text and generating a response, as is typical in RAG systems that rely on LLMs, Trusted Answer Search’s underlying system deterministically maps the query to a specific “match document,” extracts any required parameters, and returns a structured, verifiable outcome such as a report, URL, or action, Lahiri said.
A feedback loop enables users to flag incorrect matches and specify the expected result.
Lahiri sees a growing enterprise need for more deterministic natural language query systems that eliminate inconsistent responses and provide auditability for compliance purposes.
Independent consultant David Linthicum agreed about the potential market for Trusted Answer Search.
“The buyer is any enterprise that values predictability over creativity and wants to lower operational risk, especially in regulated industries, such as finance and healthcare,” he said.
Trade-offs
That said, the approach comes with trade-offs that CIOs need to consider, according to Robert Kramer, managing partner at KramerERP. While Trusted Answer Search can reduce inference costs by avoiding heavy LLM usage, it shifts spending toward data curation, governance, and ongoing maintenance, he said.
Linthicum, too, sees enterprises adopting the technology having to spend on document curation, taxonomy design, approvals, change management, and ongoing tuning.
Scott Bickley, advisory fellow at Info-Tech Research Group, warned of the challenges of keeping curated data current.
“As the source data scales upwards to include externally sourced content such as regulatory updates or supplier certifications or market updates that are updated more frequently and where the documents may number in the many thousands, the risk increases,” he said.
“The issue comes down to the ability to provide precise answers across a massive data set, especially where documents may contradict one another across versions or when similar language appears different in regulatory contexts. The risk of being served up results that are plausible but wrong goes up,” Bickley added.
Oracle’s Lahiri, however, said some of these concerns may be mitigated by how Trusted Answer Search retrieves content.
Rather than relying solely on large volumes of static, curated documents that require constant updating, the system can treat “trusted documents” as parameterized URLs that pull in dynamically rendered content from underlying systems, according to Lahiri.
Live data sources
This enables it to generate answers from live data sources such as enterprise applications, APIs, or regularly updated web endpoints, reducing dependence on manually maintained document repositories, he said.
Linthicum was not fully convinced by Lahiri’s argument, agreeing only that Oracle’s approach could help reduce content churn.
“In fast-moving domains, keeping descriptions, synonyms, and mappings current still needs disciplined owners, approvals, and feedback review. It can scale to thousands of targets, but semantic overlap raises maintenance complexity,” he said.
Trusted Answer Search puts Oracle in contention with offerings from rival hyperscalers. Products such as Amazon Kendra, Azure AI Search, Vertex AI Search, and IBM Watson Discovery already support semantic search over enterprise data, often combined with access controls and hybrid retrieval techniques.
One key distinction, between these offerings and Oracle’s, according to Ashish Chaturvedi, leader of executive research at HFS Research, is that the rival products typically layer generative AI capabilities on top to produce answers.
Enterprises can evaluate Trusted Answer Search by downloading a package that includes components such as vector search, an embedding model to process user queries, and APIs for integration into existing applications and user interfaces. They can also run it through APIs or built-in GUI applications, which are included in the package as two APEX-based applications, an administrator interface for managing the system and a portal for end users.
Exciting Python features are on the way 17 Apr 2026, 9:00 am
Transformative new Python features are coming in Python 3.15. In addition to lazy imports and an immutable frozendict type, the new Python release will deliver significant improvements to the native JIT compiler and introduce a more explicit agenda for how Python will support WebAssembly.
Top picks for Python readers on InfoWorld
Speed-boost your Python programs with the new lazy imports feature
Starting with Python 3.15, Python imports can work lazily, deferring the cost of loading big libraries. And you don’t have to rewrite your Python apps to use it.
How Python is getting serious about Wasm
Python is slowly but surely becoming a first-class citizen in the WebAssembly world. A new Python Enhancement Proposal, PEP 816, describes how that will happen.
Get started with Python’s new frozendict type
A new immutable dictionary type in Python 3.15 fills a long-desired niche in Python — and can be used in more places than ordinary dictionaries.
How to use Python dataclasses
Python dataclasses work behind the scenes to make your Python classes less verbose and more powerful all at once.
More good reads and Python updates elsewhere
Progress on the “Rust for CPython” project
The plan to enhance the Python interpreter by using the Rust language stirred controversy. Now it’s taking a new shape: use Rust to build components of the Python standard library.
Profiling-explorer: Spelunk data generated by Python’s profilers
Python’s built-in profilers generate reports in the opaque pstats format. This tool turns those binary blobs into interactive, explorable views.
The many failures that led to the LiteLLM compromise
How did a popular Python package for working with multiple LLMs turn into a vector for malware? This article reveals the many weak links that made it possible.
Slightly off-topic: Why open source contributions sit untouched for months on end
CPython has more than 2,200 open pull requests. The fix, according to this blog, isn’t adding more maintainers, but “changing how work flows through the one maintainer you have.”
When cloud giants neglect resilience 17 Apr 2026, 9:00 am
In a recent article chronicling the history of Microsoft Azure and its intensifying woes, we see a narrative that has been building throughout the industry for years. As cloud computing evolved from a buzzword to the backbone of digital infrastructure, major providers like Microsoft, Amazon, and Google have had to make compromises. Their promises of near-perfect uptime shifted from an expectation to “good enough,” influenced by economic pressures that have seen the cloud giants prioritize cost cuts and staff reductions over previously non-negotiable service reliability.
Frankly, many who follow the cloud space closely, including myself, have been warning about this situation for some time. Cloud outages are no longer rare, freak events. They are ingrained in the model as accepted collateral for the rapid growth and relentless cost-cutting that define this era of cloud computing. The story of Azure, as discussed in the referenced Register piece, is simply the latest and most prominent example of a much larger, industrywide trend.
This is not to say that cloud computing is inherently unstable or that its advantages—agility, scalability, rapid deployment—are a mirage. Enterprises aren’t abandoning the cloud. Far from it. Adoption continues at pace, even as these high-profile outages occur. The question is not whether the cloud is worth it, but rather, how much unreliability is acceptable for all that innovation and efficiency?
The price of cost optimization
If you trace the decisions of major public cloud players, a clear theme emerges. Competitive pressure from rivals translates to constant cost control, rushing services to market, shaving operational budgets, automating wherever possible, and reducing (or outright eliminating) teams of deeply experienced engineering talent who once ensured continuity and institutional knowledge. The comments from a former Azure engineer clearly illustrate how an exodus of talent, paired with an almost single-minded focus on AI and automation, is having downstream effects on the platform’s stability and support.
The irony is sharp: As cloud providers trumpet their AI prowess and machine-driven automation, the human expertise that built and reliably ran these platforms is no longer considered mission-critical. Automation isn’t a cure-all; companies still need experienced architects and operators who understand system limits, manage dependencies, handle failures, and respond deftly to unpredictable failures. Recent major outages reflect the slow but sure loss of that critically embedded human knowledge. Meanwhile, engineering decisions are increasingly made by those tasked with juggling ever-larger portfolios, new feature launches, and cost-reduction mandates, rather than contributing a methodical focus on resilience and craftsmanship.
Azure faces growing pains at scale, with tens of thousands of AI-generated lines of code created, tested, and deployed daily—sometimes by other AI agents —creating a self-reinforcing cycle of complexity and opacity. The resulting “compute crunch” puts even more strain on infrastructure, which, despite its sophistication, now handles heavier loads with fewer people providing oversight.
Outages aren’t driving users away
A natural question emerges: With reliability clearly taking a back seat, why aren’t enterprises reconsidering cloud altogether? I’ve argued for years that the game has changed. The benefits of cloud centralization, automation, and connectivity have become so fundamental to operations that the industry has quietly recalibrated its tolerance for outages. Public cloud is so deeply embedded into the business and digital operations that stepping back would mean undoing years, and often decades, of progress.
Headline-grabbing outages are dramatic but usually survivable. Disaster recovery plans, multi-region deployments, and architectural workarounds are now essentials for all major cloud-based companies. Building with failure in mind is a standard cost, not an avoidable exception. For most CIOs, the persistent risk of downtime is a manageable variable, balanced against the unmatchable benefits of cloud agility and in-house scale.
Providers know this well, and their actions reflect it. Outages may sting a bit in the press, but the real-world consequences have yet to outweigh the benefits to companies that push further into the cloud. As such, the providers’ logic is simple: As long as customers accept outages, however grudgingly, there’s little incentive to switch to costlier, less scalable systems.
How enterprises can adapt
With outages now the price of admission, enterprises should recognize that neither staff cuts nor the blind pursuit of automation will stop anytime soon. Cloud providers may promise improvements, but their incentives will remain focused on cost control over reliability. Organizations must adapt to this new normal, but they can still make choices that reduce their risk.
First, enterprises should prioritize fault-resistant cloud architecture. Adopting multicloud and hybrid cloud strategies, while complex, reduces the technical risk associated with reliance on a single provider.
Second, it’s crucial to invest in in-house expertise that understands both the workloads and the nuances of cloud service behavior. While the providers may treat their operations talent as expendable, nothing will replace the value of an enterprise’s in-house team to independently monitor, test, and prepare for the unexpected.
Finally, enterprises must enforce strict vendor management. This means holding providers accountable for promised service-level agreements, monitoring transparency in communication and incident reporting, and leveraging contracted services to their fullest extent, especially as the cloud market matures and customer influence grows.
The era of the infallible cloud is over. As public cloud providers pursue operational efficiency and AI dominance, resilience has taken a hit, and both providers and users must adapt. The challenge for today’s enterprises is to strategically mitigate the most likely consequences before the next outage strikes.
Anthropic’s latest model is deliberately less powerful than Mythos (and that’s the point) 17 Apr 2026, 2:33 am
Anthropic has today released a new, improved Claude model, Opus 4.7, but has deliberately built it to be less capable than the highly-anticipated Claude Mythos.
Anthropic calls Opus 4.7 a “notable improvement” over Opus 4.6, offering advanced software engineering capabilities and improved visioning, memory, instruction-following, and financial analysis.
However, the yet-to-be-released (and inadvertently leaked) Mythos seems to overshadow the Opus 4.7 release. Interestingly, Anthropic itself is downplaying Opus 4.7 to an extent, calling it “not as advanced” and “less broadly capable” than the Claude Mythos Preview.
The Opus upgrade also comes on the heels of the launch of Project Glasswing, Anthropic’s security initiative that uses Claude Mythos Preview to identify and fix cybersecurity vulnerabilities.
“For once in technological history, a product is being released with a marketing message that is focused more on what it does not do than on what it does,” said technology analyst Carmi Levy. “Anthropic’s messaging makes it clear that Opus 4.7 is a safer model, with capabilities that are deliberately dialed down compared to Mythos.”
‘Not fully ideal’ in some safety scenarios
Anthropic touts Opus 4.7’s “substantially better” instruction-following compared to Opus 4.6, its ability to handle complex, long-running tasks, and the “precise attention” it pays to instructions. Users report that they’re able to hand off their “hardest coding work” to the model, whose memory is better than that of prior versions. It can remember notes across long, multi-session work and apply them to new tasks, thus requiring less up-front context.
Opus 4.7 has 3x more vision capabilities than prior models, Anthropic said, accepting high-resolution images of up to 2,576 pixels. This allows the model to support multimodal tasks requiring fine visual detail, such as computer-use agents analyzing dense screenshots or extracting data from complex diagrams.
Further, the company reported that Opus 4.7 is a more effective financial analyst, producing “rigorous analyses and models” and more professional presentations.
Opus 4.7 is relatively on par with its predecessor in safety, Anthropic said, showing low rates of concerning behavior such as “deception, sycophancy, and cooperation with misuse.” However, the company pointed out, while it improves in areas like honesty and resistance to malicious prompt injection, it is “modestly weaker” than Opus 4.6 elsewhere, such as in responding to harmful prompts, and is “not fully ideal in its behavior.”
Opus 4.7 comes amidst intense anticipation of the release of Claude Mythos, a general-purpose frontier model that Anthropic calls the “best-aligned” of all the models it has trained. Interestingly, in its release blog today, the company revealed that Mythos Preview scored better than Opus 4.7 on a few major benchmarks, in some cases by more than ten percentage points.
The Mythos Preview boasted higher scores on SWE-Bench Pro and SWE-Bench Verified (agentic coding); Humanity’s Last Exam (multidisciplinary reasoning); and agentic search (BrowseComp), while the two had relatively the same scores for agentic computer use, graduate-level reasoning, and visual reasoning.
Opus 4.7 is available in all Claude products and in its API, as well as in Amazon Bedrock, Google Cloud’s Vertex AI, and Microsoft Foundry. Pricing remains the same as Opus 4.6: $5 per million input tokens, and $25 per million output tokens.
What sets Opus 4.7 apart
Claude Opus is being branded in the industry as a “practical frontier” model, and represents Anthropic’s “most capable intelligent and multifaceted automation model,” said Yaz Palanichamy, senior advisory analyst at Info-Tech Research Group. Its core use cases include complex coding, deep research, and comprehensive agentic workflows.
The model’s core product differentiators have to do with how well-coordinated and composable its embedded algorithms are at scaling up various operational use case scenarios, he explained.
Claude Opus 4.7 is a “technically inclined” platform requiring a fair amount of deep personalization to fine-tune prompts and generate work outputs, he noted. It retains a strong lead over rival Google Gemini in terms of applied engineering use cases, even though Gemini 3.1 Pro has a larger context window (2M tokens versus Claude’s 1M tokens), although, he said, “certain [comparable] models do tend to converge on raw reasoning.”
The 4.7 update moves Opus beyond basic chatbot workflows, and positions it as more of “a copilot for complex, technical roles,” Levy noted. “It’s more capable than ever, and an even better copilot for knowledge workers.” At the same time, it poses less risk, making it a “carefully calculated compromise.”
He also pointed out that the Opus 4.7 release comes just two months after Opus 4.6 was introduced. That itself is “a signal of just how overheated the AI development cycle has become, and how brutally competitive the market now is.”
A guinea pig for Mythos?
Last week, Anthropic also announced Project Glasswing, which applies Mythos Preview to defensive security. The company is working with enterprises like AWS and Google, as well as with 30-plus cybersecurity organizations, on the initiative, and claims that Glasswing has already discovered “thousands” of high-severity vulnerabilities, including some in every major operating system and web browser.
Anthropic is intentionally keeping Claude Mythos Preview’s release limited, first testing new cyber safeguards on “less capable models.” This includes Opus 4.7, whose cyber capabilities are not as advanced as those in Mythos. In fact, during training, Anthropic experimented to “differentially reduce” these capabilities, the company acknowledged.
Opus 4.7 has safeguards that automatically detect and block requests that suggest “prohibited or high-risk” cybersecurity uses, Anthropic explained. Lessons learned will be applied to Mythos models.
This is “an admission of sorts that the new model is somewhat intentionally dumber than its higher-end stablemate,” Levy observed, “all in an attempt to reinforce its cyber risk detection and blocking bona fides.”
From a marketing perspective, this allows Anthropic to position Opus 4.7 as an ideal balance between capability and risk, he noted, but without all the “cybersecurity baggage” of the limited availability higher-end model.
Mythos may very well be the “ultimate sacrificial lamb” at the root of broader Opus 4.7 mass adoption, Levy said. Even in the “increasing likelihood” that Mythos is never publicly released, it will serve as “an ideal means of glorifying Opus as the one model that strikes the ideal compromise for most enterprise decision-makers.”
Palanichamy agreed, noting that Opus 4.7 could serve as a public-facing guinea pig to live-test and fine-tune the automated cybersecurity safeguards that will ultimately “become a mandatory precursory requirement for an eventual broader release of Mythos-class frontier models.”
This article originally appeared on Computerworld.
Salesforce launches Headless 360 to support agent‑first enterprise workflows 16 Apr 2026, 9:00 am
Salesforce is packaging its developer and AI tooling, including its vibe coding environment Agentforce Vibes, into a new platform named Headless 360, designed to help enterprise teams build agent-first workflows.
The CRM software provider defines agent-first workflows as enterprise processes in which software agents, rather than human users, carry out tasks by directly invoking APIs, tools, and predefined business logic.
To support this approach, Headless 360 exposes Salesforce’s underlying data, workflows, and governance controls as APIs, MCP tools, and CLI commands, via its existing offerings, such as Data 360, Customer 360, and Agentforce, Joe Inzerillo, president of AI technology at Salesforce, said during a press briefing.
This allows agents to operate directly on the platform’s existing business logic and datasets, rather than relying on separate integrations or user interfaces, Inzerillo added.
Push to become a control layer for enterprise AI agents
Analysts, however, see Headless 360 as an effort by Salesforce to position itself as a central layer for managing agent-driven operations across different business functions in enterprises, moving from a system of record to being the system of execution.
“Salesforce knows the center of gravity is moving toward coding agents, conversational interfaces, agent harnesses, and external runtimes, so it is trying to keep Salesforce relevant as the system underneath,” said Dion Hinchcliffe, VP of the CIO practice at The Futurum Group.
With Headless 360, Hinchcliffe added, Salesforce is trying to move its positioning beyond “AI agents inside Salesforce” to framing “Salesforce as a programmable platform for agents operating across external tools, interfaces, and environments.”
Analysts warn that CIOs need caution before adopting Headless 360.
Scott Bickley, advisory fellow at Info-Tech Research Group, said modern data stacks can replicate much of Headless 360’s functionality with more flexibility and less vendor concentration.
There are other issues that Bickley thinks should worry CIOs: “There is no mention of cost or the underlying licensing model for this ‘headless’ experience. Are all tools included at no cost?”
“Salesforce’s MO seems to be to announce new capabilities that require SKUs. CIOs should be asking about pricing now, before building in architectural dependencies on features that might land in a premium cost tier,” Bickley cautioned.
Also, the analyst pointed out that Salesforce’s announcement is silent on SLAs for operations such as MCP tool calls, which matter materially for real-time agent workflows.
Incremental gains for developers despite broader concerns
Despite these concerns, Bickley sees some of the new Headless 360 features, although undifferentiated from the competition, as offering practical benefits for developers in their daily tasks.
The analyst was referring to newer updates, such as new MCP tools that give external coding agents full access to Salesforce’s platform, the DevOps Center MCP, the Agentforce Experience Layer, and newer governance features.
Enabling full access to external coding agents, such as Claude Code and Codex, in particular, Bickley said, helps Salesforce to meet the developer where they are or let them continue using the tool of their choice.
“Historically, developers were forced into Salesforce’s proprietary toolchain that included clunky VS Code extensions, painful metadata APIs, and quirky development pipelines that required Salesforce-specific expertise. Expanding the dev environment helps alleviate this pain,” Bickley pointed out.
The other updates, according to Hinchcliffe, should help curtail developer friction by helping avoid frequent switching between development tools, expanding real-time awareness of organization data, reducing the need for custom plumbing to expose business logic, and decreasing the effort needed to move from prototype to deployment.
Focusing specifically on the new DevOps Center MCP, which is a set of AI-powered tools that enable the use of natural language across the entire DevOps lifecycle, Bickley said that it will help developers alleviate pains around CI/CD processes.
“Salesforce development pipelines are notoriously fragile with metadata dependencies, org-specific configurations, artificial limits on work items, and UI response issues, among others,” Bickley added.
Concerns around the maturity of governance capabilities
The governance tools, specifically the updates to the Testing Center, Custom Scoring Evals, Session Tracing, and A/B Testing API, according to Hinchcliffe, too, address real gaps that enterprise development teams face, especially moving agentic workflows or applications into production.
“Salesforce is correctly identifying that enterprise agent adoption will stall unless buyers can properly measure, govern, debug, and tune agent behavior over time,” the analyst said.
However, Bickley cautioned about the efficacy of these tools, as most of these tools are in the very early stages of their release. In fact, the analyst suggested that enterprises should expect to supplement these tools with their own evaluation frameworks for the next 12-18 months.
The analyst also flagged additional concerns around newer components such as the Agentforce Experience Layer, which is a new UI service that allows developers to decouple what an agent does from how it surfaces across various services and applications.
“Ironically, this adds yet another layer to contend with in the development process for what is already considered a painful development experience. Salesforce has a pattern of shipping v1 tools that work great in demos but fall in real-world scenarios,” Bickley said.
“Development teams intending to avail themselves of these new feature sets should insist that Salesforce provide them an extended pilot and sandbox free of charge to validate the maturity level and ease of use of these new features,” Bickley added.
All the updates to Headless 360, Salesforce said, are expected to be released in phases. Generally available features include Agentforce Vibes 2.0, the DevOps Center MCP, Session Tracing, and the Agentforce Experience Layer. Features that are in early access include Custom Scoring Evals. Other features, such as the Testing Center and the Salesforce Catalog, are scheduled for rollout in May and June, respectively.
This story has been updated to correctly identify the Agentforce Experience Layer product and to remove remarks by an analyst about Headless 360’s software dependencies.
The agent tier: Rethinking runtime architecture for context-driven enterprise workflows 16 Apr 2026, 9:00 am
Most large enterprises run on deterministic software foundations. Business rules are embedded within workflows, state transitions are modeled explicitly and escalation paths are defined in advance. System behavior is specified in advance, making outcomes predictable. Meaningful scenarios are encoded as conditional branches and validated before release. For decades, this approach has delivered the reliability and control required for mission-critical operations.
This model assumes most situations can be anticipated and expressed in logic. It works well when variation is limited and conditions remain manageable. If new requirements can be added as workflow branches, the structure holds. It begins to strain when processes must respond to context — not just thresholds, but the broader circumstances of a case.
In my experience, customer onboarding in banking makes this tension visible. Onboarding sits at the intersection of digital channels, fraud detection, regulatory obligations and revenue goals. It must satisfy Know Your Customer (KYC) and Anti-Money Laundering (AML) requirements while minimizing abandonment and resisting synthetic identity attacks.
During my involvement in digital account opening initiatives at a major North American bank, cross-functional design sessions repeatedly surfaced the same trade-off. Product teams pushed to reduce friction and improve conversion while fraud teams responded to bot-driven account creation and mule schemes with additional safeguards. Compliance insisted regulatory standards be met without exception and engineering absorbed each new requirement into the orchestration framework. Individually, these decisions were rational. Collectively, they made the workflow more complex.
The underlying challenge was not a shortage of rules but expressing contextual judgment within a static branching structure. Differentiation occurred only at predefined checkpoints and information was often collected in bulk rather than adapting to known facts. Collect too little and the institution risks regulatory exposure or fraud; collect too much and abandonment rises. Attempt to encode every variation as additional branches and the workflow becomes increasingly fragile.
Adaptive scoring and contextual models can complement deterministic logic. Rather than enumerating every scenario in advance, they help determine whether additional verification is warranted or whether progression can continue with existing evidence. Deterministic workflows still enforce regulatory requirements and final state transitions; the adaptive layer informs how the system navigates toward those outcomes.
Although onboarding illustrates the issue clearly, the same pattern appears in credit adjudication, claims processing and dispute management. As adaptive signals enter these workflows, the architectural question shifts from adding branches to deciding where contextual judgment should reside. In my view, what is missing is not another conditional path but a different runtime model — one that interprets context and determines the next appropriate action within defined limits. This architectural layer, which I refer to as the Agent Tier, separates contextual reasoning from deterministic execution.
Introducing the agent tier: Separating execution from contextual judgment
In many enterprises, orchestration logic does not reside in a formal workflow platform. It is embedded in SPA applications, implemented in APIs, supported by rule engines and coordinated through service calls across systems. User journeys are assembled through API calls in predefined sequences, with eligibility or routing conditions evaluated at specific checkpoints.
This approach works well for repeatable, well-understood paths. When inputs are complete, risk signals are low and no exception handling is required, the clean path can be executed deterministically. State transitions are known in advance. Service calls follow predictable patterns. Human tasks are invoked at predefined points.
The difficulty arises when the workflow encounters ambiguity. Inputs may be incomplete. Signals may require interpretation rather than simple threshold comparison. Multiple systems may need to be coordinated in a sequence not explicitly modeled. Attempting to encode every such situation into SPA logic or orchestration APIs leads to increasingly complex condition trees and harder-to-maintain code. Instead of expanding hard-coded branching indefinitely, the runtime separates into two complementary lanes: Repeatable execution and contextual reasoning.
Conceptually, the enterprise runtime evolves into a two-lane structure, illustrated below.

Nitesh Varma
The deterministic lane retains control over authoritative state changes and rule enforcement. It manages eligibility checks, applies regulatory criteria, invokes known service sequences and finalizes cases in core systems. It continues to handle most predictable scenarios.
The runtime invokes the Agent Tier when contextual judgment is required. This may occur when additional evidence must be gathered before a rule can be evaluated, when multiple signals must be interpreted together rather than independently or when coordination across systems cannot be expressed through a fixed sequence. It evaluates available actions and returns a bounded recommendation that allows deterministic execution to resume.
The movement between lanes is explicit. The deterministic workflow hands off when it reaches a point where static branching is insufficient. The Agent Tier performs synthesis or dynamic coordination. Once the Agent Tier produces a structured result, such as a completed evidence bundle, a validated set of inputs or a recommended next step, control returns to the deterministic lane for controlled progression and final state transition.
This separation allows incremental adoption. Existing SPA logic and orchestration APIs remain intact; ambiguity points can be redirected to the Agent Tier without destabilizing deterministic execution.
What happens inside the agent tier
The Agent Tier is not a single “AI decision.” It is a structured reasoning cycle that combines interpretation with controlled action.
When the deterministic workflow hands off a case, the Agent Tier interprets the current situation by assembling available context — user inputs, existing customer relationships, fraud signals, journey state and relevant policy constraints. Based on that composite view, it selects the next action from an approved set of enterprise capabilities. That action might involve retrieving additional information, invoking a verification service, requesting clarification from the user or coordinating multiple systems in sequence. Once the action completes, the result is evaluated and the cycle continues until deterministic execution can resume.
This alternating pattern of reasoning and action is common in agentic system design. In technical literature, it is often referred to as the ReAct (Reason and Act) pattern, which interleaves reasoning steps with structured action selection. Rather than attempting to reach a final answer in a single pass, the system gathers evidence, reassesses its position and proceeds incrementally. In enterprise settings, this pattern becomes a disciplined way to manage contextual interpretation.
Reasoning in the Agent Tier does not involve free-form system access. It proceeds through approved operations exposed via governed interfaces. In practice, these tools are enterprise primitives such as:
- APIs that retrieve or update enterprise data
- event triggers that initiate downstream processing
- workflow actions that advance a case
- controlled service calls into core or third-party systems
Each operation is defined by explicit input/output contracts and permission boundaries and carries metadata describing its purpose and constraints. The runtime selects from this governed catalog — a mechanism commonly referred to as tool calling. Some frameworks further group related tools into higher-level capabilities known as skills, reusable functions for objectives such as identity verification or KYC evidence assembly.
Before control returns to the deterministic lane, the agentic runtime can also perform a structured self-check. It can verify that required conditions are satisfied, confirm alignment with policy constraints and ensure that any necessary approvals have been identified. In technical discussions, this is often described as reflection.
Taken together, these patterns do not introduce unchecked autonomy. They provide a structured way to manage contextual synthesis and dynamic coordination without allowing adaptive logic to diffuse across SPA code and orchestration services. Deterministic systems continue to enforce authoritative state transitions. The Agent Tier prepares the conditions under which those transitions occur.
In many implementations, the Agent Tier does not directly control the workflow. Instead, it recommends the next step based on the available context. The deterministic tier remains responsible for execution. After each step is completed — retrieving evidence, invoking a verification service or preparing a review case — the updated context is returned to the Agent Tier, which evaluates the new state and recommends the next action. In this model, contextual reasoning informs progression while deterministic systems continue to enforce authoritative state transitions.
Returning to the onboarding example, the Agent Tier changes how the journey adapts to each applicant. The deterministic tier still executes core steps such as creating the customer profile, enforcing regulatory checks and committing account state in core systems. The Agent Tier evaluates the evolving context — customer relationships, fraud signals, identity verification results and available documentation — and recommends whether the workflow can proceed along the clean path, trigger additional verification or escalate to manual review. The result is not a new onboarding process but a workflow that adapts its progression dynamically while preserving the deterministic controls required for regulated operations.
Conceptually, the interaction between contextual reasoning and deterministic execution can be understood as a simple runtime loop, as illustrated below.

Nitesh Varma
The workflow progresses through a continuous loop in which contextual reasoning recommends the next step, deterministic systems execute it and the resulting context feeds back into the next recommendation.
Governing adaptive systems without losing control
Separating contextual reasoning from deterministic execution clarifies responsibility but does not eliminate risk. In regulated environments, adaptive sequencing must operate within explicit governance boundaries.
The trust and operations overlay represents cross-cutting controls across the runtime: Audit logging, approval gates, observability, security enforcement and lifecycle management. Within this structure, authoritative state transitions remain deterministic. Core systems continue to create client profiles, enforce limits, record disclosures and apply regulatory thresholds. The Agent Tier may influence progression, but final state changes occur only through controlled interfaces.
This containment boundary preserves explainability. When progression changes — for example, when additional verification is triggered or escalation occurs — institutions must be able to reconstruct why. Which signals were assembled? Which tools were invoked? What reasoning produced the recommendation? Concentrating contextual evaluation within a defined runtime layer makes that traceability possible.
Operational experience reinforces the need for these guardrails. Engineering discussions of production agent systems emphasize constrained tool access, explicit action catalogs, bounded iteration and strong observability. In enterprise environments, contextual reasoning must likewise operate through governed tools and visible control points.
Approval gates remain part of this structure. High-risk actions such as credit issuance, account restrictions, large payments or regulatory filings may still require human authorization regardless of how the progression was determined. Reflection inside the Agent Tier can validate readiness, but authorization remains explicit.
Lifecycle discipline is equally important. Changes to models, identity providers, tool contracts or orchestration logic can alter workflow behavior. The Agent Tier should therefore operate as a governed platform capability with versioned reasoning logic, controlled tool catalogs and defined testing and rollback mechanisms.
The objective is not to eliminate probabilistic reasoning but to contain it within observable workflows and governed boundaries. As adaptive capabilities expand, the architectural question is not whether contextual reasoning will exist, but whether it is diffused across the stack or concentrated within a controlled runtime layer.
Architectural leadership in an adaptive era
Introducing an Agent Tier adds a new runtime component, but enterprise complexity is not new; it is already dispersed across channel code, orchestration services, rule engines and proliferating conditional branches. The architectural question is not whether complexity exists, but where it resides. As fraud models evolve, verification technologies improve and regulatory expectations shift, adaptive capabilities will continue to expand.
I believe architecture must evolve from enumerating state transitions to defining containment boundaries. Deterministic systems enforce regulatory and operational requirements and remain responsible for authoritative state changes. Adaptive reasoning operates within explicit policy constraints and informs how workflows progress toward those outcomes. Instead of encoding every possible path in advance, enterprises can move toward context-driven workflows in which deterministic execution handles authoritative actions while the Agent Tier determines the next appropriate step based on evolving context.
This evolution does not require wholesale reinvention. It can begin with a single high-impact workflow where contextual variability is already evident. By introducing a disciplined runtime layer that mediates uncertainty while preserving deterministic control, organizations can modernize incrementally. In that sense, the Agent Tier is not simply a new feature; it is a structural response to a changing runtime reality, one that allows adaptive systems to operate within clear architectural and governance boundaries.
This article is published as part of the Foundry Expert Contributor Network.
Want to join?
Ease into Azure Kubernetes Application Network 16 Apr 2026, 9:00 am
If you’re using Kubernetes, especially a managed version like Azure Kubernetes Service (AKS), you don’t need to think about the underlying hardware. All you need to do is build your application and it should run, its containers managed by the service’s orchestrator.
At least that’s the theory. However, implementing a platform that abstracts your code from the servers and network that support it brings its own problems, and a whole new discipline. Platform engineers fill the gap between software and hardware, supporting security and networking, as well as managing storage and other key services.
Kubernetes is part of an ecosystem of cloud-native services that provide the supporting framework for running and managing scalable distributed systems, including the tools needed to package and deploy applications, as well as components that extend the functionality of Kubernetes’ own nodes and pods.
Key components of this growing ecosystem are the various service meshes. These offer a way to manage connectivity between nodes and between your applications and the outside network, with tools for handling basic network security. Often implemented as “sidecar” containers, running alongside Kubernetes pods, these network proxies can consume added resources as your applications scale. That means more configuration and management, ensuring that configurations are kept up-to-date and that secrets are secure.
Istio goes ambient
One of the key service mesh implementations, Istio, has developed an alternate way of operating, what the project calls “ambient mode”. Here, instead of having individual sidecars for each pod, your service mesh is implemented as per-node proxies or as a single proxy that supports an entire Kubernetes namespace. It’s an approach that allows you to start implementing a service mesh without increasing the complexity of your platform, making it easy to go from a basic development Kubernetes implementation to a production environment without having to change your application pods.
It’s called ambient mode because there’s no need to add new service mesh elements as your application scales. Instead, the service mesh is always there, and your pods simply join it and take advantage of the existing configuration. The resulting implementation is both easier to use and easier to understand.
Microsoft has used Istio as part of Azure Kubernetes Service for many years. Istio is one of a suite of open-source tools that provide the backbone of Azure’s cloud-native computing platform.
Introducing Azure Kubernetes Application Network
So, it’s not surprising to learn that Microsoft is using Istio’s ambient mesh as the basis of Azure Kubernetes Application Network. The new service (available in preview) allows application developers to add managed network services to their applications without needing the support of a platform engineering team to implement a service mesh. It will even help you migrate away from the now-deprecated ingress-nginx by providing access to the recommended Kubernetes Gateway API without needing more sidecars and letting you use your existing ingress-nginx configurations while you complete your migration.
Microsoft describes the preview of Azure Kubernetes Application Network as “a fully managed, ambient-based service network solution for Azure Kubernetes Service (AKS).” The underlying data and control planes are managed by AKS, so all you need to do is connect your AKS clusters to an Application Network and AKS will then manage the service mesh for you, without any changes to your applications.
Like other implementations of Istio’s ambient mesh, there are two levels to Application Network: a core set of node-level application proxies that handle connectivity and security for application services, and an optional set of lower-level proxies that support routing and apply network policies, acting as a software-defined network inside your Kubernetes environment.
This approach lets you build and test a Kubernetes application on your local development hardware without using Application Network features, then deploy it to AKS along with the required network configuration — simplifying both development and deployment. It also reduces development overheads, both in compute and developer resources.
Using Azure Kubernetes Application Network
Once deployed Application Network connects the services in your application securely, managing encrypted connections automatically and managing the required certificates. It can support unencrypted connections, for when you aren’t sending confidential data and don’t need the associated overhead. As the service is managed by AKS, new pods are automatically provisioned as they are deployed, with the ambient mesh supporting both scale-up and scale-down operations.
The architecture of Application Network is much like that of an Istio ambient mesh. The main difference is that the service’s management and control planes are managed by Azure, with application owners limited to working with the service’s data plane, configuring operations and setting policies for their application workloads. Azure’s control of the management plane automates certificate management, ensuring that connections stay secure and there is little risk of certificate expiration, using the tools built into Azure Key Vault.
The Application Network data plane holds proxies and gateways used by the service mesh, and these are deployed when the service is launched, along with the required Kubernetes configurations. The key to operation is ztunnel, a proxy that intercepts inter-service requests, secures the connection, and routes requests to another ztunnel running with the destination service. A gateway oversees connections between ztunnels running in remote clusters, allowing your service mesh to scale out with demand.
Building your first ambient service mesh in AKS
Getting started with Azure Kubernetes Application Network requires the Azure CLI. If you’re working with an existing AKS cluster, then you will need to enable integration with Microsoft Entra and enable OpenID Connect.
As the Application Network service is in preview, start by registering it in your account. This can take some time, but once it’s registered you can install the AppNet CLI extension that’s used to manage and control Application Network for your AKS clusters. You can now start to set up the ambient service mesh, either creating new clusters to use it, or adding the service mesh to existing AKS deployments.
Starting from scratch is the easiest way, as it ensures that you’re running in the same tenant. AKS clusters and Application Network can be in the same resource group if you want, but it’s not necessary. You’re free to use separate resource groups for management.
The appnet command makes it easy to create an Application Network from the command line; all you need is a name for the network, a resource group, a location, and an identity type. Once you’ve run the command to create your ambient mesh, wait for the mesh to be provisioned before joining a cluster to your network. This again simply needs a resource group, a name for the member cluster, and its resource group and cluster name. At the same time, you define how the network will be managed, i.e. whether you manage upgrades yourself or leave Azure to manage them for you. Additional clusters can be added to the network the same way.
With an Application Network and member clusters in place, the next step is to use Kubernetes’ own tooling to add support for the ambient mesh to your applications. Microsoft provides a useful example that shows how to use Application Network with the Kubernetes Gateway API to manage ingress. You need to use kubectl and istioctl commands to enable gateways and verify their operation, adding services and ensuring that they are visible to each other through their respective ztunnels.
Securing applications with policies
Policies can be used to control access from the application ingress to specific services as well as between services, reducing the risk of breaches and ensuring that you control how traffic is routed in your application. These policies can be locked down to ensure only specific methods can be used, so only allowing HTTP GET operations on a read-only service, and POST where data needs to be delivered. Other options can be used to enforce OpenID Connect authorization at a mesh level.
Not all Azure Kubernetes clusters are supported in the preview, which is only available in Azure’s largest regions. For now, Application Network won’t work with private clusters or with Windows node pools. Once running you can’t switch upgrade modes, and as it’s based on Istio, you can’t enable Istio service meshes in your cluster. These requirements aren’t showstoppers, and you should be able to get started experimenting with the service as it’s still in preview.
AKS Application Network is a powerful tool that helps simplify and secure the process of building and running inter-cluster networks in an AKS application. As it is an ambient service, it’s possible to scale as necessary, and can help provide secure bridges between clusters. By working at a Kubernetes level, it’s possible to use Application Network to provide policy driven production network rules, allowing developers to build and test code in unrestricted environments before moving to test and production clusters.
As Application Network uses familiar Kubernetes and Istio constructions, it’s possible to build configurations into Helm charts and other deployment tools, ensuring configurations are part of your build artifacts and that network configurations and policies are delivered with your code every time you push a new build – without needing platform engineering support.
The two-pass compiler is back – this time, it’s fixing AI code generation 16 Apr 2026, 9:00 am
If you came up building software in the 1990s or early 2000s, you remember the visceral satisfaction of determinism. You wrote code. The compiler analyzed it, optimized it, and emitted precisely the machine instructions you expected. Same input, same output. Every single time. There was an engineering rigor to it that shaped how an entire generation of developers thought about building systems.
Then large language models (LLMs) arrived and, almost overnight, code generation became a stochastic process. Prompt an AI model twice with identical inputs and you’ll get structurally different outputs—sometimes brilliant, sometimes subtly broken, occasionally hallucinated beyond repair. For quick prototyping that’s fine. For enterprise-grade software—the kind where a misplaced null check costs you a production outage at 2am—it’s a non-starter.
We stared at this problem for a while. And then something clicked. It felt familiar, like a pattern we’d encountered before, buried somewhere in our CS fundamentals. Then it hit us: the two-pass compiler.
A quick refresher
Early compilers were single-pass: read source, emit machine code, hope for the best. They were fast but brittle—limited optimization, poor error handling, fragile output. The industry’s answer was the multi-pass compiler, and it fundamentally changed how we build languages. The first pass analyzes, parses, and produces an intermediate representation (IR). The second pass optimizes and generates the final target code. This separation of concerns is what gave us C, C++, Java—and frankly, modern software engineering as we know it.

The structural parallel between classical two-pass compilation and AI-driven code generation.
WaveMaker
The analogy to AI code generation is almost eerily direct. Today’s LLM-based tools are, architecturally, single-pass compilers. You feed in a prompt, the model generates code, and you get whatever comes out the other end. The quality ceiling is the model itself. There’s no intermediate analysis, no optimization pass, no structural validation. It’s 1970s compiler design with 2020s marketing.
Applying the two-pass model to AI code generation
Here’s where it gets interesting. What if, instead of asking an LLM to go from prompt to production code in one shot, you split the process into two architecturally distinct passes—just like the compilers that built our industry?
Pass 1 is where the LLM does what LLMs are genuinely good at: understanding intent, decomposing design, and reasoning about structure. The model analyzes the design spec, identifies components, maps APIs, resolves layout semantics—and emits an intermediate representation, an IR. Not HTML. Not Angular or React. A well-defined meta-language markup that captures what needs to be built without committing to how.
This is critical. By constraining the LLM’s output to a structured meta-language rather than raw framework code, you eliminate entire categories of failure. The model can’t inject malformed tags if it’s not emitting HTML. It can’t hallucinate nonexistent React hooks if it’s outputting component descriptors. You’ve reduced the stochastic surface area dramatically.
Pass 2 is entirely deterministic. A platform-level code generator—no LLM involved—takes that validated intermediate markup and emits production-grade Angular, React, or React Native code. This is the pass that plugs in battle-tested libraries, enforces security patterns, and applies framework-specific optimizations. Same IR in, same code out. Every time.
First pass gives you speed. Second pass gives you reliability. The separation of concerns is what makes it work.
Why this matters now
The advantages of this architecture compound in exactly the ways that matter for enterprise development. The meta-language IR becomes your durable context for iterative development—you’re not re-prompting the LLM from scratch every time you refine a component. Security concerns like script injection and SQL injection are structurally eliminated, not patched after the fact. Hallucinated properties and tokens get caught and stripped at the IR boundary before they ever reach generated code. And because Pass 2 is deterministic, you get reproducible, auditable, deployable output.
| Pass 1 — LLM-powered • Translates design/spec to structured components and design tokens • Enables iterative dev with meta-markup as persistent context Eliminates script/SQL injection by design | Pass 2 — Deterministic • Generates optimized, secure, performant framework code • Validates and strips hallucinated markup and tokens Plugs in battle-tested libraries for reliability |
If you’ve spent your career building systems where correctness isn’t optional, this should resonate. The industry spent decades learning that single-pass compilation couldn’t produce reliable software at scale. The two-pass architecture wasn’t just an optimization, but an engineering philosophy: separate understanding from generation, validate before you emit, and never let a single phase carry the entire burden of correctness.
We’re at the same inflection point with AI code generation right now. The models are powerful. The architecture around them has been naive. The fix isn’t to wait for a smarter model. It’s to apply the engineering discipline we’ve always known, and build systems where stochastic brilliance and deterministic reliability each do what they do best—in the right pass, at the right time.
Deterministic software engineering is cool again. Turns out it never really left.
—
New Tech Forum provides a venue for technology leaders—including vendors and other outside contributors—to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to doug_dineley@foundryco.com.
MuleSoft Agent Fabric adds new ways to keep AI agents in line 15 Apr 2026, 6:20 pm
Salesforce first sought to tackle AI agent sprawl last year with Agent Fabric, a suite of capabilities and tools inside its MuleSoft AnyPoint Platform. Now, it’s seeking to further rein in unruly AI agents on its platform and those of other vendors too, with new governance tools and deterministic controls.
When enterprises adopt multiple agentic AI products, they can end up redundant or siloed workflows or scattered across teams and platforms, undermining operational efficiency and complicating governance as they try to scale AI safely and responsibly.
Agent Fabric, introduced in September 2025, started out as a place for enterprises to register, view, interconnect and govern agents. In January it added a deterministic scripting tool and the ability to scan for new agents and add them to the registry.
But enterprises still need more help to bring their AI agents under control, so Salesforce is adding more features.
First up is an expansion of the deterministic controls in the form of Agent Script for Agent Broker, an intelligent routing service inside Agent Fabric that is designed to connect agents across domains, dynamically matching user tasks with the best-fit agent. Salesforce said the controls will help developers codify workflows in multi-agent systems in order to ensure consistent and reliable outputs.
Rather than leave probabilistic agents to make all the decisions about how to resolve a problem, introducing an element of unpredictability, Agent Script for Agent Broker enables enterprises to steer some of the decision-making according to predetermined rules that require fewer computing resources than running a large language model.
That’s welcome news for Robert Kramer, managing parter at KramerERP.
“Pure autonomous agents don’t necessarily work in production as enterprises need to ensure predictable outcomes. The deterministic controls should facilitate a secure handoff of control and rules while still allowing the model to engage in reasoning when it’s appropriate,” he said. “It’s a balance between control and flexibility, which is the norm for most real deployments.”
For Rebecca Wettemann, principal analyst at Valoir, providing both deterministic and probabilistic options within Agent Fabric enables developers and agent builders to take the lower-cost route to more accurate and predictable results from agentic systems.
Enterprises will have to wait to put this deterministic orchestration feature into production, though: Still in beta testing, it won’t be generally available until June 2026.
Centralized LLM governance tackles cost
Beyond orchestration, Salesforce has added a new LLM Governance capability in AI Gateway, the control layer within Agent Fabric that provides centralized visibility of token usage, costs, and data flows for third-party model.
Enterprises will be able to use LLM Governance, now generally available, to help them keep their AI operations on budget, Salesforce said.
This is becoming increasingly important as CIOs seek to bring disparate AI systems under centralized control and justify spiralling AI costs.
Info-Tech Research Group advisory fellow Scott Bickley warned that without centralized governance like this, different teams around a company may choose different models, negotiate their own API contracts, and manage token budgets locally.
“This results in sprawling costs, inconsistent security postures, and no enterprise-wide policy enforcement,” he said. “By positioning AI Gateway as the choke point through which all LLM traffic flows, enterprises gain visibility into AI usage patterns, the models in use, purpose of the usage, and cost data.”
MCP additions simplify integration
Salesforce is also adding new Model Control Protocol features, including MCP Bridge to make it easier to access legacy APIs, and Informatica-hosted MCPs, that it says will simplify how agents interact with enterprise data and APIs.
These could save developers time and simplify the building of cross-environment, multi-agent systems.
Bickley said MCP Bridge will help enterprises with thousands of legacy APIs (REST, SOAP, GraphQL) built long before MCP existed.
“Agents speaking MCP cannot call those APIs natively so they require wrappers around the API endpoint; this would be a massive engineering lift. MCP Bridge allows these APIs to be exposed as MCP-compatible tools without modifying the underlying code,” he said.
And Wettemann said Informatica-hosted MCPs will further reduce development overhead by bringing built-in data quality and governance capabilities into agent workflow, particularly critical for enterprises in regulated industries and those with heightened risk concerns.
But Bickley added a note of caution. “APIs can behave oddly and have their own nuanced behavior,” he said. “Enterprises should test how MCP Bridge handles edge cases.”
Informatica-hosted MCPs will not be a miracle solution either, he warned: “Even if the Informatica data quality and governance capabilities are cleanly integrated in the Agent Fabric registry, these are not instantaneous operations. Checking data fields for accuracy, deduplication, and cross-system matching take time and carry latency measured in milliseconds or even multiple seconds, and that is pre-integration.”
A pivot for MuleSoft?
Bickley sees the updates as a broader strategy for Salesforce to reposition MuleSoft, which it acquired in 2018 for $5.7 billion, from a traditional API integration platform to an infrastructure layer for enterprise AI agents.
By layering orchestration, governance, and connectivity into Agent Fabric, Salesforce appears to be trying to position MuleSoft as the system of record for how agents are discovered, routed, and governed across the enterprise, deepening its role beyond API management into core AI infrastructure, he said.
Not all CIOs will welcome that move.
“If your agent control plane runs on Agent Fabric, switching costs rise materially, and the more agents you register, the more orchestration rules and governance policies defined, the more difficult it becomes to move to an alternative solution,” the analyst said.
As with any critical infrastructure dependency, “CIOs need to ask: What is the exit path? What components of Agent Fabric are portable and what is locked in? What’s the pricing model? What is the integration depth with non-Salesforce agents and data sources?” he said.
For now, though, enterprises have plenty of AI agent orchestration options to choose from.
Tap into the AI APIs of Google Chrome and Microsoft Edge 15 Apr 2026, 9:00 am
With every passing year, local AI models get smaller, more efficient, and more comparable in power with their higher-end, cloud-hosted counterparts. You can run many of the same inference jobs on your own hardware, without needing an internet connection or even a particularly powerful GPU.
The hard part has been standing up the infrastructure to do it. Applications like ComfyUI and LM Studio offer ways to run models locally, but they’re big third-party apps that still require their own setup and maintenance. Wouldn’t it be great to run local AI models right in the browser?
Google Chrome and Microsoft Edge now offer that as a feature, by way of an experimental API set. With Chrome and Edge, you can perform a slew of AI-powered tasks, like summarizing a document, translating text between languages, or generating text from a prompt. All of these are accomplished with models downloaded and run locally on demand.
In this article I’ll show a simple example of Chrome and Edge’s experimental local AI APIs in action. While both browsers are in theory based on the same set of experimental APIs, they do support different varieties of functionality, and use different models. For Chrome, it’s Gemini Nano; for Edge, it’s the Phi-4-mini models.
The following demo of the Summarizer API works on both browsers, although the performance may differ between them. In my experience, Summarizer ran significantly slower on Edge.
The available AI APIs in Chrome and Edge
Chrome and Edge share a common codebase — the Chromium project — and the AI APIs available to both stem from what that project supports. As of April 2026, the available AI APIs in Chrome are:
- Translator API: Translate text from one language to another, assuming a model is available for that language pair.
- Language Detector API: Determine the language for a given input text.
- Summarizer API: Condense text into headlines, summaries, and bullet-point rundowns.
All three of these APIs are available immediately to Chrome users. All except the language detector API are also available to Edge users, although that is planned for future support.
Several other APIs, which are in a more experimental state, are available in both browsers on an opt-in basis:
- Writer API: Generate text from a given prompt.
- Rewriter API: Rewrite an existing text based on instructions from a prompt.
- Prompt API: Make natural language requests directly to the model (e.g., “Search the web for up-to-date information about visiting Italy”).
- Proofreader API: Examine a text for spelling and grammatical errors and suggest corrections.
The long-term ambition is to have these APIs accepted as general web standards, but for now they’re specific to Chrome and Edge.
Using the Summarizer API
We’ll use the Summarizer API as an example for how to use these APIs generally. The Summarizer API is available on both Chrome and Edge, and the way it’s used serves as a good model for how the other APIs also work.
First, create a web page which you’ll access through some kind of local web server. If you have Python installed, you can create an index.html file in a directory, open that directory in the terminal, and use py -m http.server to serve the contents on port 8080. You can’t, and shouldn’t, try to open the web page as a local file, as that may cause content-restriction rules to kick in and break things.
Here’s the source code of the page to create:
div style="display: flex;">
textarea style="width:50%; height:24em" id="input" placeholder="Type text to be summarized">textarea>br>
textarea style="width:50%; height:24em" id="output" placeholder="Summarization results">textarea>br>
div>
textarea style="width:100%; height:4em" id="context" placeholder="Additional context">textarea>
label for="type">Type of summarization:label>
select id="type" name="type">
option value="teaser">Teaseroption>
option value="tldr">tl;droption>
option value="headline">Headlineoption>
option value="key-points">Key pointsoption>
select>
label for="length">Length:label>
select id="length" name="length">
option value="short">Shortoption>
option value="medium">Mediumoption>
option value="long">Longoption>
select>
button type="button" onclick="go();">Startbutton>
div style="background-color:beige" id="log">div>
script>
const $log = document.getElementById("log")
const $input = document.getElementById("input")
const $output = document.getElementById("output")
const $context = document.getElementById("context")
const $type = document.getElementById("type")
const $length = document.getElementById("length")
function log(text) {
$log.innerHTML += text + "
";
}
async function summarize() {
$log.innerHTML = "";
if (!'Summarizer' in self) {
log("Summarizer not available")
return false
};
const availability = await Summarizer.availability();
log(`Summarizer status: ${availability}`);
const summarizer = await Summarizer.create({
sharedContext: $context.value,
type: $type.value,
length: $length.value,
format: 'markdown',
monitor(m) {
m.addEventListener('downloadprogress', (e) => {
log(`Downloaded ${e.loaded * 100}%`);
});
}
});
log("Summarizer created, starting summarization");
$output.value = "";
const stream = summarizer.summarizeStreaming($input.value)
for await (const chunk of stream) {
$output.value += chunk;
}
log("Finished.")
}
function go() {
summarize();
}
script>
Most of what we want to pay attention to is in the summarize() function. Let’s walk through the steps.
Step 1: Verify the API is available
The line if (!'Summarizer' in self) will determine if the summarizer API is even available on the browser. The follow-up, const availability = await Summarizer.availability(); returns the status of the model required for the API:
downloadable: The model needs to be downloaded, so you’ll want to provide some kind of progress feedback for the download. (The above code has an example of how this could be implemented, via themonitor()function passed to theSummarizer.create()method.)available: The model is on the device and can be used right away.
Step 2: Create the Summarizer object
The next step is to create the Summarizer object, which can take several parameters:
sharedContext: A text which gives the summarizer additional context for how to do its work (e.g. “Format the output as a bullet list of questions”).type: One of four values that describes the format for the summary.teasertries to create interest in the text’s contents without revealing full details;tldrprovides a quick and concise summary, no more than a sentence or two;headlinegenerates a suitable headline for the text; andkey-pointsproduces a bullet list of takeaways.length: One ofshort,medium, orlong; this parameter controls how long the output should be.format: The format of the input text.markdownis the default; another allowed value isplain-text. If you are using HTML as your source, you may want to use.innerTextto derive a text-only version of the input.
Step 3: Stream and iterate over the output
Most of the time, we want to see the output streamed a token at a time, so we have some sense that the model is working. To do this, we use const stream = summarizer.summarizeStreaming($input.value) to create an object we can iterate over ($input.value is the text to summarize). We then use for await (const chunk of stream){} to iterate over each chunk and add it to the $output field.
Here’s an example of some input and output:

Example output for built-in text summarizer AI model in Chrome and Edge. The model runs entirely on the device hosting the browser and does not call out to an external service to deliver its results.
Foundry
Caveats for using Summarizer (and other local AI APIs)
The first thing to keep in mind is that the model will take some time to download on first use. The sizes of the models vary, but you can expect them to be in the gigabyte range. That’s why it’s a good idea to provide some kind of UI feedback for the download process. Ideally, you’d want to provide some way to run the model download process and then ping the user when it’s ready for use.
Once models are downloaded, there’s no programmatic interface to how they’re managed — at least, not yet. On Google Chrome there’s a local URL, chrome://on-device-internals/, that shows which models have been loaded and provides statistics about them. You can use this page to remove models manually or inspect their stats for the sake of debugging, but the JavaScript APIs don’t expose any such functionality.
When you start the inference process, there may be a noticeable delay between the time the summarization starts and the appearance of the first token. Right now there’s no way for the API to give us feedback about what’s happening during that time, so you’ll want to at least let the user know the process has started.
Finally, while Chrome and Edge support a small number of local AI APIs now, how the future of browser-based local AI will play out is still open-ended. For instance, we might see a more generic standard emerge for how local models work, rather than the task-specific versions shown here. But you can still get going right now.
Page processed in 0.444 seconds.
Powered by SimplePie 1.4-dev, Build 20170403172323. Run the SimplePie Compatibility Test. SimplePie is © 2004–2026, Ryan Parman and Geoffrey Sneddon, and licensed under the BSD License.
