Microsoft adds MCP support to Visual Studio to boost development of agentic applications 22 Aug 2025, 9:23 pm

Microsoft has added Model Context Protocol (MCP) support to Visual Studio, its flagship integrated development environment (IDE), in an effort to allow developers to connect their AI agents to tools and services available as MCP servers from directly within the IDE.

MCP, which was introduced by Anthropic in November last year, is an open protocol that enables AI agents within applications to access external tools and data to complete a user request using a client-server mechanism, where the client is the AI agent and the server provides tools and data.

Agentic applications, which can perform tasks without manual intervention, have caught the attention of enterprises, as they allow them to do more with constrained resources.

In the case of Visual Studio, the IDE acts as the MCP client and connects to the required tools provided by MCP servers.

The protocol itself defines the message format for communication between clients and servers, which includes tool discovery, invocation, and response handling.

Boosting developer productivity and integration flexibility

Adding MCP support to Visual Studio, according to Stephanie Walter, analyst at HyperFRAME Research, will boost developer productivity and integration flexibility.

“MCP acts like a secure ‘universal adapter’ for connecting AI agents (like Copilot) to external tools, databases, code search engines, or deployment pipelines, so there are no more one-off integrations for every service,” Walter said.

“MCP helps connect to internal company tools securely, meaning that enterprise users can blend public AI advancements with proprietary processes while keeping sensitive data inside company boundaries,” Walter added.

Developers can leverage the MCP integration in Visual Studio, Walter said, by, for example, using the AI assistant in the IDE to query internal bug tracking systems, automate repetitive testing tasks across custom infrastructure, or fetch metrics from production databases, without switching context or custom scripting.

Support for both local and remote MCP Servers

Microsoft said that Visual Studio supports connections to both local and remote MCP servers, and developers can configure the connections via the .mcp.json file.

Developers have the flexibility to set up MCP servers by manually editing the configuration file or through the GitHub Copilot chat interface within Visual Studio, the company explained in its documentation.

Additionally, there’s an option for quick, one-click installation directly from the web, bypassing the need for manual configuration.

In order to provide governance and security, the MCP server setup comes in with built-in administrative oversight policies as well as support for single sign-on (SSO) and OAuth authentication.

Walter sees the addition of these features as Microsoft’s way easing access to MCP via the IDE.

“Visual Studio builds in a first-class UX for MCP, with GUI-driven server management and integrated flows for authentication, making MCP approachable for both individual coders and large teams,” Walter said.

“As a result, you can  expect a rapid expansion of MCP-enabled tools and servers from the open-source and enterprise community, with new automations and AI-powered developer experiences,” Walter added.

MCP Servers are generally not secure

Although MCP adoption is picking up speed, nearly all MCP client providers and any vendor who has added support for MCP, including Microsoft, warns users of its inherent security risks.

A research report from security firm Pynt, which surveyed at least 281 MCP configurations, showed that MCPs are inherently vulnerable.

“MCPs are designed to be powerful, flexible, and modular. That makes them excellent tools for chaining actions across plugins and APIs, but also uniquely dangerous. The core issue isn’t any single plugin, but the combination of many,” the researchers wrote in the report.

The report further points out that, while Pynt was analyzing the configurations, it discovered that a single crafted Slack message or email could trigger background code execution with zero human involvement.

(image/jpeg; 0.12 MB)

Generative AI dos, don’ts, and ‘undos’ 22 Aug 2025, 9:00 am

AI agents are undeniably powerful, but wielding that power responsibly is another story. This month’s picks dive into the real-world dos and don’ts of generative AI: Best practices for using genAI tools for code generation, how to avoid the trap of over-relying on a single tool, and why so many agents fail at understanding business context. And if you get things wrong? One company is offering a new ‘undo’ option for AI mistakes.

Top picks for generative AI readers on InfoWorld

A developer’s guide to code generation
According to a recent survey, 91% of developers are using AI for code generation. But how many are doing it right? This article features real-world talk from experts about how developers can use genAI and avoid its pitfalls.

Multi-agent AI workflows: The next evolution of AI coding
One emerging best practice of the genAI era is: Don’t treat any single tool as the solution to every problem. Just as humans have specific strengths, so do AI coding assistants.

Why AI fails at business context, and what to do about it
As organizations start turning AI agents loose on messy, real-world problems, it’s become clear they tend to miss important business context. Columnist Matt Asay says this is an engineering problem, not a philosophical one. Developers (and others) can solve it by implementing AI with care.

Is the generative AI bubble about the burst?
Developers have a front-row seat on the rollercoaster ride of generative AI. How we use it in our daily work tells us a lot about where the AI boom is headed.

GenAI news bites

More good reads and generative AI updates elsewhere

NIST’s attempts to secure AI yield many questions, no answers
NIST is taking an AI security approach that involves tweaking its current rules to accommodate AI, rather than starting from scratch. Does it go far enough?

Companies are pouring billions into AI—but it has yet to pay off
Eight in 10 companies have reported using generative AI, but just as many have reported “no significant bottom-line impact.”

Why you can’t trust a chatbot to talk about itself
When an AI tool does something surprising, your first instinct is probably to ask it why. But chatbots and other LLMs aren’t capable of introspection, and the answers they give can range from unhelpful to actively misleading.

(image/jpeg; 3.57 MB)

From cloud migration to cloud optimization 22 Aug 2025, 9:00 am

The vision of the cloud as a cost-efficient solution captured the imagination of IT leaders during its years of mass adoption. Enterprises expected to save significantly by leveraging the scalability of public cloud infrastructure and paying only for the resources they used. However, as reflected in the 2025 Crayon IT Cost Optimization Report, the reality has been far more complex.

The report, based on insights from more than 2,000 IT leaders, reveals that a staggering 94% of global IT leaders struggle with cloud cost optimization. Many enterprises underestimate the complexities of managing public cloud resources and the inadvertent overspending that occurs from mismanagement, overprovisioning, or a lack of visibility into resource usage.

This inefficiency goes beyond just missteps in cloud adoption. It also highlights how difficult it is to align IT cost optimization with broader business objectives. More than half (57%) of survey participants in the Crayon Report pointed to cloud cost optimization as their top lever for maximizing IT spending. This growing focus sheds light on the rising importance of finops (financial operations), a practice aimed at bringing greater financial accountability to cloud spending.

Adding to this complexity is the increasing adoption of artificial intelligence and automation tools. These technologies drive innovation, but they come with significant associated costs. According to the Crayon Report, 40% of IT leaders anticipate that managing AI-related expenses will be their biggest financial challenge within the next three years. This underscores the need for robust cost-optimization strategies if enterprises are to sustain growth without breaking their budgets.

The move toward hybrid models

One of the most revealing insights from the Crayon Report is the growing interest in hybrid IT infrastructures, where enterprises balance workloads between public clouds and on-premises environments. A massive 94% of IT leaders surveyed expressed a willingness to invest in on-premises infrastructure, with plans to allocate approximately 37% of their IT budgets to these efforts. The primary factor driving this shift is cost. Many organizations have found that on-premises infrastructure is cheaper for hosting certain workloads, notably those with predictable resource requirements where cloud elasticity is less advantageous.

In addition to cost, data security and compliance were cited by 52% of respondents as another key reason for this hybrid shift. With increasing regulations requiring tighter control over data handling (especially in industries like healthcare and finance), hosting critical data on premises provides enterprises with an edge in compliance management while avoiding some cloud-associated costs.

Control over infrastructure was mentioned by 41% of IT leaders. The argument for greater control is not new, but it has gained renewed relevance when paired with cost optimization strategies. Simply put, enterprises are asking tough questions about whether the public cloud meets all their operational needs. For an increasing number of organizations, the answer is no.

AI spending on the cloud

Most AI deployments illustrate the challenges with public cloud costs. According to the Crayon Report, 60% of enterprises use AI to optimize IT process automation, while 45% deploy AI for predictive cost analytics. This move underscores how businesses are leaning on machine learning models to improve resource planning and forecasting. However, running AI workloads at scale in the cloud is expensive, especially for organizations that utilize large computational models or require GPUs for specialized tasks.

Public cloud providers such as AWS, Microsoft Azure, and Google Cloud have responded with AI-optimized services and product offerings, but those often come with hefty price tags. The synergy between AI and cloud has clearly driven massive innovation, but it has also made it harder to manage cloud spending effectively. This is why cloud optimization strategies that cut costs without sacrificing performance are now crucial for maintaining financial stability amid increasing technological complexity.

The future of cloud optimization

With 41% of respondents’ IT budgets still being directed to scaling cloud capabilities, it’s clear that the public cloud will remain a cornerstone of enterprise IT in the foreseeable future. Cloud services such as AI-powered automation remain integral to transformative business strategies, and public cloud infrastructure is still the preferred environment for dynamic, highly scalable workloads. Enterprises will need to make cloud deployments truly cost-effective. This means:

  • Streamlining multicloud strategies. Many organizations continue to adopt a multicloud approach in pursuit of reliability and flexibility. However, managing multiple cloud platforms often leads to redundant costs. IT leaders need effective governance models that optimize resource usage across all cloud providers.
  • Investing in finops. To gain tighter financial oversight and accountability in managing cloud resources, enterprises should look for tools that can provide granular visibility into cloud spending and identify opportunities to cut costs.
  • Adopting a workload-first strategy. Instead of migrating workloads en masse to the cloud, organizations must critically evaluate which workloads are most cost-effective to run in the public cloud versus in their on-premises infrastructure.

The core benefits that once made the cloud so appealing—scalability, flexibility, and efficiency—still hold value today. However, experience has taught enterprises a tough lesson: Public cloud adoption alone does not guarantee cost savings. Organizations now find themselves in a new phase of their cloud journeys; the focus has shifted from migration to optimization and hybridization. IT leaders must view their decisions not as binary choices between public cloud or on-premises environments, but as opportunities to strike an efficient balance between the two.

By adopting strategies that prioritize smart cloud spending, businesses can continue to leverage the power of the cloud without sacrificing their financial health. The era of cloud optimization and hybrid IT has only just begun.

(image/jpeg; 0.7 MB)

Anthropic adds Claude Code to its Claude enterprise plans 22 Aug 2025, 3:34 am

Anthropic has bundled Claude Code, its agent-based command line interface (CLI) coding tool, into the enterprise plans for its generative AI chatbot Claude, in an effort to help streamline developer workflows as well as to take on rivals such as Gemini CLI and GitHub Copilot.

“Enterprise and Team [plan] customers can now upgrade to premium seats that include more usage and Claude Code — bringing our app and powerful coding agent together under one subscription,” the company wrote in a blog post.

This bundling will allow developers to collaborate with Claude across the entire development lifecycle — from researching frameworks and evaluating architectures to generating production-ready code directly in their terminal, the company added.

Analysts see the bundling as a natural inflection point for AI-based offerings geared towards developers.

While Forrester VP and principal analyst Charlie Dai sees the bundle as a reaction to high demand from enterprise customers seeking integrated AI coding solutions, Everest Group senior analyst Oishi Mazumder sees the move as Anthropic’s strategy to gain traction inside enterprises as they begin to scale AI coding tools.

Granular controls

Anthropic said that enterprises now have granular controls to manage these premium seats, including the ability to buy and allocate new seats, buy extra capacity, and set spending limits for users from the Claude admin panel, and view Claude Code analytics, including metrics such as lines of code accepted, suggestion accept rate, and usage patterns, from within Claude.

Additionally, administrators can deploy and enforce settings across all Claude Code users to match internal policies, including tool permissions, file access restrictions, and MCP server configurations, the company said.

Analysts see these granular controls as a differentiator for Claude Code.

“Compared with GitHub Copilot’s integrated development environment (IDE) focus and Gemini CLI’s scale and openness, Anthropic’s Claude Code stands out for offering more granular controls such as SSO and role-based access. This positions Anthropic to appeal to enterprises that prioritize governance alongside coding productivity,” Mazumder said.

Better oversight

The bundling of Claude and Claude Code may give Anthropic a stronger value proposition compared to GitHub and Google Gemini, as it minimizes tool sprawl and simplifies oversight, Mazumder added.

Anthropic has also released a Compliance API to help enterprises conduct reviews of Claude usage data and customer content in real time.

“Administrators can integrate Claude data into existing compliance dashboards, automatically flag potential issues, and manage data retention through selective deletion capabilities,” the company wrote. “This provides the visibility and control organizations need to scale AI adoption while meeting regulatory obligations.”

(image/jpeg; 2.12 MB)

Up and running with Azure Linux 3.0 21 Aug 2025, 9:00 am

Microsoft’s move to the cloud-native world means it’s now the custodian of several quite different Linux distributions. Some are internal tools that run deep parts of Azure’s networking infrastructure; others are part of Azure’s Internet of Things platform. However, one of the most important is almost invisible: Azure Linux.

First unveiled as CBL-Mariner, Azure Linux was designed to be a base Linux platform for Microsoft’s various container services, one that Microsoft controlled and that couldn’t, like CoreOS, be withdrawn with little advance warning. Since then, Azure Linux has provided an effective Linux tool for Microsoft projects, such as Windows Subsystem for Linux (WSL), that need a small, fast Linux with minimal CPU and memory demands. It also forms the basis of much of Azure Kubernetes Service (AKS).

Going cloud-native with Azure Linux 3.0

Azure Linux 3.0 arrived in the spring of 2025 and was at once available in AKS, as part of AKS version 1.32 and higher. Based on version 6.6 of the Linux kernel, Azure Linux 3.0 builds are available for both x64 and Arm platforms, so it will run on Azure’s high-density, front-end Arm-based Cobalt systems. There is even support from many familiar cloud-native platform tools such as Dapr and Terraform, so you can integrate them into AKS solutions running on Azure Linux 3.0.

Other updates in this release include new versions of ContainerD and SystemD, as well as support for the SymCrypt cryptography library, which will help get you ready for the switch to post-quantum cryptography algorithms.

Like much of Microsoft’s open source and free software development tools, you can find the Azure Linux repository on GitHub. Here you can download its source code to build your own installation from scratch or even create a custom fork. The source code and ready-built ISOs are available, and containers with a base Azure Linux 3.0 image are in Microsoft’s container registry.

Microsoft has recently integrated Azure Linux with its OS Guard tool, building on the idea of immutable containers and adding policy enforcement to ensure only trusted binaries can run in user space, restricting them to secure protected volumes and even to specific files. There’s now support for trusted launch, which verifies your Azure Linux environments with trusted boot components and keys held in an Azure vTPM. Only authorized code can run, significantly reducing the risks of compromise.

New releases come every three years, so Azure Linux 3.0 will be the basis of the operating system until 2028. Microsoft provides tools to move from Azure Linux 2.0 to 3.0 via an update to AKS node pools, with Azure Linux 2.0 losing support in November 2025. You can expect a similar lifespan and comparable migration tools for the shift from Azure Linux 3.0 to 4.0.

Unlike some other container Linuxes, Azure Linux is not completely immutable (though it can easily be run that way). It supports familiar package management tools based on the RPM package standard. You can use tdnf to update and upgrade packages, ensuring that your container stays secure. Even so, it’s best to download the latest base image each time you need to construct an application release or a container update.

Rolling your own Azure Linux

As the project’s GitHub repository has all the code to build a full release of Azure Linux, there’s always the possibility of building from source. Microsoft provides instructions on how to build both standard and custom images, going straight from code to a bootable Hyper-V virtual machine or to an ISO that can be installed on most virtualization platforms. The option of building a custom image is interesting as it allows you to add your own choice of packages, including specific tested versions or internally developed code that is not publicly available.

Having the ability to ship your own custom base container image is important as it ensures that you have everything you need without the complexity of supporting and merging different container definitions into a deployable image. However, it does mean having to rebuild images as security updates get pushed into the mainline codebase.

Running Azure Linux 3.0 in WSL

As part of looking at Azure Linux 3.0, I decided to run it as a WSL distribution. That’s easier said than done, as it’s only shipped as an ISO image or as a Docker-format container. However, Microsoft recently added support for tarball-based installs for custom WSL instances, making it a lot easier to go from a Docker container to a working WSL Azure Linux in surprisingly few steps. I tried out this new process by building an Azure Linux installation.

Starting in an Ubuntu WSL command line, I used Podman, the open alternative to Docker, which has the same commands, to download and prepare an Azure Linux release for use in WSL. You could use Docker desktop in Windows, but I had problems with exports: It generated an unreadable tarball that was twice the size of one generated in Linux.

I first pulled the latest release from the Microsoft container repository and then had Podman run it as a named container, listing the file system contents to check that it was running correctly. I was then able to export the container contents from the image using the Podman export command, which creates a tarball from the target container’s file system.

As Windows 11 provides a direct link to installed WSL file systems from Windows, I navigated to my Ubuntu user directory and copied the Azure Linux tarball to Windows, where it was ready to import into WSL. Microsoft does provide instructions to build configuration scripts for full-scale WSL installs, but if you just want to look at a distribution and don’t mind running as root, you can simply create a target directory for your Linux file systems and use the WSL import command to create a new instance in that folder with a specific name from your tarball.

If you want to build a standard WSL image that can be installed across a team, it’s a good idea to create a package configuration using Microsoft’s recommended scripts. This will set up groups, force the creation of a local user, and add the user to them, for example, ensuring access to sudo. You can improve integration between the Azure Linux install and Windows Terminal by creating a terminal profile with the official blue penguin logo; otherwise, all you get is the default name and icon as launch options.

Installing a small-form-factor Linux like Azure Linux takes a few seconds, and you can use the command line WSL to launch your new distribution. As Azure Linux is designed to host containers within Kubernetes, the first time it runs it will throw a handful of errors as WSL tries to mount the Windows file system. However, you don’t need Windows integration to experiment with Azure Linux, so you’re still ready to start.

Making Azure Linux part of your PC

You can even access the Azure Linux file system from inside Windows File Explorer, though you may need to reboot your PC to see it. With this in place, you can start to use tools like Visual Studio Code’s remote development extensions to make building on a local copy of Azure Linux part of your cloud-native development toolchain.

You’ve always been able to use an ISO to build an Azure Linux virtual machine in your choice of hypervisor, like Hyper-V or KVM, but having a version of Azure Linux accessible from the command line can help build and test container-based applications without having to work inside a Docker environment. This keeps resource demands to a minimum and ensures you have access to both the target environment and your development toolchain on a single machine.

With a defined life cycle and the ability to run copies in VMs or locally, Azure Linux is a useful tool for building and hosting cloud-native applications in Azure. Keeping the OS small and lightweight is key to delivering a base for containers, allowing you to add packages as needed. Running a local copy as part of your daily toolchain lets you familiarize yourself with its capabilities so you know how code interacts and what system resources your application will need to give your users the best experience possible.

(image/jpeg; 3.7 MB)

The shift from AI code generation to true development partnership 21 Aug 2025, 9:00 am

When Anthropic announced its Claude 4 models, the marketing focused heavily on improved reasoning and coding capabilities. But having spent months working with AI coding assistants, I’ve learned that the real revolution isn’t about generating better code snippets — it’s about the emergence of genuine agency.

Most discussions about AI coding capabilities focus narrowly on syntactic correctness, benchmark scores or the ability to produce working code. But my hands-on testing of Claude 4 reveals something far more significant: the emergence of AI systems that can understand development objectives holistically, work persistently toward solutions and autonomously navigate obstacles – capabilities that transcend mere code generation.

Rather than rely on synthetic benchmarks, I decided to evaluate Claude 4’s agency through a real-world development task: building a functional OmniFocus plugin that integrates with OpenAI’s API. This required not just writing code, but understanding documentation, implementing error handling, creating a coherent user experience and troubleshooting issues — tasks that demand initiative and persistence beyond generating syntactically valid code.

What I discovered about agentic capabilities may fundamentally change how we collaborate with AI systems in software development.

3 models, 3 approaches to agency

Working with Opus 4: Beyond code generation to development partnership

My experience with Claude Opus 4 demonstrated that we’ve crossed an important threshold. Unlike previous AI systems that excel primarily at generating code snippets in response to specific instructions, Opus 4 exhibited genuine development agency — the ability to drive the development process toward a working solution independently.

When I encountered a database error, Opus 4 didn’t just fix the code I pointed out — it proactively identified the underlying cause:

“I see the problem — OmniFocus plugins require using the Preferences API for persistent storage rather than direct database access. Let me fix that for you.”

It then implemented a complete solution using OmniFocus’s Preferences API.

This illustrates the crucial difference between code generation and true agency. A code generator produces text that looks like code; an agent understands the development context, identifies problems and resolves them within the broader framework of the application’s requirements.

What impressed me most was how Opus 4 went beyond the explicit requirements. Without prompting, it enhanced the implementation with:

  • A configuration interface for API settings
  • Detailed error messages for debugging
  • Input validation to prevent invalid requests
  • Progress indicators during API calls

These additions weren’t requested — they emerged from Opus 4’s understanding of what makes a good developer experience, demonstrating comprehension beyond the immediate coding task.

Working with Sonnet 4: The cautious collaborator

Sonnet 4 demonstrated strong capabilities but required guidance to further develop its potential. Our interaction felt like working with a capable but cautious developer who needed regular check-ins.

The initial implementation showed good understanding of the task but contained minor errors in the API integration. When faced with these issues, Sonnet 4 asked clarifying questions:

“I notice OmniFocus has a specific way of handling HTTP requests. Can you point me to the documentation for its URL fetching capabilities?”

After receiving this information, it successfully fixed the implementation, although it took seven to eight iterations to reach a fully working solution.

Interestingly, Sonnet 4 made an unexpected pivot at one point — when struggling with the OpenAI integration, it temporarily suggested removing that functionality in favor of local analysis. While this showed initiative in trying to complete the task, it demonstrated less adherence to the specific requirements.

Working with Sonnet 3.7: The responsive tool

My experience with Sonnet 3.7 felt like using a coding assistant. It required explicit instructions and struggled to maintain the broader context of what I was building.

A typical exchange went like this:

  • Me: “The plugin needs to convert tasks to TaskPaper format before sending to OpenAI.”
  • Sonnet 3.7: “I’ll implement a function to convert tasks to TaskPaper format.” [Implements basic function without error handling]
  • Me: “Now we need to implement the OpenAI API integration.
  • Sonnet 3.7: [Implements basic API call without proper error handling or user feedback]

When errors occurred, Sonnet 3.7 struggled to diagnose them independently:

  • Me: “I’m getting a ‘file is directory’ error.”
  • Sonnet 3.7: “That’s strange, can you show me the full error message?”
  • [I provide error details]
  • Sonnet 3.7: “This might be related to file paths. Let’s check how the plugin is being saved.”

After 10+ interactions, we still didn’t have a fully functional plugin.

The agency spectrum: Moving beyond code quality

This hands-on comparison revealed something important: the key differentiator between AI coding systems is increasingly not their ability to generate syntactically correct code, but their level of agency — their capacity to understand and work toward development objectives with minimal guidance.

Based on my testing, I’d place these models on an agency spectrum:

  • Code generators. Generate syntactically valid code in response to specific prompts, but lack persistence and contextual understanding.
  • Responsive assistants. Produce working code but require explicit guidance at each development stage, focusing on immediate instructions rather than overall objectives.
  • Collaborative agents. Balance following instructions with initiative, can work semi-autonomously with periodic guidance, but may need redirection.
  • Development partners. Internalize development objectives and work persistently toward them, proactively identifying and resolving obstacles without explicit guidance.

This spectrum represents a fundamental shift in how we should evaluate AI coding systems — moving beyond code quality metrics to assess their capacity for autonomous problem-solving in real development contexts.

What this means for development practices

The emergence of agency-capable AI systems has profound implications for development workflows:

From micro-instructions to development objectives

With agentic AI systems, effective collaboration shifts from providing detailed step-by-step instructions to communicating higher-level development objectives and context. I found myself giving Opus 4 instructions like:

“Build a plugin that sends OmniFocus tasks to OpenAI for analysis and summarization. It should handle errors gracefully and provide a good user experience.”

This high-level direction was sufficient for it to build a complete solution – something that would have been impossible with earlier code generation systems. 

Beyond token counting: A new economic calculus

The agency capabilities of the Claude 4 models introduce a new dimension to cost-benefit analysis. While Opus 4 costs more per token ($15/$75 input/output vs. Sonnet 4’s $3/$15), its ability to work autonomously toward solutions dramatically reduces the number of interactions required.

When I needed just three to four interactions with Opus 4 versus 10+ with Sonnet 3.7, the efficiency gain offset the higher per-token cost. More importantly, it saved my time and cognitive load as a developer — costs that rarely factor into model selection but have significant real-world impact.

Adapting development workflows to AI agency

As AI systems move beyond code generation to exhibit genuine agency, development workflows will evolve. My experience suggests a future where AI systems handle not just code writing but implementation planning, error diagnosis and quality assurance – freeing developers to focus on:

  • Architecture and system design
  • Defining objectives and quality standards
  • Critical evaluation of AI-generated solutions
  • The human and ethical aspects of software development

This doesn’t mean AI is replacing developers — rather, it’s elevating our role from writing routine code to higher-level direction and oversight.

The road ahead: Beyond current capabilities

Based on this rapid evolution in AI agency, several trends emerge:

  • Agency-specialized development systems. Future AI systems may optimize specifically for development agency rather than general intelligence, creating specialized partners for different development domains.
  • New collaboration interfaces. Current chat interfaces aren’t optimized for development collaboration. Expect tools that provide AI systems with greater autonomy to explore codebases, run tests and propose coherent solutions.
  • Evolving evaluation frameworks. As agency becomes the key differentiator, we’ll need new ways to evaluate AI systems beyond code generation benchmarks, focusing on their ability to understand and achieve development objectives.
  • Organizational adaptation. Development teams will need to rethink how they integrate agentic AI capabilities, potentially creating new roles focused on directing and evaluating AI contributions. 

Agency as the new frontier

New LLM models represent a significant milestone in the evolution of AI coding systems — not because they generate better code, but because they exhibit a level of agency that transforms the human-AI development relationship.

The most important insight from my testing is that the frontier has shifted from “can it write correct code?” to “can it understand what we’re trying to build?” New models demonstrate that we’re entering an era where AI systems can function as genuine development partners, not just sophisticated code generators.

This article is published as part of the Foundry Expert Contributor Network.
Want to join?

(image/jpeg; 0.57 MB)

How to upload files using minimal APIs in ASP.NET Core 21 Aug 2025, 9:00 am

ASP.NET Core offers a simplified hosting model, called minimal APIs, that allows us to build lightweight APIs with minimal dependencies. We’ve discussed minimal APIs in several earlier posts here. We’ve examined how we can implement in-memory caching, route constraints, model binding, parameter binding, anti-forgery tokens, versioning, JWT authentication, identity authentication, authentication handler, logging, and testing in ASP.NET Core minimal API applications.

In this post, we’ll examine how we can upload files via minimal APIs in ASP.NET Core. To use the code examples provided in this article, you should have Visual Studio 2022 installed in your system. If you don’t already have a copy, you can download Visual Studio 2022 here.

Create an ASP.NET Core Web API project in Visual Studio 2022

To create an ASP.NET Core Web API project in Visual Studio 2022, follow the steps outlined below.

  1. Launch the Visual Studio 2022 IDE.
  2. Click on “Create new project.”
  3. In the “Create new project” window, select “ASP.NET Core Web API” from the list of templates displayed.
  4. Click Next.
  5. In the “Configure your new project” window, specify the name and location for the new project. Optionally check the “Place solution and project in the same directory” check box, depending on your preferences.
  6. Click Next.
  7. In the “Additional Information” window shown next, select “.NET 9.0 (Standard Term Support)” as the framework version and uncheck the check box that says “Use controllers,” as we’ll be using minimal APIs in this project.
  8. Elsewhere in the “Additional Information” window, leave the “Authentication Type” set to “None” (the default) and make sure the check boxes “Enable Open API Support,” “Configure for HTTPS,” and “Enable Docker” remain unchecked. We won’t be using any of those features here.
  9. Click Create.

We’ll use this ASP.NET Core Web API project to work with the code examples given in the sections below.

IFormFile and IFormFileCollection in ASP.NET Core

In the recent versions of ASP.NET Core, minimal APIs provide support for uploading files using the IFormFile and IFormFileCollection interfaces. While IFormFile is used to upload a single file, IFormFileCollection is used to upload multiple files. The following code snippet illustrates how you can upload a single file using IFormFile in your minimal API application.


app.MapPost("/upload", async (IFormFile file) =>
{
    var tempFile = Path.GetTempFileName();
    using var fileStream = File.OpenWrite(tempFile);
    await file.CopyToAsync(fileStream);
});

Note that the File.OpenWrite method accepts the path to a file in your file system as a parameter and returns a FileStream instance. As its name indicates, a FileStream object provides a Stream for a file, meaning a sequence of bytes.

Similarly, the following piece of code shows how you can upload multiple files using the IFormFileCollection interface.


app.MapPost("/upload_multiple_files", async (IFormFileCollection files) =>
{
    foreach (var file in files)
    {
        var tempFile = Path.GetTempFileName();
        using var fileStream = File.OpenWrite(tempFile);
        await file.CopyToAsync(fileStream);
    }
});

Often we will want to do more with a file than simply upload it. If we want to parse or manipulate the contents of a file [OK?], we can take advantage of the StreamReader class. StreamReader is a high-level class, built on top of FileStream, that allows us to read the characters from a byte stream. StreamReader can also handle character encoding (UTF-8, ASCII, etc.) if needed.

Let’s say you have a file that contains author records that you want to insert into a database table. Assuming each line of text in the file represents a different author record, you could include the following code in your Program.cs file to upload the contents of the file, line by line, to a minimal API endpoint.


app.MapPost("/author/upload", (IFormFile file,
    [FromServices] IAuthorRepository authorRepository) =>
{
    using var streamReader = new StreamReader(file.OpenReadStream());
    while (streamReader.Peek() >= 0)
        authorRepository.Create(streamReader.ReadLine() ?? string.Empty);
});

You might use the preceding code snippet to read a collection of author data represented as JSON, for example, and then insert those records in a database table. Note that I have omitted the source code of the IAuthorRepository interface and its implemented classes here for brevity.

Avoiding anti-forgery errors when uploading files

When uploading files in ASP.NET Core, you may often encounter anti-forgery errors. ASP.NET Core issues these errors to warn of cross-site request forgery attacks.

ASP.NET Core anti-forgery error

Figure 1: ASP.NET Core may generate an anti-forgery error when uploading a file. 

Foundry

If your endpoint is safe and it doesn’t require anti-forgery protection, you can disable anti-forgery validation for the endpoint by using the DisableAntiforgery method, as shown in the following code.


app.MapPost("/upload", async (IFormFile file) =>
{
    var tempFile = Path.GetTempFileName();
    using var fileStream = File.OpenWrite(tempFile);
    await file.CopyToAsync(fileStream);
}).DisableAntiforgery();

Passing anti-forgery tokens in request headers

For endpoints that require anti-forgery protection, you can avoid such errors by passing anti-forgery tokens in the response headers. To add anti-forgery tokens to your minimal APIs, follow these simple steps:

  1. Register the necessary services related to working with anti-forgery tokens in ASP.NET Core.
  2. Generate anti-forgery tokens with each response.
  3. Submit the anti-forgery token with each request.

Register anti-forgery services

To use anti-forgery tokens in your minimal APIs, you add the anti-forgery services to the services collection using the following line of code in the Program.cs file.


builder.Services.AddAntiforgery();

Then you can add anti-forgery services to the request processing pipeline using the line of code below.


app.UseAntiforgery();

Generate an anti-forgery token

To generate an anti-forgery token, you can use the GetAndStoreTokens method of the IAntiforgery interface. You pass a reference to the current HTTP context to the GetAndStoreTokens method as a parameter as shown in the code snippet given below.


app.MapGet("/generate-antiforgery-token", (IAntiforgery antiforgery, HttpContext context) =>
{
    var antiForgeryTokenSet = antiforgery.GetAndStoreTokens(context);
    var xsrfToken = antiForgeryTokenSet.RequestToken!;
    return Results.Ok(new { token = xsrfToken });
});

Pass the anti-forgery token in the request header

Finally, you can use a REST client or a HTTP client (such as Postman, for testing) to invoke the endpoint to generate the token in a HTTP response. For each subsequent POST request, you pass the token in the request to avoid anti-forgery violation errors. 

Final thoughts

You can avoid anti-forgery validation errors in minimal APIs to prevent cross-site request forgery (CSRF) attacks. However, you should be careful using anti-forgery tokens in load-balanced environments because of the way they are generated and validated. In such an environment, each server may have its own data protection keys. Hence, if a token has been generated and validated in one server, the other servers will not be able to validate the same token unless the keys have been shared. To solve this, you must use shared data protection keys in such environments.

(image/jpeg; 15.94 MB)

GitHub launches Copilot agents panel on GitHub.com 21 Aug 2025, 4:07 am

GitHub has unveiled an agents panel, available on every page of github.com, that allows developers to delegate coding tasks to GitHub Copilot throughout the GitHub platform. The agents panel allows developers to assign tasks to Copilot and manage these tasks from a single interface, GitHub said.

Launched August 19, the agents panel is available in public preview for all paid Copilot users. The agents panel serves as a mission control center for agentic workflows on GitHub, allowing developers to assign background tasks without switching pages, monitor the progress of running tasks, and review the pull requests created by agents. Accessible via a new Agents button in the navigation bar, the agents panel works as a lightweight overlay that lets developers hand new tasks to Copilot and track existing tasks without navigating away from current work, according to GitHub.

GitHub said Copilot tasks can be started from the new agents panel with a simple prompt. Users can describe a goal in natural language and select the relevant repository. Copilot will take it from there and start creating a plan, drafting changes, running tests, and preparing a pull request. The coding agent runs in the background, works in parallel on multiple tasks, and issues a draft pull request when it’s done. GitHub introduced the Copilot coding agent in May.

(image/jpeg; 2.07 MB)

JRebel Enterprise speeds configuration, code updates for cloud-based Java development 20 Aug 2025, 8:22 pm

Perforce Software has introduced JRebel Enterprise, software that promises to accelerate the configuration of cloud-based Java development environments, and that enables incremental code changes to Java applications, eliminating the need to redeploy entire applications for every change, the company said.

Announced August 19, JRebel Enterprise skips Java application redeploys for minor code changes and automatically configures Java environments to support changing Java development environments at enterprise scale, Perforce said. JRebel Enterprise offers the same capabilities of the JRebel code deployment tool, but is optimized for containerized, cloud environments, Perforce said.

JRebel Enterprise enables developers to seamlessly push code changes to remote environments without the need for lengthy rebuilds and redeploys, said Jeff Michael, Perforce’s senior director of product management, in a statement. The company said that, according to its 2025 Java Productivity Report published March 4, 73% of enterprise respondents use cloud-based or remote development environments. But increased complexities in these environments can result in additional challenges for companies searching for development tools to ensure compatibility and efficiency within their own infrastructure. JRebel Enterprise both eliminates redeploys—the time needed to make a change in Java code and see the change reflected in the resulting environment—and eliminates frequent developer-level reconfiguration for cloud environments brought on by dynamic Java development environments, Perforce said.

JRebel Enterprise includes the following features:

  • Accelerated cloud configuration from three to five minutes per server, per developer, to a one-time, one- to two-minute configuration for an entire Java development team.
  • Automatic detection and configuration of JRebel agents running on JVMs.
  • Support for Java 21 and newer versions and integration with the JetBrains IntelliJ IDEA integrated development environment.
  • Support for cloud providers including Amazon Web Services, Google Cloud Platform, and Microsoft Azure.

(image/jpeg; 9.78 MB)

AWS blames bug for Kiro pricing glitch that drained developer limits 20 Aug 2025, 9:42 am

AWS has blamed a bug for all usage and pricing-related issues that developers have been facing on Kiro, its new agentic AI-driven integrated development environment (IDE), since it introduced a revised pricing structure last week.

“As we have dug into this, we have discovered that we introduced a bug when we rolled out pricing in Kiro, where some tasks are inaccurately consuming multiple requests. That’s causing people to burn through their limits much faster than expected,” Adnan Ijaz, director of product management for Agentic AI at AWS, posted on Kiro’s official Discord channel.

Further, Ijaz wrote that AWS was “actively” working to fix the issue in order to provide a resolution within a couple of days.

Pricing flip-flops and developer dissatisfaction

In July, AWS had to limit the usage of Kiro, just days after announcing it in public preview, due to the sheer number of developers flocking to try out the IDE, mainly driven by pricing changes and throttling issues in rival IDEs, such as Cursor and Claude Code.

It had also retracted details of the pricing tiers it planned for the service. AWS initially said it would offer three tiers of service for Kiro: free with a cap of 50 agentic interactions per month; Pro at $19 per month for up to 1,000 interactions, and Pro+ at $39 per month for up to 3,000 interactions.

However, last week, it introduced a revised pricing structure moving away from simple interactions to vibe and spec requests: free with a cap of 50 vibe and 0 spec requests; Pro at $20 with 225 vibe and 125 spec requests, Pro+ at $40 with 450 vibe and 250 spec requests; and Power at $200 for 2,250 vibe and 1,250 spec requests.

Any use that breaches these limits in the paid tiers is to be charged at $0.04 per vibe and $0.20 per spec request, AWS said.

The vibe and spec-driven pricing structure, which is unique to Kiro, did not go down well with developers, with several of them taking to social media and varied forums to express their disappointment.

Several users also took to Kiro’s GitHub page to raise concerns that the limits of the various pricing tiers were getting exhausted quickly, rendering the IDE unusable.

A GitHub user reported the issue of accelerated limit exhaustion as a bug. Another GitHub user reported that a large amount of vibe credits were being consumed even when the user didn’t actively engage in any conversation.

Reminiscent of Cursor’s episode from June

Many users are comparing Kiro’s pricing changes with Cursor’s pricing changes introduced in June, which left developers confused and dissatisfied.

Post the pricing change, several took to social media platforms, such as Reddit, to express their dissatisfaction and state that they were looking at alternatives as their cost of usage on the same plan had increased dramatically.

Although Cursor has attempted to clarify its Pro plan and has offered full refunds for surprise charges incurred between June 16 and July 4, many developers are still not clear on the changes in the plan.

Some had even sought clarity on the company’s official forum page, which also then housed several posts showing dissatisfaction among its users

AWS, too, is currently offering to reset limits for any users impacted by the bug.

Kiro to be anyways more expensive than other IDEs?

Despite the dissatisfaction among users and developers, analysts see Kiro driving more value when compared to rivals.

Kiro’s advantage over rivals, according to Moor Insights and Strategy principal analyst Jason Andersen, is rooted in its spec-driven approach, wherein a developer defines the entire application or task rather than conventional chat-oriented code generation or code reviews.

The spec-driven approach is also the primary reason behind Kiro being more expensive when compared to rivals, especially in request consumption, Andersen said.

“Spec-driven development can spawn many tasks simultaneously, so what appears as a single spec request actually may be many requests rolled into one. And since these requests are more complex, they ultimately use more GPU inference, hence more requests and a higher cost per request,” Andersen explained.

Further, Andersen sees the spec-driven development strategy as an effective playbook for AWS to disrupt rivals with the added condition that it educates developers.

“AWS came up with this split pricing model so they could offer a vibe toolset that was price-competitive and a price for spec that was reflective of its more powerful nature, which in turn uses a lot more resources. This was a sincere attempt to address the market conditions,” Andersen said.

However, he also pointed out that AWS should provide some insights or benchmarks for what a developer can expect when it comes to using spec-driven development.

(image/jpeg; 9.45 MB)

Is the generative AI bubble about to burst? 20 Aug 2025, 9:00 am

One way to measure the scope of the generative AI boom is financially, and another is in terms of public awareness. Both are nearly unprecedented, even in the realm of high tech. Data center build-outs in support of AI expansion are expected to be in the region of $364 Billion in 2025—an amount that makes the cloud “revolution” look more like an aperitif.

Beyond the charm and utility of chatbots, the AGI (artificial general intelligence) concept is the main driver of public attention. Many believe we are on the cusp of an explosion of compute power that will change history.

If the promise of AGI is one pole of the field in public perception of AI, the opposite pole is that generative AI is largely a stochastic hat trick. Some rather well-informed parties have pointed out the inherent limitations in the way large language models are built. In short, LLMs have certain drawbacks that cannot be mitigated by increasing the size of the models.

Somewhere between the two poles lies the truth. Because we are already using these tools in our daily work, software developers know better than most where AI tools shine and where they fall apart. Now is a great moment to reflect on the state of the AI boom from our position on its leading edge.

Rhymes with dotcom

In 2001, I was a junior engineer at a dotcom startup. I was walking through the familiar maze of cubicles one day when a passing thought froze me: “Is this a bubble?”

Now, I wasn’t especially prescient. I didn’t have a particular grasp of the economics or even the overall technology landscape. I was just glad to be programming for money. But there was something about the weird blend of college dorm and high tech, of carefree confidence and easy history-making, that caught my attention.

There was a palpable sense that “everything was different now.” We were in this bright new era where the limits on expansion had been overcome. All we had to do was continue exploring new possibilities as they opened before us. The ultimate apotheosis of tech married to finance was guaranteed, so long as we maintained enthusiasm.

Does any of this sound familiar to you?

Learning (or not) from the dotcom bust

Early this year, Goldman Sachs released a study comparing the dotcom tech boom to AI today. The study notes fundamental ways the current moment differs from the dotcom era, especially in the profits reaped by big tech. In essence, the “Magnificent 7” tech companies are pulling in AI-generated revenue that, according to Sachs, justifies the mental and financial extravagance of the moment.

“We continue to believe that the technology sector is not in a bubble,” says the report, because “while enthusiasm for technology stocks has risen sharply in recent years, this has not represented a bubble because the price appreciation has been justified by strong profit fundamentals.”

So, that’s the bullish case from the investment point of view. (For another view, the New Yorker recently published another comparison of the current AI boom and that of the dotcom era.)

The AI money trap

We don’t have to look far for a more contrarian perspective. Economist Paul Kedrosky, for one, notes that capital expenditure on data centers has driven 1.2% of national GDP, acting as a kind of stimulus program. Without that investment, he writes, the US economy would be in contraction.

Kedrosky describes an “AI datacenter spending program” that is “already larger than peak telecom spending (as a percentage of GDP) during the dot-com era, and within shouting distance of peak 19th century railroad infrastructure spending.”

Virtually all AI spend flows into Nvidia in one way or another. This is reflected in its recent valuation as the first publicly traded company to break $4 trillion in market capitalization (second up was Microsoft).

To put that number in context, market observers such as Forbes described it as being greater than the GDP of Canada or the annual global spending on defense.

Nvidia alone accounts for more than 7% of the value of the S&P 500. AI gadfly Ed Zitron calls it the “AI money trap.”

What developers know

Here’s where I believe programmers and others who use AI tools have an advantage. We are the early adopters, par excellence. We are also, as typical coders, quick to call it like it is. We won’t fudge if reality doesn’t live up to the promise.

Code generation is currently the killer app of AI. But for developers, that flavor of AI assistance is already turning stale. We’re already looking for the next frontier, be it agentic AI, infrastructure, or process automation.

The reality is that AI is useful for development, but continues to exhibit many shortcomings. if you’re using it for work, you get a visceral sense for that balance. AI sometimes delivers incredible, time-saving insights and content. But then it will just as confidently introduce an inaccuracy or regression that eats up all the time it’s just saved you.

I think most developers have realized quickly that modern AI is more of a useful tool than a world-changing revelation. It won’t be tearing the roof off of traditional development practice anytime soon.

The limitations of LLMs

There is a growing ripple effect as developers realize we are basically doing what can be done with AI, while the industry presses forward as if there were an almost insatiable demand for more.

As an example, consider the sobering rumination from Gary Marcus, a longtime observer of AI, in a recent post dissecting the rather lackluster launch of ChatGPT 5. Looking into the heart of modern AI design, he identifies inherent shortcomings that cannot be addressed by increasing the availability of compute power and data. (His alternative architecture, Neurosymbolic AI, is worth a look.)

Marcus also references a recent report from Arizona State University, which delves into chain of thought (CoT) reasoning and the limitations of LLMs to perform inference. This is a structural limitation also highlighted in the June 2025 position paper from Apple, The Illusion of Thinking.

The basic message is that LLMs, when they appear to be reasoning, are actually just reflecting the patterns in their data, without the ability to generalize. According to this line of thought, what we see is what we get with LLMs; how they have worked in the last few years is what they are capable of—at least, without a thoroughgoing re-architecture.

If that is true, then we can expect a continuation of the incremental gains we are seeing now, even after throwing trillions worth of data center infrastructure at AI. Some unpredictable breakthroughs may occur, but they’ll be more in the realm of potential than predictable, based on the existing facts on the ground.

What if AI is a bubble?

Artificial intelligence is a legitimately exciting new sphere of technology, and it is producing a massive build-out. But if the hype extends too far beyond what reality can support, it will contract in the other direction. The organizations and ideas that survive that round of culling will be the ones capable of supporting enduring growth.

This happened with Blockchain. It went big, and some of those riding its expansion exploded spectacularly. Consider FTX, which lost some $8 billion in customer funds, or the Terra collapse, which is tough to fully quantify but included at least $35 billion lost in a single day. And these are just two examples among many.

However, many of the companies and projects that survived the crypto winter are now pillars of the crypto ecosystem, which is becoming ever more integrated directly into mainstream finance.

The same thing may be true of the current AI trend: Even if the bubble pops spectacularly, there will be survivors. And those survivors will promote lasting AI-driven changes. Some of the big tech companies at the forefront of AI are survivors of the dotcom collapse, after all.

In a recent interview with The Verge, OpenAI’s Sam Altman noted that, “When bubbles happen, smart people get overexcited about a kernel of truth. Are we in a phase where investors as a whole are overexcited about AI? My opinion is yes. Is AI the most important thing to happen in a very long time? My opinion is also yes.”

What do you think? As a software developer using AI in your work, are we in a bubble? If so, how big is it, and how long before it is corrected?

(image/jpeg; 1.13 MB)

Your code is more strongly coupled than you think 20 Aug 2025, 9:00 am

In previous articles I introduced connascence—the idea that code coupling can be described and quantified—and discussed five kinds of static connascence. In this article, we’ll wrap up our tour of connascence with a discussion of the deeper kind of connascence, dynamic connascence.

Dynamic connascence is visible only at runtime. Because it’s discovered late and it’s often non-local, dynamic connascence usually represents stronger coupling than static forms.

Connascence of execution

Connascence of execution occurs when code must execute in a certain order for the system to function correctly. It is often referred to as “temporal coupling.” Here is a simple example:


userRecord.FirstName = 'Alicia';
userRecord.LastName = 'Florrick';
userManager.addUser(userRecord);
userRecord.Birthday = new Date(1968, 6, 24);

This code adds the Birthday value after the user has been added. This clearly won’t work. This mistake can be caught by examining the code, but a more complex scenario could be harder to spot.

Consider this code:


sprocketProcessor.AddSprocket(SomeSprocket);
sprocketProcessor.ValidateSprocket();

Does the order of those two statements matter? Does the sprocket need to be added and then validated, or should it be validated before it is added? What happens if the sprocket doesn’t validate? It is hard to say, and someone not well-versed in the system might make the mistake of putting them in the wrong order. That is connascence of execution.

If you ever see a comment along the lines of, “This line of code MUST BE EXECUTED BEFORE the one below!!!!!”, then the developer before you probably ran into a problem caused by connascence of execution. The exclamation points tell you the error wasn’t easy to find. 

The key question: Would reordering lines of code break behavior? If so, then you have connascence of execution.

Connascence of timing

Connascence of timing occurs when the timing of execution makes a difference in the outcome of the application. The most obvious example of this is a threaded race condition, where two threads pursue the same resource, and only one of the threads can win the race.

Connascence of timing is notoriously difficult to find and diagnose, and it can reveal itself in unpredictable ways. Anyone who has delved deeply into debugging thread problems is all too aware of connascence of timing.

The key question: Would changing thread scheduling, network latency, or a timeout alter correctness? Then you have connascence of timing.

Connascence of value

Connascence of value occurs when several values must be properly coordinated between modules. For instance, imagine you have a unit test that looks like this:


[Test]
procedure TestCheckoutValue {
  PriceScanner = new TPriceScanner();
  PriceScanner.Scan('Frosted Sugar Bombs');
  Assert.Equals(50, PriceScanner.CurrentBalance);
}

So we’ve written the test. Now, in the spirit of Test Driven Development, I’ll make the test pass as easily and simply as possible.


void PriceScanner.Scan(aItem: string) {
  CurrentBalance = 50;
}

We now have tight coupling between TPriceScanner and our test. We obviously have connascence of name, because both classes rely on the name CurrentBalance. But that’s relatively low-level connascence, and perfectly acceptable. We have connascence of type, because both must agree on the type TPriceScanner, but again, that’s benign. And we have connascence of meaning, because both routines have a hard-coded dependency on the number 50. That should be refactored.

But the real problem is the connascence of value that occurs because both of the classes know the price—that is, the “value”—of Frosted Sugar Bombs. If the price changes, even our very simple test will break.

The solution is to refactor to a lower level of connascence. The first thing you could do is to refactor so that the price (the value) of Frosted Sugar Bombs is maintained in only one place:


void TPriceScanner.Scan(aItem: string; aPrice: integer) {
  currentBalance := currentBalance + aPrice;
}

Now our test can read as follows:


[Test]
void TestCheckoutValue {
  PriceScanner = new TPriceScanner;
  PriceScanner.Scan('Frosted Sugar Bombs', 50);
}

As a result, we no longer have connascence of value between the two modules, and our test still passes. The price is reflected in only one place, and any price will work for our test. Excellent.

The key question: Do multiple places need updating when a constant or a configuration setting changes? If so, then you have connascence of value.

Connascence of identity

Connascence of identity occurs when two components must refer to the same object. If the two components refer to the same object, and one changes that reference, then the other component must change to the same reference. This, too, is a subtle and difficult-to-detect form of connascence. In fact, connascence of identity is the most complex form of connascence.

As an example, consider the following code:


class ReportInfo {
  private _reportStuff = "";

  get reportStuff(): string {
    return this._reportStuff;
  }

  set reportStuff(value: string) {
    this._reportStuff = value;
  }
}

class InventoryReport {
  constructor(public reportInfo: ReportInfo) {}
}

class SalesReport {
  constructor(public reportInfo: ReportInfo) {}
}

function main(): void {
  const reportInfo = new ReportInfo();
  reportInfo.reportStuff = "Initial shared info";

  const inventoryReport = new InventoryReport(reportInfo);
  const salesReport = new SalesReport(reportInfo);

  // Do stuff with reports...
  console.log("Initially shared object?",
    inventoryReport.reportInfo === salesReport.reportInfo); // true

  // Change one side to point at a new identity
  const newReportInfo = new ReportInfo();
  newReportInfo.reportStuff = "New info for inventory only";
  inventoryReport.reportInfo = newReportInfo;

  // Now they refer to different ReportInfo instances (Connascence of Identity risk)
  console.log("Still shared after reassignment?",
    inventoryReport.reportInfo === salesReport.reportInfo); // false

  // Observable divergence
  console.log("Inventory.reportInfo.reportStuff:", inventoryReport.reportInfo.reportStuff);
  console.log("Sales.reportInfo.reportStuff:", salesReport.reportInfo.reportStuff);
}

main(); 

Here we have two reports, an inventory report and a sales report. The domain requires that the two reports always refer to the same instance of ReportInfo. However, as you can see in the code above, in the middle of the reporting process, the inventory report gets a new ReportInfo instance. This is fine, but the sales report must refer to this new ReportInfo instance as well.

In other words, if you change the reference in one report, then you must change it in the other report for the system to continue working correctly. This is called connascence of identity, as the two classes depend on the same identity of the reference.

The above example is simple to be sure, but it’s not hard to conceive of a situation where the references change in unknown or obfuscated ways, with unexpected results. 

The key question: Must two components always point to the same instance? Then you have connascence of identity.

Okay, so there we have it—the four kinds of dynamic connascence. Now, I’ll be the first to admit that this connascence stuff is quite nerdy and a little hard to understand. But much of it covers coding issues you already intuitively knew but perhaps never considered in a formal way. How many types of connascence are wrapped up in the “Don’t repeat yourself” (DIY) principle?

All I know is that I don’t like heavily coupled code, and anything I can learn that helps me avoid it isn’t too nerdy for me. 

(image/jpeg; 2.92 MB)

PyApp: An easy way to package Python apps as executables 20 Aug 2025, 9:00 am

Every developer knows how hard it is to redistribute a Python program as a self-contained, click-and-run package. There are third-party solutions, but they all have drawbacks. PyInstaller, the oldest and best-known tool for this job, is crotchety to work with and requires a fair amount of trial-and-error to get a working redistributable. Nuitka, a more recent project, compiles Python programs to redistributable binaries, but the resulting artifacts can be massive and take a long time to produce.

A newer project, PyApp, takes an entirely different approach. It’s a Rust program you compile from source, along with information about the Python project you want to distribute. The result is a self-contained binary that, when run, unpacks your project into a directory and executes it from there. The end user doesn’t need to have Python on their system to use it.

Setting up PyApp

Unlike other Python distribution solutions, PyApp is not a Python library like PyInstaller. It’s also not a standalone program that takes in your program and generates an artifact from it. Instead, you create a custom build of PyApp for each Python program you want to distribute.

You’ll need to take care of a few prerequisites before using PyApp to deploy Python programs:

  • PyApp’s source: Make a clone of PyApp’s source and put it in a directory by itself, separate from any other projects.
  • The Rust compiler and any other needed infrastructure: If you’re unfamiliar with Rust or its tooling, you’ll need to understand at least enough of it to compile a Rust program from source. Check out my tutorial for getting started with Rust to see what you need.
  • Your Python program, packaged as a wheel: The “wheel,” or .whl file, is the binary format used to package Python programs along with any platform-specific components (such as precompiled libraries). If you don’t already have a wheel for the Python program you want to repackage, you’ll need to generate one. My video tutorial for creating Python wheels steps you through that process. You can also use wheels hosted on PyPI.

PyApp uses environment variables during the build process to figure out what Python project you want to compile into it and how to support it. The following variables are the ones most commonly used:

  • PYAPP_PROJECT_NAME: Used to define the name of the Python project you’ll be bundling. If you used pyproject.toml to define your project, it should match the project.name attribute. This can also be the name of a project on PyPI; if not, you will want to define the path to the .whl file to use.
  • PYAPP_PROJECT_VERSION: Use this to configure a specific version of the project if needed.
  • PYAPP_PROJECT_PATH: The path (relative or absolute) to the .whl file you’ll use for your project. Omit this if you’re just installing a .whl from PyPI.
  • PYAPP_EXEC_MODULE: Many Python packages run automatically in some form when run as a module. This variable lets you declare which module to use, so if your program runs with thisprogram, you’d set this variable to: python -m thisprogram.
  • PYAPP_EXEC_SPEC: For programs that have an entry-point script, you can specify it here. This matches the syntax in the project.scripts section of pyproject.toml. For instance, pyprogram.cmd:main would import the module pyprogram.cmd from your Python program’s modules, then execute the function main() from it.
  • PYAPP_EXEC_SCRIPT: This variable lets you supply a path to an arbitrary Python script, which is then embedded in the binary and executed at startup.
  • PYAPP_DISTRIBUTION_EMBED: Normally, when you create a PyApp binary, it downloads the needed Python distribution to run your program from the Internet when it’s first executed. If you set this variable to 1, PyApp will pre-package the needed Python distribution in the generated binary. The result is a larger binary, but one that doesn’t need to download anything; it can just unpack itself and go.

Many other options are available, but these should be enough for most projects you want to build.

To make things easy on yourself, you may want to create a shell script for each project that sets up these environment variables and runs the compilation process.

Building the PyApp binary

Once you’ve set the environment variables, go to the root directory of PyApp’s source, and build PyApp using the Rust compiler with the command:


cargo build --release 

This might take several minutes, as PyApp has many dependencies (363 as of this writing). Future compilation passes will take much less time, though, once Rust obtains and caches everything.

Note that while it’s possible to cross-compile for other platforms, it is not recommended or supported.

Once the compiling is done, the resulting binary will be in the target/release subdirectory of the PyApp project directory, as pyapp.exe. You can rename the resulting file anything you want, as long as it’s still an executable.

Running the PyApp binary

To test the binary, just run it from a console. If all is well, you should see prompts in the console as PyApp unpacks itself and prepares to run. If you see any failures in the console, take note of them and edit your environment variables; chances are you didn’t properly set up the entry point or startup script for the project.

When the PyApp binary runs for the first time, it’ll extract itself into a directory that’s usually a subdirectory of the user profile. On future runs, it’ll use that already-unpacked copy and so will launch much faster.

If you want to control where the binary unpacks itself, you can set an environment variable to control where the program writes and looks for its unpacked contents when it’s run:


PYAPP_INSTALL_DIR_ = "path/to/directory"

Note that is an uppercased version of the PYAPP_PROJECT_NAME variable. So, for the program conwaylife, we’d use the variable name PYAPP_INSTALL_DIR_CONWAYLIFE. The directory can be a full path or a relative one, so you could use a directory like ./app to indicate the application should be unpacked into the subdirectory app of the current working directory.

Also note that this setting is not persistent. If you don’t have this exact environment variable set when you run the program, it’ll default to the user-profile directory.

PyApp options

The deployed PyApp executable comes with a couple of convenient commands built into it:

  • pyapp self remove: Removes the unpacked app from the directory it was unpacked into.
  • pyapp self restore: Removes and reinstalls the app.

Note again that if you used the PYAPP_INSTALL_DIR_ variable to set where the project lives, you must also have it set when running these commands! It’s also important to note that PyApp-packaged apps can generate false positives with antivirus solutions on Microsoft Windows, because the resulting executables aren’t code-signed by default.

(image/jpeg; 1.76 MB)

.NET 10 Preview 7 adds XAML generator 19 Aug 2025, 10:07 pm

The latest preview of Microsoft’s planned .NET 10 application development platform is now available, featuring a source generator for XAML and improved translation for parameterized collections in Entity Framework Core.

This preview was unveiled August 12 and can be downloaded from dotnet.microsoft.com. The production release of .NET 10 is expected in November.

For XAML, .NET MAUI (Multi-platform App UI) now has a source generator for XAML that improves build performance and enables better tools support, Microsoft said. The generator creates strongly typed code for XAML files at compile time, reducing runtime overhead and providing better IntelliSense support. The generator decorates generated types with the [Generated] attribute for better tool integration and debugging support.  

With Entity Framework Core 10, the next version of the object-relational mapper, a new default translation mode is introduced for parameterized collections, where each value in the collection is translated into its own scalar parameter. This allows collection values to change without resulting in different SQL code, which results in cache misses and other performance problems, Microsoft said.

Also introduced in .NET 10 Preview 7 is WebSocketStream, an API designed to simplify common WebSocket scenarios in .NET. Traditional WebSocket APIs are low-level and require significant boilerplate for tasks such as handling buffering and framing and managing encoding and decoding, Microsoft said. These complexities make it difficult to use WebSockets as a transport, especially for apps with streaming or text-based protocols. WebSocketStream addresses these issues by providing a Stream-based abstraction over a WebSocket, enabling seamless integration with existing APIs.

Other new features and improvements in .NET 10 Preview 7:

  • For Windows, ProcessStartInfo.CreateNewProcessGroup can be used to launch a process in a separate process group.
  • JsonSerializer.Deserialize now supports PipeReader, complementing existing PipeWriter support.
  • A new configuration option, ExceptionHandlerOptions.SuppressDiagnosticsCallback, has been added to the ASP.NET Core exception handler middleware to control diagnostic output.
  • APIs for passkey authentication in ASP.NET Core Identity have been updated and simplified.

No new features were added for the .NET runtime or for the Visual Basic, C#, and F# languages in Preview 7, Microsoft noted. .NET 10 Preview 7 follows .NET 10 Preview 6, which was released July 15 and featured improved JIT code generation. Preview 5 was released June 10 and featured C# 14 and runtime enhancements. .NET 10 Preview 1 arrived February 25, followed by Preview 2 on March 18, Preview 3 on April 10, and Preview 4 on May 13.

(image/jpeg; 9.13 MB)

IBM can’t afford an unreliable cloud 19 Aug 2025, 9:00 am

On August 12, 2025, IBM Cloud experienced its fourth major outage since May, resulting in a two-hour service disruption that affected 27 services globally across 10 regions. This “Severity 1” event left enterprise customers locked out of critical resources due to authentication failures, with users unable to access IBM’s cloud console, CLI, or APIs. Such recurring failures reveal systemic weaknesses in IBM’s control plane architecture, the layer responsible for handling user access, orchestration, and monitoring.

This incident followed previous outages on May 20, June 3, and June 4, and further eroded confidence in IBM’s reliability. This does not reflect well on a provider that promotes itself as a leader in hybrid cloud solutions. For industries with strict compliance requirements or businesses that depend on cloud availability for real-time operations, these disruptions raise doubts about IBM’s ability to meet their needs on an ongoing basis. These recurring incidents give enterprises a reason to consider switching to platforms with more reliable track records, such as AWS, Microsoft Azure, or Google Cloud.

For enterprises that have entrusted IBM Cloud with hybrid strategies that balance on-premises systems with public cloud integration, these events strike at the heart of IBM’s value proposition. The hybrid cloud’s supposed benefit is resilience, giving businesses flexibility in handling workloads. A fragile control plane undermines this perceived advantage, leaving IBM’s multi-billion-dollar investments in hybrid systems on shaky ground.

Opening the door for competitors

IBM has traditionally been a niche player in the cloud market, holding a 2% global market share compared to AWS (30%), Microsoft Azure (21%), and Google Cloud (11%). IBM Cloud targets a specific enterprise audience with hybrid cloud integration and enterprise-grade features.

AWS, Azure, and Google Cloud have consistently demonstrated their reliability, operational efficiency, and capacity to scale. Since the control plane is crucial for managing cloud infrastructure, the Big Three hyperscalers have diversified their architectures to avoid single points of failure. Enterprises having issues with IBM Cloud might now consider switching critical data and applications to one of these larger providers that also offer advanced tools for AI, machine learning, and automation.

These outages couldn’t come at a worse time for IBM. With healthcare, finance, manufacturing, and other industries increasingly depending on AI-driven technologies, companies are focused on cloud reliability. AI workloads require real-time data processing, continuity, and reliable scaling to work effectively. For most organizations, disruptions caused by control-plane failures could lead to catastrophic AI system failures.

What IBM can do

IBM must make major changes if it wants to recover its credibility and regain enterprise trust. Here are several critical steps I would take if I were CTO of IBM:

  • Adopt a resilient control-plane architecture. Duh. IBM’s reliance on centralized control-plane management has become a liability. A distributed control plane infrastructure will allow individual regions or functions to operate independently and limit the scope of global outages.
  • Enhance IAM design with segmentation. Authentication failures have been at the core of the past four outages. Regionally segmented identity and access management (IAM) and distributed identity gateways must replace the globally entangled design currently in place.
  • Strengthen SLAs targeting control-plane uptime. Cloud customers demand operational guarantees. By implementing robust service-level agreements (SLAs) focused explicitly on control-layer reliability, IBM could reassure customers that their vital management functions will remain stable even under pressure.
  • Increase transparency and communication. IBM needs to be proactive with customers following outages. Offering incident reports, clear timelines for fixes, and planned updates to infrastructure can help rebuild trust, though it will take time. Silence, on the other hand, will only deepen dissatisfaction.
  • Accelerate stress-testing procedures. IBM must regularly perform extensive load and resilience testing to identify vulnerabilities before they impact customers. Routine testing in simulated high-pressure operating conditions should be a priority.
  • Develop hybrid systems with multi-control-plane options. IBM should adopt multi-control-plane designs to enable enterprises to manage workloads independently of centralized limitations. This would enable hybrid strategies to retain their resilience advantage.

Increasing enterprise resilience

For enterprises wary of any cloud provider’s reliability, there are several steps to build resilience into their operations:

  • Adopt a multicloud strategy. By distributing workloads across multiple cloud providers, enterprises reduce dependency on any single vendor. This ensures that even if one provider has a disruption, core business functions remain active.
  • Integrate disaster recovery automation. Automated failover systems and data backups across multiple regions and providers can minimize downtime when outages occur.
  • Demand stronger SLAs. Enterprises should negotiate contracts that prioritize uptime guarantees for control planes and include penalties for SLA violations.
  • Monitor and audit vendor reliability. Enterprises should actively track their cloud providers’ reliability performance metrics and plan for migration if vendors continuously fail to meet standards.

IBM has reached a critical juncture. In today’s competitive market, cloud reliability is the baseline expectation, not a value-added bonus. IBM’s repeated failures—particularly at the control-plane level—fundamentally undermine its positioning as a trusted enterprise cloud partner. For many customers, these outages may serve as the final justification to migrate workloads elsewhere.

To recover, IBM must focus on transforming its control-plane architecture, ensuring transparency, and reaffirming its commitment to reliability through clear, actionable changes. Meanwhile, enterprises should see this as a reminder that resilience must be built into their cloud strategies to safeguard their operations, regardless of provider.

In a world increasingly dependent on AI and automation, reliability isn’t optional—it’s essential. IBM has a lot of work ahead.

(image/jpeg; 9.16 MB)

Retrieval-augmented generation with Nvidia NeMo Retriever 19 Aug 2025, 9:00 am

Nvidia was founded by three chip designers (including Jensen Huang, who became CEO) in 1993. By 1997 they had brought a successful high-performance 3D graphics processor to market; two years later the company invented the GPU (graphics processing unit), which caused a sea change in computer graphics, specifically for video games. In 2006 they introduced CUDA (Compute Unified Device Architecture), which allowed them to expand their market to scientific research and general-purpose computing. In 2012 they adapted GPUs to neural networks through specialized CUDA libraries (cuDNN) that could be called from Python, which expanded into support for large language models (LLMs).

In 2025, along with other products aimed at developers, Nvidia offers an entire suite of enterprise AI software products, including Nvidia NIM, Nvidia NeMo, and Nvidia RAG Blueprint. Together they can import your raw documents, create a vector-indexed knowledge base, and allow you to converse with an AI that knows and can reason from the contents of the knowledge base. Of course, these programs all take full advantage of Nvidia GPUs.

Nvidia NIM is a set of accelerated inference microservices that allow organizations to run AI models on Nvidia GPUs anywhere. Access to NIM generally requires an Nvidia AI Enterprise suite subscription, which typically costs about $4,500 per GPU per year, but the Essentials level of the suite comes gratis for three to five years with some server-class GPUs such as the H200.

Nvidia NeMo is an end-to-end platform for developing custom generative AI including large language models (LLMs), vision language models (VLMs), retrieval models, video models, and speech AI. NeMo Retriever, part of the NeMo platform, provides models for building data extraction and information retrieval pipelines, which extract structured (think tables) and unstructured (think PDF) data from raw documents.

The Nvidia RAG Blueprint demonstrates how to set up a retrieval-augmented generation (RAG) solution that uses Nvidia NIM and GPU-accelerated components. It provides a quick start for developers to set up a RAG solution using the Nvidia NIM services. The Nvidia AI-Q Research Assistant Blueprint expands on the RAG Blueprint to do deep research and report generation.

Nvidia AI Enterprise

Nvidia AI Enterprise consists of two types of software, application and infrastructure. The application software is for building AI agents, generative AI, and other AI workflows; the infrastructure software includes Nvidia GPU and networking drivers and Kubernetes operators.

Nvidia NIM

Nvidia NIM provides containers to self-host GPU-accelerated inferencing microservices for pre-trained and customized AI models. NIM containers improve AI inferencing for AI foundation models on Nvidia GPUs.

Nvidia NeMo

NeMo tools and microservices help you to customize and apply AI models from the Nvidia API catalog. Nvidia likes to talk about creating data flywheels with NeMo to continuously optimize AI agents with curated AI and human feedback. NeMo also helps to deploy models with guardrails and retrieval-augmented generation.

Nvidia NeMo Retriever

NeMo Retriever microservices accelerate multi-modal document extraction and real-time retrieval with, according to Nvidia, lower RAG costs and higher accuracy. NeMo Retriever supports multilingual and cross-lingual retrieval, and claims to optimize storage, performance, and adaptability for data platforms.

Nvidia NeMo Retriever AI-Q Blueprint 04

NeMo Retriever architecture diagram. Note how many NeMo components interact with each other to create the retriever flow.

Nvidia

Nvidia AI Blueprints

Nvidia AI Blueprints are reference examples that illustrate how Nvidia NIM can be leveraged to build innovative solutions. There are 18 of them including the RAG Jupyter Notebook we’ll look at momentarily.

Nvidia Brev

Nvidia Brev is a cloud GPU development platform that allows you to run, build, train, and deploy machine learning models on VMs in the cloud. Launchables are an easy way to share software.

Nvidia AI-Q Blueprint RAG and Research Assistant reference examples

Nvidia had me use a Brev launchable running on AWS to test out an AI-Q Blueprint reference example.

The Nvidia RAG blueprint (see figure below) serves as a reference solution for a foundational retrieval-augmented generation pipeline. As you probably know, RAG is a pattern that helps LLMs to incorporate knowledge that wasn’t included in their training and to focus on relevant facts; here it is designed to allow users to query an enterprise corpus.

The Nvidia RAG blueprint is a bit more complicated than you might expect, at least partially because it’s trying to handle a variety of input formats, including text, voice, graphics, and formatted pages (e.g. PDFs). It includes refinements such as re-ranking, which narrows relevancy lists; OCR, which extracts text from graphics; and guardrails, which protect against incoming query-based jailbreaks as well as against certain kinds of outgoing hallucinations.

Nvidia NeMo Retriever AI-Q Blueprint 05

Nvidia RAG blueprint architecture diagram. Note that this flow includes NeMo Retriever components, LLMs, an OCR component, an object store, and a vector database. The query processing block is based on the LangChain framework.

Nvidia

The Nvidia AI-Q Research Assistant blueprint (see figure below) depends on the RAG blueprint and on an LLM-as-a-judge to verify the relevance of the results. Note the reasoning cycle that occurs prior to final report generation.

Nvidia NeMo Retriever AI-Q Blueprint 06

In addition to RAG functionality, the AI-Q Research Assistant creates a report plan, searches data sources for answers, writes a report, reflects on gaps in the report for further queries, and finishes a report with a list of sources. Note that Llama models are used to generate RAG results, to reason on the results, and to generate the report.

Nvidia

The screenshots below are from my runs of the AI-Q Research Assistant blueprint.

Nvidia NeMo Retriever AI-Q Blueprint 07

Nvidia AI-Q Research Assistant blueprint starter page. The bar at the top shows the overall system diagram, and the text below shows a preview of the Jupyter Notebook which appears in the instance. Note that the AI-Q Research Assistant depends on the Nvidia RAG blueprint, shown a few diagrams up from here.

Foundry

Nvidia NeMo Retriever AI-Q Blueprint 08

Starting the AI-Q Research Assistant.

Foundry

Nvidia NeMo Retriever AI-Q Blueprint 09

Running AI-Q Research Assistant instance. The preview of the Jupyter Notebook has been replaced by the instance logs.

Foundry

Nvidia NeMo Retriever AI-Q Blueprint 10

The Jupyter Notebook that appears when the AI-Q Research Assistant is running. The third item down in the directory is what we want.

Foundry

Nvidia NeMo Retriever AI-Q Blueprint 11

The beginning of the “Get Started” Jupyter Notebook for the AI-Q Research Assistant blueprint. The first cell should look familiar from the preview of the launchable.

Foundry

Nvidia NeMo Retriever AI-Q Blueprint 12

Report plan for Amazon 2023 financial performance. You can ask for changes to the plan at this point if you wish.

Foundry

Nvidia NeMo Retriever AI-Q Blueprint 13

Amazon 2023 financial report generated by the AI-Q Research Assistant. You can dive into its process or ask for changes if you wish.

Foundry

AI-powered research with RAG

The Nvidia AI-Q Research Assistant blueprint I tested did a better job than I expected at ingesting financial reports in PDF form and generating reports based on them in response to user queries. One of the surprises was how well the Llama-based models performed. In separate tests of Llama models in naive RAG designs my results were not nearly as good, so I have to conclude that the plan-reflect-refine architecture helped a lot.

I would have expected my tests of this system to take a day or less. In fact, they took me about a week and half. My first problem turned out to be an error in the documentation. My second problem turned out to be a failure in a back-end process. I was assured that Nvidia has fixed both issues, so that you’re not likely to encounter them.

Bottom line

The Nvidia AI-Q Research Assistant blueprint I tested did a better job than I expected at ingesting financial reports in PDF form and generating reports based on them in response to user queries. In separate tests of Llama models in naive RAG designs, my results were not nearly as good. Chalk one up for the plan-reflect-refine architecture.

Pros

  1. Able to create a credible deep research assistant that can run on-prem or in the cloud
  2. Models iterate on the report to refine it
  3. NeMo Retriever makes quick work of PDF ingestion
  4. Open source blueprint can be adapted to your own AI research applications

Cons

  1. The version tested still had a few bugs, which should now be fixed
  2. Very much tied to Nvidia GPUs

Cost

Contact your authorized Nvidia partner for pricing.

Platform

Docker Compose or Nvidia AI Workbench, server-class Nvidia GPU(s) with sufficient memory. Can be run on a cloud with Nvidia APIs, or on premises in containers.

(image/jpeg; 2.38 MB)

The successes and challenges of AI agents 19 Aug 2025, 9:00 am

AI has changed a lot in just two years. In 2023, most companies were experimenting with large language models. These tools helped with writing, research, and support tasks. They were smart, but they waited for instructions and could not take action on their own.

In 2025, we are seeing something more powerful: AI agents. They are not just chat tools anymore. They can remember, plan, use tools, and act on their own. AI agents can take a broad goal, figure out the steps, and carry it out without needing help at every stage. Some can even fix problems along the way.

Early wins

These agents have moved beyond research and begun working inside real businesses. For example, ServiceNow uses AI agents to manage IT requests. If someone needs software installed or a license updated, the agent takes care of it from start to finish. There are no tickets to raise and no waiting time.

GitHub Copilot is another example. It now has a mode where the agent understands what the developer is trying to do, choosing tools, making decisions, and completing small coding tasks on its own. For developers, this saves time and removes repetitive work.

A final example is Cisco, which is using AI agents inside Webex to improve customer service. One agent speaks directly to customers, another supports human agents during live calls, and a third listens and creates a summary of the conversation with tone and sentiment analysis. These layers work together and make customer support faster and more accurate.

These applications of AI agents work well because the tasks are clear and follow a standard process. But agents are now being trained to handle more complex problems too.

Take this use case: A business analyst is trying to answer why sales dropped for a product last quarter. In the past, a human would explore the data, come up with possible reasons, test them, and suggest a plan. Now, an AI co-pilot is being trained to do most of that work. It pulls structured data, breaks it into groups, tests different ideas, and surfaces the insights. This kind of system is still in testing but shows what agents might be able to do soon.

A better approach

Even with these early wins, most companies are still trying to add agents to old workflows, which limits their impact. To really get the benefits, businesses will need to redesign the way work is done. The agent should be placed at the center of the task, with people stepping in only when human judgment is required.

There is also the issue of trust. If the agent is only giving suggestions, a person can check the results. But when the agent acts directly, the risks are higher. This is where safety rules, testing systems, and clear records become important. Right now, these systems are still being built.

One unexpected problem is that agents often think they are done when they are not. Humans know when a task is finished. Agents sometimes miss that. In some tests, over 30% of multi-agent failures were caused because one agent thought the task was completed too early.

To build agents, developers are using tools like LangChain and CrewAI to help create logic and structure. But when it comes to deploying and running these agents, companies rely on cloud platforms. In the future, platforms like AWS and Google Cloud may offer complete solutions to build, launch, and monitor agents more easily.

Today, the real barrier goes beyond just technology. It is also how people think about agents. Some overestimate what they can do; others are hesitant to try them. The truth lies in the middle. Agents are strong with goal-based and repeatable tasks. They are not ready to replace deep human thinking yet.

The value of agents

Still, the direction is clear. In the next two years, agents will become normal in customer support and software development. Writing code, checking it, and merging it will become faster. Agents will handle more of these steps with less need for back-and-forth. As this grows, companies may create new roles to manage agents, needing someone to track how they are used, make sure they follow rules, and measure how much value they bring. This role could be as common as a data officer in the future.

The hype over AI agents is loud, but the real change is quiet. Agents are not taking over the world; they are just taking over tasks. And in doing that, they are changing how work feels—slowly but surely.

Aravind Chandramouli is vice president, AI Center of Excellence, at Tredence.

Generative AI Insights provides a venue for technology leaders to explore and discuss the challenges and opportunities of generative artificial intelligence. The selection is wide-ranging, from technology deep dives to case studies to expert opinion, but also subjective, based on our judgment of which topics and treatments will best serve InfoWorld’s technically sophisticated audience. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Contact doug_dineley@foundryco.com.

(image/jpeg; 1.05 MB)

Go language previews performance-boosting garbage collector 18 Aug 2025, 11:54 pm

Go 1.25, the latest version of the Google-developed open source programming language, has been released. The update brings new capabilities including an experimental garbage collector that improves performance, a fix for a compiler bug that could delay pointer checks, and a package that provides support for testing concurrent code.

Announced August 12 by the Go team, Go 1.25 can be accessed at go.dev. The release includes enhancements across tools, the runtime, the standard library, the compiler, and the linker.

The new garbage collector has a design that improves performance of marking and scanning small objects through better locality and CPU scalability, according to the Go team. The team expects a 10% to 40% reduction in garbage collection overhead in real-world programs that heavily use the collector. Developers can enable the collector by setting GOEXPERIMENT=greenteaqc at build time. 

For the compiler, meanwhile, the release fixes a bug from Go 1.21 that could incorrectly delay nil pointer checks. Programs like the one below, which used to execute successfully when they shouldn’t, the Go team said, will now correctly panic with a nil-pointer exception.


package main

import "os"

func main() {
    f, err := os.Open("nonExistentFile")
    name := f.Name()
    if err != nil {
        return
    }
    println(name)
}

In the standard library, Go 1.25 has a new testing/synctest  package that supports testing for concurrent code. The Test function runs a test function in an isolated “bubble,” the team said. Within the bubble, time is virtualized: time package functions operate on a fake clock and the clock moves forward instantaneously if all goroutines in the bubble are blocked. Also, the Wait function waits for all goroutines in the current bubble to block. This package first became available in Go 1.24 under GOEXPERIMENT=synctest, with a slightly different API. The experiment has graduated to general availability.

Go 1.25 follows Go 1.24, which was introduced in February with enhancements pertaining to generic type aliases and WebAssembly. The Go language has gained attention lately with Microsoft’s plan to port the TypeScript compiler and tools to the language, with the intent of boosting performance.

Also featured in Go 1.25:

  • An experimental JSON implementation, when enabled, provides an encoding/json/v2 package, which is a major revision of the encoding/json package, and the encoding/json/jsontext package, which provides lower-level processing of JSON syntax.
  • The go build -asan option now defaults to doing leak detection at program exit. This will report an error if memory allocated by C is not freed and is not referenced by any other memory allocated by either Go or C.
  • The compiler now can allocate the backing store for slices on the stack in more situations, improving performance.
  • The compiler and linker now generate debug information using DWARF (debugging with attributed record formats) Version 5.
  • The Go distribution will include fewer prebuilt tool binaries. Core toolchain binaries such as the linker and compiler still will be included, but tools not invoked by build or test operations will be built and run by go tool as needed.
  • The linker now accepts a -funcalign=N command line option that specifies the alignment of function entries. The default value is platform-dependent and unchanged in Go 1.25.
  • For cryptography, MessageSigner is a signing interface that can be implemented by signers that wish to hash the message to be signed themselves.

(image/jpeg; 4.65 MB)

Google adds VM monitoring to Database Center amid enterprise demand 18 Aug 2025, 11:55 am

Google has updated its AI-powered database fleet management offering — Database Center — with the capability to monitor self-managed databases running on its own compute virtual machines (VMs).

Several enterprises run their databases, such as PostgreSQL and MySQL, on compute VMs as they offer more flexibility, scalability, and cost-effectiveness when compared to dedicated hardware.

Earlier, enterprises could use the Database Center to only monitor Google-managed databases, including Spanner, AlloyDB, and Bigtable.

This capability, according to Google, is a result of several enterprises demanding support for monitoring self-managed databases to gain full oversight of all their deployed databases.

“This holistic visibility helps identify critical security vulnerabilities, improve security posture, and simplify compliance,” said Charlie Dai, VP and principal analyst at Forrester.

These security vulnerabilities could include outdated minor versions, broad IP access range, having databases without a root password, and having databases that don’t have auditing enabled, Google executives wrote in a blog post.

This capability is currently in preview, and enterprises need to sign up for early access to get access to it.

Google has also added new capabilities to the Database Center, such as alerting for new resources and issues for all the databases, adding Gemini-powered natural language capabilities for folder-level fleet management, and historical fleet comparison up to 30 days.

Google said its alerting for new resources and issues for all databases will allow enterprise users to create custom alerts when new database resources are provisioned and also receive alerts via email, Slack, and Google chat messages for any new issue types detected by Database Center.

This capability will enable proactive monitoring and allow immediate action to enforce governance policies, prevent configuration drift, and mitigate risks before they impact applications, Dai said.

In order to simplify fleet monitoring at scale, Google has added Gemini-powered language capabilities to Database Center at the folder level.

“This means you can now have contextual conversations about your databases within a specific folder, making it easier to manage and troubleshoot databases, especially in large and complex organizational environments,” Google executives wrote in a blog post.

The historical fleet comparison feature for 30 days, on the other hand, can be used by enterprises in capacity planning and the analysis of database fleet health.

Earlier, Google offered a seven-day historical comparison for database inventory and issues, and now it offers three options: 1 day, 7 days, and 30 days.

Enterprises or database administrators can use the fleet comparison feature to get a detailed view of new database inventory and identify new operational, security issues that emerged during the selected period, Google executives wrote.

This should help database administrators in enterprises use data-driven decisions for fleet optimization, Dai said.

Google did not clarify whether the additional capabilities are already available for enterprise users.

More Google news and insights:

(image/jpeg; 10.64 MB)

How does the metrics layer enhance the power of advanced analytics? 18 Aug 2025, 9:00 am

Amid all the buzz around advanced AI-powered data analytics, one crucial component often goes unnoticed: the metrics layer. This is where metrics creation takes place, the process that defines and manages the metrics that turn data signals into actionable, meaningful insights.

Although it’s increasingly vital for effective analytics, metric creation often does not receive enough attention in the broader business intelligence (BI) package, and enterprises frequently fail to understand its role.

What is the metrics layer?

Metrics turn a concept into something that can be measured. They provide the framework upon which stakeholders can track changes in whatever they want to track. Until raw data signals are converted into metrics, there’s no way to measure improvements or degradation, and no way to identify patterns or trends.

The metrics layer, metrics creation, metrics store, metrics platform or headless BI are all different terms for creating, managing, defining, enforcing and delivering metrics. The bundle of best practices, features and tools resides between the data source and the apps that use the data and deliver insights — hence the term “metrics layer.”

A metrics layer:

  • Serves as a single source of truth for metrics across all your dashboards, reports, applications and more.
  • Holds information about how to calculate metrics and the attributes that should be used to evaluate KPIs, like data repositories do for data and GitHub does for code.
  • Translates requests for metrics into SQL queries, executes the requests and then returns the metrics to the users.
  • Defines key metrics, explains what the data represents, such as whether an increase is favorable or negative, and shows how metrics relate to each other.

According to Gartner, which was one of the first entities to use the phrase, metrics creation is a use case that “Enables organizations to connect to data, prepare data and define standardized metrics that can be shared throughout the organization.”

“A metrics layer allows an organization to standardize its metrics and how they are calculated. It builds a single source of truth for all metric or KPI definitions for all data sources in the organization,” explains Christina Obry, a product manager at Tableau.

Is a metrics layer essential for BI success?

Metrics creation is so critical that Gartner considers it a mandatory component for any BI platform. Without a strong metrics layer, BI platforms struggle to deliver useful business intelligence.

There’s simply too much data flooding into enterprises, but also, there are too many tools measuring and analyzing that data, resulting in inconsistent metrics. Even simple metrics can become muddled, with tools disagreeing about how to measure them.

Avi Perez, CTO and co-founder of Pyramid Analytics, says that “Mature organizations understand the need for a protocol that ensures formulas are calculated consistently, maximizing their usefulness to users across departments. They don’t promote self-service at the expense of a single source of truth, and they seek out mechanisms for standardizing metrics.”

Data only has value when it’s transmuted into insights, but those insights need to reach the right decision-makers together with the right context. A metrics layer enables the creation of a universal glossary of metrics that every business stakeholder can utilise to inform sound decisions.

The dangers of operating without a metrics layer

Imagine counting the number of active users for an app. Should they be measured weekly, monthly or annually? How long can users go between logins before they are no longer considered “active” users? What’s the best way to segment them geographically?

The gaps in how these questions are answered lead to wasted time, a loss of trust in the data and widespread confusion. Without a universally managed metrics layer, departments can become misaligned and measure the same metric differently. In an era of data-driven business decision-making, muddled or inconsistent data can lead to damagingly erroneous decisions.

Fixing these inconsistencies can be a nightmare. First, you have to find them all, scattered across all your data sources, analysis tools and custom queries. As they are reused without oversight, the inconsistencies grow. Changing the business logic definition for every tool, every department leader, every time, means that data teams waste time firefighting instead of working on tasks that deliver value.

“Your organization has multiple dashboards. It may have multiple BI tools, too. Do you really want to define the business logic for your metrics every single time in each of those outlets? What if the logic changes as the business grows? That increases the chances of one instance being slightly off or out of date by the time someone looks at it and makes a decision,” warns Chris Nguyen, a BI analyst at Keller Williams Realty International.

A centralized metrics layer is a way to define and store metrics in a single place, so that everyone in your organization uses the same logic, every time.

What are the benefits of a metrics layer?

Metrics creation delivers value that goes beyond the critical need for consistent metrics. By setting up a centralized repository for business metrics and KPIs, organizations enjoy numerous benefits:

  • More trust in data, thanks to consistency in the metrics used across the organization
  • Improved accessibility to vital metrics for line-of-business users who aren’t data experts
  • Increased scalability for business logic across the company
  • Shorter time to insights and real-time updates
  • Greater adaptability to changing business needs

IT consultant Sean Michael Kerner emphasizes that “metrics stores provide a consistent way for organizations to use and reuse metrics definitions and calculations across different data tools and teams.” Everyone can inspect metrics definitions at will, helping improve transparency and trust in data.

Integrating centralized metrics management with modern data architecture makes it easy to update definitions as business requirements evolve and then propagate them across the organization. This improves both scalability and collaboration, as the whole organization speaks the same data “language” without gaps or misunderstandings.

Metrics stores are built to integrate natively with open APIs, making it possible to surface metrics in the workflows and apps where LOB users need them most. Moreover, headless BI infrastructure enables real-time and near real-time updates, helping to keep decision-making relevant and informed.

A metrics layer is also a boon for software engineers. Because it translates metric definitions into code, it helps tech teams follow established best practices, such as version control, tracking and the DRY (don’t repeat yourself) principle. This increases efficiency and reduces repetitive work.

Metrics creation is advanced analytics’ crucial ingredient

Robust metrics creation is the glue that holds true advanced BI solutions together. Without this use case, data would languish unused, metrics would diverge across the organization, teams would struggle to coordinate and insights would arrive too late or not at all.

FAQs:

What is a metrics layer?

A metrics layer is a centralized data modeling layer that defines consistent business metrics across different tools and teams. It ensures everyone uses the same logic and calculations for analysis and reporting.

Why does a metrics layer matter for business analytics?

A metrics layer ensures consistency and accuracy in business analytics by standardizing metric definitions, reducing errors and enabling faster, relevant insights across teams. Without metrics creation, data would become confused and trust would drop.

What are the business benefits of a metrics layer?

The business benefits of a metrics layer include:

  • Consistent and accurate metrics across tools
  • Faster decision-making with trusted data
  • Minimal manual errors and duplicated logic
  • Improved collaboration between data and business teams

What are the use cases for a metrics layer in enterprise analytics?

Use cases for a metrics store in enterprise analytics include:

  • Different teams can collaboratively define, refine and use metrics
  • Consistent, real-time metrics for executive dashboards and operations
  • Uniform financial metrics for accurate reporting and forecasting
  • Centralized customer, product and HR metrics for deeper analysis
  • Headless commerce and supply chain optimization with API-driven, consistent metrics
  • A single source of truth for metrics used in regulatory compliance reporting and internal audits

This article is published as part of the Foundry Expert Contributor Network.
Want to join?

(image/jpeg; 3.06 MB)

Page processed in 1.337 seconds.

Powered by SimplePie 1.4-dev, Build 20170403172323. Run the SimplePie Compatibility Test. SimplePie is © 2004–2025, Ryan Parman and Geoffrey Sneddon, and licensed under the BSD License.