Bringing databases and Kubernetes together | InfoWorld

Technology insight for the enterprise

How Agile practices ensure quality in GenAI-assisted development 9 Apr 2026, 9:00 am

Generative AI has revolutionized the space of software development in such a way that developers can now write code at an unprecedented speed. Various tools such as GitHub Copilot, Amazon CodeWhisperer and ChatGPT have become a normal part of how engineers carry out their work nowadays. I have experienced this firsthand, in my roles from leading engineering teams at Amazon to working on large-scale platforms for invoicing and compliance, both the huge boosts in productivity and the equally great risks that come with GenAI-assisted development.

With GenAI, the promise of productivity is very compelling. Developers who use AI coding assistants talk about their productivity going up by 15% to 55%. But most of the time, this speed comes with hidden dangers. To name a few, AI-generated software without good guardrails could open up security issues, lead to technical debt and introduce bugs that are difficult to detect through traditional code reviews. According to McKinsey research, while GenAI tools allow developers to be more productive at a higher level but also require rethinking of software development practices to maintain code quality and security.

The answer is not to abandon these awesome tools altogether. In fact, it is about combining them with reliable engineering practices that the teams already know and trust. In fact, the proper application of traditional Agile methodologies generates the precise guidelines that allow you to benefit from GenAI while also controlling its hazards. In this article, I consider the five basic Agile methodologies: Test-driven development (TDD), behavior-driven development (BDD), acceptance test-driven development (ATDD), pair programming and continuous integration together provide the guardrails to GenAI development, not just to make it quicker, but also of higher quality.

The GenAI code quality crisis: Real-world issues

Before we jump into solutions, it is worth naming the problem. The issues with AI-generated code aren’t theoretical. They’re appearing in production systems across the industry:

  • Security vulnerabilities: In 2023, researchers at Stanford found that developers using AI assistants were more likely to introduce security vulnerabilities into their code, particularly injection flaws and insecure authentication mechanisms. A study published in IEEE Security & Privacy demonstrated that GitHub Copilot suggested vulnerable code approximately 40% of the time across common security-critical scenarios. At one major financial institution, an AI-generated SQL query bypassed parameterization, creating a critical injection vulnerability that wasn’t caught until penetration testing.
  • Hallucinated dependencies: AI models sometimes generate suggestions for libraries, functions or APIs that don’t exist. A group from a healthcare company invested three days in finding the bug in their microservice compiling issue, only to learn that the AI had suggested a nonexistent AWS SDK method. The code seemed legitimate, went through the first review, but the method signature was completely made up.
  • Subtly incorrect business logic: Most misleading of all are mistakes in the business logic that look good on the surface but have subtle defects in them. For example, we came across a line, item tax calculation on an invoice that was AI-generated, which looked perfect, but upon close inspection, it was discovered that rounding was applied at the level of each item rather than at the level of the subtotal. While a brief inspection of the logic indicated that it was correct, the difference in the sequence of rounding would have resulted in the final invoice totals being different from the legal requirements for tax reporting, thus leading to compliance risks and reconciliation errors from millions of transactions.
  • Technical debt accumulation: AI tools focus on producing working code rather than maintainable code. They frequently recommend very nested conditional logic, duplicated code patterns and excessively complex solutions when simpler alternatives are available. Gartner research warned that without strong governance, early GenAI adopters can accumulate cost, complexity and technical debt.
  • Compliance and licensing issues: AI models trained from public code repositories can, at times, generate code that is basically a copy of the code with certain licenses that are incompatible. For industries that are heavily regulated, such as healthcare and finance, this kind of situation poses very serious risks of noncompliance. A pharmaceutical company, as an example, came across AI-generated codes that were very similar to the GPL-licensed open-source software and if the company relies on such a platform, it would be legally exposed.

The root cause: Speed without clear specification

These problems arise from the same root, which is AI producing code based on patterns it has seen, without real understanding of requirements, context or whether the code is correct. It works on probability, for example, “what code pattern from the prompt is most likely” rather than correctness or suitability for the particular case.

Traditional code review, although essential, is not enough to protect against errors from AI-generated code. Most people find it difficult to spot subtle errors in code that looks legitimate and the volume of AI-generated code can easily overwhelm the review capacity. We must have automated, systematic methods that check correctness and not just quick visual inspection.

Agile practices as GenAI guardrails

One can find the answer in the methods that have been around for a long time, even before GenAI, and yet they are great at fixing its flaws. Every one of these methods provides a different type of safety net:

1. Test-driven development (TDD): The correctness validator

The TDD cycle, Red, Green, Refactor provides the most direct protection against incorrect AI-generated code. By writing tests first, you create an executable specification of correctness before the AI generates any implementation.

How it works with GenAI:

  • Red: Write a failing test that specifies the exact behavior you need. This test becomes your requirement specification in executable form.
  • Green: Ask the AI to generate code that makes the test pass. The AI now has a clear, unambiguous target.
  • Refactor: Use AI to suggest improvements to the working code while ensuring tests still pass.

Real-world impact: We applied very strict TDD alongside GenAI-assisted development. Before developers accept any AI suggestions, they should write extensive unit tests that detail all the aspects. This caught a critical line-item tax calculation error, while the AI suggested a simple multiplication that “looked” correct, the test specifically checked for legal rounding requirements (rounding at the subtotal level rather than the line level). Because the test specified these precision requirements, the AI’s initial code failed immediately. Without TDD, this discrepancy would have reached production, resulting in significant compliance risks and revenue reconciliation failures.

Moreover, TDD solves the problem of hallucination of dependencies. For example, if AI offers a method or a library that does not exist, the test will not be able to compile or run, thus providing immediate feedback instead of finding out the issue after several days.

2. Behavior-driven development (BDD): The business logic guardian

BDD extends TDD by focusing on system behavior from the user’s perspective using Given-When-Then scenarios. This is particularly powerful for GenAI-assisted development because it creates human-readable specifications that bridge the gap between business requirements and code.

BDD scenarios serve two critical functions with AI-generated code:

First, they provide context-rich prompts for the AI. Instead of asking “write a function to calculate tax,” you provide a complete scenario: “Given a customer in California, when they purchase a $100 item, then the tax should be $9.25.” The AI has more context to generate correct code.

Second, they create executable business logic tests that catch subtle errors humans might miss. The scenarios are written in plain language by product owners and domain experts, then automated using frameworks like Cucumber or Cypress.

Real-world impact: Compliance platform processes invoices across multiple tax jurisdictions. When we started using AI assistance, we first created comprehensive BDD scenarios covering all tax rules, edge cases and regulatory requirements. These scenarios, written by our tax compliance specialists, became the specification for AI code generation. The AI-generated code that passed all BDD scenarios was correct 95% of the time, far higher than code generated from vague prompts.

3. Acceptance test-driven development (ATDD): The stakeholder alignment tool

ATDD involves customers and stakeholders early in defining automated acceptance tests before development begins. This practice is crucial when using GenAI because it ensures the AI is solving the right problem, not just generating plausible-looking code.

The ATDD workflow with GenAI:

  • Specification Workshop: Product owners, developers and testers collaborate to define acceptance criteria in a testable format. This creates a shared understanding of “done.”
  • Test Automation: Convert acceptance criteria into automated tests before writing implementation code. These tests represent the customer’s definition of success.
  • AI Assisted Implementation: Use GenAI to implement features that satisfy the acceptance tests. The tests prevent the AI from drifting away from actual requirements.

Real-world impact: For a volume-based discount feature, we held ATDD workshops to define a specific requirement: “Buy 10, Get 10% Off” must apply only to the qualifying line items, not the entire invoice total. These became our automated acceptance tests. When developers used GenAI to implement the logic, the AI suggested a simple, global discount function that subtracted 10% from the final balance, a common coding pattern for retail, but incorrect for our B2B contractual rules. Because the ATDD test validated the discount at the line-item level, the AI’s “perfect-looking” code failed immediately. This prevented a logic error that would have resulted in significant over-discounting and lost revenue across thousands of bulk orders.

4. Pair programming: The human-AI collaboration model

Traditional pair programming involves two developers working together, often one writing tests and the other writing implementation. With GenAI, this model evolves into a powerful three-way collaboration: Developer A writes tests, Developer B reviews AI-generated code and the AI serves as a rapid implementation assistant.

The enhanced pair programming workflow:

  • Navigator Role: One developer focuses on writing comprehensive tests and thinking about edge cases, security implications and architectural fit. They are not distracted by implementation details.
  • Driver Role: The other developer works with the AI to generate implementation code, critically evaluating each suggestion. They serve as the quality filter for AI output.
  • AI Assistant: Generates implementation suggestions based on tests and context, accelerating the coding process while the human pair ensures quality.

Real-world impact: A recent study by GitClear found that code quality metrics declined when developers used AI tools in isolation but improved when used in pair programming contexts. We recommend pair programming for any AI-assisted development of critical systems. The navigator catches security issues and architectural mismatches that the driver, focused on AI output, might miss. We have seen a 60% reduction in post-deployment bugs compared to solo AI-assisted development.

5. Continuous integration (CI): The automated safety net

Continuous integration runs automated test suites every time code is merged. It becomes even more critical with GenAI-assisted development. CI provides the final safety net that catches issues before they reach production.

Enhanced CI pipeline for GenAI code:

  • Comprehensive test execution: Run all unit tests, integration tests, BDD scenarios and acceptance tests on every commit. AI-generated code must pass the entire suite.
  • Static analysis: Include additional static analysis tools that check for common AI-generated code issues like security vulnerabilities, code complexity metrics and licensing compliance.
  • Performance benchmarks: Automated performance tests catch AI-generated code that works correctly but performs poorly at scale.

Real-world impact: Our CI pipeline is configured with specialized checks designed to catch the unique risks of AI-assisted coding. For the invoicing platform, we integrated automated business-rule validators that specifically verify logic like tax rounding and discount applications.

The synergistic effect: Practices working together

The real power emerges when these practices work together. Each creates a different layer of protection:

  • TDD ensures the code works correctly for specified inputs.
  • BDD ensures it implements the right business behavior.
  • ATDD ensures it meets stakeholder expectations.
  • Pair programming ensures human oversight and critical thinking.
  • CI ensures all these checks run automatically and consistently.

Consider a typical user story for an e-commerce platform: “As a customer, I want to apply discount codes so that I can save money on purchases.”

Without Agile practices, a developer might prompt an AI: “Write a function to apply discount codes to shopping carts.” The AI generates plausible-looking code, the developer briefly reviews it and it ships. Hidden issues might include: discount stacking vulnerabilities, floating-point rounding errors, failure to validate expiration dates or SQL injection in the discount code lookup.

With Agile practices:

  • ATDD: Product owner, developer and tester define acceptance criteria: “Given a valid 10% discount code, When applied to a $100 cart, Then the total should be $90.”
  • BDD: Business analyst writes scenarios covering edge cases: expired codes, invalid codes, maximum discount limits, combination rules.
  • TDD: Developer pair writes unit tests first, including security tests for injection attacks, tests for decimal precision and tests for all edge cases.
  • Pair programming: One developer writes tests, the other works with AI to implement, both review the generated code critically.
  • CI: All tests run automatically on commit, plus static analysis for security issues, performance benchmarks and compliance checks.

This multi-layered approach catches issues at different stages: tests catch functional errors, pair programming catches architectural mismatches, CI catches regressions and security issues.

Implementation recommendations

Based on our experience implementing these practices across multiple teams, here are practical recommendations for organizations adopting GenAI development tools:

  • Start with TDD as the foundation: Make test-first development non-negotiable when using AI assistance. This single practice prevents the majority of AI-generated code issues. Invest in training developers on TDD if they’re not already proficient.
  • Enhance code review processes: Traditional code review checklists need updating for AI-generated code. Add specific review criteria: Does the code handle edge cases? Are there obvious security vulnerabilities? Does it match our architectural patterns? Is the complexity appropriate for the problem?
  • Invest in test infrastructure: Strong CI pipelines become even more important. Ensure your pipeline can run comprehensive test suites quickly. Slow test execution discourages frequent commits and reduces the effectiveness of CI as a safety net.
  • Create AI usage guidelines: Document when and how to use AI assistance. Some scenarios might be high risk (security-critical code, financial calculations) and require extra scrutiny. Others might be low-risk (boilerplate code, standard CRUD operations) and benefit most from AI acceleration.
  • Measure and monitor: Track metrics specific to AI-assisted development, such as defect rates in AI-generated vs. human-written code, test coverage trends, time-to-production and post-deployment issues. Use data to refine your practices.

 

Conclusion: Speed with safety

Generative AI is a fundamental change in how we write software that can be compared to the introduction of high-level programming languages or integrated development environments. It brings real and huge productivity gains. But speed without quality is not progress; it is technical debt accumulation at an accelerated rate.

The great thing is that we don’t have to come up with new methods to make use of GenAI safely. The Agile methods that have been used for decades, such as TDD, BDD, ATDD, Pair Programming and CI, are exactly the safety measures we need. These methods have quality at their core through automation, collaboration and continuous verification. They are even more impressive with AI help because they provide objective, automated checks that don’t have the same pattern-matching biases as humans, who are poor at reviewing AI-generated code, have.

Companies that use GenAI tools but keep up with strict software development practices will get the best results. For example, faster development without sacrificing quality, reduced defect rates despite increased velocity and sustainable productivity gains that don’t create future maintenance issues.

Software development in the future is not about just humans or AI. It is humans and AI cooperating within established quality assurance that is protected by proven frameworks. The combination of AI’s speed and the safety of Agile methods allows us to do software development that is both really efficient and of high quality on a large scale.

This article is published as part of the Foundry Expert Contributor Network.
Want to join?

(image/jpeg; 7.65 MB)

Rethinking Angular forms: A state-first perspective 9 Apr 2026, 9:00 am

Forms remain one of the most important interaction surfaces in modern web applications. Nearly every product relies on them to capture user input, validate data, and coordinate workflows between users and back-end systems. Yet despite their importance, forms are also one of the areas where front-end complexity tends to accumulate quietly over time.

For simple scenarios, Angular forms feel straightforward to work with. A handful of controls, a few validators, and a submission handler can be implemented quickly and confidently. But the situation changes as applications grow. Nested form groups, dynamic controls, conditional validation, and cross-field dependencies gradually introduce layers of behavior that are difficult to visualize as a single coherent system.

Developers often reach a point where a form technically works but becomes difficult to explain. Adding a new rule or modifying a validation condition can require tracing through observables, validators, and control states that are spread across multiple components. The challenge is rarely a missing feature. Instead, it is the growing difficulty of reasoning about how the system behaves as a whole.

This is not a criticism of Angular forms themselves. The framework has evolved powerful abstractions that solve real problems: keeping view and model synchronized, enforcing validation rules, coordinating asynchronous operations, and maintaining accessibility. These capabilities are essential for production-scale applications.

The more interesting question is architectural rather than technical. What mental model should developers use when reasoning about form behavior in modern Angular applications?

In this article, we step away from specific APIs and instead examine forms from first principles. By looking at forms primarily as state systems rather than event pipelines, we can better understand where complexity originates and why newer reactive primitives such as Angular Signals align naturally with the underlying structure of form logic.

Over the past decade, Angular forms have been shaped primarily by event-driven abstractions and observable streams. As the framework evolves toward signal-based reactivity, it is worth reconsidering whether forms should continue to be modeled primarily around events at all.

Forms are not fundamentally event systems. They are state systems that happen to receive events. This distinction becomes clearer as front-end systems grow larger and validation logic becomes increasingly intertwined with application state.

Many front-end systems have gradually adopted an event-centric mental model, where application behavior is primarily expressed through chains of reactions and emissions. As discussed in my recent InfoWorld article, “We mistook event handling for architecture”, this approach can blur the distinction between reacting to change and representing the underlying state of an application. Forms are one of the areas where that distinction becomes particularly visible.

Why forms became complicated (and why that was reasonable)

To understand why a signal-first approach matters, it is worth briefly revisiting how Angular forms evolved and why complexity was an unavoidable outcome.

Early Angular forms were primarily about synchronization. Input elements need to remain synchronized with the model, and updates must flow in both directions. Template-driven forms relied heavily on two-way binding to achieve this. For small forms, this approach felt intuitive and productive. However, as forms grew larger and more complex, the need for structure became apparent. Validation rules, cross-field dependencies, conditional UI logic, and testability all pushed developers toward a more explicit model.

Reactive forms addressed this need by modeling forms as trees of controls. Each control encapsulated its own value, validation state, and metadata. RxJS observables provided a declarative way to respond to changes over time. Validators, both synchronous and asynchronous, could be attached to controls, and Angular automatically tracked interaction state, such as whether a control was dirty, touched, or pending.

This architecture solved many real problems. It also shifted the dominant mental model from state to events. That shift was reasonable at the time, but it also encouraged developers to think of form behavior primarily as a sequence of reactions rather than as a system defined by state. Developers began reasoning about forms in terms of streams: when a value emits, when a status changes, when a validator runs, and when subscriptions are triggered. In simple cases, this was manageable. In larger forms, it often became difficult to trace why a particular piece of logic executed or why a control entered a specific state.

The deeper issue is not that reactive forms rely on RxJS, but that they often conflate state with coordination. RxJS excels at coordinating asynchronous workflows and reacting to events. It is less well-suited to serve as a primary representation of the state. Forms, however, are overwhelmingly state-driven. At any given moment, a form has a well-defined set of values, validation rules, derived errors, and UI flags. Much of this information can be computed deterministically, without reference to time or event ordering.

As form logic grows, the cost of mixing state representation with event coordination increases. Debugging requires tracing emissions across multiple observables. Understanding behavior requires knowing not only what the state is but also how it arrived at that state. This is the context in which Angular Signals becomes interesting, not as a replacement for RxJS, but as a better fit for modeling form state itself.

Defining form state from first principles

Before introducing any APIs or framework constructs, it is useful to strip the problem down to its essentials and ask a basic question: what is form state?

At its core, a form exists to collect data. This data is typically represented as a plain object composed of strings, numbers, booleans, or nested structures. These values form the canonical source of truth for everything else the form does. Without values, there is no form.

Validation rules operate on those values. They define constraints such as whether a field is required, whether a value conforms to a particular format, or whether multiple fields satisfy a cross-field condition. Importantly, validation rules do not store state. Given the same input values, they always produce the same outcome. They are pure functions of state, not state themselves.

From values and validation rules, we derive validity and error information. A field is either valid or invalid, and specific error messages may apply. At the form level, validity is typically derived by aggregating field-level results. This information is deterministic and can be recalculated at any time from the underlying values.

Forms also track interaction metadata. Whether a field has been touched or modified influences when feedback is shown to the user, but it does not affect the correctness of the data. This metadata exists to improve user experience, not to define business logic.

Finally, there are side effects. Submitting data to a server, persisting drafts, performing asynchronous validation, or navigating to another view are all reactions to state changes. These actions matter, but they are not the state. They are consequences of the state.

Seen through this lens, most of what we consider “form complexity” is not inherent complexity. It is organizational complexity. Derived information is often stored as a mutable state. Validation logic is scattered across imperative callbacks. UI flags are toggled in response to events rather than derived from underlying conditions.

Signals encourage a different organization. They make it natural to treat values as the only mutable input, to express validity and UI state as derived data, and to isolate side effects as explicit reactions. This separation does not introduce new ideas, but it makes existing best practices easier to apply consistently.

Understanding this distinction is essential before adopting any signal-based form API. Without it, signals risk becoming just another abstraction layered on top of existing complexity. With it, they become a tool for simplifying how form behavior is expressed and understood.

The cost of treating the state as events

As reactive forms evolved, the complexity in form logic related to  event coordination soared. Value changes emitted events. Validation status emitted events. Asynchronous validators emitted events. Subscriptions responded to these emissions, producing additional side effects. This model is powerful, but it subtly shifts how developers reason about form behavior.

When form logic is expressed primarily through events, understanding behavior requires temporal reasoning. Developers must ask not only what the current state of the form is, but also how the form arrived at that state. Questions such as “Which emission triggered this validator?” or “Why did this error appear now?” become common. The answers often depend on subscription order, life-cycle timing, or intermediate states that no longer exist.

This event orientation creates an asymmetry in how form behavior can be inspected. Current state, values, errors, and validity can be logged or displayed. The sequence of events that produced that state cannot. Once an emission has passed, it leaves no trace beyond its effects. Debugging becomes an exercise in reconstruction rather than observation.

Over time, the focus on events leads to a common anti-pattern: derived information is promoted to a mutable state. Validation results are stored rather than computed. UI flags are toggled imperatively rather than derived from underlying conditions. These shortcuts reduce immediate friction but increase long-term complexity. The form begins to carry not only its current state but also the historical residue of its manipulation.

The problem becomes more pronounced as forms grow. Cross-field validation introduces dependencies that span multiple controls. Conditional logic ties UI behaviour to combinations of values and interaction states. At this scale, the cost of reasoning in terms of events compounds. Understanding behavior requires tracing emissions across multiple observables, each representing a partial view of the system.

This is not a failure of RxJS or reactive forms. RxJS excels at coordinating asynchronous workflows and reacting to external data streams. The issue arises when event-driven coordination is used as the primary representation of state. Forms, by their nature, are overwhelmingly state-driven. At any given moment, a form has a well-defined configuration of values, rules, and derived outcomes.

Recognizing this mismatch is an important step. It allows us to separate coordination concerns from state representation, and to ask whether some of the complexity we experience is inherent or simply a consequence of the mental model we apply.

Gaining a state-first perspective

Many of the challenges developers encounter when building complex forms are not the result of missing framework features. They arise from how form behavior is structured and reasoned about. When validation rules, UI state, and side effects are coordinated primarily through event flows, understanding the system often requires reconstructing the sequence of events that produced the current state.

A state-first perspective approaches the problem differently. Form values become the central source of truth. Validation rules operate deterministically on that state. Error messages, validity flags, and UI behavior emerge as derived information rather than independently managed pieces of mutable state.

This shift does not invalidate existing Angular Forms patterns, nor does it diminish the usefulness of RxJS where coordination of asynchronous workflows is required. Instead, it clarifies the distinction between two different concerns: representing the state and reacting to events.

Teams that model forms explicitly around state tend to build systems that are easier to inspect, easier to refactor, and easier to reason about as they grow. Angular’s evolving reactivity model opens the door to expressing these ideas more directly.

In the next article in this series, we will examine Angular Signals themselves—what they are, how they differ from observable-driven reactivity, and why their design aligns naturally with the way form state behaves in real applications. From there, the series will explore how signal-driven models can simplify validation, derived state, and large-scale form architecture.

Bringing databases and Kubernetes together 9 Apr 2026, 9:00 am

Running databases on Kubernetes is popular. For cloud-native organizations, Kubernetes is the de facto standard approach to running databases. According to Datadog, databases are the most popular workload to deploy in containers, with 45 percent of container-using organizations using this approach. The Data on Kubernetes Community found that production deployments were now common, with the most advanced teams running more than 75 percent of their data workloads in containers.

Kubernetes was not built for stateful workloads originally—the project had to develop multiple new functions like StatefulSets in Kubernetes 1.9 and Operator support for integration with databases later. With that work done over the first 10 years of Kubernetes, you might think that all the hard problems around databases on Kubernetes have been solved. However, that is not the case.

Embracing database as a service with Kubernetes

Today we can run databases in Kubernetes successfully, and match those database workloads alongside the application components that also run in containers. This makes the application development side easier as all the infrastructure is in one place, and can be controlled from one point. While that approach makes the “Day 1” issues around application development easier, it does not deal with many of the “Day 2” issues that still have to be addressed.

Day 2 issues include all the tasks that you need to have running so your application operates effectively over time. That includes looking at resiliency, security, operational management, and business continuity. For developers looking at databases, that means tasks like backup, availability, and failover. Some of these elements are easier in containers. Kubernetes was built to monitor containers in pods and restart images if a problem took place. However, stateful databases require more planning than stateless applications.

Kubernetes Operators can automate some of these processes for you, allowing Kubernetes to work through a database to trigger a cluster to carry out a backup task automatically. But that doesn’t go through the whole process, and it relies on the developer making decisions around how best to follow that process. If you are not an expert in backup or availability, you might be tempted to hand all these concerns over to a third-party provider for them to take care of.

That approach works. Cloud-based database as a service (DBaaS) offerings grew at nearly twice the rate of on-premises deployments according to Gartner — 64% to 36% — as developers went for the easy option. However, this locked them into that particular cloud provider or service. Even when developers might choose an open source database to base their application on, they would still be tied to the provider and their way of doing things. From a mobility perspective, that can represent a serious cost to run a DBaaS rather than doing it yourself.

The future for Kubernetes and database as a service

Automating Kubernetes workloads with Operators can provide the same level of functionality as DBaaS, while still avoiding lock-in to a specific provider. This should fit into how teams want to run internal developer platforms or platform engineering for their developers. However, getting that to work for all databases in a consistent way is its own challenge.

At Percona, we did some of this work around our own project, Everest. But this only supported the databases that we are interested in — namely, MySQL, PostgreSQL and MongoDB. What about other databases that have Operators? How about other systems for managing and observing databases? While the idea of a fully open source database as a service option is great in theory, in practice it needs a community that is willing to get involved and support how that gets built out.

If you really love something, sometimes you have to set it free. Like Woody in Toy Story 3, you just have to say, “So long, partner” with the hope that things go “To Infinity and Beyond” a la Buzz Lightyear. That is what we have done with Everest — or to use its new name, OpenEverest. OpenEverest is now a fully open source project that anyone can get involved in. With the project donated and accepted by the Cloud Native Computing Foundation, this will make running databases on Kubernetes easier for everyone. Over time, OpenEverest will support more databases based on what the community wants to see and where they help with more contributions or support.

For developers, getting Kubernetes and databases together helps them be more productive. But for those running infrastructure or dealing with Day 2 problems around Kubernetes, databases still remain potentially challenging to manage. Dealing with edge cases, automation, and resilience at scale is still a significant hurdle for databases on Kubernetes, yet this approach remains essential if you want to implement platform engineering or internal developer platforms without lock-in. This new open source project is a strong starting point to delivering on that requirement. Making it a community open source software project under a foundation, rather than the preserve of one company, will help this approach to succeed.

New Tech Forum provides a venue for technology leaders—including vendors and other outside contributors—to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to doug_dineley@foundryco.com.

Yael Nardi Names Minimus as Chief Business Officer to Head Growth Strategy 9 Apr 2026, 4:25 am

Minimus, a provider of hardened container images and secure container images designed to reduce CVE risk, today announced the appointment of Yael Nardi as Chief Business Officer (CBO). In this newly created role, Nardi will lead the company’s next phase of operations, overseeing top-of-funnel growth strategy, strategic operations, and future corporate development.

As the market landscape evolves and AI affects customer acquisition, Minimus is implementing an operational model to scale marketing and strategic alliances, which will be managed by Nardi.

“We are entering a phase of aggressive expansion that requires rigorous execution and a completely new playbook. Traditional marketing strategies are no longer enough in today’s fast-moving environment. We need an operational powerhouse at the helm. Yael is a world-class operator accustomed to zero-error environments and high-stakes execution. We are choosing intelligence, speed, and strategic alignment, and there is no one I trust more to run this machine.” – Ben Bernstein, CEO at Minimus

Nardi brings a multidisciplinary background to Minimus, with over 15 years of experience advising high-growth startups, global investors, and technology corporations. Most recently, she served as Director at Meitar NY Inc. and Partner at Meitar Law Offices. Yael was leading the significant M&A transaction of Twistlock’s acquisition by Palo Alto Networks and others, (PANW)—a foundational deal in the container image hardening and runtime security space—as well as transactions involving Wiz, JFrog, Salesforce, and others.

“I have worked with the Minimus team through some of their most critical milestones, and I know firsthand the massive potential of their technology. The demand for near-zero CVE container images and minimal container images with built-in security is only accelerating. Scaling a company in today’s environment requires the same 24/7 rigor, vendor accountability, and strategic precision as closing a major M&A deal. I am thrilled to step into this operational role and build the growth engine that will drive Minimus’s next chapter.” – Yael Nardi, Chief Business Officer, Minimus

Nardi holds a Bachelor of Laws (LLB) from Tel Aviv University and will be based in Minimus’s New York City office. She will work with the executive leadership team to execute the company’s growth targets.

About Minimus

Minimus provides hardened container images and hardened Docker images engineered to achieve near-zero CVE exposure. Built continuously from source with the latest patches and security updates, Minimus images undergo rigorous container image hardening and attack surface reduction, delivering secure container images with seamless supply chain security and built-in compliance for FedRAMP, FIPS 140-3, CIS, and STIG standards. Through automatically generated SBOMs and real-time threat intelligence, Minimus empowers teams to prioritize remediation and avoid over 97% of container vulnerabilities – making it a compelling Chainguard alternative for teams seeking production-hardened, distroless container images at scale. 

For more information, visit minimus.io.

Minimus Public Relations

contact@minimus.io

Visual Studio Code 1.115 introduces VS Code Agents app 9 Apr 2026, 2:49 am

Visual Studio Code 1.115, the latest release of the Microsoft’s extrensible code editor, previews a companion app called Visual Studio Code Agents, optimized for agent-native development. Additionally, the agent experience in the editor is improved for running terminal commands in the background, according to Microsoft.

Introduced April 8, Visual Studio Code 1.115 can be downloaded from the Visual Studio Code website for Windows, Mac, or Linux.

Available as a Visual Code Insiders early access capability, the VS Code Agents app allows developers to run agentic tasks across projects, by kicking off multiple agent sessions across multiple repos in parallel. Developers can track session progress, view diffs inline, leave feedback for agents, and create pull requests without leaving the app, Microsoft said. Additionally, custom instructions, prompt files, custom agents, Model Context Protocol (MCP) servers, hooks, and plugins all work in the Agents app, along with VS Code customizations such as themes.

VS Code 1.115 also introduces two changes designed to improve the agent experience for running terminal commands in the background. First, a new send_to_terminal tool lets an agent continue interacting with background terminals. For example, if an SSH session times out while waiting for a password prompt, the agent still can send the required input to complete the connection. Previously, background terminals were read-only, with only the get_terminal_output available to the agent to check the terminal’s status. This was particularly limiting when a foreground terminal timed out and moved to the background, because the agent could no longer interact with it.

Second, a new experimental setting, chat.tools.terminal.backgroundNotifications, allows an agent to automatically be notified when a background terminal command finishes or requires user input. This also applies to foreground terminals that time out and are moved to the background. The agent then can take appropriate action, such as reviewing the output or providing input via the send_to_terminal tool. Previously, when a terminal command was running in the background, the agent had to manually call get_terminal_output to check the status. There was no way to know when the command completed or needed input.

Also in VS Code 1.115, when an agent invokes the browser tool, the tool calls now have a more descriptive label and a link to go directly to the target browser tab, Microsoft said. Plus, the Run Playwright Code tool has improved support for long-running scripts. Scripts that take longer than five seconds to run (by default) now return a deferred result for the agent to poll.

VS Code 1.115 follows VS Code 1.114 by a week, with that release featuring streamlined AI chat. Updates to VS Code now arrive weekly instead of monthly, a change in cadence that Microsoft introduced with the VS Code 1.111 release on March 9.

(image/jpeg; 1.44 MB)

Visual Studio Code 1.115 introduces VS Code Agents app 9 Apr 2026, 2:40 am

Visual Studio Code 1.115, the latest release of the Microsoft’s extrensible code editor, previews a companion app called Visual Studio Code Agents, optimized for agent-native development. Additionally, the agent experience in the editor is improved for running terminal commands in the background, according to Microsoft.

Introduced April 8, Visual Studio Code 1.115 can be downloaded from the Visual Studio Code website for Windows, Mac, or Linux.

Available as a Visual Code Insiders early access capability, the VS Code Agents app allows developers to run agentic tasks across projects, by kicking off multiple agent sessions across multiple repos in parallel. Developers can track session progress, view diffs inline, leave feedback for agents, and create pull requests without leaving the app, Microsoft said. Additionally, custom instructions, prompt files, custom agents, Model Context Protocol (MCP) servers, hooks, and plugins all work in the Agents app, along with VS Code customizations such as themes.

VS Code 1.115 also introduces two changes designed to improve the agent experience for running terminal commands in the background. First, a new send_to_terminal tool lets an agent continue interacting with background terminals. For example, if an SSH session times out while waiting for a password prompt, the agent still can send the required input to complete the connection. Previously, background terminals were read-only, with only the get_terminal_output available to the agent to check the terminal’s status. This was particularly limiting when a foreground terminal timed out and moved to the background, because the agent could no longer interact with it.

Second, a new experimental setting, chat.tools.terminal.backgroundNotifications, allows an agent to automatically be notified when a background terminal command finishes or requires user input. This also applies to foreground terminals that time out and are moved to the background. The agent then can take appropriate action, such as reviewing the output or providing input via the send_to_terminal tool. Previously, when a terminal command was running in the background, the agent had to manually call get_terminal_output to check the status. There was no way to know when the command completed or needed input.

Also in VS Code 1.115, when an agent invokes the browser tool, the tool calls now have a more descriptive label and a link to go directly to the target browser tab, Microsoft said. Plus, the Run Playwright Code tool has improved support for long-running scripts. Scripts that take longer than five seconds to run (by default) now return a deferred result for the agent to poll.

VS Code 1.115 follows VS Code 1.114 by a week, with that release featuring streamlined AI chat. Updates to VS Code now arrive weekly instead of monthly, a change in cadence that Microsoft introduced with the VS Code 1.111 release on March 9.

Microsoft announces end of support for ASP.NET Core 2.3 8 Apr 2026, 7:56 pm

Microsoft’s ASP.NET Core 2.3, a version of the company’s open source web development framework for .NET and C#, will reach end of life support on April 7, 2027.

Following that date, Microsoft will no longer provide bug fixes, technical support, or security patches for ASP.NET Core 2.3, the company announced on April 7, exactly a year before the cessation date. ASP.NET Core 2.3 packages—the latest patched versions only—are supported currently on .NET Framework, following the support cycle for those .NET Framework versions. After April 7, 2027, this support will end regardless of the .NET Framework version in use, according to Microsoft. Support for ASP.NET Core 2.3 packages including the Entity Framework 2.3 packages will end on the same date.

Microsoft recommends upgrading to a currently supported version of .NET, such as .NET 10 LTS. To help with the upgrade process, Microsoft recommends using GitHub Copilot modernization, which provides AI-powered assistance in planning and executing migrations to a modern .NET version.

Microsoft detailed the release of ASP.NET 2.3 in February 2025. The company lists the following impacts as a result of its end of support:

  • Applications will continue run; end of support does not break existing applications.
  • No new security updates will be issued for ASP.NET Core 2.3.
  • Continuing to use an unsupported version may expose applications to security vulnerabilities.
  • Technical support will no longer be available for ASP.NET Core 2.3.
  • The ASP.NET Core 2.3 packages will be deprecated.

ASP.NET Core is the open-source version of ASP.NET that runs on macOS, Linux, and Windows. ASP.NET Core first was released in 2016 and is a re-design of earlier Windows-only versions of ASP.NET.

(image/jpeg; 0.64 MB)

AWS turns its S3 storage service into a file system for AI agents 8 Apr 2026, 4:49 pm

Amazon Web Services is making its S3 object storage service easier for AI agents to access with the introduction of a native file system interface. The new interface, S3 Files, will eliminate a longstanding tradeoff between the low cost of S3 and the interactivity of a traditional file system or of Amazon’s Elastic File System (EFS).

“The file system presents S3 objects as files and directories, supporting all Network File System (NFS) v4.1+ operations like creating, reading, updating, and deleting files,” AWS principal developer advocate Sébastien Stormacq wrote in a blog post.

The file system can be accessed directly from any AWS compute instance, container, or function, spanning use cases from production applications to machine learning training and agentic AI systems, Stormacq said.

Analysts saw the change in accessibility as a strategic move by AWS to position S3 as a primary data layer for AI agents and modern applications, moving beyond its traditional use cases in data lakes and batch analytics.

“AWS is aligning S3 with AI, analytics, and distributed application needs where shared, low-latency file access is required on object-resident data. This addresses growing demand from machine learning training, agentic systems, and multi-node workloads that require concurrent read/write access without moving data out of S3,” said Kaustubh K, practice director at Everest Group.

Without a file system in S3, enterprises developing and deploying agentic systems and other modern applications typically had to either use a separate storage system or copy, synchronize, and stage data stored in S3, introducing latency, inconsistency, and operational overhead, said Pareekh Jain, principal analyst at Pareekh Consulting.

Some developers, said Kaustubh, turned to FUSE-based tools such as s3fs or Mountpoint to simulate file systems on top of S3, but these often lacked proper locking, consistency guarantees, and efficient update mechanisms.

In contrast, S3 Files addresses those limitations through native support for file operations, including permissions, locking, and incremental updates, Jain said.

This reduces friction for developers, he said, as they will no longer need to rewrite applications for object storage: existing file-based tools will just work. “Agents also become easier to build, as they can directly read and write files, store memory, and share data. Overall, it reduces the need for extra glue code like sync jobs, caching layers, and file adapters,” Jain said.

This also implications for CIOs, as it simplifies data architecture by bringing everything, including data lakes, file systems, and staging layers, into Amazon S3.

“This approach lowers costs by removing duplication, reducing pipelines, and cutting operational overhead, while also improving governance with a single source of truth and no scattered copies,” Jain said.

S3 Files is now generally available and can be accessed through the AWS Management Console or the Command Line Interface (CLI), where users can create, mount, and deploy file systems.

(image/jpeg; 2.55 MB)

Z.ai unveils GLM-5.1, enabling AI coding agents to run autonomously for hours 8 Apr 2026, 10:27 am

Chinese AI company Z.ai has launched GLM-5.1, an open-source coding model it says is built for agentic software engineering. The release comes as AI vendors move beyond autocomplete-style coding tools toward systems that can handle software tasks over longer periods with less human input.

Z.ai said GLM-5.1 can sustain performance over hundreds of iterations, an ability it argues sets it apart from models that lose effectiveness in longer sessions.

As one example, the company said GLM-5.1 improved a vector database optimization task over more than 600 iterations and 6,000 tool calls, reaching 21,500 queries per second, about six times the best result achieved in a single 50-turn session.

In a research note, Z.ai said GLM-5.1 outperformed its predecessor, GLM-5, on several software engineering benchmarks and showed particular strength in repo generation, terminal-based problem solving, and repeated code optimization. The company said the model scored 58.4 on SWE-Bench Pro, compared with 55.1 for GLM-5, and above the scores it listed for OpenAI’s GPT-5.4, Anthropic’s Opus 4.6, and Google’s Gemini 3.1 Pro on that benchmark.

GLM-5.1 has been released under the MIT License and is available through its developer platforms, with model weights also published for local deployment, the company said. That may appeal to enterprises looking for more control over how such tools are deployed.

Longer-running coding agents

Z.ai says long-running performance is a key differentiator for the company when compared to models that lose effectiveness in extended sessions.

Analysts say this is because many current models still plateau or drift after a relatively small number of turns, limiting their usefulness on extended, multi-step software tasks.

Pareekh Jain, CEO of Pareekh Consulting, said the industry is now moving beyond tools that can answer prompts toward systems that can carry out longer assignments with less supervision.

The question, Jain said, is no longer, “What can I ask this AI?” but, “What can I assign to it for the next eight hours?”

For enterprises, that raises the prospect of assigning an agent a ticket in the morning and receiving an optimized solution by day’s end, after it has run hundreds of experiments and profiled the code.

“This capability aligns with real needs such as large refactors, migration programs, and continuous incident resolution,” said Charlie Dai, VP and principal analyst at Forrester. “It suggests that long‑running autonomous agents are becoming more practical, provided enterprises layer in governance, monitoring, and escalation mechanisms to manage risk.”

Open-source appeal grows

GLM-5.1’s release under the MIT License could be significant, especially for companies in regulated or security-sensitive sectors.

“This matters in four key ways,” Jain said. “First, cost. Pricing is much lower than for premium models, and self-hosting lets companies control expenses instead of paying per use. Second, data governance. Sensitive code and data do not have to be sent to external APIs, which is critical in sectors such as finance, healthcare, and defense. Third, customization. Companies can adapt the model to their own codebases and internal tools without restrictions.”

The fourth factor, according to Jain, is geopolitical risk. Although the model is open source, its links to Chinese infrastructure and entities could still raise compliance concerns for some US companies.

Dai said the MIT license makes it easier for companies to run the model on their own systems while adapting it to internal requirements and governance policies. “For many buyers, this makes GLM‑5.1 a viable strategic option alongside commercial models, especially where regulatory constraints, IP sensitivity, or long‑term platform control matter most,” Dai said.

Benchmark credibility

Z.ai cited three benchmarks: SWE-Bench Pro, which tests complex software engineering tasks; NL2Repo, which measures repository generation; and Terminal-Bench 2.0, which evaluates real-world terminal-based problem solving.

“These benchmarks are designed to test coding agents’ advanced coding capabilities, so topping those benchmarks reflects strong coding performance, such as reliability in planning-to-execution, less prompt rework, and faster delivery,” said Lian Jye Su, chief analyst at Omdia. “However, they are still detached from typical enterprise realities.”

Su said public benchmarks still do not capture the messiness of proprietary codebases, legacy systems, and code review workflows. He added that benchmark results come from controlled settings that differ from production, though the gap is closing as more teams adopt agentic setups.

The article originally appeared in ComputerWorld.

(image/jpeg; 4.47 MB)

Microsoft’s new Agent Governance Toolkit targets top OWASP risks for AI agents 8 Apr 2026, 9:38 am

Microsoft has quietly introduced the Agent Governance Toolkit, an open source project designed to monitor and control AI agents during execution as enterprises try, and move them into production workflows.

The toolkit, which is a response to the Open Worldwide Application Security Project’s (OWASP) emerging focus on AI and LLM security risks, adds a runtime security layer that enforces policies to mitigate issues such as prompt injection, and improves visibility into agent behavior across complex, multi-step workflows, Imran Siddique, principal group engineering manager at Microsoft wrote in a blog post.

More specifically, the toolkit maps to OWASP’s top 10 risks for agentic systems, including goal hijacking, tool misuse, identity abuse, supply chain risks, code execution, memory poisoning, insecure communications, cascading failures, human-agent trust exploitation, and rogue agents.

The rationale behind the toolkit, Siddique wrote, stems from how AI systems increasingly resemble loosely governed distributed environments, where multiple untrusted components share resources, make decisions, and interact externally with minimal oversight.

That prompted Microsoft to apply proven design patterns from operating systems, service meshes, and site reliability engineering to bring structure, isolation, and control to these environments, Siddique added.

The result was the Redmond-headquartered giant packaging these principles into the toolkit comprising seven components available in Python, TypeScript, Rust, Go, and .NET.

The cross language approach, Siddique explained, is aimed at meeting developers where they are and enabling integration across heterogeneous enterprise stacks.

As for the components, the toolkit includes modules such as a policy enforcement layer named Agent OS, a secure communication and identity framework named Agent Mesh, an execution control environment named Agent Runtime, and additional components, such as Agent SRE, Agent Compliance, and Agent Lightning, covering reliability, compliance, marketplace governance, and reinforcement learning oversight.

Beyond its modular design, Siddique further wrote that the toolkit is built to work with existing development ecosystems: “We designed the toolkit to be framework-agnostic from day one. Each integration hooks into a framework’s native extension points, LangChain’s callback handlers, CrewAI’s task decorators, Google ADK’s plugin system, Microsoft Agent Framework’s middleware pipeline, so adding governance doesn’t require rewriting agent code.”

This approach, the senior executive explained, would reduce integration overhead and risk, allowing developers to introduce governance controls into production systems without disrupting existing workflows or incurring the cost and complexity of rearchitecting applications.

Siddique even went on to give examples of several framework integrations that are already deployed in production workloads, including LlamaIndex’s TrustedAgentWorker integration.

For those wishing to explore the toolkit, which is currently in public preview, it is available under an MIT license and structured as a monorepo with independently installable components.

Microsoft, in the future, plans to transition the project to a foundation-led model and is already engaging with the OWASP agentic AI community to support broader governance and stewardship, Siddique wrote.

(image/jpeg; 1.57 MB)

The winners and losers of AI coding 8 Apr 2026, 9:00 am

I don’t need to tell you that agentic coding is changing the world of software development. Things are happening so quickly that it’s hard to keep up. Internet years seem like eons compared to agentic coding years.  It seemed like just a few short weeks ago that everyone very suddenly stopped writing code and let Claude Code do all the work because, well, it was a few short weeks ago that it happened. 

It seems like new ideas, tools, and frameworks are popping up every day.

Despite things moving like a cheetah sprinting across the Savannah, I am going to make a few predictions about where the cheetah is going to end up and what will happen when it gets there. 

So long, legacy software

First, legacy software is going to become a thing of the past. You know what I’m talking about—those big balls of mud that have accreted over the last 30 years. The one started by your cousin’s friend who wrote that software for your dad’s laundromat and is now the software recommended by the Coin Laundry Association. The one with seven million lines of hopeless spaghetti code that no one person actually understands, that uses ancient, long-outdated technology, that is impossible to maintain but somehow still works. The one that depends on an entire team of developers and support people to keep running.

Well, someone is going to come along and write a completely fresh, new, unmuddy version of that ball of mud with a coding agent. The perfect example of this is happening in open source with Cloudflare’s EmDash project. Now don’t get me wrong. I have a deep respect for WordPress, the CMS that basically runs the internet. It’s venerable and battle-tested—and bloated and insecure and written in PHP.

EmDash is a “spiritual successor” to WordPress. Cloudflare basically asked, “What would WordPress look like if we started building it today?” Then they started building it using agentic coding, and basically did in a couple of months what WordPress took 24 years to do. Sure, they had WordPress as a template, but it was only because of agentic coding that they were even willing to attempt it. It’s long been thought foolish to say “Let’s rebuild the whole thing from scratch.” Now, with agentic coding, it seems foolish not to.

This is not the last creaky, old-school project that will be re-imagined in the coming days. If your business relies on a big ball of mud, it’s time to start looking at rebuilding it from the ground up before someone else beats you to it.

Ideas, implemented

Second, all those great application ideas you’ve been thinking about but could never find the time to do? Well, now you and millions of other developers can actually do them. I myself am nearing completion on six — six! — of the ideas I’ve been kicking around for years and never found the time to do. Yep, I build them all in parallel, with six different agents running at once. (Thank you, Garry Tan and gstack!)

Now, will there be a lot of slop that comes out of that? Sure. But will there be a huge supply of cool new software that will change the world? Yes, definitely. 

That project you’ve always wanted to do? You can do it now. 

Third, bespoke software will become the norm. Today, a business that needs accounting software will buy a product like Quickbooks or some other off-the-shelf solution, and adapt it to their way of doing things. But going forward, those businesses can create their own accounting package designed specifically for the way they do business. No one knows their domain better than the small business owner themselves. Instead of relying on someone who doesn’t understand the nuances of running your particular plumbing business, you can just talk to Claude Code and build your own solution. 

This is happening today (the head of finance wrote the solution!). If you aren’t considering becoming more efficient via agentic coding, then you might find yourself dealing with competitors that are.

Legacy apps need rewriting. Those side projects need building. That app you need for your business isn’t going to build itself. Three months ago, it all seemed foolish and impossible. Today? You are either the cheetah or the gazelle.

(image/jpeg; 6.26 MB)

Get started with Python’s new frozendict type 8 Apr 2026, 9:00 am

Only very rarely does Python add a new standard data type. Python 3.15, when it’s released later this year, will come with one—an immutable dictionary, frozendict.

Dictionaries in Python correspond to hashmaps in Java. They are a way to associate keys with values. The Python dict, as it’s called, is tremendously powerful and versatile. In fact, the dict structure is used by the CPython interpreter to handle many things internally.

But a dict has a big limitation: it’s not hashable. A hashable type in Python has a hash value that never changes during its lifetime. Strings, numerical values (integers and floats), and tuples are all hashable because they are immutable. Container types, like lists, sets, and, yes, dicts, are mutable, so can’t guarantee they hold the same values over time.

Python has long included a frozenset type—a version of a set that doesn’t change over its lifetime and is hashable. Because sets are basically dictionaries with keys and no values, why not also have a frozendict type? Well, after much debate, we finally got just that. If you download Python 3.15 alpha 7 or later, you’ll be able to try it out.

The basics of a frozendict

In many respects, a frozendict behaves exactly like a regular dictionary. The main difference is you can’t use the conventional dictionary constructor (the {} syntax) to make one. You must use the frozendict() constructor:

my_frozendict = frozendict(
    x = 1, y = True, z = "Hello"
)

You can also take an existing dictionary and give it to the constructor:

my_frozendict = frozendict(
    {x:1, y:True, z:"Hello", "A string":"Another string"}
)

One big advantage of using a dict as the source is that you have more control over what the keys can be. In the above example, we can’t use "A string" as a key in the first constructor, because that’s not a valid argument name. But we can use any string we like as a dict key.

The new frozendict bears some resemblance to an existing type in the collections module, collections.frozenmap. But frozendict differs in several key ways:

  • frozendict is built-in, so doesn’t need to be imported from a module.
  • frozenmap does not preserve insertion order.
  • Lookups for keys in a frozenmap are potentially slower (O(log n) than in a frozendict (O(1)).

Working with frozendicts

A frozendict behaves exactly like a regular dict as long as all you’re doing is reading values from it.

For instance, if you want to get a value using a key, it’s the same: use the syntax the_frozendict[the_key]. If you want to iterate through a frozendict, that works the same way as with a regular dict: for key in the_frozendict:. Likewise for key/value pairs: for key, value in the_frozendict.items(): will work as expected.

Another convenient aspect of a frozendict is that they preserve insertion order. This feature was added relatively recently to dictionaries, and can be used to do things like create FIFO queues there. That the frozendict preserves the same behavior is very useful; it means you can iterate through a frozendict created from a regular dictionary and get the same items in the same sequence.

What frozendicts don’t let you do

The one big thing you can’t do with a frozendict is change its contents in any way. You can’t add keys, reassign their values, or remove keys. That means all of the following code would be invalid:

# new key x
my_frozendict[x]=y
# existing key q
my_frozendict[q]=p
# removing item
my_frozendict.pop()

Each of these would raise an exception. In the case of myfrozendict.pop(), note that the method .pop() doesn’t even exist on a frozendict.

While you can use merge and update operators on a frozendict, the way they work is a little deceptive. They don’t actually change anything; instead, they create a new frozendict object that contains the results of the merge or update. It’s similar to how “changing” a string or tuple really just means constructing a new instance of those types with the changes you want.

# Merge operation
my_frozendict = frozendict(x=1)
my_other_frozendict = frozendict(y=1)
new_fz = my_frozendict | my_other_frozendict

# Update operation
new_fz |= frozendict(x=2)

Use cases for frozendicts

Since a frozendict can’t be changed, it obviously isn’t a substitute for a regular dictionary, and it isn’t meant to be. The frozendict will come in handy when you want to do things like:

  • Store key/value data that is meant to be immutable. For instance, if you collect key/value data from command-line options, you could store them in a frozendict to signal that they should not be altered over the lifetime of the program.
  • Use a dictionary in some circumstance where you need a hashable type. For instance, if you want to use a dictionary as a key in a dictionary, or as an element in a set, a frozendict fits the bill.

It might be tempting to think a frozendict will provide better performance than a regular dict, considering it’s read-only. It’s possible, but not guaranteed, that eventual improvements in Python will enable better performance with immutable types. However, right now, that’s far from being a reason to use them.

(image/jpeg; 4.85 MB)

GitHub Copilot CLI adds Rubber Duck review agent 7 Apr 2026, 11:17 pm

GitHub has introduced an experimental Rubber Duck mode in the GitHub Copilot CLI. The latest addition to the AI-powered coding tool uses a second model from a different AI family to provide a second opinion before enacting the agent’s plan.

The new feature was announced April 6. Introduced in experimental mode, Rubber Duck leverages a second model from a different AI family to act as an independent reviewer, assessing plans and work at the moments where feedback matters most, according to GitHub. Rubber Duck is a focused review agent, powered by a model from a complementary family to a primary Copilot session. The job of Rubber Duck is to check the agent’s work and present a short, focused list of high-value concerns including details the primary agent may have missed, assumptions worth questioning, and edge cases to consider.

Developers can use/experimentalin the Copilot CLI to access Rubber Duck alongside other experimental features.

Evaluating Rubber Duck on SWE-Bench Pro, a benchmark of real-world coding problems drawn from open-source repositories, GitHub found that Claude Sonnet 4.6 paired with Rubber Duck running GPT-5.4 achieved a resolution rate approaching Claude Opus 4.6 running alone, closing 74.7% of the performance gap between Sonnet and Opus. GitHub said Rubber Duck tends to help more with difficult problems, ones that span three-plus files and would normally take 70-plus steps. On these problems, Sonnet plus Rubber Duck scores 3.8% higher than the Sonnet baseline and 4.8% higher on the hardest problems identified across three trials.

GitHub cited these examples of the kinds of problems Rubber Duck finds:

  • Architectural catch (OpenLibrary/async scheduler): Rubber Duck caught that the proposed scheduler would start and immediately exit, running zero jobs—and that even if fixed, one of the scheduled tasks was itself an infinite loop.
  • One-liner bug (OpenLibrary/Solr): Rubber Duck caught a loop that silently overwrote the same dict key on every iteration. Three of four Solr facet categories were being dropped from every search query, with no error thrown.
  • Cross-file conflict (NodeBB/email confirmation): Rubber Duck caught three files that all read from a Redis key which the new code stopped writing. The confirmation UI and cleanup paths would have been silently broken on deploy.

(image/jpeg; 2.69 MB)

The Terraform scaling problem: When infrastructure-as-code becomes infrastructure-as-complexity 7 Apr 2026, 12:41 pm

Terraform promised us a better world. Define your infrastructure in code, version it, review it, and deploy it with confidence. For small teams running a handful of services, that promise holds up beautifully.

Then your organization grows. Teams multiply. Modules branch and fork. State files balloon. And suddenly, that clean declarative vision starts looking a lot like a sprawling monolith that nobody fully understands and everyone is afraid to touch.

If you’ve ever watched a Terraform plan run for 20 minutes, encountered a corrupted state file at 2 a.m. or inherited a Terraform codebase where half the resources are undocumented and a quarter are unmanaged, you know exactly what we’re talking about. This is the Terraform scaling problem, and it’s affecting engineering organizations of every size.

The numbers confirm it isn’t a niche concern. The 2023 State of IaC Report found that 90% of cloud users are already using infrastructure-as-code, with Terraform commanding 76% market share according to the CNCF 2024 Annual Survey. Yet the HashiCorp State of Cloud Strategy Survey 2024 showed that 64% of organizations report a shortage of skilled cloud and automation staff, creating a dangerous gap between Terraform’s adoption and the expertise required to operate it well at scale.

In this post, we break down where Terraform breaks down, why traditional solutions fall short, and how AI-assisted IaC management is offering a credible path forward.

The root causes of Terraform complexity at scale

Terraform’s design philosophy is fundamentally sound: Declarative infrastructure, idempotent operations and a provider ecosystem that covers nearly every cloud service imaginable. The problem isn’t the tool; it’s the gap between how Terraform was designed to work and how large engineering organizations actually operate.

State management becomes a full-time job

Terraform’s state file is both its greatest strength and its biggest liability at scale. State gives Terraform the ability to track what it has deployed and calculate diffs — but as infrastructure grows, that state file becomes a critical shared resource with no native support for distributed access patterns.

Teams running a monolithic state end up with a single point of contention. Engineers queue up to run plans and apply. Locking mechanisms in backends like S3 with DynamoDB help, but they don’t solve the underlying architectural issue: Everyone is competing for the same resource.

The HashiCorp State of Cloud Strategy Survey consistently places state management issues, corruption, drift and locking failures among the top pain points for Terraform users in organizations with more than 50 engineers. When a state file gets corrupted mid-apply, recovery can take hours and require deep expertise. The problem compounds as infrastructure grows: Organizations running more than 500 managed resources in a single workspace routinely report 15–30 minute plan times, turning what should be a fast feedback loop into a deployment bottleneck.

Module sprawl and dependency hell

Terraform modules are the right answer to code reuse. They’re also the source of some of the most painful debugging sessions in platform engineering.

As organizations scale, module libraries grow organically. Teams fork modules to meet specific requirements. Version pinning gets inconsistent. A security patch in a root module requires coordinated updates across dozens of dependent modules — a task that sounds simple until you’re dealing with circular dependencies, incompatible provider versions and module registries that weren’t designed for enterprise governance.

Adopting semantic versioning for Terraform modules has a measurable impact: According to a Moldstud IaC case study (June 2025), approximately 60% of organizations that enforce semantic versioning on module releases report a decrease in deployment failures over six months. Yet most teams don’t adopt this practice until after they’ve experienced the failure modes firsthand. The same research found that teams using peer reviews for Terraform code experience a 30% improvement in code quality but this requires process investment that most fast-moving platform teams skip in the early stages.

The pattern is consistent: What starts as a tidy module hierarchy becomes a tangled dependency graph that requires tribal knowledge to navigate.

Plan times and blast radius

At a certain scale, the Terraform plan stops being a quick feedback loop and starts being a liability. Teams managing thousands of resources in a single workspace can wait 15–30 minutes for a plan to complete. More critically, the blast radius of a single application expands proportionally.

A misconfigured security group rule in a small workspace affects a handful of resources. The same mistake in a large monolithic workspace can cascade across hundreds of resources before anyone can intervene. Terraform’s own declarative model means that configuration errors can trigger resource destruction, a risk that grows with workspace size. This reality pushes teams toward increasingly conservative change management processes, which defeats the core value proposition of IaC in the first place.

There’s a meaningful ROI case for solving this. The Moldstud IaC case study indicates that implementing automated IaC solutions can lead to a 70% reduction in deployment times. But capturing that return requires architectural decisions that prevent plan-time bottlenecks before they compound.

Drift: The silent killer

Infrastructure drift — where the actual state of your cloud environment diverges from what Terraform believes it to be — is among the most insidious challenges at scale. It accumulates slowly, through emergency console changes, partially applied runs and resources created outside of Terraform entirely.

The causes are well-documented: An on-call engineer hotfixes a security group at 3 a.m. and forgets to update the code; an autoscaling event modifies a resource configuration that Terraform manages; a third-party integration quietly changes a setting that Terraform has no visibility into. Each of these is a small divergence. Collectively, they erode the reliability of your entire IaC foundation. Terraform Drift Detection Guide documents how teams across industries are consistently caught off guard by drift accumulation in environments they believed were fully under IaC control.

By the time drift becomes visible, it’s often embedded deep enough to make remediation genuinely risky. The DORA 2023 State of DevOps Report found that teams dealing with frequent configuration drift had 2.3× higher change failure rates than teams maintaining consistent IaC hygiene. The compounding effect is significant: Drift erodes confidence in your IaC, which leads to more manual changes, which causes more drift.

Why traditional approaches fall short

The conventional responses to Terraform scaling challenges are well-documented: Workspace decomposition, remote state backends, CI/CD pipelines with policy enforcement and module registries with semantic versioning. These are all necessary practices. They’re also insufficient on their own.

  • Workspace decomposition reduces blast radius but multiplies operational overhead. You’re trading one large problem for many smaller ones, each requiring its own state management, access controls and pipeline configuration. Managing 200 workspaces is a full-time engineering effort.
  • CI/CD enforcement catches policy violations after the fact. By the time a plan hits your pipeline, an engineer has already spent time writing code that may get rejected. Feedback loops are slow, and the root cause — the complexity of authoring correct IaC at scale — remains unsolved.
  • Manual code reviews don’t scale. Platform teams can become bottlenecks when every Terraform change requires expert review to validate correctness, security posture and compliance. The cognitive load required to review infrastructure changes accurately is substantial, and reviewers burn out. This bottleneck is only sharpened by the talent shortage: With 64% of organizations reporting a shortage of skilled cloud and automation staff, the supply of qualified reviewers isn’t growing fast enough to match Terraform’s adoption curve.

The honest assessment: These solutions manage Terraform complexity rather than resolving it. They require ongoing investment in tooling, process and expertise that many organizations struggle to maintain.

This is exactly the friction that StackGen’s Intent-to-Infrastructure Platform was designed to address. Rather than adding more manual process overhead, it introduces an intelligent layer that helps teams author, validate and govern Terraform configurations from the point of intent before complexity accumulates.

Emerging solutions: Where the industry is moving

The Terraform ecosystem is evolving rapidly in response to these challenges. The global IaC market reflects this urgency: Valued at $847 million in 2023, it’s projected to reach $3.76 billion by 2030 at a 24.4% compound annual growth rate, according to Grand View Research’s IaC Market Report. That growth isn’t just adoption — it’s investment in solving the complexity problems that widespread adoption creates.

Workspace automation and orchestration

Tools like Atlantis, Stackgen, Terraform Cloud, are moving toward intelligent workspace orchestration, automatically managing dependencies between workspaces, ordering applies correctly and providing better visibility into cross-workspace impact. This reduces the manual coordination overhead that plagues large-scale Terraform operations.

The key shift is treating your collection of workspaces as a managed system rather than a set of independent units. When a shared networking module changes, an orchestration layer should automatically identify affected workspaces, calculate the propagation order and manage the apply sequence — rather than requiring a human to track and coordinate each dependency manually.

Policy-as-code with earlier enforcement

Open Policy Agent (OPA) and HashiCorp Sentinel have matured significantly. More importantly, teams are learning to push policy enforcement left — validating Terraform plans against organizational policies before they hit a CI/CD pipeline, and ideally before they’re even submitted for review.

HashiCorp has reported that teams using Sentinel with pre-plan validation see a 45% reduction in policy violation-related build failures compared to teams running post-plan enforcement only. Earlier feedback means faster iteration and lower engineer frustration.

AI-assisted IaC management: The emerging frontier

This is where the most significant innovation is happening. AI-assisted infrastructure management addresses the problems that automation alone can’t solve: The cognitive complexity of understanding large IaC codebases, identifying drift patterns before they become critical and translating high-level intent into correct, compliant Terraform code.

Platforms like StackGen’s Intent-to-Infrastructure Platform represent a new paradigm here. Rather than requiring platform engineers to manually author and review every Terraform resource definition, StackGen interprets infrastructure intent — expressed in natural language or high-level policy- and generates compliant Terraform configurations, validates them against organizational standards and surfaces potential issues before they reach production. This directly addresses the bottleneck where expert review becomes a constraint on velocity.

The practical applications are concrete:

  • Drift detection and remediation: AI models trained on infrastructure patterns can identify anomalous drift, distinguishing between expected configuration changes and unauthorized modifications, and surface remediation recommendations with context about impact and risk. This is particularly powerful for teams managing hundreds of workspaces where manual drift monitoring isn’t practical.
  • Intelligent module recommendations: Rather than requiring engineers to navigate sprawling module registries manually, AI-assisted tooling can analyse an infrastructure request, identify the most appropriate existing modules and flag where new module development is needed. This reduces the “reinvent the wheel” pattern that causes module sprawl.
  • Natural language to IaC: For platform teams managing self-service infrastructure portals, AI translation layers allow development teams to request infrastructure in natural language and receive validated Terraform configurations that conform to organizational standards — without requiring deep Terraform expertise from every team consuming platform services.
  • Proactive complexity warnings: AI analysis of Terraform codebases can identify emerging complexity patterns before they become critical — detecting circular dependencies forming, state files approaching problematic size thresholds or module versioning patterns that suggest future compatibility issues.

Gartner predicts that by 2026, more than 40% of organizations will be using AI-augmented IaC tooling for some portion of their infrastructure management workflow — up from under 10% in 2023. The trajectory is clear, and the window for early-mover advantage is still open.

Practical guidance: Scaling terraform without losing your mind

While AI-assisted tooling continues to mature, there are concrete architectural and process changes your team can adopt today.

  • Decompose by domain, not by team. Workspace boundaries should reflect infrastructure domains (networking, compute, data) rather than organizational team boundaries. Teams change; infrastructure domains are more stable. This reduces the reorganization tax you pay when teams restructure.
  • Treat state as infrastructure. Your state backend deserves the same reliability engineering as production systems. Remote state with versioning, automated backup verification and clear recovery runbooks should be non-negotiable before you’re managing more than a few dozen resources. The HashiCorp State of Cloud Strategy Survey shows that over 80% of enterprises already integrate IaC into their CI/CD pipelines — but pipeline integration doesn’t substitute for state backend reliability.
  • Invest in a private module registry early. Whether you use Terraform Cloud’s built-in registry, a self-hosted solution or a structured module registry with enforced semantic versioning pays compounding dividends as your module library grows. The cost of retrofitting governance onto an ungoverned module library is significantly higher than building in governance from the start.
  • Automate drift detection, not just drift remediation. Drift remediation is expensive; drift detection is cheap. Scheduled Terraform plan runs in CI/CD, combined with alerting on detected drift, give you an early warning system that prevents drift from compounding silently. For teams managing large environments where manual detection becomes impractical, automated drift tooling, whether native to HCP Terraform or third-party solutions, becomes essential infrastructure in its own right.
  • Build a paved road for Terraform consumers. If every application team needs to become a Terraform expert to consume platform services, your platform won’t scale. Build opinionated, simplified interfaces, whether that’s a service catalogue, a self-service portal or an AI-assisted request layer that allows development teams to get the infrastructure they need without requiring deep IaC expertise.

The strategic inflection point

We’re at an inflection point in how the industry thinks about infrastructure-as-code. The original vision of IaC infrastructure defined, versioned and managed like software was correct. The execution, for large-scale organizations, has accumulated significant complexity debt.

The next wave of IaC tooling isn’t about replacing Terraform. Terraform’s declarative model, provider ecosystem and community are genuine strengths that won’t be supplanted quickly. The opportunity is in the layer above Terraform: Intelligent orchestration, AI-assisted authoring, proactive complexity management and intent-driven infrastructure interfaces that make IaC accessible to the full organization rather than just a specialized subset of platform engineers.

Teams that invest in this layer now, whether through emerging platforms, internal tooling or AI-assisted workflows, will build a meaningful operational advantage. Teams that continue fighting Terraform complexity with more Terraform will find themselves spending an increasing proportion of engineering capacity on infrastructure maintenance rather than product development.

The IaC market’s 24.4% CAGR reflects growing awareness that the tools and processes managing this complexity need to evolve as fast as the infrastructure they govern.

Key takeaways

The Terraform scaling problem is real, but it’s solvable. The path forward involves three parallel tracks: Architectural decisions that manage blast radius and reduce state contention; process investments in policy-as-code and module governance; and tooling that uses AI to address the cognitive complexity that has always been the hardest part of IaC at scale.

Your infrastructure code should accelerate your engineering organization, not constrain it. If it’s doing the latter, the problem isn’t your engineers; it’s the layer of tooling and process sitting between intent and deployed infrastructure.

Ready to explore how AI-assisted IaC management can reduce the complexity overhead in your Terraform workflows?

This article is published as part of the Foundry Expert Contributor Network.
Want to join?

(image/jpeg; 4.28 MB)

Nvidia’s SchedMD acquisition puts open-source AI scheduling under scrutiny 7 Apr 2026, 12:13 pm

Nvidia’s recent acquisition of SchedMD, the company behind the Slurm workload manager, is raising concerns among AI industry executives and supercomputing specialists who fear the chip giant could use its new position to favour its own hardware over competing chips, whether through code prioritization or roadmap decisions.

The concern, as industry sources frame it, is straightforward: Nvidia now controls scheduling software that also runs on hardware from its rivals, including AMD and Intel. A vendor that controls workload scheduling software has significant leverage over how efficiently competing hardware performs within shared computing environments — whether it exercises that leverage or not, Reuters reported, citing five anonymous sources, three of whom work in the AI industry and two with knowledge of supercomputer operations.

Analysts who spoke to InfoWorld said Nvidia’s open-source commitment — the company said during the acquisition announcement that it would “continue to develop and distribute Slurm as open-source, vendor-neutral software” — may not be sufficient protection.

“Slurm’s open-source foundation offers safeguards such as transparent code, forking ability, and community governance, but SchedMD’s control gives Nvidia soft power rather than hard lock-in,” said Manish Rawat, semiconductor analyst at TechInsights. Rawat said Nvidia could subtly shape the roadmap, prioritising GPU-aware scheduling and topology optimisations that favour its own hardware, and that integration timelines already showed faster support for the CUDA ecosystem compared to alternatives such as AMD’s ROCm or Intel’s oneAPI – creating what he described as a “best-supported path effect.”

What is Slurm, and why does it matter

Slurm, originally developed at Lawrence Livermore National Laboratory, runs on roughly 60% of the world’s supercomputers. The software is in active use at major AI companies, including Meta Platforms, French AI startup Mistral, and Anthropic for elements of AI model training, Reuters reported.

Government supercomputers used for weather forecasting and national security research also depend on it. Nvidia acquired Slurm developer SchedMD in December 2025 and described the deal as a push to strengthen its open-source ecosystem and help users adopt newer AI techniques alongside traditional supercomputing work.

Is the concern valid?

Dr. Danish Faruqui, CEO of Fab Economics, a US-based AI hardware and datacenter advisory, said the risk was real.

“The skepticism that Nvidia may prioritize its own hardware in future software updates, potentially delaying or under-optimizing support for rivals, is a feasible outcome,” he said. As the primary developer, Nvidia now controls Slurm’s official development roadmap and code review process, Faruqui said, “which could influence how quickly competing chips are integrated on new development or continuous improvement elements.”

Owning the control plane alongside GPUs and networking infrastructure such as InfiniBand, he added, allows Nvidia to create a tightly vertically integrated stack that can lead to what he described as “shallow moats, where advanced features are only available or performant on Nvidia hardware.”

One concrete test of that, industry observers say, will be how quickly Nvidia integrates support for AMD’s next-generation chips into Slurm’s codebase compared with how quickly it integrates its own forthcoming hardware and networking technologies, such as InfiniBand.

Does the Bright Computing precedent hold?

Analysts point to Nvidia’s 2022 acquisition of Bright Computing as a reference point, saying the software became optimized for Nvidia chips in ways that disadvantaged users of competing hardware. Nvidia disputed that characterization, saying Bright Computing supports “nearly any CPU or GPU-accelerated cluster.”

Rawat said the comparison was instructive but imperfect. “Nvidia’s acquisition of Bright Computing highlights its preference for vertical integration, embedding Bright tightly into DGX and AI Factory stacks rather than maintaining a neutral, multi-vendor orchestration role,” he said. “This reflects a broader strategic pattern — Nvidia seeks to control the full-stack AI infrastructure experience.”

However, he said Slurm presented a fundamentally different challenge. “Deeply entrenched in supercomputing centers and academia, and effectively community-governed, Slurm carries high switching costs,” Rawat said. “Nvidia may influence but is unlikely to replicate the same tightly integrated control in markets dominated by established, neutral, and community-driven platforms.”

The open-source safety valve and its limits

Faruqui acknowledged that Slurm’s open-source licensing under a GNU GPL v2.0 licence offers some protection, including the community’s right to fork the project if Nvidia’s stewardship is seen as biased. But he cautioned that the option carried its own risks. “Slurm’s open-source status provides a safety valve with its limitations, but it is not a complete shield against vendor-neutrality,” he said.

The acquisition brought many of the world’s leading Slurm developers inside Nvidia, he noted, meaning a community-led fork would struggle to sustain the same pace of development.

Rawat described the situation as “a strategic dependency risk, not a crisis,” and said organisations should diversify GPU procurement, benchmark workloads across multiple vendor ecosystems, and develop internal expertise to modify or switch orchestration tools if needed.

Faruqui recommended that enterprise buyers negotiating Slurm support agreements seek service-level guarantees that apply equally to non-Nvidia hardware, covering response times, bug fixes, and feature parity across heterogeneous clusters. On architecture, he said organisations should consider containerising AI workloads to isolate applications from the underlying scheduler, making migration to alternative schedulers such as Flux or Kubernetes more feasible if required.

(image/jpeg; 13.28 MB)

Enterprise developers question Claude Code’s reliability for complex engineering 7 Apr 2026, 11:56 am

When a coding assistant starts looking like it’s cutting corners, developers notice. A senior director in AMD’s AI Group has publicly needled Anthropic’s Claude Code for what she calls a tendency to skim the hard bits, offering answers that land but don’t quite stick.

The gripe isn’t about outright failure so much as fading rigor, with complex problems drawing responses that seem quicker, lighter, and a little too eager to move on, forcing the senior executive and her team to stop using the pair programming tool for complex engineering tasks, such as debugging hardware and kernel-level issues.

The concerns were detailed in a GitHub issues ticket that Stella Laurenzo filed, where she claims that a February update of the tool might have resulted in quality regression issues around its reasoning capabilities for complex tasks.

The ticket stems from her quantitative analysis of 17,871 thinking blocks and 234,760 tool calls across 6,852 session files spanning January to March, covering both pre- and post-update periods for comparison.

In her analysis, Laurenzo pointed out that the model stopped reading code gradually before making changes to it as a result of a loss of reasoning capabilities.

“When thinking is shallow, the model defaults to the cheapest action available: edit without reading, stop without finishing, dodge responsibility for failures, take the simplest fix rather than the correct one,” she wrote in the ticket.

The loss in reasoning, Laurenzo added, is a major hurdle for her team as it affects over 50 concurrent agent sessions doing systems programming in C, GPU drivers, and over 30 minutes of autonomous runs with complex multi-file changes.

Laurenzo is not alone in raising these concerns. Several users commented on the ticket saying that they were having similar experiences as Laurenzo and her team.

Another user pointed to multiple subreddits highlighting similar degradation concerns, a comment that itself drew visible support from other developers through upvotes on GitHub.

Capacity crunch meets developer patience

That growing chorus of complaints has not gone unnoticed by analysts, who connected the issue to Anthropic’s fledgling capacity constraints.

“This is primarily a capacity and cost issue. Complex engineering tasks require significantly more compute, including intermediate reasoning steps. As usage increases, the system cannot sustain this level of compute for every request,” said Chandrika Dutt, research director at Avasant.

“As a result, the system limits how long a task runs or how much reasoning depth is applied and how many such tasks can run simultaneously,” Dutt added.

This is not the first instance where Anthropic had to deal with capacity constraints when it comes to Claude Code.

Last month, it started limiting usage across its Claude subscriptions to cope with rising demand that is stretching its compute capacity. The rationale then was that by accelerating how quickly users hit their session limits within these windows, Anthropic would be able to effectively redistribute access to prevent system overloads while still preserving overall weekly usage quotas.

Developers, much like in the case of the reasoning regression, had pushed back sharply against the rate limits imposed on Claude Code, arguing that the restrictions undercut its usefulness.

No exodus, but a slow erosion of trust

Taken together, the twin frustrations over rate limits and perceived reasoning regressions risk denting developer confidence in the platform, rather than a mass exodus, slowing momentum and nudging enterprise users to hedge their bets with alternatives, analysts say.

“This is not the kind of moment where users walk away overnight. It is far more subtle and far more dangerous than that. What is happening is a quiet shift in how much developers trust the system when the stakes are high. The loudest complaints are coming from teams that had already begun to rely on the system for serious, multi-step engineering work over extended sessions,” said Sanchit Vir Gogia, chief analyst at Greyhound Research.

“What has changed is not just the quality of outputs, but the way the system behaves while producing them. There is a noticeable drift from careful, step-by-step reasoning toward quicker, more reactive execution. That creates a cycle where engineers step in more often, interrupt more frequently, and end up doing the thinking the system was expected to handle,” Gogia pointed out.

That change, according to the analyst, will force teams to route complex or critical work elsewhere while keeping simpler tasks with Claude, which over time will erode the platform’s role from primary tool to optional tool.

Laurenzo, too, as per her GitHub issues ticket, is taking the same route that Gogia is predicting, ditching Claude Code temporarily for Anthropic to fix and switching to an unnamed rival offering for now.

No easy escape hatch in a GPU-constrained world

However, Avasant’s Dutt isn’t hopeful about Laurenzo’s decision in the long run. She pointed out that rivals might start facing similar capacity constraints as Anthropic: “All frontier models operate under similar GPU and cost constraints. As usage scales, all providers will need to introduce throttling mechanisms, tiered access models, and trade-offs between speed, cost, and reasoning depth. This is structurally inevitable.”

More so for reasoning regression because the analyst sees maintaining deep reasoning at scale as a difficult challenge, pinning her theory on recent SWE-EVO 2025 benchmarks on AI coding agents that show that success rates drop sharply for multi-step tasks, with failure rates often in the 60%–80% range, especially for execution-heavy scenarios.

Pay more, see more: the emerging AI trade-off?

As a fallback, though, Laurenzo is optimistic that Anthropic can course-correct, even suggesting, in her ticket, that the company introduce premium tiers that allow users to pay for greater reasoning capacity.

That might soon become a reality, both Dutt and Gogia said, as the industry is moving toward a consumption model where basic usage is treated differently from heavy, reasoning-intensive workloads.

Analysts also support Laurenzo’s other suggestions to Anthropic, which included transparency around thinking token allocation.

“Users need to understand what the system is doing under the hood. Not every detail, but enough to know whether the system actually reasoned through a problem or simply produced a quick answer. Today, users are forced to infer that from outcomes, which is why you are seeing users analyzing logs and behavior patterns. That should not be necessary,” Gogia said.

For now, though, Anthropic has yet to respond to Laurenzo’s GitHub ticket or assign it to anyone.

However, if they’re hoping for a quick fix, especially around capacity, they may want to lower expectations, at least till 2027, because that’s when new chips, in the form of Google TPUs manufactured by Broadcom, will be added to its fleet. Until more GPUs show up or the company decides who gets to use them at higher pricing, developers may be left refreshing threads, watching tokens get rationed, and waiting for reasoning to make a comeback.

(image/jpeg; 10.83 MB)

The reckless temptation of AI code generation 7 Apr 2026, 9:00 am

Too many executives are cutting software engineering teams because they bought into the fantasy that AI can now build and maintain enterprise applications with only a few people around to supervise the machine. That idea isn’t bold. It isn’t visionary. It’s reckless, and more executives will suffer the consequences of their mistakes beyond just a bad quarter.

Yes, AI can write code. That much is clear. The problem is that many vendors and leaders have taken this fact and exaggerated it into something absurd: the idea that software engineering has become essentially optional. They believe that if a model can generate application logic, then experienced developers, architects, and performance engineers are suddenly unnecessary expenses. This kind of thinking might seem clever in a boardroom presentation, but it falls apart in real-world production.

How this story unravels

The applications often work, which makes this approach deceptively effective. The demo succeeds, and, at first, the feature seems to function properly. Everyone congratulates themselves. But then the system is deployed at scale and the cloud bill skyrockets. What used to cost $10,000 a month on AWS suddenly jumps to $300,000 or more. In the worst cases, companies face multimillion-dollar monthly cloud costs for systems that should never have been built that way in the first place.

AI can generate code, but it doesn’t grasp efficiency like experienced engineers do. It doesn’t prioritize cost-efficient architecture. It doesn’t instinctively avoid wasteful service calls, excessive data movement, poor caching, bad concurrency patterns, noisy database behavior, or compute-heavy nonsense that might look good in a code sample but fails in real-world use. It produces something plausible. However, it doesn’t deliver something financially responsible.

Then comes my favorite bad argument from the AI hype crowd: “Just optimize it afterward.” Fine. With whom? These companies fired the experts who understood complex systems, leaving behind AI-generated code no one fully understands. The remaining humans didn’t build it, don’t know its structure, and can’t safely modify it. They are trapped with applications they can run at an exorbitant price but not reliably maintain.

That isn’t innovation. That’s self-inflicted technical debt on an industrial scale.

Normally, technical debt creeps in over time. A rushed release here, a shortcut there, an old dependency nobody wants to touch. With AI-generated enterprise software, companies are creating years of technical debt in a matter of months. It’s almost impressive, in the worst possible way. They are compressing entire failure cycles because AI lets them build faster than they can think.

And now the frantic calls begin. Why is the app slow? Why are users complaining? Why are outages harder to diagnose? Why is the cloud bill out of control? Why can’t anyone fix this without causing something else to fail? Why doesn’t the AI coding promise look anything like the sales pitch?

Know the pros and cons of AI

That doesn’t mean AI is useless—far from it. AI can absolutely help software teams move faster. It can help with scaffolding, documentation, repetitive coding tasks, test generation, and even architectural brainstorming. In the hands of strong engineering teams, it is a legitimate accelerator. But somewhere along the way, too many executives decided that “accelerator” meant “replacement,” and the bad decisions began.

Good engineers are not valuable because they can type code into an editor. Good engineers are valuable because they understand systems. They understand trade-offs. They understand why one design choice creates future operational pain and another choice avoids it. They understand how software behaves after launch, under load, across regions, inside complex security and compliance environments, and on top of public cloud pricing models that punish inefficiency. AI does not replace that. It imitates fragments of it.

What makes this even worse is that too many companies incentivize the short term. The market loves a cost-cutting story. Announce layoffs or say “AI transformation” often enough and you may get a nice temporary stock bump. Executives know that. They also know that if the real damage shows up three or four quarters later, they can always blame execution, market conditions, or “unexpected complexities.” Meanwhile, the company’s engineering foundation is being hollowed out.

Don’t be the company that finds out too late that it has painted itself into an AI corner. The old human-built systems will still around, but the people who understood them are gone. The new AI-built systems are expensive, fragile, and opaque. Rebuilding will cost a fortune. Rehiring talent will be difficult. Some employees will not come back, and I wouldn’t blame them.

I said this before, and it still holds true: AI is nowhere near replacing software engineers at the scale being promised. Not even close. The leaders who think otherwise are gullible, not brave. Worse, they are risking their companies for marketing stories pushed by people who profit from overstating the future.

In the next few years, I anticipate some difficult case studies. Some companies will quietly change direction. Others will spend a lot of money trying to fix issues. A few might shut down entirely because they made a fatal management mistake: They bought into the hype, fired the people who knew what they were doing, and handed control of systems to individuals who couldn’t truly manage them.

If companies want to avoid that outcome, the answer is straightforward. Keep your engineers, use AI to enhance their capabilities, and assign experienced architects to lead, enforce governance, control costs, and ensure maintainability. Treat AI as a tool and not a replacement for human judgment.

It’s easy for hype cycles to make lots of magical claims. Reality is less exciting. Look past the marketing spin to long-term implications, because reality is what pays the cloud bill.

(image/jpeg; 4.51 MB)

What enterprise devops teams should learn from SaaS 7 Apr 2026, 9:00 am

Many enterprise devops teams struggle to deploy frequently, increase test automation, and ensure reliable releases. What can they learn from SaaS companies, where developing and deploying software for thousands of customers is core to their revenue and business operations?

SaaS companies must have robust testing, observability, deployment, and monitoring capabilities. One bad deployment can disrupt customer operations, unravel sales opportunities, and attract negative media coverage.

What’s most challenging is that many SaaS platforms are configurable and have low-code development capabilities. These platforms require robust test data sets and real-time monitoring to ensure that deployments don’t break functionality for customers. Impacting even a tiny fraction of customers is an unacceptable outcome.

Validating data entry forms and end-to-end workflows is a combinatorial problem that requires building robust data sets and testing a statistical significance of input patterns. Further, developing, integrating, and deploying AI agents and language models adds new complexities. In enterprises, testing open-ended AI agents with non-deterministic responses becomes a greater challenge as more organizations use third-party AI agents and move AI experiments into production.   

I asked SaaS providers to share some of their devops secrets. As more enterprise devops teams develop, deploy, and support mission-critical apps, integrations, and data pipelines, how can they improve resiliency? Look to the practices of SaaS providers.

Aim for smart ‘customer’ upgrades

If you deploy an upgrade, will end users take advantage of the new capabilities, or will they be frustrated by defects that leaked into production? Enterprises must aim for smart customer upgrades that are seamless for end users, are deployed frequently, have low defect rates, avoid security issues, and drive adoption of new capabilities.

To consistently meet these objectives, enterprise devops teams must embrace a shift in mindset, away from legacy IT norms. They must recognize that:

  1. End users are customers and disrupting workflows affects business operations. Improving deployment frequency but shipping with defects is not a win for SaaS, nor should it be for enterprise IT.
  2. Deploying capabilities that few people notice, try out, and adopt is highly problematic. It implies that the team invested time and resources without delivering business value and likely introduced new technical debt.
  3. Deploying software is never a one-time effort, and agile teams should communicate their release management plans.

“The most successful devops teams realize that their internal platform is actually a specialized SaaS product where the developers are the primary customers,” says Sergio Rodríguez Inclán, senior devops engineer at Jalasoft. “By replacing rigid project deadlines with a commitment to continuous reliability and self-service automation, IT shifts from being a corporate bottleneck to a competitive advantage.”

One transformation many enterprises are undertaking is a shift to product-based IT, which helps group applications into products and assigns product managers to oversee their roadmaps. Most SaaS companies assign product managers to communicate product vision, define user personas, understand customer needs, prioritize features, and measure business outcomes.

Esko Hannula, SVP of robotics at Copado, says, “Modern enterprise IT should adopt the SaaS mindset. Software isn’t a project with an end date but a continuously improving product delivered through frequent, incremental releases.”

Hannula recommends reviewing the devops practices used by SaaS teams, including advanced CI/CD, continuous testing, canary releases, A/B testing, and data-guided product management, to be able to release whenever needed. “These practices matter because they create the confidence, agility, and quality necessary for rapid response to business change—outcomes that naturally follow from treating software as a long-lived product rather than a one-off project,” Hannula says.

Code less and test more

Developers in enterprise IT are using AI code generators and experimenting with vibe coding tools. Research shows that these tools can improve developer productivity by 30% or more. But will productivity translate into devops teams deploying features faster and more reliably?

Enterprise IT has a long history of underfunding testing and targeting big-bang deployments. SaaS companies do the opposite. They apply analytics in test automation, build synthetic test data sets, and use feature flagging to reduce the risks of deploying more frequently. The more advanced SaaS companies adopt continuous deployment, but this may be challenging to implement for many enterprises.

“Test automation may feel like an upfront cost, but it pays off quickly because more resilient services lead to fewer incidents, fewer support tickets, and lower operational overhead,” says Nikhil Mungel, Director of AI R&D at Cribl. “SaaS teams often de-risk launches by releasing features to small groups first and using observability to watch system vitals and user experience before broad release, typically via feature flags and bucketing. IT devops teams can mirror this by enabling ‘power users’ to opt in early, improving satisfaction while reducing support burden.”

Secure in design, development, and deployment

The productivity improvement from code generators may come at a cost. The same study noted above, which showed improved developer productivity and code maintainability, also found that 23.7% more security vulnerabilities were introduced in the code generated.

Shifting security left sounds straightforward, but in reality, it’s a broad agenda for enterprise devops teams that want security on equal grounds with dev and ops priorities. To become devsecops, agile teams must address security and compliance beyond application security, including cloud infrastructure hardening, identity management, data security, and AI model bias.

“SaaS teams win when they embed security into their codebase with dependencies that upgrade seamlessly,” says David Mytton, CEO of Arcjet. Mytton recommends these four practices:

  1. Test suites with automated security and privacy checks that flag when dependencies break.
  2. Standardized observability with structured traces and context-rich logs.
  3. Privacy-aware data management and in-app PII redaction with CI/CD gates.
  4. Feature flags and canary rollouts to move fast without breaking customers or compliance.

Priya Sawant, GM and VP of platform and infrastructure at ASAPP, adds that modern SaaS teams shift left by baking security, testing, access control, and observability into the design and CI/CD pipelines rather than patching them in at the end. “Automating permissions, enforcing golden-path pipelines, and delivering built-in observability removes friction, improves quality, and accelerates delivery. IT and devops teams that adopt this model move faster and scale more reliably than those stuck in manual approvals and reactive workflows,” Sawant says.

Start planning for resilient operations from Day 0

Back when I was a CIO, I once was asked what our Day 2 model was for a new application we were building. Day 2 is legacy terminology for when an application is deployed to production and requires support, as opposed to Day 1 (development) and Day 0 (planning).

SaaS teams have a very different mindset around operations, and they start planning for scalability, security, performance, and incident management from the outset in the architecture design.

For example, SaaS companies place a lot of emphasis on their developer experiences. Developers who are too busy tinkering with cloud configurations, manually patching components, or handling data management tasks can lose focus on customer needs.

“SaaS engineers use technologies that don’t overcomplicate things and let them move fast without wrestling with upgrade paths,” says Alejandro Duarte, developer relationships engineer at MariaDB.

Duarte recommends choosing infrastructure that doesn’t slow down developers. For example, at the data layer, Duarte prioritizes systems that support native replication, vector storage, fast analytics, and automatic node recovery.

Define an observability strategy, then implement it

Another SaaS-inspired mindset shift is from Day 2 monitoring of applications as black boxes to Day 0 observability, providing ops teams with the details needed to aid incident management and root cause analysis. In enterprises, establishing observability standards is essential because operations teams track alerts and incidents across hundreds to thousands of applications.

“IT devops teams can learn from SaaS developers that observability isn’t just about monitoring systems after deployment—it’s about embedding real context into every stage of development,” says Noam Levy, founding engineer and field CTO at groundcover. “Modern observability tools, especially when paired with AI, help engineers anticipate regressions before they happen in production environments, guiding safer code changes and more reliable releases. This shift from reactive troubleshooting to proactive reliability mirrors how leading SaaS teams continuously refine and reinforce trust in their software.”

The importance of observability was a common theme among SaaS leaders, and many standardize it as a devops non-negotiable. But logging every bit of information can become expensive and complex, especially when AI agents log all interactions.

“As AI-driven systems generate exponentially more logs, metrics, and traces, tightly coupled observability stacks can’t keep enough data hot without driving up costs or offloading it into slow, hard-to-query cold storage,” says Eric Tschetter, chief architect at Imply. “With an observability warehouse as the scalable data layer, teams keep telemetry data accessible at scale without increasing costs.”

Ang Li, director of engineering at Observe, shares a good rule that SaaS teams use to decide what information to include in their standards. “SaaS engineering teams design observability around users and workflows, not just whether systems are up or down. IT devops can apply the same thinking, moving beyond uptime monitoring to instrumenting critical business transactions to better understand user impact, limit blast radius, and recover faster,” says Li.

Key takeaways

We can distill two key takeaways for enterprise devops teams from the recommendations our experts shared above. First, apply product management practices, focus on features that matter, and develop robust testing. Second, shift left in practices, not just culture, by considering observability, security, and resiliency as part of the solution’s architecture.


(image/jpeg; 7.94 MB)

Rust team warns of WebAssembly change 7 Apr 2026, 4:51 am

WebAssembly targets for Rust will soon face a change that could risk breaking existing projects, according to an April 4 bulletin in the official Rust blog. The bulletin notes that all WebAssembly targets in Rust have been linked using the --allow-undefined flag to wasm-ld, but this flag is being removed.

Removing --allow-undefined on wasm targets is being done in rust-lang/rust#149868. That change is slated to land in nightly builds soon and will be released with Rust 1.96 on 2026-05-28. The bulletin explains that all WebAssembly binaries in Rust are created by linking with wasm-ld, thus serving a similar purpose to ld, lld, and mold. Since the first introduction of WebAssembly targets in Rust, the --allow-undefined flag has been passed to wasm-ld.

However, by passing --allow-undefined on all WebAssembly targets, rustc introduces diverging behavior between other platforms and WebAssembly, the bulletin says. The main risk of --allow-undefined is that misconfiguration or mistakes in building can result in broken WebAssembly modules being produced, as opposed to compilation errors. The bulletin lists the following example problematic situations:

  • If mylibrary_init was mistakenly typed as mylibraryinit, then the final binary would import the mylibraryinit symbol instead of calling the linked mylibrary_init C symbol.
  • If mylibrary was mistakenly not compiled and linked into a final application, then the mylibrary_init symbol would end up imported rather than producing a linker error saying it’s undefined.
  • If external tools are used to process a WebAssembly module, such as wasm-bindgen or wasm-tools component new, they are likely to provide an error message that isn’t clearly connected back to the original source code from which the symbols were imported.
  • Web errors along the lines of Uncaught TypeError: Failed to resolve module specifier "env". Relative references must start with either "/", "./", or "../".can mean that "env" leaked into the final module unexpectedly and the true error is the undefined symbol error, not the lack of "env" items provided.

All native platforms consider undefined symbols to be an error by default. Therefore, by passing --allow-undefined rustc introduces surprising behavior on WebAssembly targets. The goal of the change is to remove this surprise so that WebAssembly behaves more like native platforms, the bulletin states.

In theory, however, not a lot is expected to break from this change, the bulletin concludes. If the final WebAssembly binary imports unexpected symbols, then it’s likely the binary won’t be runnable in the desired embedding, as the desired embedding probably doesn’t provide the symbol as a definition. Therefore, most of the time this change will not break users, but will instead provide better diagnostics.

(image/jpeg; 11.64 MB)

Visual Studio Code 1.114 streamlines AI chat 6 Apr 2026, 9:13 pm

Microsoft has released Visual Studio Code 1.114. The update of Microsoft’s popular code editor streamlines the AI chat experience, offering previews of videos in the image carousel for chat attachments, adding a Copy Final Response command to the chat context menu, simplifying semantic searches of codebases by GitHub Copilot, and more.

Introduced April 1, VS Code 1.114 can be downloaded from the project website.

With VS Code 1.114, the image carousel, introduced in version 1.113, now also supports videos. Developers can play and navigate videos from chat attachments or the Explorer context menu. The viewer provides controls and navigation for images and videos using arrows or thumbnail. Also, there now is a Copy Final Response command in the chat context menu that copies the last Markdown section of the agent’s response, after tool calls have run.

For simplifying workspace searches, the #codebase tool now is used exclusively for semantic searches. Previously, #codebase could fall back to less accurate and less efficient fuzzy text searches. The agent can still do text and fuzzy searches, but Microsoft intends to keep #codebase purely focused on semantic searches. Microsoft also simplified how the codebase index is managed.

Elsewhere in VS Code 1.114:

  • A preview feature for troubleshooting previous chat sessions allows developers to reference any previous chat session when troubleshooting. This makes it easier to investigate issues after the fact, without needing to reproduce them, Microsoft said.
  • TypeScript and JavaScript support now extends to TypeScript 6.0, which was introduced March 23.
  • The Python Environments extension now recommends the community Pixi extension when Pixi environments are detected, and includes Pixi in the environment manager priority order.
  • Administrators now can use a group policy to disable Anthropic Claude agent integration in chat. When this policy is applied, the github.copilot.chat.claudeAgent.enabled setting is managed by the organization and users cannot enable the Claude agent.
  • A proposed API for fine-grained tool approval allows language model tools with an approval flow to scope approval to a specific combination of arguments, so that users approve each command individually.

VS Code 1.114 is part of a March change where Microsoft has begun releasing weekly updates to VS Code instead of just monthly updates. VS Code 1.115 is likely to be released any day now, under this new policy.

(image/jpeg; 0.33 MB)

Page processed in 0.319 seconds.

Powered by SimplePie 1.4-dev, Build 20170403172323. Run the SimplePie Compatibility Test. SimplePie is © 2004–2026, Ryan Parman and Geoffrey Sneddon, and licensed under the BSD License.