Or try one of the following: 詹姆斯.com, adult swim, Afterdawn, Ajaxian, Andy Budd, Ask a Ninja, AtomEnabled.org, BBC News, BBC Arabic, BBC China, BBC Russia, Brent Simmons, Channel Frederator, CNN, Digg, Diggnation, Flickr, Google News, Google Video, Harvard Law, Hebrew Language, InfoWorld, iTunes, Japanese Language, Korean Language, mir.aculo.us, Movie Trailers, Newspond, Nick Bradbury, OK/Cancel, OS News, Phil Ringnalda, Photoshop Videocast, reddit, Romanian Language, Russian Language, Ryan Parman, Traditional Chinese Language, Technorati, Tim Bray, TUAW, TVgasm, UNEASYsilence, Web 2.0 Show, Windows Vista Blog, XKCD, Yahoo! News, You Tube, Zeldman
Databricks pitches Lakewatch as a cheaper SIEM — but is it really? | InfoWorld
Technology insight for the enterpriseDatabricks pitches Lakewatch as a cheaper SIEM — but is it really? 26 Mar 2026, 12:42 pm
Databricks has previewed a new open agentic Security Information and Event Management software (SIEM) named Lakewatch that signals its first deliberate step beyond data warehousing into security analytics.
The data warehouse-provider is pitching Lakewatch as a lower-cost alternative to traditional security tools, arguing that consolidating security analytics into its data platform can reduce overall spend.
“Right now, existing solutions’ (rival SIEMs) ingestion costs force teams to discard up to 75% of their data, so while attackers can use AI to attack anywhere, defenders only see a fraction of their own data. Our goal with Lakewatch is to close this gap… because our lakehouse architecture is uniquely built to handle massive amounts of data cheaply,” Andrew Krioukov, general manager of Lakewatch at Databricks, told InfoWorld.
“Unlike other SIEM platforms, we do not charge based on the amount of data ingested or stored, but rather on the compute that security teams use. This allows organizations to achieve up to an 80% reduction in total cost of ownership (TCO) while maintaining years of hot, queryable data for compliance and hunting,” Krioukov added.
Analysts, too, agree with Krioukov, but only in part.
“The cost problem in SIEM is real. Many organizations often are forced to discard data because ingestion pricing makes full retention prohibitively expensive,” said Stephanie Walter, leader of the AI stack at HyperFRAME Research.
In contrast, Lakewatch can reduce costs in some cases, especially if enterprises want to retain large amounts of data, echoed Akshat Tyagi, associate practice leader at HFS Research.
However, analysts warned that savings may be less straightforward, with costs potentially shifting to compute and data processing rather than disappearing altogether.
“Costs don’t disappear; they shift. If usage isn’t controlled, compute can add up quickly. It can be more efficient, but not automatically cheaper,” said Robert Kramer, principal analyst at Moor Strategy and Insights.
Beyond costs, though, analysts say Lakewatch is offering a progressive structural shift in how enterprises conduct security operations, especially analytics.
The platform stitches together components such as Unity Catalog for governance and access control, Lakeflow Connect for ingesting and streaming security data, and the Open Cybersecurity Schema Framework (OCSF) to standardize disparate log formats, effectively turning the lakehouse into a centralized system of record for security operations, Walter said.
The added context from all the combined data in the lakehouse is also likely to act as an accelerant for helping enterprises automate security operations at scale with agents, Walter added.
That said, translating these benefits into near-term buy-in from CIOs and CISOs could prove challenging for Databricks.
“This is more likely to complement existing SIEMs than replace them. Early adoption will come from large enterprises already committed to Databricks, especially those seeking flexibility or cost control. It aligns with existing investments but remains new territory for operational security teams. Building trust through proven use cases will be key,” Kramer said.
Even so, Databricks is signaling serious intent, with the acquisitions of two cybersecurity startups — Antimatter and SiftD.ai, which analysts say point to its broader security roadmap ahead. “This looks like the foundation of a long-term security portfolio, not a one-off SIEM feature. Acquiring security-focused companies is less about adding features and more about importing credibility. Security buyers trust vendors with domain depth, not just infrastructure scale,” HyperFRAME Research’s Walter said.
Google targets AI inference bottlenecks with TurboQuant 26 Mar 2026, 10:22 am
Google says its new TurboQuant method could improve how efficiently AI models run by compressing the key-value cache used in LLM inference and supporting more efficient vector search.
In tests on Gemma and Mistral models, the company reported significant memory savings and faster runtime with no measurable accuracy loss, including a 6x reduction in memory usage and an 8x speedup in attention-logit computation on Nvidia H100 hardware.
For developers and enterprise AI teams, the technology offers a path toward reduced memory demands and better hardware utilization, along with the possibility to scale inference workloads without a matching jump in infrastructure costs.
According to Google, TurboQuant targets two of the more expensive components in modern AI systems, specifically the key-value (KV) cache used during LLM inference and the vector search operations that underpin many retrieval-based applications.
By compressing these workloads more aggressively without affecting output quality, TurboQuant could allow developers to run more inference jobs on existing hardware and ease some of the cost pressure around deploying large models.
Significance in enterprise deployments
Whether this amounts to a meaningful breakthrough for enterprise AI teams will depend on how well the technique performs outside Google’s own tests and how easily it can be integrated into production software stacks.
“If these results hold in production systems, the impact is direct and economic,” said Biswajeet Mahapatra, principal analyst at Forrester. “Enterprises constrained by GPU memory rather than compute could run longer context windows on existing hardware, support higher concurrency per accelerator, or reduce total GPU spend for the same workload.”
Sanchit Vir Gogia, chief analyst at Greyhound Research, said the announcement addresses a real but often overlooked constraint in enterprise AI systems.
“Let’s call this what it is,” Gogia said. “Google is going after one of the most annoying, least talked about problems in AI systems today. Memory blow-up during inference. The moment you move beyond toy prompts and start working with long documents, multi-step workflows, or anything that needs context to persist, memory becomes the constraint.”
These gains matter because KV cache memory rises in step with context length. Any meaningful compression can directly let developers handle longer prompts, larger documents, and more persistent agent memory, all without having to redesign the underlying architecture.
However, Gogia cautioned that efficiency gains may not translate into lower spending.
“Efficiency gains rarely reduce spend,” Gogia said. “They increase usage. Teams don’t save money. They stretch systems further. Longer context, more queries, more experimentation. So the impact is real, but it shows up as scale, not savings.”
LLM interference to benefit
Google is positioning TurboQuant as a technology that could improve both LLM inference and vector search. Some analysts say the more immediate payoff is likely to come in LLM inference.
“The KV cache problem is already an acute cost and scaling limiter for enterprises deploying chat, document analysis, coding assistants, and agentic workflows, and TurboQuant directly compresses that runtime memory without retraining or calibration,” Mahapatra said. “Vector search also benefits from the same underlying compression techniques, but most enterprises already manage vector memory through sharding, approximate search, or storage tiering, which makes the pain less immediate.”
That distinction matters because inference memory pressure tends to hit enterprises where it hurts most: GPU sizing, latency, and cost per query. In other words, the problem is not theoretical. It affects the economics of running AI systems at scale today.
Gogia, however, sees the initial impact playing out differently, with retrieval and vector search systems likely to benefit first.
“Retrieval systems are modular,” Gogia said. “You can isolate them, tweak them, test them without breaking everything else. And they already depend on compression to function at scale. So any improvement here hits immediately. Storage footprint comes down. Index rebuilds get faster. Refresh cycles improve. That is operational value, not theoretical value.”
Gogia said Google’s announcement represents a solid piece of engineering that addresses a real problem and could deliver meaningful benefits in the right contexts. However, he added that it does not change the underlying constraints, noting that AI systems remain limited by infrastructure, power, cost, and the complexity of making all the components work together.
Swift 6.3 boosts C interoperability, Android SDK 26 Mar 2026, 9:00 am
Swift 6.3, the latest release of the Apple-driven language for multiple platforms, offers more flexible C interoperability and improvements for cross-platform build tools. The official SDK for Android mobile application development also is featured.
Announced March 24, Swift 6.3 can be accessed at swift.org. For C interoperability, Version 6.3 debuts the @c attribute for exposing Swift functions and enums to C code in a project. Annotating a function or enum with @c prompts Swift to include a corresponding declaration in the generated C header that can be included in C/C++ files.
Also, Swift 6.3 has a preview of the Swift Build system integrated into Swift Package Manager. This preview brings a unified build engine across all supported platforms for a more consistent cross-platform development experience. Improvements to Swift Package Manager in Version 6.3 include a prebuilt Swift syntax for shared macro libraries, flexible inherited documentation, and discoverable package traits.
In the Android development vein, the Swift SDK for Android enables development of native Android programs in Swift and updating of Swift packages to support building for Android. Also, developers can use Swift Java and Swift Java JNI Core to integrate Swift code into existing Android applications written in Kotlin or Java.
Also in Swift 6.3:
- Module selectors are being introduced to specify which imported module Swift should look in for an API used in code.
- Embedded Swift has improvements ranging from enhanced C interoperability and better debugging support to meaningful steps toward a complete linkage model.
- For the core library, Swift Testing has improvements for areas including warning issues, test cancellation, and image attachments.
- Experimental capabilities are added to the DocC documentation compiler for markdown output, per-page static HTML content, and code block annotations.
- For performance control for library APIs, attributes were introduced that give library authors finer-grained control over compiler optimizations for clients of APIs.
A data trust scoring framework for reliable and responsible AI systems 26 Mar 2026, 9:00 am
Digital transformation today is more than just automating tasks or speeding up calculations. It’s reshaping how we make decisions. People used to rely on their own experience and negotiation skills, but now algorithms are often taking over. While this shift improves efficiency and scale, it also introduces a critical challenge: managing knowledge reliably across automated decision systems. If these systems end up using data that isn’t accurate, balanced or well-organized, mistakes and inequality can spread instead of smart solutions.
Artificial intelligence is only as good as the data it gets and the goals it’s built to reach. To create AI that people really trust, we need to make sure our data is reliable and fair. That’s why a data trust scoring framework matters. It helps turn ideas about fairness and responsibility into clear ratings for the data sets that power AI systems.
From human trust to algorithmic reliance
Trust is often viewed as a personal bond, where one person depends on another’s abilities, goodwill and honesty. When trust is broken in relationships, it feels like betrayal rather than just disappointment, because trust carries deeper expectations.
When considering AI, the situation becomes more complex. Many people attempt to apply human concepts of trust to machines, but this proves challenging. Skills can be assessed through accuracy, while safety measures substitute for goodwill. Integrity is more difficult to evaluate since machines lack moral judgement, so attention turns toward transparency and fairness within these systems. Recent studies recommend viewing trustworthy AI in social terms, considering its benefits for institutions instead of just focusing on the technology itself.
A practical strategy is to distinguish reliance from trust. Reliance involves expecting a system to perform based on evidence and previous results. True trust should be reserved for individuals and organizations capable of accepting responsibility. Therefore, data trust scoring ought to communicate clearly what AI systems are able and unable to accomplish, which helps users rely on them with justified confidence.
Mapping human trust attributes to data and models
If traditional trust is grounded in ability, benevolence and integrity, those ideas can be translated into an algorithmic setting as follows:
- Ability becomes technical performance and robustness. How accurate is the model on representative data, and how resilient is it under distribution shift or adversarial manipulation?
- Benevolence becomes alignment with human safety, rights and organizational purpose. Does the system’s behavior track the values it is supposed to embody, rather than merely its loss function?
- Integrity becomes process transparency, procedural fairness and traceability. Can one reconstruct how data was collected, processed and used? Can one explain what the model is doing in ways that are meaningful to affected stakeholders?
These translations are not perfect, but they create a bridge between relational trust and system level governance. They also motivate a more fine-grained view of dataset fitness, which is where the seven-dimensional taxonomy enters.
A 7-dimensional taxonomy of dataset fitness
The data trust scoring framework rates datasets across seven areas, using clear rubrics and producing a composite score for easier understanding:
- Accuracy: Checks if data matches true events, focusing on correct labels and avoiding systematic errors. Inaccurate labels can mislead models at scale.
- Completeness: Looks for missing data or gaps. Incomplete datasets, such as missing transaction records, skew model outcomes and risk estimates.
- Freshness: Assesses if data is up to date. Old data can misrepresent current trends, so this dimension highlights the importance of recent information.
- Bias Risk: Flags built-in prejudices, from sampling bias to historical discrimination. This ensures fairness is addressed from the start, not as an afterthought.
- Traceability: Focuses on clear records from data collection to final use. Without tracking, it’s hard to analyze failures or make corrections.
- Compliance. It evaluates alignment with regulatory and policy requirements. This includes privacy obligations under regimes such as GDPR, sector-specific mandates and emerging AI standards. The NIST AI Risk Management Framework has become a widely referenced guide for mapping, measuring and managing AI risks, while the EU AI Act is moving toward legally enforceable obligations for data quality and transparency in high-risk systems.
Contextual clarity
Contextual clarity concerns how well the dataset’s scope, limitations and intended uses are documented. Developers need enough metadata and narrative context to understand where the data is reliable and where it is not. This dimension guards against the silent repurposing of data in settings for which it was never appropriate.
Each dimension is scored, normalized and then combined into an overall trust score. One common aggregation formula is:

Sunil Kumar Mudusu
Where 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛_𝑆𝑐𝑜𝑟𝑒 is the normalized score for each of the seven dimensions, and specific 𝑊𝑒𝑖𝑔ℎ𝑡 is the importance factor derived from stakeholder analysis.
Semantic integrity and generative AI
Traditional data quality principles were developed with structured data in mind. Large Language Models and other generative systems challenge these assumptions. They are trained on massive, heterogeneous corpora, yet can generate outputs that look fluent while being factually or logically incorrect.
To address this, the framework introduces semantic integrity constraints. These are declarative rules that extend classical database integrity constraints into the semantic domain. At a high level, they fall into two broad categories:
- Grounding constraints, which require that generated content be consistent with authoritative sources. This can be implemented through retrieval augmented generation, constrained decoding or post hoc validation against trusted knowledge bases.
- Soundness constraints, which evaluate whether the model’s reasoning is logically coherent. This is particularly relevant when LLMs are used to generate explanations, summaries of complex evidence or structured outputs such as JSON objects and code.
Metrics like SEMSCORE, which leverage neural embeddings to approximate human judgments of semantic similarity, and more structurally aware measures such as STED, which balance semantic flexibility against syntactic precision, offer partial but useful tools for quantifying semantic integrity in practice.
Privacy preserving computation and mathematical trust
A key component of data trust is the protection of individual privacy. Traditional anonymization methods have proven vulnerable to reidentification attacks, especially when datasets are linked or auxiliary information is available. Differential privacy offers a more rigorous alternative. As summarized in public references such as the article on differential privacy in computational privacy literature and on Wikipedia, the core idea is to limit how much influence any single individual can have on the output of a computation.
Formally, for two datasets D1 and D2 that differ in exactly one record, and for a randomized mechanism K, epsilon differential privacy requires that for every possible output set S:

Sunil Kumar Mudusu
The parameter epsilon quantifies the privacy loss. Smaller values mean stronger privacy guarantees, but they also require more noise to be injected into the computation, which can reduce utility.
Kanonymity provides a more classical framework. It demands that each record in a released dataset be indistinguishable from at least K − 1 others with respect to a set of quasi-identifiers. While Kanonymity is vulnerable to various attacks if used alone, it remains useful when combined with additional safeguards, especially for generating synthetic datasets that preserve statistical properties while reducing the risk of reidentification.
In the trust scoring framework, privacy preserving techniques contribute directly to the compliance and traceability dimensions and indirectly to bias and contextual clarity.
Regulatory alignment and operational guardrails
Data trust cannot be considered in isolation from the regulatory environment. Organizations deploying AI systems are increasingly expected to demonstrate not just that their models perform well, but that they manage risk responsibly across the entire lifecycle.
The NIST AI RMF offers a voluntary, but influential, structure for doing this. It organizes AI risk management into four functions: govern, map, measure and manage. The EU AI Act, by contrast, is a binding legal instrument. It classifies AI applications by risk level and imposes specific obligations on high-risk systems, including documentation of data quality, transparency measures and post-deployment monitoring. Some proposed implementations even contemplate minimum transparency index thresholds for models that affect fundamental rights.
A data trust scoring framework fits naturally into this landscape. It provides a concise, quantifiable summary of data fitness that can be linked to governance gates, deployment approvals and audit processes.
Operationalizing trust through KPIs and model cards
For a trust scoring framework to matter, it must move beyond design documents and into daily practice. That means integrating it with key performance indicators and the tools that teams already use.
Relevant KPIs include:
- Bias detection and mitigation rates, tracking both disparities discovered and time to remediation.
- Model drift detection times, measuring how quickly significant performance degradations are identified.
- Explanation coverage, estimating the percentage of model outputs for which meaningful explanations can be generated.
- Audit readiness scores, assessing the completeness and accessibility of documentation, lineage and decision logs.
Model cards provide a complementary artifact. As described in “Model Cards for Model Reporting,” they offer a structured template for documenting a model’s purpose, data foundations, design choices, limitations and monitoring plans. When every production model is accompanied by a model card and a current data trust score, AI governance shifts from retrospective justification to continuous, evidence-based stewardship.
Trust as a quantitative and institutional practice
The movement toward reliable and responsible AI is not a single project with a clear end state. It is an ongoing process of refinement in which technical capability, regulatory expectation and social norms evolve together. The data trust scoring framework is one contribution to that process. While it cannot remove difficult value judgments or eliminate ambiguity, it does make those judgments explicit, measurable and open to revision over time.
As AI systems become more autonomous and more deeply embedded in critical workflows, the question will not only be how powerful they are, but how well we can justify relying on them. Organizations that treat data trust as a quantifiable, governable property, rather than a vague aspiration, will be better positioned to answer that question convincingly to regulators, customers and their own staff. In the end, the durability of AI driven systems will depend less on raw model sophistication and more on the integrity of the data practices that sustain them.
This article is published as part of the Foundry Expert Contributor Network.
Want to join?
Rethinking VM data protection in cloud-native environments 26 Mar 2026, 9:00 am
After many years of being relatively static, the enterprise virtualization landscape is shifting under our feet. As organizations reassess their reliance on traditional hypervisors, driven by cost, licensing disruption, or a broader push toward modernization, many are exploring Kubernetes as the natural consolidation point for both containerized and virtual machine workloads. This is often referred to as “cloud-native virtualization” or “Kubernetes-native virtualization,” and it is enabled by KubeVirt and Containerized Data Importer (CDI), two open-source projects that together bring VMs into Kubernetes as first-class citizens. But running VM workloads on Kubernetes, a platform designed for distributed container orchestration, forces a fundamental rethinking of how those workloads are protected.
Many organizations still think of their Kubernetes environments as being stateless and not requiring backup. Whether or not this was true before (more often than not it wasn’t) it certainly isn’t true once VMs enter the picture.
VM data protection for traditional hypervisors has been mature for many years. It benefits from predictable methods and constructs, consistent snapshot semantics, and well-established approaches for application consistency and recovery. But things are different with KubeVirt. KubeVirt inherits Kubernetes’ management model, which is built around declarative management, resources, controllers, loosely coupled components, and pluggable storage drivers. Understanding how these architectural decisions reshape data protection is critical for anyone designing backup or disaster-recovery (DR) solutions for Kubernetes-native virtualization.
VMs defined by Kubernetes resources
The first big difference is in representation. In traditional virtualization systems, a VM is defined by an object or set of objects tightly controlled by the hypervisor. Its configuration, disk files, snapshots, and runtime state are all stored in a platform-specific way, enabling consistent backup semantics across different environments.
KubeVirt relies on the Kubernetes model instead. Virtual machines are defined using Kubernetes custom resources such as VirtualMachine, VirtualMachineInstance, and (with CDI) DataVolume, which are stored in the Kubernetes control plane. Their configuration is thus described declaratively in YAML, and their life cycle is managed by KubeVirt’s controllers. A VM definition in KubeVirt is therefore not a bundle of hypervisor objects, but a collection of Kubernetes resources describing compute, storage, networking, initialization, and storage volumes.
A generation of Kubernetes administrators have come to appreciate Kubernetes’ open, declarative model and YAML-based definitions, but for VM administrators it may be a bit confusing at first. More importantly for our purposes, the way this critical metadata is backed up and restored is entirely different. You’ll need to use Kubernetes-specific tools rather than the tools you’ve been using, and those tools will require at least a basic understanding of the Kubernetes control plane.
Storage and snapshot behavior governed by CSI
Storage is another area where virtualization teams might encounter architectural whiplash when transitioning. Storage systems in traditional enterprise VM environments are largely managed through the use of plugins, the prime example being VMware vCenter storage plugins. These plugins abstract important storage operations such as provisioning, health monitoring, and snapshot control. VMs running under Kubernetes rely, through CDI and Kubernetes persistent volumes, on drivers conforming to Kubernetes’ Container Storage Interface (CSI) for accessing storage. You can think of these as being somewhat analogous to the storage plugins, but at present generally less capable and less uniform in the features they provide.
From a data protection perspective, this leads to a few important points. First, different CSI drivers support different degrees of snapshot capability, with some providing no snapshot capability at all. The behavior of a KubeVirt VM backup that uses snapshots is therefore determined by the StorageClass and associated provisioner (CSI driver) backing its Persistent Volume Claims (PVCs).
Second, multi-disk VMs can make things more complicated. A KubeVirt VM may include multiple disks that need to be snapshotted together for consistency. KubeVirt’s snapshot mechanism helps orchestrate consistency across the PVCs for these volumes, but its success can depend on the presence of the QEMU guest agent (to freeze VM file systems), and the underlying CSI driver’s snapshot capabilities. True atomic consistency across multiple disks without file-system freezing (fs-freeze command) requires Volume Group Snapshot capabilities, which are still maturing in Kubernetes.
Third, designing reliable VM protection requires understanding the capabilities, limitations, and performance characteristics of each StorageClass.
Finally, cross-cluster recovery raises additional challenges. Unlike traditional hypervisor environments where datastores are often standardized or abstracted, different Kubernetes clusters frequently have different StorageClasses and underlying CSI drivers. Restoring a VM into a new cluster may require remapping storage classes or modifying PVC parameters. Recovery workflows must therefore be prepared to handle heterogeneous storage rather than relying on uniform hypervisor primitives.
VM snapshots with KubeVirt
When a VM snapshot is taken in VMware vSphere (for example), the operation produces a set of delta files capturing VM disk state, usually assisted by optional guest quiescing. It can also capture VM memory state.
KubeVirt treats VM snapshots differently. A KubeVirt snapshot consists of a captured copy of the VM spec and a set of underlying volume snapshots that capture the state of each associated PVC. KubeVirt uses the CSI driver’s snapshot functionality for capturing storage state in this way. VMs need to use DataVolumes or PVCs backed by a StorageClass that supports snapshots, and snapshots must be configured properly for those StorageClasses.
KubeVirt VM snapshots are file-system consistent when using the QEMU guest agent, and crash consistent otherwise. Importantly, KubeVirt snapshots do not preserve VM memory state, nor do they provide application consistency. Application consistency (e.g., for databases) often requires additional custom application hooks.
When restoring a VM snapshot, KubeVirt reconstructs the VM by applying the stored spec and restoring/binding volume snapshots to newly created PVCs.
This design aligns with Kubernetes’ broader philosophy of operation, but it brings new engineering considerations. Application consistency often requires explicit hooks, or coordination with in-guest processes. Disaster recovery may require coordinating the restores of multiple resources rather than a single hypervisor action.
Also note that some common Kubernetes backup and DR tools such as Velero and CloudCasa do not use KubeVirt VM snapshots at all, but instead directly back up KubeVirt custom resources and orchestrate their own persistent volume snapshots using the CSI snapshot interface. This approach is better when the intention is to back the snapshots up off-cluster and allow restores to other clusters.
KubeVirt does more than just offer an alternative runtime environment for VMs. It promises the nirvana of a unified compute plane for both VMs and containerized workloads. But it also reshapes the whole model of VM life cycle management and protection. For architects and platform engineers, the transition requires new assumptions, new skills, and new tools, and this often includes new Kubernetes-specific protection and DR solutions.
—
New Tech Forum provides a venue for technology leaders—including vendors and other outside contributors—to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to doug_dineley@foundryco.com.
Basic and advanced Java serialization 26 Mar 2026, 9:00 am
Serialization is the process of converting a Java object into a sequence of bytes so they can be written to disk, sent over a network, or stored outside of memory. Later, the Java virtual machine (JVM) reads those bytes and reconstructs the original object. This process is called deserialization.
Under normal circumstances, objects exist only in memory and disappear when a program terminates. Serialization allows an object’s state to outlive the program that created it, or to be transferred between different execution contexts.
The Serializable interface
Java does not allow every object to be serialized. A class must explicitly opt in by implementing the Serializable interface, as shown here:
public class Challenger implements Serializable {
private Long id;
private String name;
public Challenger(Long id, String name) {
this.id = id;
this.name = name;
}
}
Serializable (java.io.Serializable) is a marker interface, meaning that it does not define any methods. By implementing it, the class signals to the JVM that its instances may be converted into bytes. If Java attempts to serialize an object whose class does not implement the Serializable interface, it fails at runtime with a NotSerializableException. There is no compile‑time warning.
Serialization traverses the entire object graph. Every non‑transient field must refer to an object that is itself serializable. If any referenced object cannot be serialized, the entire operation fails. All primitive wrapper types (Integer, Long, Boolean, and others), as well as String, implement Serializable, which is why they can be safely used in serialized object graphs.
Limits of Java serialization
Serialization stores instance state and preserves reference identity within the object graph (shared references and cycles). It does not preserve behavior or JVM identity across runs. Remember the following guidelines when using serialization:
- Instance fields are written to the byte stream.
- Behavior is not serialized.
- Static fields are not serialized.
- Object identity is preserved.
Also note: If two fields reference the same object before serialization, that relationship is preserved after deserialization.
A Java serialization example
As an example of serialization, consider the following example, a Java Challengers player:
Challenger duke = new Challenger(1L, "Duke");
What do you notice? Let’s unpack it.
1. Writing the object
First, Java verifies that the class implements Serializable, converts the object’s field values into bytes, and writes them to the file:
try (ObjectOutputStream out = new ObjectOutputStream(new FileOutputStream("duke.ser"))) {
out.writeObject(duke);
}
2. Reading the object back
During deserialization, the serializable class’s own constructor is not called. The JVM creates the object through an internal mechanism and assigns field values directly from the serialized data. However, if the class extends a nonserializable superclass, that superclass’s no‑argument constructor will run:
try (ObjectInputStream in = new ObjectInputStream(new FileInputStream("duke.ser"))) {
Challenger duke = (Challenger) in.readObject();
}
This behavior often surprises developers the first time they debug a deserialized object, as the invariants are silently broken. This distinction is also important in class hierarchies, which we’ll discuss later in the article.
Serialization callbacks
Because the JVM controls object creation and field restoration during serialization, it also provides hooks that allow a class to customize how its state is written and restored. A class can define two private methods with exact signatures:
private void writeObject(ObjectOutputStream out) throws IOException
private void readObject(ObjectInputStream in) throws IOException, ClassNotFoundException
These methods are not called directly by application code. They are JVM callbacks invoked automatically during serialization and deserialization. Calling them manually results in a NotActiveException, because they require an active serialization context managed by the JVM:
import java.io.*;
public class OrderSensitiveExample implements Serializable {
private static final long serialVersionUID = 1L;
void main() throws IOException, ClassNotFoundException {
OrderSensitiveExample example = new OrderSensitiveExample();
// Serialization: triggers writeObject(...)
try (ObjectOutputStream out = new ObjectOutputStream(new FileOutputStream("example.ser"))) {
out.writeObject(example);
}
// Deserialization: triggers readObject(...)
try (ObjectInputStream in = new ObjectInputStream(new FileInputStream("example.ser"))) {
in.readObject();
}
}
private void writeObject(ObjectOutputStream out) throws IOException {
out.defaultWriteObject();
out.writeObject("Duke");
out.writeObject("Juggy");
}
private void readObject(ObjectInputStream in) throws IOException, ClassNotFoundException {
in.defaultReadObject();
String first = (String) in.readObject();
String second = (String) in.readObject();
System.out.println(first + " " + second);
}
}
Meanwhile, writeObject and readObject are invoked by the JVM. This is done via Java reflection, as part of ObjectOutputStream.writeObject and ObjectInputStream.readObject, and cannot be meaningfully called by application code.
Using serialVersionUID for version control
Every serializable class has a version identifier called serialVersionUID. This value is written into the serialized data. During deserialization, the JVM compares the value stored in the serialized data with the value declared in the current version of the class. If they differ, deserialization fails with an InvalidClassException.
If you do not declare a serialVersionUID, Java generates one automatically based on the given class structure. Adding a field, removing a method, or even recompiling the class can change it and break compatibility. This is why relying on the generated value is usually a mistake.
Choosing a serialVersionUID
For new classes, it is common and correct to start with the following declaration:
private static final long serialVersionUID = 1L;
IDEs often suggest long, generated values that mirror the JVM’s default computation. While technically correct, those values are frequently misunderstood. They do not distinguish objects, prevent name collisions, or identify individual instances. All objects of the same class share the same serialVersionUID.
The purpose of this value is to identify the class definition, not the object. It acts as a compatibility check during deserialization, ensuring that the class structure matches the one used when the data was written. This usually becomes a problem only after data has already been serialized and deployed.
The number itself has no special meaning; Java does not treat 1L differently from any other value. What matters is that the value is explicit, stable, and changed intentionally.
When to change serialVersionUID
You should change serialVersionUID when a class change causes previously serialized field values to have a different meaning for the current code.
Typical reasons include removing or renaming a serialized field, changing the type of a serialized field, changing the meaning of stored values such as status codes, introducing new constraints that old data may violate, or changing the class hierarchy or custom serialization logic.
In these cases, deserialization may still succeed, but the resulting object would represent an incorrect logical state. Changing the serialVersionUID ensures such data is rejected instead of silently misused.
If changes only add behavior or optional data, such as adding new fields or methods, the value usually does not need to change.
Excluding fields with transient
Some fields should not be serialized, such as passwords, cached values, or temporary data. In these cases, you can use the transient keyword:
public class ChallengerAccount implements Serializable {
private static final long serialVersionUID = 1L;
private String username;
private transient String password;
public ChallengerAccount(String username, String password) {
this.username = username;
this.password = password;
}
}
A field marked transient is skipped during serialization. When the object is deserialized, the field is set to its default value, which is usually null.
Serialization and inheritance
Serialization works across class hierarchies, but there are strict rules.
If a superclass does not implement Serializable, its fields are not serialized, and it must provide a no‑argument constructor. This failure tends to surface late, often after a seemingly harmless refactor of a base class:
class Person {
String name;
public Person() { this.name = "unknown"; }
}
class RankedChallenger extends Person implements Serializable {
private static final long serialVersionUID = 1L;
int ranking;
}
During deserialization, the superclass constructor runs and initializes its fields, while only the subclass fields are restored from the serialized data. If the no‑argument constructor is missing, deserialization fails at runtime.
Custom serialization with sensitive data
Revisiting the ChallengerAccount example we looked at earlier, the password field was marked as transient, so it is not included in default serialization and will be null after deserialization. In controlled environments, this behavior can be overridden by defining custom serialization logic.
In the example below, the writeObject and readObject methods are shown inline for clarity, but they must be declared as private methods inside the serializable class. Here’s what happens during deserialization:
private void writeObject(ObjectOutputStream out) throws IOException {
out.defaultWriteObject();
out.writeObject(password);
}
private void readObject(ObjectInputStream in) throws IOException, ClassNotFoundException {
in.defaultReadObject();
this.password = (String) in.readObject();
}
This is considered custom serialization because the class explicitly writes and reads part of its state instead of relying entirely on the JVM’s default mechanism. The call to readObject() does not read a field by name. Java serialization is a linear byte stream, not a keyed structure.
The value returned here is simply the next object in the stream, which happens to be the password because it was written immediately after the default object data. For this reason, values must be read in the exact order they were written. Changing that order will corrupt the stream or cause deserialization to fail.
Transforming data during serialization
Custom serialization can also transform data before writing it. This is useful for derived values, normalization, or compact representations:
public class ChallengerProfile implements Serializable {
private static final long serialVersionUID = 1L;
private String username;
private transient LocalDate joinDate;
public ChallengerProfile(String username, LocalDate joinDate) {
this.username = username;
this.joinDate = joinDate;
}
}
The joinDate field is marked as transient, so it is not serialized by default. Although LocalDate is itself Serializable, marking it as transient and writing it as a single long demonstrates how custom serialization can transform a field into a different representation:
private void writeObject(ObjectOutputStream out) throws IOException {
out.defaultWriteObject();
out.writeLong(joinDate.toEpochDay());
}
During deserialization, the epoch day is converted back into a LocalDate:
private void readObject(ObjectInputStream in)throws IOException, ClassNotFoundException {
in.defaultReadObject();
this.joinDate = LocalDate.ofEpochDay(in.readLong());
}
The important point is not the specific transformation, but that writeObject and readObject must apply inverse transformations and read values in the exact order they were written. Here, toEpochDay and ofEpochDay are natural inverses: One converts a date to a number, and the other converts it back.
Restoring derived fields
Some fields are derived from others and should not be serialized:
public class ChallengerStats implements Serializable {
private static final long serialVersionUID = 1L;
private int wins;
private int losses;
private transient int score;
public ChallengerStats(int wins, int losses) {
this.wins = wins;
this.losses = losses;
this.score = calculateScore();
}
private int calculateScore() {
return wins * 3 - losses;
}
}
After deserialization, score will be zero. It can be restored as follows:
private void readObject(ObjectInputStream in) throws IOException, ClassNotFoundException {
in.defaultReadObject();
this.score = calculateScore();
}
Why order matters in custom serialization logic
When writing custom serialization logic, the order in which values are written must exactly match the order in which they are read:
private void writeObject(ObjectOutputStream out) throws IOException {
out.defaultWriteObject();
out.writeInt(42);
out.writeUTF("Duke");
out.writeLong(1_000_000L);
}
private void readObject(ObjectInputStream in) throws IOException, ClassNotFoundException {
in.defaultReadObject();
int level = in.readInt();
String name = in.readUTF();
long score = in.readLong();
}
Because the stream is not keyed by field name, each read call simply consumes the next value in sequence. If readUTF were called before readInt, the stream would attempt to interpret the bytes of an integer as a UTF string, resulting in corrupted data or a deserialization failure. This is one of the main reasons custom serialization should be used sparingly. A useful mental model is to think of serialization as a tape recorder: Deserialization must replay the tape in exactly the order it was recorded.
Why serialization is risky
Serialization is fragile when classes change. Even small modifications can make previously stored data unreadable.
Deserializing untrusted data is particularly dangerous. Deserialization can trigger unexpected code paths on attacker‑controlled object graphs, and this has been the source of real‑world security vulnerabilities.
For these reasons, Java serialization should be used only in controlled environments.
When serialization makes sense
Java serialization is suitable only for a narrow set of use cases where class versions and trust boundaries are tightly controlled.
| Use case | Recommendation |
| Internal caching | Java serialization works well when data is short-lived and controlled by the same application. |
| Session storage | Acceptable with care, provided all participating systems run compatible class versions. |
| Long-term storage | Risky: Even small class changes can make old data unreadable. |
| Public APIs | Use JSON. It is language-agnostic, stable across versions, and widely supported. Java serialization exposes implementation details and is fragile. |
| System-to-system communication | Prefer JSON or schema-based formats such as Protocol Buffers or Avro. |
| Cross-language communication | Avoid Java serialization entirely. It is Java-specific and not interoperable with other platforms. |
Rule of thumb: If the data must survive class evolution, cross trust boundaries, or be consumed by non‑Java systems, prefer JSON or a schema‑based format over Java serialization.
Advanced serialization techniques
The mechanisms we’ve covered so far handle most practical scenarios, but Java serialization has a few additional tools for solving problems that default serialization cannot.
Preserving singletons with readResolve
Deserialization creates a new object. For classes that enforce a single instance, this breaks the guarantee silently:
public class GameConfig implements Serializable {
private static final long serialVersionUID = 1L;
private static final GameConfig INSTANCE = new GameConfig();
private GameConfig() {}
public static GameConfig getInstance() {
return INSTANCE;
}
private Object readResolve() throws ObjectStreamException {
return INSTANCE;
}
}
Without readResolve, deserializing a GameConfig would produce a second instance, and any identity check using == would fail. The method intercepts the deserialized object and substitutes the canonical one. The deserialized copy is discarded.
Substituting objects with writeReplace
Whereas readResolve controls what comes out of deserialization, writeReplace controls what goes into serialization. A class can define this method to substitute a different object before any bytes are written.
The two methods are often used together to implement a serialization proxy. One class represents the object’s runtime form, while another represents its serialized form.
In this example,ChallengerWriteReplace plays the role of the “real” object, while ChallengerProxy represents its serialized form:
public class ChallengerProxy implements Serializable {
private static final long serialVersionUID = 1L;
private final long id;
private final String name;
public ChallengerProxy(long id, String name) {
this.id = id;
this.name = name;
}
private Object readResolve() throws ObjectStreamException {
return new ChallengerWriteReplace(id, name);
}
}
class ChallengerWriteReplace implements Serializable {
private static final long serialVersionUID = 1L;
private long id;
private String name;
public ChallengerWriteReplace(long id, String name) {
this.id = id;
this.name = name;
}
private Object writeReplace() throws ObjectStreamException {
return new ChallengerProxy(id, name);
}
}
When a ChallengerWriteReplace instance is serialized, its writeReplace method substitutes it with a lightweight ChallengerProxy. The proxy is the only object that is actually written to the byte stream.
During deserialization, the proxy’s readResolve method reconstructs a new ChallengerWriteReplace instance, and the proxy itself is discarded. The application never observes the proxy object directly.
This technique keeps the serialized form decoupled from the internal structure of ChallengerWriteReplace. As long as the proxy remains stable, the main class can evolve freely without breaking previously serialized data. It also provides a controlled point where invariants can be enforced during reconstruction.
Filtering deserialized classes with ObjectInputFilter
I have explained why deserializing untrusted data is dangerous. Introduced in Java 9, the ObjectInputFilter API gives applications a way to restrict which classes are allowed during deserialization:
ObjectInputFilter filter = ObjectInputFilter.Config.createFilter(
"com.example.model.*;!*"
);
try (ObjectInputStream in = new ObjectInputStream(new FileInputStream("data.ser"))) {
in.setObjectInputFilter(filter); // must be set before readObject()
Object obj = in.readObject();
}
This filter allows only classes under com.example.model and rejects everything else. The pattern syntax supports allowlisting by package, as well as setting limits on array sizes, object graph depth, and total object count.
Java 9 made it possible to set a process-wide filter via ObjectInputFilter.Config.setSerialFilter or the jdk.serialFilter system property, ensuring that no ObjectInputStream would be left unprotected by default. Java 17 extended this further by introducing filter factories (ObjectInputFilter.Config.setSerialFilterFactory), which allow context‑specific filters to be applied per stream rather than relying on a single global policy. If your application deserializes data that crosses a trust boundary, an input filter is not optional; it is the minimum viable defense.
Java records and serialization
Java records can implement Serializable, but they behave differently from ordinary classes in one critical way: During deserialization, the record’s canonical constructor is called. This means any validation logic in the constructor runs on deserialized data, which is a significant safety advantage:
public record ChallengerRecord(Long id, String name) implements Serializable {
public ChallengerRecord {
if (id == null || name == null) {
throw new IllegalArgumentException(
"id and name must not be null");
}
}
}
With a traditional Serializable class, a corrupted or malicious stream could inject null values into fields that the constructor would normally reject. With a record, the constructor acts as a gatekeeper even during deserialization.
Records do not support writeObject, readObject, or serialPersistentFields. Their serialized form is derived entirely from their components, a design decision that intentionally favors predictability and safety over customization.
Alternatives to Java serialization
The Externalizable interface is an alternative to Serializable that gives the class complete control over the byte format. A class that implements Externalizable must define writeExternal and readExternal, and must provide a public no‑argument constructor:
public class ChallengerExt implements Externalizable {
private long id;
private String name;
public ChallengerExt() {} // required
public ChallengerExt(long id, String name) {
this.id = id;
this.name = name;
}
@Override
public void writeExternal(ObjectOutput out) throws IOException {
out.writeLong(id);
out.writeUTF(name);
}
@Override
public void readExternal(ObjectInput in) throws IOException {
this.id = in.readLong();
this.name = in.readUTF();
}
}
Unlike Serializable, no field metadata or field values are written automatically. The class descriptor (class name and serialVersionUID) is still written, but the developer is fully responsible for writing and reading all instance state.
Because writeExternal and readExternal work directly with primitives and raw values, fields should use primitive types where possible. Using a wrapper type such as Long with writeLong would throw a NullPointerException if the value were null, since auto‑unboxing cannot handle that case.
This approach can produce more compact output, but the developer is fully responsible for versioning, field ordering, and backward compatibility.
In practice, Externalizable is rarely used in modern Java. When a full control over-the-wire format is needed, most teams choose Protocol Buffers, Avro, or similar schema‑based formats instead.
Conclusion
Java serialization is a low-level JVM mechanism for saving and restoring object state. Known for being powerful but unforgiving, serialization bypasses constructors, assumes stable class definitions, and provides no automatic safety guarantees. Used deliberately in tightly controlled systems, it can be effective. Used casually, it introduces subtle bugs and serious security vulnerabilities. Understanding the trade-offs discussed in this article will help you use serialization correctly and avoid accidental misuse.
Claude Code AI tool getting auto mode 25 Mar 2026, 10:22 pm
Anthropic is fitting its Claude Code AI-powered coding assistant with an auto mode for the Claude AI assistant to handle permissions on the user’s behalf, with safeguards to monitor actions before they run.
Auto mode was announced March 24; instructions on getting started with it can be found on the introductory blog post. Currently being launched in research preview status for Claude Team users, this capability is due to roll out to enterprise and API users in coming days, according to Anthropic. The company explained that Claude Code default permissions are conservative, with every file write and Bash command asking for approval. While this is a safe default, it means users cannot start a large task and walk away.
Some developers bypass permission checks with --dangerously-skip-permissions, but skipping permissions can result in dangerous and destructive outcomes and should not be used outside of isolated environments. Auto mode is a middle path to run longer tasks with fewer interruptions while introducing less risk than skipping all permissions. Before each tool call runs, a classifier reviews it to check for potentially destructive actions such as mass deleting files, sensitive data exfiltration, or malicious code execution, Anthropic said. Actions deemed safe can proceed and risky ones are blocked, redirecting Claude to take a different approach.
Auto mode reduces risk compared to --dangerously-skip-permissions but does not eliminate it entirely. The classifier may still allow some risky actions: for example, if user intent is ambiguous or if Claude does not have enough context about an environment to know an action might create additional risk. It may also occasionally block benign actions. Anthropic plans to continue to improve the user experience over time.
PyPI warns developers after LiteLLM malware found stealing cloud and CI/CD credentials 25 Mar 2026, 11:13 am
PyPI is warning of possible credential theft from AI applications and developer pipelines after two malicious versions of the widely used Python middleware for large language models, LiteLLM, were briefly published.
“Anyone who has installed and run the project should assume any credentials available to the LiteLLM environment may have been exposed, and revoke/rotate them accordingly,” PyPI said in an advisory that linked the incident to an exploited Trivy dependency from the ongoing TeamPCP supply-chain attack.
According to a Sonatype analysis, the packages embedded a multi-stage payload designed to harvest sensitive data from developer environments, CI/CD pipelines, and cloud configurations, and were live on PyPI for roughly two hours before being taken down.
“Given the package’s three million daily downloads, the compromised LiteLLM could have seen significant exposure during that short time span,” Sonatype researchers said in a blog post. On top of serving as a stealer, the packages were also acting as droppers, enabling follow-on payloads and deeper system compromise.
Three-stage payload built for maximum reach
The compromise affected versions 1.82.7 and 1.82.8. Sonatype’s analysis noted the payload operating in three distinct stages. These included initial execution and data exfiltration, deeper reconnaissance and credential harvesting, and finally persistence with remote control capabilities.
The attack chain relied heavily on obfuscation, with base64-encoded Python code covering up the payload’s tracks. Once executed, the malware collected sensitive data, encrypted it using AES-256-CBC, and then secured the encryption key with an embedded RSA public key before sending everything to attacker-controlled servers.
The disclosure highlighted a common approach that attackers follow these days. Instead of going off immediately after installation, the malware quietly lingers to map the environment and establish a foothold, before pulling credentials from local machines, cloud configs, and automation pipelines.
“It (payload) targets environment variables (including API keys and tokens), SSH Keys, cloud credentials (AWS, GCP, Azure), Kubernetes configs, CI/CD secrets, Docker configs, database credentials, and even cryptocurrency wallets,” said Wiz researchers, who are separately tracking the campaign, in a blog post. “Our data shows that LiteLLM is present in 36% of cloud environments, signifying the potential for widespread impact.”
Wiz also provided a way for its customers to check their environment for exposure via the Wiz Threat Center.
An expanding supply-chain campaign
The LiteLLM incident has been confirmed to be a part of the rapidly unfolding TeamPCP supply chain campaign that first compromised Trivy.
Trivy, developed by Aqua Security, is a widely used open-source vulnerability scanner designed to identify security issues in container images, file systems, and infrastructure-as-code (IaC) configurations. The ongoing attack, attributed to TeamPCP with reported links to LAPSUS$, involved attackers compromising publishing credentials and injecting credential-stealing code into official releases and GitHub Actions used in CI/CD pipelines.
The Trivy compromise was quickly followed by similar supply chain incidents, with attackers leveraging the same access and tactics to target other developer security tools like KICS and Checkmarx, extending the campaign’s reach across multiple CI/CD ecosystems.
PyPI advisory tied the LiteLLM incident directly to the Trivy compromise. The malicious packages were uploaded “after an API Token exposure from an exploited Trivy dependency,” it said.
Ben Read, a lead researcher at Wiz, calls it a systematic campaign that needs to be monitored for further expansion. “We are seeing a dangerous convergence between supply chain attackers and high-profile extortion groups like LAPSUS$,” he said. “By moving horizontally across the ecosystem – hitting tools like liteLLM that are present in over a third of cloud environments – they are creating a snowball effect.”
PyPI has advised users to rotate any secrets accessible to the affected LiteLLM environment, as researchers confirm active data exfiltration and potential exposure across cloud environments tied to the ongoing campaign.
The article originally appeared in InfoWorld.
Cloudflare launches Dynamic Workers for AI agent execution 25 Mar 2026, 10:37 am
Cloudflare has rolled out Dynamic Workers, an isolate-based runtime designed to run AI-generated code faster and more efficiently than traditional containers, as the company pushes lightweight, disposable execution environments as a foundation for enterprise AI applications.
The service enables enterprises to spin up execution environments in milliseconds, pointing to a transition away from container-heavy architectures toward more ephemeral runtimes designed for high-volume AI agent workloads.
For many enterprises, this points to a shift in how AI systems are built and executed. Instead of orchestrating predefined tools, organizations are beginning to let models generate and execute code on demand, a shift that raises new questions around security and cost.
Built on Cloudflare’s existing Workers platform, Dynamic Workers uses V8 isolates to execute code generated at runtime, often by LLMs, without requiring a full container or virtual machine.
“An isolate takes a few milliseconds to start and uses a few megabytes of memory,” Cloudflare said in a blog post. “That’s around 100x faster and 10x-100x more memory efficient than a typical container. That means that if you want to start a new isolate for every user request, on-demand, to run one snippet of code, then throw it away, you can.”
Cloudflare is pairing the runtime with its “Code Mode” approach, which encourages models to write short TypeScript functions against defined APIs instead of relying on multiple tool calls, a method the company says can reduce token usage and latency.
From an enterprise perspective, the platform includes controls such as outbound request interception for credential management, automated code scanning, and rapid rollout of V8 security patches. Cloudflare noted that isolate-based sandboxes have different security characteristics compared to hardware-backed environments.
Dynamic Workers are available in open beta under Cloudflare’s Workers paid plan. While pricing is set at $0.002 per unique Worker loaded per day, in addition to standard CPU and invocation charges, the per-Worker fee is waived during the beta period.
Enterprise runtime implications
For enterprise IT teams, the move to isolate-based execution could reshape how AI workloads are architected, especially for use cases that demand high concurrency and low-latency performance.
“Cloudflare is essentially looking to redefine the application lifecycle by pivoting away from the traditional ‘build-test-deploy’ cycle on centralized servers, which often relies on high-overhead, latency-heavy containers,” said Neil Shah, VP for research at Counterpoint Research. “The move to V8 reduces startup times from around 500 ms to under 5 ms, a roughly 100x improvement, making it significant for bursts of agentic AI requests that may require cold starts.”
This shift could also have cost implications. If AI agents can generate and execute scripts locally to produce outcomes, rather than repeatedly calling LLMs, enterprises may see improvements in both efficiency and latency.
However, Shah noted that the model introduces new security considerations that enterprise leaders cannot ignore.
“Allowing AI agents to generate and execute code on the fly introduces a new attack vector and risk,” Shah said. “While Dynamic Workers are sandboxed to limit the impact of a potential compromise, the unpredictability of AI-generated logic requires a robust security framework and clear guardrails.”
Others say these risks extend beyond sandboxing and require broader governance across the AI execution lifecycle. Nitish Tyagi, principal analyst at Gartner, said that while isolate-based environments improve containment, they do not eliminate risks.
“Running an AI agent and executing code in an isolated environment may seem very safe in theory, but it doesn’t ensure complete safety,” Tyagi said.
He pointed to risks such as vulnerabilities in AI-generated code, indirect prompt-injection attacks, and supply-chain threats, in which compromised external sources could lead agents to expose sensitive data or execute harmful actions.
Tyagi also warned of operational risks, including the risk of autonomous agents entering recursive execution loops, which can lead to cost escalation and resource exhaustion.
To mitigate these risks, Tyagi said enterprises need stronger governance mechanisms, including real-time monitoring of agent behavior, tighter control over outbound traffic, and better visibility into AI supply chains and dependencies.
Oracle adds pre-built agents to Private Agent Factory in AI Database 26ai 25 Mar 2026, 9:54 am
Oracle has added new prebuilt agents to Private Agent Factory, its no-code framework for building containerized, data-centric agents within AI Database 26ai.
These agents include a Database Knowledge Agent, a Structured Data Analysis Agent, and a Deep Data Research Agent.
While the Database Knowledge Agent translates natural-language prompts into queries to fetch specific facts, policies, or entities, the Deep Data Research Agent tackles more complex tasks by breaking them into steps and iterating across web sources, document libraries, or both, the company said.
The Structured Data Analysis Agent, meanwhile, is aimed at crunching tabular data —think SQL tables or CSV files — using tools like Python’s pandas library to generate charts, spot trends, flag anomalies, and summarize metrics, the company added.
The addition of these agents and Private Agent Factory to AI Database 26ai will help developers accelerate agent building in a secure and simplified manner, especially for enterprises operating in regulated industries, helping move pilots to production, analysts say.
“With AI Database Private Agent Factory, teams will be able to rapidly create AI agents or leverage pre-built ones, turning experimentation into production-ready solutions quickly. By embedding intelligence at the core of the database, Oracle is enabling a new era of agentic AI, where sophisticated, autonomous systems and applications can adapt and act at scale,” said Noel Yuhanna, principal analyst at Forrester.
Oracle’s rationale, Yuhanna added, reflects its broader strategy of making the database a central pillar of enterprise AI, given that execution ultimately depends on where the data resides.
That view is echoed by Stephanie Walter, practice leader of AI stack at HyperFRAME Research, who says Private Agent Factory is Oracle’s attempt to position itself as “the operational control layer” in enterprises rather than just the storage layer, by bringing data and AI closer together and reducing the need for data movement and external orchestration.
“Every major cloud provider is moving toward tighter coupling between data, models, and orchestration. Oracle’s differentiation is that it starts from the database outward, while hyperscalers typically start from the model or platform outward,” Walter said.
That differentiation is more than architectural nuance, according to Bradley Shimmin, lead of the data intelligence practice at The Futurum Group.
“By architecting agent orchestration directly into the database, Oracle is letting enterprises drop the duct-tape approach of complex, brittle data-movement pipelines that I would say continue to plague cloud-centric ecosystems, even those emphasizing zero-ETL capabilities,” Shimmin said.
That tighter integration also feeds directly into a more pragmatic concern for regulated industries: keeping sensitive data under control as AI agents move from experimentation into production.
“Most agent frameworks today assume you’re comfortable sending data to external LLM providers and orchestrating through cloud-hosted services. For regulated industries—including banking, healthcare, defense, and government—that assumption is a non-starter,” said Ashish Chaturvedi, leader of executive research at HFS Research.
“The Private Agent Factory meets those customers exactly where they are: behind the firewall, with the drawbridge up,” he added.
Stop worrying: Instead, imagine software developers’ next great pivot 25 Mar 2026, 9:00 am
My sister always says, “worry is just a lack of imagination.” By that, she means we always seem to worry about the worst-case scenarios — about things going badly. Why not worry, or imagine, that the best possible outcome will happen? You have a choice — choose to assume that everything will work out perfectly rather than disastrously.
This has never been more true when you look at the folks who think all of us software developers are going to end up selling apples on street corners.
Don’t fear the coding agent
I get it. Software development has suddenly become incredibly efficient. Claude Code can write code vastly faster and more efficiently than we humans can, and so it seems reasonable that one person can now do (manage?) the work of 10 (50? 100?) people, companies will get rid of the other nine, leaving them destitute. Seas of software developers will be standing in unemployment lines, their skills rendered moot by the blazing tokens of coding agents.
There’s the worst-case scenario. But what if we apply a bit of imagination?
A similar case happened during the Industrial Revolution. In the mid-19th century, steam engines were the leading technology, and as they became more efficient, coal miners grew concerned that demand for their services would drop as those engines used less and less coal.
But the coal miners lacked imagination — more efficient steam engines led to the unexpected result of an increase in the demand for coal. This counterintuitive outcome was noticed by economist William Stanley Jevons, who realized that cheaper, more efficient steam engines led to their more widespread use in ways that hadn’t yet been conceived, thus expanding the need for both coal miners and factory workers to build more and better steam engines. Everybody wins.
And why won’t the same thing be true for software? Can’t we imagine a world where the amount of software demanded and produced expands beyond what we think of today? The “programmers selling apples” scenario assumes that the demand for software remains constant. But if producing software becomes more efficient, won’t that lead to more software being produced?
Think of this: I bet most of us have a few side projects that we’d like to get done that we never seem to be able to find the time for. Your product manager certainly has a long list of features for your product that she’d like to do, but for which there never seems to be the time to put on the schedule. Small businesses all have bespoke requirements for software that off-the-shelf solutions don’t meet.
Adapting to development disruption
Add to that the software that hasn’t even been conceived of yet, and it’s pretty easy to see — imagine — that there is no shortage of software that can be created. Making software easier to create won’t lead to the same projected amount of software created. Making software easier to create will drastically increase the amount of software that will be produced. The floor just dropped out from under “we don’t have the time for that.”
Now, I’ll give you this: There may be a disruption in the type of work required to produce this software. Job descriptions change — this is a constant. We used to need people to write assembly and C. Procedural development gave way to object-oriented coding. Windows developers were left behind as the web rose to prominence. But we all have adapted, and we’ll do so again.
It turns out my sister is right. The best-case scenario is vastly more interesting than anyone bothers to imagine.
TypeScript 6.0 arrives 25 Mar 2026, 9:00 am
TypeScript 6.0 is slated to be the last release of the language based on the current JavaScript codebase and is now generally available. Version 6.0 acts as a bridge between TypeScript 5.9 and the planned TypeScript 7.0, close to completion and set to be speedier and based on the Go language.
The 6.0 production release was unveiled on March 23, following the release candidate that arrived March 6. Developers can access TypeScript 6.0 via NPM with the following command: npm install -D typescript.
TypeScript has been established as JavaScript with syntax for types. Several changes were cited as noteworthy additions in the general production release of TypeScript 6.0, including an adjustment in type-checking for function expressions in generic calls, especially those occurring in generic JSX expressions. This typically will catch more bugs in existing code, although developers may find that some generic calls may need an explicit type argument, said Daniel Rosenwasser, principal product manager for TypeScript at Microsoft.
Also, Microsoft has extended its deprecation of import assertion syntax (i.e. import ... assert {...}) to import() calls like import(..., { assert: {...}}).
With the general release, Microsoft also has updated the DOM types to reflect the latest web standards, including some adjustments to the Temporal APIs. Other capabilities featured in TypeScript 6.0 include:
- There is less context sensitivity on
this-less functions. Ifthisis never actually used in a function, then it is not considered contextually sensitive, which means these functions will be seen as higher priority when it comes to type inference. - A new flag has been introduced, called
–stableTypeOrdering, which is intended to assist with TypeScript 6.0 migrations to Version 7.0. - TypeScript 6.0 adds support for the
es2025option for bothtargetandlib. Although there are no new JavaScript language features in ES2025, this new target adds new types for built-in APIs and moves a few declarations fromesnextintoes2025. - The contents of
lib.dom.iterable.d.tsandlib.dom.asynciterable.d.tsare included inlib.dom.d.ts. Developers still can referencedom.iterableanddom.asynciterablein a configuration file’s"lib"array, but they are now just empty files. TypeScript’sliboption lets users specify which global declarations a target runtime has. - In TypeScript 6.0, using
modulewherenamespacewas expected is now a hard deprecation. This change was necessary becausemoduleblocks are a potential ECMAScript proposal that would conflict with the legacy TypeScript syntax.
The foundation of TypeScript 7.0, meanwhile, is set to be a compiler and language service written in Go that takes advantage of the speed of native code and shared-memory multi-threading. Version 7.0 is “extremely close to completion,” Rosenwasser said. It can be tried out from the Visual Studio Code editor or installed via NPM. “In fact, if you’re able to adopt TypeScript 6.0, we encourage you to try out the native previews of TypeScript 7.0,” Rosenwasser said.
Speed boost your Python programs with new lazy imports 25 Mar 2026, 9:00 am
When you import a module in Python, the module’s code must be evaluated completely before the program can proceed. For most modules, this isn’t an issue. But if a module has a long and involved startup process, it’s going to slow down the rest of your program at the point where it’s imported.
Python developers typically work around this issue by structuring imports so they don’t happen unless they are needed—for instance, by placing an import in the body of a function instead of at the top level of a module. But this is a clumsy workaround, and complicates the flow of the program.
Python 3.15 adds a new feature, lazy imports, that provides a high-level solution for slow-importing modules. Declaring an import as “lazy” means it will be evaluated when it is first used, not when it is first imported. The cost of a slow import can then be deferred until the code it contains is actually needed. And, while lazy imports introduce new syntax, you can future-proof existing code to use them without having to change any of its syntax.
Eager versus lazy imports
To start, it’s helpful to understand the problem addressed by lazy imports. So, let’s say we have two files in the same directory:
# main.py
print ("Program starting")
from other import some_fn
print ("Other module imported")
some_fn()
print ("Program ended")
# other.py
print("Other module evaluation started")
from time import sleep
sleep(2)
# ^ This simulates a slow-loading module
print("Other module evaluation ended")
def some_fn():
print ("some_fn run")
If you run main.py, the output should look something like this:
Program starting
Other module evaluation started
[two-second delay]
Other module evaluation ended
Other module imported
some_fn run
Program ended
The mere act of importing other grinds our program to a near-halt before we can even do anything with the imported function, let alone continue with the rest of the program.
Now let’s see what happens if we modify main.py to use lazy imports (this will only work on Python 3.15 or higher):
print ("Program starting")
lazy from other import some_fn
print ("Other module imported")
some_fn()
print ("Program ended")
When you run the program now, the behavior changes:
Program starting
Other module imported
Other module evaluation started
[two-second delay]
Other module evaluation ended
some_fn run
Program ended
Now, the import imposes no delay at all. We only see the delay when we try to run the function we imported from the module.
What’s happening under the hood? When Python detects a lazy import—typically triggered with the lazy keyword on the import line, as shown above—it doesn’t perform the usual import process. Instead, it creates a “proxy object,” or a stand-in, for the imported module. That proxy waits until the program tries to do something with the module. Then the actual import action triggers, and the module is evaluated.
The lazy keyword is always the first word on the line of an import you want to declare as lazy:
# lazily imports foo
lazy import foo
# lazily imports bar from foo
lazy from foo import bar
# same with the use of "as":
lazy import foo as foo1
lazy from foo import bar as bar1
Where to use lazy imports in Python
The most common scenario for using lazy imports is to replace the usual workaround for avoiding a costly import at program startup. As I mentioned previously, placing the import inside a function, instead of at the top level of a module, causes the import to happen only when the function runs. But it also means the import is limited to the function’s scope, and is therefore unavailable to the rest of the module unless you apply another workaround.
With a lazy import, you can keep the import in the top level of a module as you usually would. The only change you have to make is adding the lazy keyword to your code.
Using lazy imports automatically
It is also possible to enable imports on an existing codebase automatically—without rewriting any import statements.
Python 3.15 adds new features to the sys module that control how lazy imports work. For instance, you can declare programmatically that every import from a given point forward in your program’s execution will be lazy:
import sys
sys.set_lazy_imports("all")
If sys.set_lazy_imports() is given "all", then every import in the program from that point on is lazy, whether or not it uses the lazy keyword. Code labeled "normal" would have only explicitly lazy imports handled as lazy, and code labeled "none" would disable lazy importing across the board.
Controlling lazy imports programmatically
You can also hook into lazy imports at runtime, which lets you do things like control which specific modules are lazy-imported:
import sys
def mod_filter(importing, imported, fromlist):
return imported == ("module")
sys.set_lazy_imports_filter(mod_filter)
sys.set_lazy_imports("all")
sys.set_lazy_imports_filter() lets you supply a function that takes in three parameters:
- The module where the import is being performed
- The module being imported
- A list of names being imported
With that, you can write logic to return True to allow a given import to be lazily imported, or False to force it to be imported normally. This lets you write allow-lists and block-lists for lazy imports as part of a test, or simply as part of how your program works.
Two ways to get started with lazy imports
Python has a long-standing tradition of allowing newer features to be added gracefully to existing codebases. Lazy imports can be used the same way: You can check for the presence of the feature at program start, then apply lazy imports across your codebase automatically by using sys.set_lazy_imports().
To start, you can check the Python version number:
import sys
if (sys.version_info.major==3 and sys.version_info.minor>=15):
... # set up lazy imports
Or you can test for the presence of the lazy import controls in sys:
import sys
if getattr(sys, "set_lazy_imports", None):
... # set up lazy imports
New JetBrains platform manages AI coding agents 24 Mar 2026, 11:12 pm
Seeking to help developers control growing fleets of AI coding agents, JetBrains is introducing JetBrains Central, an agentic development platform for teams to manage and maintain visibility over these agents.
An early access program for JetBrains Central is set to begin in the second quarter of 2026 with a limited number of design partners participating. JetBrains describes the platform as the control and execution plane for agent-driven software production. JetBrains Central is intended to address the difficulties developers face in dealing with the growing number of agents. Developers are increasingly running into challenges with oversight, consistency, and control across these environments, according to JetBrains.
Announced March 24, JetBrains Central acts as a control layer across agentic workflows alongside tools such as the JetBrains’s Air agentic development environment and the Junie LLM-agnostic (large language model) coding agent. JetBrains Central connects developer tools, agents, and development infrastructure into a unified system where automated work can be executed and governed across teams and tools, JetBrains said. Developers can interact with agent workflows from JetBrains IDEs, third-party IDEs, CLI tools, web interfaces, or automated systems. Agents themselves can come from JetBrains or external ecosystems, including Codex, Gemini CLI, or custom agents.
JetBrains Central connects agents with the context needed, including repositories, documentation, and APIs. At the same time, agents operate within real delivery pipelines and infrastructure, interacting with Git repositories, CI/CD systems, cloud environments, and other amenities. When agents need guidance or complete a task, they interact with human teammates through the tools teams already use, such as Slack or Atlassian. This allows agent workflows to operate inside the same systems used by development teams today, rather than in isolated AI tools, according to JetBrains. Specific core capabilities include:
- Governance and control, including policy enforcement, identity and access management, observability, auditability, and cost attribution for agent-driven work. Some of these functionalities are already available via the JetBrains Central Console.
- Agent execution infrastructure, with cloud agent runtimes and computation provisioning, allows agents to run reliably across development environments.
- Agent optimization and context features shared semantic context across repositories and projects. This enables agents to access relevant knowledge and route tasks to the most appropriate models or tools.
New ‘StoatWaffle’ malware auto‑executes attacks on developers 24 Mar 2026, 12:01 pm
A newly disclosed malware strain dubbed “StoatWaffle” is giving fresh teeth to the notorious, developer-targeting “Contagious Interview” threat campaign.
According to NTT Security findings, the malware marks an evolution from the long-running campaign’s user-triggered execution to a near-frictionless compromise embedded directly in developer workflows. Attackers are using blockchain-themed project repositories as decoys, embedding a malicious VS Code configuration file that triggers code execution when the folder is opened and trusted by the victim.
“StoatWaffle is a modular malware implemented by Node.js and it has Stealer and RAT modules,” NTT researchers said in a blog post, adding that the campaign operator “WaterPlum” is “continuously developing new malware and updating existing ones.”
This means tracking Contagious Interview activity may now require widening the scope of detection efforts to include weaponized dev environments, not just malicious packages and interview lures.
Opening a folder is all it takes
StoatWaffle abuses developer trust within Visual Studio Code environments. Instead of relying on users to execute suspicious scripts, like in earlier attacks, attackers are embedding malicious configurations inside legitimate-looking project repositories, often themed around blockchain development, a lure theme that has been consistent with Contagious Interview campaigns.
The trick relies on a “.vscode/tasks.json” file configured with a “runOn: folderOpen” setting. Once a developer opens the project and grants trust, the payload executes automatically without any further clicks. The executed StoatWaffle malware operates a modular, Node.js-based framework that typically unfolds in stages. These stages include a loader, credential harvesting components, and then a remote access trojan (RAT) planted for persistence and pivoting access across systems.
The RAT module maintains regular communication with an attacker-controlled C2 server, executing commands to terminate its own process, change the working directory, list files and directories, navigate to the application directory, retrieve directory details, upload a file, execute Node.js code, and run arbitrary shell commands, among others.
StoatWaffle also exhibits custom behavior depending on the victim’s browser. “If the victim browser was Chromium family, it steals browser extension data besides stored credentials,” the researchers said. “If the victim browser was Firefox, it steals browser extension data besides stored credentials. It reads extensions.json and gets the list of browser extension names, then checks whether the designated keyword is included.”
For victims running macOS, the malware also targets Keychain databases, they added.
Contagious Interview, revisited
StoatWaffle isn’t an isolated campaign. It’s the latest chapter in the Contagious Interview attacks, widely attributed to North Korea-linked threat actors tracked as WaterPlum.
Historically, this campaign has targeted developers and job seekers through fake interview processes, luring them into running malicious code under the guise of technical assessments. Previously, the campaign weaponized npm packages and staged loaders like XORIndex and HexEval, often distributing dozens of malicious packages to infiltrate developer ecosystems at scale.
Team 8, one of the group’s sub-clusters, previously relied on malware such as OtterCookie, shifting to StoatWaffle around December 2025, the researchers said.
The disclosure also shared a set of IP-based indicators of compromise (IOCs), likely tied to C2 infrastructure observed during analysis, to support detection efforts.
The article originally appeared in CSO.
VS Code now updates weekly 24 Mar 2026, 9:00 am
With Microsoft now releasing stable updates to its Visual Studio Code editor weekly instead of just monthly, VS Code Versions 1.112 and 1.111 recently have been released, featuring capabilities such as agent troubleshooting, integrated browser debugging, and Copilot CLI permissions. Also featured is the deprecation of VS Code’s Edit Mode.
VS Code 1.112 was released March 18, while VS Code 1.111 arrived on March 9. Both follow what was a monthly update, VS Code 1.110, released March 4. Download instructions for the editor can be found on the Visual Studio Code website.
Integrated browser debugging on VS Code 1.112 means developers can open web apps directly within VS Code and can start debugging sessions with the integrated browser. This allows interaction with the web app, setting of breakpoints, stepping through code, and inspecting variables without leaving VS Code.
With VS Code 1.111, Edit Mode was officially deprecated. Users can temporarily re-enable Edit Mode via the Code setting chat.editMode.hidden. This setting will remain supported through Version 1.125. Beginning with Version 1.125, Edit Mode will be fully removed, and it will no longer be possible to enable it via settings.
For Copilot CLI sessions in VS Code 1.112, meanwhile, developers can configure permissions for local agent sessions in chat to give agents more autonomy in their actions and to reduce the number of approval requests. Developers can choose between permission levels, including default permissions, bypass approvals, and autopilot.
To reduce risks of locally running Model Context Protocol (MCP) servers, developers with VS Code 1.112 now can run locally configured studio MCP servers in a sandboxed environment on macOS and Linux. Sandboxed servers have restricted file system and network access.
Also in VS Code 1.112, agents can now read image files from disk and binary files natively. This allows developers to use agents for a wider variety of tasks, such as analyzing screenshots, reading data from binary files, and more. Binary files are presented to the agent in a hexdump format.
VS Code 1.111, meanwhile, emphasizes agent capabilities. With this release, developers gained a benefit in agent troubleshooting. To help them understand and troubleshoot agent behavior, developers now can attach a snapshot of agent debug events as context in chat by using #debugEventsSnapshot. This can be used to ask the agent about loaded customizations, token consumption, or to troubleshoot agent behavior. Developers also can select the sparkle chat icon in the top-right corner of the Agent Debug panel to add the debug events snapshot as an attachment to the chat composer. Selecting the attachment opens the Agent Debug panel logs, filtered to the timestamp when the snapshot was taken.
Also in the agent vein, VS Code 1.111 adds a new permissions picker in the Chat view for controlling how much autonomy the agent has. The permission level applies only to the current session. Developers can change it at any time during a session by selecting a different level from the permissions picker.
Further in the agent space, the custom agent frontmatter in VS Code 1.111 adds support for agent-scoped hooks that are only run when a specific agent is selected or when it is invoked via runSubagent. This enables attachment of pre- and post-processing logic to specific agents without affecting other chat interactions.
VS Code 1.111 also featured a preview of an autopilot capability. This lets agents iterate autonomously until they complete their task.
When Windows 11 sneezes, Azure catches cold 24 Mar 2026, 9:00 am
If you look at Microsoft as a collection of product lines, it is easy to conclude that Windows 11 and Azure occupy different universes. One is a client operating system that has irritated its users, confused administrators, and pushed hardware refresh cycles in ways many customers did not want. The other is a hyperscale cloud platform selling compute, storage, data services, and AI infrastructure to enterprises. On paper, these are different businesses. In practice, they are part of the same trust system.
That is why the real question is not whether every unhappy Windows 11 user immediately stops buying Azure. They do not. The short-term connection is too indirect for that. The real issue is whether Microsoft is weakening the strategic gravity that has historically pulled enterprises toward the Microsoft stack. If Windows becomes less loved, less trusted, and less central, then Azure loses one of its quiet but important advantages: the assumption that Microsoft remains the default operating environment from endpoint to identity to server to cloud.
A cascade of Windows 11 problems
Windows 11 did not fail because of one mistake. It became controversial because Microsoft stacked friction on top of friction. The first issue was hardware eligibility. By tightening CPU support and enforcing TPM 2.0 and Secure Boot requirements, Microsoft effectively told a large installed base that perfectly usable machines were no longer good enough for the future of Windows. For many users and businesses, that translated into an involuntary hardware refresh rather than an upgrade. That remains one of the most damaging perception problems around Windows 11 because it turned operating system modernization into a capital expense conversation.
The second issue has been the aggressive insertion of AI features, especially Copilot, into the Windows experience. Recent reporting indicates Microsoft has been reassessing how deeply to push Copilot into Windows 11 after broad criticism that AI was being forced into core workflows rather than offered as a clearly optional capability. That matters because enterprise customers tend to reward optionality and punish coercion. When users believe the operating system is being used as a delivery vehicle for features they did not request, trust erodes quickly.
The third issue is cumulative quality perception. Even where individual complaints differ, the common narrative has been remarkably consistent: too much UX churn, too much product agenda, and not enough attention to core stability and utility. Once that story takes hold, it is no longer just about Windows 11. It becomes about Microsoft’s judgment.
The short-term impact on Azure
In the near term, I do not think the Windows 11 backlash materially dents Azure revenue in a dramatic, visible way. Azure buying decisions are still driven by enterprise agreements, migration road maps, data gravity, AI demand, regulatory requirements, and the practical realities of application modernization. A company does not walk away from its Azure footprint because employees dislike a desktop rollout.
There is also a structural reason the short-term effect is muted. Most Azure customers run a mixed environment already. Even in Microsoft-heavy enterprises, cloud workloads are often Linux-based, containerized, or managed through cross-platform tools. The Azure strategy today is less “run Windows everywhere” and more “meet customers where they are.” That makes the desktop operating system less immediately determinative than it was a decade ago.
However, that should not be confused with immunity. In the short run, Windows 11 can damage Microsoft’s credibility and affect adjacent buying decisions. If CIOs and architects see Microsoft overreaching on the client, they may become more skeptical of broader Microsoft platform bets. Skepticism does not always kill a deal, but it can slow expansion, increase competitive reviews, and make alternatives look more reasonable.
The risk of ecosystem decoupling
This is where the story gets serious. Microsoft’s power historically came from stack continuity. Windows on the desktop led to Windows Server, Active Directory, Microsoft management tools, Microsoft productivity software, Microsoft developers, and eventually Microsoft cloud. The company benefited from a kind of architectural momentum. Even when customers complained, they often stayed because the ecosystem fit together.
If Windows 11 reduces the footprint or strategic relevance of Windows on end-user devices, that continuity weakens. Lenovo is already shipping some lines of business laptops with both Windows and Linux options, a sign that major manufacturers see practical demand for more operating system flexibility. More broadly, mainstream business laptop coverage now treats Linux-capable systems from Lenovo and Dell as credible enterprise choices rather than edge cases. That shift matters. Once manufacturers normalize OS choice, Microsoft loses part of its distribution advantage.
A reduced Windows footprint does not automatically mean Azure declines, but it does make non-Microsoft infrastructure easier to justify. If the endpoint is no longer assumed to be Windows, then the organization becomes more comfortable with Linux-first operations, browser-based productivity, identity abstraction, cross-platform management, and container-native development. At that point, AWS and Google Cloud gain more than competitive parity. They gain narrative momentum.
Who benefits from Microsoft’s missteps
AWS has long benefited from being seen as the neutral default for cloud infrastructure. Google Cloud benefits from strength in data, AI, Kubernetes, and open source. Both providers become more attractive when enterprises want to avoid deeper entanglement with a single vendor’s ecosystem. If Microsoft weakens the emotional and operational case for staying inside that ecosystem, competitors have less resistance to overcome.
Then there is the rise of sovereign clouds and neo clouds. Sovereign cloud offerings are increasingly attractive to governments, regulated industries, and companies navigating regional data control requirements. Neo clouds, especially GPU-centric specialists, are capturing interest from organizations that want AI infrastructure without buying into a full legacy enterprise stack. These providers are not necessarily replacing Azure across the board, but they are fragmenting the market and redefining what “best fit” looks like.
That fragmentation becomes more dangerous for Microsoft if Windows no longer functions as an ecosystem anchor. Once customers accept heterogeneity at the edge, they become more comfortable buying heterogeneity in the cloud.
Microsoft still has time to stop this from spreading. The fix is not complicated, although it may be culturally difficult. Microsoft has to make Windows useful before it makes Windows strategic. That means reducing forced experiences, making Copilot clearly optional, restoring confidence in the value of core OS improvements, and acknowledging that hardware gating created real resentment. It also means understanding that endpoint trust is not a side issue. It is part of the company’s larger cloud positioning.
If Microsoft treats Windows 11 as merely a noisy consumer controversy, it will miss the enterprise lesson. Platforms are built on confidence. Confidence on the desktop influences confidence in the data center and the cloud. The short-term Azure impact may be modest, but the long-term risk is real: If Windows stops being the front door to the Microsoft universe, Azure stops being the default destination.
That is how desktop mistakes become cloud problems. Not all at once, but gradually and then faster than expected.
Designing self-healing microservices with recovery-aware redrive frameworks 24 Mar 2026, 9:00 am
Cloud-native microservices are built for resilience, but true fault tolerance requires more than automatic retries. In complex distributed systems, a single failure can cascade across multiple services, databases, caches or third-party APIs, causing widespread disruptions. Traditional retry mechanisms, if applied blindly, can exacerbate failures and create what is known as a retry storm, an exponential amplification of failed requests across dependent services.
This article presents a recovery-aware redrive framework, a design approach that enables self-healing microservices. By capturing failed requests, continuously monitoring service health and replaying requests only after recovery is confirmed, systems can achieve controlled, reliable recovery without manual intervention.
Challenges with traditional retry mechanisms
Retry storms occur when multiple services retry failed requests independently without knowledge of downstream system health. Consider the following scenario:
- Service A calls Service B, which is experiencing high latency.
- Both services implement automatic retries.
- Each failed request is retried multiple times across layers.
In complex systems where services depend on multiple layers of other services, a single failed request can be retried multiple times at each layer. This can quickly multiply the number of requests across the system, overwhelming downstream services, delaying recovery, increasing latency and potentially triggering cascading failures even in components that were otherwise healthy.
Recovery-aware redrive framework: System design
The recovery-aware redrive framework is designed to prevent retry storms while ensuring all failed requests are eventually processed. Its core design principles include:
- Failure capture: All failed requests are persisted in a durable queue (e.g., Amazon SQS) along with their payloads, timestamps, retry metadata and failure type. This guarantees exact replay semantics.
- Service health monitoring: A serverless monitoring function (e.g., AWS Lambda) evaluates downstream service metrics, including error rates, latency and circuit breaker states. Requests remain queued until recovery is confirmed.
- Controlled replay: Once system health indicates recovery, queued requests are replayed at a controlled rate. Failed requests during replay are re-enqueued, enabling multi-cycle recovery while avoiding retry storms. Replay throughput can be dynamically adjusted to match service capacity.

Anshul Gupta
Operational flow
The framework operates in three stages:
- Failure detection: Requests failing at any service are captured with full metadata in the durable queue.
- Monitoring and recovery detection: Health metrics are continuously analyzed. Recovery is considered achieved when all monitored metrics fall within predefined thresholds.
- Replay execution: Requests are replayed safely after recovery, with throughput limited to prevent overload. Failures during replay are returned to the queue for subsequent attempts.
This design ensures safe, predictable retries without amplifying failures. By decoupling failure capture from replay and gating retries based on real-time service health, the system prevents premature retries that could overwhelm recovering services. It also maintains end-to-end request integrity, guaranteeing that all failed requests are eventually processed while preserving the original payload and semantics. This approach reduces operational risk, avoids cascading failures and supports observability, allowing engineers to track failures, recovery events and replay activity in a controlled and auditable manner.
Implementation in cloud-native environments
A practical implementation involves:
- Failure capture function: Intercepts failed API calls and writes them to a queue.
- Monitoring function: Evaluates downstream service health continuously.
- Replay function: Dequeues messages at a controlled rate after recovery, re-queuing failures as necessary.
This decoupling of failure capture from replay enables true self-healing microservices, reducing the need for human intervention during outages.
Benefits of recovery-aware redrive
Implementing a recovery-aware redrive framework offers several operational advantages that directly impact system reliability and resilience. By intelligently managing failed requests and controlling replay based on actual service health, this design not only prevents uncontrolled traffic amplification but also ensures that every request is eventually processed without manual intervention. In addition, it enhances visibility into system behavior, providing actionable insights for troubleshooting and capacity planning. These benefits make the framework particularly well-suited for modern cloud-native environments where stability, observability and cross-platform compatibility are critical.
- Prevents retry storms: Ensures request amplification is bounded.
- Maintains reliability: Guarantees that all failed requests are eventually processed.
- Supports observability: Logs all failures, replay attempts and system metrics for auditing and troubleshooting.
- Platform agnostic: Compatible with Kubernetes, serverless or hybrid cloud environments.
Best practices
- Design requests to be idempotent or safely deduplicated.
- Base monitoring on real system metrics rather than static timers.
- Throttle replay throughput dynamically according to system capacity.
- Maintain audit logs of failures and replay activities for operational transparency.
Conclusion
Self-healing microservices require more than traditional retries. A recovery-aware redrive framework provides a structured approach to capture failed requests, monitor downstream service health and replay them safely after recovery. This framework prevents retry storms, improves observability and enables cloud-native systems to recover autonomously from outages, delivering resilient and reliable services in complex distributed environments.
This article is published as part of the Foundry Expert Contributor Network.
Want to join?
7 safeguards for observable AI agents 24 Mar 2026, 9:00 am
Many organizations are under pressure to take their AI agent experiments and proof of concepts out of pilots and into production. Devops teams may have limited time to ensure these AI agents meet AI agent non-negotiable requirements for production deployments, including implementing observability, monitoring, and other agenticops practices.
One question devops teams must answer is what their minimum requirements are to ensure AI agents are observable. Teams can start by extracting fundamentals from devops observability practices and layering in dataops observability for data pipelines and modelops for AI models.
But organizations also must extend their observability standards, especially as AI agents take over role-based tasks, integrate with MCP servers for more complex workflows, and support both human-in-the-middle and autonomous operations.
A key observability question is: Who did what, when, why, and with what information, from where? The challenging part is centralizing this information and having an observability data standard that works regardless of whether the decision or action came from an AI agent or a person.
“Devops should apply the same content and quality processes to AI agents as they do for people by leveraging AI-powered solutions that monitor 100% of interactions from both humans and AI agents,” suggests Rob Scudiere, CTO at Verint. “The next step is observing, managing, and monitoring AI and human agents together because performance oversight and continuous improvement are equally critical.”
I asked experts to share key concepts and their best practices for implementing observable AI agents.
1. Define success criteria and operational governance
Observability is a bottom-up process for capturing data on an AI agent’s inputs, decisions, and operations. Before delving into non-functional requirements for AI agents and defining observability standards, teams should first review top-down goals, operational objectives, and compliance requirements.
Kurt Muehmel, head of AI strategy at Dataiku, says observable agents require three disciplines that many teams treat as afterthoughts:
- Define success criteria because engineers can’t determine what “good” looks like alone. Domain experts need to help build evaluation datasets that capture edge cases only they would recognize.
- Centralize visibility because agents are being built everywhere, including data platforms, cloud services, and across teams.
- Establish technical operational governance before deployment, including evaluation criteria, guardrails, and monitoring.
Observability standards should cover proprietary AI agents, those from top-tier SaaS and security companies, and those from growing startups. Regarding technical operational governance:
- Evaluation criteria can incorporate site reliability concepts around service-level objectives, but should include clear boundaries for poor, unacceptable, or dangerous performance.
- Guardrails should include deployment standards and release-readiness criteria.
- Monitoring should include clear communication and escalation procedures.
2. Define the information to track
Observability of AI agents is non-trivial for a handful of reasons:
- AI agents are not only stateful but have memory and feedback loops to improve decision-making.
- Actions may be triggered by people, autonomously by the AI agent, or orchestrated by another agent via an MCP server.
- Tracking the agent’s behavior requires versioning and change tracking for the underlying datasets, AI models, APIs, infrastructure components, and compliance requirements.
- Observability must account for additional context, including identities, locations, time considerations, and other conditions that can influence an agent’s recommendations.
Given the complexity, it’s not surprising that experts had many suggestions regarding what information to track.
“Teams should treat every agent interaction like a distributed trace with instrumentation at the various decision-making boundaries and capture the prompt, model response, the latency, and the resulting action in order to spot drift, latency issues, or unsafe behaviors in real time,” says Logan Rohloff, tech lead of cloud and observability at RapDev. “Combining these metrics with model-aware signals, such as token usage, confidence scores, policy violations, and MCP interactions enables you to detect when an agent is compromised or acting outside its defined scope.”
Devops teams will need to extend microservice observability principles to support AI agents’ stateful, contextual interactions.
“Don’t overlook the bits around session, context, and workflow identifiers as AI agents are stateful, communicate with each other, and can store and rehydrate sessions,” says Christian Posta, global field CTO at Solo.io. “We need to be able to track causality and flows across this stateful environment, and with microservices, there was always a big challenge getting distributed tracing in place at an organization. Observability is not optional, and without it, there’s no way you can run AI agents and be compliant.”
Agim Emruli, CEO of Flowable, adds that “teams need to establish identity-based access controls, including unique agent credentials and defined permissions, because in multi-agent systems, traceability drives accountability.”
3. Identify errors, hallucinations, and dangerous recommendations
Instrumenting observable APIs and applications helps engineers address errors, identify problem root causes, improve resiliency, and research security and operational issues. The same is true for AI agents that autonomously complete tasks or make recommendations to human operators.
“When an AI agent hallucinates or makes a questionable decision, teams need visibility into the full trajectory, including system prompts, contexts, tool definitions, and all message exchanges,” says Andrew Filev, CEO and founder of Zencoder. “But if that’s your only line of defense, you’re already exposed because agentic systems are open-ended and operate in dynamic environments, requiring real-time verification. This shift started with humans reviewing every result and is now moving toward built-in self- and parallel verification.”
Autonomous verification will be needed as organizations add agents, integrate with MCP servers, and allow agents to connect to sensitive data sources.
“Observing AI agents requires visibility not only into model calls but into the full chain of reasoning, tools, and code paths they activate, so devops can quickly identify hallucinations, broken steps, or unsafe actions,” says Shahar Azulay, CEO and co-founder of Groundcover. “Real-time performance metrics like token usage, latency, and throughput must sit alongside traditional telemetry to detect degradation early and manage the real cost profile of AI in production. And because agents increasingly execute code and access sensitive data, teams need security-focused observability that inspects payloads, validates integrations like MCP, and confirms that every action an agent takes is both authorized and expected.”
4. Ensure AI agent observability addresses risk management
Organizations will recognize greater business value and ROI as they scale AI agents to operational workflows. The implication is that the ecosystem of AI agents’ observability capabilities becomes a fundamental part of the organization’s risk management strategy.
“Make sure that observability of agents extends into tool use: what data sources they access, and how they interact with APIs,” says Graham Neray, co-founder and CEO, Oso. “You should not only be monitoring the actions agents are taking, but also categorizing risk levels of different actions and alerting on any anomalies in agentic actions.”
Risk management leaders will be concerned about rogue agents, data issues, and other IT and security risks that can impact AI agents. Auditors and regulators will expect enterprises to implement robust observability into AI agents and have remediation processes to address unexpected behaviors and other security threats.
5. Extend observability to security monitoring and threat detection
Another consumer of observability data will be security operation centers (SOCs) and security analysts. They will connect the information to data security posture management (DSPM) and other security monitoring tools used for threat detection.
“I expect real insight into how the agent reacts when it connects to external systems because integrations create blind spots that attackers target,” says Amanda Levay, CEO of Redactable. “Leaders need this level of observability because it shows where the agent strains under load, where it misreads context, and where it opens a path that threatens security.”
CISOs will need to extend their operational playbooks as threats from AI actors grow in scale and sophistication.
“Infosec and devops teams need clear visibility into the data transferred to agents, their actions on data and systems, and the requests made of them by users to look for signs of compromise, remediate issues, and perform root-cause analysis, says Mike Rinehart, VP of AI at Securiti AI. “As AI and AI agents become part of important data pipelines, teams must fold governance into prompts, integrations, and deployments so security, privacy, and engineering leaders act from a shared view of the data landscape and the risks that come with it.”
6. Evaluate AI agent performance
Addressing risk management and security concerns is one need for implementing observability in AI agents. The other key question observability can help answer is gauging an AI agent’s performance and providing indicators when improvements are needed.
“When I evaluate AI agents, I expect visibility into how the agent forms its decisions because teams need a clear signal when it drifts from expected behavior,” says Levay of Redactable. “I watch for moments when the agent ignores its normal sources or reaches for shortcuts because those shifts reveal errors that slip past general observability tools.
To evaluate performance, Tim Armandpour, CTO of PagerDuty, says technology leaders must prepare for AI agents that fail subtly rather than catastrophically. He recommends, “Instrument the full decision chain from prompt to output and treat reasoning quality and decision patterns as first-class metrics alongside traditional performance indicators. The teams succeeding at this treat every agent interaction as a security boundary and build observability contracts that make agent behavior auditable and explainable in production.”
7. Prepare for observability AI agents that take action
The natural evolution of observability is when devops organizations turn signals into actions using AI observability agents.
“Observability shouldn’t stop at recording; you should be able to take action if an agent is going astray easily,” says Neray of Oso. “Make sure you can easily restrict agentic actions by tightening access permissions, removing a particular tool, or even fully quarantining an agent to stop rogue behavior.”
Observability data will fuel the next generation of IT and security operational AI agents that will need to monitor a business’s agentic AI operations. The question is whether devops teams will have enough time to implement observability standards, or whether business demand to deploy agents will drive a new era of AI technical debt.
An architecture for engineering AI context 24 Mar 2026, 9:00 am
Ensuring reliable and scalable context management in production environments is one of the most persistent challenges in applied AI systems. As organizations move from experimenting with large language models (LLMs) to embedding them deeply into real applications, context has become the dominant bottleneck. Accuracy, reliability, and trust all depend on whether an AI system can consistently reason over the right information at the right time without overwhelming itself or the underlying model.
Two core architectural components of Empromptu’s end-to-end production AI system, Infinite Memory and the Adaptive Context Engine, were designed to solve this problem, not by expanding raw context windows but by rethinking how context is represented, stored, retrieved, and optimized over time.
The core problem: Context as a system constraint
Empromptu is designed as a full-stack system for building and operating AI applications in real-world environments. Within that system, Infinite Memory and Adaptive Context Engine work together to solve one specific but critical problem: how AI systems retain, select, and apply context reliably as complexity grows.
Infinite Memory provides the persistent memory layer of the system. It is responsible for retaining interactions, decisions, and historical context over time without being constrained by traditional context window limits.
The Adaptive Context Engine provides the attention and selection layer. It determines which parts of that memory, along with current data and code, should be surfaced for any given interaction so the AI can act accurately without being overwhelmed.
Together, these components sit beneath the application layer and above the underlying models. They do not replace foundation models or require custom training. Instead, they orchestrate how information flows into those models, making large, messy, real-world systems usable in production.
In practical terms, Infinite Memory answers the question: What can the system remember? The Adaptive Context Engine answers the question: What should the system pay attention to right now?
Both are designed as infrastructure primitives that plug into Empromptu’s broader platform, which includes evaluation, optimization, governance, and integration with existing codebases. This is what allows the system to support long-running sessions, large codebases, and evolving workflows without degrading accuracy over time.
Most modern AI systems operate within strict context limits imposed by the underlying foundation models. These limits force difficult trade-offs:
- Retain full interaction history and suffer from escalating latency, cost, and performance degradation.
- Periodically summarize past interactions and accept the loss of nuance, intent, and critical decision history.
- Reset context entirely between sessions and rely on users to restate information repeatedly.
These approaches may be acceptable in demos or chatbots, but they break down quickly in production systems that must operate over long time horizons, large document sets, or complex codebases.
In real applications, context is not a linear conversation. It includes prior decisions, system state, user intent, historical failures, domain constraints, and evolving requirements. Treating context as a flat text buffer inevitably leads to hallucinations, regressions, and brittle behavior.
The challenge is not how much context an AI system can hold at once, but how intelligently it can decide what context matters for any given action.
Infinite Memory: Moving beyond context windows
Infinite Memory represents a shift away from treating context as something that must fit inside a single prompt. Instead, it introduces a persistent memory layer that exists independently of the model’s immediate context window.
This memory layer captures all interactions, decisions, corrections, and system state over time. Importantly, Infinite Memory does not attempt to inject all of this information into every request. Instead, it stores information in structured, retrievable forms that can be selectively reintroduced when relevant.
From an architectural perspective, Infinite Memory functions more like a knowledge substrate than a conversation log. Each interaction contributes to a growing memory graph that records:
- User intent and preferences
- Historical decisions and their outcomes
- Corrections and failure modes
- Domain-specific constraints
- Structural information about code, data, or workflows
This allows the system to support conversations and workflows of effectively unlimited length without overwhelming the underlying model. The result is an AI system that never forgets, but also never blindly recalls everything.
Adaptive Context Engine: Attention as infrastructure
If Infinite Memory is the storage layer, the Adaptive Context Engine is the reasoning layer that decides what to surface and when to do so.
Internally, the Adaptive Context Engine is best understood as an attention management system. Its role is to continuously evaluate available memory and determine which elements are necessary for a specific request, task, or decision.
Unlike static prompt engineering approaches, the Adaptive Context Engine is dynamic and self-optimizing. It learns from usage patterns, outcomes, and feedback to improve its context selection over time. Rather than relying on predefined rules, it treats context selection as an evolving optimization problem.
Multi-level context management
The Adaptive Context Engine operates across multiple layers of abstraction, allowing it to manage both conversational and structural context.
Request harmonization
One of the most common failure modes in AI systems is request fragmentation. Users ask for changes, clarifications, and additions across multiple interactions, often referencing previous requests implicitly rather than explicitly.
Request harmonization addresses this by maintaining a continuously updated representation of the user’s cumulative intent. Each new request is merged into a harmonized request object that reflects everything the user has asked for so far, including constraints and dependencies.
This prevents the system from treating each interaction as an isolated command and allows it to reason over intent holistically rather than sequentially.
Synthetic history generation
Rather than replaying full interaction histories, the system generates what we refer to as synthetic histories. A synthetic history is a distilled representation of past interactions that preserves intent, decisions, and constraints while removing redundant or irrelevant conversational detail.
From the model’s perspective, it appears as though there has been a single coherent exchange that already incorporates everything learned so far. This dramatically reduces token usage while also maintaining reasoning continuity. Synthetic histories are regenerated dynamically, allowing the system to evolve its understanding as new information arrives.
Secondary agent control
For complex tasks, particularly those involving large codebases or document collections, a single monolithic context is inefficient and error-prone. The Adaptive Context Engine employs secondary agents that operate as context selectors.
These secondary agents analyze the task at hand and determine which files, functions, or documents require full expansion and which can remain summarized or abstracted. This selective expansion allows the system to reason deeply about specific components without loading entire systems into context unnecessarily.
CORE Memory: Recursive context expansion at scale
The most advanced component of the Adaptive Context Engine is what we call Centrally-Operated Recursively-Expanded Memory (CORE-Memory). This system addresses the challenge of working with large codebases or complex systems by creating associative trees of information.
CORE Memory automatically analyzes functions, files, and documentation to create hierarchical tags and associations. When the AI needs specific functionality, it can recursively search through these tagged associations rather than loading entire codebases into context. This allows for expansion on classes of files by tag or hierarchy, enabling manipulation of specific parts of code without context overload.
A production-grade system
Infinite Memory and the Adaptive Context Engine were built specifically for production environments, not research demos. Several design principles differentiate them from experimental context management approaches.
Self-managing context
The system is capable of operating across hundreds of documents or files while maintaining high accuracy. In production deployments, it consistently handles more than 250 documents without degradation while still achieving accuracy levels approaching 98%. This is accomplished through selective expansion, continuous pruning, and adaptive optimization rather than brute-force context injection.
Continuous optimization
The Adaptive Context Engine learns from real-world usage. It tracks which context selections lead to successful outcomes and which lead to errors or inefficiencies. Over time, this feedback loop allows the system to refine its attention strategies automatically, reducing hallucinations and improving relevance without manual intervention.
Integration flexibility
The architecture is designed to integrate with existing codebases, data stores, and foundation models. It does not require retraining models or rewriting systems. Instead, it acts as an orchestration layer that enhances reliability and performance across diverse environments.
Real-world applications
Together, Infinite Memory and the Adaptive Context Engine enable capabilities that are difficult or impossible with traditional context management approaches.
Extended Conversations
There are no artificial limits on conversation length or complexity. Context persists indefinitely, supporting long-running workflows and evolving requirements without loss of continuity.
Deep code understanding
The system can reason over large, complex codebases while maintaining awareness of architectural intent, historical decisions, and prior modifications.
Learning from failure
Failures are not discarded. The system retains memory of past errors, corrections, and edge cases, allowing it to avoid repeating mistakes and to improve over time.
Cross-session continuity
Context persists across sessions, users, and environments. This allows AI systems to behave consistently and predictably even as usage patterns evolve.
Architectural benefits
Empromptu’s approach with Infinite Memory and the Adaptive Context Engine offers several advantages over traditional context management techniques.
- Scalability without linear cost growth
- Improved reasoning accuracy under real-world constraints
- Adaptability based on actual usage rather than static rules
- Compatibility with existing AI infrastructure
Most importantly, it reframes context not as a hard constraint, but as an intelligent resource that can be managed, optimized, and leveraged strategically.
As AI systems move deeper into production environments, context management has become the defining challenge for reliability and trust. Infinite Memory and the Adaptive Context Engine represent a shift away from brittle prompt-based approaches toward a more resilient, system-level solution. By treating memory, attention, and context selection as first-class infrastructure, it becomes possible to build AI applications that scale in complexity without sacrificing accuracy.
The future of applied AI will not be defined by larger context windows alone, but by architectures that understand what matters and when.
—
New Tech Forum provides a venue for technology leaders—including vendors and other outside contributors—to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to doug_dineley@foundryco.com.
Page processed in 0.572 seconds.
Powered by SimplePie 1.4-dev, Build 20170403172323. Run the SimplePie Compatibility Test. SimplePie is © 2004–2026, Ryan Parman and Geoffrey Sneddon, and licensed under the BSD License.
