Final training of AI models is a fraction of their total cost 27 Mar 2026, 5:08 pm

AI models cost a lot more to develop than you may think. AI research company Epoch AI has set out all the costs of building a new AI model — and explaining why AI companies are so concerned about perceived threats to their intellectual property.

It has looked into this before: Last year, it estimated that of OpenAI’s $5 billion expenditure on R&D, only about 10 percent went on the final training runs, with the majority going on scaling, synthetic data generation, and basic research.

At the time, Epoch was unsure whether this was a peculiarity of OpenAI but now two Chinese companies, MiniMax and Z.ai, have also disclosed their R&D compute spending, and Epoch has found that, despite the differences in company size, final training runs are only a small part of the Chinese companies’ R&D expenditure too.

Epoch set out more detail about the issue. It said that if “most of the spending is exploration rather than execution, then a competitor who learns what works from the frontier could replicate the results for a fraction of the original cost.”

This has been a concern of US AI companies for some time.  Google has already expressed concerns about intellectual property theft. And Anthropic has fingered MiniMax as a company that has sought to extract Claude’s capabilities to enhance its own offerings. It’s clear that any business looking to develop AI models is going to be committing to spend huge sums of money: The training is just a small part of it.

(image/jpeg; 4.8 MB)

OpenAI adds plugin system to Codex to help enterprises govern AI coding agents 27 Mar 2026, 12:27 pm

OpenAI has introduced a plugin system for Codex, its AI-powered software engineering platform, giving enterprise IT teams a way to package coding workflows, application integrations, and external tool configurations into versioned, installable bundles that can be distributed or blocked across development organizations.

“We’re rolling out plugins in Codex,” OpenAI Developers, the company’s official developer account, posted on X.  “Codex now works seamlessly out of the box with the most important tools builders already use, like Slack, Figma, Notion, Gmail, and more.”

Plugins are “installable bundles for reusable Codex workflows” that “make it easier to share the same setup across projects or teams,” an OpenAI developer portal documentation noted. Each bundle can contain skills, which the documentation describes as prompts that the Codex agent can discover and execute, along with optional application integrations and Model Context Protocol server configurations that give the agent access to remote tools or shared context, it added.

A governance layer for agentic AI

How those bundles are distributed and governed is controlled through a separate policy layer, the documentation said.

Organizations can define plugin catalogs, called marketplaces, in JSON files scoped either to a repository or to an individual developer’s environment. Each plugin entry carries an installation policy with values including “INSTALLED_BY_DEFAULT,” “AVAILABLE,” and “NOT_AVAILABLE,” giving administrators the ability to push, restrict, or block plugins across the developer workforce, the document added. Authentication behavior is configurable at the policy level as well.

The plugin feature is the latest in a run of enterprise-focused additions to Codex since OpenAI announced the platform’s general availability in October 2025, when it said Cisco had reported pull request review times falling by as much as 50% after deployment. Admin tooling released at the same time gave ChatGPT Business, Edu, and Enterprise customers environment controls, usage analytics dashboards, and managed configuration options for the Codex CLI and IDE extension.

“Centralized control over which plugins are permitted, blocked, or deployed by default directly addresses concerns around security, compliance, and operational consistency,” said Charlie Dai, VP and principal analyst at Forrester. “It aligns AI agents with existing IT governance models rather than bypassing them.”

Adoption will be gradual, Dai said. “While technical tooling is advancing quickly, most enterprises will adopt this incrementally, led by platform engineering and developer productivity teams,” he said.

Agent behavior as managed infrastructure

Beyond the pace of adoption, Dai said the plugin system signals a broader shift in how enterprises are expected to manage AI-assisted development.

“By encapsulating standards, workflows, and tool access into versioned artifacts, organizations elevate AI-assisted development from ad hoc usage to managed infrastructure,” he said.

That distinguishes Codex from its main rivals. GitHub Copilot Extensions, which reached general availability in early 2025, lets developers invoke third-party tools from Copilot Chat inside Visual Studio Code, JetBrains IDEs, and GitHub.com, with a public marketplace hosting extensions from vendors including Docker, Sentry, and Perplexity. The emphasis is on contextual tool access during chat sessions rather than governing agent behavior at scale.

Cursor, another rival, launched its own plugin marketplace in February. The company expanded it this month, adding more than 30 integrations from partners including Atlassian, Datadog, and GitLab, according to Cursor’s changelog. Teams and Enterprise administrators can also create private marketplaces for controlled distribution.

Anthropic has moved in a similar direction, introducing workflow automation plugins for its Claude Cowork platform earlier this year.

“Compared with GitHub Copilot or Cursor, OpenAI is extending beyond policy enforcement into behavioral standardization,” Dai said. “Competitors focus primarily on permissions and guardrails; Codex begins to formalize execution patterns at scale.”

The missing third-party ecosystem

That behavioral standardization, however, has a notable constraint for now.

OpenAI has not opened self-serve publishing to its official plugin directory. “Adding plugins to the official Plugin Directory is coming soon,” the documentation said. “Self-serve plugin publishing and management are coming soon.” Organizations are limited for now to private marketplaces scoped to a repository or to an individual developer’s environment.

On the other hand, GitHub’s marketplace has been open to third-party builders since early 2025. Cursor’s marketplace already lists more than 30 external partners. OpenAI’s directory so far contains only plugins curated by the company itself.

“Long-term platform stickiness will depend on a curated third-party ecosystem that expands capability breadth and accelerates innovation,” Dai said. “Mature enterprises will expect audited, interoperable plugins for domain-specific tooling and regulated workflows. Without this external ecosystem, Codex risks limited extensibility beyond core engineering use cases.”

(image/jpeg; 13.05 MB)

Anthropic throttles Claude subscriptions to meet capacity 27 Mar 2026, 11:48 am

Anthropic has started limiting usage across its Claude subscriptions to cope with rising demand that is stretching its compute capacity.

“To manage growing demand for Claude we’re adjusting our 5 hour session limits for free/Pro/Max subs during peak hours. Your weekly limits remain unchanged,” Thariq Shihipar, a member of Anthropic’s technical staff, wrote in a post on X.

“During weekdays between 5am–11am PT / 1pm–7pm GMT, you’ll move through your 5-hour session limits faster than before,” Shihipar added.

In effect, the change concentrates tighter usage controls during peak global working hours, when demand from both enterprise and individual users is highest.

The rationale here is that by accelerating how quickly users hit their session limits within these windows, Anthropic is effectively redistributing access to prevent system overloads while still preserving overall weekly usage quotas.

Pro users affected

Anthropic’s Shihipar, in his post, was referring to Claude’s subscription plans, which include usage limits, unlike Claude’s API plans, which are unaffected by the change.

The subscription plans’ usage limits, according to the model provider, are defined as the enabler for controlling how much a user can interact with Claude over a specific time period.

“Think of this as your ‘conversation budget’ that determines how many messages you can send to Claude, or how long you can work with Claude Code, before needing to wait for your limit to reset,” the documentation reads.

It is worth noting that Anthropic doesn’t define pricing for its subscription plans as clearly as its API-based plans, which is dependent on multiple factors, including base input tokens, cache write, and output tokens.

The change in usage limits, Shihipar later wrote, will impact approximately “7% of users”, particularly pro tiers. As a bypass, the senior technical staff recommended running “token-intensive background jobs” during “off-peak hours” to stretch session limits.

Push to adopt API-based plans?

Analysts say the move would create issues for users.

“The impact is largely limited to individual users, prosumers, and small teams using Claude via subscription plans, where usage caps and throttling are expected to manage shared compute and costs,” said Pareekh Jain, principal analyst at Pareekh Consulting.

Enterprises, Jain added, are typically mostly insulated as they typically rely on API-based consumption or dedicated contracts.

Having said that, though, Jain says power users in enterprises would also be affected, resulting in slow experimentation or slower pilots, which he says could be an Anthropic strategy to push teams towards API adoption.

That push, according to Forrester VP and principal analyst Charlie Dai, could mean more guaranteed revenue for the model provider.

Offering a contrarian view on how the change affects large enterprises, Greyhound Research chief analyst Sanchit Vir Gogia pointed out that most enterprises are not operating in a clean, API-only model and, in reality, usage is fragmented across subscription tiers, team environments, developer tooling, and API integrations.

“It is within this blended environment that the impact begins to surface. Subscription layers are no longer peripheral. They power real workflows, particularly in development, analytics, and rapid execution scenarios. When those layers become inconsistent during peak demand, enterprise productivity is affected indirectly but meaningfully,” Gogia said.

In fact, Gogia, too, sees the change forcing enterprises to choose API-based plans to ensure productivity: “Enterprises are entering a phase where performance consistency is no longer assumed. It must be architected, negotiated, and paid for. If demand continues to outpace infrastructure capacity, this segmentation will become more explicit.”

More so because analysts see Anthropic’s rivals and other vendors taking similar routes in terms of balancing usage with capacity.

“Since all major vendors are either introducing or will introduce similar constraints, impacted users may not get relief by moving to another vendor platform. They would be effectively moving across different forms of limits rather than escaping them entirely,” Jain said.

Limited backlash due to limited options

That same logic is also why Jain sees limited backlash from users, especially when it comes to switching vendors, as a response to the policy change from Anthropic.

Echoing Jain, Avasant’s research director Chandrika Dutt pointed out that the capacity throttling measures being implemented by model-providers closely mirror strategies employed by hyperscalers during the early days of cloud computing.

Cloud providers, such as Amazon Web Services, Microsoft Azure, and Google Cloud, Dutt said, faced similar capacity constraints and instead of scaling instantly, they introduced mechanisms to shape demand and smooth consumption patterns, such as reserved capacity models and pricing incentives to shift usage to off-peak periods.

(image/jpeg; 2.12 MB)

Edge clouds and local data centers reshape IT 27 Mar 2026, 9:00 am

For more than 10 years, enterprise cloud strategy has relied on centralizing as much as possible—shifting workloads from data centers, consolidating operations on hyperscale platforms, and leveraging economies of scale. This approach has reduced infrastructure sprawl, accelerated deployment, and provided nearly unlimited compute and storage. However, the next generation of digital systems increasingly interacts with regional regulations, real-time decision loops, and the physical world in general. These factors do not tolerate distance well. Smart traffic systems can’t wait for a round-trip to distant cloud regions. Industrial control systems can’t halt operations because a wide-area link is congested. AI-driven video analytics becomes costly and inefficient when every frame must be sent back to a centralized platform for inference. In these environments, it matters where the data is created and processed and where decisions are made.

The future of cloud computing is neither more nor less centralized. It is selectively distributed, with edge cloud and localized data centers becoming essential in situations where latency, sovereignty, and physical-world responsiveness matter most.

That is the real story behind the rise of edge cloud. It’s not hype, a complete reversal of cloud adoption, or a nostalgic return to on-premises infrastructure. Instead, what’s emerging is more practical dual architecture: a centralized cloud for aggregation, model training, cross-region coordination, and platform services; with local infrastructure for time-sensitive processing, regional independence, and compliance-driven workloads.

Use cases for edge clouds

Edge cloud involves deploying compute, storage, and networking resources closer to users, devices, and data sources. It looks like telecom facilities at the metro edge, or micro data centers in hospitals, retail outlets, factories, or municipal centers. These localized data centers support workloads that benefit from proximity, embodying the regional computing principle of placing workloads where they are most operationally and economically effective.

The trend is accelerating because multiple forces are converging at once. Low-latency applications are moving from pilot projects to full production. AI is transitioning from centralized training to distributed inference. Data residency laws are becoming more specific and easier to enforce. Enterprises are also realizing that bandwidth is limited, and transmitting massive amounts of sensor, video, and telemetry data to a central cloud can often be a poor design choice hidden behind architectural simplicity.

Consider smart cities. Municipal systems are no longer limited to back-office software and basic public websites. City systems now include connected traffic lights, intelligent surveillance, environmental sensors, safety systems, transit monitoring, and energy efficiency platforms, all generating continuous streams of local data that require immediate responses. Detecting congestion, hazards, or emergency vehicle routes at intersections demands quick action. Relying on distant cloud analysis can delay responses, risking public safety.

The same logic applies in industrial settings. Connected factories increasingly use machine vision, predictive maintenance models, robotics, telemetry, and digital twins to boost throughput and minimize downtime. Much of that data has local value first and global value second. A detection model for defects running alongside a production line can stop defective output in real time. A centralized system can still gather data for fleet-wide analytics, training, and optimization, but it should not be on the critical path of every local decision. This is where edge cloud delivers tangible business value as a way to keep local operations fast, resilient, and cost-effective.

Healthcare can’t rely solely on a centralized cloud system. Regional setups depend on imaging, monitoring, connected devices, and patient-facing services. Some workloads must remain local because of privacy concerns, network limitations, or response time requirements. Hospitals need local computing for imaging, decision support, and operations that can’t risk WAN failures. At the same time, they require centralized platforms for analytics, model development, and data integration. Hybrid is the best operating model.

Retail demonstrates another vital aspect of edge: local processing for personalization, inventory, checkout, and analytics. Pushing all transactions to a central platform is costly, especially when business value is immediate and local. Stores that adapt staffing, promotions, or fulfillment in real time gain an edge. This doesn’t mean abandoning centralized platforms but rather extending them with localized execution.

Telecom providers, colocation operators, and cloud vendors recognize this opportunity. Telecom companies aim to monetize network proximity by converting metro infrastructure into application platforms. Colocation providers position regional facilities as neutral points for latency-sensitive workloads, data exchange, and multi-cloud interconnection. Hyperscale cloud vendors respond by expanding managed services through local zones, distributed appliances, and edge-specific platforms. Everyone strives to control the plane in a world where compute becomes increasingly decentralized.

When hype outruns architecture

Deploying edge infrastructure is easy to celebrate in strategy decks because it sounds modern and inevitable. However, operating it at scale is much less glamorous. Managing a centralized cloud region is already challenging, but having hundreds of distributed sites with hardware limitations, physical exposure, inconsistent connectivity, and varying operational maturity presents a completely different set of problems. The issue isn’t just deploying small clusters across many locations. It involves life-cycle management, security hardening, observability, orchestration, failover, and governance within an inherently fragmented estate.

Security complexity rises as each distributed site increases the attack surface. Remote, diverse infrastructure makes patching harder. Identity, certificates, and policies must be consistent across locations with varying staffing and controls. Many underestimate the operational burden, thinking edge is just cloud with shorter networks.

Observability remains a significant gap. Distributed systems fail in distributed ways, which rapidly multiplies blind spots. If enterprises cannot monitor what is happening across thousands of nodes, local clusters, gateways, and data pipelines, they are not truly operating at the edge—they are building up technical debt in smaller units.

Interoperability also remains underdeveloped. Despite vendor claims, many edge solutions are still too tightly linked to specific hardware stacks, connectivity methods, or cloud ecosystems. This creates lock-in risks exactly when enterprises seek greater architectural flexibility.

Edge advocates stress lower latency and better bandwidth, both of which provide real benefits. However, local infrastructure costs include capital, staffing, remote management, and maintenance. The case is strong if the workload genuinely needs local processing but weak if it’s adopted just because it sounds strategic. Running workloads at the edge without real-time capabilities, sovereignty, or resilience is often just expensive infrastructure rather than true innovation.

That is why enterprise leaders should resist the temptation to frame edge as the next universal destination for workloads. It is not. Some apps fit in centralized cloud regions, some belong in data centers, and others in localized facilities. The aim isn’t architectural purity but placement discipline. A helpful way to think about edge adoption in the next three to five years is to start with three questions:

  1. What decisions need to be made locally because of latency, safety, or user experience?
  2. What data should remain local because regulation, privacy, or economics make centralization a poor option?
  3. What operations must keep going even when connectivity to a centralized cloud is limited?

If a workload clearly benefits from one or more of those criteria, edge deserves serious consideration. If not, it probably fits better in a more centralized setup.

CIOs and architects should also avoid treating edge as a disconnected side project. The preferred model remains the integrated hybrid cloud. Centralized platforms are still ideal for data aggregation, long-term storage, model training, enterprisewide policies, and shared digital services. Edge is where execution occurs close to the source of interaction. More mature organizations will treat these as coordinated layers within one architecture, rather than opposing camps in an infrastructure debate.

The cloud market is evolving beyond the one-size-fits-all centralization model that characterized its early days. This is the cloud maturing. Smart cities, industrial systems, healthcare networks, telecom infrastructure, and low-latency digital services all point to the same truth: Proximity has become a crucial architectural factor that can no longer be overlooked.

Enterprises don’t need edge computing everywhere. They need a strategy for where it truly matters. The next stage of cloud architecture will reward organizations that recognize a simple truth: The most effective cloud is the one that intentionally distributes intelligence.

(image/jpeg; 0.91 MB)

On the pleasures and dangers of open source Python 27 Mar 2026, 9:00 am

Announced at JavaOne, Project Detroit proposes to break down the walls between Java, Python, and JavaScript. Also in this report: Better ways to instrument your code with Python’s new built-in sampling profiler, another run at using AI locally to rework a Python project, and the question on everyone’s mind right now (surely): What does OpenAI really want with Astral?

Top picks for Python readers on InfoWorld

OpenAI buys Python tools builder Astral
Astral, the maker of uv, ty, and pyx, has a new home under the OpenAI umbrella. Is OpenAI demonstrating its commitment to maintaining tooling in the AI space, or is the purchase more of a power move?

I ran Qwen3.5 locally instead of Claude Code. Here’s what happened
Want to run an LLM on your own hardware for that at-home Claude Code or Copilot experience? You can, but it’ll be a bumpy ride. My takeaway? Maybe don’t let the AI run around unsupervised after dark.

Hands-on with the new sampling profiler in Python 3.15
Among Python 3.15’s best new features is a sampling profiler. See how it works in this guide to using the profiler to instrument your code and find bottlenecks with minimal performance impact.

Project Detroit, bridging Java, Python, JavaScript, moves forward
The once-dead, now-revived Detroit project aims to allow Java’s Foreign Function and Memory API to talk seamlessly to other language runtimes. The vision? More powerful mixing and matching of languages across domains.

More good reads and Python updates elsewhere

The slow collapse of MkDocs
The strange, ongoing saga of how a developer meltdown took out one of the most popular documentation tools for Python—with no clear successor in sight.

Comparing the typing spec conformance of Python type-checking tools
How well do tools like Pyright, Pyrefly, Mypy, Ty, and others conform to Python’s own type annotation specs? The answers range, surprisingly, from “very closely” to “just barely.”

The optimization ladder: All the ways to make Python faster
From replacing the runtime to integrating modules written in C or Rust, here’s an end-to-end rundown of ways to speed up Python for tasks that urgently need performance.

License laundering and the death of ‘clean room’
When someone rewrote a long-unmaintained Python library with an LLM, the original developer broke a decade-plus silence to object. What are the implications for open source?

(image/jpeg; 6.05 MB)

Context Hub vulnerable to supply chain attacks, says tester 27 Mar 2026, 3:25 am

On the surface, the recent critique of a new tool called Context Hub by a developer who created an open-source alternative appears to be an illustration that the tool is vulnerable to misuse. But delve further and it serves as a far greater warning to AI developers of the downside of using non-authoritative sources of information.

Two weeks ago, Andrew Ng, founder of a Silicon Valley technical training firm called DeepLearning.AI, launched the product, which he stated in a LinkedIn post is an open tool that gives a coding agent the up-to-date API documentation it needs.

“Install it and prompt your agent to use it to fetch curated docs via a simple CLI,” the post reads. “Why this matters: Coding agents often use outdated APIs and hallucinate parameters. For example, when I ask Claude Code to call OpenAI’s GPT-5.2, it uses the older chat completions API instead of the newer responses API, even though the newer one has been out for a year. Context Hub solves this. Context Hub is also designed to get smarter over time.”

According to Ng, using Context Hub, agents can even annotate docs with notes. “If your agent discovers a workaround, it can save it and doesn’t have to rediscover it next session,” he said. “Longer term, we’re building toward agents sharing what they learn with each other, so the whole community benefits.”

Poisoning the project

However, on Wednesday, Mickey Shmueli, the developer of LAP, which he described as an “open source alternative to Context Hub,” released a Context Hub supply chain attack Proof of Concept (PoC) on Github.

He explained the problem he’d discovered: Context Hub contributors submit docs as GitHub pull requests, maintainers merge them, and agents fetch the content on demand, “[but] the pipeline has zero content sanitization at every stage”

He wrote that the project “[has] published more than 1,000 API documents, and added a feature letting agents annotate docs for other agents. We tested whether a poisoned document in that registry could silently compromise developer projects.”

In the test, he wrote, “we created realistic poisoned docs containing fake dependencies and served them through an … MCP server inside isolated Docker containers.” He emphasized, however, that no poisoned content was uploaded to Context Hub’s registry; the tests were run locally on an MCP server configured to serve pre-built output from disk. But from the agent’s perspective, the experience was identical to fetching docs from the live registry.

The result: “When AI coding assistants fetched the docs, [Claude] Haiku silently wrote the fake package into requirements.txt in 100% of runs without ever mentioning it in its text output. A developer reading the assistant’s response would see nothing suspicious, but their project is poisoned.”

Only Claude Haiku, Sonnet, and Opus were tested; Opus fared best, Haiku worst. Results for other models such as GPT-4, Gemini, and Llama may differ, Shmueli noted.

Agentic AI likened to ‘high-speed idiot’

Responding to Shmueli’s findings, David Shipley, CEO of Beauceron Security, said Thursday, “[it is] time to have a moment of pure honesty about agentic AI. At its best, it’s a gullible, high-speed idiot that occasionally trips on hallucinogenic mushrooms that you’re giving the ability to act on your behalf. Stop and think about that. Would you knowingly hire a human that fit that description and then give them unsupervised access to code or your personal banking?  I wouldn’t.”

LLM-based generative AI tools, he said, “do not have the capacity for critical thought or reasoning, period. They’re probability math and tokens. They’re faking reasoning by retuning and iterating prompts to reduce the chances of being wrong.” 

That is not critical thinking, Shipley said, noting, “what was true in the 1950s remains true today: Garbage in, garbage out.”

People, he said, “built stochastic parrots that can be manipulated by sweet talking to them, and they call it prompt engineering. Dudes, it’s social engineering. And the more the AI industry keeps telling us about the Emperor’s New Clothes, the dumber we all look for believing them.”

Supply chain attacks a ‘serious and scalable threat’

Justin St-Maurice, technical counselor at Info-Tech Research Group, echoed Shipley’s concerns. He noted, “supply chain attacks are a serious and scalable threat, and what we’re seeing this week is a good example of why. The vulnerability isn’t necessarily in the application itself. It’s in the dependency chain, the shared libraries, the package repositories, all the common infrastructure these systems are built on top of.”

He added, “we’ve seen this pattern before, many times. A single flaw gets introduced upstream, and suddenly a huge range of downstream systems are exposed, often before anyone has caught it. What’s different now is the speed at which AI-assisted development is moving. Developers are pulling in shared dependencies, using AI-generated code, and moving fast. If something gets introduced into one of those common sources, it can propagate across a wide range of systems very quickly.”

And in an AI context, said St-Maurice, “the impact isn’t just passive. These systems can consume those inputs and act on them, which makes the potential impact a lot bigger.”

He noted, “the LiteLLM situation and what’s happening with Context Hub are two examples in the same week. It’s definitely worth paying attention to. Vibe coders and people building quickly on top of AI tools need to think seriously about how they’re validating dependencies and managing upstream risk. Relying on prompts alone won’t be enough to manage security risks.”

(image/jpeg; 1.67 MB)

Visual Studio Code previews chat customizations editor 26 Mar 2026, 10:55 pm

Just-released Version 1.113 of Microsoft’s Visual Studio Code editor emphasizes improvements ranging from chat customizations to support for MCP (Model Context Protocol) in Copilot CLI and Claude agents.

Released March 25, VS Code can be downloaded for Windows, Linux, or Mac via the VS Code download webpage. VS Code 1.113 closely follows VS Code Versions 1.111 and 1.112 as part of Microsoft’s new plan to issue releases on a weekly schedule instead of monthly.

The chat customizations editor previewed in VS Code 1.113 has a centralized UI for creating and managing all chat customizations in one place. The editor organizes customization types into separate tabs, such as prompt files, custom instructions, custom agents, and agent skills. It also provides an embedded code editor with syntax highlighting and validation. Developers can create customizations from scratch or use AI to generate initial content based on a project, according to Microsoft.

Elsewhere in Version 1.113, MCP servers registered in VS Code are bridged to Copilot CLI and Claude agents. This applies to user-defined servers and servers defined in a workspace via mcp.json files. Previously, MCP servers configured in VS Code only were available to local agents running in the editor. For Claude session listing, VS Code now adopts the official API from the Claude agent SDK to list out sessions and their messages. Previously, Microsoft relied on parsing Claude JSON files on disk, which had a risk of being out of sync if Claude changed their structure. Issues with the Claude agent not showing all sessions or messages should now be resolved.

Also in VS Code 1.113:

  • Subagents now can invoke other subagents, enabling more complex multistep workflows. Previously, subagents were restricted from calling other subagents to prevent infinite recursion.
  • When working with images in chat, whether this involves attached screenshots to a request or the agent-generated images via tool calls, developers can now select any image attachment to open it in a full image viewer experience.
  • Models that support reasoning, such as Claude Sonnet 4.6 and GPT-5.4, now show a Thinking Effort submenu directly in the model picker. This can control how much reasoning the model applies to each request without navigating to VS Code settings.
  • Browser tab management has been improved.
  • VS Code now ships with new default themes: VS Code Light and VS Code Dark. These themes are designed to provide a modern, fresh look while maintaining the familiarity and usability of the previous default “Modern” themes.

(image/jpeg; 11.22 MB)

Databricks pitches Lakewatch as a cheaper SIEM — but is it really? 26 Mar 2026, 12:42 pm

Databricks has previewed a new open agentic Security Information and Event Management software (SIEM) named Lakewatch that signals its first deliberate step beyond data warehousing into security analytics.

The data warehouse-provider is pitching Lakewatch as a lower-cost alternative to traditional security tools, arguing that consolidating security analytics into its data platform can reduce overall spend.

“Right now, existing solutions’ (rival SIEMs) ingestion costs force teams to discard up to 75% of their data, so while attackers can use AI to attack anywhere, defenders only see a fraction of their own data. Our goal with Lakewatch is to close this gap… because our lakehouse architecture is uniquely built to handle massive amounts of data cheaply,” Andrew Krioukov, general manager of Lakewatch at Databricks, told InfoWorld.

“Unlike other SIEM platforms, we do not charge based on the amount of data ingested or stored, but rather on the compute that security teams use. This allows organizations to achieve up to an 80% reduction in total cost of ownership (TCO) while maintaining years of hot, queryable data for compliance and hunting,” Krioukov added.

Analysts, too, agree with Krioukov, but only in part.

“The cost problem in SIEM is real. Many organizations often are forced to discard data because ingestion pricing makes full retention prohibitively expensive,” said Stephanie Walter, leader of the AI stack at HyperFRAME Research.

In contrast, Lakewatch can reduce costs in some cases, especially if enterprises want to retain large amounts of data, echoed Akshat Tyagi, associate practice leader at HFS Research.

However, analysts warned that savings may be less straightforward, with costs potentially shifting to compute and data processing rather than disappearing altogether.

“Costs don’t disappear; they shift. If usage isn’t controlled, compute can add up quickly. It can be more efficient, but not automatically cheaper,” said Robert Kramer, principal analyst at Moor Strategy and Insights.

Beyond costs, though, analysts say Lakewatch is offering a progressive structural shift in how enterprises conduct security operations, especially analytics.

The platform stitches together components such as Unity Catalog for governance and access control, Lakeflow Connect for ingesting and streaming security data, and the Open Cybersecurity Schema Framework (OCSF) to standardize disparate log formats, effectively turning the lakehouse into a centralized system of record for security operations, Walter said.

The added context from all the combined data in the lakehouse is also likely to act as an accelerant for helping enterprises automate security operations at scale with agents, Walter added.

That said, translating these benefits into near-term buy-in from CIOs and CISOs could prove challenging for Databricks.

“This is more likely to complement existing SIEMs than replace them. Early adoption will come from large enterprises already committed to Databricks, especially those seeking flexibility or cost control. It aligns with existing investments but remains new territory for operational security teams. Building trust through proven use cases will be key,” Kramer said.

Even so, Databricks is signaling serious intent, with the acquisitions of two cybersecurity startups — Antimatter and SiftD.ai, which analysts say point to its broader security roadmap ahead. “This looks like the foundation of a long-term security portfolio, not a one-off SIEM feature. Acquiring security-focused companies is less about adding features and more about importing credibility. Security buyers trust vendors with domain depth, not just infrastructure scale,” HyperFRAME Research’s Walter said.

(image/jpeg; 0.11 MB)

Google targets AI inference bottlenecks with TurboQuant 26 Mar 2026, 10:22 am

Google says its new TurboQuant method could improve how efficiently AI models run by compressing the key-value cache used in LLM inference and supporting more efficient vector search.

In tests on Gemma and Mistral models, the company reported significant memory savings and faster runtime with no measurable accuracy loss, including a 6x reduction in memory usage and an 8x speedup in attention-logit computation on Nvidia H100 hardware.

For developers and enterprise AI teams, the technology offers a path toward reduced memory demands and better hardware utilization, along with the possibility to scale inference workloads without a matching jump in infrastructure costs.

According to Google, TurboQuant targets two of the more expensive components in modern AI systems, specifically the key-value (KV) cache used during LLM inference and the vector search operations that underpin many retrieval-based applications.

By compressing these workloads more aggressively without affecting output quality, TurboQuant could allow developers to run more inference jobs on existing hardware and ease some of the cost pressure around deploying large models.

Significance in enterprise deployments

Whether this amounts to a meaningful breakthrough for enterprise AI teams will depend on how well the technique performs outside Google’s own tests and how easily it can be integrated into production software stacks.

“If these results hold in production systems, the impact is direct and economic,” said Biswajeet Mahapatra, principal analyst at Forrester. “Enterprises constrained by GPU memory rather than compute could run longer context windows on existing hardware, support higher concurrency per accelerator, or reduce total GPU spend for the same workload.”

Sanchit Vir Gogia, chief analyst at Greyhound Research, said the announcement addresses a real but often overlooked constraint in enterprise AI systems.

“Let’s call this what it is,” Gogia said. “Google is going after one of the most annoying, least talked about problems in AI systems today. Memory blow-up during inference. The moment you move beyond toy prompts and start working with long documents, multi-step workflows, or anything that needs context to persist, memory becomes the constraint.”

These gains matter because KV cache memory rises in step with context length. Any meaningful compression can directly let developers handle longer prompts, larger documents, and more persistent agent memory, all without having to redesign the underlying architecture.

However, Gogia cautioned that efficiency gains may not translate into lower spending.

“Efficiency gains rarely reduce spend,” Gogia said. “They increase usage. Teams don’t save money. They stretch systems further. Longer context, more queries, more experimentation. So the impact is real, but it shows up as scale, not savings.”

LLM interference to benefit

Google is positioning TurboQuant as a technology that could improve both LLM inference and vector search. Some analysts say the more immediate payoff is likely to come in LLM inference.

“The KV cache problem is already an acute cost and scaling limiter for enterprises deploying chat, document analysis, coding assistants, and agentic workflows, and TurboQuant directly compresses that runtime memory without retraining or calibration,” Mahapatra said. “Vector search also benefits from the same underlying compression techniques, but most enterprises already manage vector memory through sharding, approximate search, or storage tiering, which makes the pain less immediate.”

That distinction matters because inference memory pressure tends to hit enterprises where it hurts most: GPU sizing, latency, and cost per query. In other words, the problem is not theoretical. It affects the economics of running AI systems at scale today.

Gogia, however, sees the initial impact playing out differently, with retrieval and vector search systems likely to benefit first.

“Retrieval systems are modular,” Gogia said. “You can isolate them, tweak them, test them without breaking everything else. And they already depend on compression to function at scale. So any improvement here hits immediately. Storage footprint comes down. Index rebuilds get faster. Refresh cycles improve. That is operational value, not theoretical value.”

Gogia said Google’s announcement represents a solid piece of engineering that addresses a real problem and could deliver meaningful benefits in the right contexts. However, he added that it does not change the underlying constraints, noting that AI systems remain limited by infrastructure, power, cost, and the complexity of making all the components work together.

(image/jpeg; 10.64 MB)

A data trust scoring framework for reliable and responsible AI systems 26 Mar 2026, 9:00 am

Digital transformation today is more than just automating tasks or speeding up calculations. It’s reshaping how we make decisions. People used to rely on their own experience and negotiation skills, but now algorithms are often taking over. While this shift improves efficiency and scale, it also introduces a critical challenge: managing knowledge reliably across automated decision systems. If these systems end up using data that isn’t accurate, balanced or well-organized, mistakes and inequality can spread instead of smart solutions.  

Artificial intelligence is only as good as the data it gets and the goals it’s built to reach. To create AI that people really trust, we need to make sure our data is reliable and fair. That’s why a data trust scoring framework matters. It helps turn ideas about fairness and responsibility into clear ratings for the data sets that power AI systems.  

From human trust to algorithmic reliance 

Trust is often viewed as a personal bond, where one person depends on another’s abilities, goodwill and honesty. When trust is broken in relationships, it feels like betrayal rather than just disappointment, because trust carries deeper expectations.  

When considering AI, the situation becomes more complex. Many people attempt to apply human concepts of trust to machines, but this proves challenging. Skills can be assessed through accuracy, while safety measures substitute for goodwill. Integrity is more difficult to evaluate since machines lack moral judgement, so attention turns toward transparency and fairness within these systems. Recent studies recommend viewing trustworthy AI in social terms, considering its benefits for institutions instead of just focusing on the technology itself.  

A practical strategy is to distinguish reliance from trust. Reliance involves expecting a system to perform based on evidence and previous results. True trust should be reserved for individuals and organizations capable of accepting responsibility. Therefore, data trust scoring ought to communicate clearly what AI systems are able and unable to accomplish, which helps users rely on them with justified confidence.  

Mapping human trust attributes to data and models 

If traditional trust is grounded in ability, benevolence and integrity, those ideas can be translated into an algorithmic setting as follows: 

  • Ability becomes technical performance and robustness. How accurate is the model on representative data, and how resilient is it under distribution shift or adversarial manipulation? 
  • Benevolence becomes alignment with human safety, rights and organizational purpose. Does the system’s behavior track the values it is supposed to embody, rather than merely its loss function? 
  • Integrity becomes process transparency, procedural fairness and traceability. Can one reconstruct how data was collected, processed and used? Can one explain what the model is doing in ways that are meaningful to affected stakeholders? 

These translations are not perfect, but they create a bridge between relational trust and system level governance. They also motivate a more fine-grained view of dataset fitness, which is where the seven-dimensional taxonomy enters. 

A 7-dimensional taxonomy of dataset fitness 

The data trust scoring framework rates datasets across seven areas, using clear rubrics and producing a composite score for easier understanding:  

  1. Accuracy: Checks if data matches true events, focusing on correct labels and avoiding systematic errors. Inaccurate labels can mislead models at scale.  
  1. Completeness: Looks for missing data or gaps. Incomplete datasets, such as missing transaction records, skew model outcomes and risk estimates.  
  1. Freshness: Assesses if data is up to date. Old data can misrepresent current trends, so this dimension highlights the importance of recent information.  
  1. Bias Risk: Flags built-in prejudices, from sampling bias to historical discrimination. This ensures fairness is addressed from the start, not as an afterthought.  
  1. Traceability: Focuses on clear records from data collection to final use. Without tracking, it’s hard to analyze failures or make corrections.  
  1. Compliance. It evaluates alignment with regulatory and policy requirements. This includes privacy obligations under regimes such as GDPR, sector-specific mandates and emerging AI standards. The NIST AI Risk Management Framework has become a widely referenced guide for mapping, measuring and managing AI risks, while the EU AI Act is moving toward legally enforceable obligations for data quality and transparency in high-risk systems. 

Contextual clarity 

Contextual clarity concerns how well the dataset’s scope, limitations and intended uses are documented. Developers need enough metadata and narrative context to understand where the data is reliable and where it is not. This dimension guards against the silent repurposing of data in settings for which it was never appropriate. 

Each dimension is scored, normalized and then combined into an overall trust score. One common aggregation formula is: 

Common aggregation formual

Sunil Kumar Mudusu

Where 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛_𝑆𝑐𝑜𝑟𝑒 is the normalized score for each of the seven dimensions, and specific 𝑊𝑒𝑖𝑔ℎ𝑡 is the importance factor derived from stakeholder analysis. 

Semantic integrity and generative AI 

Traditional data quality principles were developed with structured data in mind. Large Language Models and other generative systems challenge these assumptions. They are trained on massive, heterogeneous corpora, yet can generate outputs that look fluent while being factually or logically incorrect. 

To address this, the framework introduces semantic integrity constraints. These are declarative rules that extend classical database integrity constraints into the semantic domain. At a high level, they fall into two broad categories: 

  • Grounding constraints, which require that generated content be consistent with authoritative sources. This can be implemented through retrieval augmented generation, constrained decoding or post hoc validation against trusted knowledge bases. 
  • Soundness constraints, which evaluate whether the model’s reasoning is logically coherent. This is particularly relevant when LLMs are used to generate explanations, summaries of complex evidence or structured outputs such as JSON objects and code. 

Metrics like SEMSCORE, which leverage neural embeddings to approximate human judgments of semantic similarity, and more structurally aware measures such as STED, which balance semantic flexibility against syntactic precision, offer partial but useful tools for quantifying semantic integrity in practice. 

Privacy preserving computation and mathematical trust 

A key component of data trust is the protection of individual privacy. Traditional anonymization methods have proven vulnerable to reidentification attacks, especially when datasets are linked or auxiliary information is available. Differential privacy offers a more rigorous alternative. As summarized in public references such as the article on differential privacy in computational privacy literature and on Wikipedia, the core idea is to limit how much influence any single individual can have on the output of a computation. 

Formally, for two datasets D1 and D2 that differ in exactly one record, and for a randomized mechanism K, epsilon differential privacy requires that for every possible output set S: 

Epsilon differential privacy equation

Sunil Kumar Mudusu

The parameter epsilon quantifies the privacy loss. Smaller values mean stronger privacy guarantees, but they also require more noise to be injected into the computation, which can reduce utility. 

Kanonymity provides a more classical framework. It demands that each record in a released dataset be indistinguishable from at least K − 1 others with respect to a set of quasi-identifiers. While Kanonymity is vulnerable to various attacks if used alone, it remains useful when combined with additional safeguards, especially for generating synthetic datasets that preserve statistical properties while reducing the risk of reidentification. 

In the trust scoring framework, privacy preserving techniques contribute directly to the compliance and traceability dimensions and indirectly to bias and contextual clarity. 

Regulatory alignment and operational guardrails 

Data trust cannot be considered in isolation from the regulatory environment. Organizations deploying AI systems are increasingly expected to demonstrate not just that their models perform well, but that they manage risk responsibly across the entire lifecycle. 

The NIST AI RMF offers a voluntary, but influential, structure for doing this. It organizes AI risk management into four functions: govern, map, measure and manage. The EU AI Act, by contrast, is a binding legal instrument. It classifies AI applications by risk level and imposes specific obligations on high-risk systems, including documentation of data quality, transparency measures and post-deployment monitoring. Some proposed implementations even contemplate minimum transparency index thresholds for models that affect fundamental rights. 

A data trust scoring framework fits naturally into this landscape. It provides a concise, quantifiable summary of data fitness that can be linked to governance gates, deployment approvals and audit processes. 

Operationalizing trust through KPIs and model cards 

For a trust scoring framework to matter, it must move beyond design documents and into daily practice. That means integrating it with key performance indicators and the tools that teams already use. 

Relevant KPIs include: 

  • Bias detection and mitigation rates, tracking both disparities discovered and time to remediation. 
  • Model drift detection times, measuring how quickly significant performance degradations are identified. 
  • Explanation coverage, estimating the percentage of model outputs for which meaningful explanations can be generated. 
  • Audit readiness scores, assessing the completeness and accessibility of documentation, lineage and decision logs. 

Model cards provide a complementary artifact. As described in “Model Cards for Model Reporting,” they offer a structured template for documenting a model’s purpose, data foundations, design choices, limitations and monitoring plans. When every production model is accompanied by a model card and a current data trust score, AI governance shifts from retrospective justification to continuous, evidence-based stewardship. 

Trust as a quantitative and institutional practice 

The movement toward reliable and responsible AI is not a single project with a clear end state. It is an ongoing process of refinement in which technical capability, regulatory expectation and social norms evolve together. The data trust scoring framework is one contribution to that process. While it cannot remove difficult value judgments or eliminate ambiguity, it does make those judgments explicit, measurable and open to revision over time. 

As AI systems become more autonomous and more deeply embedded in critical workflows, the question will not only be how powerful they are, but how well we can justify relying on them. Organizations that treat data trust as a quantifiable, governable property, rather than a vague aspiration, will be better positioned to answer that question convincingly to regulators, customers and their own staff. In the end, the durability of AI driven systems will depend less on raw model sophistication and more on the integrity of the data practices that sustain them. 

This article is published as part of the Foundry Expert Contributor Network.
Want to join?

(image/jpeg; 11.31 MB)

Basic and advanced Java serialization 26 Mar 2026, 9:00 am

Serialization is the process of converting a Java object into a sequence of bytes so they can be written to disk, sent over a network, or stored outside of memory. Later, the Java virtual machine (JVM) reads those bytes and reconstructs the original object. This process is called deserialization.

Under normal circumstances, objects exist only in memory and disappear when a program terminates. Serialization allows an object’s state to outlive the program that created it, or to be transferred between different execution contexts.

The Serializable interface

Java does not allow every object to be serialized. A class must explicitly opt in by implementing the Serializable interface, as shown here:

public class Challenger implements Serializable {

    private Long id;
    private String name;

    public Challenger(Long id, String name) {
        this.id = id;
        this.name = name;
    }
}

Serializable (java.io.Serializable) is a marker interface, meaning that it does not define any methods. By implementing it, the class signals to the JVM that its instances may be converted into bytes. If Java attempts to serialize an object whose class does not implement the Serializable interface, it fails at runtime with a NotSerializableException. There is no compile‑time warning.

Serialization traverses the entire object graph. Every non‑transient field must refer to an object that is itself serializable. If any referenced object cannot be serialized, the entire operation fails. All primitive wrapper types (Integer, Long, Boolean, and others), as well as String, implement Serializable, which is why they can be safely used in serialized object graphs.

Limits of Java serialization

Serialization stores instance state and preserves reference identity within the object graph (shared references and cycles). It does not preserve behavior or JVM identity across runs. Remember the following guidelines when using serialization:

  • Instance fields are written to the byte stream.
  • Behavior is not serialized.
  • Static fields are not serialized.
  • Object identity is preserved.

Also note: If two fields reference the same object before serialization, that relationship is preserved after deserialization.

A Java serialization example

As an example of serialization, consider the following example, a Java Challengers player:

Challenger duke = new Challenger(1L, "Duke");

What do you notice? Let’s unpack it.

1. Writing the object

First, Java verifies that the class implements Serializable, converts the object’s field values into bytes, and writes them to the file:

try (ObjectOutputStream out = new ObjectOutputStream(new FileOutputStream("duke.ser"))) {
    out.writeObject(duke);
}

2. Reading the object back

During deserialization, the serializable class’s own constructor is not called. The JVM creates the object through an internal mechanism and assigns field values directly from the serialized data. However, if the class extends a nonserializable superclass, that superclass’s no‑argument constructor will run:

try (ObjectInputStream in = new ObjectInputStream(new FileInputStream("duke.ser"))) {
    Challenger duke = (Challenger) in.readObject();
}

This behavior often surprises developers the first time they debug a deserialized object, as the invariants are silently broken. This distinction is also important in class hierarchies, which we’ll discuss later in the article.

Serialization callbacks

Because the JVM controls object creation and field restoration during serialization, it also provides hooks that allow a class to customize how its state is written and restored. A class can define two private methods with exact signatures:

private void writeObject(ObjectOutputStream out) throws IOException
private void readObject(ObjectInputStream in) throws IOException, ClassNotFoundException

These methods are not called directly by application code. They are JVM callbacks invoked automatically during serialization and deserialization. Calling them manually results in a NotActiveException, because they require an active serialization context managed by the JVM:

import java.io.*;

public class OrderSensitiveExample implements Serializable {
    private static final long serialVersionUID = 1L;

    void main() throws IOException, ClassNotFoundException {
        OrderSensitiveExample example = new OrderSensitiveExample();

        // Serialization: triggers writeObject(...)
        try (ObjectOutputStream out = new ObjectOutputStream(new FileOutputStream("example.ser"))) {
            out.writeObject(example);
        }

        // Deserialization: triggers readObject(...)
        try (ObjectInputStream in = new ObjectInputStream(new FileInputStream("example.ser"))) {
            in.readObject();
        }
    }

    private void writeObject(ObjectOutputStream out) throws IOException {
        out.defaultWriteObject();
        out.writeObject("Duke");
        out.writeObject("Juggy");
    }

    private void readObject(ObjectInputStream in) throws IOException, ClassNotFoundException {
        in.defaultReadObject();
        String first = (String) in.readObject();
        String second = (String) in.readObject();

        System.out.println(first + " " + second);
    }
}

Meanwhile, writeObject and readObject are invoked by the JVM. This is done via Java reflection, as part of ObjectOutputStream.writeObject and ObjectInputStream.readObject, and cannot be meaningfully called by application code.

Using serialVersionUID for version control

Every serializable class has a version identifier called serialVersionUID. This value is written into the serialized data. During deserialization, the JVM compares the value stored in the serialized data with the value declared in the current version of the class. If they differ, deserialization fails with an InvalidClassException.

If you do not declare a serialVersionUID, Java generates one automatically based on the given class structure. Adding a field, removing a method, or even recompiling the class can change it and break compatibility. This is why relying on the generated value is usually a mistake.

Choosing a serialVersionUID

For new classes, it is common and correct to start with the following declaration:

private static final long serialVersionUID = 1L;

IDEs often suggest long, generated values that mirror the JVM’s default computation. While technically correct, those values are frequently misunderstood. They do not distinguish objects, prevent name collisions, or identify individual instances. All objects of the same class share the same serialVersionUID.

The purpose of this value is to identify the class definition, not the object. It acts as a compatibility check during deserialization, ensuring that the class structure matches the one used when the data was written. This usually becomes a problem only after data has already been serialized and deployed.

The number itself has no special meaning; Java does not treat 1L differently from any other value. What matters is that the value is explicit, stable, and changed intentionally.

When to change serialVersionUID

You should change serialVersionUID when a class change causes previously serialized field values to have a different meaning for the current code.

Typical reasons include removing or renaming a serialized field, changing the type of a serialized field, changing the meaning of stored values such as status codes, introducing new constraints that old data may violate, or changing the class hierarchy or custom serialization logic.

In these cases, deserialization may still succeed, but the resulting object would represent an incorrect logical state. Changing the serialVersionUID ensures such data is rejected instead of silently misused.

If changes only add behavior or optional data, such as adding new fields or methods, the value usually does not need to change.

Excluding fields with transient

Some fields should not be serialized, such as passwords, cached values, or temporary data. In these cases, you can use the transient keyword:

public class ChallengerAccount implements Serializable {

    private static final long serialVersionUID = 1L;

    private String username;
    private transient String password;

    public ChallengerAccount(String username, String password) {
        this.username = username;
        this.password = password;
    }
}

A field marked transient is skipped during serialization. When the object is deserialized, the field is set to its default value, which is usually null.

Serialization and inheritance

Serialization works across class hierarchies, but there are strict rules.

If a superclass does not implement Serializable, its fields are not serialized, and it must provide a no‑argument constructor. This failure tends to surface late, often after a seemingly harmless refactor of a base class:

class Person {
    String name;
    public Person() { this.name = "unknown"; }
}

class RankedChallenger extends Person implements Serializable {
    private static final long serialVersionUID = 1L;
    int ranking;
}

During deserialization, the superclass constructor runs and initializes its fields, while only the subclass fields are restored from the serialized data. If the no‑argument constructor is missing, deserialization fails at runtime.

Custom serialization with sensitive data

Revisiting the ChallengerAccount example we looked at earlier, the password field was marked as transient, so it is not included in default serialization and will be null after deserialization. In controlled environments, this behavior can be overridden by defining custom serialization logic.

In the example below, the writeObject and readObject methods are shown inline for clarity, but they must be declared as private methods inside the serializable class. Here’s what happens during deserialization:

private void writeObject(ObjectOutputStream out) throws IOException {
    out.defaultWriteObject();
    out.writeObject(password);
}

private void readObject(ObjectInputStream in) throws IOException, ClassNotFoundException {
    in.defaultReadObject();
    this.password = (String) in.readObject();
}

This is considered custom serialization because the class explicitly writes and reads part of its state instead of relying entirely on the JVM’s default mechanism. The call to readObject() does not read a field by name. Java serialization is a linear byte stream, not a keyed structure.

The value returned here is simply the next object in the stream, which happens to be the password because it was written immediately after the default object data. For this reason, values must be read in the exact order they were written. Changing that order will corrupt the stream or cause deserialization to fail.

Transforming data during serialization

Custom serialization can also transform data before writing it. This is useful for derived values, normalization, or compact representations:

public class ChallengerProfile implements Serializable {

    private static final long serialVersionUID = 1L;

    private String username;
    private transient LocalDate joinDate;

    public ChallengerProfile(String username, LocalDate joinDate) {
        this.username = username;
        this.joinDate = joinDate;
    }
}

The joinDate field is marked as transient, so it is not serialized by default. Although LocalDate is itself Serializable, marking it as transient and writing it as a single long demonstrates how custom serialization can transform a field into a different representation:

private void writeObject(ObjectOutputStream out) throws IOException {
    out.defaultWriteObject();
    out.writeLong(joinDate.toEpochDay());
}

During deserialization, the epoch day is converted back into a LocalDate:

private void readObject(ObjectInputStream in)throws IOException, ClassNotFoundException {
    in.defaultReadObject();
    this.joinDate = LocalDate.ofEpochDay(in.readLong());
}

The important point is not the specific transformation, but that writeObject and readObject must apply inverse transformations and read values in the exact order they were written. Here, toEpochDay and ofEpochDay are natural inverses: One converts a date to a number, and the other converts it back.

Restoring derived fields

Some fields are derived from others and should not be serialized:

public class ChallengerStats implements Serializable {

    private static final long serialVersionUID = 1L;

    private int wins;
    private int losses;

    private transient int score;

    public ChallengerStats(int wins, int losses) {
        this.wins = wins;
        this.losses = losses;
        this.score = calculateScore();
    }

    private int calculateScore() {
        return wins * 3 - losses;
    }
}

After deserialization, score will be zero. It can be restored as follows:

private void readObject(ObjectInputStream in)  throws IOException, ClassNotFoundException {
    in.defaultReadObject();
    this.score = calculateScore();
}

Why order matters in custom serialization logic

When writing custom serialization logic, the order in which values are written must exactly match the order in which they are read:

private void writeObject(ObjectOutputStream out) throws IOException {
    out.defaultWriteObject();
    out.writeInt(42);
    out.writeUTF("Duke");
    out.writeLong(1_000_000L);
}
private void readObject(ObjectInputStream in) throws IOException, ClassNotFoundException {
    in.defaultReadObject();
    int level = in.readInt();
    String name = in.readUTF();
    long score = in.readLong();
}

Because the stream is not keyed by field name, each read call simply consumes the next value in sequence. If readUTF were called before readInt, the stream would attempt to interpret the bytes of an integer as a UTF string, resulting in corrupted data or a deserialization failure. This is one of the main reasons custom serialization should be used sparingly. A useful mental model is to think of serialization as a tape recorder: Deserialization must replay the tape in exactly the order it was recorded.

Why serialization is risky

Serialization is fragile when classes change. Even small modifications can make previously stored data unreadable.

Deserializing untrusted data is particularly dangerous. Deserialization can trigger unexpected code paths on attacker‑controlled object graphs, and this has been the source of real‑world security vulnerabilities.

For these reasons, Java serialization should be used only in controlled environments.

When serialization makes sense

Java serialization is suitable only for a narrow set of use cases where class versions and trust boundaries are tightly controlled.

Use caseRecommendation
Internal cachingJava serialization works well when data is short-lived and controlled by the same application.
Session storageAcceptable with care, provided all participating systems run compatible class versions.
Long-term storageRisky: Even small class changes can make old data unreadable.
Public APIsUse JSON. It is language-agnostic, stable across versions, and widely supported. Java serialization exposes implementation details and is fragile.
System-to-system communicationPrefer JSON or schema-based formats such as Protocol Buffers or Avro.
Cross-language communicationAvoid Java serialization entirely. It is Java-specific and not interoperable with other platforms.

Rule of thumb: If the data must survive class evolution, cross trust boundaries, or be consumed by non‑Java systems, prefer JSON or a schema‑based format over Java serialization.

Advanced serialization techniques

The mechanisms we’ve covered so far handle most practical scenarios, but Java serialization has a few additional tools for solving problems that default serialization cannot.

Preserving singletons with readResolve

Deserialization creates a new object. For classes that enforce a single instance, this breaks the guarantee silently:

public class GameConfig implements Serializable {

    private static final long serialVersionUID = 1L;
    private static final GameConfig INSTANCE = new GameConfig();

    private GameConfig() {}

    public static GameConfig getInstance() {
        return INSTANCE;
    }

    private Object readResolve() throws ObjectStreamException {
        return INSTANCE;
    }
}

Without readResolve, deserializing a GameConfig would produce a second instance, and any identity check using == would fail. The method intercepts the deserialized object and substitutes the canonical one. The deserialized copy is discarded.

Substituting objects with writeReplace

Whereas readResolve controls what comes out of deserialization, writeReplace controls what goes into serialization. A class can define this method to substitute a different object before any bytes are written.

The two methods are often used together to implement a serialization proxy. One class represents the object’s runtime form, while another represents its serialized form.

In this example,ChallengerWriteReplace plays the role of the “real” object, while ChallengerProxy represents its serialized form:

public class ChallengerProxy implements Serializable {

    private static final long serialVersionUID = 1L;

    private final long id;
    private final String name;

    public ChallengerProxy(long id, String name) {
        this.id = id;
        this.name = name;
    }

    private Object readResolve() throws ObjectStreamException {
        return new ChallengerWriteReplace(id, name);
    }
}

class ChallengerWriteReplace implements Serializable {

    private static final long serialVersionUID = 1L;

    private long id;
    private String name;

    public ChallengerWriteReplace(long id, String name) {
        this.id = id;
        this.name = name;
    }

    private Object writeReplace() throws ObjectStreamException {
        return new ChallengerProxy(id, name);
    }
}

When a ChallengerWriteReplace instance is serialized, its writeReplace method substitutes it with a lightweight ChallengerProxy. The proxy is the only object that is actually written to the byte stream.

During deserialization, the proxy’s readResolve method reconstructs a new ChallengerWriteReplace instance, and the proxy itself is discarded. The application never observes the proxy object directly.

This technique keeps the serialized form decoupled from the internal structure of ChallengerWriteReplace. As long as the proxy remains stable, the main class can evolve freely without breaking previously serialized data. It also provides a controlled point where invariants can be enforced during reconstruction.

Filtering deserialized classes with ObjectInputFilter

I have explained why deserializing untrusted data is dangerous. Introduced in Java 9, the ObjectInputFilter API gives applications a way to restrict which classes are allowed during deserialization:

ObjectInputFilter filter = ObjectInputFilter.Config.createFilter(
        "com.example.model.*;!*"
);

try (ObjectInputStream in = new ObjectInputStream(new FileInputStream("data.ser"))) {
    in.setObjectInputFilter(filter); // must be set before readObject()
    Object obj = in.readObject();
}

This filter allows only classes under com.example.model and rejects everything else. The pattern syntax supports allowlisting by package, as well as setting limits on array sizes, object graph depth, and total object count.

Java 9 made it possible to set a process-wide filter via ObjectInputFilter.Config.setSerialFilter or the jdk.serialFilter system property, ensuring that no ObjectInputStream would be left unprotected by default. Java 17 extended this further by introducing filter factories (ObjectInputFilter.Config.setSerialFilterFactory), which allow context‑specific filters to be applied per stream rather than relying on a single global policy. If your application deserializes data that crosses a trust boundary, an input filter is not optional; it is the minimum viable defense.

Java records and serialization

Java records can implement Serializable, but they behave differently from ordinary classes in one critical way: During deserialization, the record’s canonical constructor is called. This means any validation logic in the constructor runs on deserialized data, which is a significant safety advantage:

public record ChallengerRecord(Long id, String name) implements Serializable {
    public ChallengerRecord {
        if (id == null || name == null) {
            throw new IllegalArgumentException(
                    "id and name must not be null");
        }
    }
}

With a traditional Serializable class, a corrupted or malicious stream could inject null values into fields that the constructor would normally reject. With a record, the constructor acts as a gatekeeper even during deserialization.

Records do not support writeObject, readObject, or serialPersistentFields. Their serialized form is derived entirely from their components, a design decision that intentionally favors predictability and safety over customization.

Alternatives to Java serialization

The Externalizable interface is an alternative to Serializable that gives the class complete control over the byte format. A class that implements Externalizable must define writeExternal and readExternal, and must provide a public no‑argument constructor:

public class ChallengerExt implements Externalizable {

    private long id;
    private String name;

    public ChallengerExt() {} // required

    public ChallengerExt(long id, String name) {
        this.id = id;
        this.name = name;
    }

    @Override
    public void writeExternal(ObjectOutput out) throws IOException {
        out.writeLong(id);
        out.writeUTF(name);
    }

    @Override
    public void readExternal(ObjectInput in) throws IOException {
        this.id = in.readLong();
        this.name = in.readUTF();
    }
}

Unlike Serializable, no field metadata or field values are written automatically. The class descriptor (class name and serialVersionUID) is still written, but the developer is fully responsible for writing and reading all instance state.

Because writeExternal and readExternal work directly with primitives and raw values, fields should use primitive types where possible. Using a wrapper type such as Long with writeLong would throw a NullPointerException if the value were null, since auto‑unboxing cannot handle that case.

This approach can produce more compact output, but the developer is fully responsible for versioning, field ordering, and backward compatibility.

In practice, Externalizable is rarely used in modern Java. When a full control over-the-wire format is needed, most teams choose Protocol Buffers, Avro, or similar schema‑based formats instead.

Conclusion

Java serialization is a low-level JVM mechanism for saving and restoring object state. Known for being powerful but unforgiving, serialization bypasses constructors, assumes stable class definitions, and provides no automatic safety guarantees. Used deliberately in tightly controlled systems, it can be effective. Used casually, it introduces subtle bugs and serious security vulnerabilities. Understanding the trade-offs discussed in this article will help you use serialization correctly and avoid accidental misuse.

(image/jpeg; 10.71 MB)

Swift 6.3 boosts C interoperability, Android SDK 26 Mar 2026, 9:00 am

Swift 6.3, the latest release of the Apple-driven language for multiple platforms, offers more flexible C interoperability and improvements for cross-platform build tools. The official SDK for Android mobile application development also is featured.

Announced March 24, Swift 6.3 can be accessed at swift.org. For C interoperability, Version 6.3 debuts the @c attribute for exposing Swift functions and enums to C code in a project. Annotating a function or enum with @c prompts Swift to include a corresponding declaration in the generated C header that can be included in C/C++ files.

Also, Swift 6.3 has a preview of the Swift Build system integrated into Swift Package Manager. This preview brings a unified build engine across all supported platforms for a more consistent cross-platform development experience. Improvements to Swift Package Manager in Version 6.3 include a prebuilt Swift syntax for shared macro libraries, flexible inherited documentation, and discoverable package traits.

In the Android development vein, the Swift SDK for Android enables development of native Android programs in Swift and updating of Swift packages to support building for Android. Also, developers can use Swift Java and Swift Java JNI Core to integrate Swift code into existing Android applications written in Kotlin or Java.

Also in Swift 6.3:

  • Module selectors are being introduced to specify which imported module Swift should look in for an API used in code.
  • Embedded Swift has improvements ranging from enhanced C interoperability and better debugging support to meaningful steps toward a complete linkage model.
  • For the core library, Swift Testing has improvements for areas including warning issues, test cancellation, and image attachments.
  • Experimental capabilities are added to the DocC documentation compiler for markdown output, per-page static HTML content, and code block annotations.
  • For performance control for library APIs, attributes were introduced that give library authors finer-grained control over compiler optimizations for clients of APIs.

(image/jpeg; 28.4 MB)

Rethinking VM data protection in cloud-native environments 26 Mar 2026, 9:00 am

After many years of being relatively static, the enterprise virtualization landscape is shifting under our feet. As organizations reassess their reliance on traditional hypervisors, driven by cost, licensing disruption, or a broader push toward modernization, many are exploring Kubernetes as the natural consolidation point for both containerized and virtual machine workloads. This is often referred to as “cloud-native virtualization” or “Kubernetes-native virtualization,” and it is enabled by KubeVirt and Containerized Data Importer (CDI), two open-source projects that together bring VMs into Kubernetes as first-class citizens. But running VM workloads on Kubernetes, a platform designed for distributed container orchestration, forces a fundamental rethinking of how those workloads are protected.

Many organizations still think of their Kubernetes environments as being stateless and not requiring backup. Whether or not this was true before (more often than not it wasn’t) it certainly isn’t true once VMs enter the picture.

VM data protection for traditional hypervisors has been mature for many years. It benefits from predictable methods and constructs, consistent snapshot semantics, and well-established approaches for application consistency and recovery. But things are different with KubeVirt. KubeVirt inherits Kubernetes’ management model, which is built around declarative management, resources, controllers, loosely coupled components, and pluggable storage drivers. Understanding how these architectural decisions reshape data protection is critical for anyone designing backup or disaster-recovery (DR) solutions for Kubernetes-native virtualization.

VMs defined by Kubernetes resources

The first big difference is in representation. In traditional virtualization systems, a VM is defined by an object or set of objects tightly controlled by the hypervisor. Its configuration, disk files, snapshots, and runtime state are all stored in a platform-specific way, enabling consistent backup semantics across different environments.

KubeVirt relies on the Kubernetes model instead. Virtual machines are defined using Kubernetes custom resources such as VirtualMachine, VirtualMachineInstance, and (with CDI) DataVolume, which are stored in the Kubernetes control plane. Their configuration is thus described declaratively in YAML, and their life cycle is managed by KubeVirt’s controllers. A VM definition in KubeVirt is therefore not a bundle of hypervisor objects, but a collection of Kubernetes resources describing compute, storage, networking, initialization, and storage volumes.

A generation of Kubernetes administrators have come to appreciate Kubernetes’ open, declarative model and YAML-based definitions, but for VM administrators it may be a bit confusing at first. More importantly for our purposes, the way this critical metadata is backed up and restored is entirely different. You’ll need to use Kubernetes-specific tools rather than the tools you’ve been using, and those tools will require at least a basic understanding of the Kubernetes control plane.

Storage and snapshot behavior governed by CSI

Storage is another area where virtualization teams might encounter architectural whiplash when transitioning. Storage systems in traditional enterprise VM environments are largely managed through the use of plugins, the prime example being VMware vCenter storage plugins. These plugins abstract important storage operations such as provisioning, health monitoring, and snapshot control. VMs running under Kubernetes rely, through CDI and Kubernetes persistent volumes, on drivers conforming to Kubernetes’ Container Storage Interface (CSI) for accessing storage. You can think of these as being somewhat analogous to the storage plugins, but at present generally less capable and less uniform in the features they provide.

From a data protection perspective, this leads to a few important points. First, different CSI drivers support different degrees of snapshot capability, with some providing no snapshot capability at all. The behavior of a KubeVirt VM backup that uses snapshots is therefore determined by the StorageClass and associated provisioner (CSI driver) backing its Persistent Volume Claims (PVCs).

Second, multi-disk VMs can make things more complicated. A KubeVirt VM may include multiple disks that need to be snapshotted together for consistency. KubeVirt’s snapshot mechanism helps orchestrate consistency across the PVCs for these volumes, but its success can depend on the presence of the QEMU guest agent (to freeze VM file systems), and the underlying CSI driver’s snapshot capabilities. True atomic consistency across multiple disks without file-system freezing (fs-freeze command) requires Volume Group Snapshot capabilities, which are still maturing in Kubernetes.

Third, designing reliable VM protection requires understanding the capabilities, limitations, and performance characteristics of each StorageClass.

Finally, cross-cluster recovery raises additional challenges. Unlike traditional hypervisor environments where datastores are often standardized or abstracted, different Kubernetes clusters frequently have different StorageClasses and underlying CSI drivers. Restoring a VM into a new cluster may require remapping storage classes or modifying PVC parameters. Recovery workflows must therefore be prepared to handle heterogeneous storage rather than relying on uniform hypervisor primitives.

VM snapshots with KubeVirt

When a VM snapshot is taken in VMware vSphere (for example), the operation produces a set of delta files capturing VM disk state, usually assisted by optional guest quiescing. It can also capture VM memory state.

KubeVirt treats VM snapshots differently. A KubeVirt snapshot consists of a captured copy of the VM spec and a set of underlying volume snapshots that capture the state of each associated PVC. KubeVirt uses the CSI driver’s snapshot functionality for capturing storage state in this way.  VMs need to use DataVolumes or PVCs backed by a StorageClass that supports snapshots, and snapshots must be configured properly for those StorageClasses.

KubeVirt VM snapshots are file-system consistent when using the QEMU guest agent, and crash consistent otherwise. Importantly, KubeVirt snapshots do not preserve VM memory state, nor do they provide application consistency. Application consistency (e.g., for databases) often requires additional custom application hooks.

When restoring a VM snapshot, KubeVirt reconstructs the VM by applying the stored spec and restoring/binding volume snapshots to newly created PVCs.

This design aligns with Kubernetes’ broader philosophy of operation, but it brings new engineering considerations. Application consistency often requires explicit hooks, or coordination with in-guest processes. Disaster recovery may require coordinating the restores of multiple resources rather than a single hypervisor action.

Also note that some common Kubernetes backup and DR tools such as Velero and CloudCasa do not use KubeVirt VM snapshots at all, but instead directly back up KubeVirt custom resources and orchestrate their own persistent volume snapshots using the CSI snapshot interface. This approach is better when the intention is to back the snapshots up off-cluster and allow restores to other clusters.

KubeVirt does more than just offer an alternative runtime environment for VMs. It promises the nirvana of a unified compute plane for both VMs and containerized workloads. But it also reshapes the whole model of VM life cycle management and protection. For architects and platform engineers, the transition requires new assumptions, new skills, and new tools, and this often includes new Kubernetes-specific protection and DR solutions.

New Tech Forum provides a venue for technology leaders—including vendors and other outside contributors—to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to doug_dineley@foundryco.com.

(image/jpeg; 8.73 MB)

Claude Code AI tool getting auto mode 25 Mar 2026, 10:22 pm

Anthropic is fitting its Claude Code AI-powered coding assistant with an auto mode for the Claude AI assistant to handle permissions on the user’s behalf, with safeguards to monitor actions before they run.

Auto mode was announced March 24; instructions on getting started with it can be found on the introductory blog post. Currently being launched in research preview status for Claude Team users, this capability is due to roll out to enterprise and API users in coming days, according to Anthropic. The company explained that Claude Code default permissions are conservative, with every file write and Bash command asking for approval. While this is a safe default, it means users cannot start a large task and walk away.

Some developers bypass permission checks with --dangerously-skip-permissions, but skipping permissions can result in dangerous and destructive outcomes and should not be used outside of isolated environments. Auto mode is a middle path to run longer tasks with fewer interruptions while introducing less risk than skipping all permissions. Before each tool call runs, a classifier reviews it to check for potentially destructive actions such as mass deleting files, sensitive data exfiltration, or malicious code execution, Anthropic said. Actions deemed safe can proceed and risky ones are blocked, redirecting Claude to take a different approach.

Auto mode reduces risk compared to --dangerously-skip-permissions but does not eliminate it entirely. The classifier may still allow some risky actions: for example, if user intent is ambiguous or if Claude does not have enough context about an environment to know an action might create additional risk. It may also occasionally block benign actions. Anthropic plans to continue to improve the user experience over time.

(image/jpeg; 13.05 MB)

PyPI warns developers after LiteLLM malware found stealing cloud and CI/CD credentials 25 Mar 2026, 11:13 am

PyPI is warning of possible credential theft from AI applications and developer pipelines after two malicious versions of the widely used Python middleware for large language models, LiteLLM, were briefly published.

“Anyone who has installed and run the project should assume any credentials available to the LiteLLM environment may have been exposed, and revoke/rotate them accordingly,” PyPI said in an advisory that linked the incident to an exploited Trivy dependency from the ongoing TeamPCP supply-chain attack.

According to a Sonatype analysis, the packages embedded a multi-stage payload designed to harvest sensitive data from developer environments, CI/CD pipelines, and cloud configurations, and were live on PyPI for roughly two hours before being taken down.

“Given the package’s three million daily downloads, the compromised LiteLLM could have seen significant exposure during that short time span,” Sonatype researchers said in a blog post. On top of serving as a stealer, the packages were also acting as droppers, enabling follow-on payloads and deeper system compromise.

Three-stage payload built for maximum reach

The compromise affected versions 1.82.7 and 1.82.8. Sonatype’s analysis noted the payload operating in three distinct stages. These included initial execution and data exfiltration, deeper reconnaissance and credential harvesting, and finally persistence with remote control capabilities.

The attack chain relied heavily on obfuscation, with base64-encoded Python code covering up the payload’s tracks. Once executed, the malware collected sensitive data, encrypted it using AES-256-CBC, and then secured the encryption key with an embedded RSA public key before sending everything to attacker-controlled servers.

The disclosure highlighted a common approach that attackers follow these days. Instead of going off immediately after installation, the malware quietly lingers to map the environment and establish a foothold, before pulling credentials from local machines, cloud configs, and automation pipelines.

“It (payload) targets environment variables (including API keys and tokens), SSH Keys, cloud credentials (AWS, GCP, Azure), Kubernetes configs, CI/CD secrets, Docker configs, database credentials, and even cryptocurrency wallets,” said Wiz researchers, who are separately tracking the campaign, in a blog post. “Our data shows that LiteLLM is present in 36% of cloud environments, signifying the potential for widespread impact.”

Wiz also provided a way for its customers to check their environment for exposure via the Wiz Threat Center.

An expanding supply-chain campaign

The LiteLLM incident has been confirmed to be a part of the rapidly unfolding TeamPCP supply chain campaign that first compromised Trivy.

Trivy, developed by Aqua Security, is a widely used open-source vulnerability scanner designed to identify security issues in container images, file systems, and infrastructure-as-code (IaC) configurations. The ongoing attack, attributed to TeamPCP with reported links to LAPSUS$, involved attackers compromising publishing credentials and injecting credential-stealing code into official releases and GitHub Actions used in CI/CD pipelines.

The Trivy compromise was quickly followed by similar supply chain incidents, with attackers leveraging the same access and tactics to target other developer security tools like KICS and Checkmarx, extending the campaign’s reach across multiple CI/CD ecosystems.

PyPI advisory tied the LiteLLM incident directly to the Trivy compromise. The malicious packages were uploaded “after an API Token exposure from an exploited Trivy dependency,” it said.

Ben Read, a lead researcher at Wiz, calls it a systematic campaign that needs to be monitored for further expansion. “We are seeing a dangerous convergence between supply chain attackers and high-profile extortion groups like LAPSUS$,” he said. “By moving horizontally across the ecosystem – hitting tools like liteLLM that are present in over a third of cloud environments – they are creating a snowball effect.”

PyPI has advised users to rotate any secrets accessible to the affected LiteLLM environment, as researchers confirm active data exfiltration and potential exposure across cloud environments tied to the ongoing campaign.

The article originally appeared in InfoWorld.

(image/jpeg; 0.26 MB)

Cloudflare launches Dynamic Workers for AI agent execution 25 Mar 2026, 10:37 am

Cloudflare has rolled out Dynamic Workers, an isolate-based runtime designed to run AI-generated code faster and more efficiently than traditional containers, as the company pushes lightweight, disposable execution environments as a foundation for enterprise AI applications.

The service enables enterprises to spin up execution environments in milliseconds, pointing to a transition away from container-heavy architectures toward more ephemeral runtimes designed for high-volume AI agent workloads.

For many enterprises, this points to a shift in how AI systems are built and executed. Instead of orchestrating predefined tools, organizations are beginning to let models generate and execute code on demand, a shift that raises new questions around security and cost.

Built on Cloudflare’s existing Workers platform, Dynamic Workers uses V8 isolates to execute code generated at runtime, often by LLMs, without requiring a full container or virtual machine.

“An isolate takes a few milliseconds to start and uses a few megabytes of memory,” Cloudflare said in a blog post. “That’s around 100x faster and 10x-100x more memory efficient than a typical container. That means that if you want to start a new isolate for every user request, on-demand, to run one snippet of code, then throw it away, you can.”

Cloudflare is pairing the runtime with its “Code Mode” approach, which encourages models to write short TypeScript functions against defined APIs instead of relying on multiple tool calls, a method the company says can reduce token usage and latency.

From an enterprise perspective, the platform includes controls such as outbound request interception for credential management, automated code scanning, and rapid rollout of V8 security patches. Cloudflare noted that isolate-based sandboxes have different security characteristics compared to hardware-backed environments.

Dynamic Workers are available in open beta under Cloudflare’s Workers paid plan. While pricing is set at $0.002 per unique Worker loaded per day, in addition to standard CPU and invocation charges, the per-Worker fee is waived during the beta period.

Enterprise runtime implications

For enterprise IT teams, the move to isolate-based execution could reshape how AI workloads are architected, especially for use cases that demand high concurrency and low-latency performance.

“Cloudflare is essentially looking to redefine the application lifecycle by pivoting away from the traditional ‘build-test-deploy’ cycle on centralized servers, which often relies on high-overhead, latency-heavy containers,” said Neil Shah, VP for research at Counterpoint Research. “The move to V8 reduces startup times from around 500 ms to under 5 ms, a roughly 100x improvement, making it significant for bursts of agentic AI requests that may require cold starts.”

This shift could also have cost implications. If AI agents can generate and execute scripts locally to produce outcomes, rather than repeatedly calling LLMs, enterprises may see improvements in both efficiency and latency.

However, Shah noted that the model introduces new security considerations that enterprise leaders cannot ignore.

“Allowing AI agents to generate and execute code on the fly introduces a new attack vector and risk,” Shah said. “While Dynamic Workers are sandboxed to limit the impact of a potential compromise, the unpredictability of AI-generated logic requires a robust security framework and clear guardrails.”

Others say these risks extend beyond sandboxing and require broader governance across the AI execution lifecycle. Nitish Tyagi, principal analyst at Gartner, said that while isolate-based environments improve containment, they do not eliminate risks.

“Running an AI agent and executing code in an isolated environment may seem very safe in theory, but it doesn’t ensure complete safety,” Tyagi said.

He pointed to risks such as vulnerabilities in AI-generated code, indirect prompt-injection attacks, and supply-chain threats, in which compromised external sources could lead agents to expose sensitive data or execute harmful actions.

Tyagi also warned of operational risks, including the risk of autonomous agents entering recursive execution loops, which can lead to cost escalation and resource exhaustion.

To mitigate these risks, Tyagi said enterprises need stronger governance mechanisms, including real-time monitoring of agent behavior, tighter control over outbound traffic, and better visibility into AI supply chains and dependencies.

(image/jpeg; 0.07 MB)

Oracle adds pre-built agents to Private Agent Factory in AI Database 26ai 25 Mar 2026, 9:54 am

Oracle has added new prebuilt agents to Private Agent Factory, its no-code framework for building containerized, data-centric agents within AI Database 26ai.

These agents include a Database Knowledge Agent, a Structured Data Analysis Agent, and a Deep Data Research Agent.  

While the Database Knowledge Agent translates natural-language prompts into queries to fetch specific facts, policies, or entities, the Deep Data Research Agent tackles more complex tasks by breaking them into steps and iterating across web sources, document libraries, or both, the company said.

The Structured Data Analysis Agent, meanwhile, is aimed at crunching tabular data —think SQL tables or CSV files — using tools like Python’s pandas library to generate charts, spot trends, flag anomalies, and summarize metrics, the company added.

The addition of these agents and Private Agent Factory to AI Database 26ai will help developers accelerate agent building in a secure and simplified manner, especially for enterprises operating in regulated industries, helping move pilots to production, analysts say.

“With AI Database Private Agent Factory, teams will be able to rapidly create AI agents or leverage pre-built ones, turning experimentation into production-ready solutions quickly. By embedding intelligence at the core of the database, Oracle is enabling a new era of agentic AI, where sophisticated, autonomous systems and applications can adapt and act at scale,” said Noel Yuhanna, principal analyst at Forrester.

Oracle’s rationale, Yuhanna added, reflects its broader strategy of making the database a central pillar of enterprise AI, given that execution ultimately depends on where the data resides.

That view is echoed by Stephanie Walter, practice leader of AI stack at HyperFRAME Research, who says Private Agent Factory is Oracle’s attempt to position itself as “the operational control layer” in enterprises rather than just the storage layer, by bringing data and AI closer together and reducing the need for data movement and external orchestration.

“Every major cloud provider is moving toward tighter coupling between data, models, and orchestration. Oracle’s differentiation is that it starts from the database outward, while hyperscalers typically start from the model or platform outward,” Walter said.

That differentiation is more than architectural nuance, according to Bradley Shimmin, lead of the data intelligence practice at The Futurum Group.

“By architecting agent orchestration directly into the database, Oracle is letting enterprises drop the duct-tape approach of complex, brittle data-movement pipelines that I would say continue to plague cloud-centric ecosystems, even those emphasizing zero-ETL capabilities,” Shimmin said.

That tighter integration also feeds directly into a more pragmatic concern for regulated industries: keeping sensitive data under control as AI agents move from experimentation into production.

“Most agent frameworks today assume you’re comfortable sending data to external LLM providers and orchestrating through cloud-hosted services. For regulated industries—including banking, healthcare, defense, and government—that assumption is a non-starter,” said Ashish Chaturvedi, leader of executive research at HFS Research.

“The Private Agent Factory meets those customers exactly where they are: behind the firewall, with the drawbridge up,” he added.

(image/jpeg; 4.7 MB)

TypeScript 6.0 arrives 25 Mar 2026, 9:00 am

TypeScript 6.0 is slated to be the last release of the language based on the current JavaScript codebase and is now generally available. Version 6.0 acts as a bridge between TypeScript 5.9 and the planned TypeScript 7.0, close to completion and set to be speedier and based on the Go language.

The 6.0 production release was unveiled on March 23, following the release candidate that arrived March 6. Developers can access TypeScript 6.0 via NPM with the following command: npm install -D typescript.

TypeScript has been established as JavaScript with syntax for types. Several changes were cited as noteworthy additions in the general production release of TypeScript 6.0, including an adjustment in type-checking for function expressions in generic calls, especially those occurring in generic JSX expressions. This typically will catch more bugs in existing code, although developers may find that some generic calls may need an explicit type argument, said Daniel Rosenwasser, principal product manager for TypeScript at Microsoft.

Also, Microsoft has extended its deprecation of import assertion syntax (i.e. import ... assert {...}) to import() calls like import(..., { assert: {...}}).

With the general release, Microsoft also has updated the DOM types to reflect the latest web standards, including some adjustments to the Temporal APIs. Other capabilities featured in TypeScript 6.0  include:

  • There is less context sensitivity on this-less functions. If this is never actually used in a function, then it is not considered contextually sensitive, which means these functions will be seen as higher priority when it comes to type inference.
  • A new flag has been introduced, called –stableTypeOrdering , which is intended to assist with TypeScript 6.0 migrations to Version 7.0.
  • TypeScript 6.0 adds support for the es2025option for both target and lib. Although there are no new JavaScript language features in ES2025, this new target adds new types for built-in APIs and moves a few declarations from esnext into es2025.
  • The contents of lib.dom.iterable.d.tsand lib.dom.asynciterable.d.ts are included in lib.dom.d.ts. Developers still can reference dom.iterable and dom.asynciterable in a configuration file’s "lib" array, but they are now just empty files. TypeScript’s liboption lets users specify which global declarations a target runtime has.
  • In TypeScript 6.0, usingmodule where namespacewas expected is now a hard deprecation. This change was necessary because module blocks are a potential ECMAScript proposal that would conflict with the legacy TypeScript syntax.

The foundation of TypeScript 7.0, meanwhile, is set to be a compiler and language service written in Go that takes advantage of the speed of native code and shared-memory multi-threading. Version 7.0 is “extremely close to completion,” Rosenwasser said. It can be tried out from the Visual Studio Code editor or installed via NPM. “In fact, if you’re able to adopt TypeScript 6.0, we encourage you to try out the native previews of TypeScript 7.0,” Rosenwasser said.

(image/jpeg; 11.38 MB)

Stop worrying: Instead, imagine software developers’ next great pivot 25 Mar 2026, 9:00 am

My sister always says, “worry is just a lack of imagination.”   By that, she means we always seem to worry about the worst-case scenarios — about things going badly.  Why not worry, or imagine, that the best possible outcome will happen?  You have a choice — choose to assume that everything will work out perfectly rather than disastrously.

This has never been more true when you look at the folks who think all of us software developers are going to end up selling apples on street corners.

Don’t fear the coding agent

I get it. Software development has suddenly become incredibly efficient.  Claude Code can write code vastly faster and more efficiently than we humans can, and so it seems reasonable that one person can now do (manage?) the work of 10 (50? 100?) people, companies will get rid of the other nine, leaving them destitute.  Seas of software developers will be standing in unemployment lines, their skills rendered moot by the blazing tokens of coding agents

There’s the worst-case scenario.  But what if we apply a bit of imagination?

A similar case happened during the Industrial Revolution.  In the mid-19th century, steam engines were the leading technology, and as they became more efficient, coal miners grew concerned that demand for their services would drop as those engines used less and less coal. 

But the coal miners lacked imagination — more efficient steam engines led to the unexpected result of an increase in the demand for coal.  This counterintuitive outcome was noticed by economist William Stanley Jevons, who realized that cheaper, more efficient steam engines led to their more widespread use in ways that hadn’t yet been conceived, thus expanding the need for both coal miners and factory workers to build more and better steam engines. Everybody wins.

And why won’t the same thing be true for software?  Can’t we imagine a world where the amount of software demanded and produced expands beyond what we think of today?  The “programmers selling apples” scenario assumes that the demand for software remains constant. But if producing software becomes more efficient, won’t that lead to more software being produced? 

Think of this:  I bet most of us have a few side projects that we’d like to get done that we never seem to be able to find the time for.  Your product manager certainly has a long list of features for your product that she’d like to do, but for which there never seems to be the time to put on the schedule. Small businesses all have bespoke requirements for software that off-the-shelf solutions don’t meet. 

Adapting to development disruption

Add to that the software that hasn’t even been conceived of yet, and it’s pretty easy to see — imagine — that there is no shortage of software that can be created.  Making software easier to create won’t lead to the same projected amount of software created.  Making software easier to create will drastically increase the amount of software that will be produced.  The floor just dropped out from under “we don’t have the time for that.”

Now, I’ll give you this: There may be a disruption in the type of work required to produce this software.  Job descriptions change — this is a constant.  We used to need people to write assembly and C.  Procedural development gave way to object-oriented coding.  Windows developers were left behind as the web rose to prominence.  But we all have adapted, and we’ll do so again.

It turns out my sister is right. The best-case scenario is vastly more interesting than anyone bothers to imagine.

(image/jpeg; 0.29 MB)

Speed boost your Python programs with new lazy imports 25 Mar 2026, 9:00 am

When you import a module in Python, the module’s code must be evaluated completely before the program can proceed. For most modules, this isn’t an issue. But if a module has a long and involved startup process, it’s going to slow down the rest of your program at the point where it’s imported.

Python developers typically work around this issue by structuring imports so they don’t happen unless they are needed—for instance, by placing an import in the body of a function instead of at the top level of a module. But this is a clumsy workaround, and complicates the flow of the program.

Python 3.15 adds a new feature, lazy imports, that provides a high-level solution for slow-importing modules. Declaring an import as “lazy” means it will be evaluated when it is first used, not when it is first imported. The cost of a slow import can then be deferred until the code it contains is actually needed. And, while lazy imports introduce new syntax, you can future-proof existing code to use them without having to change any of its syntax.

Eager versus lazy imports

To start, it’s helpful to understand the problem addressed by lazy imports. So, let’s say we have two files in the same directory:

# main.py
print ("Program starting")
from other import some_fn
print ("Other module imported")
some_fn()
print ("Program ended")

# other.py
print("Other module evaluation started")
from time import sleep
sleep(2)
# ^ This simulates a slow-loading module
print("Other module evaluation ended")

def some_fn():
    print ("some_fn run")

If you run main.py, the output should look something like this:


Program starting
Other module evaluation started

[two-second delay]

Other module evaluation ended 
Other module imported 
some_fn run 
Program ended

The mere act of importing other grinds our program to a near-halt before we can even do anything with the imported function, let alone continue with the rest of the program.

Now let’s see what happens if we modify main.py to use lazy imports (this will only work on Python 3.15 or higher):

print ("Program starting")
lazy from other import some_fn
print ("Other module imported")
some_fn()
print ("Program ended")

When you run the program now, the behavior changes:

Program starting
Other module imported
Other module evaluation started

[two-second delay]

Other module evaluation ended
some_fn run
Program ended

Now, the import imposes no delay at all. We only see the delay when we try to run the function we imported from the module.

What’s happening under the hood? When Python detects a lazy import—typically triggered with the lazy keyword on the import line, as shown above—it doesn’t perform the usual import process. Instead, it creates a “proxy object,” or a stand-in, for the imported module. That proxy waits until the program tries to do something with the module. Then the actual import action triggers, and the module is evaluated.

The lazy keyword is always the first word on the line of an import you want to declare as lazy:


# lazily imports foo
lazy import foo
# lazily imports bar from foo
lazy from foo import bar
# same with the use of "as":
lazy import foo as foo1
lazy from foo import bar as bar1

Where to use lazy imports in Python

The most common scenario for using lazy imports is to replace the usual workaround for avoiding a costly import at program startup. As I mentioned previously, placing the import inside a function, instead of at the top level of a module, causes the import to happen only when the function runs. But it also means the import is limited to the function’s scope, and is therefore unavailable to the rest of the module unless you apply another workaround.

With a lazy import, you can keep the import in the top level of a module as you usually would. The only change you have to make is adding the lazy keyword to your code.

Using lazy imports automatically

It is also possible to enable imports on an existing codebase automatically—without rewriting any import statements.

Python 3.15 adds new features to the sys module that control how lazy imports work. For instance, you can declare programmatically that every import from a given point forward in your program’s execution will be lazy:

import sys
sys.set_lazy_imports("all")

If sys.set_lazy_imports() is given "all", then every import in the program from that point on is lazy, whether or not it uses the lazy keyword. Code labeled "normal" would have only explicitly lazy imports handled as lazy, and code labeled "none" would disable lazy importing across the board.

Controlling lazy imports programmatically

You can also hook into lazy imports at runtime, which lets you do things like control which specific modules are lazy-imported:


import sys

def mod_filter(importing, imported, fromlist):
    return imported == ("module")

sys.set_lazy_imports_filter(mod_filter)
sys.set_lazy_imports("all")

sys.set_lazy_imports_filter() lets you supply a function that takes in three parameters:

  • The module where the import is being performed
  • The module being imported
  • A list of names being imported

With that, you can write logic to return True to allow a given import to be lazily imported, or False to force it to be imported normally. This lets you write allow-lists and block-lists for lazy imports as part of a test, or simply as part of how your program works.

Two ways to get started with lazy imports

Python has a long-standing tradition of allowing newer features to be added gracefully to existing codebases. Lazy imports can be used the same way: You can check for the presence of the feature at program start, then apply lazy imports across your codebase automatically by using sys.set_lazy_imports().

To start, you can check the Python version number:

import sys
if (sys.version_info.major==3 and sys.version_info.minor>=15):
    ... # set up lazy imports

Or you can test for the presence of the lazy import controls in sys:

import sys
if getattr(sys, "set_lazy_imports", None):
    ... # set up lazy imports

(image/jpeg; 4.1 MB)

Page processed in 0.688 seconds.

Powered by SimplePie 1.4-dev, Build 20170403172323. Run the SimplePie Compatibility Test. SimplePie is © 2004–2026, Ryan Parman and Geoffrey Sneddon, and licensed under the BSD License.