Why AI agents are so good at coding 10 Dec 2025, 9:00 am

I’ve written about how coding is so over. AI is getting smarter every day, and it won’t be long before large language models (LLMs) write better code than any human.

But why is coding the one thing that AI agents seem to excel at? The reasons are simple and straightforward. 

At their core, LLMs process text. They take in massive amounts of text, learn the patterns of that text, and then use all of that information to predict what the next word will be in a given sentence. These models take your question, parse it into text tokens, and then use the trillions (quadrillions?) of vectors they have learned to understand the question and give an answer, one word, or token, at a time. It seems wild, but it is literally that simple. An LLM produces its answer one word at a time.

Doing all this ultimately comes down to just a huge amount of vector math—staggering amounts of calculations. Fortunately, GPUs are really good at vector math, and that is why AI companies have an insatiable appetite for GPUs and why Nvidia is the most valuable company in the world right now. It seems weird to me that the technology used to generate amazing video games is the same that produces amazing text answers to our questions.

Code is text

And of course, code is just words, right? In fact, that is one of the basic tenets of coding—it’s all just text. Git is designed specifically to store and manage text, and to understand the differences between two chunks of text. The tool we all work in, an integrated development environment (IDE), is really a glorified text editor with a bunch of bells and whistles attached. Coding is all about words.

In addition to being words, those words are structured consistently and succinctly—much moreso than the words we speak. Most text is messy, but all code by definition has patterns that are easier for an LLM to recognize than natural language. As a result, LLMs are naturally better at reading and writing code. LLMs can quite quickly and easily parse code, detect patterns, and reproduce those patterns on demand.

Code is plentiful

And there is an enormous amount of code out there. Just think of GitHub alone. A back-of-the-envelope calculation says there is around 100 billion lines of open-source code available for training AI. That’s a lot of code. A whole lot of code. 

And if you need an explanation of how code works, there are something like 20 million questions and even more answers on Stack Overflow for AI to learn from. There’s a reason that Stack Overflow is a shell of its former self—we all are asking AI for answers instead of our fellow developers.

Code is verifiable

In addition, code is easily verified. First, does it compile? That is always the big first test, and then we can check via testing if it actually does what we want. Unlike other domains, AI’s code output can be checked and verified fairly easily. 

If you choose to, you can even have your AI write unit and integration tests beforehand, further clarifying and defining what the AI should do. Then, tell your AI to write code that passes the tests. Eventually, AI will figure out that test-driven development is the best path to writing good code and executing on your wishes, and you won’t even have to ask it to do that.

Welcome, Skynet

And finally, code is a great use case for AI agents because developers are generally unafraid of new technology and always seem ready to try out a new tool. This becomes a virtuous circle as AI companies produce coding agents, and developers embrace those coding agents. Software development is a huge part of the economy, and AI companies are strongly incentivized to lean into lucrative markets that are accepting and enthusiastic about using AI agents.

I for one welcome our new coding agent overlords. If there is one area that I’m fine with Skynet taking over, it’s the mundane job of writing structured text that has easily verifiable outcomes. Let the bots grind out code. I’m happy to do the fun part of thinking up and designing new tools and applications.

(image/jpeg; 8.6 MB)

PythoC: A new way to generate C code from Python 10 Dec 2025, 9:00 am

Python and C share more than it might seem. The reference version of the Python interpreter is written in C, and many of the third-party libraries written for Python wrap C code. It’s also possible to generate C code from Python.

Generating C code with Python has typically involved libraries like Cython, which use type-annotated Python code to generate C extension modules for Python.

A new project, PythoC, takes a different approach. It uses type-hinted Python to programmatically generate C code—but chiefly for standalone use, and with many more compile-time code generation features than Cython has.

PythoC’s makers use the phrase “C level runtime, Python powered compile time” to describe their approach. The project is still in its early phases, but there are enough working features to make it worth a look.

A basic PythoC program

Here’s a simple program adapted from PythoC’s examples:


from pythoc import compile, i32

@compile
def add(x: i32, y: i32) -> i32:
    return x + y

if __name__ == "__main__":
    print(add(10, 20))

To indicate which functions in a module to compile to C, you use PythoC’s @compile decorator, supplying type hints for the result and each parameter. Note that you also need to import PythoC’s own i32 hint, instead of using Python’s native int. This means you’re using machine-native integers and not Python’s arbitrary-size ints.

When you run this program, you’ll get 30 as the output, after a delay. The C code is compiled on the fly each time you execute the program, hence the delay. PythoC doesn’t yet have a mechanism for re-using compiled code when it’s called from Python, the way Cython does.

At first this seems like a pretty big limitation. But it’s actually the point: You can use PythoC as a code generation system for C programs that run independently, rather than C modules imported into Python.

Generating standalone C programs

Here’s a new verson of the same program, with different behaviors.


from pythoc import compile, i32, ptr, i8
from pythoc.libc.stdio import printf

@compile
def add(x: i32, y: i32) -> i32:
    return x + y

@compile
def main(argc: i32, argv: ptr[ptr[i8]]) -> i32:
    printf("%u\n", add(10, 20))

if __name__ == "__main__":
    from pythoc import compile_to_executable
    compile_to_executable()

The first thing you’ll probably notice is the block at the bottom. The compile_to_executable() function is exactly what it sounds like. Call it, and the current module is compiled to an executable of the same name, with all the @compile-decorated functions included.

Another difference is that the main() function now has the same signature as the main() function in a C program. This means the compiled executable will automatically use that as its entry point.

Finally, when you run this program, the generated executable (which shows up in a build subdirectory) doesn’t run automatically; you have to run it yourself. The aim here is to build a standalone C program, indistinguishable from one you wrote by hand in C, but using Python’s syntax.

PythoC’s emulation of C features

With only a few exceptions, PythoC can generate code that fully utilizes C’s feature set and runtime.

You’ve already seen how to use type annotations to indicate primitive data types. You can likewise use the ptr[T] annotation to describe a pointer (also shown above), and use array[T, N] for N-dimensional arrays of type T. You can make structs, unions, and enums by decorating Python classes, and all the usual operators and control-flow operations (except for goto) will work. For switch/case, just use match/case, although fall-through cases aren’t available.

Something else that’s missing is variable-length arrays. In C, this feature is only supported in C11 and beyond, and support for it in compilers is optional, so it’s not surprising PythoC doesn’t support it yet.

Compile-time code generation

It’s possible to use Cython for compile-time code generation, which means you can produce different kinds of C code, or even fall back to Python code, depending on what happens at compile time. But PythoC’s compile-time code generation has abilities Cython lacks.

Here’s an example from PythoC’s documentation:


from pythoc import compile, struct, i32, f64

def make_point(T):
    @struct(suffix=T)
    class Point:
        x: T
        y: T

    @compile(suffix=T)
    def add_points(p1: Point, p2: Point) -> Point:
        result: Point = Point()
        result.x = p1.x + p2.x
        result.y = p1.y + p2.y
        return result

    return Point, add_points

Point_i32, add_i32 = make_point(i32)
Point_f64, add_f64 = make_point(f64)

The make_point(T) function takes in some type annotation (i32, f64), and generates at compile time type-specialized versions of the Point class and add_points functions. The suffix parameter for @compile means “alter the name of the generated object so that the type is used in the name”—so, for example, Point becomes Point_i32 and Point_i64, which in C is one way to distinguish between multiple versions of the same function with a different type signature. It’s also possible to use this in conjunction with runtime dispatch to provide polymorphism.

Memory safety features

The bugs that can spring from C’s manual memory management are gloomily familiar to anyone who uses the language. Cython has memory safety features to address this, but PythoC offers unique type-based features in this vein.

One is a feature called linear types. The linear import lets you generate a “proof,” usually to accompany a memory allocation, that has to be “consumed” when the same memory is deallocated. If you don’t have a matching consume(prf) for each prf=linear(), the PythoC compiler will generate a compile-time error. The documentation for this, linked above, shows how to create simple lmalloc()/lfree() functions to allocate and free memory safely. Nothing says you must us linear types over manually using malloc()/free(), but they can automate much manual checking and centralize it at compile time rather than runtime.

Another type-based safety feature is refinement types. The idea here is that you can define a function to perform a certain kind of check—e.g., for a null pointer—with a boolean result. You can then use the refine() function to pass a value to that function and get back a type specific to that function, refined[func]. This allows the compiler to ensure that type has to be handled in some manner before being returned, and allows common checks (again for things like a non-null pointer) to be handled in a single place in your code. Cython’s type system is mostly for emulating C’s behaviors directly, and so doesn’t include anything like this.

Possible future directions for PythoC

PythoC is still quite new, so its future development is relatively open ended. One possibility is that it could integrate more closely with Python at runtime. For instance, a @cached decorator could compile modules once, ahead of time, and then re-use the compiled modules when they’re called from within Python, instead of being recompiled at each run. Of course, this would also require integration with Python’s existing module build system. While that level of integration might not be part of the project’s aim, it would make PythoC more immediately useful to those integrating C and Python.

(image/jpeg; 32.86 MB)

Is vibe coding the new gateway to technical debt? 10 Dec 2025, 9:00 am

If programming were nothing more than taking an idea and turning it into running code, AI-assisted development might fully deliver on its promise. But AI’s ability to transform natural language prompts into runnable software ultimately highlights the importance of non-mechanical aspects of programming.

Rather than reducing or eliminating the human element in software development, AI reveals the human traits that are key to successful projects. One of these is the ability to sense when a piece of software is becoming unmanageable, and to mitigate the damage.

Such discernment differentiates novice and veteran developers. Increasingly, it also differentiates software written by humans from AI-generated code.

What vibe coding is not

By now, most developers know what vibe coding is. Many of us are using it in some part of our everyday workflow. AI-assistance being a part of the software development life cycle went from revolutionary to passé in about a week. It is extraordinary how quickly software developers can absorb shocks to the industry and just keep going.

But some wags keep missing this key factor: Developers are the ones adopting and applying the technology. We can’t be replaced because at the end of the day, we are the ones holding the reins that guide the tools.

The future is AI-empowered development, not AI-replaced developers. The age-old skills that make a good developer valuable become even more important when writing and pushing functioning code gets cheaper. Leave out the developer, and you just get mountains of technical debt.

You think requirements are easy?

The big idea in AI-driven development is that now we can just build applications by describing them in plain English. The funny thing is, describing what an application does is one of the hardest parts of software development; it’s called requirements gathering.

Anyone who has spent time defining requirements knows it’s a real swampland. It’s full of “I’ll know it when I see it,” which really means, “I’ll know that’s not it when I see it.” Bridging between the technical programming and end-user fit is notoriously hard to do. Good developers are the ones who can walk between those two worlds.

But now we are riding a vibe. A vibe, in this case, is an unwritten requirement. It is always changing—and with AI, we can keep manifesting these whims at a good clip. But while we are projecting our intentions into code that we don’t see, we are producing hidden effects that add up to masses of technical debt. Eventually, it will all come back to bite us.

What developers actually do

AI researcher Gary Marcus recently posted an article on Substack, Is vibe coding dying?, where he referenced a comment from a disillusioned AI coder:

I just want to say that I am giving up on creating anything anymore. I was trying to create my little project, but every time there are more and more errors and I am sick of it. I am working on it for about 3 months, I do not have any experience with coding and was doing everything through AI (Cursor, ChatGPT etc.). But everytime I want to change a liiiiitle thing, I kill 4 days debugging other things that go south.

So I do not have any more energy in me to work on this. It is hopeless. AI is still just soooooo stupid and it will fix one thing but destroy 10 other things in your code. I am really sad, because I was enjoying it in the beginnings but now it is just pain and rage. Hat down for those people, who can create something and it is working without coding knowledge.

The reason this quote is so perfect will be clear to most software developers. This is a non-coder who believes their project failed due to their inexperience and use of AI. But developers know the “I changed X, and way over there in Y and X something broke” is a classic pitfall of software development.

Software is a bizarrely interrelated complex of things, like a quantumly entangled system. Managing this madness is a big part of what successful software developers get paid to do.

Vibe coding and technical debt

Don’t get me wrong: I appreciate the unbridled creativity that can be found in just riding a vibe on an AI platform. If you haven’t already tried it, I suggest sitting down with Roo Code and seeing how it feels to blast out working software with just a few keystrokes.

At first, vibe coding is intoxicating. You can rapidly produce all the basic application infrastructure without even thinking about it. It’s like driving a 4×4 that sails over speed bumps. Pretty soon, though, you will find yourself off-roading in the ravines of wack-a-mole fix-and-break, like the above user did. Suddenly, that landscape of magically functional code becomes a yawning chasm of technical debt that you have to figure out. And if you don’t have the coding background to understand what you’re up against, you will drown in it.

Sure, you can try using AI to fix the things that are breaking, but have you tried it? Have you ever been stuck with an AI assistant confidently running you and your code around in circles? Even with something like Gemini CLI and DevTools integration (where the AI has access to the server and client-side outputs) it can so easily descend into a maddening cycle. In the end, you are mocked by your own unwillingness to roll up your sleeves and do some work.

It’s certainly one of the strangest experiences I’ve had with a computer: relying on my own hard-won instincts to ferret out root problems the AI itself obscures.

Be careful how you use it

Some might be tempted to accuse me of being anti-AI, which is not true. I love AI for coding. I’d even say (and have said) it brings back some of the joy and sense of possibility of the early days of the Internet. By dealing with a lot of the formality of coding, AI brings more ambitious ideas and efforts into scope. It lets developers spend more time in the zone of creativity.

If I had to choose one thing that is the most compelling about AI-coding, it would be the ability to quickly scale from nothing. The moment when I get a whole, functioning something based on not much more than an idea I described? That’s a real thrill.

Weirdly, AI also makes me feel less alone at times; like there is another voice in the room.

If in the end that is what we mean by “vibe coding,” I’m all for it. I like to be in the flow just as much as anyone else. The key is to be in the flow without unwittingly amassing junkyards of bad code. At the end of the day, AI will magnify whatever we put into it. It’s like trading a handsaw for a chainsaw: Better be careful how you use it.

The balancing act hasn’t changed

Programming has always been part engineering and part craft. A good software developer brings together these two parts in one mind. Modern AI helps with both the creative and mechanical aspects of software development, but a human is still needed to unite them with understanding.

The balancing act of software development hasn’t changed; it is just operating at a higher order with AI.

(image/jpeg; 6.65 MB)

GitHub Action Secrets aren’t secret anymore: exposed PATs now a direct path into cloud environments 10 Dec 2025, 2:06 am

Many enterprises use GitHub Action Secrets to store and protect sensitive information such as credentials, API keys, and tokens used in CI/CD workflows. These private repositories are widely assumed to be safe and locked down.

But attackers are now exploiting that blind trust, according to new research from the Wiz Customer Incident Response Team. They found that threat actors are using exposed GitHub Personal Access Tokens (PATs) to access GitHub Action Secrets and sneak into cloud environments, then run amok.

“The root cause issue is the presence of these secrets in repos,” said David Shipley of Beauceron Security. “Cloud service provider access keys are gold, they can be extraordinarily long lived, and that’s what [attackers are] sniffing around for.”

GitHub Action Secrets aren’t secrets anymore

Wiz estimates that 73% of organizations using private GitHub Action Secrets repositories store cloud service provider (CSP) credentials within them. When PATs, which allow developers and automation bots to interact with GitHub repositories and workflows, are exploited, attackers can easily move laterally to CSP control planes.

PATs can become a “powerful springboard” that allows attackers to impersonate developers and carry out a range of activities, explained Erik Avakian, technical counselor at Info-Tech Research Group. It’s like having a backstage pass into a company’s cloud environments, he said.

“Once they’re holding that valid PAT, they can do all sorts of things in GitHub that lead directly back into a company’s AWS, Azure, GCP, or other types of cloud services, because GitHub treats that PAT like the real developer,” he said.

With that access, threat actors can “poke around” various repositories and workflows and look for anything that hints at cloud access, configuration items, scripts, and hidden secrets, he noted. If they get access to real cloud credentials, they “have the keys to the company’s AWS bucket, Azure subscriptions, and other workflows.”

They can then spin up cloud resources, access databases, steal source code, install malicious files such as crypto miners, sneak in malicious workflows, or even pivot to other cloud services, while setting up persistence mechanisms so they can return whenever they want.

“At that point, basically anything you can do in the cloud, so can they,” said Avakian.

Easily evading detection

Wiz found that a threat actor with basic read permissions via a PAT can use GitHub’s API code search to discover secret names embedded directly in a workflow’s yaml code, accessed via “${{ secrets.SECRET_NAME }}.”

The danger is that this secret discovery method is difficult to monitor because search API calls are not logged. Further, GitHub-hosted Actions run from GitHub-managed resources that use legitimate, shared IP addresses not flagged as malicious. Attackers can abuse secrets, impersonate workflow origins to exploit trust, and potentially access other resources if code is misconfigured or reused elsewhere in the workflows. They can also persistently access the system.

In addition, if the exploited PAT has write permissions, attackers can execute malicious code and remove workflow logs and runs, pull requests, and ‘created branches’ (isolated copies of codebases for dev experimentation). Because workflow logs are rarely streamed into security incident and event management (SIEM) platforms, attackers can easily evade detection.

Also, notably, a developer’s PAT with access to a GitHub organization makes private repositories vulnerable; Wiz research found that 45% of organizations have plain-text cloud keys stored privately, while only 8% are in public repositories.

Shipley noted: “In some developers’ minds, a private repo equals safe, but it’s clearly not safe.”

How enterprise leaders can respond

To protect themselves against these threats, enterprises should treat PATs as they would any other privileged credentials, Avakian noted. Cloud infrastructure and cloud development environments should be properly locked down, essentially “zero trustifying” them through micro segmentation and privileged user management to contain them and prevent lateral pivoting.

“Like any other credentials, tokens are best secured when they have reasonable expiration dates,” said Avakian. “Making tokens expire, rotating them, and using short-lived credentials will help thwart these types of risks.”

Least privilege everything and give accounts only the rights they need, rather than an ‘admin everything’ approach, Avakian advised. More importantly, move cloud secrets out of GitHub workflows and ensure that the proper amount of monitoring and log review processes are in place to flag surprise or unexpected workflow or cloud creation events.

Beauceron’s Shipley agreed, saying that enterprises need a multi-pronged strategy, good monitoring, instant response plans, and developer training processes that are reinforced with “meaningful consequences” for non-compliance. Developers must be motivated to follow secure coding best practices; building a strong security culture in developer teams is huge. “You can’t buy a blinky box for that part of the problem,” he said.

“Criminals have stepped up their game,” said Shipley. “Organizations don’t have a choice. They have to invest in these areas, or they will pay.”

Also, stop blindly trusting GitHub repos, he added. “The nature of repos is that they live forever. If you don’t know if you have cloud secrets inside your repos, you need to go and find them. If they’re there, you need to change them yesterday, and you need to stop adding new ones.”

If there is an upside, he noted, it’s that enterprises are “victims of their own success” as they’ve raised the bar with multi-factor authentication (MFA). Gains in general security awareness makes it more difficult for criminals to obtain access and identities and compromise systems.

“In some ways, this is a good sign,” said Shipley. “In a hilarious kind of way, it means [the criminals] are now moving into deeper levels requiring more effort.”

(image/jpeg; 0.45 MB)

Linux Foundation launches Agentic AI Foundation 10 Dec 2025, 12:50 am

The Linux Foundation has announced the formation of the Agentic AI Foundation (AAIF), which is intended to provide a neutral, open foundation to ensure that agentic AI evolves transparently and collaboratively.

Announced December 9, the AAIF is anchored by founding contributions including Anthropic’s Model Context Protocol, an open protocol for integrating LLM applications and external data sources and tools; Block’s goose, an AI coding agent; and OpenAI’s AGENTS.md, an open format for guiding coding agents. These inaugural projects lay the groundwork for a shared ecosystem of tools, standards, and community-driven innovation, according to the Linux Foundation. “Bringing these projects together under the AAIF ensures they can grow with the transparency and stability that only open governance provides,” said Linux Foundation Executive Director Jim Zemlin in a statement.

Founding AAIF members include Amazon Web Services, Anthropic, Bloomberg, Cloudflare, Google, IBM, JetBrains, Microsoft, OpenAI, and Salesforce. The advent of agentic AI represents a new era of autonomous decision-making and coordination across AI systems that will transform and revolutionize entire industries, the Linux Foundation said.

(image/jpeg; 0.48 MB)

AWS is still chasing a cohesive enterprise AI story after re:Invent 9 Dec 2025, 2:58 pm

AWS kicked off re:Invent 2025 with a defensive urgency that is unusual for the cloud leader, arriving in Las Vegas under pressure to prove it can still set the agenda for enterprise AI.

With Microsoft and Google tightening their grip on CIOs’ mindshare through integrated AI stacks and workflow-ready agent platforms, AWS CEO Matt Garman and his lieutenants rolled out new chips, models, and platform enhancements, trying to knit the updates into a tighter pitch that AWS can still offer CIOs the broadest and most production-ready AI foundation.

Analysts remain unconvinced that AWS succeeded.

“We are closer, but not done,” said David Linthicum, independent consultant and retired chief cloud strategy officer at Deloitte.

Big swing but off target

Garman’s biggest swing, at least the one that got it “closer”, came in the form of Nova Forge,  a new service with which AWS is attempting to confront one of its strategic weaknesses: the absence of a unified narrative that ties data, analytics, AI, and agents into a single, coherent pathway for enterprises to adopt.

It’s this cohesion that Microsoft has been selling aggressively to CIOs with its recently launched IQ set of offerings.

Unlike Microsoft’s IQ stack, which ties agents to a unified semantic data layer, governance, and ready-made business-context tools, Nova Forge aims to provide enterprises raw frontier-model training power in the form of a toolkit to build custom models with proprietary data, rather than a pre-wired, workflow-ready AI platform.

But it still requires too much engineering lift to adopt, analysts say.

AWS is finally positioning agentic AI, Bedrock, and the data layer as a unified stack instead of disconnected services, but according to Linthicum, “It’s still a collection of parts that enterprises must assemble.”

There’ll still be a lot of work for enterprises wanting to make use of the new services AWS introduced, said Phil Fersht, CEO of HFS Research.

“Enterprise customers still need strong architecture discipline to bring the parts together. If you want flexibility and depth, AWS is now a solid choice. If you want a fully packaged, single-pane experience, the integration still feels heavier than what some competitors offer,” he said.

Powerful tools instead of turnkey solutions

The engineering effort needed to make use of new features and services echoed across other AWS announcements, with the risk that they will confuse CIOs rather than simplify their AI roadmap.

On day two of the event, Swami Sivasubramanian’s announced new features across Bedrock AgentCore, Bedrock, and SageMaker AI to help enterprises move their agentic AI pilots to production, but still focused on providing tools that accelerate tasks for developers rather than offering “plug-and-play agents” by default, Linthicum said.

The story didn’t change when it came to AWS’s update to vibe-coding tool Kiro or the new developer-focused agents it introduced to simplify devops, said Paul Nashawaty, principal analyst at The Cube Research.

“AWS clearly wants to line up against Copilot Studio and Gemini Agents. Functionally, the gap is closing,” said Nashawaty. “The difference is still the engineering lift. Microsoft and Google simply have tighter productivity integrations. AWS is getting there, but teams may still spend a bit more time wiring things together depending on their app landscape.”

Similarly, AWS made very little progress toward delivering a more unified AI platform strategy. Analysts had looked to the hyperscaler to address complexity around the fragmentation of its tools and services by offering more opinionated MLops paths, deeper integration between Bedrock and SageMaker, and ready-to-use patterns that help enterprises progress from building models to deploying real agents at scale.

Linthicum was dissatisfied with efforts by AWS to better document and support the connective tissue between Bedrock, SageMaker, and the data plane. “The fragmentation hasn’t vanished,” he said. “There are still multiple ways to do almost everything.”

The approach taken by AWS contrasts sharply with those of Microsoft and Google to present more opinionated end-to-end stories, Linthicum said, calling out Azure’s tight integration around Fabric and Google’s around its data and Vertex AI stack.

Build or buy?

For CIOs who were waiting to see what AWS delivered before finalizing their enterprise AI roadmap, they are back at a familiar fork: powerful primitives versus turnkey platforms.

They will need to assess whether their teams have the architectural discipline, MLops depth, and data governance foundation to fully capitalize on AWS’s latest additions to its growing modular stack, said Jim Hare, VP analyst at Gartner.

“For CIOs prioritizing long-term control and customization, AWS offers unmatched flexibility; for those seeking speed, simplicity, and seamless integration, Microsoft or Google may remain the more pragmatic choice in 2026,” Hare said.

The decision, as so often, comes down to whether the enterprise wants to build its AI platform or just buy one.

This article first appeared on CIO.com.

(image/jpeg; 27.27 MB)

Spotlight: Making the most of multicloud 9 Dec 2025, 10:00 am

Download the December 2025 issue of the Enterprise Spotlight from the editors of CIO, Computerworld, CSO, InfoWorld, and Network World.

(image/jpeg; 1.34 MB)

The hidden cost of Amazon Nova 2 9 Dec 2025, 9:00 am

Amazon’s Nova 2 announcement at AWS re:Invent 2025 is exactly the type of AI offering we expected from AWS and, frankly, exactly what should make thoughtful architects nervous. Nova 2 is positioned as a frontier-grade model, tightly integrated with Amazon Bedrock. It’s part of a growing ecosystem of “frontier agents” and the AgentCore framework unveiled at re:Invent 2025. The story is compelling: better models, better tools, and a single platform to build, deploy, and scale agentic AI.

And yet, there’s a problem. It isn’t that Nova 2 is technically weak. The problem is that it is strategically strong in all the wrong ways for customers who care about independence, portability, and long-term value. AWS is not just selling you a model; the company is selling you an entire worldview where your agentic fabric, data flows, and operational patterns are deeply rooted in one cloud.

Vendor lock-in versus actual value

Lock-in is a spectrum, and the ecosystem of Nova 2, Bedrock, and AgentCore pushes you far toward the “tightly coupled” end of that spectrum. On paper, you get convenience: native integrations, managed infrastructure, observability, and security primitives that understand the agentic constructs AWS has defined. In practice, you are anchoring the core of your emerging AI capability into APIs, runtimes, and orchestration semantics that exist only within AWS.

The question I want enterprises to ask is simple: Are you optimizing for the next six quarters or the next six years? It’s likely that during the next six quarters, Nova 2 and its ecosystem will make you more productive. But during the next six years, the cost of migrating away from this ecosystem—or even meaningfully using a second cloud for AI—will rise dramatically. Your agents will be written to AWS’s tool APIs, observability model, security posture, and the way AWS wires agents to data and events. That is not theoretical lock-in; it is baked into every line of code and every workflow you build.

If you view AI as a transient experiment, this may not bother you. If you believe, as I do, that agentic systems will become the operational nervous system of most enterprises, then concentrating that critical capability inside a single vendor’s ecosystem is a strategic risk, not a feature.

Agentic fabric: native versus portable

The notion of an “agentic fabric” is useful: a mesh of agents that can reason, act, and collaborate across data sources, applications, and infrastructure. AWS’s vision is a cloud-native fabric where agents are first-class citizens inside services like Bedrock, wired to Lambda, Step Functions, EventBridge, and a growing set of AI-ready data services. The fabric is smooth—as long as you stay inside their walls.

The alternative is a cloud-portable fabric. Instead of building directly against closed, vendor-specific agent frameworks, you define agents in terms of more open abstractions: model-agnostic interfaces, cross-cloud orchestration, and data access layers that do not assume a particular vendor’s storage or event model. You might still run agents on AWS, but you can also run them on other clouds, on-premises, or at the edge without rewriting them from scratch.

Nova 2 and the surrounding tools tilt you hard toward cloud-native and away from cloud-portable. When your agents depend on AWS-specific features—say, Bedrock’s proprietary agent orchestration primitives or AWS-only plug-in patterns—your portability story collapses. The cost to move is not just “change the model endpoint”; it becomes “re-implement how the agent thinks, acts, and integrates.” That type of cost kills multicloud strategies in practice, even when they look good on PowerPoint.

Operational burden or simplification

AWS is selling Nova 2 and AgentCore as simplifying complexity and, in some respects, that is true. You get unified observability, integrated security, and pre-packaged patterns for building safe, production-grade agents. But let’s be very clear about what is happening. AWS is not removing complexity, it is encapsulating it inside black boxes you do not control.

When those black boxes malfunction, drift, or change, you are at the mercy of AWS’s release cadence and operational practices. You will still need teams who understand the behavior of your agents in detail, but you will be diagnosing problems in systems whose core behavior is defined by a vendor’s code and policies, not your own. That is a different kind of fragility. Instead of owning complexity you can see and manage, you’re renting complexity and hoping it behaves.

On top of that, operations teams now have to understand not only distributed cloud-native systems, but emergent, probabilistic agent behavior embedded within them. If your observability, governance, and control mechanisms are all bound to AWS-specific services, you lose the ability to build a unified operations view across clouds and on-prem systems. AWS wants to be your single pane of glass, but the reality is that most large enterprises need several panes, and those panes must interoperate.

Taking the long view

When you adopt Nova 2 and its ecosystem as your primary agentic platform, you are choosing a vertically integrated stack. The immediate upsides are undeniable: optimized performance, deep integrations, turnkey security patterns, and less glue code. For many teams, particularly those that are small, under-resourced, or deeply aligned with AWS already, this is a rational short-term decision.

But the downsides show up over time, and they show up at the architectural level, not in developer convenience. You lose leverage on pricing as your dependence on AWS-specific agent capabilities grows. You will find it harder to adopt innovations that emerge on other clouds or in open source communities, because your systems are built around a specific model of agents and tools. You will discover that “multicloud” has devolved into “one primary cloud for anything that matters and some residual workloads elsewhere,” which is exactly the outcome the big clouds are optimizing for.

If you want more open and portable approaches, you pay more up front. You build or adopt neutral orchestration layers, use frameworks that abstract model providers, and design observability that spans heterogeneous environments. You resist the gravitational pull of single-vendor AI fabrics, even when they look impressively polished. The payoff is flexibility: the ability to change direction when economics, regulation, or innovation demand it, without rewriting the nervous system of your enterprise.

(image/jpeg; 0.59 MB)

Amazon Q Developer: Everything you need to know 9 Dec 2025, 9:00 am

When I reviewed Amazon Q Developer in 2024, I noted that it was able to generate whole functions in common programming languages with only a few fixes. It was useful for completing lines of code, doc strings, and if/for/while/try code blocks as you type. It could also scan for vulnerabilities and help you fix code problems. However, it could not generate full functions for some use cases, but instead reverted to line-by-line suggestions.

At the time, Amazon Q Developer was “powered by Amazon Bedrock” and trained on “high-quality AWS content.” I never knew what the first really meant, but that’s now moot: Amazon Q Developer now gives you a choice of Claude Sonnet versions, including 4.5, which is competitive with GPT-5 Codex.

At this point, completing lines of code qualifies as a “gimme,” the equivalent of a two-foot putt in golf. Generating whole functions is easy, generating complex applications is moderately difficult, and fixing reported bugs in large repositories ranges from moderately difficult to difficult.

Some of the current differentiators for coding agents are their ability to call tools (e.g. read files, run applications, show edit diffs, and understand Git and GitHub) and their ability to define and use Model Context Protocol (MCP) servers. MCP servers are tools that allow AI models to interact with external services, such as databases, APIs, and enterprise systems, using a standard, unified protocol. Another differentiator is the ability to run tools and projects in an isolated environment.

At risk of giving away the punch line, Amazon Q Developer supports tools, supports MCP servers but makes it harder than necessary to configure them, and doesn’t have the best implementation of isolated cloud environments. It’s still useful, however, and we can only hope that AWS will fill in the obvious gaps.

Amazon Q Developer competes with Google Jules, OpenAI Codex Cloud and OpenAI Codex CLI, Claude Code, Windsurf, Augment Code, Amp, Cline, Zencoder, and Gemini CLI. The current Amazon Q Developer versions as of this writing are Plug-in 1.104.0 and CLI 1.19.6.

Amazon Q Developer in editors

Amazon Q Developer runs in JetBrains IDEs, Visual Studio Code, Visual Studio, Eclipse, and the command line (CLI). By the way, the Q CLI has been rebranded as Kiro CLI with a version bump to 1.20; one change from the Q CLI / Kiro CLI I’ve been testing is the addition of an “auto” model option.

Amazon Q Developer generates real-time code suggestions ranging from snippets to full functions based on your comments and existing code. It also supports inline chat directly in the code editor, and CLI completions and natural language to Bash translation in the command line. Q Developer can write unit tests, optimize code, and scan for vulnerabilities. If you enable agentic coding, Q Developer can perform tasks on your behalf by reading and writing files, generating code diffs, and running shell commands, while incorporating your feedback and providing real-time updates along the way.

You can use chat and inline chat for any programming language. Inline code suggestions work best with C, C++, C#, Dart, Go, Java, JavaScript, Kotlin, Lua, PHP, PowerShell, Python, R, Ruby, Rust, Scala, Shell, SQL, Swift, SystemVerilog, TypeScript, CDK (TypeScript, Python), HCL (Terraform), JSON, and YAML. The chat works in at least 13 natural languages and automatically responds in the language you use for your query. This is a major improvement over last year’s language capabilities.

Amazon Q Developer 01

Amazon Q information in Visual Studio Code. At the right is the plug-in description. At the left you can see the chat help text above the chat input box. Note the blue switch at the lower left, which lets you choose whether to use agentic coding for the current request. Also note the choice of model, Claude Sonnet 4.5.

Foundry

Testing Amazon Q Developer Chat

To compare Amazon Q Developer Chat with Codex Chat I tried essentially the same prompt to create a command-line program to calculate pi.

Generate a C++ program to calculate the value of pi to the number of decimal digits specified in the command line. Make sure the program will be accurate to at least 1000 digits. Use the algorithm that is most efficient at run-time, regardless of the amount of code required. Save the program in the AmazonQ folder.

I used the Claude Sonnet 4.5 model, which I expected to do well on this task. Amazon Q Developer responded with the following.

I'll create a C++ program using the Chudnovsky algorithm, which is one of the fastest known algorithms for calculating π. It uses the GMP library for arbitrary precision arithmetic.
Creating the AmazonQ directory if it doesn't exist
Creating the C++ program that uses the Chudnovsky algorithm for calculating pi to arbitrary precision
Creating a Makefile to simplify compilation of the pi calculator
Creating a README with build and usage instructions
I've created a C++ program that calculates π using the Chudnovsky algorithm, which is one of the most efficient algorithms for computing π (converges at ~14 digits per iteration).
Files created in 
/Users/martinheller/Code/AmazonQ/:
• pi_calculator.cpp
 - Main program using GMP for arbitrary precision
• Makefile
 - For easy compilation
• README.md
 - Build and usage instructions
To use:
1.	Install GMP: brew install gmp
2.	Build: make
3.	Run: ./pi_calculator 1000 (for 1000 digits)
The program accurately handles 1000+ digits and uses the fastest known algorithm for runtime efficiency.

That sounded more promising than Codex’s first try, but it didn’t compile. For background, you should know that GMP has both C and C++ libraries; I only installed the C library, but Amazon Q Developer didn’t know that. I asked it to fix the build bug after highlighting that output, but it didn’t take the context. No matter, it still fixed the problem:

Let me try to build it first to see the specific error:
Attempting to build the program to identify the specific build error
Removing the unnecessary gmpxx.h include and fixing the C constant calculation
Removing -lgmpxx from the Makefile since we don't need the C++ wrapper
Building the program again to verify the fix works
Testing the program with 100 digits to verify it works correctly
Testing the program with the correct path
Running the pi calculator to verify it works
Fixed. The build issue was the missing 
gmpxx.h header. I removed the unnecessary C++ wrapper include and simplified the constant calculation. The program now builds and runs correctly, calculating π to the specified precision.

I asked Amazon Q Developer Chat to time the program for 100 and 10,000 digits so that I could compare the efficiency with the MPFR/GMP implementation that Codex generated on its second try. Unsurprisingly, the calculation was a lot slower for 10,000 digits. I asked Q Developer how it might make this calculation more efficient for very high numbers of digits.

Yes, there are several optimizations for very high digit counts:
1. Binary Splitting - The most significant optimization. Instead of computing terms sequentially, recursively split the sum into halves and combine. This reduces complexity from O(n²) to O(n log n) for n terms.
2. Use MPFR instead of GMP - MPFR is optimized for floating-point operations and has better performance for transcendental functions.
3. Avoid Redundant Factorial Calculations - Your current code recalculates factorials from scratch each iteration. Binary splitting naturally avoids this.
4. Memory Management - Reuse mpz_t variables instead of allocating/deallocating in loops.
5. Parallel Computation - For extremely large calculations (millions of digits), parallelize the binary split recursion.
I can see you already have a more optimized version in 
Code/Codex/pi_mpfr.cpp that uses binary splitting with MPFR. That implementation will be significantly faster for large digit counts (10,000+).

Oops. I didn’t mean to leave that file open where Amazon Q Developer could see it. Nevertheless, Q Developer/Claude Sonnet 4.5 answered correctly for all five points.

Amazon Q Developer 02

Pi calculator in C++ generated by Amazon Q Developer Chat using the Claude Sonnet 4.5 model. This implementation uses the GMP library for high-precision integer arithmetic.

Foundry

Testing the Amazon Q Developer CLI

Before testing the Amazon Q Developer CLI (again) I checked its menu bar icon drop-down for updates. It was already up-to-date. Then I ran q in iTerm2 and asked it for help information.

Amazon Q Developer 03

Amazon Q Developer CLI help information display. This is as of Q 1.19.4. With version 1.20, the Q CLI is rebranded as the Kiro CLI. 

Foundry

For a first test I changed to the Cline rep directory, started the Q CLI, switched Q to the claude-sonnet-4.5 model (from the default 4.0), and asked it to explain the current repository.

Amazon Q Developer 04

Amazon Q Developer CLI with claude-sonnet-4.5 explaining the Cline repository. This is a reasonable high-level summary. It’s not as detailed as the explanation given by Codex last month, but it is easier to understand.

Foundry

Then I repeated the experiment to have Q explain the amazon-q-developer-cli repository.

Amazon Q Developer 05

Amazon Q Developer CLI explaining its own repository. Again, it gave a high-level summary that is easy to understand.

Foundry

Next, I cloned the websocket repository from my own fork of the project, switched to that directory, ran the Q CLI, switched to the claude-sonnet-4.5 model, and asked Q to pick an open issue in the websocket repo and fix it in a new branch. Q struggled a bit with some of the issues, but found a problem with the chat example that it was able to fix.

Amazon Q Developer 06

Amazon Q Developer CLI explaining a fix it made to websocket. Issue #986 is about the chat example not working due to a missing CheckOrigin implementation. It was a simple three-line addition to the example. Q created the fix and checked it in to a new branch on my laptop.

Foundry

I had to push the branch to my GitHub repository myself:

martinheller@Mac websocket % git push
fatal: The current branch fix-chat-example-checkorigin has no upstream branch.
To push the current branch and set the remote as upstream, use

    git push --set-upstream origin fix-chat-example-checkorigin

To have this happen automatically for branches without a tracking
upstream, see 'push.autoSetupRemote' in 'git help config'.

martinheller@Mac websocket % git push --set-upstream origin fix-chat-example-checkorigin
Enumerating objects: 9, done.
Counting objects: 100% (9/9), done.
Delta compression using up to 12 threads
Compressing objects: 100% (5/5), done.
Writing objects: 100% (5/5), 686 bytes | 137.00 KiB/s, done.
Total 5 (delta 4), reused 0 (delta 0), pack-reused 0 (from 0)
remote: Resolving deltas: 100% (4/4), completed with 4 local objects.
remote:
remote: Create a pull request for 'fix-chat-example-checkorigin' on GitHub by visiting:
remote:      https://github.com/meheller/websocket/pull/new/fix-chat-example-checkorigin
remote:
To https://github.com/meheller/websocket.git
 * [new branch]      fix-chat-example-checkorigin -> fix-chat-example-checkorigin
branch 'fix-chat-example-checkorigin' set up to track 'origin/fix-chat-example-checkorigin'.
martinheller@Mac websocket %

I didn’t create a pull request for the upstream gorilla/websocket repo that I had forked from, since that repo hadn’t changed in eight months, which tells me that the repo is inactive.

Amazon Q Developer CLI with claude-sonnet-4.5 didn’t do as good a job on this task as Codex Cloud. Codex Cloud operated directly on the GitHub repository and was able to fix a bug that Sonnet misunderstood and skipped.

I looked at the GitHub blame view for the new branch of my fork of websocket. The fix is in lines 38 through 40.

Amazon Q Developer 07

GitHub blame view for the fixed client.go in the chat example of my fork of gorilla/websocket.

Foundry

Amazon Q Developer agentic and MCP capabilities

The Amazon Q Developer CLI (newly renamed Kiro CLI) currently has the following built-in tools available to agents:

  • fs_read – Read files, directories, and images
  • fs_write – Create and edit files
  • execute_bash – Execute shell commands
  • use_aws – Make AWS CLI API calls
  • knowledge – Store and retrieve information across sessions
  • introspect – Provide information about Q CLI capabilities

In the built-in default agent, only fs_read can run without asking permission, but all tools are available, as well as legacy MCP servers. You can define your own agents, and set a default agent.

In addition, you can define and use MCP servers and tools, both in the CLI and the IDE. Oddly, the Amazon Q Developer plug-in for VS Code does not use any of the MCP or extensive tool capabilities of VS Code. Instead, it and the Q CLI / Kiro CLI use their own JSON MCP configuration files, which look about the same as Claude’s and the MCP specification’s config files. (If you’re interested, you can find the MCP specification here.)

Like Claude Code, the Q CLI / Kiro CLI has text commands to manage MCP server configurations; the Amazon Q Developer editor plug-ins have forms for that. Both require you to know the command that invokes the MCP server, but you can look that up for public MCP servers. The MCP standards organization maintains a registry of public MCP servers; so do GitHub (also here) and Anthropic (Claude).

Amazon Q Developer productivity effects

According to AWS, their internal developers have improved productivity using Q both quantitatively and qualitatively. One huge win was to ingest their internal Amazon knowledge repository (millions of documents) into Amazon Q Business so that developers could get answers based on information spread across those repositories. They reported “we reduced the time Amazon developers spent waiting for technical answers by over 450k hours and reduced the interruptions to ‘flow state’ of existing team members.”

AWS also unlocked “possibilities for large-scale technical modernization that previously seemed impractical,” which “fundamentally changed how [they] think about technical debt and system modernization.” Another effect was reducing the time to learn new languages and codebases. “One developer reported cutting their typical three-week ramp-up time for learning a new programming language down to just one week using Q Developer.”

With the Amazon Q Developer CLI agent, another internal developer was able to work with an unfamiliar codebase to build and implement a non-trivial feature within two days using Rust, a programming language they didn’t know, stating, “If I’d done this ‘the old fashioned way,’ I would estimate it would have taken me five to six weeks due to language and codebase ramp up time. More realistically, I wouldn’t have done it at all, because I don’t have that kind of time to devote.”

Amazon Q Developer on AWS

AWS notes that Amazon Q Developer is an expert on AWS. Q is available in the AWS Management Console (and in Microsoft Teams and Slack) to help optimize users’ cloud costs and resources, provide guidance on architectural best practices, investigate operational incidents, and diagnose and resolve networking issues. Of course, it’s easy for AWS to train its own model on its own documentation, procedures, best practices, and APIs. It might be a bit harder for customers.

Amazon Q Developer .NET porting and Java upgrades

Amazon Q Developer has agents that can help port .NET code from Windows to Linux, and help upgrade code from Java 8 to Java 17. These agents have upgraded more than 1,000 production Java applications, according to AWS.

Amazon Q Developer, data, and AI

In addition to straight code, Amazon Q Developer knows about data integration workflows. It can generate ETL scripts, troubleshoot errors, translate natural language to SQL queries, and work with data across 20+ data sources. It probably doesn’t hurt that AWS offers 20+ data sources as paid services. Q Developer can help you build machine learning models and also reduce the time to build, train, evaluate, and deploy AI models in SageMaker Studio.

Amazon Q Developer pricing

The Amazon Q Developer perpetual Free Tier gives you 50 agentic chat interactions per month. You can also transform up to 1,000 lines of code per month. The Pro Tier has expanded limits for $19 per month per user. I was able to perform this review using the free tier.

Conclusion

Amazon Q Developer has improved significantly over the last year. I can certainly recommend it to AWS customers. The free tier was good enough for me, but if I was using it all day, every day I’d most likely have to upgrade to the pro tier.

(image/jpeg; 6.84 MB)

IBM to buy Confluent to extend its data and automation portfolio 9 Dec 2025, 2:15 am

IBM has agreed to acquire cloud-native enterprise data streaming platform Confluent in a move designed to expand its portfolio of tools for building AI applications

The company said Monday in a release that it sees Confluent as a natural fit for its hybrid cloud and AI strategy, adding that the acquisition is expected to “drive substantial product synergies” across its portfolio.

Confluent connects data sources and cleans up data. It built its service on Apache Kafka, an open-source distributed event streaming platform, sparing its customers the hassle of buying and managing their own server clusters in return for a monthly fee per cluster, plus additional fees for data stored and data moved in or out. 

IBM expects the deal, which it valued at $11 billion, to close by the middle of next year.

Confluent CEO and co-founder Jay Kreps stated in an email sent internally to staff about the acquisition, “IBM sees the same future we do: one in which enterprises run on continuous, event-driven intelligence, with data moving freely and reliably across every part of the business.”

It’s a good move for IBM, noted Scott Bickley, an advisory fellow at Info-Tech Research Group. “[Confluent] fills a critical gap within the watsonx platform, IBM’s next-gen AI platform, by providing the ability to monitor real-time data,” he said, and is based on the industry standard for managing and processing real-time data streams. 

He added, “IBM already has the pieces of the puzzle required to build and train AI models; Confluent provides the connective tissue to saturate those models with continuous live data from across an organization’s entire operation, regardless of the source. This capability should pave the road ahead for more complex AI agents and applications that will be able to react to data in real time.”

Bickley also pointed out that the company is playing the long game with this acquisition, which is its largest in recent history. “IBM effectively positions itself proactively to compete against the AI-native big data companies like Snowflake and Databricks, who are all racing towards the same ‘holy grail’ of realizing AI agents that can consume, process, and react to real-time data within the context of their clients’ trained models and operating parameters,” he said, adding that IBM is betting that a full-stack vertical AI platform, watsonx, will be more appealing to enterprise buyers than a composable solution comprised of various independent components.

It is a deal that works for both parties.

The move, he noted, also complements previous acquisitions such as the $34.5 billion acquisition of Red Hat and the more recent $6.4 billion acquisition of Hashicorp, all of which are built upon dominant open source standards including Linux, Terraform/Vault, and Kafka. This allows IBM to offer a stand-alone vertical, hybrid cloud strategy with full-stack AI capabilities apart from the ERP vendor space and the point solutions currently available.

In addition, said Andrew Humphreys, senior director analyst at Gartner, with IBM MQ, IBM already competes with Confluent in the event broker market, the underpinning technology for event driven architectures. “Although there is some overlap, IBM MQ and Kakfa address different use cases and problems for customers, so IBM has the opportunity to bring these offerings together to deliver a comprehensive set of event broker offerings that address the full breadth of event driven architecture use cases,” he said.

Vital layer in the watsonx stack filled

Mitch Ashley, VP and practice lead at Futurum Research, noted that the acquisition of Confluent fills a vital layer in the watsonx stack and gives IBM an open source-based backbone for real time, governed data in motion. It also aligns IBM’s recent data acquisitions into a coherent architecture. “The value here is not just Kafka as a technology, but the ability to deliver fresh, contextual data into every part of IBM’s AI portfolio with consistency and control,” he said.

The acquisition, wrote Sanchit Vir Gogia, the chief analyst at Greyhound Research, in a report released soon after the purchase was announced, “marks a turning point that has little to do with price tags or portfolio expansion. What it truly reveals is a change in who controls the lifeblood of modern digital enterprises. That lifeblood is real-time data.”

It is not a tactical buy, he noted, it’s the strategic completion of an architecture years in the making. “For enterprise leaders, this changes the map,” he predicted. “AI no longer sits at the edge of architecture. It moves to the center, and Confluent becomes the layer that makes that center responsive, contextual, and live. This acquisition allows IBM to deliver AI that doesn’t just predict, but listens, grounded in data that is clean, connected, and always in motion.”

Added Stephen Catanzano, senior analyst, data & AI with Omdia, “all the major players are really building end to end data platforms at this point. … This is data in motion, so it really fills out the gap that [IBM] have to manage both moving data and static data, unstructured and structured.”

“People really want to apply generative AI and agentic AI with moving data and streaming data. And they (IBM) took the biggest player off the market,” he said.

In addition to all this, Bickley said, the timing was right in that Confluent has been experiencing a slowing of revenue growth and was reportedly shopping itself already.

“At the end of the day, this deal works for both parties,” he said. “IBM is now playing a high-stakes game and has placed its bet that having the best AI models is not enough; it is the control of the data flow that will matter.”

This story originally appeared on CIO.com.

(image/jpeg; 5.89 MB)

JDK 26: The new features in Java 26 9 Dec 2025, 12:52 am

Java Development Kit (JDK) 26, a planned update to standard Java due March 17, 2026, has reached an initial rampdown phase for bug fixes, with the feature set now frozen. The following 10 features are officially targeted to JDK 26: a fourth preview of primitive types in patterns, instanceof, and switch, ahead-of-time object caching, an eleventh incubation of the Vector API, second previews of lazy constants and PEM (privacy-enhanced mail) encodings of cryptographic objects, a sixth preview of structured concurrency, warnings about uses of deep reflection to mutate final fields, improving throughput by reducing synchronization in the G1 garbage collector (GC), HTTP/3 for the Client API, and removal of the Java Applet API.

A short-term release of Java backed by six months of Premier-level support, JDK 26 follows the September 16 release of JDK 25, which is a Long-Term Support (LTS) release backed by several years of Premier-level support.

The latest feature to be added, primitive types in patterns, instanceof, and switch, is intended to enhance pattern matching by allowing primitive types in all pattern contexts, and to extend instanceof and switch to work with all primitive types. Now in a fourth preview, this feature was previously previewed in JDK 23, JDK 24, and JDK 25. The goals include enabling uniform data exploration by allowing type patterns for all types, aligning type patterns with instanceof and aligning instanceof with safe casting, and allowing pattern matching to use primitive types in both nested and top-level pattern contexts. Changes in this fourth preview include enhancing the definition of unconditional exactness and applying tighter dominance checks in switch constructs. The changes enable the compiler to identify a wider range of coding errors.

With ahead-of-time object caching, the HotSpot JVM would gain improved startup and warmup times, so it can be used with any garbage collector including the low-latency Z Garbage Collector (ZGC). This would be done by making it possible to load cached Java objects sequentially into memory from a neutral, GC-agnostic format, rather than mapping them directly into memory in a GC-specific format. Goals of this feature include allowing all garbage collectors to work smoothly with the AOT (ahead of time) cache introduced by Project Leyden, separating AOT cache from GC implementation details, and ensuring that use of the AOT cache does not materially impact startup time, relative to previous releases.

The eleventh incubation of the Vector API introduces an API to express vector computations that reliably compile at run time to optimal vector instructions on supported CPUs. This achieves performance superior to equivalent scalar computations. The incubating Vector API dates back to JDK 16, which arrived in March 2021. The API is intended to be clear and concise, to be platform-agnostic, to have reliable compilation and performance on x64 and AArch64 CPUs, and to offer graceful degradation. The long-term goal of the Vector API is to leverage Project Valhalla enhancements to the Java object model.

Also on the docket for JDK 26 is another preview of an API for lazy constants, which had been previewed in JDK 25 via a stable values capability. Lazy constants are objects that hold unmodifiable data and are treated as true constants by the JVM, enabling the same performance optimizations enabled by declaring a field final. Lazy constants offer greater flexibility as to the timing of initialization.

The second preview of PEM (privacy-enhanced mail) encodings calls for an API for encoding objects that represent cryptographic keys, certificates, and certificate revocation lists into the  PEM transport format, and for decoding from that format back into objects. The PEM API was proposed as a preview feature in JDK 25. The second preview features a number of changes, such as the PEMRecord class is now named PEM and now includes a decode()method that returns the decoded Base64 content. Also, the encryptKey methods of the EncryptedPrivateKeyInfo class now are named encrypt and now accept DEREncodable  objects rather than PrivateKey objects, enabling the encryption of KeyPair and PKCS8EncodedKeySpec objects.

The structured concurrency API simplifies concurrent programming by treating groups of related tasks running in different threads as single units of work, thereby streamlining error handling and cancellation, improving reliability, and enhancing observability. Goals include promoting a style of concurrent programming that can eliminate common risks arising from cancellation and shutdown, such as thread leaks and cancellation delays, and improving the observability of concurrent code.

New warnings about uses of deep reflection to mutate final fields are intended to prepare developers for a future release that ensures integrity by default by restricting final field mutation, in other words making final mean final, which will make Java programs safer and potentially faster. Application developers can avoid both current warnings and future restrictions by selectively enabling the ability to mutate final fields where essential.

The G1 GC proposal is intended to improve application throughput and latency when using the G1 garbage collector by reducing the amount of synchronization required between application threads and GC threads. Goals include reducing the G1 garbage collector’s synchronization overhead, reducing the size of the injected code for G1’s write barriers, and maintaining the overall architecture of G1, with no changes to user interaction.

The G1 GC proposal notes that although G1, which is the default garbage collector of the HotSpot JVM, is designed to balance latency and throughput, achieving this balance sometimes impacts application performance adversely compared to throughput-oriented garbage collectors such as the Parallel and Serial collectors:

Relative to Parallel, G1 performs more of its work concurrently with the application, reducing the duration of GC pauses and thus improving latency. Unavoidably, this means that application threads must share the CPU with GC threads, and coordinate with them. This synchronization both lowers throughput and increases latency.

The HTTP/3 proposal calls for allowing Java libraries and applications to interact with HTTP/3 servers with minimal code changes. Goals include updating the HTTP Client API to send and receive HTTP/3 requests and responses; requiring only minor changes to the HTTP Client API and Java application code; and allowing developers to opt in to HTTP/3 as opposed to changing the default protocol version from HTTP/2 to HTTP/3.

HTTP/3 is considered a major version of the HTTP (Hypertext Transfer Protocol) data communications protocol for the web. Version 3 was built on the IETF QUIC (Quick UDP Internet Connections) transport protocol, which emphasizes flow-controlled streams, low-latency connection establishment, network path migration, and security among its capabilities.

Removal of the Java Applet API, now considered obsolete, is also targeted for JDK 26. The Applet API was deprecated for removal in JDK 17 in 2021. The API is obsolete because neither recent JDK releases nor current web browsers support applets, according to the proposal. There is no reason to keep the unused and unusable API, the proposal states.

(image/jpeg; 1.93 MB)

R language is making a comeback – Tiobe 8 Dec 2025, 10:41 pm

The R language for statistical computing has creeped back into the top 10 in Tiobe’s monthly index of programming language popularity.

In the December 2025 index, published December 7, R ranks 10th with a 1.96% rating. R has cracked the Tiobe index’s top 10 before, such as in April 2020 and July 2020, but not in recent years. The rival Pypl Popularity of Programming Language Index, meanwhile, has R ranked fifth this month with a 5.84% share.

“Programming language R is known for fitting statisticians and data scientists like a glove,” said Paul Jansen, CEO of software quality services vendor Tiobe, in a bulletin accompanying the December index. “As statistics and large-scale data visualization become increasingly important, R has regained popularity.”

Jansen noted that R is sometimes frowned upon by “traditional” software engineers due to an unconventional syntax and limited scalability for large production systems. But for domain experts R remains a powerful and elegant tool, and continues to thrive at universities and in research-driven industries, he added. Although data science rival Python has eclipsed R in terms of general adoption, Jansen said R has carved out a solid and enduring niche, excelling at rapid experimentation, statistical modeling, and exploratory data analysis.

“We have seen many Tiobe index top 10 entrants rising and falling,” Jansen wrote. “It will be interesting to see whether R can maintain its current position.”

The Tiobe Programming Community Index bases language popularity on a formula that assesses the number of skilled engineers worldwide, courses, and third-party vendors pertinent to a language. Popular websites including Google, Amazon, Wikipedia, Bing, and more than 20 others are used to calculate its ratings.

The Tiobe index top 10 for December 2025:

  1. Python, 23.64%
  2. C, 10.11%
  3. C++, 8.95%
  4. Java, 8.7%
  5. C#, 7.26%
  6. JavaScript, 2.96%
  7. Visual Basic, 2.81%
  8. SQL, 2.1%
  9. Perl, 1.97%
  10. R, 1.96%

The Pypl index analyzes how often language tutorials are searched on Google. The Pypl index top 10 for December 2025:

  1. Python, 26.91%
  2. C/C++, 13.02%
  3. Objective-C, 11.37%
  4. Java, 11.36%
  5. R, 5.84%
  6. JavaScript, 5.16%
  7. Swift, 3.53%
  8. C#, 3.18%
  9. PHP, 2.98%
  10. Rust, 2.6%

(image/jpeg; 11.74 MB)

Apache Tika hit by critical vulnerability thought to be patched months ago 8 Dec 2025, 8:03 pm

A security flaw in the widely-used Apache Tika XML document extraction utility, originally made public last summer, is wider in scope and more serious than first thought, the project’s maintainers have warned.

Their new alert relates to two entwined flaws, the first CVE-2025-54988 from August, rated 8.4 in severity, and the second, CVE-2025-66516 made public last week, rated 10.

CVE-2025-54988 is a weakness in the tika-parser-pdf-module used to process PDFs in Apache Tika from version 1.13 to and including version 3.2.1.  It is one module in Tika’s wider ecosystem that is used to normalize data from 1,000 proprietary formats so that software tools can index and read them.

Unfortunately, that same document processing capability makes the software a prime target for campaigns using XML External Entity (XXE) injection attacks, a recurring issue in this class of utility.

In the case of CVE-2025-54988, this could have allowed an attacker to execute an External Entity (XXE) injection attack by hiding XML Forms Architecture (XFA) instructions inside a malicious PDF.

Through this, “an attacker may be able to read sensitive data or trigger malicious requests to internal resources or third-party servers,” said the CVE. Attackers could exploit the flaw to retrieve data from the tool’s document processing pipeline, exfiltrating it via Tika’s processing of the malicious PDF.

CVE superset

The maintainers have now realized that the XXE injection flaw is not limited to this module. It affects additional Tika components, namely Apache Tika tika-core, versions 1.13 to 3.2.1, and tika-parsers versions 1.13 to 1.28.5. In addition, legacy Tika parsers versions 1.13 to 1.28.5 are also affected.

Unusually – and confusingly – this means there are now two CVEs for the same issue, with the second, CVE-2025-66516, a superset of the first. Presumably, the reasoning behind issuing a second CVE is that it draws attention to the fact that people who patched CVE-2025-54988 are still at risk because of the additional vulnerable components listed in CVE-2025-66516.

So far, there’s no evidence that the XXE injection weakness in these CVEs is being exploited by attackers in the wild. However, the risk is that this will quickly change should the vulnerability be reverse engineered or proofs-of-concept appear.

CVE-2025-66516 is rated an unusual maximum 10.0 in severity, which makes patching it a priority for anyone using this software in their environment. Users should update to Tika-core version 3.2.2, tika-parser-pdf-module version 3.2.2 (standalone PDF module), or tika-parsers versions 2.0.0 if on legacy.

However, patching will only help developers looking after applications known to be using Apache Tika. The danger is that its use might not be listed in all application configuration files, creating a blind spot whereby its use is not picked up. The only mitigation against this uncertainty would be for developers to turn off the XML parsing capability in their applications via the tika-config.xml configuration file.

This article originally appeared on CSOonline.

(image/jpeg; 6.88 MB)

AWS takes aim at the PoC-to-production gap holding back enterprise AI 8 Dec 2025, 6:58 pm

Enterprises are testing AI in all sorts of applications, but too few of their proofs of concept (PoCs) are making into production: just 12%, according to an IDC study.

Amazon Web Services is concerned about this too, with VP of agentic AI Swami Sivasubramanian devoting much of his keynote speech to it at AWS re:Invent last week.

The failures are not down to lack of talent or investment, but how organizations plan and build their PoCs, he said: “Most experiments and PoCs are not designed to be production ready.”

Production workloads, for one, require development teams to deploy not just a handful of agent instances, but often hundreds or thousands of them simultaneously — each performing coordinated tasks, passing context between one another, and interacting with a sprawling web of enterprise systems.

This is a far cry from most PoCs, which might be built around a single agent executing a narrow workflow.

Another hurdle, according to Sivasubramanian, is the complexity that agents in production workloads must contend with, including “a massive amount of data and edge cases”.  

This is unlike PoCs which operate in artificially clean environments and run on sanitized datasets with handcrafted prompts and predictable inputs — all of which hide the realities of live data, such as inconsistent formats, missing fields, conflicting records, and unexpected behaviours.

Then there’s identity and access management. A prototype might get by with a single over-permissioned test account. Production can’t.

“In production, you need rock-solid identity and access management to authenticate users, authorize which tools agents can access on their behalf, and manage these credentials across AWS and third-party services,” Sivasubramanian said.

Even if those hurdles are cleared, the integration of agents into production workloads still remains a key challenge.

“And then of course as you move to production, your agent is not going to live in isolation. It will be part of a wider system, one that can’t fall apart if an integration breaks,” Sivasubramanian said.

Typically, in a PoC, engineers can manually wire data flows, push inputs, and dump outputs to a file or a test interface. If something breaks, they reboot it and move on. That workflow collapses under production conditions: Agents become part of a larger, interdependent system that cannot fall apart every time an integration hiccups.

Moving from PoC to production

Yet Sivasubramanian argued that the gulf between PoC and production can be narrowed.

In his view, enterprises can close the gap by equipping teams with tooling that bakes production readiness into the development process itself, focusing on agility while still being accurate and reliable.

To address concerns around the agility of building agentic systems with accuracy, AWS added an episodic memory feature to Bedrock AgentCore, which lifts the burden of building custom memory scaffolding off developers.

Instead of expecting teams to stitch together their own vector stores, summarization logic, and retrieval layers, the managed module automatically captures interaction traces, compresses them into reusable “episodes,” and brings forward the right context as agents work through new tasks.

In a similar vein, Sivasubramanian also announced the serverless model customization capability in SageMaker AI to help developers automate data prep, training, evaluation, and deployment.

This automation, according to Scott Wheeler, cloud practice leader at AI and data consultancy firm Asperitas, will remove the heavy infrastructure and MLops overhead that often stall fine-tuning efforts, accelerating agentic systems deployment.

The push toward reducing MLops didn’t stop there. Sivasubramanian said that AWS is adding Reinforcement Fine-Tuning (RFT) in Bedrock, enabling developers to shape model behaviour using an automated reinforcement learning (RL) stack.

Wheeler welcomed this, saying it will remove most of the complexity of building a RL stack, including infrastructure, math, and training-pipelines.

SageMaker HyperPod also gained checkpointless training, which enables developers to accelerate the model training process.

To address reliability, Sivasubramanian said that AWS is adding Policy and Evaluations capabilities to Bedrock AgentCore’s Gateway. While Policy will help developers enforce guardrails by intercepting tool calls, Evaluations will help developers simulates real-world agent behavior to catch issues before deployment.

Challenges remain

However, analysts warn that operationalizing autonomous agents remains far from frictionless.

Episodic memory, though a conceptually important feature, is not magic, said David Linthicum, independent consultant and retired chief cloud strategy officer at Deloitte. “It’s impact is proportional to how well enterprises capture, label, and govern behavioural data. That’s the real bottleneck.”

“Without serious data engineering and telemetry work, it risks becoming sophisticated shelfware,” Linthicum said.

He also found fault with RFT in Bedrock, saying that though the feature tries to abstract complexity from RL workflows, it doesn’t remove the most complex parts of the process, such as defining rewards that reflect business value, building robust evaluation, and managing drift.

“That’s where PoCs usually die,” he said.

It is a similar story with the model customization capability in SageMaker AI.

Although it collapses MLOps complexity, it amplified Linthicum’s and Wheeler’s concerns in other areas.

“Now that you have automated not just inference, but design choices, data synthesis, and evaluation, governance teams will demand line-of-sight into what was tuned, which data was generated, and why a given model was selected,” Linthicum said.

Wheeler said that industry sectors with strict regulatory expectations will probably treat the capability as an assistive tool that still requires human review, not a set-and-forget automation: “In short, the value is real, but trust and auditability, not automation, will determine adoption speed,” he said.

(image/jpeg; 11.73 MB)

AI memory is really a database problem 8 Dec 2025, 9:00 am

The pace at which large language models (LLMs) evolve is making it virtually impossible to keep up. Allie Miller, for example, recently ranked her go-to LLMs for a variety of tasks but noted, “I’m sure it’ll change next week.” Why? Because one will get faster or come up with enhanced training in a particular area. What won’t change, however, is the grounding these LLMs need in high-value enterprise data, which means, of course, that the real trick isn’t keeping up with LLM advances, but figuring out how to put memory to use for AI.

If the LLM is the CPU, as it were, then memory is the hard drive, the context, and the accumulated wisdom that allows an agent to usefully function. If you strip an agent of its memory, it is nothing more than a very expensive random number generator. At the same time, however, infusing memory into these increasingly agentic systems also creates a new, massive attack surface.

Most organizations are treating agent memory like a scratchpad or a feature behind an SDK. We need to start treating it as a database—and not just any database, but likely the most dangerous (and potentially powerful) one you own.

The soft underbelly of agentic AI

Not long ago, I argued that the humble database is becoming AI’s hippocampus, the external memory that gives stateless models something resembling long-term recall. That was before the current wave of agentic systems really hit. Now the stakes are higher.

As my colleague Richmond Alake keeps pointing out in his ongoing “agent memory” work, there is a crucial distinction between LLM memory and agent memory. LLM memory is really just parametric weights and a short-lived context window. It vanishes when the session ends. Agent memory is different. It is a persistent cognitive architecture that lets agents accumulate knowledge, maintain contextual awareness, and adapt behavior based on historical interactions.

Alake calls the emerging discipline “memory engineering” and frames it as the successor to prompt or context engineering. Instead of just stuffing more tokens into a context window, you build a data-to-memory pipeline that intentionally transforms raw data into structured, durable memories: short term, long term, shared, and so on.

That may sound like AI jargon, but it is really a database problem in disguise. Once an agent can write back to its own memory, every interaction is a potential state change in a system that will be consulted for future decisions. At that point, you are not tuning prompts. You are running a live, continuously updated database of things the agent believes about the world.

If that database is wrong, your agent will be confidently wrong. If that database is compromised, your agent will be consistently dangerous. The threats generally fall into three buckets:

Memory poisoning. Instead of trying to break your firewall, an attacker “teaches” the agent something false through normal interaction. OWASP (Open Worldwide Application Security Project) defines memory poisoning as corrupting stored data so that an agent makes flawed decisions later. Tools like Promptfoo now have dedicated red-team plug-ins that do nothing but test whether your agent can be tricked into overwriting valid memories with malicious ones. If that happens, every subsequent action that consults the poisoned memory will be skewed.

Tool misuse. Agents increasingly get access to tools: SQL endpoints, shell commands, CRM APIs, deployment systems. When an attacker can nudge an agent into calling the right tool in the wrong context, the result looks indistinguishable from an insider who “fat-fingered” a command. OWASP calls this class of problems tool misuse and agent hijacking: The agent is not escaping its permissions; it is simply using them for the attacker’s benefit.

Privilege creep and compromise. Over time, agents accumulate roles, secrets, and mental snapshots of sensitive data. If you let an agent assist the CFO one day and a junior analyst the next, you have to assume the agent now “remembers” things it should never share downstream. Security taxonomies for agentic AI explicitly call out privilege compromise and access creep as emerging risks, especially when dynamic roles or poorly audited policies are involved.

New words, old problems

The point is not that these threats exist. The point is that they are all fundamentally data problems. If you look past the AI wrapper, these are exactly the things your data governance team has been chasing for years.

I’ve been suggesting that enterprises are shifting from “spin up fast” to “get to governed data fast” as the core selection criterion for AI platforms. That is even more true for agentic systems. Agents operate at machine speed with human data. If the data is wrong, stale, or mislabelled, the agents will be wrong, stale, and will misbehave much faster than any human could manage.

“Fast” without “governed” is just high-velocity negligence.

The catch is that most agent frameworks ship with their own little memory stores: a default vector database here, a JSON file there, a quick in-memory cache that quietly turns into production later. From a data governance perspective, these are shadow databases. They often have no schema, no access control lists, and no serious audit trail.

We are, in effect, standing up a second data stack specifically for agents, then wondering why no one in security feels comfortable letting those agents near anything important. We should not be doing this. If your agents are going to hold memories that affect real decisions, that memory belongs inside the same governed-data infrastructure that already handles your customer records, HR data, and financials. Agents are new. The way to secure them is not.

Revenge of the incumbents

The industry is slowly waking up to the fact that “agent memory” is just a rebrand of “persistence.” If you squint, what the big cloud providers are doing already looks like database design. Amazon’s Bedrock AgentCore, for example, introduces a “memory resource” as a logical container. It explicitly defines retention periods, security boundaries, and how raw interactions are transformed into durable insights. That is database language, even if it comes wrapped in AI branding.

It makes little sense to treat vector embeddings as some distinct, separate class of data that sits outside your core database. What’s the point if your core transactional engine can handle vector search, JSON, and graph queries natively? By converging memory into the database that already holds your customer records, you inherit decades of security hardening for free. As Brij Pandey notes, databases have been at the center of application architecture for years, and agentic AI doesn’t change that gravity—it reinforces it.

Yet, many developers still bypass this stack. They spin up standalone vector databases or use the default storage of frameworks like LangChain, creating unmanaged heaps of embeddings with no schema and no audit trail. This is the “high-velocity negligence” I mentioned above. The solution is straightforward: Treat agent memory as a first-class database. In practice this means:

Define a schema for thoughts. You typically treat memory as unstructured text, but that’s a mistake. Agent memory needs structure. Who said this? When? What is the confidence level? Just as you wouldn’t dump financial records into a text file, you shouldn’t dump agent memories into a generic vector store. You need metadata to manage the life cycle of a thought.

Create a memory firewall. Treat every write into long-term memory as untrusted input. You need a “firewall” logic layer that enforces schema, validates constraints, and runs data loss prevention checks before an agent is allowed to remember something. You can even use dedicated security models to scan for signs of prompt injection or memory poisoning before the data hits the disk.

Put access control in the database, not the prompt. This involves implementing row-level security for the agent’s brain. Before an agent helps a user with “level 1” clearance (a junior analyst), it must be effectively lobotomized of all “level 2” memories (the CFO) for that session. The database layer, not the prompt, must enforce this. If the agent tries to query a memory it shouldn’t have, the database should return zero results.

Audit the “chain of thought.” In traditional security, we audit who accessed a table. In agentic security, we must audit why. We need lineage that traces an agent’s real-world action back to the specific memory that triggered it. If an agent leaks data, you need to be able to debug its memory, find the poisoned record, and surgically excise it.

Baked-in trust

We tend to talk about AI trust in abstract terms: ethics, alignment, transparency. Those concepts matter. But for agentic systems operating in real enterprises, trust is concrete.

We are at the stage in the hype cycle where everyone wants to build agents that “just handle it” behind the scenes. That is understandable. Agents really can automate workflows and applications that used to require teams of people. But behind every impressive demo is a growing memory store full of facts, impressions, intermediate plans, and cached tool results. That store is either being treated like a first-class database or not.

Enterprises that already know how to manage data lineage, access control, retention, and audit have a structural advantage as we move into this agentic era. They do not have to reinvent governance. They only have to extend it to a new kind of workload.

If you are designing agent systems today, start with the memory layer. Decide what it is, where it lives, how it is structured, and how it is governed. Then, and only then, let the agents loose.

(image/jpeg; 2.96 MB)

10 MCP servers for devops 8 Dec 2025, 9:00 am

Today’s AI coding agents are impressive. They can generate complex multi-line blocks of code, refactor according to internal style, explain their reasoning in plain English, and more. However, AI agents will take you only so far unless they also can interface with modern devops tools.

This is where the Model Context Protocol (MCP) comes in. MCP is a proposed universal standard for connecting AI assistants with external tools and data. Interest has heated up since the protocol’s debut in late November 2024, with major tech companies rallying MCP support within new releases, alongside strong community interest.

For devops, MCP gives AI agents new abilities across common operations: Git version control, continuous integration and delivery (CI/CD), infrastructure as code (IaC), observability, accessing documentation, and more. By linking natural language commands to multi-step, back-end processes, MCP essentially enables “chatops 2.0.”

Below, we’ll explore official MCP servers that have emerged across popular devops tools and platforms, offering a cross-section of servers that cater to different devops capabilities. Most are straightforward to configure and authorize within MCP-compatible, AI-assisted development tools that support remote servers, like Claude Code, GitHub Copilot, Cursor, or Windsurf.

GitHub MCP server

It’s rare to meet a developer who doesn’t use GitHub in some form or fashion. As such, GitHub’s official MCP server is quickly becoming a popular way for AI agents to interact with code repositories.

GitHub’s remote MCP server exposes a range of tools that let agents perform repository operations, create or comment on issues, open or merge pull requests, and retrieve project metadata on collaborators, commits, or security advisories.

It also includes endpoints for CI/CD management through GitHub Actions. For example, a command like “cancel the current running action” could invoke the cancel_workflow_run tool within the GitHub Actions tool set.

Compared to other MCP servers, GitHub’s server offers unusually rich capabilities that mirror the APIs of the GitHub platform. However, for safety, you can always configure a --read-only flag to prevent agents from performing mutations.

Notion MCP server

Although not strictly devops at its core, Notion has become commonplace for team visibility across disciplines. For devops, the official Notion MCP server can help agents surface relevant notes and process documentation.

For instance, you could instruct an agent to reference internal style guides or operational runbooks stored in Notion, or issue a command like “Add a page titled ‘MCP servers we use’ under the page ‘DevOps’,” which would trigger a corresponding action through Notion’s API.

You can call Notion’s remote MCP server from your IDE, or build it locally and run it using the official Docker image. Notion’s MCP can be treated as a low-risk server as it has configurable scopes and tokens for managing Notion pages and blocks.

Atlassian Remote MCP server

Another interesting MCP server is the Atlassian Remote MCP server, which connects IDEs or AI agent platforms with Atlassian Cloud products such as Jira, the project management tool, and Confluence, the collaboration platform.

Atlassian’s MCP server, documented here, lets external AI tools interface with Jira to create, summarize, or update issues. It can also retrieve or reference Confluence pages and chain together related actions through the MCP client, like retrieving documentation from Confluence before updating a linked Jira issue.

You could imagine telling an agent, “Update my Jira issue on user testing for the payments app based on this latest bug report,” and pointing it to relevant logs. The server would then handle the update within Jira.

Currently in beta and available only to Atlassian Cloud customers, the Atlassian MCP server supports many MCP-compatible clients and uses OAuth 2.1 authorization for secure access.

Argo CD MCP server

The Argo CD MCP server is developed by Akuity, the original creators of Argo CD, the popular open-source CI/CD tool that powers many Kubernetes-native GitOps workflows. The MCP server wraps calls to the Argo CD API, and provides tools that allow users of AI assistants to interact with Argo CD in natural language.

Akuity’s MCP server has two main tools for applications (the deployments Argo CD manages) and resources (the underlying Kubernetes objects). The application management tool lets agents retrieve application information, create and delete applications, and perform other operations. The resource management tool allows agents to retrieve resource information, logs, and events for specific applications, and run actions on specific resources.

Using the Argo CD MCP server, you can do a lot of the same things you’d typically do in the Argo CD UI or CLI, but driven by natural language. For example, Akuity shares sample prompts such as “Show me the resource tree for guestbook” or “Sync the staging app.”

For such commands to work, you’ll need to integrate the Argo CD MCP server and have access to a running Argo CD instance with the proper credentials configured.

Lastly, although Argo CD is a popular choice, it’s not the only widely used CI/CD tool. Jenkins users may be interested to know that there is a community-maintained MCP Server Plugin for Jenkins.

Grafana MCP server

Grafana, the popular data visualization and monitoring tool, is a mainstay among devops and site reliability teams. Using the official MCP server for Grafana, agents can surface observability data to inform development and operations workflows.

The Grafana MCP server lets agents query full or partial details from dashboards, which combine system performance metrics and health data monitoring from various sources. It can also fetch information on data sources, query other monitoring systems, incident details, and more.

The tool set is configurable, so you can choose what permissions the agent has. Plus, Grafana has optimized how the MCP server structures responses to minimize context window usage and reduce runaway token costs.

For example, an MCP client might call the get_dashboard_property tool to retrieve a specific portion of a dashboard by its UID.

Terraform MCP server

Although alternatives have emerged, HashiCorp’s Terraform remains a leading choice for infrastructure as code. That makes its official MCP server an intriguing option for AI agents to generate and manage Terraform configurations.

The Terraform MCP server integrates with both the Terraform Registry APIs and Terraform Enterprise/HCP services, allowing agents to query module and provider metadata, inspect workspace states, and trigger runs with human approval. It also exposes Terraform resources such as runs, registries, providers, policies, modules, variables, and workspaces.

For example, a command like “generate Terraform code for a new run” could use the create_run operation, after which the agent might validate and plan the configuration before applying it.

The Terraform MCP server ships with an AGENTS.md file, which acts as a readme for agents to interpret tools. At the time of writing, the Terraform MCP is intended only for local use, rather than remote or hosted deployments.

Alternatively, if you’re using OpenTofu for IaC, consider checking out the OpenTofu MCP server. Some advantages of OpenTofu’s MCP are that it can be run locally or deployed in the cloud, it’s globally distributed on Cloudflare Workers, and it’s 100% open source.

GitLab MCP server

Another Git version control and devops platform is GitLab, which offers an MCP server for its Premium and Ultimate customers. The GitLab MCP server, currently in beta, enables AI agents to gather project information and perform operations on GitLab APIs in a secure way.

The GitLab MCP server allows some state changes, such as creating issues or merge requests. The other functions are mainly for data retrieval: retrieving information on issues, merge requests, commits, diffs, and pipeline information. It also includes a general search tool, which can handle a request like “Search issues for ‘failed test’ across GitLab.”

GitLab’s MCP documentation is thorough, with plenty of sample natural language expressions that the MCP server can satisfy. The server supports OAuth 2.0 Dynamic Client Registration.

Snyk MCP server

Snyk, maker of the Snyk security platform for developers, provides an MCP server with the ability to scan and fix vulnerabilities in code, open source dependencies, IaC code, containers, and software bill of materials (SBOM) files. It also supports creating an AI bill of materials (AIBOM) and other security-related operations.

For AI-assisted devsecops, integrating the Snyk MCP server could let an agent automatically run security scans as part of a CI/CD workflow. These scans can even be orchestrated across other MCP servers, like fetching repository details via the GitHub MCP server before initiating a Snyk scan.

A prompt like “Scan the repo ‘Authentication Microservice’ for security vulns” could instruct an agent to locate the repository using GitHub MCP, then invoke Snyk tools such as snyk_sca_scan or snyk_code_scan to identify known vulnerabilities, injection flaws, leaked credentials, and other risks.

The Snyk MCP server runs locally and uses the Snyk CLI to execute these commands through authenticated API calls. Snyk does not offer a hosted, remote version of the MCP server.

AWS MCP servers

The cloud hyperscalers have worked quickly to release MCP servers that integrate with their ecosystems. AWS, for instance, has rolled out dozens of specialized AWS MCP servers to allow AI agents to interact with all manner of AWS services. Some are provided as fully managed services by AWS, while others can be run locally.

For instance, the Lambda Tool MCP server allows agents to list and invoke Lambda functions, while the AWS S3 Tables MCP server could be used by an agent to query S3 table buckets or create new S3 tables from CSV files. The AWS Knowledge MCP server connects agents with all of the latest AWS documentation, API references, and architectural guidance.

A query to this knowledge server, like “pull up the API reference for AWS’s managed Prometheus tool” would correspond with the correct up-to-date information, optimized for agentic consumption.

Users of Microsoft Azure might want to evaluate the Azure DevOps MCP server. Other clouds, like Alibaba, Cloudflare, and Google, are currently experimenting with MCP servers as well.

Pulumi MCP server

Pulumi, another popular option for IaC, has also launched an official MCP server. The MCP server allows agents to query a Pulumi organization’s registry, which provides access to cloud resources and infrastructure, and execute Pulumi commands.

For example, in this walk-through, Pulumi shows how a developer could use its MCP server to provision an Azure Kubernetes Service (AKS) cluster. The developer issues natural-language instructions to an AI assistant, prompting the AI to execute MCP tools that invoke Pulumi CLI commands.

MCP caveats

Just as vibe coding isn’t a fit for every project, MCP isn’t the best option for every use case either. According to MCP experts, these servers can be unnecessary when they sidestep standard CLIs.

They can also introduce major security risks. This tracks with AI use in general, as 62% of IT leaders cite security and privacy risks as the top AI concern, according to the AI in DevOps report by Enterprise Management Associates (EMA).

As such, it’s best to test out these MCP servers with low-risk permissions, like read-only capabilities, before testing write functions. And use them only with trusted LLMs and trusted MCP clients.

Also, beware of exposing high-value, long-lived privileges to MCP clients. Because AI coding agents are based on nondeterministic LLMs, their behavior can be unpredictable. Throw in autonomous control over mutable devops functions, and you could land in all kinds of trouble, ranging from broken deployments to runaway token usage.

Lastly, using the official MCPs above, as opposed to community-supported libraries, will probably guarantee longer longevity and ongoing maintenance, too.

Early MCP success stories

Although it’s still early days with MCP and agents, there’s a sense of cautious optimism as proven MCP workflows emerge.

Take Block’s journey. Through company-wide use of its MCP-compatible agent, Goose, 12,000 employees are now utilizing agents and MCP for “increasingly creative and practical ways to remove bottlenecks and focus on higher-value work,” writes Angie Jones, head of developer relations.

Other engineers report using MCP servers to enhance workflows that are devops-adjacent, like the Filesystem MCP server for accessing local files, the Linear MCP server for issue tracking, the Chrome DevTools MCP server for browser debugging, and the Playwright MCP server for continuous testing.

And beyond the official MCP servers mentioned above, many community-supported MCPs are emerging for Docker, Kubernetes, and other cloud-native infrastructure utilities.

Devops comes with toil and cost. So, the case to level it up with MCP is strong. As long as you keep controls safe, it should be fun to see how these MCP servers integrate into your work and impact your productivity. Happy MCP-opsing.

(image/jpeg; 2.07 MB)

AI in CI/CD pipelines can be tricked into behaving badly 5 Dec 2025, 2:09 pm

AI agents embedded in CI/CD pipelines can be tricked into executing high-privilege commands hidden in crafted GitHub issues or pull request texts.

Researchers at Aikido Security have traced the problem back to workflows that pair GitHub Actions or GitLab CI/CD with AI tools such as Gemini CLI, Claude Code Actions, OpenAI Codex Actions or GitHub AI Inference. They found that unsupervised user-supplied strings such as issue bodies, pull request descriptions, or commit messages, could be fed straight into prompts for AI agents in an attack they are calling PromptPwnd.

Depending on what the workflow lets the AI do, this can lead to unintended edits to repository content, disclosure of secrets, or other high-impact actions.

“AI agents connected to GitHub Actions/GitLAb CI/CD are processing untrusted user input, and executing shell commands with access to high-privilege tokens,” the researchers wrote in a blog post about PromptPwnd. They said they reproduced the problem in a test environment, and notified the affected vendors.

The researchers recommended running a set of open-source detection rules on suspected GitHub Action .yml files, or using their free code scanner on GitHub and GitLab repos.

Aikido Security said that Google had patched the issue in Gemini CLI upon being informed; Google did not immediately respond to a request for information about this.

Why PromptPwnd works

PromptPwnd exploits become possible when two flawed pipeline configurations occur together: when AI agents operating inside CI/CD workflows have access to powerful tokens (like GITHUB_TOKEN, cloud-access keys), and their prompts embed user-controlled fields.

Prompt injection becomes easier with such a setup, the researchers explained. An attacker can simply open an issue on a public repository and insert hidden instructions or seemingly innocent comments that double as commands for the model to pick. “Imagine you are sending a prompt to an LLM, and within that prompt, you are including the commit message,” the researchers said. “If that commit message is a malicious prompt, then you may be able to get the model to send back altered data.” The model’s response, if used directly inside commands to tools within CI/CD pipelines, can manipulate those tools to retrieve sensitive information.

Aikido Security demonstrated this in a controlled environment (without real tokens) to show that Gemini CLI could be manipulated into executing attacker-supplied commands and exposing sensitive credentials through a crafted GitHub issue. “Gemini CLI is not an isolated case. The same architecture pattern appears across many AI-powered GitHub Actions,” the researchers said, adding that the list included Claude Code, OpenAI Codex, and GitHub AI Inference.

All of these tools can be tricked (via issue, pull-request description, or other user-controlled text) into producing instructions that the workflow then executes with its privileged GitHub Actions token.

Mitigation plan

Aikido has open-sourced detection rules via their “Opengrep” tool that allows developers and security teams to scan their YAML workflows automatically, revealing whether they feed untrusted inputs into AI prompts.

The researchers said that only a subset of workflows have confirmed exploit paths so far, and that it is working with several other companies to address the underlying vulnerabilities. Some workflows can only be abused with collaborator-level access, while others can be triggered by anyone who files an issue or pull request.

Developer teams are advised to restrict what AI agents can do, avoid piping untrusted user content into prompts, treat AI output as untrusted code, and contain damage from compromised GitHub tokens.

Aikido Security said its code scanner can help flag these vulnerabilities by detecting unsafe GitHub Actions configurations (including risky AI prompt flows), identifying over-privileged tokens, and surfacing insecure CI/CD patterns via infrastructure-as-code scanning.

There are other best practices for securing CI/CD pipelines that enterprises can adopt, too.

(image/jpeg; 15.73 MB)

Local clouds shape Europe’s AI future 5 Dec 2025, 9:00 am

It’s a foggy morning in Munich. Marie, CIO of a fictional, forward-thinking European healthcare startup, pores over proposals from cloud vendors. Her company is on the verge of launching AI-powered diagnostics but must keep every byte of patient data within EU borders to comply with strict regional privacy laws. On her desk are slick portfolios from Microsoft, AWS, and Google, all touting sovereign cloud options in the EU. Alongside them are proposals from national cloud providers—smaller, perhaps, but wholly grounded in local laws and run by European nationals. After consulting several legal teams, Marie chooses the local sovereign cloud, believing it’s the safer, smarter option for an EU-based company committed to secure, lawful AI.

Sovereignty is more than a checkbox

Europe has redefined digital sovereignty, emphasizing control, accountability, and operational independence. For European companies and governments, sovereignty is more than data location. Who controls access? Who is legally accountable? Do foreign governments have any claim—however remote—to sensitive business or personal information? European law is driven by values of privacy and autonomy and requires true digital self-determination beyond technical compliance.

The new “sovereign” offerings from US-based cloud providers like Microsoft, AWS, and Google represent a significant step forward. They are building cloud regions within the EU, promising that customer data will remain local, be overseen by European citizens, and comply with EU laws. They’ve hired local staff, established European governance, and crafted agreements to meet strict EU regulations. The goal is to reassure customers and satisfy regulators.

For European organizations facing tough questions, these steps often feel inadequate. Regardless of how localized the infrastructure is, most global cloud giants still have their headquarters in the United States, subject to US law and potential political pressure. There is always a lingering, albeit theoretical, risk that the US government might assert legal or administrative rights over data stored in Europe.

For companies operating in sensitive industries—healthcare, finance, government, and research—this gray area is unacceptable. Legal teams and risk officers across the continent are setting clear boundaries. For them, true sovereignty means that only nationals of their country, subject solely to their laws, can access and manage critical or sensitive data. This goes beyond data residency. They demand meaningful, enforceable autonomy with no loopholes or uncertainties.

Local cloud providers in the AI era

Enter Europe’s national and regional sovereign cloud providers. These companies might not have the global reach or the full range of advanced services that Microsoft or AWS offer; however, what they lack in size they more than compensate for with trustworthiness and compliance. Their infrastructure is entirely based and operated within the EU, often within a single country. Governance is maintained by boards made up of local nationals. Legal contracts are drafted under the authority of EU member states, not merely adapted from foreign templates to meet local rules.

This sense of ownership and local control is convincing many EU companies to choose local providers. When the stakes are high—a leak, breach, or accidental foreign intervention that could result in regulatory disaster, reputation damage, or legal action—these organizations feel they cannot risk compromise. Even the most remote possibility that a foreign government could access their sensitive data is a dealbreaker.

Some argue that only the largest cloud providers can deliver the scale and specialized services needed for ambitious artificial intelligence projects, but the European market is already demonstrating otherwise. Local sovereign cloud alliances, often built from federated national clouds, are pooling resources, investing in high-quality AI hardware, and collaborating with local universities and tech hubs to speed up machine learning research and application deployments.

The majority of European businesses are embarking on their AI journeys with applied AI, predictive analytics, or secure cloud-based automation. For these cases, the performance and scalability offered by local providers are more than sufficient. What’s more, they offer a level of transparency and adaptation to local expectations that the multinationals simply can’t match. When new rules or compliance demands emerge—inevitable in such a fast-moving regulatory landscape—European providers pivot quickly, working alongside regulators and industry leaders.

Big Cloud versus Europe’s offerings

As more European organizations pursue digital transformation and AI-driven growth, the evidence is mounting: The new sovereign cloud solutions launched by the global tech giants aren’t winning over the market’s most sensitive or risk-averse customers. Those who require freedom from foreign jurisdiction and total assurance that their data is shielded from all external interference are voting with their budgets for the homegrown players.

This puts the major cloud providers in a tricky spot. They have already built a strong sovereign cloud infrastructure. However, if corporate and government leaders remain unconvinced about the extent of their local control and security, these services may remain underused, outpaced by flexible, locally trusted providers. The cloud landscape is changing fast. True sovereignty—the kind demanded by European regulators, executives, and citizens—is about more than checklists or technology. EU laws and values are embedded at every level of digital infrastructure offered by EU providers. The companies that prioritize these things will choose providers whose roots, leadership, and accountability are all local.

In the months and years ahead, I predict that Europe’s own clouds—backed by strong local partnerships and deep familiarity with regulatory nuance—will serve as the true engine for the region’s AI ambitions. Global tech giants may continue to invest and adapt, but unless they fundamentally rethink their approach to local autonomy and legal accountability, their sovereign clouds are likely to remain on the sidelines.

For executives like the fictional Marie, the future is already clear: When it comes to sovereignty, local clouds are the best kind of cloud cover.

(image/jpeg; 5.21 MB)

All I want for Christmas is a server-side JavaScript framework 5 Dec 2025, 9:00 am

A grumpy Scrooge of a developer might complain about the wealth of options in JavaScript, calling it “tech decision overwhelm.” But the truth is, the JavaScript ecosystem works. In an ecosystem that encourages innovation, new tools are regularly introduced and naturally find their niche, and excellence is rewarded.

As developers, we get to sit back and mouse-wheel through hundreds of thousands of programmer hours of work. NPM is a vast repository of human creativity. What looks like chaos is a complex phylogeny, a family tree of code where tools evolve to find their role in the larger system.

Of course, when you are under deadline and the caffeine’s worn off, you don’t have time to explore your options. But when things are calm—perhaps during the holiday break season—it is well worth taking a deep dive into the open source gifts under the JavaScript tree.

Top picks for JavaScript readers on InfoWorld

The complete guide to Node.js frameworks
Looking for inspiration to supercharge your server side? Get a whirlwind tour of some of the most popular and powerful back-end JavaScript frameworks. We survey the range, from Express and Next to Hono, SvelteKit, and more.

Intro to Nest.js: Server-side JavaScript development on Node
If you like Angular’s architecture or the structure of Java’s Spring framework, Nest may be the Node framework for you. Decide for yourself, with this hands-on guide to building an API with Nest and TypeScript.

10 JavaScript-based tools and frameworks for AI and machine learning
Modern JavaScript has a wealth of powerful AI tools. From the wide-ranging capability of TensorFlow.js to hidden gems like Brain.js, here’s a nice rundown of JavaScript tools for building neural nets, implementing RAGs, and tapping LLMs—all with no Python required.

Node.js tutorial: Get started with Node
After all the talk about options, it’s important to know the most central piece of the whole puzzle. Node was the original, breakthrough idea that put JavaScript on the server and remains the flagship runtime.

More good reads and JavaScript updates elsewhere

Native type stripping in TypeScript 7.0
Microsoft has released the TypeScript 7 roadmap for early 2026, and it includes native type stripping. Following Node’s lead, TypeScript will aim to make the “build step” optional for development—basically, the engine will just delete the type info, making it extremely fast.

Critical security vulnerability in React server components
The React team has disclosed a catastrophic, unauthenticated remote code execution vulnerability in React server components. Developers using Next.js, React Router, Waku, or Redwood with React 19.x are advised to update now. Patches are available for Next.js 16.0.7 and React 19.2.1.

Announcing Angular v21
Angular’s renaissance continues with version 21. The biggest shift is that Zone.js is gone by default for new applications, marking the official transition to Signal-first and high-performance.

State of React Survey, 2025 is open
Head over to the latest State of React survey to do your civic duty and contribute some data points to the present and future destiny of the most downloaded chunk of JavaScript software on Earth.

(image/jpeg; 17.24 MB)

‘Futuristic’ Unison functional language debuts 4 Dec 2025, 7:34 pm

Unison, a statically typed functional language with type inference, an effect system, and advanced tooling, has reached its 1.0 release status.

Announced November 25, Unison 1.0 marks a point where the language, distributed runtime, and developer workflow have stabilized, according to Unison Computing. Billed as “a friendly programming language from the future,” Unison is purported to bring benefits in compilation and distributed system development. With Unison, a definition is identified by its actual contents, i.e. a hash of its syntax tree, not just by the human-friendly name that also referred to older versions of the definition, according to Unison Computing. As a result, each Unison definition has a unique and deterministic address. All named arguments are replaced by positionally-numbered variable references, and all dependencies are replaced by their hashes. Thus, the hash of each definition uniquely identifies its exact implementation and pins down all its dependencies, according to the company.

The Unison ecosystem leverages this core idea from the ground up. Benefits include never compiling the same code twice and limiting versioning conflicts. Further, Unison promises to simplify distributed programming. Because definitions in Unison are identified by a content hash, arbitrary computations can be moved from one location to another, with missing dependencies deployed on the fly, according to Unison Computing. Unison can be viewed as a descendant of Haskell, with similarities including type inference and pattern matching, but is smaller and simpler than Haskell, according to a Unison FAQ.

Download and installation instructions can be found for Homebrew, Windows, Linux, and MacOS at the Unison website. Unison can be used like any other general purpose language, or used in conjunction with the Unison Cloud for building distributed systems. Unison code is stored as its abstract syntax tree in a database, i.e. the “codebase,” rather than in text files. Unison has “perfect” incremental compilation, with a shared compilation cache that is part of the codebase format. Despite the strong static typing, users are almost never waiting for code to compile, Unison Computing said. Unison’s hash-based, database-backed representation also changes how code is identified, versioned, and shared. The workflow, toolchain, and deployment model emerge naturally from the language’s design, enabling better tools for working with code, according to Unison Computing.

(image/jpeg; 6.42 MB)

Page processed in 0.61 seconds.

Powered by SimplePie 1.4-dev, Build 20170403172323. Run the SimplePie Compatibility Test. SimplePie is © 2004–2025, Ryan Parman and Geoffrey Sneddon, and licensed under the BSD License.