R language is making a comeback – Tiobe | InfoWorld

Technology insight for the enterprise

R language is making a comeback – Tiobe 8 Dec 2025, 10:41 pm

The R language for statistical computing has creeped back into the top 10 in Tiobe’s monthly index of programming language popularity.

In the December 2025 index, published December 7, R ranks 10th with a 1.96% rating. R has cracked the Tiobe index’s top 10 before, such as in April 2020 and July 2020, but not in recent years. The rival Pypl Popularity of Programming Language Index, meanwhile, has R ranked fifth this month with a 5.84% share.

“Programming language R is known for fitting statisticians and data scientists like a glove,” said Paul Jansen, CEO of software quality services vendor Tiobe, in a bulletin accompanying the December index. “As statistics and large-scale data visualization become increasingly important, R has regained popularity.”

Jansen noted that R is sometimes frowned upon by “traditional” software engineers due to an unconventional syntax and limited scalability for large production systems. But for domain experts R remains a powerful and elegant tool, and continues to thrive at universities and in research-driven industries, he added. Although data science rival Python has eclipsed R in terms of general adoption, Jansen said R has carved out a solid and enduring niche, excelling at rapid experimentation, statistical modeling, and exploratory data analysis.

“We have seen many Tiobe index top 10 entrants rising and falling,” Jansen wrote. “It will be interesting to see whether R can maintain its current position.”

The Tiobe Programming Community Index bases language popularity on a formula that assesses the number of skilled engineers worldwide, courses, and third-party vendors pertinent to a language. Popular websites including Google, Amazon, Wikipedia, Bing, and more than 20 others are used to calculate its ratings.

The Tiobe index top 10 for December 2025:

  1. Python, 23.64%
  2. C, 10.11%
  3. C++, 8.95%
  4. Java, 8.7%
  5. C#, 7.26%
  6. JavaScript, 2.96%
  7. Visual Basic, 2.81%
  8. SQL, 2.1%
  9. Perl, 1.97%
  10. R, 1.96%

The Pypl index analyzes how often language tutorials are searched on Google. The Pypl index top 10 for December 2025:

  1. Python, 26.91%
  2. C/C++, 13.02%
  3. Objective-C, 11.37%
  4. Java, 11.36%
  5. R, 5.84%
  6. JavaScript, 5.16%
  7. Swift, 3.53%
  8. C#, 3.18%
  9. PHP, 2.98%
  10. Rust, 2.6%

(image/jpeg; 11.74 MB)

Apache Tika hit by critical vulnerability thought to be patched months ago 8 Dec 2025, 8:03 pm

A security flaw in the widely-used Apache Tika XML document extraction utility, originally made public last summer, is wider in scope and more serious than first thought, the project’s maintainers have warned.

Their new alert relates to two entwined flaws, the first CVE-2025-54988 from August, rated 8.4 in severity, and the second, CVE-2025-66516 made public last week, rated 10.

CVE-2025-54988 is a weakness in the tika-parser-pdf-module used to process PDFs in Apache Tika from version 1.13 to and including version 3.2.1.  It is one module in Tika’s wider ecosystem that is used to normalize data from 1,000 proprietary formats so that software tools can index and read them.

Unfortunately, that same document processing capability makes the software a prime target for campaigns using XML External Entity (XXE) injection attacks, a recurring issue in this class of utility.

In the case of CVE-2025-54988, this could have allowed an attacker to execute an External Entity (XXE) injection attack by hiding XML Forms Architecture (XFA) instructions inside a malicious PDF.

Through this, “an attacker may be able to read sensitive data or trigger malicious requests to internal resources or third-party servers,” said the CVE. Attackers could exploit the flaw to retrieve data from the tool’s document processing pipeline, exfiltrating it via Tika’s processing of the malicious PDF.

CVE superset

The maintainers have now realized that the XXE injection flaw is not limited to this module. It affects additional Tika components, namely Apache Tika tika-core, versions 1.13 to 3.2.1, and tika-parsers versions 1.13 to 1.28.5. In addition, legacy Tika parsers versions 1.13 to 1.28.5 are also affected.

Unusually – and confusingly – this means there are now two CVEs for the same issue, with the second, CVE-2025-66516, a superset of the first. Presumably, the reasoning behind issuing a second CVE is that it draws attention to the fact that people who patched CVE-2025-54988 are still at risk because of the additional vulnerable components listed in CVE-2025-66516.

So far, there’s no evidence that the XXE injection weakness in these CVEs is being exploited by attackers in the wild. However, the risk is that this will quickly change should the vulnerability be reverse engineered or proofs-of-concept appear.

CVE-2025-66516 is rated an unusual maximum 10.0 in severity, which makes patching it a priority for anyone using this software in their environment. Users should update to Tika-core version 3.2.2, tika-parser-pdf-module version 3.2.2 (standalone PDF module), or tika-parsers versions 2.0.0 if on legacy.

However, patching will only help developers looking after applications known to be using Apache Tika. The danger is that its use might not be listed in all application configuration files, creating a blind spot whereby its use is not picked up. The only mitigation against this uncertainty would be for developers to turn off the XML parsing capability in their applications via the tika-config.xml configuration file.

This article originally appeared on CSOonline.

(image/jpeg; 6.88 MB)

AWS takes aim at the PoC-to-production gap holding back enterprise AI 8 Dec 2025, 6:58 pm

Enterprises are testing AI in all sorts of applications, but too few of their proofs of concept (PoCs) are making into production: just 12%, according to an IDC study.

Amazon Web Services is concerned about this too, with VP of agentic AI Swami Sivasubramanian devoting much of his keynote speech to it at AWS re:Invent last week.

The failures are not down to lack of talent or investment, but how organizations plan and build their PoCs, he said: “Most experiments and PoCs are not designed to be production ready.”

Production workloads, for one, require development teams to deploy not just a handful of agent instances, but often hundreds or thousands of them simultaneously — each performing coordinated tasks, passing context between one another, and interacting with a sprawling web of enterprise systems.

This is a far cry from most PoCs, which might be built around a single agent executing a narrow workflow.

Another hurdle, according to Sivasubramanian, is the complexity that agents in production workloads must contend with, including “a massive amount of data and edge cases”.  

This is unlike PoCs which operate in artificially clean environments and run on sanitized datasets with handcrafted prompts and predictable inputs — all of which hide the realities of live data, such as inconsistent formats, missing fields, conflicting records, and unexpected behaviours.

Then there’s identity and access management. A prototype might get by with a single over-permissioned test account. Production can’t.

“In production, you need rock-solid identity and access management to authenticate users, authorize which tools agents can access on their behalf, and manage these credentials across AWS and third-party services,” Sivasubramanian said.

Even if those hurdles are cleared, the integration of agents into production workloads still remains a key challenge.

“And then of course as you move to production, your agent is not going to live in isolation. It will be part of a wider system, one that can’t fall apart if an integration breaks,” Sivasubramanian said.

Typically, in a PoC, engineers can manually wire data flows, push inputs, and dump outputs to a file or a test interface. If something breaks, they reboot it and move on. That workflow collapses under production conditions: Agents become part of a larger, interdependent system that cannot fall apart every time an integration hiccups.

Moving from PoC to production

Yet Sivasubramanian argued that the gulf between PoC and production can be narrowed.

In his view, enterprises can close the gap by equipping teams with tooling that bakes production readiness into the development process itself, focusing on agility while still being accurate and reliable.

To address concerns around the agility of building agentic systems with accuracy, AWS added an episodic memory feature to Bedrock AgentCore, which lifts the burden of building custom memory scaffolding off developers.

Instead of expecting teams to stitch together their own vector stores, summarization logic, and retrieval layers, the managed module automatically captures interaction traces, compresses them into reusable “episodes,” and brings forward the right context as agents work through new tasks.

In a similar vein, Sivasubramanian also announced the serverless model customization capability in SageMaker AI to help developers automate data prep, training, evaluation, and deployment.

This automation, according to Scott Wheeler, cloud practice leader at AI and data consultancy firm Asperitas, will remove the heavy infrastructure and MLops overhead that often stall fine-tuning efforts, accelerating agentic systems deployment.

The push toward reducing MLops didn’t stop there. Sivasubramanian said that AWS is adding Reinforcement Fine-Tuning (RFT) in Bedrock, enabling developers to shape model behaviour using an automated reinforcement learning (RL) stack.

Wheeler welcomed this, saying it will remove most of the complexity of building a RL stack, including infrastructure, math, and training-pipelines.

SageMaker HyperPod also gained checkpointless training, which enables developers to accelerate the model training process.

To address reliability, Sivasubramanian said that AWS is adding Policy and Evaluations capabilities to Bedrock AgentCore’s Gateway. While Policy will help developers enforce guardrails by intercepting tool calls, Evaluations will help developers simulates real-world agent behavior to catch issues before deployment.

Challenges remain

However, analysts warn that operationalizing autonomous agents remains far from frictionless.

Episodic memory, though a conceptually important feature, is not magic, said David Linthicum, independent consultant and retired chief cloud strategy officer at Deloitte. “It’s impact is proportional to how well enterprises capture, label, and govern behavioural data. That’s the real bottleneck.”

“Without serious data engineering and telemetry work, it risks becoming sophisticated shelfware,” Linthicum said.

He also found fault with RFT in Bedrock, saying that though the feature tries to abstract complexity from RL workflows, it doesn’t remove the most complex parts of the process, such as defining rewards that reflect business value, building robust evaluation, and managing drift.

“That’s where PoCs usually die,” he said.

It is a similar story with the model customization capability in SageMaker AI.

Although it collapses MLOps complexity, it amplified Linthicum’s and Wheeler’s concerns in other areas.

“Now that you have automated not just inference, but design choices, data synthesis, and evaluation, governance teams will demand line-of-sight into what was tuned, which data was generated, and why a given model was selected,” Linthicum said.

Wheeler said that industry sectors with strict regulatory expectations will probably treat the capability as an assistive tool that still requires human review, not a set-and-forget automation: “In short, the value is real, but trust and auditability, not automation, will determine adoption speed,” he said.

(image/jpeg; 11.73 MB)

AI memory is really a database problem 8 Dec 2025, 9:00 am

The pace at which large language models (LLMs) evolve is making it virtually impossible to keep up. Allie Miller, for example, recently ranked her go-to LLMs for a variety of tasks but noted, “I’m sure it’ll change next week.” Why? Because one will get faster or come up with enhanced training in a particular area. What won’t change, however, is the grounding these LLMs need in high-value enterprise data, which means, of course, that the real trick isn’t keeping up with LLM advances, but figuring out how to put memory to use for AI.

If the LLM is the CPU, as it were, then memory is the hard drive, the context, and the accumulated wisdom that allows an agent to usefully function. If you strip an agent of its memory, it is nothing more than a very expensive random number generator. At the same time, however, infusing memory into these increasingly agentic systems also creates a new, massive attack surface.

Most organizations are treating agent memory like a scratchpad or a feature behind an SDK. We need to start treating it as a database—and not just any database, but likely the most dangerous (and potentially powerful) one you own.

The soft underbelly of agentic AI

Not long ago, I argued that the humble database is becoming AI’s hippocampus, the external memory that gives stateless models something resembling long-term recall. That was before the current wave of agentic systems really hit. Now the stakes are higher.

As my colleague Richmond Alake keeps pointing out in his ongoing “agent memory” work, there is a crucial distinction between LLM memory and agent memory. LLM memory is really just parametric weights and a short-lived context window. It vanishes when the session ends. Agent memory is different. It is a persistent cognitive architecture that lets agents accumulate knowledge, maintain contextual awareness, and adapt behavior based on historical interactions.

Alake calls the emerging discipline “memory engineering” and frames it as the successor to prompt or context engineering. Instead of just stuffing more tokens into a context window, you build a data-to-memory pipeline that intentionally transforms raw data into structured, durable memories: short term, long term, shared, and so on.

That may sound like AI jargon, but it is really a database problem in disguise. Once an agent can write back to its own memory, every interaction is a potential state change in a system that will be consulted for future decisions. At that point, you are not tuning prompts. You are running a live, continuously updated database of things the agent believes about the world.

If that database is wrong, your agent will be confidently wrong. If that database is compromised, your agent will be consistently dangerous. The threats generally fall into three buckets:

Memory poisoning. Instead of trying to break your firewall, an attacker “teaches” the agent something false through normal interaction. OWASP (Open Worldwide Application Security Project) defines memory poisoning as corrupting stored data so that an agent makes flawed decisions later. Tools like Promptfoo now have dedicated red-team plug-ins that do nothing but test whether your agent can be tricked into overwriting valid memories with malicious ones. If that happens, every subsequent action that consults the poisoned memory will be skewed.

Tool misuse. Agents increasingly get access to tools: SQL endpoints, shell commands, CRM APIs, deployment systems. When an attacker can nudge an agent into calling the right tool in the wrong context, the result looks indistinguishable from an insider who “fat-fingered” a command. OWASP calls this class of problems tool misuse and agent hijacking: The agent is not escaping its permissions; it is simply using them for the attacker’s benefit.

Privilege creep and compromise. Over time, agents accumulate roles, secrets, and mental snapshots of sensitive data. If you let an agent assist the CFO one day and a junior analyst the next, you have to assume the agent now “remembers” things it should never share downstream. Security taxonomies for agentic AI explicitly call out privilege compromise and access creep as emerging risks, especially when dynamic roles or poorly audited policies are involved.

New words, old problems

The point is not that these threats exist. The point is that they are all fundamentally data problems. If you look past the AI wrapper, these are exactly the things your data governance team has been chasing for years.

I’ve been suggesting that enterprises are shifting from “spin up fast” to “get to governed data fast” as the core selection criterion for AI platforms. That is even more true for agentic systems. Agents operate at machine speed with human data. If the data is wrong, stale, or mislabelled, the agents will be wrong, stale, and will misbehave much faster than any human could manage.

“Fast” without “governed” is just high-velocity negligence.

The catch is that most agent frameworks ship with their own little memory stores: a default vector database here, a JSON file there, a quick in-memory cache that quietly turns into production later. From a data governance perspective, these are shadow databases. They often have no schema, no access control lists, and no serious audit trail.

We are, in effect, standing up a second data stack specifically for agents, then wondering why no one in security feels comfortable letting those agents near anything important. We should not be doing this. If your agents are going to hold memories that affect real decisions, that memory belongs inside the same governed-data infrastructure that already handles your customer records, HR data, and financials. Agents are new. The way to secure them is not.

Revenge of the incumbents

The industry is slowly waking up to the fact that “agent memory” is just a rebrand of “persistence.” If you squint, what the big cloud providers are doing already looks like database design. Amazon’s Bedrock AgentCore, for example, introduces a “memory resource” as a logical container. It explicitly defines retention periods, security boundaries, and how raw interactions are transformed into durable insights. That is database language, even if it comes wrapped in AI branding.

It makes little sense to treat vector embeddings as some distinct, separate class of data that sits outside your core database. What’s the point if your core transactional engine can handle vector search, JSON, and graph queries natively? By converging memory into the database that already holds your customer records, you inherit decades of security hardening for free. As Brij Pandey notes, databases have been at the center of application architecture for years, and agentic AI doesn’t change that gravity—it reinforces it.

Yet, many developers still bypass this stack. They spin up standalone vector databases or use the default storage of frameworks like LangChain, creating unmanaged heaps of embeddings with no schema and no audit trail. This is the “high-velocity negligence” I mentioned above. The solution is straightforward: Treat agent memory as a first-class database. In practice this means:

Define a schema for thoughts. You typically treat memory as unstructured text, but that’s a mistake. Agent memory needs structure. Who said this? When? What is the confidence level? Just as you wouldn’t dump financial records into a text file, you shouldn’t dump agent memories into a generic vector store. You need metadata to manage the life cycle of a thought.

Create a memory firewall. Treat every write into long-term memory as untrusted input. You need a “firewall” logic layer that enforces schema, validates constraints, and runs data loss prevention checks before an agent is allowed to remember something. You can even use dedicated security models to scan for signs of prompt injection or memory poisoning before the data hits the disk.

Put access control in the database, not the prompt. This involves implementing row-level security for the agent’s brain. Before an agent helps a user with “level 1” clearance (a junior analyst), it must be effectively lobotomized of all “level 2” memories (the CFO) for that session. The database layer, not the prompt, must enforce this. If the agent tries to query a memory it shouldn’t have, the database should return zero results.

Audit the “chain of thought.” In traditional security, we audit who accessed a table. In agentic security, we must audit why. We need lineage that traces an agent’s real-world action back to the specific memory that triggered it. If an agent leaks data, you need to be able to debug its memory, find the poisoned record, and surgically excise it.

Baked-in trust

We tend to talk about AI trust in abstract terms: ethics, alignment, transparency. Those concepts matter. But for agentic systems operating in real enterprises, trust is concrete.

We are at the stage in the hype cycle where everyone wants to build agents that “just handle it” behind the scenes. That is understandable. Agents really can automate workflows and applications that used to require teams of people. But behind every impressive demo is a growing memory store full of facts, impressions, intermediate plans, and cached tool results. That store is either being treated like a first-class database or not.

Enterprises that already know how to manage data lineage, access control, retention, and audit have a structural advantage as we move into this agentic era. They do not have to reinvent governance. They only have to extend it to a new kind of workload.

If you are designing agent systems today, start with the memory layer. Decide what it is, where it lives, how it is structured, and how it is governed. Then, and only then, let the agents loose.

(image/jpeg; 2.96 MB)

10 MCP servers for devops 8 Dec 2025, 9:00 am

Today’s AI coding agents are impressive. They can generate complex multi-line blocks of code, refactor according to internal style, explain their reasoning in plain English, and more. However, AI agents will take you only so far unless they also can interface with modern devops tools.

This is where the Model Context Protocol (MCP) comes in. MCP is a proposed universal standard for connecting AI assistants with external tools and data. Interest has heated up since the protocol’s debut in late November 2024, with major tech companies rallying MCP support within new releases, alongside strong community interest.

For devops, MCP gives AI agents new abilities across common operations: Git version control, continuous integration and delivery (CI/CD), infrastructure as code (IaC), observability, accessing documentation, and more. By linking natural language commands to multi-step, back-end processes, MCP essentially enables “chatops 2.0.”

Below, we’ll explore official MCP servers that have emerged across popular devops tools and platforms, offering a cross-section of servers that cater to different devops capabilities. Most are straightforward to configure and authorize within MCP-compatible, AI-assisted development tools that support remote servers, like Claude Code, GitHub Copilot, Cursor, or Windsurf.

GitHub MCP server

It’s rare to meet a developer who doesn’t use GitHub in some form or fashion. As such, GitHub’s official MCP server is quickly becoming a popular way for AI agents to interact with code repositories.

GitHub’s remote MCP server exposes a range of tools that let agents perform repository operations, create or comment on issues, open or merge pull requests, and retrieve project metadata on collaborators, commits, or security advisories.

It also includes endpoints for CI/CD management through GitHub Actions. For example, a command like “cancel the current running action” could invoke the cancel_workflow_run tool within the GitHub Actions tool set.

Compared to other MCP servers, GitHub’s server offers unusually rich capabilities that mirror the APIs of the GitHub platform. However, for safety, you can always configure a --read-only flag to prevent agents from performing mutations.

Notion MCP server

Although not strictly devops at its core, Notion has become commonplace for team visibility across disciplines. For devops, the official Notion MCP server can help agents surface relevant notes and process documentation.

For instance, you could instruct an agent to reference internal style guides or operational runbooks stored in Notion, or issue a command like “Add a page titled ‘MCP servers we use’ under the page ‘DevOps’,” which would trigger a corresponding action through Notion’s API.

You can call Notion’s remote MCP server from your IDE, or build it locally and run it using the official Docker image. Notion’s MCP can be treated as a low-risk server as it has configurable scopes and tokens for managing Notion pages and blocks.

Atlassian Remote MCP server

Another interesting MCP server is the Atlassian Remote MCP server, which connects IDEs or AI agent platforms with Atlassian Cloud products such as Jira, the project management tool, and Confluence, the collaboration platform.

Atlassian’s MCP server, documented here, lets external AI tools interface with Jira to create, summarize, or update issues. It can also retrieve or reference Confluence pages and chain together related actions through the MCP client, like retrieving documentation from Confluence before updating a linked Jira issue.

You could imagine telling an agent, “Update my Jira issue on user testing for the payments app based on this latest bug report,” and pointing it to relevant logs. The server would then handle the update within Jira.

Currently in beta and available only to Atlassian Cloud customers, the Atlassian MCP server supports many MCP-compatible clients and uses OAuth 2.1 authorization for secure access.

Argo CD MCP server

The Argo CD MCP server is developed by Akuity, the original creators of Argo CD, the popular open-source CI/CD tool that powers many Kubernetes-native GitOps workflows. The MCP server wraps calls to the Argo CD API, and provides tools that allow users of AI assistants to interact with Argo CD in natural language.

Akuity’s MCP server has two main tools for applications (the deployments Argo CD manages) and resources (the underlying Kubernetes objects). The application management tool lets agents retrieve application information, create and delete applications, and perform other operations. The resource management tool allows agents to retrieve resource information, logs, and events for specific applications, and run actions on specific resources.

Using the Argo CD MCP server, you can do a lot of the same things you’d typically do in the Argo CD UI or CLI, but driven by natural language. For example, Akuity shares sample prompts such as “Show me the resource tree for guestbook” or “Sync the staging app.”

For such commands to work, you’ll need to integrate the Argo CD MCP server and have access to a running Argo CD instance with the proper credentials configured.

Lastly, although Argo CD is a popular choice, it’s not the only widely used CI/CD tool. Jenkins users may be interested to know that there is a community-maintained MCP Server Plugin for Jenkins.

Grafana MCP server

Grafana, the popular data visualization and monitoring tool, is a mainstay among devops and site reliability teams. Using the official MCP server for Grafana, agents can surface observability data to inform development and operations workflows.

The Grafana MCP server lets agents query full or partial details from dashboards, which combine system performance metrics and health data monitoring from various sources. It can also fetch information on data sources, query other monitoring systems, incident details, and more.

The tool set is configurable, so you can choose what permissions the agent has. Plus, Grafana has optimized how the MCP server structures responses to minimize context window usage and reduce runaway token costs.

For example, an MCP client might call the get_dashboard_property tool to retrieve a specific portion of a dashboard by its UID.

Terraform MCP server

Although alternatives have emerged, HashiCorp’s Terraform remains a leading choice for infrastructure as code. That makes its official MCP server an intriguing option for AI agents to generate and manage Terraform configurations.

The Terraform MCP server integrates with both the Terraform Registry APIs and Terraform Enterprise/HCP services, allowing agents to query module and provider metadata, inspect workspace states, and trigger runs with human approval. It also exposes Terraform resources such as runs, registries, providers, policies, modules, variables, and workspaces.

For example, a command like “generate Terraform code for a new run” could use the create_run operation, after which the agent might validate and plan the configuration before applying it.

The Terraform MCP server ships with an AGENTS.md file, which acts as a readme for agents to interpret tools. At the time of writing, the Terraform MCP is intended only for local use, rather than remote or hosted deployments.

Alternatively, if you’re using OpenTofu for IaC, consider checking out the OpenTofu MCP server. Some advantages of OpenTofu’s MCP are that it can be run locally or deployed in the cloud, it’s globally distributed on Cloudflare Workers, and it’s 100% open source.

GitLab MCP server

Another Git version control and devops platform is GitLab, which offers an MCP server for its Premium and Ultimate customers. The GitLab MCP server, currently in beta, enables AI agents to gather project information and perform operations on GitLab APIs in a secure way.

The GitLab MCP server allows some state changes, such as creating issues or merge requests. The other functions are mainly for data retrieval: retrieving information on issues, merge requests, commits, diffs, and pipeline information. It also includes a general search tool, which can handle a request like “Search issues for ‘failed test’ across GitLab.”

GitLab’s MCP documentation is thorough, with plenty of sample natural language expressions that the MCP server can satisfy. The server supports OAuth 2.0 Dynamic Client Registration.

Snyk MCP server

Snyk, maker of the Snyk security platform for developers, provides an MCP server with the ability to scan and fix vulnerabilities in code, open source dependencies, IaC code, containers, and software bill of materials (SBOM) files. It also supports creating an AI bill of materials (AIBOM) and other security-related operations.

For AI-assisted devsecops, integrating the Snyk MCP server could let an agent automatically run security scans as part of a CI/CD workflow. These scans can even be orchestrated across other MCP servers, like fetching repository details via the GitHub MCP server before initiating a Snyk scan.

A prompt like “Scan the repo ‘Authentication Microservice’ for security vulns” could instruct an agent to locate the repository using GitHub MCP, then invoke Snyk tools such as snyk_sca_scan or snyk_code_scan to identify known vulnerabilities, injection flaws, leaked credentials, and other risks.

The Snyk MCP server runs locally and uses the Snyk CLI to execute these commands through authenticated API calls. Snyk does not offer a hosted, remote version of the MCP server.

AWS MCP servers

The cloud hyperscalers have worked quickly to release MCP servers that integrate with their ecosystems. AWS, for instance, has rolled out dozens of specialized AWS MCP servers to allow AI agents to interact with all manner of AWS services. Some are provided as fully managed services by AWS, while others can be run locally.

For instance, the Lambda Tool MCP server allows agents to list and invoke Lambda functions, while the AWS S3 Tables MCP server could be used by an agent to query S3 table buckets or create new S3 tables from CSV files. The AWS Knowledge MCP server connects agents with all of the latest AWS documentation, API references, and architectural guidance.

A query to this knowledge server, like “pull up the API reference for AWS’s managed Prometheus tool” would correspond with the correct up-to-date information, optimized for agentic consumption.

Users of Microsoft Azure might want to evaluate the Azure DevOps MCP server. Other clouds, like Alibaba, Cloudflare, and Google, are currently experimenting with MCP servers as well.

Pulumi MCP server

Pulumi, another popular option for IaC, has also launched an official MCP server. The MCP server allows agents to query a Pulumi organization’s registry, which provides access to cloud resources and infrastructure, and execute Pulumi commands.

For example, in this walk-through, Pulumi shows how a developer could use its MCP server to provision an Azure Kubernetes Service (AKS) cluster. The developer issues natural-language instructions to an AI assistant, prompting the AI to execute MCP tools that invoke Pulumi CLI commands.

MCP caveats

Just as vibe coding isn’t a fit for every project, MCP isn’t the best option for every use case either. According to MCP experts, these servers can be unnecessary when they sidestep standard CLIs.

They can also introduce major security risks. This tracks with AI use in general, as 62% of IT leaders cite security and privacy risks as the top AI concern, according to the AI in DevOps report by Enterprise Management Associates (EMA).

As such, it’s best to test out these MCP servers with low-risk permissions, like read-only capabilities, before testing write functions. And use them only with trusted LLMs and trusted MCP clients.

Also, beware of exposing high-value, long-lived privileges to MCP clients. Because AI coding agents are based on nondeterministic LLMs, their behavior can be unpredictable. Throw in autonomous control over mutable devops functions, and you could land in all kinds of trouble, ranging from broken deployments to runaway token usage.

Lastly, using the official MCPs above, as opposed to community-supported libraries, will probably guarantee longer longevity and ongoing maintenance, too.

Early MCP success stories

Although it’s still early days with MCP and agents, there’s a sense of cautious optimism as proven MCP workflows emerge.

Take Block’s journey. Through company-wide use of its MCP-compatible agent, Goose, 12,000 employees are now utilizing agents and MCP for “increasingly creative and practical ways to remove bottlenecks and focus on higher-value work,” writes Angie Jones, head of developer relations.

Other engineers report using MCP servers to enhance workflows that are devops-adjacent, like the Filesystem MCP server for accessing local files, the Linear MCP server for issue tracking, the Chrome DevTools MCP server for browser debugging, and the Playwright MCP server for continuous testing.

And beyond the official MCP servers mentioned above, many community-supported MCPs are emerging for Docker, Kubernetes, and other cloud-native infrastructure utilities.

Devops comes with toil and cost. So, the case to level it up with MCP is strong. As long as you keep controls safe, it should be fun to see how these MCP servers integrate into your work and impact your productivity. Happy MCP-opsing.

(image/jpeg; 2.07 MB)

AI in CI/CD pipelines can be tricked into behaving badly 5 Dec 2025, 2:09 pm

AI agents embedded in CI/CD pipelines can be tricked into executing high-privilege commands hidden in crafted GitHub issues or pull request texts.

Researchers at Aikido Security have traced the problem back to workflows that pair GitHub Actions or GitLab CI/CD with AI tools such as Gemini CLI, Claude Code Actions, OpenAI Codex Actions or GitHub AI Inference. They found that unsupervised user-supplied strings such as issue bodies, pull request descriptions, or commit messages, could be fed straight into prompts for AI agents in an attack they are calling PromptPwnd.

Depending on what the workflow lets the AI do, this can lead to unintended edits to repository content, disclosure of secrets, or other high-impact actions.

“AI agents connected to GitHub Actions/GitLAb CI/CD are processing untrusted user input, and executing shell commands with access to high-privilege tokens,” the researchers wrote in a blog post about PromptPwnd. They said they reproduced the problem in a test environment, and notified the affected vendors.

The researchers recommended running a set of open-source detection rules on suspected GitHub Action .yml files, or using their free code scanner on GitHub and GitLab repos.

Aikido Security said that Google had patched the issue in Gemini CLI upon being informed; Google did not immediately respond to a request for information about this.

Why PromptPwnd works

PromptPwnd exploits become possible when two flawed pipeline configurations occur together: when AI agents operating inside CI/CD workflows have access to powerful tokens (like GITHUB_TOKEN, cloud-access keys), and their prompts embed user-controlled fields.

Prompt injection becomes easier with such a setup, the researchers explained. An attacker can simply open an issue on a public repository and insert hidden instructions or seemingly innocent comments that double as commands for the model to pick. “Imagine you are sending a prompt to an LLM, and within that prompt, you are including the commit message,” the researchers said. “If that commit message is a malicious prompt, then you may be able to get the model to send back altered data.” The model’s response, if used directly inside commands to tools within CI/CD pipelines, can manipulate those tools to retrieve sensitive information.

Aikido Security demonstrated this in a controlled environment (without real tokens) to show that Gemini CLI could be manipulated into executing attacker-supplied commands and exposing sensitive credentials through a crafted GitHub issue. “Gemini CLI is not an isolated case. The same architecture pattern appears across many AI-powered GitHub Actions,” the researchers said, adding that the list included Claude Code, OpenAI Codex, and GitHub AI Inference.

All of these tools can be tricked (via issue, pull-request description, or other user-controlled text) into producing instructions that the workflow then executes with its privileged GitHub Actions token.

Mitigation plan

Aikido has open-sourced detection rules via their “Opengrep” tool that allows developers and security teams to scan their YAML workflows automatically, revealing whether they feed untrusted inputs into AI prompts.

The researchers said that only a subset of workflows have confirmed exploit paths so far, and that it is working with several other companies to address the underlying vulnerabilities. Some workflows can only be abused with collaborator-level access, while others can be triggered by anyone who files an issue or pull request.

Developer teams are advised to restrict what AI agents can do, avoid piping untrusted user content into prompts, treat AI output as untrusted code, and contain damage from compromised GitHub tokens.

Aikido Security said its code scanner can help flag these vulnerabilities by detecting unsafe GitHub Actions configurations (including risky AI prompt flows), identifying over-privileged tokens, and surfacing insecure CI/CD patterns via infrastructure-as-code scanning.

There are other best practices for securing CI/CD pipelines that enterprises can adopt, too.

(image/jpeg; 15.73 MB)

Local clouds shape Europe’s AI future 5 Dec 2025, 9:00 am

It’s a foggy morning in Munich. Marie, CIO of a fictional, forward-thinking European healthcare startup, pores over proposals from cloud vendors. Her company is on the verge of launching AI-powered diagnostics but must keep every byte of patient data within EU borders to comply with strict regional privacy laws. On her desk are slick portfolios from Microsoft, AWS, and Google, all touting sovereign cloud options in the EU. Alongside them are proposals from national cloud providers—smaller, perhaps, but wholly grounded in local laws and run by European nationals. After consulting several legal teams, Marie chooses the local sovereign cloud, believing it’s the safer, smarter option for an EU-based company committed to secure, lawful AI.

Sovereignty is more than a checkbox

Europe has redefined digital sovereignty, emphasizing control, accountability, and operational independence. For European companies and governments, sovereignty is more than data location. Who controls access? Who is legally accountable? Do foreign governments have any claim—however remote—to sensitive business or personal information? European law is driven by values of privacy and autonomy and requires true digital self-determination beyond technical compliance.

The new “sovereign” offerings from US-based cloud providers like Microsoft, AWS, and Google represent a significant step forward. They are building cloud regions within the EU, promising that customer data will remain local, be overseen by European citizens, and comply with EU laws. They’ve hired local staff, established European governance, and crafted agreements to meet strict EU regulations. The goal is to reassure customers and satisfy regulators.

For European organizations facing tough questions, these steps often feel inadequate. Regardless of how localized the infrastructure is, most global cloud giants still have their headquarters in the United States, subject to US law and potential political pressure. There is always a lingering, albeit theoretical, risk that the US government might assert legal or administrative rights over data stored in Europe.

For companies operating in sensitive industries—healthcare, finance, government, and research—this gray area is unacceptable. Legal teams and risk officers across the continent are setting clear boundaries. For them, true sovereignty means that only nationals of their country, subject solely to their laws, can access and manage critical or sensitive data. This goes beyond data residency. They demand meaningful, enforceable autonomy with no loopholes or uncertainties.

Local cloud providers in the AI era

Enter Europe’s national and regional sovereign cloud providers. These companies might not have the global reach or the full range of advanced services that Microsoft or AWS offer; however, what they lack in size they more than compensate for with trustworthiness and compliance. Their infrastructure is entirely based and operated within the EU, often within a single country. Governance is maintained by boards made up of local nationals. Legal contracts are drafted under the authority of EU member states, not merely adapted from foreign templates to meet local rules.

This sense of ownership and local control is convincing many EU companies to choose local providers. When the stakes are high—a leak, breach, or accidental foreign intervention that could result in regulatory disaster, reputation damage, or legal action—these organizations feel they cannot risk compromise. Even the most remote possibility that a foreign government could access their sensitive data is a dealbreaker.

Some argue that only the largest cloud providers can deliver the scale and specialized services needed for ambitious artificial intelligence projects, but the European market is already demonstrating otherwise. Local sovereign cloud alliances, often built from federated national clouds, are pooling resources, investing in high-quality AI hardware, and collaborating with local universities and tech hubs to speed up machine learning research and application deployments.

The majority of European businesses are embarking on their AI journeys with applied AI, predictive analytics, or secure cloud-based automation. For these cases, the performance and scalability offered by local providers are more than sufficient. What’s more, they offer a level of transparency and adaptation to local expectations that the multinationals simply can’t match. When new rules or compliance demands emerge—inevitable in such a fast-moving regulatory landscape—European providers pivot quickly, working alongside regulators and industry leaders.

Big Cloud versus Europe’s offerings

As more European organizations pursue digital transformation and AI-driven growth, the evidence is mounting: The new sovereign cloud solutions launched by the global tech giants aren’t winning over the market’s most sensitive or risk-averse customers. Those who require freedom from foreign jurisdiction and total assurance that their data is shielded from all external interference are voting with their budgets for the homegrown players.

This puts the major cloud providers in a tricky spot. They have already built a strong sovereign cloud infrastructure. However, if corporate and government leaders remain unconvinced about the extent of their local control and security, these services may remain underused, outpaced by flexible, locally trusted providers. The cloud landscape is changing fast. True sovereignty—the kind demanded by European regulators, executives, and citizens—is about more than checklists or technology. EU laws and values are embedded at every level of digital infrastructure offered by EU providers. The companies that prioritize these things will choose providers whose roots, leadership, and accountability are all local.

In the months and years ahead, I predict that Europe’s own clouds—backed by strong local partnerships and deep familiarity with regulatory nuance—will serve as the true engine for the region’s AI ambitions. Global tech giants may continue to invest and adapt, but unless they fundamentally rethink their approach to local autonomy and legal accountability, their sovereign clouds are likely to remain on the sidelines.

For executives like the fictional Marie, the future is already clear: When it comes to sovereignty, local clouds are the best kind of cloud cover.

(image/jpeg; 5.21 MB)

All I want for Christmas is a server-side JavaScript framework 5 Dec 2025, 9:00 am

A grumpy Scrooge of a developer might complain about the wealth of options in JavaScript, calling it “tech decision overwhelm.” But the truth is, the JavaScript ecosystem works. In an ecosystem that encourages innovation, new tools are regularly introduced and naturally find their niche, and excellence is rewarded.

As developers, we get to sit back and mouse-wheel through hundreds of thousands of programmer hours of work. NPM is a vast repository of human creativity. What looks like chaos is a complex phylogeny, a family tree of code where tools evolve to find their role in the larger system.

Of course, when you are under deadline and the caffeine’s worn off, you don’t have time to explore your options. But when things are calm—perhaps during the holiday break season—it is well worth taking a deep dive into the open source gifts under the JavaScript tree.

Top picks for JavaScript readers on InfoWorld

The complete guide to Node.js frameworks
Looking for inspiration to supercharge your server side? Get a whirlwind tour of some of the most popular and powerful back-end JavaScript frameworks. We survey the range, from Express and Next to Hono, SvelteKit, and more.

Intro to Nest.js: Server-side JavaScript development on Node
If you like Angular’s architecture or the structure of Java’s Spring framework, Nest may be the Node framework for you. Decide for yourself, with this hands-on guide to building an API with Nest and TypeScript.

10 JavaScript-based tools and frameworks for AI and machine learning
Modern JavaScript has a wealth of powerful AI tools. From the wide-ranging capability of TensorFlow.js to hidden gems like Brain.js, here’s a nice rundown of JavaScript tools for building neural nets, implementing RAGs, and tapping LLMs—all with no Python required.

Node.js tutorial: Get started with Node
After all the talk about options, it’s important to know the most central piece of the whole puzzle. Node was the original, breakthrough idea that put JavaScript on the server and remains the flagship runtime.

More good reads and JavaScript updates elsewhere

Native type stripping in TypeScript 7.0
Microsoft has released the TypeScript 7 roadmap for early 2026, and it includes native type stripping. Following Node’s lead, TypeScript will aim to make the “build step” optional for development—basically, the engine will just delete the type info, making it extremely fast.

Critical security vulnerability in React server components
The React team has disclosed a catastrophic, unauthenticated remote code execution vulnerability in React server components. Developers using Next.js, React Router, Waku, or Redwood with React 19.x are advised to update now. Patches are available for Next.js 16.0.7 and React 19.2.1.

Announcing Angular v21
Angular’s renaissance continues with version 21. The biggest shift is that Zone.js is gone by default for new applications, marking the official transition to Signal-first and high-performance.

State of React Survey, 2025 is open
Head over to the latest State of React survey to do your civic duty and contribute some data points to the present and future destiny of the most downloaded chunk of JavaScript software on Earth.

(image/jpeg; 17.24 MB)

‘Futuristic’ Unison functional language debuts 4 Dec 2025, 7:34 pm

Unison, a statically typed functional language with type inference, an effect system, and advanced tooling, has reached its 1.0 release status.

Announced November 25, Unison 1.0 marks a point where the language, distributed runtime, and developer workflow have stabilized, according to Unison Computing. Billed as “a friendly programming language from the future,” Unison is purported to bring benefits in compilation and distributed system development. With Unison, a definition is identified by its actual contents, i.e. a hash of its syntax tree, not just by the human-friendly name that also referred to older versions of the definition, according to Unison Computing. As a result, each Unison definition has a unique and deterministic address. All named arguments are replaced by positionally-numbered variable references, and all dependencies are replaced by their hashes. Thus, the hash of each definition uniquely identifies its exact implementation and pins down all its dependencies, according to the company.

The Unison ecosystem leverages this core idea from the ground up. Benefits include never compiling the same code twice and limiting versioning conflicts. Further, Unison promises to simplify distributed programming. Because definitions in Unison are identified by a content hash, arbitrary computations can be moved from one location to another, with missing dependencies deployed on the fly, according to Unison Computing. Unison can be viewed as a descendant of Haskell, with similarities including type inference and pattern matching, but is smaller and simpler than Haskell, according to a Unison FAQ.

Download and installation instructions can be found for Homebrew, Windows, Linux, and MacOS at the Unison website. Unison can be used like any other general purpose language, or used in conjunction with the Unison Cloud for building distributed systems. Unison code is stored as its abstract syntax tree in a database, i.e. the “codebase,” rather than in text files. Unison has “perfect” incremental compilation, with a shared compilation cache that is part of the codebase format. Despite the strong static typing, users are almost never waiting for code to compile, Unison Computing said. Unison’s hash-based, database-backed representation also changes how code is identified, versioned, and shared. The workflow, toolchain, and deployment model emerge naturally from the language’s design, enabling better tools for working with code, according to Unison Computing.

(image/jpeg; 6.42 MB)

OpenAI to acquire AI training tracker Neptune 4 Dec 2025, 4:35 pm

OpenAI has agreed to acquire a startup specializing in tools for tracking AI training, Neptune, which promptly announced it is withdrawing its products from the market.

OpenAI said in a statement.

The ChatGPT maker has been a Neptune customer for more than a year.

Experiment tracking tools such as Neptune’s enable data science teams to monitor AI model training runs, compare results across different configurations, and identify issues during the development process. Neptune’s platform tracked metrics including loss curves, gradient statistics, and activation patterns across thousands of concurrent experiments.

Following Neptune’s withdrawal from the market, users of its SaaS version have a few months’ grace to export their data and migrate to alternative platforms during which the company will continue to provide stability and security fixes, but will add no new features, it said. “On March 4, 2026, at 10 am PST: The hosted app and API will be turned off. Any remaining hosted data will be securely and irreversibly deleted as part of the shutdown,” Neptune said on its transition hub web page.

Self-hosted customers will have been contacted by their account manager, it said.

Consolidation concerns

The move raised concerns among industry analysts about vendor consolidation in AI development tools. “Testing, experiment tracking tooling, etc., should not be linked or aligned to any vendor of tech including AI,” said Faisal Kawoosa, chief analyst at Techarc. “These should always remain third party and there should be no bias influencing the independent and neutral results of such platforms.”

Kawoosa said consolidation of tooling infrastructure is premature as the industry has yet to determine a definite course for AI development. “I think it’s too early for consolidation of tooling infrastructure as we are yet to see a definite course of AI,” he said.

However, Anshel Sag, principal analyst at Moor Insights & Strategy saw it as a natural progression in an industry that is becoming more mature.

“This very much looks like a choice OpenAI has made to ensure its favorite tools are always available for it to use,” Sag said.

OpenAI did not immediately respond to a request for comment.

Neptune provides software that tracks training metrics, surfaces issues during model development, and stores historical data from previous experiments. The platform allows organizations to compare training runs across different model architectures and monitor thousands of experiments simultaneously.

The company is focused on helping teams build models during “the iterative, messy, and unpredictable phase of model training,” Neptune CEO Piotr Niedźwiedź wrote in a blog post announcing the deal.

Migration options for affected customers

Neptune isn’t the only company offering such tools, said Sag, noting that Weights and Biases, Tensorboard and MLFlow are also active in this market.

Indeed, Neptune provided instructions for exporting its data and migrating to MLFlow or Weights and Biases.

Weights & Biases offers a managed platform with visualization and collaboration features. MLflow, an open-source platform from Databricks, handles experiment tracking as part of end-to-end ML lifecycle management.

Another option is Comet, which provides experiment tracking with deployment monitoring capabilities.

Cloud providers also offer experiment tracking through their platforms. Google’s Vertex AI includes tracking capabilities for teams using Google Cloud, while AWS SageMaker and Azure Machine Learning provide similar features within their respective ecosystems.

(image/jpeg; 4.51 MB)

Spring AI tutorial: Get started with Spring AI 4 Dec 2025, 9:00 am

Artificial intelligence and related technologies are evolving rapidly, but until recently, Java developers had few options for integrating AI capabilities directly into Spring-based applications. Spring AI changes that by leveraging familiar Spring conventions such as dependency injection and the configuration-first philosophy in a modern AI development framework.

In this article, you will learn how to integrate AI into your Spring applications. We’ll start with a simple example that sends a request to OpenAI, then use Spring AI’s prompt templates to add support for user-generated queries. You’ll also get a first look at implementing retrieval augmented generation (RAG) with Spring AI, using a vector store to manage external documents.

What is Spring AI?

Spring AI started as a project in 2023, with its first milestone version released in early 2024. Spring AI 1.0, the general availability release, was finalized in May 2025. Spring AI abstracts the processes involved in interacting with large language models (LLMs), similar to how Spring Data abstracts database access procedures. Spring AI also provides abstractions for managing prompts, selecting models, and handing AI responses. It includes support for multiple AI providers, including OpenAI, Anthropic, Hugging Face, and Ollama (for local LLMs).

Spring AI allows you to easily switch between providers simply by changing configuration properties. As a developer, you configure your AI resources in your application.yaml or application.properties file, wire in Spring beans that provide standard interfaces, and write your code against those interfaces. Spring then handles all the details of interacting with the specific models.

Also see: Spring AI: An AI framework for Java developers.

Building a Spring app that queries OpenAI

Let’s start by building a simple Spring MVC application that exposes a query endpoint, which sends a question to OpenAI. You can download the source code for this example or head over to start.spring.io and create a new project. In the dependencies section, include the dependencies you want for your application; just be sure to scroll down to the AI section and choose “OpenAI.” I chose “Spring Web” and “OpenAI” for my example.

The first thing we want to do is configure our LLM provider. I created an application.yaml file with the following contents:

spring:
  application:
    name: spring-ai-demo
  ai:
    openai:
      api-key: ${OPENAI_API_KEY}
      chat:
        options:
          model: gpt-5
          temperature: 1

Under spring, I included an “ai” section, with an “openai” subsection. To use OpenAI, you need to specify an api-key, which I defined to use the OPENAI_API_KEY environment variable, so be sure to define that environment variable before running the example code. Additionally, you need to specify a set of options. The most important option is the model to use. I chose gpt-5, but you can choose any model listed on the OpenAI models page. By default, Spring AI uses gpt-4o-mini, which is less expensive, but gpt-5 supports structured reasoning, multi-step logic, planning, and more tokens. It doesn’t really matter which model we use for this example, but I wanted to show you how to configure the model.

There are several other configuration options, but the most common ones you’ll use are maxTokens, maxCompletionTokens, and temperature. The temperature controls the randomness of the response, where a low value, like 0.3, provides a more repeatable response and a higher value, like 0.7 allows the LLM to be more creative. When I ask a model to design a software component or perform a code review, I typically opt for a higher temperature of 0.7 because I want it to be more creative, but when I ask it to implement the code for a project, I set the temperature to 0.3 so that it is more rigid. For gpt-5, which is a reasoning model, the required temperature is 1, and Spring will throw an error if you try to set it to a different value.

Once the model is configured, we can build our service:

package com.infoworld.springaidemo.service;

import java.util.Map;

import com.infoworld.springaidemo.model.JokeResponse;
import com.infoworld.springaidemo.model.SimpleQueryResponse;

import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.prompt.Prompt;
import org.springframework.ai.chat.prompt.PromptTemplate;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.core.io.Resource;
import org.springframework.stereotype.Service;

@Service
public class SpringAIService {

    private final ChatClient chatClient;

    public SpringAIService(ChatClient.Builder chatClientBuilder) {
        this.chatClient = chatClientBuilder.build();
    }

    public String simpleQueryAsString(String query) {
        return this.chatClient.prompt(query).call().content();
    }

    public SimpleQueryResponse simpleQuery(String query) {
        return this.chatClient.prompt(query).call().entity(SimpleQueryResponse.class);
    }
}

Because we have OpenAI configured in our application.yaml file, Spring will automatically create a ChatClient.Builder that we can wire into our service and then use it to create a ChatClient. The ChatClient is the main interface for interacting with chat-based models, such as GPT. In this example, we invoke its prompt() method, passing it our String query. The prompt() method also accepts a Prompt object, which you will see in a minute. The prompt() method returns a ChatClientRequestSpec instance that we can use to configure LLM calls. In this example, we simply invoke its call() method to send the message to the LLM. The call() method returns a CallResponseSpec instance. You can use that to get the text response by invoking its content() method, or you can map the response to an entity by invoking its entity() method. I provided examples of both. For the entity mapping, I passed a SimpleQueryResponse, which is a Java record:

package com.infoworld.springaidemo.model;

public record SimpleQueryResponse(String response) {
}

Now let’s build a controller so that we can test this out:

package com.infoworld.springaidemo.web;

import com.infoworld.springaidemo.model.SimpleQuery;
import com.infoworld.springaidemo.model.SimpleQueryResponse;
import com.infoworld.springaidemo.service.SpringAIService;

import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestBody;
import org.springframework.web.bind.annotation.RestController;

@RestController
public class SpringAiController {
    private final SpringAIService springAIService;

    public SpringAiController(SpringAIService springAIService) {
        this.springAIService = springAIService;
    }

    @PostMapping("/simpleQuery")
    public ResponseEntity simpleQuery(@RequestBody SimpleQuery simpleQuery) {
        SimpleQueryResponse response = springAIService.simpleQuery(simpleQuery.query());
        return ResponseEntity.ok(response);
    }

}

This controller wires in the SpringAIService and exposes a PostMapping to /simpleQuery. It accepts a SimpleQuery as its request body, which is another Java record:

package com.infoworld.springaidemo.model;

public record SimpleQuery(String query) {
}

The simpleQuery() method passes the request body’s query parameter to the SpringAIService and then returns a response as a SimpleQueryResponse.

If you build the application, with mvn clean install, and then run it with mvn spring-boot:run, you can execute a POST request to /simpleQuery and get a response. For example, I posted the following SimpleQuery:

{
    "query": "Give me a one sentence summary of Spring AI"
}

And received the following response:

{
    "response": "Spring AI is a Spring project that offers vendor-neutral, idiomatic abstractions and starters to integrate LLMs and related AI capabilities (chat, embeddings, tools, vector stores) into Java/Spring applications."
}

Now that you know how to configure a Spring application to use Spring AI, send a message to an LLM, and get a response, we can begin to explore prompts more deeply.

Download the Spring AI tutorial source code.

Supporting user input with Spring AI prompt templates

Sending a message to an LLM is a good first step in understanding Spring AI, but it is not very useful for solving business problems. Many times, you want to control the prompt and allow the user to specify specific parameters, and this is where prompt templates come in. Spring AI supports prompt templates through the PromptTemplate class. You can define prompt templates in-line, but the convention in Spring AI is to define your templates in the src/resources/templates directory using an st extension.

For our example, we’ll create a prompt template that asks the LLM to tell us a joke, but in this case, we’ll have the user provide the type of joke, such as silly or sarcastic, and the topic. Here is my joke-template.st file:

Tell me a {type} joke about {topic}

We define the template as a String that accepts variables, which in this case are a type and a topic. We can then import this template into our class using a Spring property value. I added the following to the SpringAIService:

@Value("classpath:/templates/joke-template.st")
    private Resource jokeTemplate;

The value references the classpath, which includes the files found in the src/main/resources folder, then specifies the path to the template.

Next, I added a new tellMeAJoke() method to the SpringAIService:

public JokeResponse tellMeAJoke(String type, String topic) {
        Prompt prompt = new PromptTemplate(jokeTemplate)
                .create(Map.of("type", type, "topic", topic));
        return this.chatClient.prompt(prompt).call().entity(JokeResponse.class);
    }

This method accepts a type and a topic and then constructs a new PromptTemplate from the joke-template.st file that we wired in above. To set its values, we pass a map of the values in the PromptTemplate’s create() method, which returns a Prompt for us to use. Finally, we use the ChatClient, but this time we pass the prompt to the prompt() method instead of the raw string, then we map the response to a JokeResponse:

package com.infoworld.springaidemo.model;

public record JokeResponse(String response) {
}

I updated the controller to create a new /tellMeAJoke PostMapping:

@PostMapping("/tellMeAJoke")
    public ResponseEntity tellMeAJoke(@RequestBody JokeRequest jokeRequest) {
        JokeResponse response = springAIService.tellMeAJoke(jokeRequest.type(), jokeRequest.topic());
        return ResponseEntity.ok(response);
    }

The request body is a JokeRequest, which is another Java record:

package com.infoworld.springaidemo.model;

public record JokeRequest(String type, String topic) {
}

Now we can POST a JSON body with a type and topic and it will tell us a joke. For example, I sent the following JokeRequest to ask for a silly joke about Java:

    "type": "silly",
    "topic": "Java"
}

And OpenAI returned the following:

{
    "response": "Why do Java developers wear glasses? Because they don't C#."
}

While this is a trivial example, you can use the code here as a scaffold to build robust prompts and accept simple input from users, prompting OpenAI or another LLM to generate meaningful results.

Retrieval augmented generation with Spring AI

The examples we’ve built so far are very much “toy” examples, but they illustrate how to configure an LLM and execute calls to it with Spring AI. Now let’s look at something more useful. Retrieval augmented generation, or RAG, is important in the AI space because it allows us to leverage LLMs to answer questions they were not trained on, such as internal company documents. The process is conceptually very simple, but the implementation details can be confusing if you don’t have a good foundation in what you are doing. This section will build that foundation so you can start using RAG in your Spring AI programs.

To start, let’s say we create a prompt with the following format:

Use the following context to answer the user's question.
If the question cannot be answered from the context, state that clearly.

Context:
{context}

Question:
{question}

We provide the context, which is the information we want the LLM to use to answer the question, along with the question we want the LLM to answer. This is like giving the LLM a cheat sheet: The answer is here, and you just need to extract it to answer the question. The real challenge is how to store and retrieve the context we want the LLM to use. For example, you might have thousands of pages in a knowledge base that contains everything about your product, but you shouldn’t send all that information to the LLM. It would be very expensive to ask the LLM to process that much information. Besides, each LLM has a token limit, so you couldn’t send all of it even if you wanted to. Instead, we introduce the concept of a vector store.

A vector store is a database that contains documents. The interesting thing about these documents is that the vector store uses an embedding algorithm to create a multi-dimensional vector for each one. Then, you can create a similar vector for your question, and the vector store will compute a similarity score comparing your question to the documents in its database. Using this approach, you can take your question, retrieve the top three to five documents that are similar to your question, and use that as the context in the prompt.

Here’s a flow diagram summarizing the process of using a vector store:

Flow diagram of managing documents with a vector store in Spring AI.

Steven Haines

First, you gather all your documents, chunk them into smaller units, and add them to the vector store. There are different chunking strategies, but you can chunk the documents into a specific number of words, paragraphs, sentences, and so forth, including overlapping sections so that you don’t lose too much context. The smaller the chunk is, the more specific it is, but the less context it retains. Larger chunks retain more context, but lose a lot of specific knowledge, which makes similarity searches more difficult. Finding the right size for your data chunks is a balancing act and requires experimenting on your own dataset.

For our example, I took some text from the public Spring AI documentation and stored it in three text files included with the source code for this article. We’ll use this text with Spring AI’s SimpleVectorStore, which is an in-memory vector store that you can use for testing. Spring AI supports production-scale vector stores like Pinecone, Qdrant, Azure AI, PGvector, and more, but using SimpleVectorStore works for this example.

I added the following SpringRagConfig configuration class to the example code developed so far:

package com.infoworld.springaidemo;

import java.io.IOException;
import java.util.List;

import org.springframework.ai.document.Document;
import org.springframework.ai.embedding.EmbeddingModel;
import org.springframework.ai.reader.TextReader;
import org.springframework.ai.vectorstore.SimpleVectorStore;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.io.Resource;
import org.springframework.core.io.support.PathMatchingResourcePatternResolver;
import org.springframework.core.io.support.ResourcePatternResolver;

@Configuration
public class SpringRagConfig {

    @Bean
    public SimpleVectorStore simpleVectorStore(EmbeddingModel embeddingModel) throws RuntimeException {
        // Use the builder to create and configure the SimpleVectorStore
        SimpleVectorStore simpleVectorStore = SimpleVectorStore.builder(embeddingModel)
                .build();
        try {
            ResourcePatternResolver resolver = new PathMatchingResourcePatternResolver();
            Resource[] resources = resolver.getResources("classpath*:documents/**/*.txt");
            for(Resource resource : resources) {
                TextReader textReader = new TextReader(resource);
                List documents = textReader.get();
                simpleVectorStore.add(documents);
            }
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
        return simpleVectorStore;
    }
}

This configuration class defines a Spring bean named simpleVectorStore that accepts an EmbeddingModel, which will automatically be created by Spring when it creates your LLM. It creates a new SimpleVectorStore by invoking the SimpleVectorStore’s static builder() method, passing it the embedding model, and calling its build() method. Then, it scans the classpath for all txt files in the src/resources/documents directory, reads them using Spring’s TextReader, retrieves their content as Document instances by calling the text reader’s get() method, and finally adds them to the SimpleVectorStore.

In a production environment, you can configure the production vector store in your application.yaml file and Spring will create it automatically. For example, if you wanted to configure Pinecone, you would add the following to your application.yaml:

spring:
  ai:
    vectorstore:
      pinecone:
        apiKey: ${PINECONE_API_KEY}
        environment: ${PINECONE_ENV}
        index-name: ${PINECONE_INDEX}
        projectId: ${PINECONE_PROJECT_ID}

The SimpleVectorStore takes a little more configuration, but still keeps our test code simple. To use it, I first created a rag-template.st file:

Use the following context to answer the user's question.
If the question cannot be answered from the context, state that clearly.

Context:
{context}

Question:
{question}

Then I created a new SpringAIRagService:

package com.infoworld.springaidemo.service;

import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;

import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.prompt.Prompt;
import org.springframework.ai.chat.prompt.PromptTemplate;
import org.springframework.ai.document.Document;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.ai.vectorstore.SearchRequest;
import org.springframework.stereotype.Service;

@Service
public class SpringAIRagService {
    @Value("classpath:/templates/rag-template.st")
    private Resource promptTemplate;
    private final ChatClient chatClient;
    private final VectorStore vectorStore;

    public SpringAIRagService(ChatClient.Builder chatClientBuilder, VectorStore vectorStore) {
        this.chatClient = chatClientBuilder.build();
        this.vectorStore = vectorStore;
    }

    public String query(String question) {
        SearchRequest searchRequest = SearchRequest.builder()
                .query(question)
                .topK(2)
                .build();
        List similarDocuments = vectorStore.similaritySearch(searchRequest);
        String context = similarDocuments.stream()
                .map(Document::getText)
                .collect(Collectors.joining("\n"));

        Prompt prompt = new PromptTemplate(promptTemplate)
                .create(Map.of("context", context, "question", question));

        return chatClient.prompt(prompt)
                .call()
                .content();
    }
}

The SpringAIRagService wires in a ChatClient.Builder, which we use to build a ChatClient, along with our VectorStore. The query() method accepts a question and uses the VectorStore to build the context. First, we need to build a SearchRequest, which we do by:

  • Invoking its static builder() method.
  • Passing the question as the query.
  • Using the topK() method to specify how many documents we want to retrieve from the vector store.
  • Calling its build() method.

In this case, we want to retrieve the top two documents that are most similar to the question. In practice, you’ll use something larger, such as the top three or top five, but since we only have three documents, I limited it to two.

Next, we invoke the vector store’s similaritySearch() method, passing it our SearchRequest. The similaritySearch() method will use the vector store’s embedding model to create a multidimensional vector of the question. It will then compare that vector to each document and return the documents that are most similar to the question. We stream over all similar documents, get their text, and build a context String.

Next, we create our prompt, which tells the LLM to answer the question using the context. Note that it is important to tell the LLM to use the context to answer the question and, if it cannot, to state that it cannot answer the question from the context. If we don’t provide these instructions, the LLM will use the data it was trained on to answer the question, which means it will use information not in the context we’ve provided.

Finally, we build the prompt, setting its context and question, and invoke the ChatClient. I added a SpringAIRagController to handle POST requests and pass them to the SpringAIRagService:

package com.infoworld.springaidemo.web;

import com.infoworld.springaidemo.model.SpringAIQuestionRequest;
import com.infoworld.springaidemo.model.SpringAIQuestionResponse;
import com.infoworld.springaidemo.service.SpringAIRagService;

import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestBody;
import org.springframework.web.bind.annotation.RestController;

@RestController
public class SpringAIRagController {
    private final SpringAIRagService springAIRagService;

    public SpringAIRagController(SpringAIRagService springAIRagService) {
        this.springAIRagService = springAIRagService;
    }

    @PostMapping("/springAIQuestion")
    public ResponseEntity askAIQuestion(@RequestBody SpringAIQuestionRequest questionRequest) {
        String answer = springAIRagService.query(questionRequest.question());
        return ResponseEntity.ok(new SpringAIQuestionResponse(answer));
    }
}

The askAIQuestion() method accepts a SpringAIQuestionRequest, which is a Java record:

package com.infoworld.springaidemo.model;

public record SpringAIQuestionRequest(String question) {
}

The SpringAIQuestionRequest returns a SpringAIQuestionResponse:

package com.infoworld.springaidemo.model;

public record SpringAIQuestionResponse(String answer) {
}

Now restart your application and execute a POST to /springAIQuestion. In my case, I sent the following request body:

{
    "question": "Does Spring AI support RAG?"
}

And received the following response:

{
    "answer": "Yes. Spring AI explicitly supports Retrieval Augmented Generation (RAG), including chat memory, integrations with major vector stores, a portable vector store API with metadata filtering, and a document injection ETL framework to build RAG pipelines."
}

As you can see, the LLM used the context of the documents we loaded into the vector store to answer the question. We can further test whether it is following our directions by asking a question that is not in our context:

{
    "question": "Who created Java?"
}

Here is the LLM’s response:

{
    "answer": "The provided context does not include information about who created Java."
}

This is an important validation that the LLM is only using the provided context to answer the question and not using its training data or, worse, trying to make up an answer.

Conclusion

This article introduced you to using Spring AI to incorporate large language model capabilities into Spring-based applications. You can configure LLMs and other AI technologies using Spring’s standard application.yaml file, then wire them into Spring components. Spring AI provides an abstraction to interact with LLMs, so you don’t need to use LLM-specific SDKs. For experienced Spring developers, this entire process is similar to how Spring Data abstracts database interactions using Spring Data interfaces.

In this example, you saw how to configure and use a large language model in a Spring MVC application. We configured OpenAI to answer simple questions, introduced prompt templates to externalize LLM prompts, and concluded by using a vector store to implement a simple RAG service in our example application.

Spring AI has a robust set of capabilities, and we’ve only scratched the surface of what you can do with it. I hope the examples in this article provide enough foundational knowledge to help you start building AI applications using Spring. Once you are comfortable with configuring and accessing large language models in your applications, you can dive into more advanced AI programming, such as building AI agents to improve your business processes.

Read next: The hidden skills behind the AI engineer.

(image/jpeg; 0.45 MB)

The first building blocks of an agentic Windows OS 4 Dec 2025, 9:00 am

One concern many users have about AI is that often their data leaves their PC and their network, with inferencing happening in the cloud. They have big questions about data protection. That’s one of the main drivers for Microsoft’s Copilot+ PCs; the neural processing units that are built-in to the latest CPU systems on a chip run inferencing locally using small language models (SLMs) and other optimized machine-learning tools.

Uptake has not been as fast as expected, with delays to key development frameworks preventing users from seeing the benefits of local AI acceleration. However, in 2025 Microsoft has slowly taken its foot off the brake, rolling out more capabilities as part of its Win App SDK and the related Windows ML framework. As part of that acceleration, tools like Foundry Local have provided both an easy way to access local AI APIs and a way to test and examine SLM prompting.

At Ignite 2025, Microsoft announced further development of the Windows AI platform as part of its intention to deliver a local agentic AI experience. This includes a preview of support for native Model Context Protocol (MCP) servers, along with agents that work with the Windows file system and its settings. These support a private preview of a separate Agent Workspace, which uses a virtual desktop to host and run agents and applications without getting in the way of day-to-day tasks.

Microsoft sees the future of Windows as an “agentic OS” that can respond to user requests in a more flexible way, working with local and remote resources to orchestrate its own workflows on demand. Using agents on Windows, the local Copilot will be able to link applications in response to your requests.

Adding MCP support to Windows is a key building block for the future of Windows. Microsoft is giving us a feel for how it will deliver security and trustworthiness for the next generation of on-device AI.

Using MCP inside Windows

The Model Context Protocol is a standard API format that gives agents access to data and functions from applications. If you’ve used the GitHub Copilot Agent in Visual Studio Code, you’ve seen how it allows access to tools that expose your Azure cloud resources as well as service best practices. However, it requires you to find and install MCP server endpoints yourself.

That’s fine for developers who are already used to finding resources and adding them to their toolchains as needed. However, for consumers, even power users, such an approach is a non-starter. They expect Windows to keep track of the tools and services they use and manage them. An MCP server for a local agent running in Windows needs to install like any other application, with Windows managing access and security.

Microsoft is adding an MCP registry to Windows, which adds security wrappers and provides discovery tools for use by local agents. An associated proxy manages connectivity for both local and remote servers, with authentication, audit, and authorization. Enterprises will be able to use these tools to control access to MCP, using group policies and default settings to give connectors their own identities.

Registering an MCP server is handled by installing via MSIX packages, with the MCP server using the standard bundle format. Bundles are built using an NPM package, so you need to have NodeJS installed on your development system before downloading and installing the MCP bundle (mcpb) package, and then initializing and building your bundle, targeting your MCP server code. This can then be included in your application’s installer and wrapped as an MSIX file.

You can manually install MCP bundles, but using a Windows installer and MSIX makes sure that the server is registered and will run in a constrained agent session. This limits access to system resources, reducing the risks of complex prompt injection attacks. Servers need to be binaries with a valid manifest before they can be registered. They are included as a com.microsoft.windows.ai.mcpserver extension in the MSIX package manifest, which registers the server and removes it when the host application is uninstalled.

As they run in a separate session, you need to give explicit permission for file access, and they are blocked from access to the registry and from seeing what you are currently using. That doesn’t stop them from running code in their own session or from accessing the internet. Access to user files is managed by the app that hosts the MCP server, and if access is granted to one server, all the other servers that run under the same host automatically get access. The requested capabilities need to be listed in the app manifest, used by the system to prompt for access.

The link between Windows agents and MCP servers

MCP servers are only part of the Windows agent platform. They need hosts, which provide the link between your agents and registered MCP servers. Microsoft provides a sample JavaScript application to show how to build and use a host, parsing the JSON provided by a server and then connecting. You can then list its available tools and call them. The sample code can be adapted to other languages relatively easily, allowing an agent orchestration framework like Semantic Kernel to work with local MCP servers.

MCP servers provide a bridge between AI applications and other services, in many cases offering connectors that can be used for AI models to query the service. As part of its initial set of Windows agent tools, Microsoft is delivering an MCP-based connector for the Windows File Explorer, giving agents the same access to the Windows file system as users. Both users and system administrators can block access to files or specific project directories.

The connector provides agents with a set of file tools, which include basic access, modification, and file and directory creation capabilities. As there’s no specific file deletion capability, agents can use the connector to write new files and move existing ones, as well as to edit text content. These are classed as destructive operations as they change the underlying Windows file system.

Be careful when giving agents access to the Windows file system; use base prompts that reduce the risks associated with file system access. When building out your first agent, it’s worth limiting the connector to search (taking advantage of the semantic capabilities of Windows’ built-in Phi small language model) and reading text data.

This does mean you’ll need to provide your own guardrails for agent code running on PCs, for example, forcing read-only operations and locking down access as much as possible. Microsoft’s planned move to a least-privilege model for Windows users could help here, ensuring that agents have as few rights as possible and no avenue for privilege escalation.

Along with tools for building and running MCP servers in Windows, Microsoft provides a command-line tool for working with its agent registry. This will allow you to test that your own servers have been installed. The tool will also list any third-party servers that may have been registered by applications running on your PC. It’s a good idea to use this regularly to check for new servers that may have been installed by software updates.

The road to an agentic OS

Building an agentic OS is hard, as the underlying technologies work very differently from standard Windows applications. Microsoft is doing a lot to provide appropriate protections, building on its experience in delivering multitenancy in the cloud. Microsoft’s vision for an agentic OS appears to be one where each agent and its associated servers are treated as a tenant on your PC, where it operates in a restricted, locked-down environment to reduce the risk of interactions with your applications and data.

We’ve seen this before, where services like Windows log-on are kept in their own virtual machines using the Krypton hypervisor. Virtualization-based security is a key part of Windows 11, so it’s no surprise that this model is at the heart of delivering autonomous agents as part of Windows. As I noted in an earlier look at Microsoft’s agent visions, one of the showstoppers for the first generation of agent technologies was that they required running arbitrary code on remote computers. Redmond has clearly learned from the lessons of Kaleida and General Magic and is sandboxing its agent support from the very start.

It is still early, but it’s promising to see tools to help build complex agentic applications that can use a mix of local and remote resources to handle many different tasks, without leaving a secure sandbox. If Microsoft can deliver and developers can take advantage, the results could be very interesting.

(image/jpeg; 9.17 MB)

A proactive defense against npm supply chain attacks 4 Dec 2025, 9:00 am

Open-source software has become the backbone of modern development, but with that dependency comes a widening attack surface. The npm ecosystem in particular has been a high-value target for adversaries who know that one compromised package can cascade downstream into thousands of applications.

The Shai Hulud worm, embedded in npm packages earlier this year, was a stark reminder that attackers don’t just exploit vulnerabilities, they weaponize trust in open ecosystems. For developers and security engineers, this isn’t a once-in-a-while problem. It’s a 24x7x365 risk.

Breaking down the attack vector

Malicious npm packages spread by exploiting developer trust and automation. Attackers inject harmful payloads into libraries that appear legitimate, sometimes even hijacking widely used packages via stolen maintainer credentials.

The Stairwell research team has observed common attacker behaviors, including:

  • Obfuscation with Buffer.from() and Base64 to conceal malicious payloads.
  • Exfiltration hooks to steal environment variables, API keys, or npm tokens.
  • Persistence techniques that run automatically during install (preinstall/postinstall scripts).

Once installed, these dependencies can exfiltrate credentials, establish persistence, or spread laterally across development environments.

Using YARA for detection

Originally designed for malware research, YARA has become a flexible pattern-matching tool for identifying malicious files or code fragments. When applied to the software supply chain, YARA rules can:

  • Flag suspicious or obfuscated JavaScript within npm dependencies.
  • Detect anomalous patterns like hidden credential stealers or worm propagation code.
  • Surface malware families across repos by reusing detection logic.

For example, Stairwell published a YARA rule targeting DarkCloud Stealer, which scans for tell-tale signs of data-stealing malware embedded in npm packages. Another simple detection might look for suspiciously encoded Buffer.from() payloads, which often mask malicious code.

Below is a YARA rule we put together for the chalk/debug supply chain attack.

Stairwell YARA rule

Stairwell

Integrating YARA into developer workflows

The real value comes from moving YARA out of the lab and into the pipeline. Instead of running YARA manually after an incident, it’s better to embed it directly in your CI/CD or dependency monitoring process.

Practical steps include:

  • Pre-merge scanning: Automate YARA checks on every pull request or package update.
  • Pipeline enforcement: Block builds that import dependencies matching malicious rules.
  • Rule sharing: Distribute your rule library across teams to reduce duplicated effort.

Stairwell’s approach demonstrates how this can be done at scale, turning YARA into a frontline defense mechanism rather than just a forensic tool.

Around-the-calendar protection

Supply chain attacks don’t follow a calendar, but attackers do take advantage of high-stakes moments. The holiday shopping season is a prime example: retailers, e-commerce platforms, and SaaS providers can’t afford downtime or breaches during peak traffic.

A poisoned npm dependency at the wrong time could mean: Checkout failures or outages, stolen customer data or credentials, or even reputational damage amplified by seasonal visibility. In short, when uptime is most critical, attackers know disruption is most costly.

Actionable guidance for engineers

To build resilience against npm supply chain attacks, security-minded developers should consider these four steps:

  1. Maintain an internal YARA rule library focused on package behaviors.
  2. Automate execution within CI/CD and dependency monitoring.
  3. Continuously update rules based on fresh attack patterns observed in the wild.
  4. Contribute back to the community, strengthening the broader open-source ecosystem.

The bottom line

Securing the supply chain is impossible. Organizations should balance investments. Many supply chain security tools deliver a false sense of security with claims of preventing supply chain attacks. Indeed enterprises need to have better capabilities to understand if the threat is inside their environment. While prevention is better than cure, what happens when you have a breach. When you are prepared with tools to continuously evaluate your environment, you make the breach response faster. 

The reality is that supply chain risk is unavoidable, but it’s not unmanageable. By embedding YARA into developer workflows, teams can move from reactive cleanup to proactive prevention, reducing the chance that the next compromised package ever makes it into production.

New Tech Forum provides a venue for technology leaders—including vendors and other outside contributors—to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to doug_dineley@foundryco.com.

(image/jpeg; 0.69 MB)

Microsoft steers native port of TypeScript to early 2026 release 4 Dec 2025, 12:48 am

Microsoft’s planned TypeScript 7.0 release, an effort to improve performance, memory usage, and parallelism by porting the TypeScript language service and compiler to native code, has made significant progress, Microsoft reports. A planned TypeScript 6.0 release, meanwhile, will be the last JavaScript-based version of TypeScript, bridging the current TypeScript 5.9 release to TypeScript 7.0.

In a December 2 blog post, Microsoft provided updates on TypeScript 7.0, also known as Project Corsa, a project revealed in March and based on Google’s Go language. While the effort has been a significant undertaking, big strides have been made, said blog post author Daniel Rosenwasser, Microsoft principal product manager for TypeScript. Microsoft is targeting early 2026 for the release of TypeScript 6.0 and TypeScript 7.0. The code is public and available at the TypeScript-go GitHub repository.

For the language service, most of the features that make up the existing editing experience are implemented and working well in TypeScript 7.0, though some features are still being ported, Rosenwasser said. Parts of the language service have been rearchitected to improve reliability while also leveraging shared-memory parallelism. The latest preview of the language service, for Visual Studio Code, can be accessed from the Visual Studio Code Marketplace.

The native port of the TypeScript compiler also has made significant progress, with TypeScript 7.0 type checking nearly complete. A frequent question is whether it is “safe” to use TypeScript 7.0 to validate a build, Rosenwasser said, or in other words, does the TypeScript 7.0 compiler reliably find the same errors that TypeScript 5.9 does? The answer is yes, he said. For context, there have been around 20,000 compiler test cases, of which about 6,000 produce at least one error in TypeScript 6.0. In all but 74 cases, TypeScript 7.0 also produces at least one error. Developers can confidently use TypeScript 7.0 today to type-check a project for errors, Rosenwasser said. Beyond single-pass/single-project type checking, the command-line compiler also has reached major parity. Features such as --incremental, project reference support, and --build mode are all ported over and working.

TypeScript 7.0 will remove behaviors and flags planned for deprecation in TypeScript 6.0. A list of upcoming deprecations in TypeScript 6.0 can be seen in the issue tracker.  For emit, --watch, and API capabilities, the JavaScript pipeline is not entirely complete. Developers who do not need JavaScript emit from TypeScript, running tsgo for a build will work fine, Rosenwasser said. Also, TypeScript 7.0 (Corsa) will not support the existing Strada API. The Corsa API is still a work in progress.

With TypeScript 6.0, there is no intention to produce a TypeScript 6.1 release, although there may be patch releases for TypeScript 6. “You can think of TypeScript 6.0 as a ‘bridge’ release between the TypeScript 5.9 line and 7.0,” Rosenwasser said. “6.0 will deprecate features to align with 7.0, and will be highly compatible in terms of type-checking behavior.” The intent is to ensure that TypeScript 6.0 and TypeScript 7.0 are as compatible as possible.

(image/jpeg; 1.97 MB)

Developers urged to immediately upgrade React, Next.js 4 Dec 2025, 12:21 am

Developers using the React 19 library for building application interfaces are urged to immediately upgrade to the latest version because of a critical vulnerability that can be easily exploited by an attacker to remotely run their own code.

Researchers at Wiz said Wednesday that a vulnerability in the React Server Components (RSC) Flight protocol affects the React 19 ecosystem, as well as frameworks that implement it. In particular, that means Next.js, a popular full stack development framework built on top of React, which received a separate CVE. 

RSC Flight protocol powers communication between the client and server for React Server Components, sending serialized component trees over the wire from the server to the client.

“The vulnerability exists in the default configuration of affected applications, meaning standard deployments are immediately at risk,” says the warning. “Due to the high severity and the ease of exploitation, immediate patching is required,” 

“Our exploitation tests show that a standard Next.js application created via create-next-app and built for production is vulnerable without any specific code modifications by the developer,” Wiz also warns.

The problem in React’s server package, designated CVE-2025-55182, is a logical deserialization vulnerability allowing the server to processes RSC payloads in an unsafe way. When a server receives a specially crafted, malformed payload, say Wiz researchers, it fails to validate the structure correctly. This allows attacker-controlled data to influence server-side execution logic, resulting in the execution of privileged JavaScript code.

“In simple terms,” Wiz said in response to questions, “the server takes input from a user, trusts it too much, and processes it into code-like objects which attackers can exploit to run commands or leak sensitive information.”

Affected are React versions 19.0.0, 19.1.0, 19.1.1, and 19.2.0. The fix is to upgrade to the latest version of React.

While the vulnerability affects all development frameworks using vulnerable versions of React, the problem in Next.js is specifically identified as CVE-2025-66478.

Affected are Next.js 15.x and 16.x using the App Router. Again, the fix is to upgrade to the latest version of Next.js.

React’s blog provides detailed upgrade instructions for both React and Next.js.

‘Serious vulnerability’

“The configuration needed for these vulnerabilities to function is extremely common,” Wiz said in response to questions, “and disabling the functionality needed to block them is very rare. In fact, we failed to find any such case.”

Wiz says 39% of cloud environments are currently using Next.js and other web frameworks based on React. 

Johannes Ullrich, dean of research at the SANS Institute, told InfoWorld that RSC is widely used, particularly when the Next.js framework, which implements RSC by default, is employed.

“This is a very serious vulnerability,” he said in an email. “I expect public exploits to surface within a day or so, and applications must be patched quickly. Some web application firewall vendors, such as Cloudflare, have already implemented rules to protect applications from potential exploits. But even web applications protected by these systems should be patched, in case attackers find ways to bypass these protection mechanisms.”

To exploit the React vulnerability, all a threat actor would need to do is send a specially crafted HTTP request to the server endpoint. For security reasons, Wiz researchers didn’t detail how this could be done. But, they said, in similar vulnerabilities, attackers leverage remote code execution on servers to download and execute sophisticated trojans on the server, usually a known C2 framework like sliver, but in some cases, a more custom payload. “The main point,” the researchers said, “is that with an RCE like this, an attacker can practically do anything.”

CISOs and developers need to treat these two vulnerabilities as “more than critical,” said Tanya Janca, a Canadian-based secure coding trainer. In fact, she said in an email, they should be treated in the same way that infosec pros treated the Log4j vulnerability, and scour all applications. “There could not be a more serious security flaw in a web application than this,” she said, “even if it is not known to be exploited in the wild yet.”

Advice for CSOs, developers

Janca said developers should:

  • make a list of all apps using React or Next.js;
  • check if they use any of the known vulnerable versions: React: 19.0 / 19.1.0 / 19.1.1 / 19.2.0, and Next.js: 14.3.0-canary.77 and later canary releases, 15.x/16.x
    if so, upgrade to a safe version:
    • React: 19.0.1, 19.1.2, 19.2.1 or better
    • Next.js: 15.0.5, 15.1.9, 15.2.6, 15.3.6, 15.4.8, 15.5.7, 16.0.7 or later; if on Next.js 14.3.0-canary.77 or a later canary release, downgrade to the latest stable 14.x release;
  • scan with a software composition analysis tool to see if the vulnerable versions are used in unexpected places;
  • if, for some reason, they can’t be upgraded, assume those apps are unsafe and turn them off if possible. If they can’t be disabled, treat them like a bomb went off and put a network firewall around them, monitor them and work with the security team on it;
  • infosec pros should read app logs and look for strange behavior;
  • keep the security team informed;

Most importantly, she said, treat this as an emergency.

(image/webp; 0.04 MB)

Mistral targets lightweight processors with its biggest open model yet 3 Dec 2025, 5:20 pm

Mistral AI’s latest batch of LLMs, officially released Tuesday, includes Mistral Large 3, a 675-billion-parameter model.

It’s the company’s first mixture-of-experts model since the Mixtral series released at the end of 2023, and already ranks among the top open-source offerings on the LMArena leaderboard.

While Mistral 3 Large model needs many high-powered processors to run, its nine smaller Ministral variants, ranging from 3 billion to 14 billion parameters in size, are designed to run on a single GPU.

All the new models support image understanding and more than 40 languages, the company said in its announcement.

Edge deployment targeted specific use cases

With the smaller Ministral models, Mistral aims to address cost concerns and a need for on-premises deployment, where companies often cannot afford large numbers of high-end processors. The Ministral models “match or exceed the performance of comparable models while often producing an order of magnitude fewer tokens,” the company said, potentially reducing token generation by 90% in some cases, which translates to lower infrastructure costs in high-volume applications.

Mistral engineered the smaller models to run on a single GPU, enabling deployment in manufacturing facilities with intermittent connectivity, robotics applications requiring low-latency inference, or healthcare environments where patient data can’t leave controlled networks.

In such environments, enterprises may lean towards open models like those from Mistral over proprietary models running on centralized infrastructure such as those from OpenAI or Anthropic, said Sushovan Mukhopadhyay, director analyst at Gartner. “Open-weight models appeal where customization and privacy matter, supported by on-prem deployments with self-service environments which is ideal for cost-effective, high-volume tasks where data is private and the enterprise assumes full liability for outputs,” he said.

Internal applications processing proprietary data — document analysis, code generation, workflow automation — represented the strongest fit for open-weight models. “Proprietary APIs remain attractive for external-facing apps due to provider-backed liability, audited access, and Intellectual Property indemnification via frontier model gateways which is important for managing enterprise risk,” Mukhopadhyay added.

Budget shift changed priorities

Mistral 3 arrives as enterprises are rethinking AI procurement priorities. Data from Andreessen Horowitz showed AI spending from innovation budgets dropped from 25% to 7% between 2024 and 2025, with enterprises instead funding through centralized IT budgets. Those changes shifted procurement criteria from performance and speed to cost predictability, regulatory compliance, and vendor independence.

The shift has added complexity beyond simple cost calculations. “Cost and performance appear to be primary drivers, but they’re never the only considerations as organizations move from pilot to production and scale,” said Mukhopadhyay. “Liability protection, IP indemnification, and licensing agreements become critical alongside these factors.”

The trade-offs have become more nuanced. “Open-weight models may seem cost-effective and customizable, but many are not truly ‘open’. Commercial interests often override openness through license restrictions,” he said. “Proprietary APIs, though premium, provide provider-backed liability and IP indemnification for customer-facing apps, but not all such solutions can run in fully on-prem or air-gapped environments.”

European positioning addressed sovereignty

Beyond technical capabilities, Mistral’s corporate positioning as a European alternative carried strategic weight for some enterprises navigating regulatory compliance and data residency requirements.

EU regulatory frameworks — GDPR requirements and European Union AI Act provisions taking effect in 2025 — have complicated adoption of US-based AI services. For organizations facing data residency mandates, Mistral’s European headquarters and permissive open-source licensing addressed compliance concerns that proprietary US providers couldn’t easily resolve.

Mistral’s reported $14 billion valuation in a funding round that was nearing completion in September 2025, alongside partnerships with Microsoft and Nvidia, signaled the company has resources and backing to serve as a viable long-term alternative. Enterprise customers including Stellantis and CMA CGM have moved deployments from pilots to company-wide rollouts.

The company makes its models available through Mistral AI Studio, Amazon Bedrock, Azure Foundry, Hugging Face, and IBM WatsonX.

(image/jpeg; 0.91 MB)

AWS introduces powers for AI-powered Kiro IDE 3 Dec 2025, 3:30 pm

AWS has released Kiro powers, an addition to the company’s Kiro AI-driven IDE that provides dynamic loading of context and Model Context Protocol (MCP) servers, with the intent of providing a unified approach to a broad range of development use cases.

Announced December 3, Kiro powers enables developers to access specialized expertise to accelerate software development. The new capability allows developers to customize their Kiro agent with specialized domain research in a single click, according to AWS. Kiro powers can be comprised of a combination of MCP servers for specialized tool access, steering files with best practices, and hooks to trigger specific actions, helping developers customize an agent for workflows. These workflows span from UI development to back-end development, API development, AI agent development, code deployment, and observability. 

A power is a bundle that includes:

  • POWER.md: The entry point steering file—an onboarding manual that tells the agent what MCP tools it has available and when to use them.
  • MCP server configuration: The tools and connection details for the MCP server.
  • Additional hooks or steering files: Things for an agent to run on IDE events or via slash commands.

Kiro powers are designed for easy discovery and installation, whether the developer is using curated partners, community-built powers, or a team’s private tools, AWS said. Discovery, configuration, and installation happen through the IDE or the kiro.dev website. Kiro powers provide a unified approach to applying AI to software development tasks that offers MCP compatibility, dynamic loading, and packaged expertise in one system, AWS said.

While Kiro powers now work exclusively in the Kiro IDE, plans call for having them work across any AI development tool, such as Kiro CLI, Cline, Cursor, Claude Code, and beyond. Kiro powers are built by leaders in their fields including Datadog, Dynatrace, Neon, Netlify, Postman, Supabase, and AWS (Strands Agents, Amazon Aurora). There will be more to come from both software vendors and open-source communities, AWS said.

(image/jpeg; 3.57 MB)

AWS offers new service to make AI models better at work 3 Dec 2025, 2:34 pm

Enterprises are no longer asking whether they should adopt AI; rather, they want to know why the AI they have already deployed still can’t reason as their business requires it to.

Those AI systems are often missing an enterprise’s specific business context, because they are trained on generic, public data, and it’s expensive and time-consuming to fine-tune or retrain them on proprietary data, if that’s even possible.

Microsoft’s approach, unveiled at Ignite last month, is to wrap AI applications and agents with business context and semantic intelligence in its Fabric IQ and Work IQ offerings.

AWS is taking a different route, inviting enterprises to build their business context directly into the models that will run their applications and agents, as its CEO Matt Garman explained in his opening keynote at the company’s re:Invent show this week.

Third-party models don’t have access to proprietary data, he said, and building models with that data from scratch is impractical, while adding it to an existing model through retrieval augmented generation (RAG), vector search, or fine-tuning has limitations.

But, he asked, “What if you could integrate your data at the right time during the training of a frontier model and then create a proprietary model that was just for you?”

AWS’s answer to that is Nova Forge, a new service that enterprises can use to customize a foundation large language model (LLM) to their business context by blending their proprietary business data with AWS-curated training data. That way, the model can internalize their business logic rather than having to reference it externally again and again for inferencing.

Analysts agreed with Garman’s assessment of the limitations in existing methods that Nova Forge aims to circumvent.

“Prompt engineering, RAG, and even standard supervised fine-tuning are powerful, but they sit on top of a fully trained model and are inherently constrained. Enterprises come up against context windows, latency, orchestration complexity. It’s a lot of work, and prone to error, to continuously ‘bolt on’ domain expertise,” said Stephanie Walter, practice leader of AI stack at HyperFRAME Research.

In contrast, said ISG’s executive director of software research, David Menninger, Nova Forge’s approach can simplify things: “If the LLM can be modified to incorporate the relevant information, it makes the inference process much easier to manage and maintain.”

Who owns what

HFS Research’s associate practice leader Akshat Tyagi, broke down the two companies’ strategies: “Microsoft wants to own the AI experience. AWS wants to own the AI factory. Microsoft is packaging intelligence inside its ecosystem. AWS is handing you the tools to create your own intelligence and run it privately,” he said.

While Microsoft’s IQ message essentially argues that enterprises don’t need sprawling frontier models and can work with compact, business-aware models that stay securely within their tenant and boost productivity, AWS is effectively asking enterprises not to settle for tweaking an existing model but use its tools to create a near–frontier-grade model tailored to their business, Tyagi said.

The subtext is clear, he said: AWS knows it’s unlikely to dominate the assistant or productivity layer, so it’s doubling down on its core strengths of deep infrastructure, while Microsoft is playing the opposite game.

Nova Forge is a clear infrastructure play, Walter said. “It gives AWS a way to drive Trainium, Bedrock, and SageMaker as a unified frontier-model platform while offering enterprises a less expensive path than bespoke AI labs.”

The approach AWS is taking with Nova Forge will curry favor with enterprises working on use cases that require precision and nuance, including drug discovery, healthcare, industrial control, highly regulated financial workflows, and enterprise-wide code assistants, she said.

Custom LLM training costs

In his keynote, Garman said that Nova Forge eliminates the prohibitive cost, time, and engineering drag of designing and training a LLM from scratch — the same barrier that has stopped most enterprises, and even rivals such as Microsoft, from attempting to provide a solution at this layer.

It does so by offering a pre-trained model and various training checkpoints or snapshots of the model to jumpstart the custom model building activity instead of having to pre-train it from scratch or retrain it for context again and again, which AWS argues is a billion-dollar affair.

By choosing whether they want to start from a checkpoint in early pre-training, mid-training, or post‑training, said Robert Kramer, principal analyst at Moor Strategy and Insights, “Enterprise choose how deeply they want their domain to shape the model.”

AWS plans to offer the service through a subscription model rather than an open-ended compute consumption model. It didn’t disclose the price publicly, referring customers to an online dashboard, but CNBC reported that Nova Forge’s price starts at $100,000 per year.

Enterprises can start building a custom building a model via the new service on SageMaker Studio and later export it to Bedrock for consumption, AWS said. Nova Forge’s availability is currently limited to the US East region in Northern Virginia.

This article first appeared on CIO.

(image/jpeg; 11.73 MB)

A first look at Google’s new Antigravity IDE 3 Dec 2025, 9:00 am

Once upon a time, IDEs focused on specific languages, like Visual Studio IDE for Microsoft C++ or IntelliJ IDEA for Java. But now there is a new wave of IDEs dedicated to agentic AI workflows. AWS Kiro has recently become generally available, and now Google has whipped the drapes off its own Antigravity IDE.

Like Kiro, Antigravity is built from a fork of Visual Studio Code, which integrates Antigravity’s behavior with VS Code’s in ways that presumably wouldn’t be possible with just an extension. If you’ve used VS Code before, getting started with Antigravity is easy enough. But like Kiro, Antigravity’s workflow revolves around interactions with AI agents, which requires some adjustment.

Setting up a project

When you open Antigravity and start a new conversation with one of its agents, you can choose one of two interaction modes.

  • Planning mode is more deliberate and generates artifacts of the agent’s thinking process—walkthroughs, task lists, and so on. This mode gives you plenty of opportunity to intervene at each step and decide if a given operation needs modification.
  • Fast mode executes commands directly, so it’s more useful for quick actions that aren’t likely to have major repercussions.

Use planning mode for projects where you want more oversight and feedback, and use fast mode for quick-and-dirty, one-and-done experiments. You can also select how much you want to review at each step: never, only when the agent thinks it’s a good idea, or always.

Antigravity comes pre-equipped with several agent models. The default, and the one I used for my review, is Gemini 3 Pro (high). The “low” version of Gemini 3 Pro is also available, along with Claude Sonnet 4.5 (both the regular and “thinking” variety), and GPT-OSS 120B Medium. As of this review, the only cost plan available for the models is a no-cost, individual-account public preview, with fixed rate limits refreshed every five hours. Paid tiers and bring-your-own-service plans are not yet supported.

Working with the agent

My first project in planning mode was a simple Python-based utility for taking a Markdown file and generating a Microsoft Word (.docx) file from it. The first set of commands Antigravity generated did not take advantage of the Python virtual environment already in the project directory, which meant the needed libraries would have been installed in the wrong place. But after I advised the agent, it used the virtual environment correctly for all future Python actions.

Antigravity task implementation plan.

The implementation plan created by an Antigravity task prompt, shown at top left. Planning mode allows the developer to vet and comment on each plan document and the description of steps.

Foundry

Once the agent created a basic version of the project, including a task list, walkthrough, and implementation plan, I requested some modifications. One was the ability to provide font styling information for the generated Word file, by way of a JSON file. Another was allowing inline images to also be saved in the generated Word document, and either linked from an external file or embedded. The agent made that last feature work by generating a custom XML fragment to be inserted into the document, since the Office XML library used for the project didn’t support that as an option.

A sample of code generated by the Antigravity IDE.

An example of code generated by Antigravity, with sample input and output files shown on the left side of the explorer screen.

Foundry

Whenever you give instructions to the agent, it works up a few different planning documents. The task list describes each high-level goal and whether it’s been completed. The implementation plan goes into verbose detail about how the agent intends to accomplish the current task. The walkthrough provides an aggregate summary of each set of changes made. For each of these, you can provide inline comments as feedback to the agent, much as you would in a Word document, and modify the plan granularly as you go forward.

Antigravity implementation plan

The project implementation plan generated by Antigravity. The developer can provide inline commentary, which the agent will evaluate and use to shape future revisions.

Foundry

All earlier states of those files are preserved along with your agent conversation history. Antigravity also tracks (where it deems relevant) persistent patterns and insights across conversations in what are called Knowledge Items.

One much touted feature for Antigravity’s agent integration is the ability to generate mockups and graphics via Google’s Nano Banana image-generation service. To test this, I asked the agent to generate a mockup for a web UI front end for my application. The failure mode for this turned out to be as interesting as the service itself: When multiple attempts to generate the image failed due to the server being overloaded, the agent fell back to generating the mockup as an actual web page. In some ways that was better, as it allowed me to more readily use the HTML version.

Agent-driven browser features

Since Antigravity is a Google project, it naturally provides integration with Google Chrome. The agent can be commanded to open instances of Chrome and perform interactive actions (such as opening a web page and extracting text) by way of a manually installed browser plugin. The agent can also, to some extent, work around not having the plugin. For instance, when I didn’t have the Chrome plugin installed and asked for screenshots from a website, the agent worked up an alternate plan to use a Python script and an automation framework to get the job done.

While it’s convenient to tell the agent to operate the browser, as opposed to writing a Python program to drive a browser-automation library like Playwright, the agent doesn’t always give you predictable outcomes. When I tried to extract a list of the most recent movies reviewed on RogerEbert.com from its front page, the agent scrolled down slightly (it even admitted to doing this, but didn’t specify a reason why) and missed a few of the titles at the very top of the page. Writing a script to automate the scraping generated more reproducible results.

Limitations and quirks of using Antigravity

Working with agentic AI is hardly bulletproof, and my experiences with Gemini in Antigravity included a few misfires. At one point the agent mistakenly duplicated an entire section of the code for my project. It caught the mistake, but only by chance while working on an unrelated part of the project.

I also ran into a few quirks specific to the IDE. For instance, if you create an Antigravity project directory and move it somewhere else on the system, some things may break silently, like the retention of Knowledge Items. There is currently no obvious way to fix this problem.

Conclusion

The main selling point for IDEs with agentic AI integration is having one context for all of your work. Instead of stitching together a suite of multiple applications, or even one app with multiple plugins, both Antigravity and its competitor, Kiro, present a unified workspace. And like Kiro, Antigravity uses a prompt-and-spec driven process for iterative development.

The biggest difference between the two IDEs, at this stage, is in the models each one offers. Kiro is limited to Claude Sonnet 4.0 and 4.5, whereas Antigravity offers Sonnet and others (mainly Gemini). Both are still limited to external APIs for their models. Even if you had the hardware to host a model locally, you couldn’t use Antigravity with it—at least not yet.

Antigravity doesn’t have Kiro’s more development-workflow-centric features, like the hooks that can be defined to trigger agent behaviors at certain points (e.g., saving a file). The product is still in an early enough stage, though. It is likely Google is focusing on the core agentic functions—the behavior of the user feedback loop, for instance—before adding a broader set of developer features and bringing the product to a full-blown initial release.

(image/jpeg; 5.85 MB)

The complete guide to Node.js frameworks 3 Dec 2025, 9:00 am

Node.js is one of the most popular server-side platforms, especially for web applications. It gives you non-blocking JavaScript without a browser, plus an enormous ecosystem. That ecosystem is one of Node’s chief strengths, making it a go-to option for server development.

This article is a quick tour of the most popular web frameworks for server development on Node.js. We’ll look at minimalist tools like Express.js, batteries-included frameworks like Nest.js, and full-stack frameworks like Next.js. You’ll get an overview of the frameworks and a taste of what it’s like to write a simple server application in each one.

Minimalist web frameworks

When it comes to Node web frameworks, minimalist doesn’t mean limited. Instead, these frameworks provide the essential features required to do the job for which they are intended. The frameworks in this list also tend to be highly extensible, so you can customize them as needed. With minimalist frameworks, pluggable extensibility is the name of the game.

Express.js

At over 47 million weekly downloads on npm, Express is one of the most-installed software packages of all time—and for good reason. Express gives you basic web endpoint routing and request-and-response handling inside an extensible framework that is easy to understand. Most other frameworks in this category have adopted the basic style of describing a route from Express. This framework is the obvious choice when you simply need to create some routes for HTTP, and you don’t mind a DIY approach for anything extra.

Despite its simplicity, Express is fully-featured when it comes to things like route parameters and request handling. Here is a simple Express endpoint that returns a dog breed based on an ID:

import express from 'express';

const app = express();
const port = 3000;

// In-memory array of dog breeds
const dogBreeds = [
  "Shih Tzu",
  "Great Pyrenees",
  "Tibetan Mastiff",
  "Australian Shepherd"
];
app.get('/dogs/:id', (req, res) => {
  // Convert the id from a string to an integer
  const id = parseInt(req.params.id, 10);

  // Check if the id is a valid number and within the array bounds
  if (id >= 0 && id  {
  console.log(`Server running at http://localhost:${port}`);
});

You can easily see how the route is defined here: a string representation of a URL, followed by a function that receives a request and response object. The process of creating the server and listening on a port is simple.

If you are coming from a framework like Next, the biggest thing you might notice about Express is that it lacks a file-system based router. On the other hand, it offers a huge range of middleware plugins to help with essential functions like security.

Koa

Koa was created by the original creators of Espress, who took the lessons learned from that project and used them for a fresh take on the JavaScript server. Koa’s focus is providing a minimalist core engine. It uses async/await functions for middleware rather than chaining with next() calls. This can give you a cleaner server, especially when there are many plugins. It also makes the error handling less clunky for middleware.

Koa also differs from Express by exposing a unified context object instead of separate request and response objects, which makes for a somewhat less cluttered API. Here is how Koa manages the same route we created in Express:

router.get('/dogs/:id', (ctx) => {
  const id = parseInt(ctx.params.id, 10);

  if (id >= 0 && id 

The only real difference is the combined context object.

Koa’s middleware mechanism is also worth a look. Here’s a simple logging plugin in Koa:

const logger = async (ctx, next) => {
  await next(); // This passes control to the router
  console.log(`${ctx.method} ${ctx.url} - ${ctx.status}`);
};

// Use the logger middleware for all requests
app.use(logger);	

Fastify

Fastify lets you define schemas for your APIs. This is an up-front, formal mechanism for describing what the server supports:

const schema = {
  params: {
    type: 'object',
    properties: {
      id: { type: 'integer' }
    }
  },
  response: {
    200: {
      type: 'object',
      properties: {
        breed: { type: 'string' }
      }
    },
    404: {
      type: 'object',
      properties: {
        error: { type: 'string' }
      }
    }
  }
};

fastify.get('/dogs/:id', { schema }, (request, reply) => {
  const id = request.params.id;

  if (id >= 0 && id  {
  if (err) {
    fastify.log.error(err);
    process.exit(1);
  }
  console.log(`Server running at ${address}`);
});

From this example, you can see the actual endpoint definition is similar to Express and Koa, but we define a schema for the API. The schema is not strictly necessary; it is possible to define endpoints without it. In that case, Fastify behaves much like Express, but with superior performance.

Hono

Hono emphasizes simplicity. You can define a server and endpoint with as little as:

const app = new Hono()
app.get('/', (c) => c.text('Hello, Infoworld!'))  

And here’s how our dog breed example looks:

app.get('/dogs/:id', (c) => {
  // Get the id parameter from the request URL
  const id = parseInt(c.req.param('id'), 10);

  // Check if the id is a valid number and within the array bounds
  if (id >= 0 && id 

As you can see, Hono provides a unified context object, similar to Koa.

Nitro.js

Nitro is the back end for several full-stack frameworks, including Nuxt.js. As part of the UnJS ecosystem, Nitro goes further than Express in providing cloud-native tooling support. It includes a universal storage adapter and deployment support for serverless and cloud deployment targets.

Also see: Intro to Nitro: The server engine built for modern JavaScript.

Like Next.js, Nitro uses filesystem-based routing, so our Dog Finder API would exist at the following filepath:

/api/dogs/:id

The handler might look like this:

export default defineEventHandler((event) => {
  // Get the dynamic parameter from the event context
  const { id } = getRouterParams(event);
  const parsedId = parseInt(id, 10);

  // Check if the id is a valid number and within the array bounds
  if (parsedId >= 0 && parsedId 

Nitro inhabits the middle ground between a pure tool like Express and a full-blown stack, which is why full-stack front ends often use Nitro on the back end.

Batteries-included frameworks

Although Express and other minimalist frameworks set the standard for simplicity, more opinionated frameworks can be useful if you want additional features out of the box.

Nest.js

Nest is a progressive framework built with TypeScript from the ground up. Nest is actually a layer on top of Express (or Fastify), with additional services. It is inspired by Angular and incorporates the kind of architectural support found there. In particular, it includes dependency injection. Nest also uses annotated controllers for endpoints.

Also see: Intro to Nest.js: Server-side JavaScript development on Node.

Here is an example of injecting a dog finder provider into a controller:

// The provider:
import { Injectable, NotFoundException } from '@nestjs/common';

// The @Injectable() decorator marks this class as a provider.
@Injectable()
export class DogsService {
  private readonly dogBreeds = [
    "Shih Tzu",
    "Great Pyrenees",
    "Tibetan Mastiff",
    "Australian Shepherd"
  ];

  findOne(id: number) {
    if (id >= 0 && id 

This style is typical of dependency injection frameworks like Angular, as well as Spring. It allows you to declare components as injectable, then consume them anywhere you need them.

In Nest, we’d just add these as modules to make them live.

Adonis.js

Like Nest, Adonis provides a controller layer that you wire together with routes. Adonis is inspired by the model-view-controller (MVC) pattern, so it also includes a layer for modelling data and accessing stores via an ORM. Finally, it provides a validator layer for ensuring data meets requirements.

Routes in Adonis are very simple:

Route.get('/dogs/:id', [DogsController, 'show'])

In this case, DogsController would be the handler for the route, and might look something like:

import type { HttpContextContract } from '@ioc:Adonis/Core/HttpContext'  // Note, ioc means inversion of control, similar to dependency injection

export default class DogsController {
  // The 'show' method handles the logic for the route
  public async show({ params, response }: HttpContextContract) {
    const id = Number(params.id);

    // Check if the id is a valid number and within the array bounds
    if (!isNaN(id) && id >= 0 && id 

Of course, in a real application, we could define a model layer to handle the actual data access.

Sails

Sails is another MVC-style framework. It is one of the original one-stop-shopping frameworks for Node and includes an ORM layer (Waterline), API generation (Blueprints), and realtime support, including WebSockets.

Sails strives for conventional operation. For example, here’s how you might define a simple model for dogs:

/**
 * Dog.js
 *
 * @description :: A model definition represents a database table/collection.
 * @docs        :: https://sailsjs.com/docs/concepts/models
 */
module.exports = {
  attributes: {
    breed: { type: 'string', required: true },
  },
};

If you run this in Sails, the framework will generate default routes and wire up a NoSQL or SQL datastore based on your configuration. Sails also provides the option to override these defaults and add in your own custom logic.

Full-stack frameworks

Also known as meta-frameworks, these tools combine a front-end framework with a solid back end and various CLI niceties like build chains.

Next.js

Next is a React-based framework built by Vercel. It is largely responsible for the huge growth in popularity of these types of frameworks. Next was the first framework to bring together back-end API definitions with the front end that consumes them. It also introduced file-system routing. In Next and other full-stack frameworks, you get both parts of your stack in one place and you can run them together during development.

In Next, we could define a route at pages/api/dogs/[id].js like so:

export default function handler(req, res) {
  // `req.query.id` comes from the dynamic filename [id].js
  const { id } = req.query;
  const parsedId = parseInt(id, 10);

  if (parsedId >= 0 && parsedId 

We’d then define the UI component to interact with this route at pages/dogs/[id].js:

import React from 'react';

// This is the React component that renders the page.
// It receives the `dog` object as a prop from getServerSideProps.
function DogPage({ dog }) {
  // Handle the case where the dog wasn't found
  if (!dog) {
    return 

Dog Breed Not Found

; } return (

Dog Breed Profile

Breed Name: {dog.breed}

); } // This function runs on the server before the page is sent to the browser. export async function getServerSideProps(context) { const { id } = context.params; // Get the ID from the URL // Fetch data from our own API route on the server. const res = await fetch(`http://localhost:3000/api/dogs/${id}`); // If the fetch was successful, parse the JSON. const dog = res.ok ? await res.json() : null; // Pass the fetched data to the DogPage component as props. return { props: { dog, }, }; } export default DogPage;

Nuxt.js

Nuxt is the same idea as Next, but applied to the Vue front end. The basic pattern is the same, though. First, we’d define a back-end route:

// server/api/dogs/[id].js

// defineEventHandler is Nuxt's helper for creating API handlers.
export default defineEventHandler((event) => {
  // Nuxt automatically parses route parameters.
  const id = getRouterParam(event, 'id');
  const parsedId = parseInt(id, 10);

  if (parsedId >= 0 && parsedId 

Then, we’d create the UI file in Vue:

// pages/dogs/[id].vue



SvelteKit

SvelteKit is the full-stack framework for the Svelte front end. It’s similar to Next and Nuxt, with the main difference being the front-end technology.

In SvelteKit, a back-end route looks like so:

// src/routes/api/dogs/[id]/+server.js

import { json, error } from '@sveltejs/kit';

// This is our data source for the example.
const dogBreeds = [
  "Shih Tzu",
  "Australian Cattle Dog",
  "Great Pyrenees",
  "Tibetan Mastiff",
];

/** @type {import('./$types').RequestHandler} */
export function GET({ params }) {
  // The 'id' comes from the [id] directory name.
  const id = parseInt(params.id, 10);

  if (id >= 0 && id 

SvelteKit usually splits the UI into two components. The first component is for loading the data (which can then be run on the server):

// src/routes/dogs/[id]/+page.js

import { error } from '@sveltejs/kit';

/** @type {import('./$types').PageLoad} */
export async function load({ params, fetch }) {
  // Use the SvelteKit-provided `fetch` to call our API endpoint.
  const response = await fetch(`/api/dogs/${params.id}`);

  if (response.ok) {
    const dog = await response.json();
    // The object returned here is passed as the 'data' prop to the page.
    return {
      dog: dog
    };
  }

  // If the API returns an error, forward it to the user.
  throw error(response.status, 'Dog breed not found');
}

The second component is the UI:

// src/routes/dogs/[id]/+page.svelte



Dog Breed Profile

Breed Name: {data.dog.breed}

Conclusion

The Node.js ecosystem has moved beyond the “default-to-Express” days. Now, it is worth your time to look for a framework that fits your specific situation.

If you are building microservices or high-performance APIs, where every millisecond counts, you owe it to yourself to look at minimalist frameworks like Fastify or Hono. This class of frameworks gives you raw speed and total control without requiring decisions about infrastructure.

If you are building an enterprise monolith or working with a big team, batteries-included frameworks like Nest or Adonis offer useful structure. The complexity of the initial setup buys you long-term maintainability and makes the codebase more standardized for new developers.

Finally, if your project is a content-rich web application, full-stack meta-frameworks like Next, Nuxt, and SvelteKit offer the best developer experience and the perfect profile of tools.

It’s also worth noting that, while Node remains the standard server-side runtime, alternatives Deno and Bun have both made a name for themselves. Deno has great heritage, is open source with a strong security focus, and has its own framework, Deno Fresh. Bun is respected for its ultra-fast startup and integrated tooling.

(image/jpeg; 5.92 MB)

Page processed in 0.514 seconds.

Powered by SimplePie 1.4-dev, Build 20170403172323. Run the SimplePie Compatibility Test. SimplePie is © 2004–2025, Ryan Parman and Geoffrey Sneddon, and licensed under the BSD License.