Worm flooding npm registry with token stealers still isn’t under control 15 Nov 2025, 1:36 am

A coordinated token farming campaign continues to flood the open source npm registry, with tens of thousands of infected packages created almost daily to steal tokens from unsuspecting developers using the Tea Protocol to reward coding work.

On Thursday, researchers at Amazon said there were over 150,000 packages in the campaign. But in an interview on Friday, an executive at software supply chain management provider Sonatype, which wrote about the campaign in April 2024, told CSO that number has now grown to 153,000.

“It’s unfortunate that the worm isn’t under control yet,” said Sonatype CTO Brian Fox.

And while this payload merely steals tokens, other threat actors are paying attention, he predicted.

“I’m sure somebody out there in the world is looking at this massively replicating worm and wondering if they can ride that, not just to get the Tea tokens but to put some actual malware in there, because if it’s replicating that fast, why wouldn’t you?”

When Sonatype wrote about the campaign just over a year ago, it found a mere 15,000 packages that appeared to come from a single person.

With the swollen numbers reported this week, Amazon researchers wrote that it’s “one of the largest package flooding incidents in open source registry history, and represents a defining moment in supply chain security.”

This campaign is just the latest way threat actors are taking advantage of security holes in a number of open source repositories, which runs the risk of damaging the reputation of sites like npm, PyPI and others.

Related content: Supply chain attacks and their consequences

“The malware infestation in open-source repositories is a full-blown crisis, out of control and dangerously eroding trust in the open-source upstream supply chain,” said Dmitry Raidman, CTO of Cybeats, which makes a software bill of materials solution.

As evidence, he pointed to the Shai‑Hulud worm’s rapid exploitation of the npm ecosystem, which shows how quickly attackers can hijack developer tokens, corrupt packages, and propagate laterally across the entire dependency ecosystem. “What began as a single compromise explodes in a few hours, leaving the whole ecosystem and every downstream project in the industry at risk in a matter of days, regardless of whether it is open source or commercial.”

This past September, Raidman wrote about the compromise of the Nx build system after threat actors pushed malicious versions of the package into npm. Within hours, he wrote, developers around the world were unknowingly pulling in code that stole SSH keys, authentication tokens, and cryptocurrency wallets.

These and more recent large scale uploads of malicious packages into open source repositories are “just the beginning,” he warned, unless developers and repository maintainers improve security.

The Amazon and Sonatype reports aren’t the first to detect this campaign. Australian researcher Paul McCarty of SourceCodeRed confirmed to us this is the worm he dubbed ‘IndonesianFoods’ in a blog this week.

The Tea Protocol

The Tea Protocol is a blockchain-based platform that gives open-source developers and package maintainers tokens called Tea as rewards for their software work. These tokens are also supposed to help secure the software supply chain and enable decentralized governance across the network, say its creators on their website.

Developers put Tea code that links to the blockchain in their apps; the more an app is downloaded, the more Tea tokens they get, which can then be cashed in through a fund. The worm scheme is an attempt to make the blockchain think apps created by the threat actors are highly popular and therefore earn a lot of tokens.

At the moment, the tokens have no value. But it is suspected that the threat actors are positioning themselves to receive real cryptocurrency tokens when the Tea Protocol launches its Mainnet, where Tea tokens will have actual monetary value and can be traded.

For now, says Sonatype’s Fox, the scheme wastes the time of npm administrators, who are trying to expel over 100,000 packages. But Fox and Amazon point out the scheme could inspire others to take advantage of other reward-based systems for financial gain, or to deliver malware.

What IT leaders and developers should do

To lower the odds of abuse, open source repositories should tighten their access control, limiting the number of users who can upload code, said Raidman of Cybeats. That includes the use of multi-factor authentication in case login credentials of developers are stolen, he said, and adding digital signing capabilities to uploaded code to authenticate the author.

IT leaders should insist all code their firm uses has a software bill of materials (SBOM), so security teams can see the components. They also need to insist developers know the versions of the open source code they include in their apps, and confirm only approved and safe versions are being used and not automatically changed just because a new version is downloaded from a repository.

Sonatype’s Fox said IT leaders need to buy tools that can intercept and block malicious downloads from repositories. Antivirus software is useless here, he said, because malicious code uploaded to repositories won’t contain the signatures that AV tools are supposed to detect.

In response to emailed questions, the authors of the Amazon blog, researchers Chi Tran and Charlie Bacon, said open source repositories need to deploy advanced detection systems to identify suspicious patterns like malicious configuration files, minimal or cloned code, predictable code naming schemes and circular dependency chains.

“Equally important,” they add, “is monitoring package publishing velocity, since automated tools create at speeds no human developer could match. In addition, enhanced author validation and accountability measures are crucial for prevention. This includes implementing stronger identity verification for new accounts, monitoring for coordinated publishing activity across multiple developer accounts, as seen in this campaign, and applying ‘guilt by association’ principles where packages from accounts linked to malicious activity receive heightened scrutiny. Repositories should also track behavioral patterns like rapid account creation followed by mass package publishing, which are hallmarks of automated abuse.”

CISOs discovering these packages in their environments “face an uncomfortable reality,” the Amazon authors add: “Their current security controls had failed to detect a coordinated supply chain attack.”

SourceCodeRed’s McCarty said IT leaders need to protect developers’ laptops, as well as their automated continuous integration and delivery pipelines (CI/CD). Traditional security tools like EDR and SCA don’t scan for malware, he warned. “The number of people that buy Snyk thinking it does this is huge,” he said. 

McCarty has created two open source malware scanning tools. One, opensourcemalware.com, is an open database of malicious content like npm packages. It can be checked to see if a package being used is malicious. The second is the automated open-source MALOSS tool, which is effectively a scanner that checks opensourcemalware.com and other sources automatically. MALOSS can be used in a CI/CD pipeline or on a local workstation.

He also recommends the use of a commercial or open source package firewall, which effectively allows a developer to only install approved packages. 

“The enterprise has more options than I think they realize,” he told CSO. “They just often don’t realize that there are tools and solutions to address this risk.  Maturity is really low in this space.”

(image/jpeg; 9.83 MB)

Red Hat Linux bolsters AI assistance 14 Nov 2025, 10:40 pm

Red Hat has released two updates of Red Hat Enterprise Linux (RHEL), versions 10.1 and 9.7, with the new releases emphasizing AI-powered Linux management and quantum threat mitigation.

Both versions were unveiled on November 12 and can be accessed now from access.redhat.com.

With AI-powered Linux management, considered foundational to RHEL, the RHEL command-line assistant now has an expanded context limit, making it easier to analyze very large log files and data streams for effective troubleshooting, Red Hat said. The command-line assistant lowers the skills barriers to managing and troubleshooting connected systems, Red Hat said. Additionally, an offline version of the RHEL command-line assistant (available in developer preview) is a self-contained tool that runs locally, so users can receive AI-powered guidance for Linux tasks in disconnected environments. This is key for organizations in highly sensitive and regulated industries where cloud services are restricted, the company said.

Also with the new RHEL releases, Red Hat users now can more easily install validated drivers for leading AI accelerators from AMD, Intel, and Nvidia. This move will have RHEL delivering validated drivers to provide a secure foundation for emerging, mission-critical workloads, helping to reduce bottlenecks and accelerate the AI/ML life cycle, Red Hat said.

The company regularly releases two RHEL versions simultaneously to provide organizations with a choice of features, performance, and support life cycles. The latest RHEL version (10.1) offers the newest security patches and hardware support. In addition, key enhancements are backported to the stable, older version (9.7) so that organizations can get crucial updates without the disruption of a major version upgrade, Red Hat said.

RHEL 9.7 incorporates post-quantum cryptography algorithms to help deal with potential threads by future quantum computing. This follows capabilities introduced in RHEL 10, which arrived May 20. RHEL 10.1, meanwhile, enhances support for post-quantum cryptography in Transport Layer Security (TLS), providing protection for crucial data in transit, Red Hat said.

RHEL 10.1 now offers soft reboots, a new capability in image mode that lets administrators change system states without requiring a full kernel reboot. This makes for faster updates and patching with minimal disruption to business operations. And the OpenTelemetry Collector in RHEL 9 and RHEL 10 Cloud Images now supports Trusted Platform Module (TPM) on AWS, Microsoft Azure, and Google Cloud Platform. This enables sensitive operations to be performed within tamper-resistant hardware.

Finally, RHEL’s Automatic Certificate Environment (ACME) now is generally available. ACME automates the manual and error-prone task of security and certificate updates for production updates to help maintain security and reliability.

(image/jpeg; 8.39 MB)

Copy-paste vulnerability hits AI inference frameworks at Meta, Nvidia, and Microsoft 14 Nov 2025, 12:15 pm

Cybersecurity researchers have uncovered a chain of critical remote code execution (RCE) vulnerabilities in major AI inference server frameworks, including those from Meta, Nvidia, Microsoft, and open-source projects such as vLLM and SGLang.

According to Oligo Security, these vulnerabilities stand out for the way they propagated. Developers copied code containing insecure patterns across projects, effectively transplanting the same flaw into multiple ecosystems.

“These vulnerabilities all traced back to the same root cause: the overlooked unsafe use of ZeroMQ(ZMQ) and Python’s pickle deserialization,” said Avi Lumelsky, security researcher at Oligo. “As we dug deeper, we found that code files were copied between projects (sometimes line-for-line), carrying dangerous patterns from one repository to the next.”

Lumelsky noted in a blog post that Oligo has spent the past year uncovering similar RCE-grade flaws across widely used AI frameworks, pointing to a systemic security gap in the emerging inference ecosystem.

Code reuse contamination

In their investigation, Oligo’s researchers found that the initial trigger was exposed in Meta’s Llama Stack, where a function used ZeroMQ’s “recv-pyobj()” to receive data and then pass it directly to Python’s “pickle.loads().” This allowed arbitrary code execution over unauthenticated sockets.

“If you’ve worked with Python, you know pickle isn’t designed for security,” Lumelsky said. “ It can execute arbitrary code during deserialization, which is fine in a tightly controlled environment, but far from fine if exposed over the network.”

From Meta, the same insecure pattern appeared in other frameworks, including Nvidia’s TensorRT-LLM, vLLM, SGLang, and even the Modular Max Server. They all contained nearly identical code (sometimes with a header comment like “Adapted from vLLM”).

Oligo is calling this the “ShadowMQ” pattern, a hidden communication-layer flaw that jumps from one repository to another via copy-and-paste or minor adaptation, rather than fresh implementation. Because these frameworks are widely reused across the AI ecosystem, the contamination risk becomes systemic–a single vulnerable component can infect many downstream projects.

Oligo reported the flaw (CVE-2024-50050) to Meta in September 2024, which swiftly patched the unsafe pickle usage with JSON-based serialization. Thereon, Oligo flagged the flaw’s replication in vLLM (CVE-2025-30165), NVIDIA TensorRT-LLM(CVE-2025-23254), and Modular Max Server (CVE-2025-60455), all now fixed with suitable replacement logics.

Why this matters for AI infrastructure

The vulnerable inference servers form the backbone of many enterprise-grade AI stacks, processing sensitive prompts, model weights, and customer data. Oligo reported identifying thousands of exposed ZeroMQ sockets on the public internet, some tied to these inference clusters.

If exploited, an attacker could execute arbitrary code on GPU clusters, escalate privileges, exfiltrate model or customer data, or install GPU miners, turning an AI infrastructure asset into a liability.

SGLang has been adopted by several large enterprises, including xAI, AMD, Nvidia, Intel, LinkedIn, Cursor, Oracle Cloud, and Google Cloud, Lumelsky noted.

Oligo recommends upgrading to patched versions, which include versions not earlier than Meta Llama Stack v.0.0.41, Nvidia TensorRT-LLM 0.18.2, vLLM v0.8.0, and Modular Max Server v25.6. Restricting the use of pickle with untrusted data, adding HMAC and TLS authentication to ZQ-based communication, and educating dev teams on the risks were also advised.

(image/jpeg; 20.51 MB)

Tech mega-deals are a distraction, not a breakthrough 14 Nov 2025, 9:00 am

The tech world is abuzz with news of Amazon Web Services (AWS) and OpenAI signing a seven-year, $38 billion cloud computing deal, a partnership that promises high-powered AI advancements through massive infrastructure investments. AWS is providing OpenAI access to hundreds of thousands of Nvidia GPUs, including clusters of next-generation GB200 and GB300 chips networked via Amazon’s UltraServers. The rollout is already underway, aiming to scale tens of millions of CPUs and GPUs through 2026, with optional expansion into 2027.

The headlines may scream innovation, but let’s step back for a moment. While the tech press frames this deal as monumental, the reality is far less compelling for the vast majority of enterprises. At its core, this is a tech-to-tech agreement focused on infrastructure provisioning and back-end optimization of scale—hardly an everyday concern for most enterprises. Instead of delivering tangible solutions to business users or advancing practical enterprise use cases, mega-deals like this might actually distract both technology providers and enterprises from the outcomes that matter most.

Who benefits?

From the technology provider’s perspective, multi-billion-dollar mega-deals seem to make sense. AWS wants to strengthen its lead in cloud infrastructure while OpenAI seeks to ensure scalable compute for training and operating its generative AI models. The partnership gives OpenAI the GPU horsepower to maintain momentum for ChatGPT as well as future model development. In return, Amazon positions itself as a critical player in the generative AI race, potentially loosening the symbiotic ties OpenAI has with Microsoft. Both companies are seeking dominance in a competitive market.

But from the enterprise user’s perspective, this deal will have little, if any, immediate impact. The majority of enterprises aren’t concerned with OpenAI’s partners for GPUs. They aren’t clamoring for AWS UltraServers or Nvidia GB300 clusters. They care about solving day-to-day operational challenges: managing costs, adopting automation, easing the burden of IT operations, and delivering value to their customers. This deal won’t help them build better applications faster, nor does it address the kinds of problems CIOs and IT teams encounter in the trenches.

At best, these deals are neutral for enterprise users. At worst, they distract the very providers who should be focusing on innovating for their customers, not trying to impress their competitors.

A distraction from enterprise problems

In theory, partnerships like AWS and OpenAI’s should trickle down to the average enterprise in the form of better tools and services. But the ongoing pattern in the tech landscape is that whenever major providers form enormous partnerships, the focus inevitably shifts inward to infrastructure optimization, integration, and control of resources. The energy goes toward accommodating complex arrangements between the tech players themselves, not meeting the pressing needs of business users.

Enterprises, by contrast, operate in a world of brutal simplicity. They’re trying to migrate workloads to the cloud without downtime. They’re automating business workflows for better efficiency. They’re investing in data analytics to improve customer experience. Most simply need tools that don’t require a massive learning curve or specialized knowledge of GPU clusters or AI model hosting.

When mega tech partnerships dominate the tech industry’s attention, providers risk losing sight of the practical challenges facing their own customers. Instead of building technologies that make life easier for enterprises, they get caught up in perfecting back-end systems that have no visible impact on a company’s bottom line. Simply put, enterprises don’t care whether AWS inked a billion-dollar deal with OpenAI or any other AI leader; they just want the products they use to be better, more affordable, and easier to deploy.

Enterprises need relevance, not scale

The tech industry often conflates scale with accomplishment. The AWS-OpenAI deal is being touted as the largest AI infrastructure partnership to date, involving tens of millions of CPUs and GPUs. No doubt, that’s impressive from a technical standpoint. But the more critical question is whether this scale translates into relevant solutions for enterprises.

For most businesses, “AI” remains a buzzword. Companies are only beginning to implement chatbots, predictive analytics, and workflow optimization using AI. Smaller enterprises have yet to fully embrace cloud-native applications, let alone invest in customized AI deployments. In such a landscape, offering hundreds of thousands of GPUs and ultra-low-latency chips adds zero relevance to their lives. This isn’t a case of scaling solutions to meet demand; it’s a case of scaling capabilities that are far removed from the average enterprise reality.

Relevance, by contrast, looks like AI-backed tools that integrate seamlessly with email workflows or cloud management platforms. Relevance is offering midsize businesses affordable, out-of-the-box solutions that improve customer service, reduce operational scaling issues, or future-proof legacy systems. Relevance does not require $38 billion infrastructure partnerships. It requires listening to enterprises.

Tech users shouldn’t foot the bill

Another concern with deals of this scope is the eventual trickle-down cost to end-users. Building and maintaining the massive AI clusters outlined in the AWS-OpenAI agreement isn’t cheap, and big tech companies rarely absorb those costs themselves. Whether indirectly tied to higher cloud service rates, increased licensing costs, or premium enterprise tools, it’s the enterprise users who often bear the brunt of financing these power plays.

Rather than investing in smarter pricing models or lower-cost AI implementations, tech providers focus on inflating their infrastructure capabilities to gain competitive positioning. Generative AI is powerful and exciting, but these deals push it in directions that drive competition between providers rather than innovation for enterprises.

A better benchmark for progress

Ultimately, these mega-deals are a symptom of the tech industry’s obsession with scale and dominance. The press devotes excessive coverage to partnerships like AWS and OpenAI’s, as though they portend the next wave of technological innovation. It’s time to shift the conversation. Rather than celebrating contracts measured in billions of dollars, we should focus on what matters to business users: solving real-world problems, improving accessibility, and delivering tools that make operations more efficient, not more complex.

The average enterprise will never care about the back-end architecture of generative AI models or GPU provisioning. We, as an industry, need to let go of the hype and start delivering value where it counts. Mega-deals may keep tech providers in the headlines, but for most cloud users, they’re meaningless. Let’s stop pretending otherwise.

(image/jpeg; 0.11 MB)

Python vs. Mojo (and Java, Go, Rust, and .NET) 14 Nov 2025, 9:00 am

Is Mojo still a contender for Python’s ML/AI crown? What languages outside of the Python ecosystem are good for data science? What’s the deal with Python dataclasses? And what’s this shiny new distributed-processing framework from the folks who gave us PyTorch? Find the answers to these questions and more, in this week’s report.

Top picks for Python readers on InfoWorld

Revisiting Mojo: A faster Python?
Until recently, it wasn’t possible to run Mojo on your own machine. Now that you can, it’s time for another look. Is Mojo still a contender for Python’s data science crown?

AI and machine learning outside of Python
There’s no question Python is the default choice for machine learning and data science. That doesn’t mean other options are off the table, though. Here’s a look at what you can do with Java, Rust, Go, and .NET.

How to use Python dataclasses
Python classes can be verbose, and even simple ones are often overloaded with boilerplate. Learn how to use dataclasses for more streamlined Python class creation.

PyTorch team unveils framework for programming clusters
Monarch, as it’s called, lets you program entire clusters of machines in parallel with the same directness and power as the APIs used in PyTorch.

More good reads and Python updates elsewhere

Decompression is up to 30% faster in CPython 3.15
A future release of Python uses a markedly faster version of the Zstandard decompresion library, for possible speedups when installing binary wheels and other scenarios.

The future of Python web services looks GIL-free
How might free-threaded Python change a common use case like web services? Giovanni Barillari runs it down, with side-by-side tests and benchmarks, then shares his findings.

Using Python and Rust to build a fast Model Context Protocol server
A walkthrough of using PyO3 to build Rust-enhanced tooling for AI agents, with a convenient Python front end.

PyO3 now supports Python 3.14
Speaking of PyO3, everyone’s favorite Rust-and-Python binding tool now supports Python 3.14, phases out everything pre-Python 3.10, and introduces a new .cast() API for better type conversions.

(image/jpeg; 15.53 MB)

Google BigQuery gets managed AI functions to simplify unstructured data analysis 14 Nov 2025, 7:16 am

Google has boosted its BigQuery data warehouse with three new managed AI-based SQL functions to help enterprises reduce the complexity of running large-scale analytics, especially on unstructured data.

The functions — AI.IF, AI.CLASSIFY, and AI.SCORE — are designed to enable LLM usage for analytical tasks for enterprises on both structured and unstructured data directly within SQL queries. These functions work without prompt tuning or external tools, Google wrote in a blog post.

The new AI functions can be used to filter and join data based on semantic meaning using AI.IF in WHERE or ON clauses, categorize unstructured text or images with AI.CLASSIFY in GROUP BY clauses, and rank rows by natural language criteria through AI.SCORE in ORDER BY clauses.

Lowering the barrier to entry for data analysts

Traditionally, integrating LLMs into SQL workflows for AI-based reasoning of data has been a time-consuming, tedious, and costly affair as it requires data movement, prompt engineering, manual model selection, and parameter tuning, analysts pointed out.

The movement of data is typically required due to SQL’s inability to understand nuance and meaning of unstructured data, making advanced analysis, such as sentiment analysis or categorization, of customer reviews, support tickets, reports, etc., difficult, said Bradley Shimmin, lead of the data, analytics, and infrastructure practice at The Futurum Group.

To bypass this challenge, data analysts often had to export data from the warehouse, send it to a data scientist, and await the data scientist to send back enhanced, categorized data suitable for analysis using SQL, Shimmin noted, adding that the new AI functions “can literally collapse that entire workflow into a single query, using standard SQL syntax.”

The new managed AI Functions lower the entry barrier for data analysts as they can now adopt SQL-friendly syntax for AI-based reasoning of data without having to learn prompt engineering, HyperFRAME Research’s practice leader of AI stack, Stephanie Walter, pointed out.

“For enterprises, this means faster time-to-insight, less specialized skills required, and lower operational cost and risk,” Walter said.

Enterprises also gain from the managed nature of these new functions, Walter said, referring to Google’s backend automated handling of model selection, prompt optimization, query plan tuning, and endpoint management for these functions.

“This managed approach addresses the enterprise pain-point of complexity and operational risk: instead of analysts or teams having to decide which model variant to use, and optimize queries for latency and cost, Google abstracts that,” Walter said.

Growing demand for AI-based SQL functions

The integration of AI-based SQL functions in data warehouses is becoming a wider phenomenon. It can be viewed as a highly competitive space with all major vendors making comparable moves at varying stages of maturity.

While Databricks already offers AI Functions that can be used to apply generative-AI or LLM inference directly from SQL or Python, Snowflake provides AI_PARSE_DOCUMENT, AISQL, and Cortex functions for document parsing, semantic search, and AI-driven analytics.

Other warehouses, such as Oracle’s Autonomous Data Warehouse, also support AI workflows alongside SQL.

These integrations, according to Phil Fersht, CEO of HFS Research, point to a broader agentic evolution of data platforms.

“These managed functions act as foundational building blocks for more autonomous systems. Imagine agents that can query data, interpret results, and make decisions in real time, all without leaving the warehouse… giving enterprise data the ability to ‘think’ inside its own environment,” Fersht said.

The three new functions are currently in public preview.

(image/jpeg; 3.06 MB)

Visual Studio Code unifies UI for managing coding agents 14 Nov 2025, 5:44 am

The latest update to the Microsoft Visual Studio Code editor centers on Agent HQ, featuring a single view to start, monitor, and review agent sessions, whether local or remote.

Also called the October 2025 release, Visual Studio Code 1.106 was published November 12. Downloads for Windows, macOS, and Linux can be accessed from visualstudio.com.

Agent HQ provides multiple capabilities including the Agent Sessions view, a centralized location for managing active chat sessions. This includes local sessions in VS Code and sessions created by background agents in other environments, such as the Copilot coding agent, GitHub Copilot, or  OpenAI Codex. The Agent Sessions view now is enabled by default.

Also featured as part of Agent HQ is a new plan agent that breaks down complex tasks step-by-step before any code is written. Selecting Plan from the agents dropdown in the Chat view gets this started. When tackling a multi-step implementation, VS Code prompts the user with clarifying questions and generates a detailed implementation plan to be approved first, ensuring all requirements and context are captured up front. Developers can create a custom plan agent tailored to a team”s specific workflow and tools. GitHub last month launched Agent HQ for managing AI agents, emphasizing its extension to VS Code.

Visual Studio Code 1.106 also features updates to cloud agent sessions in the editor. The Copilot coding agent integration has been migrated from the GitHub Pull Request extension into the Copilot Chat extension to provide a more native cloud agent experience in VS Code. The release also includes an initial integration with the Copilot CLI. Users can create new sessions and resume existing CLI agent sessions in a chat editor or an integrated terminal.

Code editing also gets attention in Visual Studio Code 1.106. Deleted code in the diff editor now is selectable. Previously, when code was deleted and the changes viewed in the diff editor, the deleted lines could not be copied. Now, developers can copy text from deleted lines in the diff editor when using the inline diff view. In addition, the Go to Line command now supports navigating to a specific character position in a file by using the :: syntax. This is useful when tools report errors at specific character offsets, such as “error at position 599.”

Other new features and improvements in Visual Studio Code 1.106:

  • The concept of advanced settings now is supported. These settings are meant for configuring specialized scenarios and are intended for fine-grained control over an environment.
  • The Manage Extension Account Preferences command has been made more discoverable.
  • Iconography has been refreshed. New icons have been refined with curves, new modifier designs, and more accurate metaphors to make them feel modern, friendly, and more legible.
  • Support has been introduced for managing VS Code policies on Linux systems using JSON files. This lets administrators enforce specific settings and configurations across all users on a Linux machine.
  • Model Context Protocol (MCP) servers and extension tools now can be trusted at the source level through the Allow button dropdown.
  •  A copy button now appears in diagnostic hovers (errors, warnings, info, and hints) to make it easier to copy error messages.
  • The Command Palette now ignores character accents when searching for commands. This makes it easier to find what is needed regardless of keyboard layout or language preferences.

(image/jpeg; 0.29 MB)

Baidu launches new generation of Ernie AI 14 Nov 2025, 4:28 am

The AI marketspace is getting mighty crowded, and Chinese company Baidu is the latest player to launch its newest model into the world. At its Baidu World conference this week, it unveiled Ernie 5.0.

Baidu CTO and head of AI Group Haifeng Wang said (via translated subtitles supplied by the conference) that Ernie 5.0’s technical route was to “adopt a unified auto-regression architecture for native full multimodal modelling.” He said that this meant that “from the beginning of training, speech had been integrated, and images had been integrated, video, audio, and other multimodal data.”

While its predecessor, Ernie-4.5-VL-28B-A3B-Thinking, is supplied under an Apache license and is expected to provide an alternative to the likes of OpenAI, Ernie 5.0 is proprietary, built on the company’s PaddlePaddle deep learning framework.

In a written statement, Baidu said that Ernie 5.0 Preview was already in joint second place on the LMArena Text leaderboard, a well-established industry benchmarking organization (it has since fallen to eighth spot).

This claim was met with some skepticism by Thomas Randall, a research director at Info-Tech Research Group. “In my opinion, Ernie’s update is not a global game-changer. Claims of ‘outperforming’ or ‘matching’ other AI models must still be validated broadly in independent benchmarks and multiple languages/modalities. Having multimodal is good, but actual performance matters and we just don’t know yet whether Ernie substantially differs from other models.”

This emphasis on a multimodal approach is supported by Brandon Hoff, research director at  IDC. He said that Baidu’s focus on visual and STEM reasoning and visual grounding required a specialist approach. “These are all different workloads and targeted at different use cases, and they are getting smarter.”

He contrasted this to the OpenAI approach: “OpenAI is LLM chatbots that can address a wide variety of prompts.”

Hoff went on to say that this is an example of how quickly the AI world is moving. “I would categorize this announcement as one of many that we will see as AI is applied to new use cases and I would say that they are, in a way, expected or on the development curve,” he said. “Note that China has about half of the AI developers in the world, so we should expect lots of innovation coming out of China.”

However, AI researcher Ahmed Harhara, founder at Houston Home Tools, said that Baidu’s approach seemed heavily optimized for Chinese language context and local data. “It’s similar to how Google fine-tunes Bard [Gemini] for English-heavy datasets. The real difference isn’t just the model architecture; it’s the training environment and regulatory constraints each company operates under.”

He added that the importance of geographical fine-tuning in AI was underestimated. “It matters more than people think. It shapes how models interpret nuance, bias, and context. Baidu’s edge will likely be its integration with China’s digital ecosystem and search infrastructure, not just raw model performance.”

At the conference, Baidu EVP Dou Shen also announced two new processors aimed at powering the company’s advances in AI. The Kunlunxin M100, optimized for large scale AI inference, will be released at the beginning of 2026, with the M300, optimized for the training and inference of ultra-large-scale multimodal large models, following in early 2027.

In addition to these two new processors, Baidu also unveiled the Tianchi 256 and Tianchi 512 supernodes at the conference. They are expected to officially launch in 2026, with a single Tianchi 512 supernode, consisting of 512 Kunlunxin P800 chips, capable of training models with up to one trillion parameters, the company said.

Hoff noted that, in designing chips for their own workloads, the company was following in the footsteps of the US hyperscalers.  “Since they don’t have access to US AI platforms, China companies will focus on locally developed technology and are expected to do well,” he said.

The restrictions that the US has placed on exports to China is having an effect, he added, as it meant that China was now building on Chinese, not US, hardware platforms. “This is a threat to US leadership. Huawei is expected to be successful with their solutions, and they have tight relationships with many countries around the world, I would expect that Huawei and Baidu to both be strong competitors to the US in countries outside the US.

(image/jpeg; 3.12 MB)

Google Agent Development Kit adds Go language support 13 Nov 2025, 9:45 pm

Google has added Go to the list of languages supported by the company’s Agent Development Kit (ADK), a modular framework for developing and deploying AI agents.

Introduced November 7, ADK for Go offers an idiomatic, performant way to build agents, said Toni Klopfenstein, a Google developer relations engineer for ADK for Go. Source code can be found on GitHub. Developers can use Go’s concurrency and strong typing to build robust, scalable agentic applications, Klopfenstein said. He described ADK for Go as an open-source, code-first toolkit for developers who need fine-grained control over AI agents. Go joins Java and Python as languages supported by the kit.

Key features cited for ADK for Go include the following:

  • Pre-built tools, custom functions, OpenAPI specs, and integration across the Google ecosystem.
  • Code-first development that allows developers to define agent logic, tools, and orchestration directly for flexibility, testability, and versioning.
  • The ability to design scalable applications by composing multiple specialized agents into flexible hierarchies.
  • A built-in development UI that lets users test, evaluate, debug, and showcase agents.
  • Support for the Agent2Agent (A2A) protocol, which allows a primary agent to orchestrate and delegate tasks to specialized sub-agents.

ADK moves the complexity of large language model orchestration, agent behavior, and tool use directly into code, providing developers with robust debugging, reliable versioning, and deployment freedom, Klopfenstein said.

(image/jpeg; 1.97 MB)

Databricks fires back at Snowflake with SQL-based AI document parsing 13 Nov 2025, 11:27 am

Databricks and Snowflake are at it again, and the battleground is now SQL-based document parsing.

In an intensifying race to dominate enterprise AI workloads with agent-driven automation, Databricks has added SQL-based AI parsing capabilities to its Agent Bricks framework, just days after Snowflake introduced a similar ability inside its Intelligence platform.

The new abilities from Snowflake and Databricks are designed to help enterprises analyze unstructured data, preferably using agent-automated SQL, backed by their individual existing technologies, such as Cortex AISQL and Databricks’ AI Functions.

The ability to query unstructured data using relatively simpler yet automated methods compared to building and running costly ETL pipelines is a critical cog in the common goal that cloud data warehouses like Snowflake and Databricks share: help enterprises reduce cost and complexity by enabling unified queries across structured and unstructured data — a capability traditional warehouses lack as they are designed for analyzing structured data.

This goal is currently in sync with enterprises’ demand, said Mansi Gupta, practice director at Everest Group: “In today’s cost-conscious environment, enterprises want to leverage massive, complex datasets without driving up spend”.

Additionally, the ability to query structured and unstructured data simultaneously typically helps enterprises generate more accurate insights and accelerate decision-making.

What is Databricks’ new AI document parsing capability?

Databricks’ new capability — “ai_parse_document”, which is in public preview,  is a new addition to Agent Bricks’ AI Functions, a subset of Databricks’ AI Functions targeted at helping enterprises create autonomous agents for specific use cases.

When invoked in an agent workflow via Agent Bricks, ai-parse_document parses an entire document, not just text, although it is currently limited to formats such as PDF, JPG / JPEG, PNG, DOC/DOCX, and PPT/PPTX.

“ai_parse_document captures tables, figures, and diagrams with AI-generated descriptions and spatial metadata, storing results in Unity Catalog. Your documents now behave like tables — searchable through vector search and actionable in Agent Bricks workflows,” Databricks’ Mosaic Research team wrote in a blog post.

Before the introduction of the feature, Databricks users had to rely on various approaches, such as OCR, regular expressions, and custom ETL scripts, to normalize unstructured text, said Charlie Dai, vice president and principal analyst at Forrester. “With ai_parse, parsing becomes declarative and model-driven, reducing engineering overhead,” Dai added.

Enterprises would also be able to extend the document parsing ability to as many documents as required with the help of an integration with Spark Declarative Pipelines, including the ability to parse documents automatically as they arrive, Databricks said.

“Large-scale, incremental document processing… allows seamless ingestion, retry logic, change detection, and orchestration of new documents arriving daily. This is invaluable for production AI, compliance, and business reporting, where data freshness and reliability are essential,” said Pareekh Jain, analyst at Jain Consulting.

Snowflake vs Databricks

Databricks’ ai_parse, at least to some extent, is similar to Snowflake’s recently showcased Agentic Document Analytics offering that is being marketed as a complementary approach to current RAG practices, allowing enterprises to query thousands of documents in one go via the use of data agents.

Snowflake’s Agentic Document Analytics combines the abilities of Snowflake’s existing Cortex AISQL functions, such as AI_PARSE_DOCUMENT, AI_EXTRACT, AI_FILTER, and AI_AGG, in the Intelligence platform to parse documents and analyze the contents, according to Baris Gultekin, vice president of AI at Snowflake.

Comparing Snowflake’s existing AI_PARSE_DOCUMENT function, which was introduced a year back, to Agentic Document Analytics, Gultekin pointed out that while the parse function itself strengthens data quality for RAG by providing accurate retrieval context, Agentic Document Analytics enables quantitative and temporal analysis across those parsed results.

According to analysts, Databricks and Snowflake’s offerings would help enterprises cut down the complexity of workflows required to analyze unstructured data, especially documents.

Enterprises, historically, have had to build complex, slow, brittle OCR pipelines if they want to bring data from documents, such as PDFs, into an AI workflow, resulting in the culmination of RAG, which enabled semantic search over parsed text but still struggled with nuanced document structures like tables, said Bradley Shimmin, practice lead of data, analytics, and infrastructure at The Futurum Group.

To handle documents with tables, enterprises often chained additional LLM calls to extract and reconstruct tables as JSON, which was effective but risky due to hallucinations, Shimmin said, adding that instead of stitching together OCR, RAG, and custom extraction logic, Databricks’ ai_parse collapses the entire workflow into a single declarative SQL statement.

Databricks’ pitch for price performance

Databricks claims that its ai_parse function offers better price performance when compared to other similar functions from rivals, as well as vision language models.

“Price performance matters a lot. As an industry, we’re still figuring out how to optimize complex, agentic AI workflows, particularly in terms of how they manage context and memory assets over time. But even for basic data ingestion routines, this kind of effort can make a big difference, especially for enterprises that need to process millions, or even billions, of documents,” Shimmin said.

However, he warned that enterprises should do their own benchmarking tests and not just rely on Databricks’ claims.

Databricks’ pitch on price performance might give it an edge over Snowflake when it comes to enterprise customers, Shimmin said. “In a market where the two leaders have very similar top-line messaging, these kinds of cost savings for foundational workloads can make for a very compelling argument.”

(image/jpeg; 0.11 MB)

OpenAI rolls out GPT-5.1 to refine ChatGPT with adaptive reasoning and personalization 13 Nov 2025, 10:05 am

OpenAI has introduced GPT-5.1, an update to its GPT-5 model, aiming to deliver faster responses, improved reasoning, and more flexible conversational controls as the company works to refine its ChatGPT experience for both consumer and enterprise users.

The release includes new Instant and Thinking variants designed to offer more adaptive reasoning and a broader range of personalization options, the company said in a blog post.

GPT-5.1 is available across ChatGPT’s free and paid tiers, with enterprise and education customers receiving a brief early-access period before it becomes the default model. OpenAI said the models are also accessible through the API.

In a separate Substack post, OpenAI CEO for Applications, Fidji Simo, said these chat models are “trained using the same stack as our reasoning models,” adding that they “score higher on factuality and complex problem-solving than GPT-5, while also introducing a more natural, conversational tone.”

The update expands ChatGPT’s personalization controls with presets such as professional, friendly, candid, quirky, efficient, cynical, and nerdy, and aims to improve the reliability of custom instructions so they persist across multi-turn conversations.

Implications for enterprises

Analysts point out that the real story with GPT-5.1 is not the headline features but the subtle behavioral shifts that remove a lot of the friction enterprises quietly learned to live with this past year.

“The model is quicker to pick up intent, less likely to wander, and noticeably better at keeping a steady voice from one message to the next,” said Sanchit Vir Gogia, chief analyst, founder, and CEO of Greyhound Research. “That alone cuts a surprising amount of hidden operational waste. Teams no longer need to rephrase the same request four different ways or spend half their time smoothing out tone inconsistencies that creep into customer-facing text. This is something CIOs would appreciate.”

The real test is whether it reduces manual clean-up. Enterprises want fewer correction cycles and escalations. They also want a model that does not start acting like a different personality mid-conversation.

“GPT-5.1 makes progress on all of that,” Gogia said. “Coding output is steadier, long-context reasoning is less fragile, and the model is less prone to slipping into verbose rambling or over-polite filler. You can feel the engineering underneath. The routing architecture and the dual-style reasoning pathways matter more than the marketing language around them.”

However, it may be challenging for CIOs to fully evaluate GPT-5.1’s improvements, as many changes focus on enhancing user experience through better tonality and reasoning, according to Anushree Verma, senior director analyst at Gartner.

“These updates increase the model’s immersive capabilities, capturing users’ attention and encouraging stronger engagement,” Verma added. “Other models have positioned tonality as a competitive differentiator, and GPT-5.1’s upgrades have been made with this in mind.”

OpenAI’s competitive position

The update comes as OpenAI faces stiff competition from rivals, including those in China, and after ChatGPT 5 faced criticism for issues during rollout.

“The shine of being the default choice has faded,” Gogia said. “Enterprises that once embraced OpenAI without hesitation are now mixing and matching models to suit workload, cost, and regulatory expectations. GPT-5.1 helps OpenAI reassert itself not by leaping ahead on raw capability, but by focusing on the fundamentals that shape enterprise confidence.”

But rivals like Claude, Gemini, Mistral, and emerging open-source models are vying for market share, and modern enterprise architectures increasingly assume a multi-model fabric as the norm.

“In such a situation, GPT-5.1 functions less as a single pillar and more as the strongest member of a larger toolkit,” Gogia said. “It will still be the preferred choice for deep analytical work and ambiguous, multi-step tasks, but it must now coexist with competitors that outperform it in cost-sensitive or domain-specific scenarios.”

(image/jpeg; 0.58 MB)

Running managed Ray on Azure Kubernetes Service 13 Nov 2025, 9:00 am

The move to building and training AI models at scale has had interesting second-order effects; one of the most important is improving how we run and manage massively distributed computing applications. AI training and inferencing both require huge distributed applications to build and refine the models that are at the heart of modern machine learning systems.

As Brendan Burns, Microsoft CVP Azure OSS Cloud Native, notes in a recent Azure blog post, “Scaling from a laptop experiment to a production-grade workload still feels like reinventing the wheel.” Understanding how to break down and orchestrate these workloads takes time and requires significant work in configuring and deploying the resulting systems, even when building on top of existing platforms like Kubernetes.

Design decisions made in the early days of cloud-native development focused on orchestrating large amounts of data rather than managing significant amounts of compute, both CPU and GPU. Our tools make it hard to orchestrate these new workloads that need modern batch computing techniques. Burns’ blog post announced a new Azure partnership to help resolve these issues, working with Anyscale to use a managed version of Ray, its open source, Python-based tool, on Azure Kubernetes Service. You get the tools you need to build and run AI workloads without the work of managing the necessary infrastructure.

What is Ray?

Ray is a set of tools for building large-scale Python distributed applications, with a focus on AI. You can take existing code and quickly make it run on a distributed platform without changing how it works. It provides its own scheduling services for CPU and GPU operations and has a set of native libraries to help train and run AI models that work with familiar tools, including PyTorch.

Anyscale’s enterprise managed edition of Ray adds an enhanced runtime to speed up creating and running clusters and to improve how resources are allocated for both development and production. This all sits on top of AKS, using its provisioning and scaling features for the necessary distributed infrastructure. Thanks to its roots as an open source project founded by Anyscale’s team, it can provide direction and use Ray’s own ecosystem to extend the platform.

The combination makes sense. Microsoft has been using AKS to support both its own and third-party AI development and operations, and as a result, it has developed tools for working with GPU resources alongside the more familiar CPU options, with its own KAITO (Kubernetes AI Toolchain Operator) as well as with Ray.

With the new service currently available in a private preview and added support options for organizations using Anyscale’s commercially licensed managed Ray, it’s possible to see how Microsoft envisions users working with the combined platform, according to Microsoft’s notes on using the open source version on Azure.

Using Ray on Azure

As with any other open source Kubernetes project running on AKS, Microsoft doesn’t provide any support for Ray and redirects users to the Ray project. However, the build Microsoft offers has been compiled and tested by the AKS team and provides signed binaries and containers.

It’s used in conjunction with another open source tool: KubeRay, which provides a Kubernetes operator for Ray, allowing you to use familiar declarative techniques to configure and manage your installation. You’re not limited to using Ray for AI; any large-scale Python distributed application can take advantage of its core libraries. These help parallelize your code as well as build and deploy a cluster.

Ray provides a set of model libraries for AI development, each focused on a specific part of the model life cycle. For example, if you’re training a model in PyTorch, you’ll need to install Ray’s train model alongside PyTorch. Ray provides a set of functions that prepare both your model and your data for data parallel operations, which can then be used as a function called by Ray’s TorchTrainer workers. You can do this to ensure it uses available GPU resources to speed up training.

Other Ray tools allow you to quickly tune model hyperparameters, with functions that manage searches in just a few lines of code. Once a model has been trained and tuned, Ray supports running it in a scalable environment.

Start working with Ray on AKS

Getting started with Ray on AKS is simple enough, as Microsoft provides a set of samples that can automate the process of deploying a basic Ray implementation from a single shell script running in the Azure CLI’s Bash environment. It’s quick, but it’s a good idea to walk through the process manually to understand how KubeRay works. You’ll need some prerequisites: the Azure CLI with the AKS Preview extension, Helm, and a Terraform (or OpenTofu) client.

Along with enabling KubeRay, take your existing Ray-based PyTorch code and use it to build a Docker container that can be deployed across your AKS nodes. When your job runs, the container will be loaded by KubeRay and deployed to worker nodes ready for training.

If you’re deploying by hand, start with a KubeRay job description. This is a YAML file that contains descriptions of all the necessary resources needed to train your model: the number of pods, along with their CPUs, GPUs, and memory; the container that hosts the PyTorch model; and the number of Ray workers that will run the job. You can start relatively small, with as few as 8 virtual CPUs, adding more as job complexity increases or you need to run the job more quickly.

You can track training using Ray’s log files and then evaluate results using its dashboard. This will require configuring access and setting up a suitable Kubernetes ingress controller to provide dashboard access. If you’re tuning an existing model, you can use a similar architecture, with Azure storage for training and test data. Microsoft recommends blob storage, as it offers a good balance of performance and cost.

A platform for open source AI applications

PyTorch is one of the most popular tools for AI model development and tuning. With KubeRay and Ray on AKS, you can quickly work with models at scale, using code running on your laptop to train and tune your model in the cloud. You can also train and tune off-the-shelf, open source models from sites like Hugging Face and customize them for your specific use cases. This means you don’t have to invest in expensive GPUs or large data centers. Instead, you can treat Azure and Ray as a batch-processing environment that only runs when you need it, keeping costs down and letting you quickly deploy custom models in your own network.

There’s a lot more to modern AI than chatbots, and by supporting Ray, AKS becomes a place to train and tune computer vision and other models, using image data stored in Azure blobs or time-series operational data in Fabric, Azure’s big-data service. Once trained, those models can then be downloaded and used in your own applications. For example, you can use NPUs designed for computer vision to run custom-trained models that find flaws in products or that spot safety violations and then trigger warnings. Similar models working with log file data could spot fraud or request preemptive equipment maintenance.

By training and tuning on your own data and your own infrastructure, you get the model you need for a specific task that otherwise might be too expensive to implement. AKS and Ray provide an on-demand, cloud-native training environment, so you’re not only able to get that model in production quickly but also to keep it updated as you identify new source data that can make it more accurate or tuning parameters that will make it more responsive.

You can concentrate on building applications and let Microsoft manage your platform, ensuring you have an up-to-date, secure Kubernetes and Ray environment ready for your code and your users.

(image/jpeg; 1.48 MB)

When will browser agents do real work? 13 Nov 2025, 9:00 am

In January 2025, OpenAI released Operator, the first large-scale agent powered by a computer-use model to control its own browser. The demo was impressive: an AI moving the mouse, clicking buttons, and performing actions like a human would. It was received as a major step toward general-purpose autonomy.

But just eight months later, in August, OpenAI quietly discontinued Operator and rolled it into ChatGPT’s new Agent Mode. Instead of a single, vision-only system, ChatGPT Agents gained access to both a visual browser and a text-based browser. The shift reflected a hard-earned truth: computer-use models don’t yet work reliably enough in production.

Computer-use models perceive and act like humans do. They analyze the browser screen as an image and issue clicks or text inputs at coordinates, which is powerful in theory, but fragile in practice. Rendering differences, latency, and the difficulty of parsing complex layouts all contribute to unreliability. For agents operating at enterprise scale, even a 1% failure rate can be unacceptable.


Vision-based agents

Vision-based agents treat the browser as a visual canvas. They look at screenshots, interpret them using multimodal models, and output low-level actions like “click (210,260)” or “type “Peter Pan”.” This mimics how a human would use a computer—reading visible text, locating buttons visually, and clicking where needed.

The upside is universality: the model doesn’t need structured data, just pixels. The downside is precision and performance: visual models are slower, require scrolling through the entire page, and struggle with subtle state changes between screenshots (“Is this button clickable yet?”).


DOM-based agents

DOM-based agents, by contrast, operate directly on the Document Object Model (DOM), the structured tree that defines every webpage. Instead of interpreting pixels, they reason over textual representations of the page: element tags, attributes, ARIA roles, and labels.

A modern preprocessing technique called accessibility snapshots, popularized by Microsoft’s Playwright MCP server, transforms the live DOM into a structured, readable text form that language models can understand better than pure HTML. For example, a fragment of Google’s home page might look like:

- navigation [ref=e3]:
    - link "About" [ref=e4] -> https://about.google.com
    - link "Store" [ref=e5] -> https://store.google.com
- search [ref=e32]:
    - combobox "Search" [active]
    - button "Search by voice" [ref=e47]
Asteroid browser agents 01

Asteroid

This structured view lets models choose specific elements to act upon (“click ref=e47”) rather than guessing coordinates. DOM-based control is faster and more deterministic. Both are crucial for enterprise workflows that run thousands of browser sessions daily.


Hybrid agents: The current state of browser automation

In practice, both methods have their strengths. Vision models handle dynamic, canvas-based UIs (like dashboards or image-heavy apps). DOM-based models excel at text-rich sites like forms or portals. The best systems today combine both: using DOM actions by default and falling back to vision when necessary.

OpenAI’s decision to deprecate Operator led directly to the creation of the new ChatGPT Agent, which embodies this hybrid approach. Under the hood, it can use either a text browser or a visual browser, choosing the most effective one per step. This is far more reliable than Operator’s pure computer-use model.

Models like Claude 4 and opencua-72b-preview show that visual grounding and faster perception are improving monthly. Computer-use models will continue advancing as multimodal architectures evolve. Eventually, pure vision agents may reach the precision and speed needed for mainstream deployment.

But in 2025, production systems are still hybrid. The most reliable browser agents orchestrate multiple techniques: DOM reasoning for structured elements, vision fallback for non-standard layouts, and deterministic scripting for validation and replay. The frontier isn’t yet a single model, it’s the composition of models, selectors, and orchestration frameworks that together make agents truly usable.

The future of browser agents lies not in vision or structure alone, but in orchestrating both intelligently.


Learning by doing: The next step for browser agents

Hybrid systems solve reliability for today, but the next challenge is adaptability. How can a browser agent not just complete a task once, but actually learn from experience and improve over time?

Running a browser agent once successfully doesn’t mean it can repeat the task reliably. The next frontier is learning from exploration: transforming first-time behaviors into reusable automations.

A promising strategy starting to be deployed more and more is to let agents explore workflows visually, then encode those paths into structured representations like DOM selectors or code. Think of it as a two-stage process:

  1. Exploration phase: The agent uses computer-use or vision models to discover the structure of a new web page and record successful navigation paths.
  2. Execution phase: The agent compiles that knowledge into deterministic scripts, for example, Playwright, Selenium, or CDP (Chrome DevTools Protocol) commands to repeat the process with high reliability.

With new large language models excelling at writing and editing code, these agents can self-generate and improve their own scripts, creating a cycle of self-optimization. Over time, the system becomes similar to a skilled worker: slower on the first task, but exponentially faster on repeat executions.

This hybrid, self-improving approach—combining vision, structure, and code synthesis—is what makes browser automation increasingly robust. It’s not just about teaching models to click; it’s about enabling them to learn how to automate.

New Tech Forum provides a venue for technology leaders—including vendors and other outside contributors—to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to doug_dineley@foundryco.com.

(image/jpeg; 12.69 MB)

Java Stream API tutorial: How to create and use Java streams 13 Nov 2025, 9:00 am

You can think of a Java stream as a pipeline through which data flows. Instead of manually writing loops and conditionals to process a list, you tell Java what should happen to each element, and the Java Stream API takes care of how it happens internally.

A Java stream doesn’t hold data. Instead, it operates on an existing data source such as a List, Set, Map, or array. The stream applies a series of operations to the data source.

This article introduces you to Java streams. You’ll learn how to create streams from Java collections, get your first look at a stream pipeline, and see how lambdas, method references, and other functional programming elements work with Java streams. You’ll also learn how to combine collectors and optional chaining with Java streams, and when to use or not use streams in your programs.

Streams versus collections in Java

Many developers get tripped up by the difference between Java streams and Java collections:

  • Collections (like ArrayList or HashSet) are used for storage. They keep data in memory for you to access.
  • Streams are about behavior. They describe what to do with data, not how to store it.

As an analogy, consider that a collection is the cupboard holding ingredients, whereas a stream is the recipe for making them into a meal.

Streams give Java a functional and declarative feel by describing what to do instead of how to do it.

Why developers love Java streams

Java developers appreciate and use streams for a variety of reasons:

  • Cleaner code that replaces nested loops and conditionals.
  • Less boilerplate; no more manual for loops.
  • Readable logic—stream pipelines read like natural language.

We can begin to see these differences by comparing loops and streams.

Loops vs. streams

Streams often replace traditional loops in Java, and once you’ve started using them, it’s hard to go back. Here’s an example of a classic for loop:

List names = List.of("patrick", "mike", "james", "bill");

List result = new ArrayList();
for (String name : names) {
    if (name.length() > 4) {
        result.add(name.toUpperCase());
    }
}
Collections.sort(result);
System.out.println(result);

And here is the Java streams version:

List names = List.of("patrick", "mike", "james", "bill");

List result = names.stream()
        .filter(name -> name.length() > 4)
        .map(String::toUpperCase)
        .sorted()
        .toList();

System.out.println(result);

Unlike a loop, the Stream reads almost like English: “Take the names, filter by length, convert to uppercase, sort them, then collect to a list.”

After completing, the output will be: [JAMES, PATRICK].

Creating Java streams from collections

Streams can start from many sources. Think of all the examples below as ways to “turn on the tap.”

Here’s how to create a Stream from a collection—in this case, a List of names:

List names = List.of("James", "Bill", "Patrick");
Stream nameStream = names.stream();

Here’s how to create a Stream from a Map:

Map idToName = Map.of(1, "James", 2, "Bill");
Stream> entryStream = idToName.entrySet().stream();

And here is one created from an array:

String[] names = {"James", "Bill", "Patrick"};
Stream nameStream = Arrays.stream(names);

You can also create a Stream using Stream.of():

Stream numberStream = Stream.of(1, 2, 3, 4, 5);

Using Stream.of(), you can pass in any kind of value or object to create a Stream. It’s a simple way to quickly create a stream when you don’t already have a collection or array. Perfect for small, fixed sets of data or quick tests.

Using Stream.generate() (infinite streams)

The Stream.generate() method creates an infinite stream; it keeps producing values while the pipeline requests them:

Stream.generate(() -> "hello")
      .forEach(System.out::println);

This Stream never stops printing. Use limit() to control it:

Stream.generate(Math::random)
      .limit(5)
      .forEach(System.out::println);

Both Stream.generate() and Stream.iterate() can produce infinite sequences. Always limit or short-circuit them to avoid endless execution.

If you need to safely return an empty stream rather than null, use Stream.empty():

Stream emptyStream = Stream.empty();

This avoids null checks and makes methods returning streams safer and cleaner.

Intermediate and lazy stream operations

Streams have intermediate (lazy) and terminal (executing) operations. Together, these two types of operations form your data pipeline.

Intermediate operations (transforming on the way)

Intermediate streams operations don’t trigger execution right away. They just add steps to the recipe:

  • map(): Transforms each element.
  • filter(): Keeps only elements that match a condition.
  • sorted(): Arranges elements in order.
  • distinct(): Removes duplicates.
  • limit()/skip(): Trims the stream.
  • flatMap(): Flattens nested structures (e.g., lists of lists) into one stream.
  • peek(): Lets you look at elements as they pass through (great for debugging/logging, but not for side effects).
  • takeWhile(predicate): Keeps pulling elements until the predicate fails (like a conditional limit).
  • dropWhile(predicate): Skips elements while the predicate is true, then keeps the rest.

Streams are lazy

Streams prepare all their steps first (filtering, mapping, sorting), but nothing happens until a terminal operation triggers processing. This lazy evaluation makes them efficient by processing only what’s needed.

Take this stream pipeline, for example:

List names = List.of("james", "bill", "patrick", "guy");
names.stream()
     .filter(n -> n.length() > 3)  // keep names longer than 3 characters
     .map(String::toUpperCase)     // convert to uppercase
     .sorted();                    // sort alphabetically

System.out.println("List result: " + names);

The result will be: [james, bill, patrick, guy].

At first glance, it looks like this pipeline should:

  1. filter out "al" and "bob" (since their length isn’t greater than 3),
  2. map the rest to uppercase, and
  3. sort them.

But in reality, the pipeline does none of that.

The reason is that streams in Java are lazy.

  • All those calls (filter, map, sorted) are intermediate operations.
  • They don’t run immediately. Instead, they “record the plan.”
  • The plan only runs when you add a terminal operation like .toList(), forEach(), or count().

Since there’s no terminal operation in the above code, the pipeline is discarded and the original list prints unchanged.

Terminal operations (serving the dish)

Now we can look at the second kind of stream operation. Terminal operations trigger the stream to run and produce a result:

  • forEach(): Do something with each element.
  • collect(): Gather elements into a collection.
  • toList(): Collect all elements into an immutable List (Java 16+).
  • reduce(): Fold elements into a single result (sum, product, etc.).
  • count(): How many items?
  • findFirst(): Returns the first element that matches the filtering conditions (useful when order matters).
  • findAny(): Returns any matching element (especially useful in parallel streams where order is not guaranteed).
  • toArray(): Collect results into an array.
  • min(Comparator) / max(Comparator): Find the smallest or largest element based on a comparator.
  • anyMatch(predicate): Does any element match?
  • allMatch(predicate): Do all elements match?
  • noneMatch(predicate): Do no elements match?

Here’s an example of a stream with terminal operations:

List names = List.of("james", "bill", "patrick", "guy");

List result = names.stream()
     .filter(n -> n.length() > 3)
     .map(String::toUpperCase)
     .sorted()
     .toList();   // Terminal operation method triggers action here

System.out.println(result);

In this case, the output will be: [BILL, JAMES, PATRICK].

Streams are single use

Once a stream has been processed, it’s consumed and can’t be reused. A terminal operation closes the stream:

List names = List.of("James", "Bill", "Patrick");

Stream s = names.stream();
s.forEach(System.out::println); // OK
s.count(); // IllegalStateException — already processed

In this code, the first call pulls all data through the pipeline, and after that it’s closed. Create a new one if needed:

long count = names.stream().count(); // OK: new stream instance

Flow of a stream pipeline

To conclude this section, here is a stream pipeline with both intermediate and terminal streams operations:

List result = names.stream()   // Source
    .filter(n -> n.length() > 3)                  // Intermediate operation
    .map(String::toUpperCase)             // Intermediate operation
    .sorted()                                          // Intermediate operation
    .toList();                                          // Terminal operation

Working with collectors

In addition to streams, Java 8 introduced collectors, which you can use to describe how to gather (collect) processed data.

Collecting to a list creates a new unmodifiable list of names longer than three characters. Immutable results make stream code safer and more functional:

List list = names.stream()
    .filter(n -> n.length() > 3)
    .toList();  // Java 16+

Here, we collect results into a set, automatically removing duplicates. Use a set when uniqueness matters more than order:

Set set = names.stream()
    .map(String::toUpperCase)
    .collect(Collectors.toSet());

Here, we collect to a Map, where each key is the String’s length and each value is the name itself:

Map map = names.stream()
    .collect(Collectors.toMap(
        String::length,
        n -> n
    ));

If multiple names share the same length, a collision occurs. Handle it with a merge function:

Map safeMap = names.stream()
    .collect(Collectors.toMap(
        String::length,
        n -> n,
        (a, b) -> a   // keep the first value if keys collide
    ));

Joining strings

Collectors.joining() merges all stream elements into one String using any delimiter you choose. You can use “ |”, “ ; ”, or even “\n” to separate values however you like:

List names = List.of("Bill", "James", "Patrick");

String result = names.stream()
    .map(String::toUpperCase)
    .collect(Collectors.joining(", "));

System.out.println(result);

The output here will be: BILL, JAMES, PATRICK.

Grouping data

Collectors.groupingBy() groups elements by key (here it’s string length) and returns a Map>:

List names = List.of("james", "linus", "john", "bill", "patrick");

Map> grouped = names.stream()
    .collect(Collectors.groupingBy(String::length));

The output will be: {4=[john, bill], 5=[james, linus], 7=[patrick]}.

Summarizing numbers

You can also use collectors for summarizing:

List numbers = List.of(3, 5, 7, 2, 10);

IntSummaryStatistics stats = numbers.stream()
    .collect(Collectors.summarizingInt(n -> n));

System.out.println(stats);

The output in this case will be: IntSummaryStatistics{count=5, sum=27, min=2, average=5.4, max=10}.

Or, if you want just the average, you could do:

double avg = numbers.stream()
    .collect(Collectors.averagingDouble(n -> n));

Functional programming with streams

Earlier, I mentioned that streams combine functional and declarative elements. Let’s look at some of the functional programming elements in streams.

Lambdas and method references

Lambdas define behavior inline, whereas method references reuse existing methods:

names.stream()
    .filter(name -> name.length() > 3)
    .map(String::toUpperCase)
    .forEach(System.out::println);

map() vs. flatMap()

As a rule of thumb:

  • Use a map() when you have one input and want one output.
  • Use a flatMap() when you have one input and want many outputs (flattened).

Here is an example using map() in a stream:

List> nested = List.of(
    List.of("james", "bill"),
    List.of("patrick")
);

nested.stream()
      .map(list -> list.stream())
      .forEach(System.out::println);

The output here will be:

java.util.stream.ReferencePipeline$Head@5ca881b5
java.util.stream.ReferencePipeline$Head@24d46ca6

There are two lines because there are two inner lists, so you need two Stream objects. Also note that hash values will vary.

Here is the same stream with flatMap():

nested.stream()
      .flatMap(List::stream)
      .forEach(System.out::println);

In this case, the output will be:

 james
 bill
 patrick

For deeper nesting, use:

List>> deep = List.of(
    List.of(List.of("James", "Bill")),
    List.of(List.of("Patrick"))
);

List flattened = deep.stream()
    .flatMap(List::stream)
    .flatMap(List::stream)
    .toList();

System.out.println(flattened);

The output in this case will be: [James, Bill, Patrick].

Optional chaining

Optional chaining is another useful operation you can combine with streams:

List names = List.of("James", "Bill", "Patrick");

String found = names.stream()
    .filter(n -> n.length() > 6)
    .findFirst()
    .map(String::toUpperCase)
    .orElse("NOT FOUND");

System.out.println(found);

The output will be: NOT FOUND.

findFirst() returns an optional, which safely represents a value that might not exist. If nothing matches, .orElse() provides a fallback value. Methods like findAny(), min(), and max() also return optionals for the same reason.

Conclusion

The Java Stream API transforms how you handle data. You can declare what should happen—such as filtering, mapping, or sorting—while Java efficiently handles how it happens. Combining streams, collectors, and optionals makes modern Java concise, expressive, and robust. Use streams for transforming or analyzing data collections, not for indexed or heavily mutable tasks. Once you get into the flow, it’s hard to go back to traditional loops.

As you get more comfortable with the basics in this article, you can explore advanced topics like parallel streams, primitive streams, and custom collectors. And don’t forget to practice. Once you understand the code examples here, try running them and changing the code. Experimentation will help you acquire real understanding and skills.

(image/jpeg; 0.4 MB)

Red Hat OpenShift 4.20 boosts AI workloads, security 13 Nov 2025, 1:37 am

Red Hat OpenShift 4.20, the latest version of Red Hat’s Kubernetes-based hybrid cloud application platform, is now available. The Red Hat OpenShift 4.20 release features capabilities for accelerating AI workloads and strengthening security, according to Red Hat.

Introduced November 11, OpenShift 4.20 can be accessed from the Red Hat OpenShift product page. For AI, new capabilities are designed to streamline deployment and management of complex AI workloads, Red Hat said. The LeaderWorkerSet (LWS) API for AI workloads, for example, simplifies management of large, distributed AI workloads with automated orchestration and scaling, according to Red Hat. Deployment time is reduced using image volume source for AI workloads, allowing new models to be integrated in minutes without rebuilding application containers. These features together provide functionality for Red Hat OpenShift AI or other AI platforms to help users move more easily from experimentation to production. Cluster management is enabled via Model Context Protocol (MCP) using tools such as Visual Studio Code.

The 4.20 version of Red Hat OpenShift also helps secure the main traffic between control-plane components with initial support for post-quantum cryptography, delivering long-term cryptographic protection for critical communications, Red Hat said. The release also brings greater operational flexibility to the core platform and strengthens security capabilities for Red Hat OpenShift Platform Plus customers. This includes the availability of Red Hat Advanced Cluster Security 4.9 and enhancements to Red Hat Trusted Artifact Signer and Red Hat Trusted Profile Analyzer to help manage and analyze security data.

This release also adds CPU load-aware rebalancing and Arm support to boost performance and resource utilization for virtualized workloads, said the company. Expanded hybrid cloud extends Red Hat OpenShift Virtualization to bare-metal deployments on Oracle Cloud, providing more control over infrastructure and the placement of data. And with enhanced storage offloading functionality, the migration toolkit for virtualization accelerates VM migration from legacy solutions to OpenShift Virtualization through existing storage resources.

(image/jpeg; 27.84 MB)

Microsoft releases ‘AI-native’ Visual Studio 2026 12 Nov 2025, 10:27 pm

Visual Studio 2026, the latest version of Microsoft’s signature IDE, is now generally available. Touted as an “AI-native intelligent development environment,” the IDE also features faster startup and improved user experience.

Announced November 11, the update can be downloaded from the Visual Studio homepage. Visual Studio 2026 uses AI intelligence for such things as complex debugging, performance profiling, and application modernization, Microsoft said. AI removes friction and surfaces insights, enabling developers to move faster without disrupting flow, the company said. In addition to AI-driven development, the release announcement noted that Visual Studio 2026 features performance improvements and a redesigned user experience.

Visual Studio 2026 is compatible with projects and extensions from Visual Studio 2022. The IDE is decoupled from build tools, so developers can update to Visual Studio 2026 without affecting .NET or C++ compilers, said the company. Following the initial update, automatic monthly updates will be provided.

Visual Studio 2026 was released in preview in September 2025. The 2026 update includes the following changes and additions:

  • A refreshed interface aligned with the Fluent UI design system to improve usability, accessibility, and visual clarity.
  • Full support available for .NET 10 and C# 14.
  • Test code coverage, available in Visual Studio Community and Professional editions.
  • New C# and C++ agents, which can be used to improve precision and speed.
  • The new “Did You Mean” feature, which intelligently detects intent and suggests better matches to improve search performance and results.
  • A new “Editor Appearance” setting that focuses on the editor’s look and feel. This setting can be used to match the overall IDE theme but also works independently, allowing users to customize their coding environment without having to align with the broader IDE.

User feedback helped fix more than 5,000 bugs and add more than 300 feature requests leading up to the Visual Studio 2026 release, according to Microsoft.

(image/jpeg; 0.16 MB)

Malicious npm package sneaks into GitHub Actions builds 12 Nov 2025, 12:00 pm

A malicious npm package named “@acitons/artifact” was found impersonating the legitimate “@actions/artifact” module, directly targeting the CI/CD pipelines within GitHub Actions workflows.

According to Veracode findings, the package was uploaded on November 7 and was designed to trigger during the build process of GitHub-owned repositories. Once executed inside a CI/CD runner, the payload captures any tokens available to that build environment and then uses those credentials to publish malicious artifacts–effectively impersonating GitHub itself.

“This incident isn’t just about a malicious npm package, it is about the blind trust many organizations place in the modern supply chain,” said Randolph Barr, CISO at Cequence Security. “Most organizations focus their controls on runtime environments, yet the CI/CD pipeline often runs with higher privilege than any developer. A single typosquatted dependency can silently execute code during a build, access repository tokens, and impersonate an organization, just as this attack attempted to do with GitHub’s own repositories.“

The malicious package picked up over 260k downloads before detection, and a total of six versions were uploaded–none detectable by “any popular anti-virus” products, Veracode researchers noted in a blog post.

GitHub says that the packages were uploaded internally as part of its red teaming efforts. “The packages referenced in Veracode’s blog were part of a tightly controlled exercise conducted by GitHub’s Red Team,” a GitHub spokesperson told CSO. “GitHub takes security seriously and regularly tests its security posture through rigorous, realistic Red Team exercises to ensure resilience against current threat actor techniques. At no point were GitHub systems or data at risk.”

Hijacking the GitHub Actions build process

On the surface, @acitons/artifact package looked normal with its metadata describing it as “actions artifact lib,” and its homepage and repository URLs closely mirroring those of the legitimate GitHub project. But embedded inside was a post-install hook that downloaded and executed an obfuscated shell script named “harness.”

Veracode’s analysis showed that this script, compiled with a shell-script compiler tool, contained a time-based kill switch set to deactivate after November 6, 2025–likely to evade detection after a brief active window. Once invoked, the harness would fetch a JavaScript file (“verify.js” meant to check whether the build environment belonged to GitHub and, if so, exfiltrate GitHub Action tokens. These tokens could then be misused to impersonate GitHub and publish malicious releases.

“Typosquatting is a well-known and growing threat vector in software supply chains whereby attackers publish packages with similar names as legitimate ones and then wait for a mistake to happen, bringing the victim to their repository to install malicious code by mistake,” explained Boris Cipot, Senior Security Engineer at Black Duck. “This attack strategy is designed to exploit typos and to leverage the automated nature of CI/CD pipelines.”

Cipot added that the use of a post-install hook and a short-lived obfuscated payload shows a deliberate attempt to blend in with normal build activity.

Lessons in defense

Barr pointed out that higher privileges in CI/CD pipelines make them an ideal target. Attackers who compromise a build runner can inject code at the source, sign releases with legitimate credentials, or push authentic-looking artifacts.

Mitigations, Cipot recommended, would include short-lived, scoped tokens with regular secret rotations. Automated scanning for suspicious packages using tools like Socket.dev or Phylum might also help stay ahead of the threat. Other ways to verify package authenticity include checksum validation and emerging standards like Sigstore, he added.

Jason Soroko, senior fellow at Sectigo, advises an immediate response for teams potentially affected. “Search source code, lockfiles, caches, and registries for @acitons and 8jfiesaf83 then quarantine any runners that fetched them,” he said. “Rotate all tokens and review artifacts and package publish history for the period from October 29 to November 6, 2025.”

(image/jpeg; 0.93 MB)

Meta’s SPICE framework pushes AI toward self-learning without human supervision 12 Nov 2025, 11:37 am

Meta researchers have unveiled a new reinforcement learning framework called SPICE (Self-Play in Corpus Environments) that enables large language models (LLMs) to improve their reasoning skills without human supervision.

Developed with the National University of Singapore, SPICE trains a single model to act both as a Challenger, which generates complex, document-based problems, and a Reasoner, which solves them.

By grounding the learning process in real-world text corpora rather than synthetic data, the system avoids the hallucination loops that have plagued earlier self-play methods. It achieves an average improvement of nearly 10% in mathematical and general reasoning benchmarks.

The researchers described the approach as a “paradigm shift” toward AI systems that can self-improve through interaction with the vast, verifiable knowledge embedded in web documents rather than static human-curated datasets.

Why self-improving AI is difficult

The idea of self-improving AI has begun to take shape with the rise of LLMs capable of reasoning. However, most existing methods face fundamental barriers after some initial progress.

“Without external grounding, models inevitably plateau or collapse due to two critical issues,” the researchers said in the paper. “(1) hallucination amplification, where factual errors in both generated questions and answers compound as models train on their own unverifiable synthetic data, and (2) information symmetry, where both the problem generator and solver share the same knowledge base, preventing genuine challenge and leading to simpler, more repetitive patterns.”

Even new techniques that try to keep training data diverse, such as variational synthesis, still run into limits. They can only work with what was already captured during pretraining, essentially remixing the same information in new ways.

What makes SPICE effective

SPICE is built on the concept that a single LLM assumes two alternating roles, one that creates challenges and another that tries to solve them.

In one phase, the model acts as the Challenger, drawing information from a large document corpus to generate complex, document-grounded questions. In the next phase, it switches roles to become the Reasoner, attempting to answer those questions without seeing the source material.

The Challenger earns higher rewards when it creates problems that sit right at the edge of what the Reasoner can handle, making the tasks difficult but still solvable. The Reasoner is rewarded for producing correct answers.

This back-and-forth process, supported by real-world data, allows the system to keep discovering new challenges and improving its ability to solve them without human supervision.

This approach removes the verification bottleneck that has limited earlier research to specialized areas such as mathematics and coding. Because the answers are based on real documents, the system can verify them against factual sources rather than relying on synthetic or assumed data.

What the tests show


When tested across different LLMs, the researchers found that SPICE showed clear and consistent improvements in reasoning performance.

On the Qwen3 4B model, performance rose from 35.8 to 44.9 percent, while the larger Qwen3 8B model climbed from 43.0 to 48.7 percent. A stronger impact was seen in OctoThinker models, with improvements from 14.7 to 25.2 percent on the 3B version and from 20.5 to 32.4 percent on the 8B version.

“The adversarial dynamics between Challenger and Reasoner create an automatic curriculum: the fixed Reasoner’s pass rate decreases from 55% to 35% as it learns to generate progressively harder problems, while the fixed Challenger’s pass rate increases from 55% to 85%, indicating successful co-evolution of both roles,” the study said. 

The researchers also found that grounding the training process in real documents was essential for lasting improvement.

Models trained without this external reference quickly hit a ceiling and stopped getting better. But when SPICE drew on real-world text, it kept progressing steadily, using fresh document material to generate new and more complex challenges throughout training.

Implications of the study

By using large document collections as external sources of knowledge, SPICE helps models improve instead of stagnating on their own data. Industry analysts say such frameworks could eventually influence how enterprises train domain-specific AI models, but adoption will come with new responsibilities.

“SPICE opens new possibilities for adaptive AI, but enterprises can’t afford to set it and forget it,” said Tulika Sheel, senior VP at Kadence International. “Self-improving systems need self-checking mechanisms. Human oversight, audit trails, and compliance guardrails must stay front and center.”

Sheel noted that while the Challenger–Reasoner setup could, in theory, be replicated with corporate data such as financial or legal documents, it would demand “deep infrastructure, clean datasets, and a strong focus on transparency.”

She also warned that autonomous learning loops introduce risks like bias amplification and compliance drift. “SPICE nudges AI closer to self-sufficiency, but autonomy without accountability is dangerous,” she said.

Anish Nath, practice director at Everest Group, suggested that enterprises would benefit more from frameworks like SPICE by treating them as a training capability, not autonomy in production.

“Run self-play in sandboxes with gated releases; start on low-risk/internal workflows, then graduate to critical processes as evidence accumulates,” Nath said. “Enforce guardrails: schema-constrained outputs, policy engine, least-privilege tool whitelists, drift/anomaly detection, signed actions + audit trails, rollback/kill-switches, and human approvals for high-impact actions.”

Nath added that self-generated training data does point toward autonomous development loops, but warned of risks such as model collapse, data poisoning, and untracked drift. “These can be mitigated with independent evaluation models, provenance tracking, versioned datasets, and human gates for capability upgrades,” he said. “Improvement has to remain controlled, auditable, and compliant.”

(image/jpeg; 3.67 MB)

Revisiting Mojo: A faster Python? 12 Nov 2025, 9:00 am

When the Mojo language first appeared, it was promoted as being the best of two worlds, bringing the ease of use and clear syntax of Python, along with the speed and memory safety of Rust.

For some time, the only way to evaluate those claims was using an online notebook environment that ran Mojo code on remote servers. More recently, the Mojo compiler has been released as a standalone download for Mac and Linux. (Microsoft Windows is not yet supported directly, but it’s possible to run Mojo by way of WSL2.)

In this article, we’ll look at what it’s like to use Mojo locally. We’ll also discuss how Mojo resembles Python, how it’s different, and what it has to offer to programmers familiar with Python and other languages.

Mojo language basics

When Mojo was first announced, it was easy to think of it as a “superset” of Python, where existing Python programs would also be valid Mojo programs. It has since become clear that Mojo’s goal is not to provide inherent compatibility with Python—that is, it isn’t meant to be a runtime for existing Python programs. Instead, Mojo aims to provide a syntax that’s familiar and comfortable to Python users, but a feature set that’s more suited to lower-level programming than Python—for instance, to allow manual memory management in the style of Rust or C++.

Where full compatibility with Python programs and the Python ecosystem is required, Mojo can call out to the Python runtime and use it to handle those cases. The disadvantage there is performance, since calls to the Python runtime and anything in its sphere incur a cost. As a workaround, you could use Python in cases where you needed its ecosystem, and Mojo where your priority was performance.

Mojo’s most touted advantage over Python is that it is compiled ahead of time to machine-native code, using the LLVM toolchain. This gives Mojo programs an inherent speed advantage, although that advantage holds up best when working with features specific to Mojo.

Python features are likely to come at the cost of emulating Python’s dynamic behaviors, which are inherently slow—or again, by just using the Python runtime. That said, Mojo has native behaviors that can replace some of those things. For instance, in lieu of Python’s non-native integers of any size, Mojo can support integers of up to 256 bits that are actually aliases for fast SIMD operations.

Comparing syntax: Mojo vs. Python

Many of Mojo’s native language features do one of two things: They’re either entirely new features not found in Python or a more performant expansion of a Python feature. In the latter case, enhanced performance comes at the cost of losing Python’s dynamism.

In Python, for instance, there is no way to formally declare variable references; variables are just created as needed. In Mojo, you can use the keyword var to make an explicit variable declaration in the current scope. How the two languages handle variable scoping is also different. In Mojo, a variable can only be scoped within a given nested code block, such as an if block  (although you can pre-declare variables in the larger scope). In Python, a variable declared in an if block persists after the block ends.

Mojo also has its own struct keyword that stands in contrast to Python’s class. Classes are just Python classes, with all the dynamic (and slow) behaviors you’d expect. The struct types, though, are more like their C/C++ and Rust counterparts, with fixed layouts determined at compile time but optimized for machine-native speed.

Another Mojo keyword designed to distinguish Mojo’s behaviors from Python’s is fn. You can use def or fn to define a function, but fn will only allow you to raise errors when they are explicitly defined. Typically, a fn function handles all its own error conditions internally. This avoids generating potentially unneeded runtime error-handling code, which could impact performance.

The following code snippet, a struct for a point-like entity, shows all these principles in action:

struct Point:
    var x: Int
    var y: Int

    fn __init__(out self, x: Int, y: Int)
        self.x = x
        self.y = y

As a Python developer, you’ll notice immediately how closely Mojo code resembles Python’s—the use of whitespace for syntax, for instance. But the differences add up.

The Mojo compiler and toolchain

If you want to use Mojo on your own system, Modular provides downloadable binaries. If you’re already a Python user, the easiest way to get started with Mojo is to set up a Python virtual environment and install Mojo as if it were a Python package. This also makes cross-integration with Python easier, as any third-party Python libraries needed for a Mojo project can be installed and used in that venv. (More on this later.)

Once you have it set up, running Mojo programs is as simple as writing a file with a .mojo extension and using mojo filename.mojo to run it. The startup time for a Mojo program is noticeably longer than for a comparable Python program; that’s because the Mojo program is compiled to native code. If you use mojo build filename.mojo, that compiles the program in question to a standalone binary, and lets you reuse it without having to recompile. Compiled Mojo binaries can be quite compact; a simple “Hello, World!” program compiles to a 19K binary on Linux.

The Mojo toolchain can also be used with Modular’s package manager and project management tool, pixi. You can use pixi to manage Mojo-only projects as well as projects that blend Mojo and Python, as it supports all the metadata used in Python projects (e.g., pyproject.toml). The tool also provides lockfile and environment-management mechanisms. Some of the demo projects in the Mojo repository use pixi, but it’s entirely possible to keep using pip or uv if you’ve already invested in those tools.

Working with Python in Mojo

When you write Mojo code, the default assumption is that everything you type is Mojo. If you want to use Python features, you have to import them specifically. For instance, if you wanted to use NumPy from Python to do something specific to it (like integrating it with an existing workflow), you’d do something like this:

from python import Python

def main():
    np = Python.import_module("numpy")
    rand_array = np.random.rand(32)

The Mojo module python provides interop with the Python ecosystem as a whole, and the Python.import_module method works like Python’s own import mechanism. Every standard library item in Python, along with every third-party module installed into the virtual environment you’re using, can be imported this way. Imports do have some restrictions—you can’t import into the top level of a Mojo module, and you can’t emulate from x import y behaviors yet.

What’s key about Python/Mojo interop is that all native Python operations use an instance of the Python runtime. They aren’t emulated in Mojo itself. The advantage here is all Python behaviors are exactly what you’d expect them to be. But the downside is that every call across the Mojo/Python boundary incurs a performance cost. This isn’t peculiar to Mojo; any interop between Python and another language through Python’s foreign function interface sustains a per-call overhead. The common way to handle this is just to reduce the number of calls in either direction, by batching operations.

Mojo can also be called from Python in cases where you run Python programs in the same environment as Mojo. The mechanism for doing this is more complex, though, and requires some boilerplate. But the end result allows Python to see and use Mojo modules as Python extension modules, as if they were written in C or Rust, or by way of Python’s Cython project.

Could Mojo replace Python?

Mojo was originally touted as a language for data science and machine learning, but that pitch has since been refined. The About Mojo section of the language’s manual describes Mojo as “a systems programming language specifically designed for high-performance AI infrastructure and heterogeneous hardware.” Python, by contrast, is not intended to be a systems language, but has made a place for itself in the AI/ML world as a “glue language,” providing a convenient interface to things that by themselves aren’t convenient.

Replacing Python isn’t simply a matter of creating a faster language, because speed of execution was never Python’s draw to begin with. Speed of development and a robust ecosystem have always come first. For Mojo to replace Python, it would need to not only compete but excel in both of those categories. It would need to not just make use of existing Python libraries—since that comes at a potential performance cost—but create its own native equivalents for them. The same goes for having Mojo eclipse Python generally, such as in web development. Everything required to do that would take years to set in motion.

In the long run, Mojo is likely to be best used in the same way other languages are, as a complement to Python, enabling certain things that are still difficult to do in Python. In the meantime, Mojo can continue to grow on its own, and will eventually find a niche of its own.

(image/jpeg; 4.65 MB)

Node.js tutorial: Get started with Node 12 Nov 2025, 9:00 am

Node.js is a popular and versatile cross-platform JavaScript runtime environment. Node was the first runtime to allow developers to run JavaScript outside the browser, opening a new world of possibilities in server-side JavaScript. Its ease of use, massive ecosystem and performance characteristics have continued to secure its place as one of the most important technologies of the modern web.

Anytime you need to run JavaScript on the server—be it for a systems utility, a REST API, data processing, or anything else—Node is an excellent choice. There are newer runtimes, namely Deno and Bun, but Node remains the standard for server-side JavaScript.

Also see: 10 JavaScript concepts you need to succeed with Node.

Getting started with Node

If you haven’t already experienced Node, this article will introduce you. We’ll step through installing Node and the NPM package manager, spinning up a simple web server, and using the Node cluster module to take advantage of multiple CPU cores.

We’ll also look at using the NPM package manager to install additional Node modules and other JavaScript packages. And we’ll dip a toe into using a Node framework, in this case the ubiquitous Express server, to create more feature-rich and flexible Node.js servers. Let’s get started!

Installing Node and NPM

There are a few ways to install Node, including the installer that Node itself provides, but the recommended way is with a version manager. The most common version manager is NVM. This makes it easy to install Node and change versions when you need to. (There is also a Microsoft Windows-specific version called nvm-windows.)

NVM can be installed with an installer or using a CLI. In the following example, we use curl:


$ curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.3/install.sh | bash

Once you have NVM installed, installing the most recent version of Node is simple:


$ nvm install latest

The install latest command makes the latest version available. Mine is Node 24.9.0, so I activate it with:

$ nvm use 24.9.0

Anytime you need to install another version of Node, you can use nvm install and nvm use to switch between them.

You should now see Node available at your command prompt:


$ node -v

v24.9.0

When you install Node this way, the Node package manager (NPM) is also installed:

$ npm -v

11.6.0

Note that using NVM also avoids potential permissions issues with NPM packages when using the installer.

A simple web server in Node

To start simply, we can use an example from the Node homepage. Copy the Synopsis example code as directed there and paste it into your code editor, then save it as example.js:


const http = require('node:http');

const hostname = '127.0.0.1';
const port = 3000;

const server = http.createServer((req, res) => {
  res.statusCode = 200;
  res.setHeader('Content-Type', 'text/plain');
  res.end('Hello, InfoWorld!\n');
});

server.listen(port, hostname, () => {
  console.log(`Server running at http://${hostname}:${port}/`);
});

Open a shell in the directory where you saved the file, and run the file from your command line:


$ node example.js

Server running at http://127.0.0.1:3000/

You can now go to the browser and check it out at 127.0.0:3000, and you should see a simple greeting. Back at the terminal, press Control-C to stop the running server.

Before we go further, let’s pull apart the code.

Creating a simple HTTP server with Node

We start with the command:

const http = require(‘http’);

This is how you include a module in your code, in this case, the standard http module. (The http module ships with Node, so you don’t have to add it as a dependency.) This module provides the createServer and listen functions we’ll use later on.

You might have noted that this example used a CommonJS import. While older, this style of import is still very common in Node programs as well as some documentation. However, it’s gradually being phased out in favor of ES Modules (ESM), the standardized module system introduced in ECMAScript 2015. An ESM import would look like this:

import http from 'http';

After we import the http module, we define a couple of values we need (hostname and port):

const hostname = '127.0.0.1';

const port = 3000;

Next, we create the server:

const server = http.createServer((req, res) => {
  res.statusCode = 200;
  res.setHeader(‘Content-Type’, ‘text/plain’);
  res.end(‘Hello World\n’);
});

The creatServer command accepts a callback function, which we define using the fat arrow notation. The callback function passes two arguments, the request (req) and response (res) objects needed to handle HTTP requests. The req argument contains the incoming HTTP request, which in this case is ignored. The res.end method sets the response data to ‘Hello InfoWorld\n’ and tells the server that it is done creating the response.

Next, we have:

server.listen(port, hostname, () => {
  console.log(`Server running at http://${hostname}:${port}/`);
});

The server.listen function accepts three arguments. The first two are the port and hostname, and the third is a callback that is executed when the server is ready (in this case, it prints a message to the console).

Having all the event handlers defined as callbacks is one of the most subtle and powerful parts of Node. It’s key to Node’s asynchronous non-blocking architecture.

Node.js runs on an event loop, which always reverts to handling events when not otherwise engaged. It’s like a busy order-taker continually picking up orders and then updating the order-maker with their order. We receive updates via the callbacks.

A multi-process web server with Node

Node’s asynchronous, non-blocking nature makes it good at handling many parallel requests, but it’s not truly concurrent by default. There are a few ways to make a Node application use multiple threads for true concurrency. One of the simplest is to use the PM2 project, which lets you run the same Node application in many processes.

By launching each application instance in its own process, the operating system can make use of multiple cores on the machine. This is not usually a concern at first, but it’s a key performance consideration to bear in mind.

You can install PM2 globally like so:

$ npm install -g pm2

For our example, we want to make it obvious that the different processes are handling requests. We can achieve that goal with a small change to the server:

res.end(`Hello, InfoWorld! Handled by ${process.pid}`);

The process.pid field is a built-in environment variable, providing a unique ID for the currently running process in Node. Once PM2 is installed and the app is updated, we can run it like so:

$ pm2 start example.js -i max

That should launch several instances of the same program, as shown here:

Screenshot of a multi-process Node-based web server running several instances of the same program.

Matthew Tyson

Then, if you open multiple windows, you can see the unique ID of each instance:

Screenshot of a Node-based multi-process web server showing the unique ID of each instance.

Matthew Tyson

An Express web server with Node

For our final example, we’ll look at setting up an Express web server in Node. This time we’ll use NPM to download Express and its dependencies. NPM is one of the greatest storehouses of software on the planet, with literally millions of libraries available. Knowing how to use it is essential for working with Node.

NPM works just like other package managers you may have used, letting you define and install dependencies in a structured way. To install Express, go to your project directory and type:

$ npm install express

Node should respond with something like: added 68 packages in 5s.

You will notice several directories have been added to a /node_modules directory. Those are all the dependencies needed for Express. You usually don’t have to interact with node_modules yourself, but it’s good to know that’s where things are saved.

Now look at the package.json file, which will have something like this in it:

{
  "dependencies": {
	"express": "^5.1.0"
  }
}

This is how dependencies are defined in NPM. It says the application needs the express dependency at version 5.1.0 (or greater).

Setting up the Express server in Node

Express is one of the most-deployed pieces of software on the Internet. It can be a minimalist server framework for Node that handles all the essentials of HTTP, and it’s also expandable using “middleware” plugins.

Since we’ve already installed Express, we can jump right into defining a server. Open the example.js file we used previously and replace the contents with this simple Express server:

import express from 'express';

const app = express();
const port = 3000;

app.get('/', (req, res) => {
  res.send('Hello, InfoWorld!');
});

app.listen(port, () => {
  console.log(`Express server at http://localhost:${port}`);
});

This program does the same thing as our earlier http module version. The most important change is that we’ve added routing. Express makes it easy for us to associate a URL path, like the root path (‘/’), with the handler function.

If we wanted to add another path, it could look like this:

app.get('/about', (req, res) => {
  res.send('This is the About page.');
});

Once we have the basic web server set up with one or more paths, we’ll probably need to create a few API endpoints that respond with JSON. Here’s an example of a route that returns a JSON object:

app.get('/api/user', (req, res) => {
  res.json({
	id: 1,
	name: 'John Doe',
	role: 'Admin'
  });
});

That’s a simple example, but it gives you a taste of working with Express in Node.

Conclusion

In this article you’ve seen how to install Node and NPM and how to set up both simple and more advanced web servers in Node. Although we’ve only touched on the basics, these examples demonstrate many elements that are required for all Node applications, including the ability to import modules.

Whenever you need a package to do something in Node, you will more than likely find it available on NPM. Visit the official site and use the search feature to find what you need. For more information about a package, you can use the npms.io tool. Keep in mind that a project’s health depends on its weekly download metric (visible on NPM for the package itself). You can also check a project’s GitHub page to see how many stars it has and how many times it’s been forked; both are good measures of success and stability. Another important metric is how recently and frequently the project is updated and maintained. That information is also visible on a project’s GitHub Insights page.

(image/jpeg; 0.85 MB)

Page processed in 0.513 seconds.

Powered by SimplePie 1.4-dev, Build 20170403172323. Run the SimplePie Compatibility Test. SimplePie is © 2004–2025, Ryan Parman and Geoffrey Sneddon, and licensed under the BSD License.