AI is causing developers to abandon Stack Overflow 12 Jan 2026, 9:20 pm

Since 2008, millions of developers around the world have found answers to their programming questions on the popular platform Stack Overflow.

Recently, however, activity has declined significantly, and in December, only 3,862 questions were asked on Stack Overflow, which is a 78 percent decrease on an annual basis.

According to Dev Class, the decline is primarily due to the increased use of AI, but many users have also grown tired of being treated like idiots when they ask questions on Stack Overflow.

In this context, it is worth mentioning that many of the AI responses come from Stack Overflow itself, as companies such as Open AI have signed cooperation agreements with the platform.

This article first appeared on Computer Sweden.

Related reading:

(image/jpeg; 11.22 MB)

Stack thinking: Why a single AI platform won’t cut it 12 Jan 2026, 6:40 pm

When I started integrating AI into my workflows, I was seduced by the promise of “one tool to rule them all.” One login. One workflow. One platform that would manage research, writing, operations and communications — all in one neat package. In theory, it was elegant. But what I found, in the end, was a trap.

What broke first were nonnegotiables: depth, nuance and reliability. The moment I tried to force a single AI platform to do everything — from deep research to outreach copy to automation orchestration — I hit an invisible wall. Research became shallow, writing homogenized and operational workflows became brittle. That’s when I realized that no single AI tool could do everything I needed.

What followed was a shift in mindset: I swapped “one-platform thinking” for “stack thinking.” I started curating a bench of specialized tools — each assigned to a distinct job — and built workflows that were resilient, adaptable and far more effective in the real world.

How I fell into the one-platform trap

At first, using a single AI platform felt efficient. Everything was under one roof. No juggling accounts, no format drift. Just “AI, here I come.” It was neat. It felt modern.

But the cracks started to show quickly. The first to crumble was research depth. I’d task the platform with what I thought was “deep research” — reading about an executive’s background, pulling themes, summarizing context. Then I’d ask the same system to convert that into outreach copy, a positioning doc or automation steps.

On the surface, the output looked fine. But I slowly recognized a pattern:

  • The research was broad, but it missed edge-case nuance.
  • The writing felt safe, generic, unbranded and homogenized.
  • Operations workflows ran, but felt brittle or manual when stretched.

And — and this was the real problem — I was spending more time debugging the platform’s limits than I was accomplishing meaningful work.

The turning point came when I tried to run an “agentic” workload for my Chief of Staff agent — which I named Isla. The plan was simple: have one AI run end-to-end. It would read my email threads, parse context, draft replies and convert follow-ups into actionable tasks. It was a heavy load. What could possibly go wrong?

When I took a close look at how Isla was performing, the details were a mess. Context accuracy had collapsed. The AI mis-threaded conversations; summaries lost nuance; and follow-ups failed to capture essential subtleties. I tried patching it — full-thread retrieval, name-matching, confidence-scoring — but no matter how clever the prompts got, the central limitation remained. As it turned out, the single platform could not mimic the complex pipeline of human judgment, context and layered logic.

That’s when I stopped asking “How do I make one tool do everything?” and started asking, “What tool is built for the various jobs involved here?”

Stack thinking and the blind spot revelation

Once I gave myself permission to stop chasing “one tool to fit all,” I embraced stack thinking: letting a curated set of specialized tools work together and do what they do best.

The real revelation? The moment I added a dedicated research engine into the mix, it exposed gaps I never saw coming. Suddenly:

  • I uncovered contradictions between what an executive said last year versus last month.
  • I discovered niche angles hidden in small-circuit podcasts, obscure interviews and domain writings.
  • I detected unspoken strategic tensions that never surfaced in mainstream bios.

I didn’t just improve research. I changed the reality I was able to see. What I had thought were prompt errors turned out to be problems caused by using a single tool too broadly. After noticing that, I was sold on building a stack.

The discipline of curation — not accumulation

Stack thinking isn’t about collecting every shiny new tool. It’s about curation. Today I treat tools like hires: they need to specialize in a job and they must bring value.

Here are the questions I ask before giving a tool a seat on my bench:

  • What job is it uniquely better at? If it’s only “slightly better,” it doesn’t earn a slot.
  • Does it create compounding time savings? Weekly multipliers beat one-off wins.
  • Can it integrate without breaking my workflow rhythm? If adopting it means rewiring habits, I need a 10× payoff.

Most tools fail at the first question. They’re generalists pretending to be specialists. I don’t need another “pretty good at everything” model. I need a killer in one slot. Now the rule is blunt: if I can’t describe the tool’s unique role in one sentence, it doesn’t make the cut.

Managing the integration tax: Making many tools work as one

A multi-tool stack is elegant in theory, but messy in practice. Multiple tools bring context switching, format drift and data-handoff friction. That overhead is real and dangerous.

I’ve found that the only way to manage it is through rigid discipline and structured orchestration that follows this structure:

  • Define fixed input and output schemas between tools.
  • Use a small number of orchestrator prompts to translate between systems.
  • Avoid freeform tool-to-tool conversations. Everything passes through a framework — predictable, testable, swap-able.

For example, when I built a large site on one platform (800+ MB on Replit) for speed and momentum, it got me moving fast — but it wasn’t the right environment for final hosting. I needed a different stack to handle production-ready architecture. Because I had built with a bench mentality, not a platform addiction, I was able to rip, transplant and rebuild.

I define freedom like this: vendor independence, portability and reliability. My workflow — and my business — runs on my terms, not a single tool’s roadmap.

Mapping specialization to function: What goes where

If you’re building your first AI toolbench, don’t start with tools. Start with functions. Map them carefully by:

  • Research and sensing: breadth, retrieval, verification.
  • Synthesis and reasoning: ambiguity tolerance and multi-step logic.
  • Production: tone, format, media output.
  • Operations and automation: routing, triggers, task persistence.

Common mismatches happen when people expect a research engine to write like a marketing copywriter, or a writing engine to manage workflows, or automation tools to reason like humans. That’s how you get universal mediocrity — the “jack of all trades, master of none” curse.

In our own processes, we don’t rely on a single agent. We split functions across distinct agents.

Evolution over revolution: Versioning your stack the right way

Because AI is evolving rapidly, it’s tempting to chase every new launch or next-generation platform. But I don’t. I treat my toolbench like a product roadmap: methodical, practical and diverse.

Here’s how I approach new tools:

  • Identify friction or a ceiling in the current stack.
  • Test new tools in sandbox workflows — limited, controlled and isolated.
  • Measure real before-and-afters performance based on leverage — not hype.
  • If a new tool overlaps heavily with an existing one but doesn’t beat it decisively, I pass.

This disciplined, iterative approach ensures that your architecture remains resilient and free from the inherent brittleness of a single, all-in-one system.

Build a bench, not a castle

If you’re building with AI today — for business operations, content, automation or long-term strategy — don’t fall for the one-platform myth. It’s seductive. It’s simple. But simplicity can be the mask of fragility.

Instead, adopt stack thinking. Be deliberate about which tool does which job. Prune the bench. Define your schemas. Standardize handoffs. Insist on workflow fit, not hype.

Your AI doesn’t need to be a monolith. It needs to be resilient, adaptable and designed for real-world friction. If you build a bench instead of a castle, you’ll find that what you gain isn’t just efficiency or output. It’s clarity, quality and freedom. And in a world of constant change, that’s a far more powerful advantage than any single tool ever will be.

This article is published as part of the Foundry Expert Contributor Network.
Want to join?

(image/jpeg; 28.52 MB)

Postman snaps up Fern to reduce developer friction around API documentation and SDKs 12 Jan 2026, 3:12 pm

API platform Postman has acquired API documentation- and SDK-generation startup Fern to extend its support for developers around API adoption.

The acquisition targets common pain points, including poor documentation and brittle libraries, that slow adoption of APIs and drive up their integration and support costs.

“Postman already helps enterprises design, test, and validate APIs internally. What Fern solves is the next step and, often, the messier step of making those APIs easy for external or customer developers to understand, integrate, and trust,” said Akshat Tyagi, associate practice leader at HFS Research.

Fern has two main offerings: Fern Docs, which generates documentation from an API definition, and SDK Generator, which creates client SDKs for APIs in nine languages.

Fern Docs can help keep documentation current as an API evolves, rather than letting it fall out of sync and creating confusion among developers, broken integrations, and an increase in support tickets, Tyagi said.

Its integration with Postman will help the company address a popular developer demand, according to Tyagi. “Many developers don’t want to work directly with raw APIs. They want clean, language-specific SDKs that feel native in their environment,” he said.

Gaurav Dewan, research director at IT management consulting firm Avasant, said the acquisition could also help Postman customers sell their technology via APIs more easily.

“According to Stack Overflow, about 75% of developers are more likely to endorse a technology if it provides good API access. This suggests APIs are a key factor in tech adoption decisions,” he said.

Rising interest in frameworks like FastAPI and languages such as Python and JavaScript commonly used to build APIs also indicate healthy API ecosystem growth.

The acquisition will also help Postman close a long-standing operational gap between vendors, who specialize in developer experience, and API management vendors, he said.

While API management platforms such as Apigee, MuleSoft, and Kong remain strong in runtime governance (policies, monetization, multi-gateway, hybrid), vendors such as ReadMe, Redocly, and Stoplight who specialize in developer experience remain strong in documentation and design-first workflows.

“By bringing idiomatic SDKs and tailored docs into the same ecosystem where APIs are designed and tested, Postman offers a developer-led lifecycle from design, test, docs, SDKs, and adoption, rather than gateway-first governance. This is a differentiated path that complements gateways,” Dewan said, noting that many enterprises still stitch together docs and SDK generation with separate vendors.

(image/jpeg; 7.45 MB)

Why ‘boring’ VS Code keeps winning 12 Jan 2026, 9:00 am

Every few months, the developer tool hype machine finds a new hero. In 2023, it was GitHub Copilot, the AI pair programmer that made autocomplete feel like magic. In 2024, the vibe shifted to Cursor and the new class of AI-first editors. And now, at least on X, Google’s “agent-first” Antigravity is being pitched as the next inevitable thing.

Meanwhile, the model layer keeps whiplashing. First, everyone used ChatGPT. Then Gemini was catching up. Now, it seems Claude is the default brain for developers who want reasoning over speed.

So much churn, but churn doesn’t necessarily translate into sustained adoption. It turns out distribution beats novelty, especially once the enterprise shows up.

That is why, even in 2026, the gravitational center of day-to-day development still looks a lot like Microsoft, with Visual Studio Code as the workbench, GitHub as the workflow hub, and GitHub Copilot as the default assistant bolted onto both. Hence, the real question isn’t “Will AI replace the IDE?” but rather “Who owns the control plane when AI becomes part of the IDE?”

The stubborn persistence of VS Code

In all the talk about Cursor, Antigravity, etc., it’s easy to forget that VS Code’s popularity keeps increasing. According to a 2025 JetBrains’ survey of 24,534 developers, 85% of them reported they use AI tools, and 62% rely on at least one AI coding assistant/agent/editor. The locus of all that AI use? According to the Stack Overflow 2025 Developer Survey, VS Code sits at 75.9% among all respondents and 76.2% among professional developers. Compare this to 2024, when it was at 73.6%.

This isn’t collapse. It’s entrenchment.

Yes, Cursor shows up strongly at 17.9%, a massive leap for a newcomer. But here is the part that matters strategically. A lot of new AI editors are not replacing the VS Code ecosystem; they are riding it. Cursor’s own documentation highlights its seamless migration from VS Code because it is a fork of the VS Code code base. Google’s Antigravity is also a fork of VS Code that integrates Gemini 3, aimed at developers and enterprises who want agentic workflows in the editor.

Even when developers move to the hot new thing, the gravitational pull points back to the same platform primitives: extensions, keybindings, and repo integrations forged inside the VS Code universe. This is the “Intel Inside” problem for developer tools. Ecosystem is a hard habit to break.

GitHub Copilot isn’t going away

If you measure “winning” by how many developers actually touch the tool, GitHub Copilot is not a sideshow. GitHub CEO Thomas Dohmke recently reported 20 million GitHub Copilot users, up from 15 million just a quarter prior to that, while also noting that 90% of the Fortune 100 use the tool. That’s incredible growth, especially in light of alternatives mostly dominating the hype. Of course, this doesn’t mean Copilot is universally beloved. It’s not. But it is everywhere.

In the enterprise, procurement, compliance, and “it is already in the tool chain” matter more than vibes.

This scale feeds the real advantage: distribution. GitHub is the workflow. VS Code is the workbench. GitHub Copilot is the default assistant bolted onto both. As I have argued previously regarding Oracle’s converged database strategy, enterprises prefer integrated suites over best-of-breed fragmentation because integration reduces the complexity tax. Microsoft is applying that same converged strategy to the developer experience.

Keeping developer trust

If there is a threat to Microsoft’s dominance, it isn’t features. It’s trust. 2025 was a bad year for GitHub’s reputation, not just because of outages, but because of a growing perception that Microsoft is prioritizing AI adoption over developer agency.

We see this most clearly in the friction around opting out. In 2025, Microsoft and GitHub challenged developer trust by pushing GitHub Copilot deeper into core workflows without giving maintainers clean, reliable control over it. For example, two of the most upvoted GitHub Community threads in the prior 12 months were requests to block Copilot-generated issues and pull requests, and to fix the inability to disable automatic Copilot code reviews.

Beyond this friction, GitHub has made ecosystem-level shifts that feel like rug pulls to integrators. In a move that shocked many, they announced a hard sunset for GitHub Copilot Extensions built as GitHub Apps, blocking new creation after September 24, 2025, and enforcing full disablement by November 10, 2025. By explicitly telling developers this was a replacement rather than a migration as they pivoted to Model Context Protocol servers, GitHub violated the cardinal rule of “boring” infrastructure. Stability is supposed to be the feature, not API churn.

And just to round it out, GitHub Copilot’s security posture took a very public hit when researchers disclosed “CamoLeak,” a critical Copilot Chat vulnerability that could exfiltrate secrets and private code from private repos via prompt injection and a Content Security Policy bypass, which GitHub mitigated in part by disabling image rendering in Copilot Chat. Put those together and the trust problem is not that AI exists, it’s the perception that GitHub Copilot is becoming unavoidable infrastructure, while simultaneously being subject to churn and occasional sharp edges that are hard to justify when the product is supposed to be the boring, dependable layer.

Which maybe, just maybe, opens the door for Google.

Can Google sustain the hype?

Antigravity is a legitimate technical marvel. It represents a shift to “agent-first” development, where you delegate high-level tasks to Gemini 3 agents that run across the editor, terminal, and browser. It does this while borrowing VS Code’s familiarity.

But familiarity can be good and bad. Google’s historical weakness is not innovation; it is commitment. Developers (and the CTOs who approve their tools) have learned to fear the “killed by Google” roulette wheel. For an enterprise to rip out VS Code for Antigravity, they need to believe Antigravity will exist in 2030. Google’s track record makes that a hard bet to place. That said, lately Google has become much better at being boring and has become essential enterprise infrastructure with Google Cloud. If Google is able to manage AI security concerns better than Microsoft, it could become the new “boring.”

After all, the winners in this market won’t be the companies with the most hype; they’ll be the ones that can take a chaotic model market and turn it into a calm, governed, low-friction developer experience.

Microsoft has done that for decades, which puts it in a strong, but not unassailable, position. For developers, Microsoft owns the default workbench (VS Code), the default workflow hub (GitHub), and the enterprise rails that turn experimentation into standardization. Even when developers “switch” to Antigravity, they are often just moving to a different room in Microsoft’s house, as it were.

This doesn’t mean Microsoft will keep winning, but it does mean they’ve set the “boring” bar high.

(image/jpeg; 6.26 MB)

How to succeed with AI-powered, low-code and no-code development tools 12 Jan 2026, 9:00 am

As agentic AI takes hold across the technology industry, development tools are rapidly integrating AI-powered features. Experts say there is a rising demand for AI-assisted low-code and no-code development tools.

“The demand is huge,” says Marc-Aurele Legoux, owner of Marcus-Aurelius Digital. “These tools allow anyone with zero to little coding knowledge to develop something that would otherwise either cost them a fortune or years of time, experience, and effort.”

Legoux says he frequently uses AI technology to create custom-coded tools that either help with user experience or to quickly set up an environment for clients to inspect or test.

Such technologies “can dramatically shorten development timelines, lower technical barriers for non-engineers, and enable rapid prototyping of niche, business-specific applications,” says Aaron Grando, vice president, creative innovation at Mod Op, a marketing and advertising agency.

Grando notes that AI-assisted coding has shifted the economics of software development. “Many problems that required significant engineering investment can now be executed by smaller teams with more focused domain knowledge, even individuals. When people who need solutions are empowered to build for themselves, they get to the core of the problem faster and solve it more holistically,” he says.

Mod Op has deployed AI coding assistants to engineers, as well as no-code agent builders for staff of all experience levels, “unlocking that speed and expertise” across the entire organization, Grando says.

Demand is surging for AI-augmented, low-code and no-code tools, says Ishan Amin, founder of WP Expert Services. He sees two reasons for the surge: “Instant application creation and powerful task automation,” he says.

“On the creation front, tools like Lovable.dev and Bolt.new now give users the ability to build entire standalone web or mobile applications without any coding knowledge,” Amin says. “A user can simply describe their needs in a chat, and the AI generates the front-end design, application logic, and the complete back end, including cloud database support.”

For businesses, “this is a game changer, as product managers, designers and developers can quickly develop full-scale apps,” Amin says.

As for task automation, platforms available on the market allow users to automate complex work tasks using simple drag-and-drop components, says Amin. “The days of manually scripting these connections are gone, as the AI now handles that scripting in the background.”

As a technology and product leader for more than 20 years, Amin is seeing “seismic, month-over-month changes.” Product managers “now have to move incredibly fast,” he says.

Another proponent of these tools is Sonu Kapoor, an independent software engineer. “These platforms are breaking down traditional developer barriers, allowing cross-functional teams to contribute directly to software creation while AI handles much of the scaffolding, validation, and logic suggestions,” he says.

Having architected AI-integrated systems for enterprises such as Citicorp, Sony Music Publishing, and Cisco, “I’ve seen firsthand how AI copilots are turning low-code platforms into intelligent development environments,” Kapoor says. “They’re no longer ‘toy tools.’ They’re becoming serious productivity engines.”

8 best practices for AI-powered low-code and no-code development

Development teams and organizations can take concrete steps to enhance the likelihood of success with AI-augmented, low-code and no-code development tools. Experts using these tools offered the following best practices for incorporating them into development workflows.

1. Create a governance strategy

“Establish governance and review pipelines early,” Kapoor says. “Even though AI copilots can enforce patterns and spot regressions, developers still need to validate scalability and maintainability.”

As part of governance, organizations need to manage their data boundaries carefully. “Many AI builders depend on user input and API calls that can inadvertently expose sensitive data,” Kapoor says, “Setting up strong data governance prevents that risk.”

Without governance, “low-code AI models become [a] liability,” says Nik Kale, principal engineer at networking and security provider Cisco Systems. “One of the first lessons we learned at Cisco was that low-code AI tools without built-in governance can quickly become unmanageable at enterprise scale.”

For Cisco’s Digital Adoption Platform (CDAP), governance is integrated directly into the development process, Kale says. “Every workflow or automation created by business teams undergoes automated checks for explainability, privacy impact, and performance before release,” he says. “This ‘governance-by-design’ approach helps prevent AI drift and ensures compliance with both internal and external standards.”

AI-assisted code generation “can accelerate prototyping, but code reviews and observability policies [overseen by humans] remain essential to maintain reliability,” says Akash Thakur, global site reliability engineering and cloud resilience architect at IT services and IT consulting firm Cognizant.

“Pair domain users with engineering mentors to ensure quality and performance,” Thakur says. “The biggest ROI comes when business intuition and technical discipline meet.”

Also see: How to start developing a balanced AI governance strategy.

2. Don’t assume AI replaces experience

Users of these tools need to have at least a basic understanding of how they work and the principles of software development.

One of the presumed benefits of low-code and no-code tools is that they are easy to use, thereby enabling people with little or no programming skills or experience to develop code. But it’s a mistake to assume that anyone can produce code quickly using these tools.

As someone who has been using vibe coding—a software development approach where a user describes needed functionality in natural language and an AI tool generates and refines code—Legoux says AI tools have their shortcomings.

“Forget the idea that you will be able to create a full-blown application within a few hours if you have zero experience of creating apps in the first place,” Legoux says. “This is probably the most common misconception I see every day. You need some sort of experience and knowledge before getting started.”

Also see: Is vibe coding the new gateway to technical debt?

3. Treat AI as a co-worker, not a replacement

Another best practice is to treat AI as a strategic planner and co-author, not a replacement for people, Grando says. “The best results come when humans with deep knowhow help the AI understand the problem completely,” he says. “AI tools don’t inherently understand product requirements, governance, or compliance. Human oversight is essential to finding a solution that checks all the boxes.”

Non-engineers and solo solution builders should start with narrowly defined problems in areas over which they have full control, such as their day-to-day routines, Grando says. “This lowers complexity and risk, builds confidence, and leads to more wins,” he says. “When a problem needs a solution that’s bigger than an individual or small team, or becomes a critical part of your process, that’s when it’s time to bring in engineers and architects.”

4. Measure outcomes tied to business value

“A successful no-code initiative isn’t measured by how many automations are built, but by what they achieve,” Kale says. “We use telemetry dashboards that correlate automation outcomes to key metrics such as case deflection, meantime to resolution, and customer satisfaction.”

By surfacing those metrics to both developers and business owners, adoption becomes self-sustaining rather than a one-time experiment, Kale says.

Across Cisco’s customer-facing platforms, its AI-augmented Digital Adoption and Support Fabric has delivered 22% faster first-touch resolutions vs. pre-launch baselines, and a 15% boost in engineer productivity through richer diagnostics and fewer repeat steps, among other benefits, Kale says.

5. Master prompting with clarity and context

Providing clear, specific prompted instructions and background context is critical, Grando says.

Users need to articulate desired outcomes, data sources, and reference materials. “Strong prompting and strategic context layered into building workflows leads to better code, fewer revisions, and more strategically aligned solutions,” Grando says.

6. Remain dedicated to the tasks at hand

Another important practice for succeeding with AI coding tools is to focus intensely on the problem at hand, as it’s very easy to get lost in the innovative, new technologies, Amin says.

“The tools available today are powerful and make it possible to build almost anything, but they won’t tell you what to build,” Amin says. “Knowing the specific problem you want to solve is critical to success.”

7. Focus on domain-specific training and feedback loops, not generic automation

“Generic low-code AI tools often fail because they lack context,” Kale says. “Within Cisco’s AI Support Fabric, we train models using domain-specific telemetry from customer support cases and security endpoints. This allows automation to understand intent—for example, diagnosing endpoint issues or predicting recurring incidents—rather than executing generic process steps.”

Organizations adopting similar domain-trained low-code approaches have reported significant reductions in escalation volume by aligning automation with their specific operational language, Kale says.

8. Understand the limitations of the tools

It is vital to know a tool’s limitations before you start, Amin says. “Users need to do thorough research to understand where they can expect to find ‘blockades,’” he says. “All platforms have them; you just have to know what they are and determine if they will be a hindrance to your specific project.”

Conclusion

AI-augmented low-code and no-code tools can help drive productivity and innovation when used correctly, based on these best practices:

  1. Create a governance strategy
  2. Don’t assume AI replaces experience
  3. Treat AI as a co-worker, not a replacement
  4. Measure outcomes tied to business value
  5. Master prompting with clarity and context
  6. Remain dedicated to the tasks at hand
  7. Focus on domain-specific training and feedback loops, not generic automation
  8. Understand the limitations of the tools

(image/jpeg; 22.45 MB)

Visual Studio Code adds support for agent skills 9 Jan 2026, 11:45 pm

Visual Studio Code 1.108, the latest version of Microsoft’s popular code editor, introduces support for Agent Skills, a new feature that allows users to teach the GitHub Copilot coding agent new capabilities and provide domain-specific knowledge.

VS Code 1.108, also known as the December 2025 release, arrived January 8. Developers can access it for Windows, Linux, and Mac from code.visualstudio.com.

An experimental feature, Agent Skills are folders of scripts, instructions, and resources that GitHub Copilot can load when relevant to perform specialized tasks, according to Microsoft. Skills are stored in directories with a SKILL.md file that defines the skill’s behavior and automatically detected from the .github/skills folder. They are then loaded on-demand into the chat context when relevant for the developer’s request.

Also featured in VS Code 1.108 are improvements in the Agent Sessions view. These enhancements include keyboard access support for actions such as archives, read state, and opening a session, as well as support for archiving multiple sessions at once from the new group sections. VS Code also now bases Quick Pick for chat sessions on the same information that drives the Agent Sessions view. Developers can access any previous chat sessions from there and perform actions such as archiving, renaming, or deletion.

VS Code 1.108 follows the December 10, 2025, release of VS Code 1.107, which introduced multi-agent orchestration. Other improvements in VS Code 1.108 include the following:

  • Due to negative feedback from terminal power users, Microsoft has reworked the defaults for the recently rolled out terminal IntelliSense. The feature is still enabled by default, but instead of the control being shown automatically when typing, it must be explicitly triggered via Ctrl+Space. The status bar on the bottom and discoverability in general have also been improved.
  • A new setting, chat.tools.terminal.preventShellHistory, allows users to prevent commands run by the terminal tool from being included in shell history for bash, zsh, pwsh, and fish.
  • For debugging, breakpoints now can be shown as a tree, grouped by their file.
  • The Accessible View now dynamically streams chat responses as they are generated.
  • Users now can import a settings profile by dragging and dropping a .code-profile file into VS Code. This makes it easier to share profiles with teammates or quickly set up a new environment.

(image/jpeg; 0.95 MB)

Snowflake: Latest news and insights 9 Jan 2026, 3:09 pm

Snowflake (NYSE:SNOW) has rapidly become a staple for data professionals and has arguably changed how cloud developers, data managers and data scientists interact with data. Its architecture is designed to decouple storage and compute, allowing organizations to scale resources independently to optimize costs and performance.

For cloud developers, Snowflake’s platform is built to be scalable and secure, allowing them to build data-intensive applications without needing to manage underlying infrastructure. Data managers benefit from its data-sharing capabilities, which are designed to break down traditional data silos and enable secure, real-time collaboration across departments and with partners.

Data scientists have gravitated to Snowflake’s capability to handle large, diverse datasets and its integration with machine learning tools. Snowflake is designed to rapidly prepare raw data, build, train, and deploy models directly within the platform to achieve actionable insights.

Watch this page for the latest on Snowflake.

Snowflake latest news and analysis

Snowflake to acquire Observe to boost observability in AIops

January 9, 2026: Snowflake plans to acquire AI-based SRE platform provider Observe to strengthen observability capabilities across its offerings and help enterprises with AIOps as they accelerate AI pilots into production.

Snowflake software update caused 13-hour outage across 10 regions

December 19, 2025: A software update knocked out Snowflake’s cloud data platform in 10 of its 23 global regions for 13 hours on December 16, leaving customers unable to execute queries or ingest data.

Snowflake to acquire Select Star to enhance its Horizon Catalog

November 21, 2025: Snowflake has signed an agreement to acquire startup Select Star’s team and context metadata platform to enhance its Horizon Catalog offering, the company said in a statement. Horizon Catalog is a unified data discovery, management, and governance suite inside the cloud-based data warehouse provider’s Data Cloud offering.

Databricks fires back at Snowflake with SQL-based AI document parsing

November 13, 2025: Databricks and Snowflake are at it again, and the battleground is now SQL-based document parsing. In an intensifying race to dominate enterprise AI workloads with agent-driven automation, Databricks has added SQL-based AI parsing capabilities to its Agent Bricks framework, just days after Snowflake introduced a similar ability inside its Intelligence platform.

Snowflake to acquire Datometry to bolster its automated migration tools

November 11, 2025: Snowflake will acquire San Francisco-headquartered startup Datometry, for an undisclosed sum, to bolster SnowConvert AI, one of its existing set of migration tools.

Snowflake brings analytics workloads into its cloud with Snowpark Connect for Apache Spark

July 29, 2025: Snowflake plans to run Apache Spark analytics workloads directly on its infrastructure, saving enterprises the trouble of hosting an Apache Spark instance elsewhere, and eliminating data transfer delays between it and the Snowflake Data Cloud.

Snowflake customers must choose between performance and flexibility

June 4, 2025: Snowflake is boosting the performance of its data warehouses and introducing a new adaptive technology to help enterprises optimize compute costs. Adaptive Warehouses, built atop Snowflake’s Adaptive Compute, is designed to lower the burden of compute resource management by maximizing efficiency through resource sizing and sharing,

Snowflake takes aim at legacy data workloads with SnowConvert AI migration tools

June 3, 2025: Snowflake is hoping to win business with a new tool for migrating old workloads. SnowConvert AI is designed to help enterprises move their data, data warehouses, business intelligence (BI) reports, and code to its platform without increasing complexity.

Snowflake launches Openflow to tackle AI-era data ingestion challenges

June 3, 2025: Snowflake introduced a multi-modal data ingestion service — Openflow — designed to help enterprises solve challenges around data integration and engineering in the wake of demand for generative AI and agentic AI use cases.

Snowflake acquires Crunchy Data to counter Databricks’ Neon buy

June 3, 2025: Snowflake plans to buy Crunchy Data,a cloud-based PostgreSQL database provider, for an undisclosed sum. The move is an effort to offer developers an easier way to build AI-based applications by offering a PostgreSQL database in its AI Data Cloud. The deal, according to the Everest Group ,is an answer to rival Databricks’ acquisition of open source serverless Postgres company Neon.

Snowflake’s Cortex AISQL aims to simplify unstructured data analysis

June 3, 2025: Snowflake is adding generative AI-powered SQL functions to help organizations analyze unstructured data with SQL. The new AISQL functions will be part of Snowflake’s Cortex, managed service inside its Data Cloud providing the building blocks for using LLMs without the need to manage complex GPU-based infrastructure.

Snowflake announces preview of Cortex Agent APIs to power enterprise data intelligence

February 12, 2025: Snowflake announced the public preview of Cortex Agents, a set of APIs built on top of the Snowflake Intelligence platform, a low-code offering that was first launched in November at Build, the company’s annual developer conference.

Snowflake open sources SwiftKV to reduce inference workload costs

January 16, 2025: Cloud-based data warehouse company Snowflake has open-sourced a new proprietary approach — SwiftKV — designed to reduce the cost of inference workloads for enterprises running generative AI-based applications. SwiftKV was launched in December.

(image/jpeg; 2.38 MB)

Snowflake to acquire Observe to boost observability in AIops 9 Jan 2026, 10:09 am

Snowflake is planning to acquire AI-based site reliability engineer (SRE) platform provider Observe to strengthen observability capabilities across its offerings and help enterprises with AIOps as they accelerate AI pilots into production.

“Longer term, Snowflake is positioning itself as infrastructure for AI at scale. As AI agents generate exponentially more data, vertically integrated data and observability platforms become essential to running production AI reliably and economically,” Carl Perry, head of analytics at Snowflake, told InfoWorld.

Explaining further, Perry said that issues and bugs with AI-driven applications are harder to diagnose than in traditional software, increasing the pressure on enterprises to identify and resolve problems quickly, and Snowflake wants to tackle this by Observe’s telemetry, log, and trace analytics with Snowflake’s AI and Data Cloud.

That should give enterprises a unified view of data pipelines, model behavior, and infrastructure health — areas that are often fragmented across tools as AI systems move from experimentation to production, Perry pointed out.

The combined capabilities would help enterprises detect performance regressions, data drift, and cost anomalies earlier, while giving SRE and data teams a shared operational layer to manage reliability and governance for AI-driven applications, Perry said.

Moor Insights and Strategy principal analyst Robert Kramer agrees with Perry on the need for observability capabilities for managing AI applications at scale, terming them as “strategic” for CIOs.

Leaders who don’t adopt these capabilities for their enterprises “would be left holding a very expensive bag of science projects” as the time for pilots is over and it’s high time for them to realize tangible value in their AI investments, said The Futurum Group’s practice leader for data, AI, and infrastructure Bradley Shimmin.

The analyst also noted that Snowflake’s acquisition of Observe could significantly ease pricing pressures for CIOs by introducing a more cost-efficient approach to observability and challenging players such as Splunk and Datadog.

Unlike traditional vendors such as Datadog and Splunk, which charge premium prices because they treat telemetry data (logs, metrics, traces) as specialized, proprietary data that requires their ecosystem for storage and analysis, Snowflake plans to treat telemetry as standard data within its Data Cloud, Shimmin said.

This shift not only reduces storage and processing costs but also simplifies integration with existing enterprise data strategies, Shimmin added.

Additionally, Kramer feels that Snowflake might be able to deliver more value to its customers if it can combine Observe’s capabilities with previous acquisitions, such as TruEra.

“If Snowflake can connect system observability from Observe with model monitoring from TruEra, it could enable unified visibility from pipeline to model to production infrastructure, expanding its platform capabilities,” Kramer said.

Observe currently offers three platforms — AI SRE, o11y.ai, and LLM Observability — with capabilities like log management, application performance monitoring, and infrastructure monitoring.

The startup, which was established in 2017 by Jacob Leverich, Jonathan Trevor, and Ang Li, rolled out its initial observability platform a year later, leveraging a centralized database on Snowflake.

That, according to Perry, will help Snowflake integrate Observe’s offerings relatively faster. Snowflake has not disclosed the financial terms of the acquisition, and it is subject to regulatory approvals.

(image/jpeg; 9.4 MB)

Python starts 2026 with a bang 9 Jan 2026, 9:00 am

2026 is already popping with new Python goodies: an ultra-fast type checker from the makers of uv, Django 6, and a new way to generate C code with Python for faster-executing apps. Read on for these and other highlights.

Top picks for Python readers on InfoWorld

Reader picks: The most popular Python stories of 2025
What a year 2025 was. From free-threaded Python to integrations with Rust and Zig, recap the Python developments in 2025 that broke new ground and point the way towards a bigger and better Python world in 2026.

Python type checker ty now in beta
What Astral’s uv did for Python package management (made it wicked fast and powerful), Astral’s ty promises to do for Python type checking. And now ty is stable enough for everyone to try.

Django tutorial: Get started with Django 6
The most popular and influential Python web framework marches on! Learn how to get rolling with a freshly minted Django project, including new Django 6-only features.

PythoC: An alternative to Cython
Use Python as a C code generation system. It’s not just a Python-to-C compiler, but a kind of advanced macro platform for code generation that Cython alone can’t deliver.

More good reads and Python updates elsewhere

Python’s tail-calling interpreter rides again
The tail call compiler optimization in Python 3.14 failed to deliver the anticipated speedup, prompting an apology from developer Ken Jin. However, Jin subsequently found a way to enable this feature properly on Windows x86-64 builds, with striking results.

Notes on sandboxing untrusted Python
Python’s dynamism makes it difficult to run untrusted code safely. Developer Mohamed Diallo discusses some ways that Python interpreters could be made easier to isolate.

Edit Python AST trees while preserving source formatting
Pfst is a clever Python package that performs transformations on Python abstract syntax trees while providing access to comments, formatting (e.g., linebreaks within parentheses), and more. Example use: adding type annotations to type comments.

(image/jpeg; 4.97 MB)

Microsoft open-sources XAML Studio 8 Jan 2026, 10:00 pm

Microsoft has open-sourced XAML Studio, a rapid protoyping tool for WinUI developers using XAML. Microsoft made the announcement on January 6.

XAML Studio lets developers prototype user interface ideas before integrating them into an app within the Visual Studio IDE. Developers can prototype UWP-based (Universal Windows Platform) XAML apps. Tools and helpers are provided such as live edit and Interaction, a binding debugger, a data context editor, IntelliSense, a documentation toolbox, and namespace helpers. These features can be found in XAML Studio 1.1, accessible from Microsoft Store.

XAML Studio 2 still is in development but can be built now from source from the GitHub repository. New features include a new Fluent UI design, folder support with image loading and design data loading, a live property panel for editing, inspecting, and experimenting, and quick access preview options such as refresh, alignment grid, clipping, and theme toggle. XAML Studio requires Visual Studio 2022 or later and Windows 10 or newer. The project has adopted the Microsoft Open Source Code of Conduct.

(image/jpeg; 0.11 MB)

Databricks says its Instructed Retriever offers better AI answers than RAG in the enterprise 8 Jan 2026, 3:37 pm

Databricks is joining the AI software vendors quietly admitting that old-fashioned deterministic methods can perform much better than generative AI’s probabilistic approach in many applications. Its new “Instructed Retriever” architecture combines old-fashioned database queries with the similarity search of RAG (retrieval-augmented generation) to offer more relevant responses to users’ prompts.

Everything about retrieval-augmented generation (RAG)’s architecture was supposed to be simple. It was the shortcut to enterprise adoption of generative AI: retrieve documents that may be relevant to the prompt using similarity search, pass them to a language model along with the rest of the prompt, and let the model do the rest.

But as enterprises push AI systems closer to production, that architecture is starting to break down. Real-world prompts come with instructions, constraints, and business rules that similarity search alone cannot enforce, forcing CIOs and development teams into trade-offs between latency, accuracy, and control.

Databricks has an answer to that problem, Instructed Retriever, which breaks down requests into specific search terms and filter instructions when retrieving documents to augment the generative prompt. That means, for example, that a request for product information with an instruction to “focus on reviews from the last year” can explicitly retrieve only reviews for which the metadata indicates they are less than a year old.

That’s in contrast to traditional RAG, which treats users’ instructions in a query as part of the prompt and leaves it the model to reconcile after data retrieval has occurred: it would retrieve documents containing words or concepts similar to “review” and to “last year,” but that may be much older or not reviews at all.

By embedding instruction awareness directly into query planning and retrieval, Instructed Retriever ensures that user guidelines like recency and exclusions shape what is retrieved in the first place, rather than being retrofitted later, Databricks’ Mosaic Research team wrote in a blog post.

This architectural change leads to higher-precision retrieval and more consistent answers, particularly in enterprise settings where the relevance of a response is defined not just by text similarity in a user’s query, but also by explicit instructions, metadata constraints, temporal context, and business rules.

Not a silver bullet

Analysts and industry experts see Instructed Retriever addressing a genuine architectural gap.

“Conceptually, it addresses a real and growing problem. Enterprises are finding that simple retrieval-augmented generation breaks down once you move beyond narrow queries into system-level reasoning, multi-step decisions, and agentic workflows,” said Phil Fersht, CEO of HFS Research.

Akshay Sonawane, a machine learning engineering manager at Apple, said that Instructed Retriever acts as a bridge between the ambiguity of natural language and the deterministic nature of enterprise data. But for it to work, he said, enterprises may have to invest in data pipelines that maintain metadata consistency as new content is ingested and establish governance policies for who can query what, and how those permissions map to metadata filters.

Advait Patel, a senior site reliability engineer at Broadcom, echoed that, cautioning CIOs against seeing Instructed Retriever as a silver bullet.

“There is still meaningful work required to adopt an architecture like Instructed Retriever. Enterprises need reasonably clean metadata, well-defined index schemas, and clarity around the instructions the system is expected to follow,” Patel said.

Re-engineering retrieval

The re-engineering required to successfully use Instructed Retriever could place additional strain on CIO budgets, said Fersht.

“Adoption could mean continued investment in data foundations and governance before visible AI ROI with strain on talent as these systems would require hybrid skills across data engineering, AI, and domain logic,” he said.

Beyond cost and talent, there’s also the challenge of managing expectations. Tools like Instructed Retriever, Fersht said, risk creating the impression that enterprises can leapfrog directly to agentic AI. “In reality, they tend to expose process, data, and architectural debt very quickly,” he said.

That dynamic could lead to uneven adoption across enterprises.

Moor Insights and Strategy’s principal analyst Robert Kramer said Instructed Retriever assumes a level of data maturity, particularly around metadata quality and governance, that not every organization has yet reached.

In addition, the architecture implicitly requires businesses to encode their own reasoning into instructions and retrieval logic, demanding closer collaboration between data teams, domain experts, and leadership, which many enterprises find difficult to achieve, Kramer said.

Sonawane pointed to the need for observability in Instructed Retriever’s responses if it is to be adopted in regulated industries where transparency in how data is retrieved and filtered is critical for compliance and risk management.

“When a standard search fails, you know the keyword didn’t match. However, when an Instructed Retriever fails, it is unclear whether the model failed to reason or if the retrieval instruction itself was flawed,” Sonawane said.

In that sense, Instructed Retriever may serve as both a capability and a test. For CIOs, its value will depend less on how advanced the retrieval technology is, and more on whether organizations have the data maturity, governance, and internal alignment required to make instruction-aware AI systems work at scale.

Instructed Retriever, according to the Mosaic AI Research team, has been built into Agent Bricks and enterprises can use the offering to experience it, specifically in use cases where the Knowledge Assistant can be used.

(image/jpeg; 0.11 MB)

The hidden devops crisis that AI workloads are about to expose 8 Jan 2026, 9:00 am

Devops used to be a simple process. Take one component of the stack, run some unit tests, check a microservice in isolation, confirm it passed integration tests, and ship it. The problem is, that doesn’t test what actually matters—whether the system as a whole can handle production workloads.

This simple approach breaks down fast when AI workloads start generating massive volumes of data that need to be captured, processed, and fed back into models in real time. If data pipelines can’t keep up, AI systems can’t perform. Traditional observability approaches can’t handle the volume and velocity of data that these systems now generate.

From component testing to platform thinking

Devops must evolve beyond simple CI/CD automation. That means teams need to build comprehensive internal platforms—what I think of as “paved roads”—that replicate whole production environments. For data-intensive applications, developers should be able to create dynamic data pipelines and immediately verify that what comes out the other end meets their expectations.

Testing for resilience needs to happen at every layer of the stack, not just in staging or production. Can your system handle failure scenarios? Is it actually highly available? We used to wait until upper environments to add redundancy, but that doesn’t work when downtime immediately impacts AI inference quality or business decisions.

The challenge is that many teams bolt on observability as an afterthought. They’ll instrument production but leave lower environments relatively blind. This creates a painful dynamic where issues don’t surface until staging or production, when they cost significantly more to fix.

The solution is instrumenting at the lowest levels of the stack, even in developers’ local environments. This adds tooling overhead up front, but it allows you to catch data schema mismatches, throughput bottlenecks, and potential failures before they become production issues.

Connecting technical metrics to business goals

It’s no longer enough to worry about whether something is “up and running.” We need to understand whether it’s running with sufficient performance to meet business requirements. Traditional observability tools that track latency and throughput are table stakes. They don’t tell you if your data is current, or whether streaming data is arriving in time to feed an AI model that’s making real-time decisions. True visibility requires tracking the flow of data through the system, ensuring that events are processed in order, that consumers keep up with producers, and that data quality is consistently maintained throughout the pipeline.

Streaming platforms should play a central role in observability architectures. When you’re processing millions of events per second, you need deep instrumentation at the stream processing layer itself. The lag between when data is produced and when it is consumed should be treated as a critical business metric, not just an operational one. If your consumers fall behind, your AI models will make decisions based on old data.

The schema management problem

Another common mistake is treating schema management as an afterthought. Teams hard-code data schemas in producers and consumers, which works fine initially but breaks down as soon as you add a new field. If producers emit events with a new schema and consumers aren’t ready, everything grinds to a halt. 

By adding a schema registry between producers and consumers, schema evolution happens automatically. The producer updates its schema version, the consumer detects the change, pulls down the new schema, and keeps processing, with no downtime required.

This kind of governance belongs at the foundation of data pipelines, not something added later. Without it, every schema change becomes a high-risk event.

The devops role is evolving

Implementing all these changes requires a different skill set. Rather than just coding infrastructure, you need to understand your organization’s business objectives and trace them back to operational decisions.

As AI handles more coding tasks, developers will have more bandwidth to apply this more holistic systems thinking. Instead of spending 30 minutes writing a function, they can spend one minute prompting an AI to do the same thing, and 29 minutes understanding why the function is needed in the first place. Junior developers who once owned a narrow slice of functionality will have time to understand the entire module they’re building.

As developers spend less time coding and more time orchestrating systems, everyone can start thinking more like an architect. That means AI is not eliminating jobs; it’s giving people more time to think about the “why” instead of just the “what.”

Making AI a copilot, not a black box

Developers will trust AI tools when they can see the reasoning behind the code being generated. That means showing the AI’s actual thought process, not just providing a citation or source link. Why did the AI choose a particular library? Which frameworks did it consider and reject?

Tools like Claude and Gemini are getting much better at exposing their reasoning, allowing developers to understand where a prompt might have led the AI astray and adjust accordingly. This transparency turns AI from a black box into more of a copilot. For critical operations, like production deployments and hotfixes, human approval is still essential. But explainability makes the collaboration between developers and AI tools actually work.

The path forward

Devops teams that cling to component-level testing and basic monitoring will struggle to keep pace with the data demands of AI. The teams that do well will be the ones that invest in comprehensive observability early on, instrument their entire stack from local development to production, and make it easy for engineers to see the connection between technical decisions and business outcomes.

This shift won’t be trivial. It will require cultural change, new tooling, and a willingness to slow down up front to move faster later on. But we’re past the point where we can hope our production applications behave like they did in staging. End-to-end observability will be the foundation for building resilient systems as AI continues to progress.

New Tech Forum provides a venue for technology leaders—including vendors and other outside contributors—to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to doug_dineley@foundryco.com.

(image/jpeg; 20.28 MB)

AI-built Rue language pairs Rust memory safety with ease of use 8 Jan 2026, 1:55 am

A former longtime contributor to the development of Rust now is building a Rust-based language of his own, called Rue, which is intended to provide memory safety without garbage collection while being easier to use than Rust and Zig. Claude AI technology is being leveraged for developing Rue.

Written entirely in Rust, the language is in its early stage of development, with initial support for the standard library having just landed, said developer Steve Klabnik, in an emailed response to questions from InfoWorld on January 7, 2025. But development is progressing quickly, Klabnik said. “My hope is that it will fit into a sweet spot that’s somewhere higher-level than Rust, but lower-level than Go,” Klabnik said. “Not as hard to use as Rust, but also has good performance, fast compile times, and is easier to learn.” Thus, the language probably will not be good for a lot of low-level projects that Rust is great at, but will make different tradeoffs and help with different kinds of projects, he added.

Anthropic’s Claude AI technology is being leveraged in the development of Rue, with Claude helping Klabnik get work done faster. “I’m much, much farther along than if I hand-wrote the code myself. I do read all of the code before it gets merged in, but Claude does all of the authoring,” he said.  

As for the syntax, Rue aims for a gentle learning curve without sacrificing clarity. It compiles to x86-64 and Arm64 machine code. There is no garbage collector and no VM. The name, Rue, came about as a result of Klabnik having worked on development of both Rust and the Ruby on Rails framework. “A ‘rue’ can mean something like ‘to rue the day,’  but it’s also a family of flowers,” Klabnik said. “I like that there’s multiple ways of thinking about the name. It’s also short and easy to type.”

(image/jpeg; 7.49 MB)

Microsoft acquires Osmos to ease data engineering bottlenecks in Fabric 7 Jan 2026, 11:27 am

Microsoft has acquired AI-based data engineering firm Osmos for an undisclosed sum, as part of its effort to reduce data engineering friction inside Fabric, its unified data and analytics offering, as enterprises continue to push analytics and AI projects into production.

Osmos’ technology, which applies agentic AI to turn raw data into analytics and AI-ready assets in OneLake, will help customers bypass the common challenges most enterprises face — spending more time around data preparation rather than analysis, Bogdan Crivat, corporate VP of Azure Data Analytics, wrote in a blog post.

In fact, Roy Hasson, senior director of product at Microsoft, in a separate social media post, pointed out that the Seattle-headquartered startup had launched their AI data wrangler and AI data engineering agents on Microsoft Fabric as a native app almost two years ago, and they became quite popular.

“We quickly realized that customers loved using Osmos on top of Fabric Spark, and it reduced their dev and maintenance efforts by >50%,” Hasson wrote.

The startup, before the acquisition, offered Osmos Data Agents for Microsoft Fabric, Osmos Data Agents for Databricks, and Osmos AI-Assist Suite (Uploaders, Pipelines, Datasets), which the company describes as a collection of AI-powered data ingestion and engineering tools that automate the process of bringing external, messy data into operational systems with minimal manual effort or coding.

What the Osmos deal means for enterprises

While Microsoft is yet to divulge more details about the product roadmap for integrating Osmos’ technology in Fabric, analysts say that the integration is likely to help both CIOs and development teams.

For CIOs, the benefit would revolve around operational efficiency and faster time-to-value for analytics and AI initiatives, especially in environments constrained by data engineering talent and budget, said Robert Kramer, principal analyst at Moor Insights and Strategy.

Another upside of the acquisition for CIOs, according to Stephanie Walter, practice leader of the AI stack at HyperFRAME Research, is enabling data engineering automation that is governed, reversible, and auditable.

“As AI moves from experimentation to enterprise scale, this level of controlled automation becomes essential for maintaining reliability, compliance, and trust,” Walter said.

However, Kramer, in contrast to Walter, cautioned that enterprises’ dependence on Osmos’ technology for data engineering inside Fabric may increase platform dependence, raising governance and risk questions about certifying agentic pipelines, auditing and rolling back changes, and aligning autonomous data engineering with regulatory and compliance expectations.

Reducing repetitive engineering work for developers

For developers, though, Kramer pointed out, the acquisition has the potential to improve productivity by reducing repetitive and low-value engineering work around messy data. 

“Tasks such as data wrangling, mapping inconsistent external feeds, pipeline scaffolding, and boilerplate Spark-style transformation code could be generated by agents rather than hand-built, allowing engineers to focus on architecture, performance, data quality, and guardrail design,” Kramer said. 

“The development lifecycle could tilt toward reviewing, testing, and hardening AI-generated pipelines and transformations, with observability, approval workflows, and reversibility becoming core design requirements,” Kramer added.

Complementing recent Fabric enhancements

Analysts also view Osmos’ acquisition as complementing recent Fabric enhancements, including the introduction of Fabric IQ.

“As Fabric expands with IQ, new databases, and deeper OneLake interoperability, the limiting factor shifts from data access to data readiness. Osmos addresses that gap by automating ingestion, transformation, and schema evolution directly within the Fabric environment,” Walter said.

“In the context of Fabric IQ, Osmos helps ensure that the data feeding the semantic and reasoning layers remains continuously curated and stable as upstream sources change. Semantic systems only work when the underlying data is consistent and explainable, and Osmos is designed to reduce the operational friction that otherwise undermines those efforts,” Walter added.

But what about Osmos’ products and customers?

However, it is not all good tidings for existing Osmos customers as the company is winding down three offerings — Osmos Data Agents for Microsoft Fabric, Osmos Data Agents for Databricks, and Osmos AI-Assist Suite — as standalone products in January itself. This means that Osmos’ technology will only currently live inside Fabric, and customers who were users of the Databricks offering and the AI Suite will have to look at alternatives or find a way to work with Microsoft’s offerings.

(image/jpeg; 2.48 MB)

What the loom tells us about AI and coding 7 Jan 2026, 9:00 am

In the early 19th century, the invention of the loom threatened to turn the labor market upside down. Until then, cloth was made by skilled artisans, but the loom enabled more cloth to be made more quickly by less-skilled workers. One could even argue that the Jacquard loom, a loom that allowed for complex weaving patterns via punch cards, was the first computer. 

This technology had a disruptive effect on the labor market and gave rise to the Luddites, a group who would physically destroy looms in factories. Jobs were lost, wages were depressed, and working conditions became more unpleasant. The loom led to social upheaval and drastic change in the short run. 

But in the long run, the benefits were many. Making textile workers more productive meant more and better clothing for everyone. The advances in textile production were a harbinger of capital accumulation, economies of scale, and complementary innovations in many other areas as well, and the Industrial Revolution began.

This is a roundabout way of saying that I can’t stop writing about AI and coding.

Looming apocalypse

Just as the loom worried individual weavers, large language models (LLMs), coding agents, and the accompanying tools are the cause of considerable concern for software developers—and rightly so. We are starting to see shifts in hiring trends, with a decrease in the number of junior developers being hired. I’ve written before about our ongoing need for junior developers, if for no other reason than there will eventually be no senior developers without them, but the impact on the labor market cannot be ignored.

Shifts are definitely happening, and it remains to be seen what the effects will be. While we are a more sophisticated economy than the one faced by the Luddites, the ultimate effects of AI on software developers remain unclear. If AI writes all the code, what, exactly, will junior developers do? Juniors learn from doing the work of executing on design. Without that, how will they grow? How will they learn what they need to know to be senior developers? The short-term impact is something to worry about, however unclear it may be.

Nevertheless, I’m not afraid to make a few predictions about the long term.

If you were to tell a new parent in 1820 that their child was going to grow up to be a telegraph operator, a train conductor, or a professional photographer, they would have looked at you like you had two heads. Those jobs were unheard of at the time. Never mind if you told them their grandchildren would be pilots, radio operators, or movie producers. 

Unknowable possibilities

The future is just as unknowable for our children. AI will open up new horizons and enable new technologies that we simply can’t predict. Most likely, my grandson will have a job title that doesn’t yet exist and that will astonish me in my dotage.

Even as recently as 25 years ago, I don’t think anyone could have conceived of many services we take for granted today. Services like Uber or Doordash—the melding of mobile technology, GPS, and advanced broadband—came about through a magical confluence of a hodgepodge of technologies. AI coding will likely be a part of other technologies that we don’t foresee. It seems likely that AI will enable us to build new things that will make AI even more powerful and capable. 

There is no doubt that AI and LLMs will make the development of software more productive and take it to new, more sophisticated levels. What AI will enable isn’t knowable by us. But what we can be sure of is that this new technology will, as technology always does, combine with human ingenuity to create something amazing and mind-boggling. 

There is no doubt about it. There will be jobs and technologies that will be commonplace in 2050 but aren’t yet even a twinkle in our eyes today. 

(image/jpeg; 5.82 MB)

Generative UI: The AI agent is the front end 7 Jan 2026, 9:00 am

The advent of Model Context Protocol (MCP) APIs hints at a coming era of agent-driven architecture. In this architecture, the chat interface becomes the front end and creates UI controls on the fly. Welcome to “generative UI.”

The new portlet

Once upon a time, the portal concept promised to give every user a personalized view. Finally, the promise of a web where the user was also the controller could be realized. That didn’t quite work out the way we thought. But today, generative UI proposes to take UI personalization to a whole new level, by marrying bespoke, as-needed UI components with agentic MCP APIs.

The old way involved writing back ends that provide APIs to do things and writing user interfaces that allow humans to easily take action on those APIs. The new idea is that we’ll provide MCP definitions that allow agents to take actions on the back end, while the front end becomes a set of definitions (like Zod schemas) that expose these capabilities.

One of the greatest things about having observed the industry over a long stretch of time is a healthy skepticism. You’ve seen so many things arise and promise the moon. Sometimes they crash and burn.  Sometimes they become important. If they are useful, they are absorbed into the developer’s toolkit.

This skepticism isn’t even a conscious thing anymore; it’s an instinctual reaction. When someone tells me that AI is going to produce user interfaces on the fly as needed, I immediately begin raising objections. Like performance and accuracy. 

Then again, the overall impact of AI on development has been significant, so let’s take a closer look.

Hands-on with generative UI

I’m thinking about generative UI as a kind of evolution of the managed agentic environment (like Firebase Studio). With an agentic IDE, you can rapidly prototype UIs by typing in a description of what you want. GenUI is a logical next bridge, where the user prompts the hosted chatbot (like ChatGPT) to produce UI components that the user can interact with on the fly.

In a sense, if an AI tool, even something like Gemini or ChatGPT with Code Live Preview active, becomes powerful enough, it will push the person using it to wear the user hat, rather than the developer hat. We’ll probably see that occur gradually, where we eventually spend more time designing rather than coding, diving into developer mode only when things break or become more ambitious.

To get hands-on with this idea, we can look at Vercel’s GenUI demo (or the Datastax mirror), which implements the streamUI function:

The streamUI function allows you to stream React Server Components along with your language model generations to integrate dynamic user interfaces into your application. Learn more about the streamUI hook from Vercel AI SDK.

Vercel’s GenUI demo will give you a taste of what is meant by on-the-fly UI components streamed alongside chat interaction:

Vercel GenUI demo

Foundry

This is just a demo and it does the job of getting across the idea. It also exhibits plenty of typical AI foolishness and limitations. For instance, when I ask to buy “some Solana” in a stock buying chat it replies “Invalid amount.” So then I ask to buy “10 Solana” and it gives me a simple control with a Purchase button. 

Of course, this is all for play, and there is no plumbing backing up that purchase. Creating that plumbing would be non-trivial (wiring up a wallet or bank account and all the attendant auth work). 

But my purpose is not really to fault-find the demo. Some of the issues can be cleaned up with concerted developer work. Others are down to current limitations of large language models. By that I mean, there is a strange collision between the initial feeling of vast potential you get when using an AI or agentic tool and the hangover period of frustration that follows, when you suddenly find yourself with a mountain of AI-initiated “work” that will require hours of human concentration to master and wrangle.

It’s like you had a bit too much coffee and the caffeine wore off. Now you’ve got to roll up your sleeves and wrestle all of the big ideas into functioning software.

Vercel’s is not the only generative UI demo we can look at. Here’s another from Thesys:

Thesys GenUI demo

Foundry

Microsoft’s AG-UI offers similar capabilities.

Is generative UI a good idea?

But let’s imagine that the genUI APIs and LLMs progress beyond their current state, and developers aren’t left with the heavy lifting. The main question is: Is a generative UI something we as human beings would ever actually want to use?

To be fair, the Vercel genUI is an API that is for use in other apps. That is to say, it allows you to stream UI components into the flow of AI responses. So maybe integrating on-demand React components via the streamUI API could really be just the thing in the right setting of another, well-considered UI. 

It seems like a good UI with good UX is still the lion’s share of what people will use. I mean, I might sometimes want to ask my AI to find good deals on a flight to Kathmandu and then have it pop up an interface for buying the ticket, but usually I will just go to Expedia or whatever. 

Even if you could perfectly express intention as a UI, when you finally do get a perfectly useful UI, you probably won’t want to continue to modify it in deep ways. You will want to save it and reuse it.

Typing out intention in English (or Hindi or German) is great for certain things, especially researching, brainstorming, and working through ideas, but the visual UI has huge advantages of its own. After all, that’s why Windows supplanted DOS for many uses.

But I hasten to add I’m not dismissing the idea out of hand. Perhaps some hybrid of designed UI along with chatbot prompt that can modify it on the go is in the cards.

An essential insight here is that if the web becomes a cloud of agentic endpoints, a realm of MCP (or similar) capabilities that give action to AI, then it will be a kind of marketplace of possible actions we can take using a neutral language interface. And the on-demand, bespoke UI component will become an almost inevitable element of that landscape.

Instead of a vast collection of documents and data, the web would be a collection of actions that could be taken based on intention and meaning. 

Of course, the semantic web was supposed to make a web of meaning, but with AI a semantic web could be more practical. GenUI would be a new kind of way to provide tool definitions for engaging with that web.

Context architects

There is something here, but I don’t see genUI replacing UX and UI engineers anytime soon. Augmenting them, perhaps. Providing them with a new tool, maybe.

Similar to vibe coding, the idea that we’ll spend our time “architecting a context” using AI, rather than building interfaces, likely contains some of the character of the coming world of front-end development, but not the whole story.

The work of a UI developer in this model would consist of providing interface definitions that mediate between the chatbot and MCP servers. These definitions might look something like the snippet below.  Vercel’s API uses Zod. This is just a pseudo-example:

// This Zod schema acts as the "Interior Interface" for the AI Agent
const cryptoPurchaseTool = {
  description: 'Show this UI ONLY when the user explicitly asks to buy',
  parameters: z.object({
    coin: z.enum(['SOL', 'BTC', 'ETH']),
    amount: z.number().min(0.1).describe('Amount to buy'),
  }),
  generate: async ({ coin, amount }) => {
    // The AI plays within this sandbox
    return 
  }
}

In a sense, this schema becomes the “interior UI” available to the AI, and the AI itself becomes a kind of universal human-machine intermediary. Is this really where we’re going? Only time will tell.

(image/jpeg; 8.45 MB)

AI won’t replace human devs for at least 5 years 7 Jan 2026, 6:14 am

Human coders may have a temporary reprieve from losing their jobs to AI: It will be between five and six years before we reach full coding automation, according to a new report from LessWrong. This pushes back the online community’s previous predictions that the milestone would be reached much sooner, between January 2027 and September 2028.

The extended timeline comes just eight months after LessWrong’s initial findings, underscoring the precarious, subjective, ever-shifting nature of AI forecasting.

“The future is uncertain, but we shouldn’t just wait for it to arrive,” the researchers wrote in a report on their findings. “If we try to predict what will happen, if we pay attention to the trends and extrapolate them, if we build models of the underlying dynamics, then we’ll have a better sense of what is likely, and we’ll be less unprepared for what happens.”

Building a more nuanced model

According to LessWrong’s new AI Futures Model, AI will reach the level of “superhuman coder” by February 2032, and could ascend to artificial superintelligence (ASI) within five years of that. A superhuman coder is an AI system that can run 30x as many agents as an organization has human engineers with 5% of its compute budget. It works autonomously at the level of a top human coder, performing tasks in 30x less time than the organization’s best engineer, the researchers explained.

This new revelation pushes the timeline out 3.5 to 5 years farther than in LessWrong’s initial forecast in April 2025. This, it said, is the result of numerous reconsiderations, reframings, and shifting research strategies.

Notably, the researchers were “less bullish” on speedups in AI R&D, and relied on a new framework for a software intelligence explosion (SIE) — that is, whether AI is more rapidly improving its capabilities without the need for more compute, and how quickly that may be occurring. They also focused more heavily how well AI can set research direction and select and interpret experiments.

The LessWrong researchers analyzed several modeling approaches, eventually settling on “capability benchmark trend extrapolation,” which uses current performance trends and standardized tests to predict future AI capabilities. They estimated artificial general intelligence (AGI)-required compute using METR’s time horizon suite, METR-HRS.

“Benchmark trends sometimes break, and benchmarks are only a proxy for real-world abilities, but… METR-HRS is the best benchmark currently available for extrapolating to very capable AIs,” the researchers wrote.

But while the model pulled heavily from the METR graph, the researchers also adjusted for several other factors.

For instance, compute, labor, data, and other AI inputs won’t continue to grow at the same rate; there’s a “significant chance” they will slow due to limits in chip production, energy resources, and financial investments.

The researchers estimated a one-year slowdown in parameter updates and a two-year slowdown in AI R&D automation due to diminishing returns in software research; they ultimately described the model as “pessimistic” in this area. They also projected slower growth in the leading AI companies’ compute amounts and in their human workforce.

Further, they built the model to be less “binary,” in the sense that it gives a lower probability to very fast or very slow takeoffs. Instead, it computes increases and assumes incremental progress.

“The model takes into account what we think are the most important dynamics and factors, but it doesn’t take into account everything,” the researchers noted. At the end of the day, they analyzed the results and made adjustments “based on intuition and other factors.”

Ultimately, they acknowledged, “we don’t think this model, or any other model, should be trusted completely.”

Incremental steps to AGI

Artificial general intelligence (AGI) is typically understood as AI that has human-level cognitive capabilities and can do nearly everything humans can. But instead of making the full leap from human intelligence to AI to AGI, the LessWrong researchers break the evolution into distinct steps.

The superhuman coder, for instance, will quickly make way for the “superhuman AI researcher” that can fully automate AI R&D and make human researchers obsolete. That will then evolve to a “superintelligent AI researcher,” representing a step-change where AI outperforms human specialists 2x more than the specialists outperform their median researcher colleagues.

Beyond that is top-human-expert-dominating AI, where AI can perform as well as human specialists on nearly all cognitive tasks and ultimately replaces 95% of remote work jobs.

Lastly comes artificial superintelligence (ASI), another step-change where models perform much better than top humans at virtually every cognitive task. The researchers anticipate ASI will occur five years after superhuman coding capabilities are achieved.

“AGI arriving in the next decade seems a very serious possibility indeed,” noted LessWrong researcher Daniel Kokotajlo. He and his colleagues split their model progress into stages, the last approaching the understood limits of human intelligence. “Already many AI researchers claim that AI is accelerating their work,” they wrote.

But, they added, “the extent to which it is actually accelerating their work is unfortunately unclear.” Likely, it is a “nonzero,” but potentially very small, impact that could increase as AI becomes more capable. Eventually, this could allow AI systems to outperform humans at “super exponential” speeds, according to the researchers, introducing yet another factor for consideration.

What this means for enterprises

The altered timeline is an “important signal” for enterprises, noted Sanchit Vir Gogia, chief analyst at Greyhound Research. It shows that even sophisticated models are “extremely sensitive” to assumptions about feedback loops, diminishing returns, and bottlenecks.

“The update matters less for the year it lands on and more for what it quietly admits about how fragile forecasting in this space really is,” he said.

Benchmark-driven optimism must be handled with care, he emphasized. While time horizon style benchmarks are useful indicators of progression, they are “poor proxies” for enterprise readiness.

From a CIO perspective, this isn’t a disagreement about whether AI can code; that debate is over, said Gogia. Enterprises should be using AI “aggressively” to compress cycle times while keeping humans accountable for outcomes. To this end, he is seeing more bounded pilots, internal tooling, gated autonomy, and strong emphasis on auditability and security.

It is also critical to correct the “mental model” for the next two to three years, Gogia noted. The dominant shift will not be to fully autonomous coding, but to AI-driven acceleration of processes across the enterprise. “Value will come from redesigning workflows, not from removing people,” he said. “The organizations that succeed will treat AI as a force multiplier inside a disciplined delivery system, not as a replacement for that system.”

Ultimately, repeatable results will reveal whether AI systems can handle complex, multi-repository, long-lived software that doesn’t require constant human rescue, Gogia said. “Until then, the responsible enterprise stance is neither dismissal nor blind belief, it is preparation.”

(image/jpeg; 4.24 MB)

Automated data poisoning proposed as a solution for AI theft threat 7 Jan 2026, 5:06 am

Researchers have developed a tool that they say can make stolen high-value proprietary data used in AI systems useless, a solution that CSOs may have to adopt to protect their sophisticated large language models (LLMs).

The technique, created by researchers from universities in China and Singapore, is to inject plausible but false data into what’s known as a knowledge graph (KG) created by an AI operator. A knowledge graph holds the proprietary data used by the LLM.

Injecting poisoned or adulterated data into a data system for protection against theft isn’t new. What’s new in this tool – dubbed AURA (Active Utility Reduction via Adulteration)– is that authorized users have a secret key that filters out the fake data so the LLM’s answer to a query is usable. If the knowledge graph is stolen, however, it’s unusable by the attacker unless they know the key, because the adulterants will be retrieved as context, causing deterioration in the LLM’s reasoning and leading to factually incorrect responses.

The researchers say AURA degrades the performance of unauthorized systems to an accuracy of just 5.3%, while maintaining 100% fidelity for authorized users, with “negligible overhead,” defined as a maximum query latency increase of under 14%. They also say AURA is robust against various sanitization attempts by an attacker, retaining 80.2% of the adulterants injected for defense, and the fake data it creates is hard to detect.

Why is all this important? Because KGs often contain an organization’s highly sensitive intellectual property (IP), they are a valuable target.

Mixed reactions from experts

However, the proposal has been greeted with skepticism by one expert and with caution by another.

“Data poisoning has never really worked well,” said Bruce Schneier, chief of security architecture at Inrupt Inc., and a fellow and lecturer at Harvard’s Kennedy School. “Honeypots, no better. This is a clever idea, but I don’t see it as being anything but an ancillary security system.”

Joseph Steinberg, a US-based cybersecurity and AI consultant, disagreed, saying, “in general this could work for all sorts of AI and non-AI systems.”

“This is not a new concept,” he pointed out. “Some parties have been doing this [injecting bad data for defense] with databases for many years.” For example, he noted, a database can be watermarked so if it is stolen and some of its contents are later used – a fake credit card number, for example — investigators knows where that piece of data came from. Unlike watermarking, however, which puts one bad record into a database, AURA poisons the entire database, so if it’s stolen, it’s useless.

AURA may not be needed in some AI models, he added, if the data in the KG isn’t sensitive. The real unanswered question is what the real-world trade-off between application performance and security would be if AURA is used.

He also noted that AURA doesn’t solve the problem of an undetected attacker interfering with the AI system’s knowledge graph, or even its data.

“The worst case may not be that your data gets stolen, but that a hacker puts bad data into your system so your AI produces bad results and you don’t know it,” Steinberg said. “Not only that, you now don’t know which data is bad, or which knowledge the AI has learned is bad. Even if you can identify that a hacker has come in and done something six months ago, can you unwind all the learning of the last six months?”

This is why Cybersecurity 101 – defense in depth – is vital for AI and non-AI systems, he said. AURA “reduces the consequences if someone steals a model,” he noted, but whether it can jump from a lab to the enterprise has yet to be determined.

Knowledge graphs 101

A bit of background about knowledge graphs: LLMs use a technique called Retrieval-Augmented Generation (RAG) to search for information based on a user query and provide the results as additional reference for the AI system’s answer generation. In 2024, Microsoft introduced GraphRAG to help LLMs answer queries needing information beyond the data on which they have been trained. GraphRAG uses LLM-generated knowledge graphs to improve performance and lower the odds of hallucinations in answers when performing discovery on private datasets such as an enterprise’s proprietary research, business documents, or communications.

The proprietary knowledge graphs within GraphRAGs make them “a prime target for IP theft,” just like any other proprietary data, says the research paper. “An attacker might steal the KG through external cyber intrusions or by leveraging malicious insiders.”

Once an attacker has successfully stolen a KG, they can deploy it in a private GraphRAG system to replicate the originating system’s powerful capabilities, avoiding costly investments, the research paper notes.  

Unfortunately, the low-latency requirements of interactive GraphRAG make strong cryptographic solutions, such homomorphic encryption of a KG, impractical. “Fully encrypting the text and embeddings would require decrypting large portions of the graph for every query,” the researchers note. “This process introduces prohibitive computational overhead and latency, making it unsuitable for real-world use.”

AURA, they say, addresses these issues, making stolen KGs useless to attackers.

AI is moving faster than AI security

As the use of AI spreads, CSOs have to remember that artificial intelligence and everything needed to make it work also make it much harder to recover from bad data being put into a system, Steinberg noted.

“AI is progressing far faster than the security for AI,” Steinberg warned. “For now, many AI systems are being protected in similar manners to the ways we protected non-AI systems. That doesn’t yield the same level of protection, because if something goes wrong, it’s much harder to know if something bad has happened, and its harder to get rid of the implications of an attack.”

The industry is trying to address these issues, as the researchers observe in their paper. One useful reference, they note, is the US National Institute for Standards and Technology (NIST) AI Risk Management Framework that emphasizes the need for robust data security and resilience, including the importance of developing effective KG protection.

This article originally appeared on CSOonline.

(image/jpeg; 2.21 MB)

Ruby 4.0.0 introduces ZJIT compiler, Ruby Box isolation 6 Jan 2026, 11:45 pm

Ruby 4.0.0 has arrived as the newest release of the interpreted, object-oriented Ruby programming language. The update features a new just-in-time compiler, ZJIT, and an experimental “Ruby Box” capability for in-process separation of classes and modules.

Released on December 25, 2025, Ruby 4.0.0 can be downloaded from ruby-lang,org.

Ruby Box is a new feature designed to provide separate spaces in a Ruby process for isolating code, libraries, and monkey code. Anticipated use cases for Ruby Box include running test cases in a box to protect other tests when the test case uses monkey patches for overriding something, running web app boxes in parallel for blue-green deployments on an app server in a Ruby process, and running web app boxes in parallel to evaluate dependency updates for a specific time period by checking response diffs. Note that Ruby Box is currently experimental and comes with a few known issues.

Ruby 4.0.0 also introduces ZJIT, a new just-in-time compiler intended to be the next generation of YJIT. Built into Ruby’s YARV reference implementation, ZJIT is faster than the interpreter, but not yet as fast as YJIT. Developers are encouraged to experiment with ZJIT, but maybe hold off on deploying it in production for now. Users are advised to stay tuned for Ruby 4.1 ZJIT.

Also in Ruby 4.0.0, Ruby’s parallel execution mechanism, Ractor, has received improvements including a new class, Ractor:port, to address issues pertaining to message sending and receiving, and Ractor.shareable_proc, to make it easier to share Proc objects between Ractors. For performance, many internal data structures in Ractor have been improved to reduce contention on a global lock, thus resulting in better parallelism. Ractors now also share less internal data, resulting in less CPU contention when running in parallel.

Ruby first emerged in 1995. Other features in Ruby 4.0.0 include the following:

  • *nil no longer calls nil.to_a, similar to how **nil does not call nil.to_hash.
  • For core classes, Array#rfind has been added as a more efficient alternative to array.reverse_each.find.
  • Enumerator.produce now accepts an optional size keyword argument to specify the enumerator size.
  • Kernel#inspect now checks for the existence of an #instance_variables_to_inspect method, allowing control over which instance variables are displayed in the #inspect string.

(image/jpeg; 3.43 MB)

Open WebUI bug turns the ‘free model’ into an enterprise backdoor 6 Jan 2026, 11:28 am

Security researchers have flagged a high-severity flaw in Open WebUI, a self-hosted enterprise interface for large language models, that allows external model servers connected via its Direct Connections feature to inject malicious code and hijack AI workloads.

The issue, tracked as CVE-2025-64496, stems from unsafe handling of server-sent events (SSE), enabling account takeover and, in some cases, with extended permissions, remote code execution (RCE)  on backend servers.

According to Cato CTRL findings, if an employee connects Open WebUI to an attacker-controlled model endpoint, like under the pretext of a “free GPT-4 alternative”, the frontend can be tricked into silently executing injected JavaScript. That code steals JSON Web Tokens (JWTs) from the browser context, offering attackers persistent access to the victim’s AI workspace, documents, chats, and embedded API keys.

The bug impacts Open WebUI versions up to 0.6.34 and is fixed in v0.6.35, with enterprises urged to patch production deployments without delay. 

Convenience feature turned into a crisis

Cato researchers said the problem is Direct Connections, a feature intended to let users connect Open WebUI to external, OpenAI-compatible model servers. The platform’s SSE handler trusts incoming events from these servers, especially those tagged as “{type: execute},” and executes their payload via a dynamic JavaScript constructor.

When a user connects to a malicious server, easily enabled through social engineering, that server can stream an SSE with executable JavaScript. That script runs with full access to the browser’s storage layer, including the JWT used for authentication.

“Open WebUI stores the JWT token in localStorage,” Cato researchers said in a blog post. “Any script running on the page can access it. Tokens are long-lived by default, lack HttpOnly, and are cross-tab. When combined with the execute event, this creates a window for account takeover.”

The attack requires the victim to enable Direct Connections (disabled by default) and add the attacker’s malicious model URL, according to an NVD description.

Escalating to Remote Code Execution

The risk doesn’t stop at account takeover. If the compromised account has workspace.tools permissions, attackers can leverage that session token to push authenticated Python code through Open WebUI’s Tools API, which executes without sandboxing or validation.

This turns a browser-level compromise into full remote code execution on the backend server. Once an attacker gets Python execution, they can install persistence mechanisms, pivot into internal networks, access sensitive data stores, or run lateral attacks.

The flaw received a high severity rating at 8/10 base score by NVD, and a 7.3/10 base score by GitHub. The flaw was rated high rather than critical, reflecting the fact that exploitation requires the Direct Connections feature to be enabled and hinges on a user first being lured into connecting to a malicious external model server. Patch mitigation in Open WebUI v0.6.35 involves blocking “execute” SSE events from Direct Connections entirely, but any organization still on older builds remains exposed. Additionally, the researchers advised moving authentication to short-lived and HttpOnly cookies with rotation. “Pair with a strict CSP and ban dynamic code evaluation”, they added.

(image/jpeg; 11.06 MB)

Page processed in 0.094 seconds.

Powered by SimplePie 1.4-dev, Build 20170403172323. Run the SimplePie Compatibility Test. SimplePie is © 2004–2026, Ryan Parman and Geoffrey Sneddon, and licensed under the BSD License.