Or try one of the following: 詹姆斯.com, adult swim, Afterdawn, Ajaxian, Andy Budd, Ask a Ninja, AtomEnabled.org, BBC News, BBC Arabic, BBC China, BBC Russia, Brent Simmons, Channel Frederator, CNN, Digg, Diggnation, Flickr, Google News, Google Video, Harvard Law, Hebrew Language, InfoWorld, iTunes, Japanese Language, Korean Language, mir.aculo.us, Movie Trailers, Newspond, Nick Bradbury, OK/Cancel, OS News, Phil Ringnalda, Photoshop Videocast, reddit, Romanian Language, Russian Language, Ryan Parman, Traditional Chinese Language, Technorati, Tim Bray, TUAW, TVgasm, UNEASYsilence, Web 2.0 Show, Windows Vista Blog, XKCD, Yahoo! News, You Tube, Zeldman
Google pitches Agentic Data Cloud to help enterprises turn data into context for AI agents | InfoWorld
Technology insight for the enterpriseGoogle pitches Agentic Data Cloud to help enterprises turn data into context for AI agents 23 Apr 2026, 4:49 pm
Google is recasting its data and analytics portfolio as the Agentic Data Cloud, an architecture it says is aimed at moving enterprise AI from pilot to production by turning fragmented data into a unified semantic layer that agents can reason over and act on more reliably at scale.
The new architecture builds on Google’s existing data platform strategy, bringing together services such as BigQuery, Dataplex, and Vertex AI, and elevating their capabilities in metadata, governance, and cross-cloud interoperability into what the company describes as a shared intelligence layer.
That intelligence layer strategy is underpinned by the new Knowledge Catalog, an evolution of Dataplex Universal Catalog, that the company said uses new capabilities to extend its metadata foundation into a semantic layer mapping business meaning and relationships across data sources.
These capabilities include native support for third-party catalogs, applications such as Salesforce, Palantir, Workday, SAP, and ServiceNow, and the option to move third-party data to Google’s lakehouse, which automatically maps the data to Knowledge Catalog.
To capture business logic more directly for data stored inside Google Cloud, the company is adding tools including a LookML-based agent, currently in preview, that can derive semantics from documentation, and a new feature in BigQuery, also in preview, that allows enterprises to embed that business logic for faster data analysis.
Beyond aggregation, the catalog itself is designed to continuously enrich semantic context by analyzing how data is used across an enterprise, senior google executives wrote in a blog post.
This includes profiling structured datasets as well as tagging and annotating unstructured content stored in Google Cloud Storage, the executives pointed out, adding that the catalog’s underlying system can also infer missing structure in data by using its Gemini models to generate schemas and identify relationships.
Turning data into business context the next battleground for AI
For analysts, Google’s focus on semantics targets one of the biggest barriers to production AI for enterprises.
“The hardest AI problem is inconsistent meaning,” said Dion Hinchcliffe, lead of the CIO practice at The Futurum Group, noting that a unified semantic layer could help CIOs establish consistent business context across systems while reducing the need for developers to manually stitch together metadata and lineage.
That focus on semantic context also reflects a broader shift in how hyperscalers are approaching enterprise AI. Microsoft with Fabric IQ and AWS with Nova Forge are pursuing similar strategies, building semantic context layers over enterprise data to make AI systems more consistent and easier to operationalize at scale.
While Microsoft’s approach is to wrap AI applications and agents with business context and semantic intelligence in its Fabric IQ and Work IQ offerings, AWS want enterprises to blend business context into a foundational LLM by feeding it their proprietary data.
Mike Leone, principal analyst at Moor Insights and Strategy, said Google’s approach, though closer to Microsoft’s, places the data gravity one layer above the lakehouse, within its data catalog and semantic graph capabilities.
“Google and Microsoft are solving the same problem from different angles, Fabric through a unified data foundation and Google through a unified semantic and context layer,” Leone said.
Even data analytics software vendors are converging on the idea of offering a catalog that can map semantic context from a variety of data sources, Leone added, pointing to Databricks’ Unity Catalog and Snowflake’s Horizon Catalog.
Semantic accuracy could pose challenges for CIOs
However, Google’s approach to building an intelligent semantic layer, especially its evolved Knowledge Catalog, comes with its own set of risks for CIOs.
The new catalog’s automated semantic context refinement capability, according to Jim Hare, VP analyst at Gartner, could amplify governance challenges, especially around metadata management: “In complex enterprise domains, errors in inferred relationships or definitions will require ongoing human domain oversight to maintain trust.”
Hare also warned of operational and cost management challenges.
“Agent-driven workflows spanning analytical and operational data, potentially across clouds, will introduce new challenges in observability, debugging, and cost predictability,” he said. “Dynamic agent behavior can generate opaque consumption patterns, requiring chief data and analytics officers (CDAOs) to closely manage cost attribution, usage limits, and operational guardrails as these capabilities mature.”
Adopting Google’s new architectural approach could increase dependence at the orchestration layer, resulting in issues around portability, he warned: “Exiting Google-managed semantics, Gemini agents, or BigQuery abstractions may be harder than migrating data alone.”
Bi-directional federation as strategic play
Even so, the trade-offs may be acceptable for enterprises prioritizing tighter data integration over flexibility.
As part of the new architecture, Google is also offering cross-platform data interoperability via the Apache Iceberg REST Catalog that it says will allow bi-directional federation, in turn letting enterprises access, query, and govern data across environments such as Databricks, Snowflake, and AWS without requiring data movement or cost in egress fees.
For Stephanie Walter, practice leader of the AI stack at HyperFRAME Research, this interoperability will be strategically important for enterprises scaling agents in production, especially ones that have heterogenous data environments.
Moor Insights and Strategy’s Leone, though, sees it as a different strategic play to address enterprises’ demand to access Databricks, Snowflake, and hyperscaler environments without costly data movement.
Google’s Agentic Data Cloud architecture also includes a Data Agent Kit, currently in preview, which the company says is designed to help enterprises build, deploy, and manage data-aware AI agents that can interact with governed datasets, apply business logic, and execute workflows across systems.
Robert Kramer, managing partner at KramerERP, said the Data Agent Kit will help data practitioners abstract t daily tasks, in turn lowering the barrier to operationalizing agentic AI across workflows.
However, Gartner’s Hare warned that enterprises should guard against over delegating critical data management decisions to automated agents without sufficient observability, validation controls, and human review, particularly where downstream AI systems depend on these agents for continuous data operations.
Offer customers passkeys by default, UK’s NCSC tells enterprises 23 Apr 2026, 1:12 pm
The UK’s National Cyber Security Centre (NCSC) is recommending passkeys as the default authentication method for businesses to offer consumers, citing industry progress that now makes them a more secure and user-friendly alternative to passwords.
In a blog post published this week, the agency said passkeys can now be recommended to both the public and businesses as a primary authentication method.
“Passkeys should now be consumers’ first choice of login,” the UK cybersecurity authority said in a blog post, adding that passwords are “no longer resilient enough for the contemporary world.”
“Passkeys are a newer method for logging into online accounts which do much of the heavy lifting for users, only requiring user approval rather than needing to input a password. This makes passkeys quicker and easier to use and harder for cyber attackers to compromise,” the NCSC added in the blog.
The agency said passkeys should be used wherever supported, describing them as resistant to phishing and eliminating risks associated with password reuse.
Focus on phishing-resistant authentication
The guidance is based on the agency’s assessment of how authentication methods perform against real-world attacks.
The NCSC said its analysis examines common techniques, including phishing, credential reuse, and session hijacking, and evaluates how credentials are exposed across their lifecycle, from creation and storage to use.
“Passkeys are resistant to phishing attacks and remove the risks associated with password reuse,” the agency said.
In its accompanying technical paper, the NCSC said traditional authentication methods, including passwords combined with one-time codes, remain “inherently phishable.”
By contrast, FIDO2-based credentials such as passkeys are “as secure or more secure than traditional MFA against all common credential attacks observed in the wild,” the agency said.
However, NCSC cautioned in the technical paper that “while much of the analysis in this paper also applies to enterprise authentication scenarios (for example staff authenticating to a Single Sign On), the different threat model and usage scenarios mean this paper is not intended for enterprise risk assessment.”
How passkeys change the attack model
The NCSC added that passkeys reduce risk by removing reliance on shared secrets and binding authentication to the legitimate service.
According to the agency, this prevents credential reuse and relay attacks, as authentication cannot be intercepted and reused by an attacker.
Passkeys use cryptographic key pairs stored on a user’s device, with authentication tied to device-based verification such as biometrics or PINs, the agency said.
Shift in user-level authentication
For organizations that provide online services to customers, the guidance signals a shift in how authentication is implemented at the user interface level.
“This is a fundamental architectural change, not an incremental authentication upgrade,” said Madelein van der Hout, senior analyst at Forrester. “It moves organizations beyond the passwords-plus-MFA paradigm toward a phishing-resistant foundation.”
Van der Hout said passkeys eliminate risks associated with credential theft by using device-bound cryptographic authentication rather than shared secrets.
“Organizations that treat this as a credential swap will underinvest,” she said. “Those who treat it as a broader identity modernization opportunity will get ahead.”
The NCSC said organizations should also consider how authentication is implemented across the full user journey, including account recovery and fallback mechanisms.
While passkeys reduce reliance on passwords, the agency noted that weaker processes, such as password resets or account recovery flows, can still introduce risk if not properly secured.
Adoption challenges remain
The NCSC said passkeys are not yet universally supported and recommended password managers and multi-factor authentication where passkeys cannot be used.
“Where a particular service does not support passkeys, the NCSC’s advice to consumers is to use a password manager to create stronger passwords and keep using two-step verification,” NCSC noted in the blog post.
Van der Hout said implementation challenges are likely, particularly for organizations operating across multiple platforms and user environments.
“Legacy systems and fragmented identity environments present significant obstacles,” she said.
She added that organizations must also consider non-human identities. “Any passkey strategy that ignores the machine identity layer will create new security gaps,” she said.
Device requirements and account recovery processes may also affect how passkeys are deployed, she said.
Hybrid model is expected during the transition
A full transition away from passwords is unlikely in the near term, analysts believe.
“Expect a hybrid model lasting several years,” van der Hout said, as organizations continue to support both passkeys and traditional authentication methods.
During this period, organizations will need to manage authentication across multiple login options while ensuring that fallback methods do not weaken overall security, she added
The NCSC similarly advised maintaining strong authentication practices where passkeys are not yet available.
Policy signal strengthens shift toward passwordless login
The guidance adds to broader efforts to move away from passwords in consumer authentication.
“The guidance matters because it gives security leaders leverage,” van der Hout said, including in discussions with vendors and internal stakeholders.
The NCSC said that moving toward phishing-resistant authentication could reduce a major cause of cyber compromise, particularly in services that rely on user login credentials.
The article originally appeared in CSO.
Microsoft taps Anthropic’s Mythos to strengthen secure software development 23 Apr 2026, 9:28 am
Microsoft plans to integrate Anthropic’s Mythos AI model into its Security Development Lifecycle, a move that suggests advanced generative AI is beginning to play a direct role in how major software vendors identify vulnerabilities and harden code against attack.
The company said it will use Mythos Preview, along with other advanced models, as part of a broader push to strengthen secure coding and vulnerability detection earlier in the software development process.
The announcement comes as Anthropic’s Mythos heightens concerns that advanced AI models could dramatically shrink the time between finding a software flaw and exploiting it. Analysts say Mythos marks a notable leap in AI-driven vulnerability research, with the ability to uncover thousands of serious flaws across major operating systems and browsers.
OpenAI has also entered the space with GPT-5.4-Cyber, a version of its flagship model tailored for defensive cybersecurity work. Keith Prabhu, founder and CEO of Confidis, said a future OpenAI model, which he referred to as “Spud,” could emerge as an even stronger rival.
The move matters beyond Microsoft’s own engineering organization. For enterprise security leaders, it offers a clear sign that frontier AI models are starting to move from experimental use into core cybersecurity workflows.
That could change how software vendors build products and how defenders view the risks and benefits of using the same AI tools attackers may also exploit.
“This marks a seminal turning point in the secure software development lifecycle process,” Prabhu said. “While earlier tools were only capable of static code scanning for vulnerabilities, with AI, there is a possibility of a dynamically learning model which can also perform dynamic vulnerability and even penetration testing in real time.”
Over time, Prabhu said, the pressure to adopt AI-assisted security tools is likely to spread beyond the largest software vendors.
Why Microsoft’s move matters
Neil Shah, vice president for research at Counterpoint Research, said more than 95% of Fortune 500 companies use Microsoft Azure in some capacity, while Azure AI and the Copilot suite are entrenched across about 65% of those companies. Millions of businesses also rely on multiple Microsoft products and cloud services.
“Using Mythos in Microsoft’s Security Development Lifecycle could help strengthen and harden products like Windows, Azure, Microsoft 365, and developer tools,” Shah said. “Every enterprise running those products could benefit from the security improvement without needing direct Mythos access themselves.”
Prabhu noted that Microsoft said it had evaluated Mythos using its open-source benchmark for real-world detection engineering tasks, with results showing substantial improvements over prior models.
“Such a claim coming from Microsoft does suggest that these new AI models are becoming materially better at identifying exploitable flaws than earlier generations,” Prabhu added. “However, as with any AI tool, the strength of the tool lies in its ability to analyze code quickly based on past learning. There is a possibility that it could miss new types of vulnerabilities that only a ‘human-in-the-loop’ could identify.”
The article originally appeared in CSO.
How I doubled my GPU efficiency without buying a single new card 23 Apr 2026, 9:00 am
Late last year I got pulled into a capacity planning exercise for a global retailer that had wired a 70B model into their product search and recommendation pipeline. Every search query triggered an inference call. During holiday traffic their cluster was burning through GPU-hours at a rate that made their cloud finance team physically uncomfortable. They had already scaled from 24 to 48 H100s and latency was still spiking during peak hours. I was brought in to answer a simple question: Do we need 96 GPUs for the January sale or is something else going on?
I started where I always start with these engagements: profiling. I instrumented the serving layer and broke the utilization data down by inference phase. What came back changed how I think about GPU infrastructure.
During prompt processing — the phase where the model reads the entire user input in parallel — the H100s were running at 92% compute utilization. Tensor cores fully saturated. Exactly what you want to see on a $30K GPU. But that phase lasted about 200 milliseconds per request. The next phase, token generation, ran for 3 to 9 seconds. During that stretch the same GPUs dropped to 30% utilization. The compute cores sat idle while the memory bus worked flat out reading the attention cache.
We were paying H100-hour rates for peak compute capability and getting peak performance for roughly 5% of every request’s wall time. The other 95% was a memory bandwidth problem wearing a compute-priced GPU.
The pattern hiding in plain sight
Once I saw it, I couldn’t unsee it. LLM inference is two workloads pretending to be one. Prompt processing (the industry calls it prefill) is a dense matrix multiplication that lights up every core on the chip. Token generation (decode) is a sequential memory read that touches a fraction of the compute. They alternate on the same hardware inside the same scheduling loop. I’ve worked on carrier-scale Kubernetes clusters and high-throughput data pipelines, and I’ve never seen a workload profile this bimodal running on hardware this expensive.
If you ran a database this way — provisioning for peak write throughput and then using the server 90% of the time for reads — you’d split, it into a write primary and read replicas without a second thought. But most teams serving LLMs haven’t made that connection yet.
The monitoring tools make it worse. Every inference dashboard I looked at reported a single “GPU utilization” number: The average of both phases blended together. Our cluster showed 55%. Looks fine. Nobody panics at 55%. But 55% was the average of 92% for a few hundred milliseconds and 30% for several seconds. The dashboards were hiding a bimodal distribution behind a single number.
Researchers at UC San Diego’s Hao AI Lab published a paper called DistServe at OSDI 2024 that laid out the problem with numbers I could have pulled from my own profiling. Their measurements on H100s showed the same pattern: Prefill at 90–95% utilization, decode at 20–40%. They also proposed the fix.
Splitting the work in two
The fix is called disaggregated inference. Instead of running both phases on the same GPU pool you stand up two pools: One tuned for compute throughput (prompt processing) and one tuned for memory bandwidth (token generation). A routing layer in front sends each request to the right pool at the right time and the attention cache transfers between them over a fast network link.
When I first proposed this to the customer, they were skeptical. Two pools mean more operational complexity. A cache transfer protocol adds a network dependency that monolithic serving doesn’t have. Fair objections. So, I pointed them at who’s already running it.
Perplexity built their entire production serving stack on disaggregated inference using RDMA for cache transfers. Meta runs it. LinkedIn runs it. Mistral runs it. By early 2026 NVIDIA shipped an orchestration framework called Dynamo that treats prefill and decode as first-class pool types. The open-source engines — vLLM and SGLang — both added native disaggregated serving modes. Red Hat and IBM Research open-sourced a Kubernetes-native implementation called llm-d that maps the architecture onto standard cluster management workflows.
This isn’t a research prototype waiting for someone brave enough to try it. It’s the default architecture at the companies serving more LLM traffic than anyone else on the planet.
What changed when we split the pools
We ran a two-week proof of concept. I split the cluster into two pools: Eight GPUs dedicated to prompt processing and the remaining GPUs handling token generation. No new hardware, no new cluster — just a configuration change in the serving layer and a routing policy that sent each request to the right pool based on its inference phase. The prompt-processing pool hit 90–95% compute utilization consistently because that’s all it did. No token generation competing for scheduling slots. No decode requests sitting idle while a prefill burst hogged the cores.
The token-generation pool was the bigger surprise. By batching hundreds of concurrent decode requests together the memory reads got amortized across more work. Bandwidth utilization climbed above 70% — far better than the 30% we’d been seeing when decode requests were interleaved with prefill on the same GPU. Overall compute efficiency roughly doubled.
The cost math followed. The customer was spending about $2M annually on inference GPU-hours. After disaggregation they were on track to cut that by $600–800K while serving the same request volume at the same latency targets. No new hardware purchased. Same GPUs, same cluster, same model weights — different architecture.
The latency story was just as good. In the monolithic setup every time a new prompt arrived its processing burst would stall active token-generation requests. Users watching streaming responses would see the text pause mid-sentence while someone else’s prompt got processed. After the split: Steady token cadence with no prefill-induced stalls. P99 inter-token latency flattened out completely.
There are workloads where this doesn’t pay off. Short prompts under 512 tokens with short outputs don’t generate enough cache to justify a network transfer. Multi-turn conversations where 80%+ of the cache already lives on the decode worker from a previous turn are better served locally. And if you have fewer than a dozen GPUs the scheduling overhead of two pools can eat into whatever you save on utilization. But the teams complaining about GPU shortages and GPU bills are not running 4-GPU deployments with 512-token prompts. They’re running dozens to hundreds of GPUs at enterprise scale where the utilization waste adds up to millions per year.
The industry spends a lot of energy on the GPU supply side: Build more fabs, design better chips, negotiate bigger cloud contracts. Those things matter. But I keep coming back to what I saw in that profiling data. If the teams running monolithic LLM inference today switched to disaggregated serving the effective GPU supply would roughly double overnight. No new silicon required. The tools are ready. The proof points are in production. The only thing missing is the profiling step that makes the waste visible.
If you haven’t broken your inference utilization down by phase yet, do it this week. Add per-phase instrumentation to your serving layer. Plot prefill utilization and decode utilization separately over a 24-hour window. If the two lines look like they belong on different charts — and they will — you have your answer. You’ll stop paying for compute you’re not using.
This article is published as part of the Foundry Expert Contributor Network.
Want to join?
Is your Node.js project really secure? 23 Apr 2026, 9:00 am
JavaScript and Node.js teams do not lack security tools. What they still lack is a dependency security workflow that developers will actually use before release.
That is the real gap. A package gets installed, CI (continuous integration) runs, a scanner executes somewhere in the pipeline, and eventually a report appears. From a distance, that can look like maturity. In practice, it often means developers learn about dependency risks too late, too indirectly, and with too little clarity to act while the fix is still easy.
The real problem in JavaScript and Node.js security is no longer detection. It is actionability.
That is why so many teams can say they scan dependencies and still struggle to answer the questions that matter right before release. What exactly is vulnerable? Is it direct or transitive? Is there a fixed version? Can I fix it in my own project, or am I blocked behind an upstream dependency? Which finding deserves attention first?
Those are not edge cases. That is the real work.
In Node.js projects, the problem is easy to hide. A team may manage a reasonable number of direct dependencies while shipping hundreds or thousands of resolved packages through a lockfile. At that point, the challenge is no longer whether a scanner can produce output. Most can. The challenge is whether the result is understandable enough, local enough, and actionable enough to help a developer make a release decision before the issue turns into pipeline noise or last-minute triage.
That is where many workflows still fail. Detection exists. Usability often does not. Node.js teams do not have a scanner shortage. They have a workflow shortage.
What is missing is a fixability-first view of dependency security. Teams do not just need to know that something is vulnerable. They need to know what is directly actionable now, what is buried in transitive dependencies, and what kind of remediation path they are actually dealing with.
What CVE Lite CLI does differently
This is the problem I have been exploring through CVE Lite CLI, an open source tool built around the local dependency workflow JavaScript developers actually need.
CVE Lite CLI is not trying to win the platform race. It is trying to solve the moment where a developer needs a clear answer before release.
Its scope is intentionally narrow. It does not try to do exploitability analysis, runtime reachability, container scanning, secret scanning, or infrastructure scanning. It focuses on a more practical job: scanning JavaScript and TypeScript projects locally from their lockfiles, identifying known OSV-backed dependency issues, separating direct from transitive findings, showing dependency paths, surfacing fixed-version guidance, and producing output a developer can actually use before release.
That narrower scope is not a weakness. It is the reason the tool is useful.
Too much security tooling is built around organizational visibility. The CVE Lite CLI workflow is built around developer decision-making. Its value is not simply that it tells you vulnerabilities exist. Its value is that it makes dependency risk understandable early enough to change developer behavior.
That distinction matters. A warning that arrives late in CI may be technically correct, but operationally weak. A warning that appears locally, with direct versus transitive separation and dependency paths, is much closer to a plan for a fix.
This is the gap CVE Lite CLI aims to address. It moves dependency security closer to the point where engineering decisions are actually made.
Recent work on CVE Lite CLI pushes that workflow further by surfacing the exact package command for direct fixes where available. That makes the tool more useful at the moment developers move from detection to action.
In the stronger cases, providing the package command turns the tool from a scanner into a local remediation loop: scan, apply the suggested package change, and rescan immediately without waiting for branch-and-pipeline feedback.
That shift is bigger than convenience. It changes the feel of dependency security from a distant report into an active engineering loop. It lets the developer stay in the same working session, make a change, verify the result, and keep moving.
Local-first vulnerability scanning with CVE Lite CLI
In April 2026, I ran CVE Lite CLI against three public open source projects: Nest, pnpm, and release-it. The goal was not to single out those projects. Well-maintained projects can still surface dependency issues, and scan results can change over time. The point was to test whether a local-first tool could give developers something concrete enough to shape action.
The Nest run has now evolved into a fuller case study that makes the larger point clearer: the value of a local-first tool is not just that it detects issues, but that it helps developers move from scan output to a realistic remediation path in the same working session.
In Nest, CVE Lite CLI parsed 1,626 packages from package-lock.json and found 25 packages with known OSV matches: one high-severity issue, four medium, and 20 low. More important than the count was the structure. Twelve findings looked directly fixable in the project. Thirteen were transitive.
That is the kind of distinction raw counts hide. Twenty-five findings may sound alarming, but the real engineering question is how many of those can be acted on immediately. A fixability-first workflow makes that visible.
What the fuller Nest case study shows is that remediation is often iterative, not one-and-done. In one dependency path, resolving the issue required several tar upgrades in sequence as the dependency graph changed after each install. That is exactly where a local scan-fix-rescan loop becomes more useful than a CI-only workflow. Instead of upgrading, pushing a branch, waiting for a pipeline scanner, and discovering the next required upgrade later, the developer can keep working through the path locally until the dependency state is clean.
One of the strongest findings was diff@2.2.3, a high-severity transitive issue appearing through gulp-diff. The same scan also surfaced diff@4.0.2 as a medium-severity direct dependency and diff@7.0.0 as a medium-severity transitive dependency through mocha. That is a realistic picture of Node.js dependency management: the same package appearing in multiple forms, through multiple parents, with different remediation implications.
A weaker tool would simply tell the developer that vulnerabilities were found. CVE Lite CLI did something more useful. It exposed the dependency paths clearly enough to show why the remediation work was different in each case.
The same Nest scan surfaced tar@6.2.1 as a medium-severity direct dependency with fixed-version guidance, and form-data@2.3.3 as a medium-severity transitive issue through request. Those are not the same category of problem. One points toward a direct upgrade decision. The other points toward upstream dependency pressure. That is where dependency scanning stops being a checklist exercise and starts becoming real engineering work.
And that is where this kind of local-first dependency workflow performs well. It does not just report that something is wrong. It shows the developer what kind of wrong they are dealing with.
The release-it scan reinforced the same point on a smaller scale. CVE Lite CLI parsed 545 packages and found 10 packages with known OSV matches: four medium-severity and six low. Six appeared directly fixable. Four were transitive.
Two direct findings stood out immediately: @isaacs/brace-expansion@5.0.0 and flatted@3.3.3. Those are the kinds of issues a developer can reason about quickly. But the scan also found two minimatch findings arriving transitively through different parent chains, one through @npmcli/map-workspaces and another through glob.
That matters because it shows the tool is not only useful in large, messy dependency graphs. It is also useful in smaller projects where the real value comes from turning a vague dependency concern into a specific, inspectable remediation path.
The pnpm scan mattered for the opposite reason. CVE Lite CLI parsed 563 packages from pnpm-lock.yaml and returned no known OSV matches. That kind of result is easy to undervalue, but it should not be. A serious local workflow should not exist only to generate alerts. It should also be able to give developers confidence quickly when there is nothing obvious to fix.
That clean-result case is one of the reasons a lightweight local tool belongs in the workflow. Developers do not just need early warning. They also need fast reassurance.
Bringing dependency security into the developer workflow
The larger lesson here is not that open source projects are failing. It is that the developer workflow around dependency security is still immature. Teams have learned how to collect results. They have not learned how to make those results usable at the point where developers choose packages, update lockfiles, and prepare releases.
That is why CVE Lite CLI matters beyond the tool itself. It addresses a workflow problem that many JavaScript teams still live with every day.
The bigger issue is not one project or one scanner. It is whether dependency security becomes a normal part of everyday engineering practice.
CVE Lite CLI takes steps in that direction. It gives developers a local release check instead of forcing them to wait for CI. It gives them direct versus transitive visibility instead of flattening everything into one alarming list. It gives them dependency paths instead of vague package names with no remediation context. It gives them fixed-version guidance where possible instead of leaving them to infer the next move.
And because CVE Lite CLI is intentionally lightweight and narrow in scope, it is easier to trust, easier to adopt, and easier to add to a normal Node.js toolchain.
That point matters. Developers are already overloaded with tooling. The next tool that earns a place in the workflow will not be the one that makes the biggest promises. It will be the one that solves a real problem cleanly, honestly, and without forcing teams into a larger platform commitment.
That is why CVE Lite CLI has real potential. It meets developers where they already work.
More importantly, it points toward a broader shift in how dependency security should be understood. Security tooling is moving from vulnerability detection to vulnerability interpretation, from counting issues to understanding risk in context. That is where developer workflow becomes more important than dashboard volume.
A missing link in the developer toolchain
Dependency security should not feel like a special event. It should feel like linting, testing, or checking build output before release. In other words, it should become a normal part of the engineering loop.
That is the strongest case for CVE Lite CLI. It helps move security from a distant control function into an everyday developer habit.
For dependency paths that require more than one adjustment, a local-first, scan-fix-rescan workflow can be materially faster than relying on repeated CI feedback alone. If developers can scan lockfile-backed dependency state locally, understand what is direct, understand what is transitive, see the dependency paths, and get a credible sense of what to fix before release, then dependency security stops being abstract policy and starts becoming practical engineering.
That is what the JavaScript ecosystem needs more of.
Node.js does not need more theatrical security output. It needs better developer workflow infrastructure. It needs tools that can give clear, immediate, low-friction answers while there is still time to act. It needs tools that make dependency risk visible in the same place where dependency decisions are made.
A local-first, lockfile-aware workflow points in that direction.
And if the goal is to make dependency security a real part of everyday software engineering practice, then local-first lockfile scanning should stop being treated as a niche extra. It should become a normal part of the developer toolchain.
How open source ideals must expand for AI 23 Apr 2026, 9:00 am
Open source has never been just a licensing model. Rather, it’s also a philosophy about shared effort, shared transparency, and shared agency. The shared goal is to make an impact in the world. In the age of AI‑assisted development and agents, there is a line of thinking that AI slop, specifically mass-produced and submitted code, is the downfall of open source projects. On the contrary, I think open source is headed for a resurgence like we’ve never seen before, as long as we emphasize the aspects of open source beyond code submissions.
In this new world, the philosophy, ethics, and morals of open source are more relevant than ever. However, the focus of open source needs to evolve past raw code: Specification files (spec files) and governance documents (constitutions) are becoming as important as the source itself. The challenge is not to choose between open source and AI, but to recognize that open source is now a community-based control and scope mechanism for open technologies.
Let me break this down further. Specs describe intent and outcomes, while code shows how that intent is actually realized. When something goes wrong, you still need to trace the path from spec to implementation. Trust is earned, not inferred, so the promise that, for example, an app values your privacy or that an agent never sends data to third parties must be backed by code and built on pipelines that anyone can inspect and verify (e.g. Acquacotta Constitution).
Governance complements spec files, in showing how a spec is created, enforced and followed. It’s the “people decisions” around a project — who makes the final decision? If there’s a vote on something, who votes? How do they vote? It’s these seemingly pedantic yet crucial decisions that emerge from governance, and they are the backbone for how spec files are created and followed when code contributions become as simple as writing a basic agent prompt.
Open means open, even for AI
The main criticism of AI with open source is that code contributions become open to everyone, not just those with deep technical knowledge. This makes the pillars of an open source community outside of the code more important than ever. Users fundamentally become contributors when anyone can create code, which means they need agency in specification. Additionally, this new class of community members needs to be able to help influence and change governance and spec, just like “normal” contributors in the pre-AI days. The spec files submitted with their AI-generated code must be open for inspection and reproducible, just like a more traditional code contribution. The ability to fork the implementation and run it on outside infrastructure also remains, enabling contributors to further refine their own customizations, integrations, and optimizations.
This is how organizations retain real agency in an era where code is a commodity. In other words, spec files broaden what we need to keep open; they do not replace the need to keep code, build systems, and dependency trees open and inspectable. The future is not specs instead of open source; it is open source plus open specs and open governance.
You version, review, and discuss specs in the same way you review and discuss code. You make architecture and governance artifacts part of the public record so that others can learn from, reuse, and improve them. This creates a richer set of open source assets. The repository does not just contain code; it contains the constitution of the project, the architectural reasoning, and the guardrails that keep AI tools and AI-driven code contributions on the rails.
This lowered barrier cannot be overstated. Truly anyone can now contribute, not just those who understand the code-based components of a project. Domain experts, designers, and operations specialists can propose changes at the spec level that AI agents then help implement. For a community that leads with open source values, and robust testing frameworks (coded in the constitution), this fuels the creation of high‑quality software while preserving the transparency, reviewability, and forkability that open source depends on.
‘Real’ open source
We do risk the convergence of open source and specs turning into a two-sided purity test. Some argue that if AI wrote most of the code, it is not “real” open source. This argument implies that the provenance of each line of syntax matters more than whether the system as a whole is transparent, forkable, and governed in the open. Meanwhile, the opposing side suggests that if we can regenerate the implementation from a spec, we no longer need to worry as much about licensing and code openness — as if an elegant constitution makes inspection of the actual machinery optional.
Both positions miss the point. If you care about user agency, security, and long‑term sustainability, as all open source projects should, you need both open code and open build pipelines, so anyone can inspect, reproduce, and harden what is running. You need open specs and governance, so anyone can understand what the system is supposed to do, how it is supposed to behave, and how decisions get made over time.
The new “definition” of open must consider implementation, specification, and governance as three critical factors that must be woven together. Open implementation means the source, dependencies, and build system are available under an open source license so you can rebuild, audit, and run the software yourself. Open specification means the requirements, architecture, and project constitution are documented, versioned, and public, so others can reuse them, learn from them, and adapt them to their own needs. Open governance means the processes by which changes are proposed, reviewed, and accepted — whether at the spec level or in code — are transparent and participatory.
The path forward for open source communities is not to retreat from spec‑driven, AI‑assisted development, nor to declare the old mission obsolete. It is to lead in defining and practicing what open specification, governance, and implementation look like together in an AI‑first world — and to do so with the confidence to dream bigger than incremental automation.
It’s this ability for individuals and organizations to dream bigger that makes it possible to tackle problems that were previously too big, too weird, or too niche to justify a full project team. A UX designer who’s never committed a line of code can suddenly create a serious tool. A security engineer can prototype a new threat‑hunting pipeline end‑to‑end. An architect can stand up a reference implementation of a safer integration pattern, then let others clone and extend it. In each case, AI doesn’t replace expertise; it amplifies it. The most valuable people in the room spend less time grinding out boilerplate and more time on intent, constraints, and trade‑offs.
AI tooling isn’t a one way street
However, there’s one really important thing to remember about dreaming bigger with AI tools: You are not the only one who has easy access to them — so do your competitors. Perhaps more importantly, the bad guys have them too.
When it comes to the former, dreaming bigger means dreaming bigger than the folks who would take business away from you. It’s about making sure that your customers stay your customers and that you can attract new ones on a continual basis.
When it comes to the latter, it means considering that the people who would do you and your customers harm have these same tools. And, if anyone dreams bigger, it’s the people who want to break your systems, defraud your customers, and manipulate your users.
That’s one reason the traditional “open source = open code” framing is starting to feel too small. If attackers can continuously remix powerful AI tooling in the shadows, defenders need open, inspectable patterns for detection, response, and governance that anyone can adopt and improve.
Open source’s big tent
The same openness that once turned a loose collection of hackers into the engine of the modern software stack can now be applied to a new layer of specs and agents. This is a new way forward to make open source even more accessible to a new type of contributor. For open source, it’s not about fighting AI slop. Instead, we as a global community need to push the positives of AI-driven contributions forward, so that the good far outweighs the bad.
—
New Tech Forum provides a venue for technology leaders—including vendors and other outside contributors—to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to doug_dineley@foundryco.com.
Claude Mythos signals a new era in AI-driven security, finding 271 flaws in Firefox 23 Apr 2026, 1:33 am
The Claude Mythos Preview appears to be living up to the hype, at least from a cybersecurity standpoint. The model, which Anthropic rolled out to a small group of users, including Firefox developer Mozilla, earlier this month, has discovered 271 vulnerabilities in version 148 of the browser. All have been fixed in this week’s release of Firefox 150, Mozilla emphasized.
These findings set a new precedent in AI’s ability to unearth bugs, and could turbocharge cybersecurity efforts.
“Nothing Mythos found couldn’t have been found by a skilled human,” said David Shipley of Beauceron Security. “The AI is not finding a new class of AI-exclusive super bugs. It’s just finding a lot of stuff that was missed.”
However, the news comes as Anthropic is reportedly investigating unauthorized use of Mythos by a small group who reportedly gained access via a third party vendor environment, revealing the double-edged nature of AI.
Closing the fuzzing gap
Firefox has previously pointed AI tools, notably Anthropic’s Claude Opus 4.6, at its browser in a quest for vulnerabilities, but Opus discovered just 22 security-sensitive bugs in Firefox 148, while Mythos uncovered more than ten times that many.
Firefox CTO Bobby Holley described the sense of “vertigo” his team felt when they saw that number. “For a hardened target, just one such bug would have been red-alert in 2025,” he wrote in a blog post, “and so many at once makes you stop to wonder whether it’s even possible to keep up.”
Firefox uses a defense-in-depth strategy, with internal red teams applying multiple layers of “overlapping defenses” and automated analysis techniques, he explained. Teams run each website in a separate process sandbox.
However, no layer is impenetrable, Holley noted, and attackers combine bugs in the rendering code with bugs in the sandboxes in an attempt to gain privileged access. While his team has now adopted a more secure programming language, Rust, the developers can’t afford to stop and rewrite the decades’ worth of existing C++ code, “especially since Rust only mitigates certain, (very common) classes of vulnerabilities.”
While automated analysis techniques like fuzzing, which uncovers vulnerabilities or bugs in source code, are useful, some bits of code are more difficult to fuzz than others, “leading to uneven coverage,” Holley pointed out. Human teams can find bugs that AI can’t by reasoning through source code, but this is time-consuming, and is bottlenecked due to limited human resources.
Now, Claude Mythos Preview is closing this gap, detecting bugs that fuzzing doesn’t surface.
“Computers were completely incapable of doing this a few months ago, and now they excel at it,” Holley noted. Mythos Preview is “every bit as capable” as human researchers, he asserted, and there is no “category or complexity” of vulnerability that humans can find that Mythos can’t.
Defenders now able to win ‘decisively’?
Gaps between human-discoverable and AI-discoverable bugs favor attackers, who can afford to concentrate months of human effort to find just one bug they can exploit, Holley noted. Closing this gap with AI can help defenders erode that long-term advantage.
The industry has largely been fighting security “to a draw,” he acknowledged, and security has been “offensively-dominant” due to the size of the attack surface, giving adversaries an “asymmetric advantage.” In the face of this, both Mozilla and security vendors have “long quietly acknowledged” that bringing exploits to zero was “unrealistic.”
But now with Mythos (and likely subsequent models), defenders have a chance to win, “decisively,” Holley asserted. “The defects are finite, and we are entering a world where we can finally find them all.”
What security teams should do now
Finding 271 flaws in a mature codebase like Firefox illustrates the fact that AI-driven vulnerability discovery is now operating at a scale and depth that can outpace traditional human-led review, noted Ensar Seker, CISO at cyber threat intelligence company SOCRadar.
Holley’s “vertigo,” he said, was because defenders are realizing the attack surface is larger, and “more rapidly discoverable than previously assumed.”
Security teams must respond by shifting from periodic testing to continuous validation, Seker advised. That means integrating AI-assisted code analysis into continuous integration/continuous delivery (CI/CD) pipelines, prioritizing “patch velocity over perfection,” and assuming that any externally reachable code path will eventually be discovered and weaponized.
“The goal is no longer just finding vulnerabilities first, but reducing the window between discovery and remediation,” he said.
Shipley agreed that any company building software must evaluate resourcing so it can quickly and proactively find and fix vulnerabilities. “But stuff will happen,” he acknowledged. So, in addition to doing proactive work, enterprises must regularly exercise their incident response playbooks.
“The next few years are going to be a marathon, not a sprint,” said Shipley.
Dual-use nature of AI is a challenge
However, the dual-use nature of these systems present a big challenge. The same capability that helps defenders identify hundreds of flaws can be turned against them if the model or its outputs are exposed, Seker pointed out.
The reported unauthorized access to Mythos “reinforces that AI systems themselves are now high-value targets, effectively becoming part of the attack surface,” he said.
It’s not at all surprising that people found a way to access Mythos, Shipley agreed; it was inevitable. “Nor does Anthropic have some unique, insurmountable or exclusive AI capability for hacking,” he said, pointing out that OpenAI is already catching up in that regard, and others will “catch and surpass” Mythos.
Striking a balance requires treating AI models like privileged infrastructure, Seker noted. Enterprises need strict access controls, output monitoring, and isolation of sensitive workflows. Developers, meanwhile, must adapt by writing code that is resilient to automated scrutiny; this requires stronger input validation, safer defaults, and “fewer assumptions about obscurity.”
“In this paradigm, security isn’t just about defending systems; it’s about defending the tools that are now capable of breaking them at scale,” Seker emphasized.
This article originally appeared on CSOonline.
Malicious pgserve, automagik developer tools found in npm registry 23 Apr 2026, 12:17 am
Application developers are being warned that malicious versions of pgserve, an embedded PostgreSQL server for application development, and automagik, an AI coding tool, have been dropped into the npm JavaScript registry, where they could poison developers’ computers.
Downloading and using these versions will lead to the theft of data, tokens, SSH keys, credentials, including those for Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP), crypto coins from browser wallets, and browser passwords. The malware also spreads to other connected PCs.
The warnings came this week from researchers at two security firms.
Researchers at Socket found fake packages aimed at app developers looking for pgserve, an embedded PostgreSQL server for application development and testing, and automagik, an AI coding and agent-orchestration CLI from Namastex.ai. The researchers said the attack contains similarities to a recent campaign dubbed CanisterWorm, a worm-enabled supply chain attack that replaced the contents of legitimate packages with malware on npm.
At the time of Socket’s review, the fake automagik/genie package showed 6,744 weekly downloads, and the fake pgserve package showed about 1,300 weekly downloads.
The phony versions of automagik were versions 4.260421.33 through 4.260421.39 when Socket posted its advisory, and additional malicious versions are still being published and identified. The full scope of affected releases, maintainers, or release-path compromise is still under investigation, the researchers said.
Separately, researchers at StepSecurity also found malicious versions of pgserve on npm, noting that the compromised versions (1.1.11, 1.1.12 and 1.1.13) inject a 1,143-line credential-harvesting script that runs via postinstall every time it is installed.
The last legitimate release of pgserve is v1.1.10, according to StepSecurity.
StepSecurity said that, unlike simple infostealers, this malware is a supply-chain worm: If it finds an npm publish token on the victim machine, it re-injects itself into every package that token can publish, further propagating the compromise. Stolen data is encrypted and exfiltrated to a decentralized Internet Computer Protocol (ICP) canister, a blockchain-hosted compute endpoint chosen specifically because it cannot be taken down by law enforcement or domain seizure.
Yet another supply chain attack
This is just the latest example of a software supply chain attack, in which threat actors hope that developers will download infected utilities and tools from an open source registry and use them in packages that will spread the malware widely.
In one of the most recent examples, hackers last month compromised the npm account of the lead maintainer of the Axios HTTP client library. And last summer, attackers compromised several JavaScript testing utilities on npm.
Advice to victimized developers
Developers who have downloaded the malicious versions of pgserver and automagik need to act fast, says Tanya Janca, head of Canadian secure coding consultancy SheHacksPurple.
“Rotate every credential you can think of, right now, before you do anything else,” she said. “Then harden your CI/CD network egress controls so your build runners can only reach the domains they explicitly need. Make sure your build runners and deployment runners use separate service accounts with separate permissions. The goal is to make sure that even if a malicious package runs in your build environment, it cannot reach an attacker’s infrastructure (for data and secret exfiltration) and also block it from pivoting into your deployment pipeline.”
To prevent being compromised by any malicious npm package, Janca said IT leaders should disable automatic postinstall script execution by default.
Developers should also run this command immediately: npm config set ignore-scripts true. Some legitimate packages will occasionally break as a result of this, she admitted. But the goal is to create an intentional point of friction to force developers to consciously decide a script is or is not allowed to run on their machines.
In addition, she said, developers need tooling that checks whether what is published to npm actually matches what is in the source repository. “Not all software composition analysis tools do this,” Janca said, “so ask your vendor specifically whether the tool catches registry-to-repo mismatches.”
Finally, she advised, apply the principle of least privilege access to publishing tokens; scope them tightly, give them only the permissions they need for one specific package, and rotate them regularly — automatically, not manually.
More than just credential theft
“People tend to think of this as a credential theft incident,” Janca said. “It is actually a potential complete organizational takeover, and it can unfold in stages. First, the attacker gets your secrets on install: AWS keys, GitHub tokens, SSH keys, database passwords, everything sitting in your environment or home directory. Second, if you have an npm publish token, the worm immediately uses it to inject itself into every package you can publish, which means your downstream users are now also victims. Third, those stolen cloud credentials get used to pivot into your infrastructure: spinning up resources, exfiltrating data, moving laterally across accounts. Fourth, your CI/CD pipelines, which trust your runners and service accounts implicitly, welcomes the attackers malicious code into production.”
She pointed out that it often takes a long time for developers to notice attacks like this, “and by that time, the attacker has potentially had access to source code, production systems, customer data, and the software your users count on.”
Shift in tactics
Janet Worthington, a senior security and risk analyst at Forrester Research, said that recent attacks such as the CanisterSprawl campaign and the compromise of the Namastex.ai npm packages show a shift from threat actors toward self-propagating malware that steals credentials and uses them to automatically infect other packages.
“This behavior echoes earlier outbreaks like the Shai-Hulud worm, which spread across hundreds of packages by harvesting npm tokens and republishing trojanized versions belonging to the compromised maintainer,” she said in an email.
While open registry platforms like npm are introducing stronger protections around publisher accounts and tokens, these incidents highlight the fact that compromises are no longer isolated to a single malicious package, she said. Instead, they cascade quickly through a registry ecosystem and even jump to other ecosystems. “Enterprises should ensure that only vetted open source and third party components are utilized by maintaining curated registries, automating SCA [software composition analysis] in pipelines and utilizing dependency firewalls to limit exposure and blast radius,” said Worthington.
Developers sit at the intersection of source code, cloud infrastructure, CI/CD pipelines, and publishing credentials, Janca pointed out, so compromising one developer can mean compromising every user of every package they maintain, or even an entire organization. This attack, and several others in recent months, are also going after personal crypto wallets alongside corporate credentials. “That tells us,” she said, “that attackers understand exactly the type of person they are hitting and they are optimizing for maximum yield from a single attack.”
Microsoft issues out-of-band patch for critical security flaw in update to ASP.NET Core 22 Apr 2026, 6:45 pm
Developers are advised to check their applications after Microsoft revealed that last week’s ASP.NET Core update inadvertently introduced a serious security flaw into the web framework’s Data Protection Library.
Microsoft describes the issue as a “regression,” coding jargon for an update that breaks something that was previously working correctly.
In this case, what was introduced was a CVSS 9.1-rated critical vulnerability, identified as CVE-2026-40372, that affects ASP.NET’s Core Data Protection application library distributed via the NuGet package manager. It impacts Linux, macOS and other non-Windows OSes, as well as Windows systems where the developer explicitly opted into managed algorithms via the UseCustomCryptographicAlgorithms API.
A bug in the .NET 10.0.6 package, released as part of the Patch Tuesday updates on April 14, causes the ManagedAuthenticatedEncryptor library to compute the validation tag for the Hash-based Message Authentication Code (HMAC) using an incorrect offset.
Incorrect calculation of security hashes results in the .AspNetCore application cookies and tokens being validated and trusted when they shouldn’t be.
“In these cases, the broken validation could allow an attacker to forge payloads that pass DataProtection’s authenticity checks, and to decrypt previously-protected payloads in auth cookies, anti-forgery tokens, TempData, OIDC state, etc,” said Microsoft’s GitHub advisory.
When embedded in applications, these long-lived tokens confer the sort of power attackers quickly jump on. “If an attacker used forged payloads to authenticate as a privileged user during the vulnerable window, they may have induced the application to issue legitimately-signed tokens (session refresh, API key, password reset link, etc.) to themselves,” the advisory noted.
This vulnerability arrives only six months after ASP.NET suffered one of its worst ever flaws, October’s CVSS 9.9-rated CVE-2025-55315 in the Kestrel web server component. But somewhat alarmingly, the current advisory goes on to compare the issue to MS10-070, an emergency patch for CVE-2010-3332, an infamous zero-day vulnerability in the way Windows ASP.NET handled cryptographic errors that caused a degree of panic in 2010.
Not a simple update
Normally, when flaws are uncovered, the drill involves merely applying an update, workaround, or mitigation. In this case, the update itself should have already happened automatically for server builds, taking runtimes to the patched version 10.0.7.
However, for developers using the popular Docker container platform, things are more complicated. For those projects, the Data Protection Library is also embedded in built applications. Addressing this requires updating and rebuilding any ASP.NET Core applications created after the April 14 update.
In addition, those using 10.0.x on the netstandard2.0 or net462 target framework asset from the flawed NuGet package, for compatibility with older operating systems including Windows, are also affected.
Detecting affected binaries
How will developers know if a vulnerable binary has been loaded? Microsoft’s security advisory offers the following advice:
“Check application logs. The clearest symptom is users being logged out and repeated The payload was invalid errors in your logs after upgrading to 10.0.6. Check your project file. Look for a PackageReference to Microsoft.AspNetCore.DataProtection version 10.0.6 in your .csproj file (or in a package that depends on it). You can also run dotnet list package to see resolved package versions.”
In summary, developers should rebuild affected applications to apply the fixed version, expire all affected authentication cookies and tokens to remove forgeries, and rotate to apply new ASP.NET Core Data Protection tokens.
While there is no evidence that the issue has been exploited by attackers, good security hygiene mandates also checking for unexpected or unusual logins failures, errors, or authentication failures, Microsoft advised.
This article originally appeared on CSOonline.
SpaceX secures option to acquire AI coding startup Cursor for $60B 22 Apr 2026, 12:03 pm
SpaceX has obtained the right to acquire AI coding startup Cursor for $60 billion later this year, the two companies announced Tuesday.
The aerospace company disclosed the arrangement in a post on X. “SpaceXAI and cursor_ai are now working closely together to create the world’s best coding and knowledge work AI.”
SpaceX added that the deal would pair Cursor’s product with its Colossus AI training infrastructure.
“The combination of Cursor’s leading product and distribution to expert software engineers with SpaceX’s million H100 equivalent Colossus training supercomputer will allow us to build the world’s most useful models,” the post said. “Cursor has also given SpaceX the right to acquire Cursor later this year for $60 billion or pay $10 billion for our work together.”
Deepika Giri, AVP and head of AI research at IDC Asia/Pacific, said the contractual exposure is the immediate concern for enterprise buyers.
“Cursor’s existing zero-data-retention agreements with model providers like OpenAI and Anthropic could be challenged under the new SpaceX ownership, which might quite likely renegotiate or terminate subprocessor relationships,” Giri said. “It is likely that Cursor will cease to maintain model neutrality, which will work in favor of xAI.”
Cursor frames it as a compute deal
The startup, developed by San Francisco–based Anysphere, confirmed the tie-up in a short company blog post the same day.
“We’ve wanted to push our training efforts much further, but we’ve been bottlenecked by compute,” the company said. “With this partnership, our team will leverage xAI’s Colossus infrastructure to dramatically scale up the intelligence of our models.”
The blog post said Cursor released Composer less than six months ago as its first agentic coding model, that Composer 1.5 scaled reinforcement learning by more than 20 times, and that Composer 2 reached frontier-level performance at a fraction of the cost of other models.
Cursor co-founder and chief executive Michael Truell addressed the deal in a post on X. He said he was “excited to partner with the SpaceX team to scale up Composer,” calling the arrangement “a meaningful step on our path to build the best place to code with AI.”
Nitish Tyagi, principal analyst at Gartner, said Cursor’s in-house model carries a constraint that the announcement did not address. “Composer is fine-tuned on the Chinese base model Kimi 2.5, making it unsuitable for organizations with restrictive governance policies,” Tyagi said.
Cursor’s enterprise footprint
Cursor says more than half of the Fortune 500 use its product, with customers including Nvidia, Salesforce, Uber, Stripe, and PwC.
SpaceX brings an AI stack of its own to the partnership. Its February acquisition of xAI pulled Musk’s AI lab, the Grok chatbot, and the X social platform under the rocket company.
Cursor’s parent, Anysphere, has been scaling the product through a run of acquisitions and funding rounds. The company agreed to acquire code review startup Graphite in December, adding pull request and debugging capabilities to its enterprise stack.
Cursor is built on a fork of Visual Studio Code and competes with GitHub Copilot, Anthropic’s Claude Code, and OpenAI’s Codex. Anthropic and OpenAI supply frontier models that Cursor resells through its IDE, and both vendors have launched competing coding tools of their own, according to product documentation on their websites.
What it means for enterprise buyers
Cursor’s enterprise contracts include data-handling provisions tied to its current model providers, including a commitment to no training on customer data by Cursor or the LLM providers it routes to, according to the company’s enterprise page.
Giri said enterprise buyers should move on contract language before the option window closes. “CIOs should consider demanding change-of-control clauses with 90 to 180-day notice on any subprocessor or model routing changes,” she said. “For buyers looking for neutrality of stack, this acquisition completely takes away the neutrality that Cursor offers.”
Tyagi said the partnership’s roadmap is the next variable to watch. “While the partnership appears directionally logical, key uncertainties remain — specifically whether the roadmap will prioritize Grok, Composer, both, or an entirely new model,” he said.
He also pointed to an earlier precedent. “Model access restrictions often move faster than innovation,” Tyagi said, citing Anthropic’s decision to restrict Windsurf’s access amid OpenAI acquisition rumours. The current announcement “could backfire for Cursor if major providers like OpenAI or Anthropic limit model access,” he said.
SpaceX and Cursor did not immediately respond to requests for comment.
How AI is upending SaaS tools 22 Apr 2026, 9:00 am
It’s quite clear that agentic coding has completely taken over the software development world. Writing code will never be the same. Shoot, it won’t be long before we aren’t writing any code at all because agents can write it better and faster than we humans can. That may already be true today.
But there is more to software development than merely writing code, and those areas—source control, documentation, CI/CD, project management—are ripe for some serious disruption from AI as well. Those areas may well be hit harder than coding itself.
I would imagine that if you were in the business of analyzing data and providing dashboard-level insights into that data, then you would be very worried indeed about what AI is going to do to your value proposition. Much of the SaaS industry is in the business of analyzing existing data, and that is exactly what AI agents can do well. When a simple question can get straight to the heart of what a pricey dashboard provides, then companies have to question the value of paying for that kind of service.
Tools like LinearB, Jellyfish, and Swarmia provide deep and interesting insights into what is going on inside your repository, but if you can say to Claude Code, “What are the DORA metrics for this repository?”, well, then those businesses are definitely ripe for disruption, no?
Pivoting to AI
Those tools are already reacting by pivoting hard and leaning into the AI revolution. They are doing things like focusing on measuring AI processes instead of providing team insights. These tools are now pitching that they monitor not your development team but your AI development process, which is the kind of thing they have to do when the ground under their feet is shifting. The disruption is real, and they have to change or die.
Dashboards over existing data need to make a rapid change. But tools that produce underlying data need to change as well. Instead of producing dashboards for human consumption, these tools are turning hard towards providing Model Context Protocol (MCP) implementations that AI agents can consume.
One meta-coding area where I have found AI provides real value is in log examination. When a problem occurs, the first question that usually gets asked is, “Where is the log of that happening?” Back in the before times, you’d have to pore over the log, line by line, searching for exactly what happened for clues into the source of the problem. But now? Give the log, however large, to an AI agent, and those answers appear in a matter of minutes.
Producing the log becomes the real value—displaying dashboards over that data becomes less important. A tool like Datadog owns the ingestion pipeline and the time-series production, and it creates valuable data, so its pivot is easier. Datadog need only create a tool that talks to an AI agent instead of a human. Their beachhead is solid. The real value of logs lies in an agent’s ability to peer into them in real time and take action based on what it sees. It won’t be long until, whenever a problem occurs, an MCP server will notify an AI agent and the agent will analyze the problem, fix it, and deploy the fix, all without human intervention.
Producing and owning the data beats being able to interpret the data. Tools that produce the data can lean into the AI revolution. Tools that merely read and display data from a different source—say, an existing repository—will have a much harder time surviving alongside AI agents.
The soul of a new user
Any provider of a software tool that is part of a development or operations workflow should be working very hard to provide an MCP or a CLI for an AI agent to use, because that is the future. A CI/CD system needs to be able to respond to events without a human being involved at all. Such tools become the data source and will have an entirely different front end. Instead of humans looking at dashboards, it will be AI agents making MCP queries into the tool.
This is where the disruption is really happening. One might even say your customer is no longer a software development manager but an AI agent’s MCP server. How long will it be before we have AI tools making purchasing decisions after running thousands of simulations against a set of potential new tools? Previously, software tool companies put a lot of energy into slick-looking UIs, web pages with solid copy, and all kinds of bells and whistles meant for human consumption.
But does any of that matter if you are actually selling to an AI agent? Does your MCP server actually return data that another MCP server can consume and use?
Everything that SaaS companies have learned to do to be successful is now being turned on its head. AI agents don’t care one whit about cool-looking websites and clever marketing copy. Selling to a machine that doesn’t care about your pitch, your carefully crafted brand, or your clever logo is a game that no one has ever played before.
Google’s Gemma 4 shines on local systems – both big and small 22 Apr 2026, 9:00 am
Google’s Gemma 4 comes touted as the latest evolution of Google’s multi-modal model offerings. Gemma 4 not only offers reasoning and tool use, but vision and audio functionality, and it’s available in a range of model sizes that target servers and local devices.
What’s striking about Gemma 4 is that even at the higher end of its size range, it’s still decently performant on personal hardware. Google claims this is due to innovations in the architecture of the model, but the proof is in the trying. Gemma 4 is quite responsive.
To that end, I took Gemma 4 for a spin on my own hardware to see how it fared for its advertised tasks.
Gemma 4 model sizes
Gemma 4 comes in four basic sizes or “densities”:
- E2B: 2.3 billion effective parameters, 5.1 billion total, 128K max context window.
- E4B: 4.5 billion efffective parameters, 8 billion total, 128K max context window.
- 31B: 31 billion parameters (the “dense” version), 256K max context window. (You will probably not use this one on your own machine — it’s 62GB!)
- 26B A4B: A “mixture of experts” model with 4 billion “activated” parameters and 26 billion total parameters, 256K max context window.
Each of these model sizes is available in a slew of community-created editions, thanks to Gemma 4’s Apache 2 licensing. For instance, the 26B A4B model comes in a community edition with more compact quantizations (4-bit, 6-bit, etc.), which I used as one of the model mixes for this article.
The models I used:
- google/gemma-4-26b-a4b: 18GB model size, 4-bit quantization.
- lmstudio-community/gemma-4-E4B-it-GGUF: 6.3GB model size, 4-bit quantization.
- unsloth/gemma-4-E4B-it-GGUF: A popular alternate mix of the same model, available in multiple quantizations. I used the 4.84GB size with 4-bit quantization.
Test system and prompts
I ran each model using my now-standard test bed: LM Studio 0.4.10 on an AMD Ryzen 5 3600 6-core CPU (32GB RAM) and an Nvidia GeForce RTX 5060 (8GB VRAM).
For each model I ran a set of prompts:
- A vision-functionality prompt: “Create a caption for the attached image of no more than three sentences.” Another version of the prompt added: “… with no editorialization.” (The attached image is shown in one of the screenshots below.)
- Prompts intended to provoke web-search tool use and produce either detailed or simplified responses: “What is the copyright status of Franz Kafka’s works? Explain in detail” and “What did William Gibson think of Blade Runner?”
- A prompt for code generation and problem solving: “Python’s
piptool has a functionScriptMaker(accessed withfrom pip._vendor.distlib.scripts import ScriptMaker). On Microsoft Windows this is used to create an .exe stub launcher for a Python package’s entry points when it’s installed withpip. However, the icon created for this stub is the same generic icon used for the Python runtime itself. Let’s write a Python utility to allow the user to append their own custom icon to the .exe stub, but also preserve the stub’s appended archive and other metadata. The utility should use only the Python standard library, and should be kept as simple as possible.” - Another code-related prompt: “I have attached a Python program that takes Python applications and packages them to run with a standalone instance of the Python runtime. One drawback of the program is that it’s not very modular. Analyze the program and make some suggestions about how to increase its modularity so it can be used as a library with hooks for various advanced behaviors.”
Gemma 4 in action
The 26B model was at the upper end of what I could run comfortably on my test hardware. I wasn’t able to fit the entire model into GPU memory, but I set the first 12 layers to run on the GPU (7.51GB VRAM), and I set the context length to 16384 tokens (total: 18.76GB RAM).
Getting good performance out of models that don’t fit in VRAM is always a challenge. However, Gemma 4 has, courtesy of its “mixture of experts” design, a feature to boost performance. LM Studio exposes this feature through a setting currently tagged as experimental. You can choose how many layers of the model to “force MoE [Mixture of Experts] weights onto the CPU,” which conserves VRAM and can speed up inference.

The MoE (mixture of experts) experimental setting in LM Studio. For models that use an MoE design, this setting forces the weights for that aspect of the model to be run on the CPU instead of the GPU. With Gemma 4, this resulted in a major speed boost for models too big to fit in memory.
Foundry
Without the MoE forcing, the overall inference time and token generation speed cratered; the model could barely manage an average of 1.5 tokens per second even for simple queries. With MoE forcing turned on (with the maximum number of layers supported, 30), token generation speed jumped to anywhere from 5 to 13 tokens per second, depending on the rest of the system’s load. That’s still a far cry from the speed of the smaller models, but a lot more workable.
For faster time-to-first-token results, you can disable thinking, at the possible cost of less robust output. For the code-generation query, Gemma 4 spent 6 minutes 26 seconds thinking, and over 8 minutes generating the response (5,013 tokens, 9.55 tokens per second). The resulting code and explanation was not significantly more advanced or detailed than the non-thinking version.

Response from Gemma 4’s 26B parameter model to a query to generate code. This larger version of the model runs less quickly when it can’t fit entirely in memory, but its mixture-of-experts design helped offset that limitation.
Foundry
When I switched to the LM Studio Community edition of the E4B model, I put all 42 layers on the GPU and kept the context at 16,384, all of which fit comfortably in VRAM with room to spare. The results were a major jump in speed: 72 tokens per second. The smaller model was less specific for certain queries — the code-generation query in particular didn’t generate a comprehensive code example, only a conceptual framework for one — but still did a decent job of analyzing the problem and suggesting constructive approaches. The “unsloth” edition of the E4B model, despite being slightly smaller, was about as performant and useful.

Examples of Gemma 4’s 26B parameter version generating image captions. The smaller versions of the model tended not to editorialize. The larger version sometimes needed specific guidance to be less verbose or florid.
Foundry
For the “make this program more modular” prompt, I got roughly equivalent results across all incarnations of the model in terms of the advice given. The only major difference was that the smaller models ran far faster — 73.85 and 71.73 tokens per second vs. 9.3 for the big model.
Gemma 4 takeaways
The biggest takeaway from running Gemma 4 locally is how the mix-of-experts design in one of the larger incarnations of the model make it useful even on systems where the model doesn’t fit entirely into VRAM. The smaller incarnations of the model, even at lower quantizations, still work well, too. They also deliver results many times faster, and free up much more memory for larger context windows. Thus, the smaller models are well worth experimenting with as the first model of choice before moving up to their bigger brothers.
Snowflake offers help to users and builders of AI agents 21 Apr 2026, 1:15 pm
Snowflake is enhancing Snowflake Intelligence and Cortex Code to create a unified experience connecting enterprise systems, data sources, and AI models with Snowflake data. It’s part of the company’s vision to become the control plane for the agentic enterprise, enabling enterprises to align data, tools, and workflows with AI agents built on its platform.
With these updates, the company said, Snowflake Intelligence becomes an adaptable personal work agent for business users, and Cortex Code expands as a builder layer for enterprise AI that provides governed, data-native development.
Enhancements to Snowflake Intelligence include automation of routine tasks by describing them in natural language, new Model Context Protocol (MCP) connectors, and reusable artifacts that let users save and share analyses, visualizations, and workflows, all of which will be generally available “soon.” In addition, a new iOS mobile app, and multi-step reasoning with deep research that uses agentic architecture to reason across data will soon be in public preview.
The company said that all of these updates came out from customer feedback, as well as from insights gleaned from Project SnowWork, last month’s preview of an autonomous AI layer for its data cloud.
Cortex Code now supports additional external data sources, including AWS Glue, Databricks, and Postgres, connectivity with other AI agents via MCP and Agent Communication Protocol (ACP), a Claude Code plugin, and a new agent software development kit with support for Python and TypeScript. There are also enhancements to Cortex Code in Snowsight, Snowflake’s web interface, including Plan Mode to allow developers to preview and approve workflows, and Snap & Ask to enable interaction with data artifacts such as charts and tables.
Snowflake also announced the private preview of Cortex Code Sandboxes in Snowsight, a dedicated cloud environment where developers can execute code end-to-end with no setup.
Michael Leone, VP & principal analyst at Moor Insights & Strategy, thinks the roadmap is “ambitious,” noting the number of items announced that are “coming soon” or are in public preview. “These announcements are starting to blur together, with almost every vendor claiming their agents can reason, act, and transform the business,” he said, adding, “What makes this one worth slowing down on, at least for me, is that Snowflake is going after both halves of the enterprise at the same time. Intelligence is built for the business users who want answers and actions without writing SQL, and Cortex Code is built for the builders who actually have to put this into production.”
Most vendors pick one target, users or builders, and come back to the other later, he said, but Snowflake is putting both on the same governed data foundation. “[This] is a harder engineering problem, but I’d argue it’s a cleaner answer to the question enterprises are actually asking, which is how to open AI up to more people without losing control of the data underneath,” he said, noting that Snowflake has changed its approach from “let’s do it inside Snowflake,” to realizing that agentic AI only works if it’s interoperable with the rest of the stack.
Igor Ikonnikov, advisory fellow at Info-Tech Research Group, also sees the control plane play as part of an industry trend. “As always, the devil is in the details: what those platforms are composed of and how they offer to control AI agents,” he said. “Most platforms are built the old-fashioned way: All the controls are coded. Snowflake speaks about reusable analytics through saving the whole solution and reusing complete modules or models. It means that common semantics are still buried inside database models and code.”
All AI vendors are motivated by the same demand from the market, he said: “Move from Copilot-based generic chatbots to business-purpose-specific AI agents that understand business logic and can interact with one another.” With these updates, he sees Snowflake as having caught up with the competition, but not yet surpassing it.
Sanjeev Mohan, principal at SanjMo, said, “The good news for customers is the support for Databricks and AWS Glue. What Snowflake is saying is that even if your data lives in a competitor’s system, Snowflake AI coding agent can be used. And vice versa, the VS Code extension and Claude Code plugin can be used on Snowflake data. In other words, it reduces vendor lock-in fears.”
It’s also the right strategic direction, said Sanchit Vir Gogia, chief analyst at Greyhound Research. “Enterprise AI is moving from generation to orchestration to execution, and Snowflake’s focus on governed data as the foundation for action aligns with that shift,” he said.
“However, becoming the execution layer for enterprise AI requires more than integrating agents and expanding tooling,” he said. It also requires consistent semantics, reliable cross-system execution, strong governance, economic viability, and organisational readiness, as well as overcoming a structural constraint. “Control without ownership of the systems where work is executed introduces dependency that is difficult to fully resolve. This is the central tension in Snowflake’s strategy and will define how far it can realistically extend its influence,” he said. “Snowflake has taken a meaningful step in that direction. It has not yet proven that it can deliver this at scale. At this stage, it is one of the most credible contenders in a race that will be defined not by who builds the smartest AI, but by who can make that AI work reliably inside the enterprise.”
Amazon’s $5B Anthropic bet is really about compute, not just cash 21 Apr 2026, 11:37 am
Amazon on Monday said it was investing an additional $5 billion in Anthropic, a move that analysts say is aimed as much at easing the AI startup’s growing infrastructure bottlenecks as at deepening their strategic partnership.
As part of the deal, Anthropic will lock in up to 5 gigawatts of compute capacity across AWS’s Trainium chips, including the new Trainium 3 and upcoming Trainium 4, the companies said in a joint statement.
“Right now, users see limits like throttling and session caps because Anthropic is running out of capacity and must ration usage to avoid crashes. This deal helps fix that,” said Pareekh Jain, principal analyst at Pareekh Consulting.
“Over time, the expanded capacity will let Anthropic support more users at once, build bigger models, and reduce these limits, especially for paid and enterprise users,” Jain added.
The analyst was referring to Anthropic’s move to throttle usage across its Claude subscriptions, especially during peak demand hours, which also coincided with other concerns, such as complaints of degradation in Claude’s reasoning performance across complex tasks.
Scaling compute capacity
A significant portion of Trainium 3 capacity is expected to come online this year, they added. Anthropic already uses Trainium 2 via AWS’ Project Rainer, which is a cluster of nearly half a million chips, to train and run its models.
The agreement between Amazon and Anthropic also includes an expansion of inference capacity in Asia and Europe, which Jain said should improve Claude’s speed and reliability globally. Anthropic will also have the option to buy future generations of Trainium as they become available.
However, Anthropic isn’t alone when it comes to model providers trying to add compute capacity to train and run their models.
Earlier in February, rival OpenAI signed a deal with Amazon, Nvidia, and SoftBank to raise around $110 billion to add infrastructure to increase compute capacity.
As part of the arrangement, OpenAI has committed to consuming at least 2GW of AWS Trainium-based compute tied to Amazon’s $50 billion investment, along with 3GW of dedicated inference capacity from Nvidia under its separate $30 billion commitment.
From funding to supply chain financing
In fact, deals such as these, analysts say, reflect a broader shift in how AI infrastructure is getting financed presently.
“Rather than simple cash-for-equity, these deals bundle equity investment with massive cloud-spend, or GPU spend commitments by locking in customers, securing capex returns, and validating infrastructure buildouts in a single transaction. This isn’t venture capital anymore, it’s supply chain financing,” Jain said.
The pattern present in these deals, Jain noted, is consistent across the ecosystem, giving examples of Microsoft, Oracle, and Nvidia.
“Microsoft invested tens of billions into OpenAI while simultaneously committing Azure capacity for training and inference, with OpenAI’s Azure spend now running at a multi-billion dollar annual rate,” Jain said.
“Oracle, too, signed a $30 billion cloud deal with OpenAI, then followed it with a staggering $300 billion five-year compute commitment starting in 2027. Nvidia took it further still with its $100 billion investment in OpenAI, which was paid in GPUs, not dollars — a model it replicated with xAI,” Jain added.
That framing, however, according to Greyhound Research chief analyst Sanchit Vir Gogia, may miss a deeper shift.
Such deals, Gogia said, are more about securing scarce compute supply ahead of competitors. “What capital does is improve your position. It allows you to commit earlier and at greater scale,” the analyst pointed out, adding that the real advantage lies in locking in infrastructure before others can.
On the flip side, though, long-term capacity commitments tend to anchor companies to specific providers, Gogia cautioned.
While model providers may operate across platforms and hyperscalers, their largest infrastructure commitments ultimately shape where they optimize workloads, build features, and direct spending, the analyst pointed out.
For Anthropic, the Amazon deal comes with equally significant long-term obligations. The company has committed to spending more than $100 billion on AWS over the next decade.
For Amazon, the $5 billion investment builds on its earlier $8 billion bet on Anthropic and comes with the potential to commit up to an additional $20 billion tied to certain commercial milestones, which were not revealed. Anthropic is also looking beyond AWS. The company recently said it plans to add capacity using Google’s TPUs. These chips are expected to come online by next year.
The article originally appeared in NetworkWorld.
From the engine room to the bridge: What the modern leadership shift means for architects like me 21 Apr 2026, 10:00 am
We all agree that the role of the technology leader is being rewritten in real time, and if you’re building the systems they depend on, you need to understand what they’re asking for now.
Let me be honest about something. For most of my career, the conversations I had with CIOs followed a pretty predictable script. They’d describe a pain point, I’d map it to a solution and we’d talk timelines and integration. Clean. Transactional. Technical. Very straightforward, right?
That script has been shredded.
Over the past couple of years, working across public sector agencies, global enterprises and mid-market companies in Latin America and now the US, I’ve watched the CIO role transform in a way that genuinely changes how I do my job as a solutions architect. The technology leader who used to care primarily about uptime and cost efficiency now walks into conversations asking about competitive differentiation, cultural change and workforce transformation. And they’re right to ask.
The shift isn’t cosmetic. It’s structural.
The problem hiding in plain sight
Here’s what I kept seeing in failed modernization projects, and I saw a lot of them: The technology worked fine. The architecture was sound. The implementation was clean. And the project still stalled or quietly died six months in. The root cause was seldom technical. It was a decision-making problem upstream of delivery. Strategy that hadn’t been translated into clear operating priorities. Conflicting stakeholder mandates that nobody had formally resolved. Organizational structures that pull in different directions from the infrastructure teams trying to serve them.
What I’ve come to think of as “decision integrity”, the discipline of making sure strategy connects to execution, was missing. And the CIO, historically, wasn’t positioned to own that gap. They were downstream of it.
That’s changed. The CIOs I work with now are increasingly the ones driving that upstream clarity. They’re defining outcome frameworks, arbitrating tradeoffs and forcing the organizational alignment that makes technical delivery land. The architecture conversation I have with them today is as much about governance and organizational design as it is about platforms.
What this means if you’re building for them
From where I sit, designing solutions around open-source platforms, hybrid cloud and AI infrastructure, the practical consequence is this: The technology decisions my customers make are no longer primarily about technology.
A CIO investing in AI-ready infrastructure isn’t just buying a faster platform. They’re making a strategic bet that the organization can operate differently at scale. Which means the infrastructure must support not just the technical requirements, consistent data access, automated policy enforcement and visibility across hybrid environments, but also the organizational ones.
Can non-technical stakeholders trust the system? Can the governance model hold up as scope expands? Can the platform absorb the messiness of real enterprise change without the whole thing collapsing?
Technical debt is where this gets painfully concrete. I’ve seen environments where 30–40% of engineering capacity is absorbed by legacy maintenance. Not because anyone made a bad choice, but because previous decisions compounded over time without a deliberate modernization strategy. When a CIO tells me they want to move fast on AI adoption, the first conversation we must have is about what’s sitting underneath that ambition. You can’t build a reliable AI pipeline on top of infrastructure you don’t trust.
The CIOs who are winning right now are the ones who dealt with that debt proactively not by declaring a big-bang rewrite, but by systematically creating the conditions where innovation can happen without adding to the entropy. That’s what I try to help architect.
The cultural piece is the hardest part, and it’s real
I’ll be straight here: When someone says, “cultural transformation,” my instinct is usually to translate it into something more concrete I can design for. Agile delivery models. Feedback loops. Automation that removes friction from the right places. That’s still my instinct.
However, I’ve had to sit with the fact that the cultural piece isn’t just a soft addendum to the technical work. It’s load-bearing.
Here’s the version I’ve watched play out more than once: You build a genuinely excellent automation platform. The tooling is solid. The pipelines work. And then adoption stalls because the teams who are supposed to use it don’t trust it, weren’t involved in defining it or are quietly protecting workflows that the new system would disrupt. The problem isn’t the platform. The problem is that nobody built the social infrastructure around it.
Gartner’s projection that 25% of IT work will be handled autonomously by AI by 2030 isn’t a threat or a promise; it’s a design constraint. If you’re architecting systems today, you have to ask: What does the human role look like in this workflow once AI is doing the routine work? What skills are you developing in the team? Where does judgment still belong to a person?
Those aren’t questions with clean technical answers. But they’re questions an architect must have an opinion on.
Both hands on the wheel
There’s a framing I keep coming back to, which describes the modern technology leader as both the navigator on the bridge and the engineer in the engine room at the same time. That’s exactly the tension I recognize from the field. The CIOs I most want to work with are the ones who haven’t abandoned either role. They’re genuinely curious about how the infrastructure works, not just what it delivers. And they’re genuinely accountable for business outcomes, not just technical ones. That dual orientation is rare, and it’s valuable and when I find it, those tend to be the engagements where we build something worth building. And this is one thing that fascinates me about open source: The people who engage with it tend to be true tech experts.
For those of us on the architecture side, the implication is clear. We can’t show up to these conversations as purely technical resources anymore, either. The best solution I can design is useless if it doesn’t connect to the organizational reality my customer is operating in. Understanding the strategic pressure they’re under, the cultural conditions they’re working with, the decision-making constraints they’re navigating, that context shapes everything about how I recommend we build.
The engine room and the bridge have always been part of the same ship. It just took a while for the org charts to catch up.
This article is published as part of the Foundry Expert Contributor Network.
Want to join?
Enterprises are rethinking Kubernetes 21 Apr 2026, 9:00 am
For years, Kubernetes held an almost mythic place in enterprise IT. It was positioned as the control plane for the future, the standard abstraction for cloud-native systems, and the platform that would finally free enterprises from infrastructure lock-in. To be fair, some of that was true. Kubernetes brought discipline to container orchestration, enabled portable deployment models, and provided architects with a powerful framework for managing distributed applications at scale.
However, the market is changing, and so are enterprise expectations. The question is no longer whether Kubernetes is technically impressive. It clearly is. The question is whether it still represents the best fit for a growing number of mainstream enterprise use cases. In many cases, the answer is increasingly no. What we are seeing is not the death of Kubernetes but the end of its unquestioned dominance as the default strategic choice. Here’s why.
Too operationally expensive
As Kubernetes adoption grew, many organizations hesitated to admit that it introduced operational complexity and needed specialized skills, constant tuning, and strong governance. Running Kubernetes well requires mature engineering, observability, security, networking, and life-cycle management—much more than a side project. Many underestimated this burden.
What looked elegant in architectural diagrams became a real-world tax on operations teams. Clusters multiplied. Toolchains sprawled. Upgrades became risky. Policy enforcement became an engineering discipline in its own right. Enterprises realized they were not just adopting an orchestration platform. They were building and maintaining an internal product that required sustained investment and scarce expertise.
That might be acceptable for digital-native businesses whose scale and complexity justify the effort. It is a much harder sell for enterprises that want reliable deployments, resilient applications, and reasonable cloud costs. In those cases, Kubernetes can feel like overengineering disguised as strategic modernization. When a company spends more time managing the platform than delivering business value on top of it, the novelty wears off quickly.
Portability becomes less important
Kubernetes was marketed as a hedge against lock-in, enabling applications to run across on-premises, cloud, and edge. However, most enterprises faced ecosystem dependencies—storage, networking, security, identity, observability, CI/CD, managed services, and cloud-native databases—creating practical lock-in that Kubernetes didn’t eliminate.
What enterprises gained in workload portability, they often lost in ecosystem complexity. They standardized on Kubernetes while still depending heavily on a particular cloud provider’s managed services and operational conventions. The result was a strange middle ground: all the complexity of a highly abstracted platform without the full simplicity of using opinionated native services end-to-end.
This matters more now because boards and executive teams are less interested in theoretical architectural optionality and more focused on measurable business outcomes. They want speed, resilience, cost control, and lower risk. If a managed application platform, serverless environment, or provider-specific platform-as-a-service offering gets them there faster, many are willing to accept some level of dependency. Enterprises are becoming more candid about the trade-offs. They are realizing that strategic flexibility is valuable, but not at any cost.
This is where Kubernetes starts losing favor. Portability has value, but for many enterprises, it hasn’t justified the operational and organizational burden it entails. The promise exceeded the actual return.
Better abstractions are catching up
Perhaps the most important shift is that enterprises are moving away from buying raw technical primitives and toward consuming higher-level platforms that better align with developer productivity and business outcomes. Platform engineering teams increasingly hide Kubernetes behind internal developer platforms. Public cloud providers continue to improve managed container services, serverless offerings, and integrated application environments that reduce hands-on infrastructure management. Developers, meanwhile, do not want to become part-time cluster operators. They want fast paths to build, deploy, secure, and monitor applications without stitching together a dozen components.
In other words, Kubernetes may still be present under the hood, but it is becoming less visible and less central to strategic buying decisions. That is usually a sign of maturity. Technologies shift from being the headline to being plumbing. Enterprises are not asking, “How do we adopt Kubernetes?” as often as they are asking, “What is the fastest, safest, most cost-effective way to deliver modern applications?” That is a much healthier question.
The answer increasingly points to curated platforms, opinionated developer environments, and managed services that abstract away Kubernetes rather than exposing it. This is not a rejection of cloud-native principles. It is a rejection of unnecessary cognitive load. Enterprises are deciding they do not need to own every layer of complexity to realize the benefits of modern architecture.
Surrendering the spotlight
None of this means Kubernetes is disappearing. It remains important for large-scale, heterogeneous, and highly customized environments. It is still an excellent fit for organizations with strong platform maturity, regulatory constraints, or sophisticated multicloud operational needs. But that is a narrower slice of the market than the hype cycle once suggested.
What is losing popularity is not Kubernetes as a technology, but Kubernetes as the unquestioned standard for enterprises. This difference is important. Companies are becoming more selective about where to accept complexity and where to avoid it. They are less inclined to idealize infrastructure and more eager to choose simplicity when it exists.
That is probably a good thing. The job of enterprise architecture is not to admire elegant technology for its own sake. It is to align technology choices with operational realities, economic constraints, and business outcomes. By that standard, Kubernetes still has a place, but it no longer gets a free pass.
The cookbook for safe, powerful agents 21 Apr 2026, 9:00 am
As companies move from experimenting with AI agents to deploying them in production, one pattern becomes clear: capability without control is a liability.
Agents operate in long-running, stateful environments. They browse the web, read repositories, execute shell commands, call APIs and interact with internal systems. That power is transformative — and it meaningfully expands the attack surface.
In a recent interview, Jonathan Wall, CEO of Runloop, summarized the shift: “By default, agents should have access to very little. They need to do real work, but capabilities have to be layered on in a controlled way.” That framing reflects a broader industry reality: agent infrastructure must be designed around least privilege, explicit isolation and observable execution.
What follows is a practical control architecture for production agents.
The layered control model
A resilient agent deployment combines six explicit layers:
- Strong runtime isolation with a microVM
- Restrictive network policy with explicit egress allowlists
- Centralized credential management through a gateway
- Disciplined identity management with short-lived, scoped credentials
- Deliberate friction around sensitive actions and high-risk tools
- Continuous monitoring, logging and adversarial testing
Each layer addresses a different failure mode. Together, they contain blast radius when — not if — something breaks.
Start with least privilege
A production-grade agent environment begins in a constrained state: Isolated runtime boundary, no inbound access, no outbound network access and no implicit tool permissions.
The runtime boundary itself is part of least privilege. Containers provide efficient isolation for trusted or single-tenant workloads, but they share a host kernel. Real-world escape vulnerabilities have repeatedly shown that this boundary can fail under adversarial pressure. CVE-2019-5736 allowed attackers to overwrite the host runc binary from within a container
; CVE-2022-0492 enabled breakout via cgroups misconfiguration; CVE-2024-21626 again exposed runc-based escape paths. These incidents do not render containers unusable — but they clarify the tradeoff. MicroVMs introduce a stronger hardware-level boundary, reducing blast radius when agents execute arbitrary or unvetted code.
Isolation is not a performance decision alone. It is a risk decision.
The modern agent threat model
Traditional SaaS systems process deterministic requests. Agent systems ingest untrusted content and generate probabilistic actions.
Prompt injection has demonstrated how fragile instruction boundaries can be.
In 2023, public experiments against Bing Chat showed that hidden instructions embedded in web pages could override system prompts. Academic research from Stanford and others has shown that tool-using agents can be coerced to leak credentials or proprietary data when external content is treated as trusted context.
The danger compounds when agents operate with broad credentials. Service accounts, long-lived API keys and shared internal tokens convert a successful injection from “unexpected output” into repository compromise, database access or SaaS abuse. System prompts that embed internal URLs or configuration data become reusable artifacts once exposed.
Retrieval-augmented systems and MCP-style integrations widen the surface further. When external documents are ingested without segmentation or role separation, attacker-controlled content can redirect behavior or induce data disclosure.
This is the environment the layered model must withstand.
Network policy as containment
Network controls are often treated as compliance checkboxes. In agent systems, they are containment mechanisms.
Agents typically require outbound access for documentation lookup, dependency installation or API interaction. Yet unrestricted egress provides the cleanest path for data exfiltration after injection. Restrictive allowlists — permitting only explicitly approved domains or endpoints — dramatically reduce blast radius.
If a model is tricked into reading a .env file, a strict egress policy can prevent the obvious next step: shipping those secrets to an attacker-controlled domain. Logging outbound traffic establishes behavioral baselines and highlights anomalies early.
Containment turns catastrophic compromise into a recoverable incident.
Ingress as an operational event
Most agent runtimes do not require unsolicited inbound connections. Leaving services exposed by default accumulates unnecessary risk.
When debugging or collaborative inspection is required, exposure should be temporary and scoped — authenticated tunnels opened deliberately and closed promptly. Ingress becomes an operational decision rather than a static configuration state.
Ephemerality is a security control.
Governing model access
Large language models are external systems with cost, compliance and leakage implications. Allowing each runtime to independently manage model credentials fragments oversight.
A centralized gateway restores control. It can restrict approved models, enforce rate ceilings, log prompts and responses, and apply filtering or compliance checks. Agents no longer hold raw provider credentials directly.
The lesson from both container escapes and prompt injection incidents is consistent: implicit trust boundaries erode. Centralized governance reinforces them.
Tooling, identity and friction by design
As agents integrate with repositories, CI systems, deployment pipelines and databases, tool governance becomes inseparable from identity discipline.
Dedicated identities per agent, short-lived tokens and strict RBAC or ABAC reduce the impact of compromise. Reusing human or root-level credentials collapses isolation entirely.
Sensitive actions — sending email, modifying production code, accessing secrets, changing authentication — benefit from friction. Policy checks, approval workflows or out-of-band confirmations create deliberate pauses at high-risk boundaries.
Secrets should not live in prompts. System prompts embedded with credentials have been shown to leak under injection pressure. External secret managers and strict separation between model-visible text and credential material materially reduce exposure.
Continuous adversarial testing
Container escape CVEs and public prompt injection demonstrations share a common lesson: systems fail at integration boundaries, not in isolation. Logging tool calls, data access and network egress creates behavioral baselines against which anomalies — unusual domains, atypical file reads, unexpected tool invocation patterns — can be detected early. Red-teaming and adversarial prompt fuzzing help surface injection paths before attackers do, forcing organizations to confront weaknesses under controlled conditions rather than in production.
Agents can build, test, browse and execute arbitrary code. That capability is powerful — and dangerous when unconstrained. Production readiness is therefore defined not by what agents can do, but by how precisely their boundaries are defined, enforced and observed. The organizations that scale agents successfully will treat infrastructure as policy, isolation as a design decision and monitoring as a first-class requirement — not an afterthought.
This article is published as part of the Foundry Expert Contributor Network.
Want to join?
Addressing the challenges of unstructured data governance for AI 21 Apr 2026, 9:00 am
Large enterprises in regulated industries, especially in data-rich financial services and insurance, have invested significantly in data governance programs. Other businesses have been catching up as part of their efforts to become more data-driven organizations. Data governance often starts with defining policies, classifying data sources, establishing data catalogs, and communicating non-negotiables.
But look a little closer at the implementations, and you’ll see much of the focus has been on governing data warehouses, relational data, and other structured data sources. AI has elevated the importance of implementing data governance and establishing guardrails on unstructured data sources used to train language models and provide context to AI agents.
“Unstructured data now makes up the vast majority of enterprise information, and AI is redefining how organizations bring control, accessibility, and security to it,” says Ashish Mohindroo, general manager and senior vice president of Nutanix Database Service platform. “Leaders should ask themselves, ‘Who needs daily access to this data?’ and ‘How can we keep data safe from unauthorized access or accidental loss?’ ” Those are two key questions to address on all data sources, but unstructured ones have historically been more challenging to implement. I consulted with several experts on these complexities and on how AI can ease unstructured data governance challenges.
Context as important as content
Joanne Friedman, CEO of ReilAI, says that organizations must ensure safety through governed autonomy, which requires shifting from static access control to contract-based safety. “Routing messages is not the same as reasoning about them, connecting assets is not the same as understanding them, and reactive telemetry is not the same as choreographed intelligence,” says Friedman.
Structured data sources are a mix of transactional and relational data, supported by mature technologies to improve data quality and manage metadata. Document stores and other NoSQL databases provided better data management and search capabilities of unstructured data, but it wasn’t until vector databases and large language models (LLMs) emerged that we had tools to derive meaning from documents at scale.
“When I look at unstructured documents, I focus on the risk that lives inside the content because sensitive details hide in places people never review,” says Amanda Levay, CEO of Redactable. “I expect controls that stop those documents from entering unsafe workflows because exposure often happens before anyone knows the risk exists. I also push for systems that flag when a file carries information that shouldn’t move forward, so teams catch the problem at the moment it matters most.”
It’s a lot easier to define controls for accessing rows of structured financial transactions and customer records than to define rules for unstructured documents, such as contracts and health records. Friedman points out that the rules for unstructured documents are more dynamic, while Levay notes the scale and real-time complexities in evaluating documents.
Governance across the life cycle
Where should one begin implementing governance policies? There are many considerations for data pipelines, source data sets, consuming applications, AI models, and AI agents. Stéphan Donzé, founder and CEO of AODocs, says organizations need strong plumbing. He recommends a governed system that can perform the following tasks:
- Routes content to the right models
- Enforces granular permissions
- Maps relationships between extracted entities and other taxonomies
- Tracks implicit versions
- Calls in humans when the stakes are high
“Without these capabilities, AI becomes another black box. With them, you unlock an auditable, secure, explainable insight layer for data governance, risk, compliance, and mission-critical decisions at enterprise scale”, says Donzé.
Policies need to be implemented consistently across the full data lineage from source through consumption, including the creation of derivative data.
“One of the biggest security challenges with unstructured data is the lack of visibility and lineage as information moves across systems, clouds, and teams,” says Jack Berkowitz, chief data officer at Securiti. “When organizations cannot track where data originated, how it has changed—even what version is active or whether it is still relevant—they increase the risk of exposing sensitive or inaccurate data through genAI applications.”
Using AI to classify and categorize
Extracting knowledge from documents, categorizing them, and then classifying them for user entitlements is complex enough. Add the fact that documents are roll-ups of sections and subsections that need independent analysis and are then related to the full document’s context.
Consider building construction specifications, which often follow the CSI MasterFormat document standard. CSI MasterFormat has 50 divisions, such as general specifications, electrical, and plumbing. Now consider access controls for this document, given that security is covered in two separate divisions and may require different classifications than other sections, such as equipment. But even that’s not sufficient context, as a general contractor should have different policies for accessing the specifications for a nuclear power plant than for a small office building.
Complex classification challenges are being addressed with AI and advanced algorithms. “Enterprises are shifting toward commodity-driven, API-driven governance accelerators, especially in areas like classification, taxonomy management, and domain-specific labeling,” says Nandakumar Sivaraman, senior vice president and chief architect of enterprise data at Bridgenext. “Instead of manually applying categories, rules, and policies across thousands of assets, companies are now using AI-driven classification APIs to auto-tag and categorize data. They use machine learning–based pattern detection to assign taxonomies, product hierarchies, or entity domains, and implement lightweight governance microservices for real-time classification in ingestion pipelines.”
Another approach uses vision language models (VLMs) to analyze the document’s visual structure for additional contextual clues. Harpreet Sahota, hacker-in-residence at Voxel51, says VLMs can classify documents without training data, but the bigger issue is that most organizations don’t have consistent taxonomies to begin with. “A first step is to treat documents as images rather than just extracting text, which preserves layout information that is important for understanding structure,” recommends Sahota.
Managing versions and duplicates
Documents can have hundreds of versions and derivatives scattered across SharePoint sites, cloud storage areas, SaaS platforms, and email attachments. One of the more significant unstructured data governance challenges is identifying the latest, accurate versions to include in AI models, retrieval-augmented generation (RAG) systems, and AI agents.
“To improve document versioning, measure the semantic similarity between files and cluster documents that are likely versions of the same document,” says Reece Griffiths, field CTO for Collibra. “Once grouped, apply additional signals, such as last-modified date, metadata, or even title patterns to infer which document in each cluster is the most recent version.”
Determining document versions was once a rules-based system with controls for data owners and tools for handling exceptions. Modern systems now incorporate AI to automate or recommend the latest, most accurate documents and suggest which ones to archive.
“Agents excel at processing unstructured data, reading and analyzing the contents of presentations, videos, emails, and chat logs at scale,” says Dr. Michael Wu, chief AI strategist at PROS. “To manage versions, we must combine search and genAI to enhance the practice of ‘search first, search often’ with ‘read all before creating.’ This fosters continuous document evolution, where outdated or incorrect content is naturally updated or flagged for deprecation.”
Document retention policies
Even after duplication is addressed, a key data governance question remains: How to implement document retention policies? “Most organizations have well-defined retention rules for structured data, but applying those same rules to unstructured content has historically been very difficult,” says Griffiths of Collibra. “By performing AI-based tagging of every document according to a retention taxonomy, including record types and subtypes, companies can then query and manage unstructured data with the same precision they apply to structured data sets.”
Retention policies tend to follow legal guidelines with specific rules. A more difficult challenge is recognizing outdated information in documents that should no longer be used with AI models and agents.
“AI can age documents the way our minds naturally let older memories fade by noticing declining relevance signals, reduced connections to current work, and changing patterns of use,” says Jason Williamson, CEO of MythWorx. “Instead of a hard cutoff, it adapts continuously, helping organizations surface what’s still meaningful while gently retiring what no longer fits the present.”
Data security from start to finish
Three data disciplines are related: data governance protects the business, data privacy protects people, and data security protects the data. Implementing data security must first consider how people create and manage documents.
“When you’re dealing with documents at scale, security and governance can’t be separate workflows with handoffs between teams; they become the same integrated workflow, with discovery, classification, and enforcement happening as one coordinated response,” says Rohan Sathe, cofounder and CEO at Nightfall. “Modern platforms need to quarantine inappropriately shared messages, emails, and files the moment they’re detected. They need to revoke over-permissioned access to sensitive documents, prevent unauthorized cloud sync operations, block risky CLI commands, and stop file uploads to unsanctioned destinations—all in real time.”
Since documents feed AI models and AI agents, a second data security consideration is which documents to include and how to protect the data embedded in AI. “The primary risk with AI isn’t just a traditional breach; it’s contextual leakage,” says Nico Dupont, founder and CEO of Cyborg. “Once you ground a model in your enterprise data, that model becomes a potential vector for surfacing sensitive information to unauthorized users, and you cannot rely on the model to be its own gatekeeper. True data security requires inference time governance and treating AI as a new tier of infrastructure where the security is built into the architecture and is as automated as the data cleaning itself.”
A third consideration is how data is protected as people interact with LLMs and AI agents. These must adhere to the user’s access policies and the usage context. “The primary security risk in AI document management is inference exposure, where an AI might correctly answer a question by accessing a sensitive document that the user technically shouldn’t see,” says James Urquhart, field CTO and developer evangelist at Kamiwaza AI. “To mitigate this risk, organizations must understand the relationships between different entities in their business ontologies and implement permission-aware indexing that ensures that AI and agentic systems respect the same access controls that a human would be subject to.”
One of the most challenging aspects of unstructured data governance is that regulations are evolving and AI capabilities are improving. Policies must evolve as businesses add more data sets, increase AI literacy across their employee base, and expand their AI use cases. Addressing the challenges of unstructured data governance will generate a growing backlog of work for the foreseeable future.
GitHub pauses new Copilot sign-ups as agentic AI strains infrastructure 21 Apr 2026, 8:57 am
GitHub has paused new sign-ups for several individual Copilot plans and tightened usage limits, saying newer agentic coding workflows are consuming far more compute than its original pricing and service model was built to handle.
The move is a reminder that as AI coding assistants grow more autonomous, vendors may have to balance developer demand against infrastructure cost and service reliability.
“As Copilot’s agentic capabilities have expanded rapidly, agents are doing more work, and more customers are hitting usage limits designed to maintain service reliability,” GitHub said in a blog post. “Without further action, service quality degrades for everyone.”
Under the changes, GitHub has paused new sign-ups for its Copilot Pro, Pro+, and Student plans, saying the move will help it better serve existing customers.
The company is also tightening usage limits on individual plans, while positioning Pro+ as the higher-capacity tier with more than five times the limits of Pro for users who need heavier usage.
At the same time, GitHub is narrowing model access: Opus models will no longer be available on Pro plans, while Opus 4.7 will remain on Pro+, and Opus 4.5 and 4.6 are also set to be removed from that tier.
GitHub said it will now show usage limits directly in VS Code and Copilot CLI so users can more easily track how close they are to those caps.
The company added that affected Pro and Pro+ users who contact support between April 20 and May 20 can request a refund and will not be charged for April usage if the updated plans do not meet their needs.
GitHub’s move comes as other AI vendors are also adjusting usage policies to manage capacity, with Anthropic last month changing how Claude’s timed limits work during peak hours while keeping weekly limits unchanged.
Charlie Dai, vice president and principal analyst at Forrester, said the move shows how agent-driven coding is shifting workloads toward longer-running and parallel sessions that create higher and less predictable compute demand.
“Cost structures built for lightweight assistance no longer hold, and this puts pressure on GPU capacity, reliability, and unit economics,” Dai said.
Dai added that similar usage restrictions by major model providers suggest capacity rationing is likely to become a structural feature of the industry as agentic development becomes more routine.
Impact for developers
GitHub said Copilot now operates with both session limits and weekly seven-day limits, and that those caps are based on token consumption and model multipliers rather than just raw request counts. Users may still have premium requests left and yet hit a usage limit, because the two systems are separate.
In practice, that means developers using heavier agent-style workflows, especially long-running or parallel sessions, are more likely to hit limits than those using Copilot for simpler tasks.
GitHub is encouraging users nearing their caps to switch to lower-multiplier models, use plan mode in VS Code and Copilot CLI, and cut back on parallel workflows such as /fleet.
Analysts said the move also reflects a familiar pattern in the tech industry.
“First you give users access to a tool with relatively open usage, and then gradually start defining limits as adoption grows,” said Faisal Kawoosa, founder and chief analyst at Techarc. “GitHub has an unavoidable role in the developer world. A developer can live without an email ID, but not a GitHub account. Such is the depth of its integration. But at the same time, the rationalization of AI/Copilot in the ecosystem is inevitable, as resources are constrained.”
Kawoosa added that developers have now seen what Copilot can do, and there is little reason for GitHub to keep offering it without tighter limits. He said the next step is likely to be more differentiated plans that create clearer monetization opportunities among individual users. For enterprise engineering leaders, Dai said the episode is a reminder to evaluate AI coding tools as metered infrastructure rather than unlimited productivity layers. He said buyers should pay close attention to usage ceilings, downgrade behavior, model entitlements, and how clearly vendors communicate limits and cost controls to developers.
Hackers exploit Vercel’s trust in AI integration 20 Apr 2026, 12:13 pm
Frontend cloud platform Vercel, the creator of Next.js and Turbo.js, has warned about a data breach after a compromised third-party AI application abused OAuth to access its internal systems.
A Vercel employee used the third-party app, identified as Context.ai, which allowed the attackers to take over their Google Workspace account and access some environment variables that the company said were not marked as “sensitive.”
“Environment variables marked as ‘sensitive’ in Vercel are stored in a manner that prevents them from being read, and we currently do not have evidence that those values were accessed,” Vercel said in a security post.
The incident compromised what the company described as a “limited subset” of customers whose Vercel credentials were exposed. These customers have now been reached out to with requests to rotate their credentials, Vercel said.
According to reports surfacing on the internet, a threat actor claiming to be the Shinyhunters began attempting to sell the stolen data, which allegedly includes access key, source code, and private database, even before Vercel confirmed the breach publicly.
Hacking the access
Vercel’s disclosure confirmed that the initial access vector was Google Workspace OAuth tied to Context.ai. Once the application was compromised, attackers inherited the permissions granted to it, including access to the Vercel employee’s account.
It remains unclear whether Context.ai’s infrastructure was compromised, whether OAuth tokens were stolen, or whether a session/token leak within the AI workspace enabled attackers to abuse authenticated access into Vercel’s environments. Context.ai did not immediately respond to CSO’s request for comments.
“We have engaged Context.ai directly to understand the full scope of the underlying compromise,” Vercel said in the post. “We assess the attacker as highly sophisticated based on their operational velocity and detailed understanding of Vercel’s systems. We are working with Mandiant, additional cybersecurity firms, industry peers, and law enforcement.”
Vercel has urged its customers to review activity logs for suspicious behavior and to rotate environment variables, especially any unprotected secrets that may have been exposed. It also recommended enabling sensitive variable protections, checking recent deployments for anomalies, and strengthening safeguards by updating deployment protection settings and rotating related tokens where needed.
Sensitive secrets, including API keys, tokens, database credentials, and signing keys that were not marked as “sensitive,” should be treated as potentially exposed and rotated as a priority, Vercel emphasized.
For users in panic, Vercel has offered a shortcut. “If you have not been contacted, we do not have reason to believe that your Vercel credentials or personal data have been compromised at this time,” the post reassured.
Allegedly breached by ShinyHunters
According to screenshots circulating on the internet, a threat actor has already claimed the breach on the dark web and is attempting to sell the spoils. “Greetings All, Today I am selling Access Key/ Source Code/ Database from Vercel company,” the actor said in one of such posts. “Give me a quote if you’re interested. This could be the largest supply chain attack ever if done right.”
The data was put up for $2 million on April 19.
The threat actor can be seen using a “BreachForums” domain in the screenshot, claiming (not explicitly) to be Shinyhunters themselves, one of the operators of the notorious hacksite. Other giveaways include a Telegram channel “@Shinyc0rpsss” and an email ID “shinysevy@tutamail.com” mentioned in the post.
While recent incidents have hinted at ShinyHunters resurfacing after takedowns and alleged arrests, it remains likely that this is an imposter leveraging the name to lend credibility, something that has precedent.
Page processed in 0.414 seconds.
Powered by SimplePie 1.4-dev, Build 20170403172323. Run the SimplePie Compatibility Test. SimplePie is © 2004–2026, Ryan Parman and Geoffrey Sneddon, and licensed under the BSD License.
