How Machine Identities Became Your Biggest Security Blind Spot
Photo by jonakoh _ on Unsplash
- 28.65 million hardcoded secrets were exposed in public GitHub commits in 2025 alone — a 34% year-over-year surge and the single largest annual jump on record, per GitGuardian's State of Secrets Sprawl 2026.
- 71% of organizations suffered at least one identity-related breach in the past 12 months, with weak non-human identity (NHI) management responsible for 40.6% of root causes, according to Sophos's survey of 5,000 security professionals.
- AI-assisted code commits expose secrets at a 3.2% rate — roughly double the baseline for standard commits — with over 1.27 million AI-service credentials scattered across public repositories in 2025.
- Gartner projects that 25% of enterprise GenAI applications will experience at least five minor security incidents per year by 2028, driven by immature agentic AI governance and emerging vectors like Model Context Protocol (MCP) configuration abuse.
The Evidence
28.65 million. That is the count of new hardcoded secrets — API keys, service tokens, database passwords, and machine credentials — deposited into publicly accessible GitHub repositories during 2025. GitGuardian's State of Secrets Sprawl 2026 report, published in March 2026, labels this a 34% year-over-year increase and the largest single-year volume its researchers have documented. According to Google News, SC Media's feature investigation connects this credential avalanche directly to a structural identity crisis reshaping enterprise attack surfaces — one where the most dangerous threat actor may already be inside the perimeter wearing an AI agent's credentials.
The vector has shifted. It is no longer only the developer who forgets to scrub a token before a commit. AI-assisted code — generated or autocompleted with tools like GitHub Copilot or similar platforms — leaks secrets at a 3.2% rate, approximately double the baseline for human-only commits. In 2025 that translated into more than 1,275,105 secrets tied specifically to AI services appearing in public repositories. A granular finding compounds the problem: 24,008 unique secrets were extracted from Model Context Protocol (MCP) configuration files — a direct consequence of popular MCP setup tutorials instructing developers to embed API keys directly into config files and command-line arguments as a convenience shortcut.
Sophos's State of Identity Security 2026, drawn from interviews with 5,000 IT and security professionals completed in May 2026, attaches a breach rate to this pattern. Seventy-one percent of surveyed organizations reported at least one identity-related breach in the preceding twelve months. Non-human identities — service accounts, API tokens, OAuth grants, and AI agent credentials — were the root cause in 40.6% of those incidents. The math is not in defenders' favor.
What It Means for Your Organization's Security
The threat actor in the next major breach may not be an external adversary who phished a help-desk employee. It may be an AI agent your own engineering team provisioned last quarter — granted broad permissions, connected to production databases, and never subjected to the credential rotation schedule applied to human accounts. Gartner's April 2026 analyst advisory stated it directly: "Rapid adoption of agentic AI is outpacing the ability of enterprises to effectively secure it — rapid AI adoption amplifies long-standing IAM challenges, especially when a single AI agent requires multiple accounts with inherited rights and access control becomes nearly impossible to govern."
The scale of non-human identity (NHI) sprawl is what makes this problem structurally different from previous credential security challenges. Rubrik Zero Labs places the NHI-to-human identity ratio at 45:1 across modern enterprises. In cloud-native and DevOps environments, Entro Labs H1 2025 research puts that ratio at 144:1, with the average enterprise managing more than 250,000 NHIs across cloud infrastructure. Cybersecurity best practices built for human-scale identity governance — quarterly access reviews, mandatory MFA enrollment, manual password rotation — were never designed for this density.
Chart: Year-over-year growth in hardcoded secrets exposed in public GitHub commits, 2024 vs. 2025. Source: GitGuardian State of Secrets Sprawl 2026.
The data on unrevoked credentials reveals just how poorly existing controls are working. Only 15% of organizations report high confidence in their ability to prevent NHI-based attacks. Forty-seven percent of NHIs have not had credentials rotated in over a year. Most starkly: 64% of valid secrets that leaked in 2022 remained active and unrevoked as of 2026, according to combined GitGuardian and Cloud Security Alliance research. Each of those credentials represents a persistent open door that any threat actor with access to historical repository data can walk through.
The financial exposure makes the governance failure concrete. Sophos's May 2026 findings put the average remediation cost for a successful identity breach at $1.64 million in 2026, with nearly half of all victims experiencing data theft or ransomware as a direct outcome. Organizations with demonstrably weak NHI management paid approximately $150,000 more to recover than the average breach victim — pushing their total incident response cost toward $1.79 million. Effective threat intelligence programs and pre-built response playbooks can reduce that figure materially, but only after the NHI inventory problem is addressed at its root.
SC Media's feature framing summarizes the directional risk: a high-profile breach will trace back not to a human account, but to an AI agent or machine identity carrying excessive, unsupervised access — and enterprises integrating AI copilots, pipelines, and autonomous agents into production are constructing attack surfaces that traditional identity governance was never architected to manage. Security awareness training focused exclusively on human phishing behaviors is increasingly insufficient when the blast radius of a single over-privileged AI agent can span dozens of connected systems.
The AI Angle
The same AI systems generating developer productivity gains are simultaneously accelerating the rate at which credentials leak into repositories. GitGuardian's secrets detection platform and CyberArk's Conjur secrets management solution represent the category of tooling built specifically for this problem — continuous repository scanning paired with automated revocation workflows that can operate at the velocity AI-assisted commit pipelines demand. GitGuardian's own data infrastructure is what produced the 1,275,105 AI-service secret count and the 24,008 MCP configuration exposure figure cited in this analysis, demonstrating that automated threat intelligence tooling is already cataloguing exposures faster than most manual review processes can respond.
For teams building or operating agentic AI workloads — the production deployment patterns examined in depth by Smart AI Agents' breakdown of what separates real production systems from demos — the security implication is direct: every agent that touches an API, a database connection, or a third-party integration requires its own credential lifecycle, scoped to least privilege from provisioning. Gartner's projection that 25% of enterprise GenAI applications will log five or more minor security incidents annually by 2028 reflects a governance deficit that is not primarily a technology gap. It is a policy-as-code (automated security rules enforced programmatically at deployment) and runtime monitoring problem — one the emerging class of NHI security platforms is beginning to address at scale.
How to Act on This
No compensating control functions without a complete map of every non-human identity in your environment — service accounts, API tokens, OAuth applications, AI agent credentials, and machine certificates. Platforms such as Entro, Astrix Security, or CyberArk Identity can automate discovery across cloud accounts and SaaS environments. Flag every NHI older than 90 days without documented rotation and any credential surfacing in version-controlled repositories. This inventory is your blast radius assessment for the identity layer — understanding it is the prerequisite for every subsequent data protection and incident response action.
Manual code review cannot match the commit velocity of AI-assisted development. GitGuardian, Trufflehog, and GitHub Advanced Security's native secret scanning can be configured as pre-commit hooks or pipeline gates that reject credential-containing code before it reaches a shared or public repository. Pay particular attention to MCP configuration files and environment variable handling in AI scaffolding templates — these represent a newly documented attack surface largely absent from existing cybersecurity best practices checklists. This control can be shipped in a single pipeline configuration change, with near-zero development overhead, and it directly addresses the root cause behind 24,008 MCP exposures recorded in 2025.
AI agents routinely inherit permissions from the developer or service account that provisioned them — a pattern that structurally overprovisions access. Audit every active agent's permission scope against the principle of least privilege (each identity should hold only the minimum access required to execute its defined function, nothing more). Set credential TTLs (time-to-live limits, the maximum age before automatic expiration) of 30 days or fewer for all machine accounts, enforced through a secrets manager such as HashiCorp Vault, AWS Secrets Manager, or Azure Key Vault. Any NHI holding standing administrative rights should be treated as a critical security awareness and data protection escalation requiring immediate scope reduction. This single control addresses the 47% of stale NHIs and removes the standing access that makes AI agent compromise so operationally damaging.
Frequently Asked Questions
How do non-human identities actually cause enterprise data breaches in cloud environments?
Non-human identities — API keys, service account tokens, machine certificates, and AI agent credentials — are compromised through several distinct vectors: hardcoded secrets embedded in code repositories, permissions that significantly exceed what the service actually requires, credentials left unrotated for months or years after initial deployment, and configuration file exposure in tooling such as MCP setups. When a threat actor obtains a valid NHI credential, they can move laterally across connected systems, access sensitive data stores, exfiltrate records, or deploy ransomware — often without triggering detection rules calibrated around human login patterns and session behavior. Strong data protection controls, credential scoping, and automated rotation are the primary technical defenses.
What is NHI sprawl and why does it make enforcing cybersecurity best practices so difficult?
NHI sprawl describes the uncontrolled proliferation of machine identity credentials across an organization's cloud, on-premises, and SaaS footprint. As enterprises adopt DevOps automation, third-party API integrations, and now agentic AI pipelines, the number of machine identities can reach 45 to 144 times the human headcount — far exceeding the capacity of manual identity governance processes to track and maintain. Standard cybersecurity best practices like multi-factor authentication enrollment, quarterly access reviews, and password rotation policies were designed for human-scale identity management. NHI sprawl breaks those operational models entirely and requires automated discovery, continuous monitoring, and policy-as-code enforcement tooling to remain functional at enterprise scale.
How can small businesses protect themselves from AI agent security risks without a dedicated security team?
Small businesses can meaningfully reduce their exposure with three focused controls that require no dedicated security staff. First, enable secrets scanning on any GitHub or GitLab repository — both platforms offer built-in scanning capabilities at no additional cost that will flag hardcoded credentials before they propagate. Second, use a secrets manager — AWS Secrets Manager has a free tier and HashiCorp Vault has an open-source edition — to store and automatically rotate any API keys your software consumes. Third, conduct a quarterly review of every SaaS integration and AI tool connected to your accounts, revoking access for anything inactive. These steps address the most common NHI attack vectors and establish a practical foundation for future incident response preparation without requiring specialized security awareness training infrastructure.
What does the Gartner GenAI security projection mean for enterprises currently deploying AI copilots and autonomous agents?
Gartner's April 2026 projection — that 25% of enterprise GenAI applications will log five or more minor security incidents per year by 2028 — signals a maturity gap between the speed of AI deployment and the readiness of security governance programs. For organizations deploying AI copilots or autonomous agents today, it is a directive to build security awareness and access control frameworks in parallel with the AI rollout rather than as a retrofit. Practically, this means defining and enforcing granular permission boundaries for every agent from day one, logging all agent actions to an auditable trail, establishing threat intelligence feeds for AI-specific credential exposure, and writing incident response runbooks specific to AI agent compromise scenarios before an incident occurs rather than during one.
How do I find and revoke leaked API keys and secrets that may already be exposed in GitHub repositories?
Start with GitGuardian's free repository scanning tool, which audits public and private repositories for historical secret exposures including API keys, tokens, and service credentials. GitHub's built-in secret scanning — available on all public repositories and through GitHub Advanced Security for private repositories — provides real-time alerts and, for a growing list of supported services, can automatically notify the issuing provider to trigger revocation. For any exposed credential discovered during this process: revoke it immediately at the issuing service, rotate to a new scoped credential, and run a brief threat intelligence review of any access logs associated with the exposed key to assess whether unauthorized use occurred before revocation. The fact that 64% of secrets leaked in 2022 remain unrevoked as of 2026 reflects how consistently this final revocation step is skipped — treat it as a non-negotiable, time-bound element of your incident response procedure, not an optional cleanup task.
Disclaimer: This article is for informational purposes only and does not constitute professional security consulting advice. Always consult with a qualified cybersecurity professional for your specific needs.
Get NewsLens — All 19 Channels in One App
AI-powered news with action steps. Install free, works offline.
No comments:
Post a Comment