OpenClaw, the open-source AI assistant formerly known as Clawdbot and then Moltbot, crossed 180,000 GitHub stars and drew 2 million visitors in a single week, according to creator Peter Steinberger.
Security researchers scanning the internet found over 1,800 exposed instances leaking API keys, chat histories, and account credentials. The project has been rebranded twice in recent weeks due to trademark disputes.
The grassroots agentic AI movement is also the biggest unmanaged attack surface that most security tools can’t see.
Enterprise security teams didn’t deploy this tool. Neither did their firewalls, EDR, or SIEM. When agents run on BYOD hardware, security stacks go blind. That’s the gap.
Why traditional perimeters can’t see agentic AI threats
Most enterprise defenses treat agentic AI as another development tool requiring standard access controls. OpenClaw proves that the assumption is architecturally wrong.
Agents operate within authorized permissions, pull context from attacker-influenceable sources, and execute actions autonomously. Your perimeter sees none of it. A wrong threat model means wrong controls, which means blind spots.
“AI runtime attacks are semantic rather than syntactic,” Carter Rees, VP of Artificial Intelligence at Reputation, told VentureBeat. “A phrase as innocuous as ‘Ignore previous instructions’ can carry a payload as devastating as a buffer overflow, yet it shares no commonality with known malware signatures.”
Simon Willison, the software developer and AI researcher who coined the term “prompt injection,” describes what he calls the “lethal trifecta” for AI agents. They include access to private data, exposure to untrusted content, and the ability to communicate externally. When these three capabilities combine, attackers can trick the agent into accessing private information and sending it to them. Willison warns that all this can happen without a single alert being sent.
OpenClaw has all three. It reads emails and documents, pulls information from websites or shared files, and acts by sending messages or triggering automated tasks. An organization’s firewall sees HTTP 200. SOC teams see their EDR monitoring process behavior, not semantic content. The threat is semantic manipulation, not unauthorized access.
Why this isn’t limited to enthusiast developers
IBM Research scientists Kaoutar El Maghraoui and Marina Danilevsky analyzed OpenClaw this week and concluded it challenges the hypothesis that autonomous AI agents must be vertically integrated. The tool demonstrates that “this loose, open-source layer can be incredibly powerful if it has full system access” and that creating agents with true autonomy is “not limited to large enterprises” but “can also be community driven.”
That’s exactly what makes it dangerous for enterprise security. A highly capable agent without proper safety controls creates major vulnerabilities in work contexts. El Maghraoui stressed that the question has shifted from whether open agentic platforms can work to “what kind of integration matters most, and in what context.” The security questions aren’t optional anymore.
What Shodan scans revealed about exposed gateways
Security researcher Jamieson O’Reilly, founder of red-teaming company Dvuln, identified exposed OpenClaw servers using Shodan by searching for characteristic HTML fingerprints. A simple search for “Clawdbot Control” yielded hundreds of results within seconds. Of the instances he examined manually, eight were completely open with no authentication. These instances provided full access to run commands and view configuration data to anyone discovering them.
O’Reilly found Anthropic API keys. Telegram bot tokens. Slack OAuth credentials. Complete conversation histories across every integrated chat platform. Two instances gave up months of private conversations the moment the WebSocket handshake completed. The network sees localhost traffic. Security teams have no visibility into what agents are calling or what data they’re returning.
Here’s why: OpenClaw trusts localhost by default with no authentication required. Most deployments sit behind nginx or Caddy as a reverse proxy, so every connection looks like it’s coming from 127.0.0.1 and gets treated as trusted local traffic. External requests walk right in. O’Reilly’s specific attack vector has been patched, but the architecture that allowed it hasn’t changed.
Why Cisco calls it a ‘security nightmare’
Cisco’s AI Threat & Security Research team published its assessment this week, calling OpenClaw “groundbreaking” from a capability perspective but “an absolute nightmare” from a security perspective.
Cisco’s team released an open-source Skill Scanner that combines static analysis, behavioral dataflow, LLM semantic analysis, and VirusTotal scanning to detect malicious agent skills. It tested a third-party skill called “What Would Elon Do?” against OpenClaw. The verdict was a decisive failure. Nine security findings surfaced, including two critical and five high-severity issues.
The skill was functionally malware. It instructed the bot to execute a curl command, sending data to an external server controlled by the skill author. Silent execution, zero user awareness. The skill also deployed direct prompt injection to bypass safety guidelines.
“The LLM cannot inherently distinguish between trusted user instructions and untrusted retrieved data,” Rees said. “It may execute the embedded command, effectively becoming a ‘confused deputy’ acting on behalf of the attacker.” AI agents with system access become covert data-leak channels that bypass traditional DLP, proxies, and endpoint monitoring.
Why security teams’ visibility just got worse
The control gap is widening faster than most security teams realize. As of Friday, OpenClaw-based agents are forming their own social networks. Communication channels that exist outside human visibility entirely.
Moltbook bills itself as “a social network for AI agents” where “humans are welcome to observe.” Posts go through the API, not through a human-visible interface. Astral Codex Ten’s Scott Alexander confirmed it’s not trivially fabricated. He asked his own Claude to participate, and “it made comments pretty similar to all the others.” One human confirmed their agent started a religion-themed community “while I slept.”
Security implications are immediate. To join, agents execute external shell scripts that rewrite their configuration files. They post about their work, their users’ habits, and their errors. Context leakage as table stakes for participation. Any prompt injection in a Moltbook post cascades into your agent’s other capabilities through MCP connections.
Moltbook is a microcosm of the broader problem. The same autonomy that makes agents useful makes them vulnerable. The more they can do independently, the more damage a compromised instruction set can cause. The capability curve is outrunning the security curve by a wide margin. And the people building these tools are often more excited about what’s possible than concerned about what’s exploitable.
What security leaders need to do on Monday morning
Web application firewalls see agent traffic as normal HTTPS. EDR tools monitor process behavior, not semantic content. A typical corporate network sees localhost traffic when agents call MCP servers.
“Treat agents as production infrastructure, not a productivity app: least privilege, scoped tokens, allowlisted actions, strong authentication on every integration, and auditability end-to-end,” Itamar Golan, founder of Prompt Security (now part of SentinelOne), told VentureBeat in an exclusive interview.
Audit your network for exposed agentic AI gateways. Run Shodan scans against your IP ranges for OpenClaw, Moltbot, and Clawdbot signatures. If your developers are experimenting, you want to know before attackers do.
Map where Willison’s lethal trifecta exists in your environment. Identify systems combining private data access, untrusted content exposure, and external communication. Assume any agent with all three is vulnerable until proven otherwise.
Segment access aggressively. Your agent doesn’t need access to all of Gmail, all of SharePoint, all of Slack, and all your databases simultaneously. Treat agents as privileged users. Log the agent’s actions, not just the user’s authentication.
Scan your agent skills for malicious behavior. Cisco released its Skill Scanner as open source. Use it. Some of the most damaging behavior hides inside the files themselves.
Update your incident response playbooks. Prompt injection doesn’t look like a traditional attack. There’s no malware signature, no network anomaly, no unauthorized access. The attack happens inside the model’s reasoning. Your SOC needs to know what to look for.
Establish policy before you ban. You can’t prohibit experimentation without becoming the productivity blocker your developers route around. Build guardrails that channel innovation rather than block it. Shadow AI is already in your environment. The question is whether you have visibility into it.
The bottom line
OpenClaw isn’t the threat. It’s the signal. The security gaps exposing these instances will expose every agentic AI deployment your organization builds or adopts over the next two years. Grassroots experimentation already happened. Control gaps are documented. Attack patterns are published.
The agentic AI security model you build in the next 30 days determines whether your organization captures productivity gains or becomes the next breach disclosure. Validate your controls now.
