TD Stuff

OpenClaw: The Dangerous Magic of Autonomous AI

Mar 23, 2026278

OpenClaw provides transformative automation but creates a 'Faustian bargain' where users trade their total digital security for the convenience of an autonomous AI assistant.

AI Agents Prompt Injection Supply Chain Security Sandboxing Cybersecurity

Agentic Systems

Snowflake Patches Critical Sandbox Escape and Malware Execution Flaw in Cortex AI

Mar 18, 2026266

Snowflake Cortex Code CLI was vulnerable to a sandbox escape and human-in-the-loop bypass that allowed unauthorized malware execution via indirect prompt injection.

Prompt Injection Sandboxing AI Agents Vulnerability Research Cybersecurity

Agentic Systems

Vetting the Blast Radius: The AI Skills Security Index

Mar 16, 2026

A security database that evaluates and ranks the instructional risks and permission levels of AI agent skills to prevent exploitation.

AI Agents Prompt Injection Cybersecurity AI Safety Vulnerability Research

Agentic Systems

NanoClaw and Docker: Hardened Isolation for AI Agent Teams

Mar 13, 2026149

NanoClaw leverages Docker Sandboxes to create a multi-layered, secure runtime that isolates AI agents from each other and the host system.

AI Agents Sandboxing Containerization Multi-Agent Systems Prompt Injection

Under the Hood

Defending RAG Systems Against Knowledge Base Poisoning

Mar 12, 2026

Knowledge base poisoning is a persistent threat to RAG systems that is best countered by detecting semantic anomalies during the data ingestion process.

Retrieval-Augmented Generation Prompt Injection AI Safety Vector Databases Cybersecurity

Damage Control

Autonomous AI Agent Breaches McKinsey’s Lilli Platform

Mar 11, 2026499

An autonomous AI agent hacked McKinsey’s internal AI platform in two hours, exposing millions of confidential records and highlighting the urgent need to secure the prompt layer.

Prompt Injection AI Agents Vulnerability Research Retrieval-Augmented Generation AI-Enabled Cybercrime

Agentic Systems

Design for Distrust: Securing AI Agents via Container Isolation

Feb 28, 2026344

Secure AI agent development requires a 'design for distrust' approach that uses container isolation and minimal code to contain potential damage.

AI Agents AI Safety Sandboxing Prompt Injection

Under the Hood

The $100 AI Prompt Injection Challenge

Feb 17, 2026369

A $100 bounty challenge invites hackers to leak a secret file from an AI assistant using email-based prompt injection.

Prompt Injection AI Safety Prompt Engineering AI Ethics

Damage Control

Moltbook: AI Theater, Not AGI—And a Security Wake-Up Call

Feb 10, 2026317

Moltbook is a flashy but hollow showcase of bot behavior—more human-run theater than autonomous intelligence—and a wake-up call about large-scale agent security risks.

AI Agents AI Hype AI Safety Prompt Injection

Agentic Systems

Test Your AI Agent Against Hidden Prompt Injections

Feb 6, 2026

A practical arena to benchmark and harden AI agents against hidden prompt injection attacks in web content.

Prompt Injection AI Agents AI Safety AI Benchmarks

Agentic Systems

Moltbook: The Wild, Risky Social Network for AI Agents

Jan 30, 2026193

Moltbook is a thrilling, risky showcase of autonomous AI agents’ power—and a warning that demand is outrunning safety.

AI Agents AI Safety Prompt Injection Open Source

Products & Announcements

OpenClaw: A Security-First, Local AI Agent Rebrand and Release

Jan 30, 2026667

OpenClaw is the new, security-focused, local-first AI agent platform that lives in your chat apps and is scaling with the community.

AI Agents Open Source Prompt Injection AI Safety Self-Hosting

Damage Control

Notion AI Pre-Approval Edits Enable Prompt-Injection Data Exfiltration

Jan 8, 2026206

Notion AI saves edits before consent, enabling prompt-injected external image loads that exfiltrate user data regardless of user approval.

Prompt Injection Data Privacy AI Safety Corporate Accountability Vulnerability Research

Under the Hood

Anthropic Confirms Claude 4.5 ‘Soul Doc’ Training, Tied to Better Prompt-Injection Defense

Dec 2, 2025342

Anthropic confirms Claude 4.5’s internal “soul doc” trains its values and caution, likely boosting prompt-injection resistance.

AI Safety Prompt Injection AI Ethics Model Fine-Tuning

Under the Hood

Engineer AI for Failure: Contain Prompt Injection

Sep 26, 2025115

Stop prompt-injection harm by engineering AI like machines: assume failure, isolate, constrain, and verify.

Prompt Injection AI Safety Sandboxing Defense in Depth