Exec x AI (formerly S.AI.L) Look Out Series #3 · 19 March 2026 · execxai.com/blog
Listen to this article
10 min 53 sec
The agent landscape, briefly
Desktop AI agents have become generally available to everyone, and in some cases for free. They’re software programmes capable of reading files, executing commands, and taking autonomous action on a user’s computer. As of today, these are the ones to watch:
- Manus, now owned by Meta (META), is cross-platform, paid, and integrating into Meta’s broader ecosystem. It’s just launched My Computer which I think will become Meta’s fastest growing source of advertising revenue. You’ll never need to throw out and replace your laptop. Meta’s My Computer will provide you with the operating system that can work under the hood to do everything for you via the cloud, on condition that you pay for it by watching Meta’s ads
- Cowork, is sandboxed, permission-gated (it will drive you mad asking for permission to run commands) but it’s supposedly protected by Anthropic, that until last month, cited itself as the world’s most AI-safety conscious AI studio
- OpenClaw is free, locally hosted, and connects to messaging platforms including WhatsApp and Slack (Ed Note – we issued a red alert against OpenClaw within Look Out #1. We haven’t hyperlinked it here)
Chasing the peloton, in early March, Microsoft (MSFT) launched Frontier Suite (Microsoft 365 E7). Google (GOOG) continues to pump the pedals using Antigravity to sharp-elbow Gemini 3 into developer ecosystems and within its enterprise customers’ Gsuite apps
Each serves a different segment. Each carries a different risk profile. Your employees are likely already using at least one of them. The question is whether anyone tested what these agents will and will not do before handing them the keys
As it turns out, the answer depends less on risk and more on who built the guardrails, and what they were trying to protect
The guardrails protect the vendor, not you
I asked Claude Cowork to reconfigure Exec X AI’s domain settings. It took control of Chrome on my desktop, planned the steps, executed them, and reported back. DNS records that determine how users reach our website. Changed without hesitation
Ed note – I hold a degree in Internet Engineering. Qualified, yes. Cautious, very. I checked every step and instruction that Cowork executed
Feeling ambitious that NVIDIA (NVDA) would publish bullish news during its GPU Technology Conference (GTC), I used Cowork to take ai-hedge-fund for a test spin. Developed by Virat Singh (@Virattt), ai-hedge-fund is designed to emulate multiple hedge fund strategists, including Cathie Wood, Michael Burry, and Rakesh Jhunjhunwala, working in concert to analyse stocks
Ed. Note – ai-hedge-fund is an educational tool only, not a substitute for qualified financial advice
I ran the app and went a step further by adding an instruction to buy NVIDIA stock within my trading account. It refused
Singh’s guardrails inherited his reputational risk. Ask ai-hedge-fund to execute an action within a highly regulated, litigious industry. It won’t do it
But when I asked Cowork to tweak a DNS setting that could quietly reroute millions of users to a phishing site or malware repo, it responded with unnerving enthusiasm: “Sure!”
The lesson: do not trust the guardrails shipped within your agent. Test them to ensure they reflect your organisation’s risk profile, not the vendors’
Ed. Note – So what happened to NVIDIA’s stock? Not much. GTC came and went without ignition; the shares slipped roughly 2% for the week, dragged by geopolitics. As Wayne Gretzky put it, you miss 100% of the shots you don’t take, but occasionally, the shot your agent refuses to take keeps you in the game
Others haven’t been so lucky
Summer Yue, Director of AI Safety and Alignment on Meta’s Superintelligence team, had given her autonomous agent a simple rule: do not act without approval. She watched that rule dissolve and with it, her inbox
When she connected OpenClaw to a high-volume inbox, the system hit its processing limits, compressed its own working memory, and quietly lost the one instruction that mattered
What followed was drift, not defiance. The agent reinterpreted its task as total inbox cleanup and began deleting emails at scale
Yue’s attempts to stop it from her phone failed to land in time. She had to physically kill the process
Separately, a developer using Antigravity asked it to clear a project cache. A space character in the file path caused the agent to misinterpret the command at root drive level. It deleted the contents of an entire drive. Bypassing the Recycle Bin. Years of source code, knowledge base, and data. Gone in milliseconds
I’ve kept the explanation technical, but the root cause is simple: the agent didn’t stop to ask, “Is this actually a good idea?” It followed instructions; just not the intent. That’s why Virat Singh’s ban on live trading via ai-hedge-fund suddenly makes a lot more sense
You can push for more code. You will get more code
“What many organizations fail to understand is that with effective, proactive monitoring that can alert IT security teams when unacceptable online behaviors occur, this type of activity can be thwarted before it becomes an incident”
Nick Cavalancia, IT security VP, 2013
Cavalancia was commenting on a US-based developer who had been outsourcing his coding work to a developer in China. Different context, same lesson: there is no functional difference between outsourcing to a human and outsourcing to an AI agent, you still need oversight, controls, and accountability
But what has changed is speed. AI agents let developers move faster than ever, and guardrails get overlooked. Bloomberg put it down to a combination of AI fatigue to keep vibe coding with the latest models combined with the added pressure to code faster
The biggest problem: speed versus quality. In the early shipbuilding era, yards increased throughput by paying workers per rivet. Output rose. Quality fell. Riveted joints require precision; they are not fully leak-proof if rushed, and installation is labour-intensive. Ships left the yard faster. Some left with defects
The industry solved it by replacing riveted hulls with welded hulls – faster to fit, cheaper and easier to maintain. The process reduced reliance on individual execution. Quality became part of the system
There’s a direct parallel with coding. When you measure developers on speed, you get speed. When you remove friction, you remove checks. That trade-off is not new
We still use the QWERTY keyboard today not because it is the most efficient layout, but because it reduced mechanical failures in early typewriters by separating common letter pairs and avoiding jams. The design prioritised reliability over raw speed. So the QWERTY standard persists
Systems that remove friction without replacing it with control tend to fail under load. If you optimise only for output, guardrails will be overlooked. Then errors will scale with output
Better still, you can use agentic AI workflows and governance policies to design systems where unsafe actions are harder to execute. You will get fewer incidents. The decision sits with you
With great power comes great responsibility
The gap between a PDF governance policy and an operational governance system is where organisations lose control of their AI deployments. We close that gap. If you read this article, and thought “that could happen here,” you’re correct. It could. Book a call with a principal AI consultant or email humans@execxai.com
By the way, we love knowledge sharing and we’re always happy to set up a chat to dialogue how safe you want your agentic AI platforms to be. Schedule a call with us even if you think you’ve got your bases covered
Final word: we’re hiring!
Exec x AI (formerly S.AI.L)’s principal AI consultants work directly with leadership teams to identify which AI systems qualify as high-risk and build the permission taxonomies, tiered approval mechanisms, and monitoring architectures designed for agentic workflows. Our consultants bring deep experience across regulated industries, from financial services and energy to healthcare and government
Demand for this expertise is growing faster than the agent market itself. We are actively expanding our team and have several principal AI consultant vacancies across EMEA and APAC
We are looking for people who work across agentic AI, data readiness, deployment and operating models, product design, risk mitigation, strategy, use case development, value realisation, and generative AI enablement
If you read that list and thought “I do three of those but not all ten,” you are exactly who we want to hear from. Nobody covers the full range
The consultants who do the best work for our clients bring depth in two or three areas and curiosity about the rest. If you have spent time in the field solving real problems for real organisations and you are not sure whether your experience qualifies, it probably does
We would rather have a conversation with someone who underestimates their fit than miss them because they talked themselves out of applying
Visit www.execxai.com/careers. We welcome speculative applications
