A Brilliant Brain Without Eyes Is Still Blind: Why AI Security Needs a Better Data Layer

Product

A Brilliant Brain Without Eyes Is Still Blind: Why AI Security Needs a Better Data Layer

Contents

Last week, as we all left RSA, the Mythos leak dominated every conversation. Is it smarter? Is it a leap? Will it change everything? Maybe. But that’s not the question security teams should be asking right now.

The question is: smarter on what data?

Why AI gets security investigations wrong

There’s this analogy that gets used a lot in AI security pitches. The model is a brilliant detective. Give it a case, it solves the case. The problem is, nobody follows it through.

A brilliant detective locked in a room with one alert, three hours of telemetry, and no case history doesn’t solve anything. They generate a theory. Real investigations get solved because someone pulled together identity logs, endpoint data, cloud activity, SaaS behavior, threat intel, six months of prior context, and stitched it into something you can reason across. The detective’s intelligence matters. But the evidence is what closes the case.

No model fixes missing data. Not Mythos. Not whatever comes next.

MCP in cybersecurity: the right idea, badly implemented

Model Context Protocol makes sense architecturally. You want models to have access to tools, queries, real data, not just a static knowledge base. The problem is almost every implementation ends up the same: a wrapper around product APIs.

So the model can query your SIEM, pull an alert, enrich an IP. Fine. But can it follow an attacker who moved laterally across four systems over eight months? Can it connect a low-priority alert from yesterday to a pattern that started last spring? Almost never. Not because the model can’t reason across that, but because the data isn’t there to reason across.

MCP doesn’t solve the AI security data layer problem. It just makes the gap more visible.

What’s actually broken

When AI-assisted investigations go wrong, it usually comes down to three things.

Fragmented access. Most environments are a patchwork: a SIEM here, a data lake there, a dozen point platforms each exposing a narrow slice of what they know. Fine for operational workflows, not for investigation, where the question spans multiple systems and months of history simultaneously.

No durable history. Attackers are patient. A lot of serious compromises look like nothing for months. But APIs weren’t built to be systems of record. Fields change, retention is inconsistent, access breaks. The attacker had time on their side. The infrastructure didn’t preserve it.

Schema chaos. src_ip. sourceAddress. client_ip. Every vendor has their own dialect. Without security telemetry normalization, you get joins that are structurally wrong, timelines that are slightly off, correlations that fall apart under scrutiny.

Three problems. Same root cause: no coherent, consistent access to all the data.

Cognitive data investigation protocol: unified analytics versus schema incompatibility and missing-data dead ends.

What a real AI security data layer requires

For a model to work as an investigator and not just a summarizer, the data layer needs three things.

Federated security analytics that treats your entire environment as one addressable dataset, regardless of whether data lives in a SIEM, a data lake, or a point platform. Durable history where “what happened six months before this alert” is as answerable as “what’s happening right now.” And consistent normalization so the model reasons across clean, unified data instead of navigating vendor dialects in real time.

That’s not a nice-to-have. That’s the minimum for investigation-grade AI in the SOC.

When it goes wrong

AI systems produce confident, well-structured, wrong narratives in real incidents because a field name was inconsistent, a timestamp was off, the relevant history sat outside the retention window. And it’s not like these failures are dramatic. They’re subtle. The timeline looks right. The narrative makes sense. And then an analyst digs in and has to undo everything the model said, which takes longer than if they’d just done it themselves.

The real cost isn’t the wrong answer. It’s the erosion of trust that makes the tool unusable.

The test that matters

Vendors will keep benchmarking models. For CISOs, the test is simpler: can the model take a plain-English question, generate validated queries across all your data sources including historical archives, and return results that join into a coherent timeline?

If it can’t, you have a better summarizer. That’s useful. It’s not an investigator.

Build the data layer or stay blind

Mythos might be the most capable model the security industry has seen. But intelligence without visibility is just faster speculation.

The vendors worth your time aren’t the ones with the best model. They’re the ones who’ve solved what goes underneath it: federated access, durable memory, a shared schema.

Build the data layer and the model becomes an investigator. Skip it and you’ve got a very articulate guesser.

Want to see what complete visibility actually looks like? Explore Vega or request a demo.

Yonni Shelmerdine
Yonni Shelmerdine
Chief Product Officer