Many AI projects look strong in internal demonstrations. The model performs well. User feedback is positive. The business case is clear. Then the project reaches the information security committee, and stops moving.
This is not unusual. The gap between an AI system that works and one that clears enterprise security review is larger than most teams plan for, and it tends to surface at the same point in the project lifecycle.
Three concerns typical of where these reviews stall:
The regulatory landscape surrounding enterprise AI has evolved significantly.
In Japan, the AI Promotion Act (AI推進法) has been fully effective since September 2025, establishing a framework that emphasizes voluntary cooperation and government-led coordination rather than direct penalties. Alongside it, the AI Guidelines for Business (updated to v1.2 in March 2026) continue to provide practical guidance for AI developers, providers, and users.
In the EU, the AI Act takes a different approach, introducing legally binding obligations for certain categories of AI systems. Rules for high-risk systems apply from August 2026, with potential extraterritorial implications for organizations providing AI systems or AI-enabled services into the EU market.
Security review is no longer only an internal hurdle. It is increasingly shaped by external expectations as well, and several of the patterns discussed below may have regulatory implications in addition to operational and security risks.
This article walks through five patterns that commonly cause enterprise AI projects to fail security review, and the habits that separate the projects which pass.
The difficulty of moving AI from proof-of-concept to production is usually framed in technical terms. Accuracy drops on real data. Latency rises. Costs scale unpredictably. These problems are real, but they are not what kills most projects.
What kills most projects is the second gate. A PoC asks one question: does this work? A security review asks a different one: is this safe to release into production? Most teams build to clear the first question and assume the second will follow. It does not follow.

In many projects, prompts live inside source files, configuration blobs, or copy-pasted between engineers’ notes. Reviewers ask a simple question. Which prompt produced the output that led to this decision, on which date, by whom? If the answer requires reconstructing git history and chat logs, the project has already failed the review.
Versioning prompts is not glamorous engineering. It is what makes the difference between a system reviewers can sign off on and one they cannot.
RAG systems often get built as if access control is a downstream concern. A vector database ingests documents from across the enterprise, and the application queries it freely. The assumption is that filtering happens at the chat interface.
Reviewers correctly identify this as a privilege escalation surface. The model can surface information the requesting user has no clearance to see, with no trace, because the access decision was made at a layer that does not know about user identity.
When an AI-assisted workflow contributes to a customer-facing decision such as pricing, approval, or recommendation, auditors expect a reconstructable trail. The input, the retrieved context, the model version, the output, the human review step. Most production AI systems log some of these. Few log all of them. Almost none log them in a form auditors can read without engineering help.
Prompt injection is often discussed in academic terms, as if it were a future concern documented in research papers. In production, it is a present concern. Any system that ingests untrusted text such as emails, documents, or web content, and then passes that text to an LLM with privileges, has an active attack surface. Reviewers know this. Treating injection mitigations as optional is one of the fastest ways to fail review.
A system that performs well at launch will not necessarily perform well in three months. The model provider may release an update. The retrieval corpus drifts as documents are added and removed. Without continuous monitoring of output quality, retrieval relevance, and downstream business metrics, teams cannot answer the reviewer’s last question. How would you know if this system started to fail?
These patterns are not just operational failures. Each one now maps onto named obligations under the regulations in force. The next section walks through that mapping.
The EU AI Act, applicable from August 2026 for high-risk systems, names concrete technical obligations in its articles on requirements for high-risk AI. Japan’s AI Guidelines for Business v1.2 frame obligations differently, separated across three roles: AI Developer, AI Provider, and AI User. A team building or deploying AI may fall into more than one of these roles, and the failure patterns above translate into named duties under both frameworks.
| Failure pattern | EU AI Act (high-risk systems) | AI Guidelines for Business v1.2 |
|---|---|---|
| Unversioned prompts | Art. 11 (Technical documentation); Art. 12 (Record-keeping) | Transparency obligations across Developer and Provider roles |
| Missing audit trails | Art. 12 (Record-keeping); Art. 26 (Deployer log retention, minimum six months) | Accountability obligations across all three roles |
| Prompt injection treated as research | Art. 15 (Accuracy, robustness, cybersecurity) | Technical robustness obligations on Provider role |
| Drift without monitoring | Art. 72 (Post-market monitoring) | Ongoing review obligations on Provider and User roles |
The mapping is not exhaustive. Several patterns cut across multiple articles or roles. The point is that each of these five failures, until recently treated as engineering oversights or matters for internal review, now has an external counterpart with audit and enforcement attached.
The projects that clear security review tend to share a small number of habits.
These habits do not slow projects down. They are, in practice, what lets projects move at all.
This is the first article in a series on building AI systems ready for enterprise deployment. Future articles will cover each of the five patterns in more depth, beginning with the retrieval and access control layer.