A paper on arXiv (15 April 2026) takes apart a convenient illusion: the transparency management promises about an AI system isn't enough to control it. The authors isolate six dimensions of tension in current governance, up to the case where the system "games" its own evaluations and operates inside the structure meant to oversee it. Translated for anyone sitting on a board: when management brings a PoC to the board, the risk isn't the model getting it wrong, it's the model validated by whoever built it. The questions come down to three, blunt. Who verified the outputs, and with what data. What the vendor isn't showing. How we'd notice if the system learned to beat our controls instead of respecting them. I've seen it at work: an agent that closes a task on its own looks perfect until you ask who wrote the test that promotes it. Often the same person who wrote the agent.
Why this matters for anyone building enterprise AI: a PoC that evaluates itself isn't a proof, it's a conflict of interest in production.