Oxford-based startup Align AI claims to have achieved a significant breakthrough in AI safety. The company’s Algorithm for Concept Extraction (ACE) allows AI systems to form more sophisticated associations, akin to human concepts, addressing the problem of spurious correlations in current AI systems. By avoiding misgeneralizations, ACE could improve the reliability of self-driving cars, robots, and other AI-based products that rely on accurate and safe decision-making. The algorithm could also find applications in content moderation and robotics, enabling robots to generalize knowledge from simulators to real-world environments.
Demonstrating the capabilities of ACE, Align AI tested the algorithm on a video game called CoinRun. The game, similar to Sonic the Hedgehog, challenges AI agents to navigate obstacles and hazards while seeking a gold coin and advancing to the next level. Previous AI software could only obtain the coin 59% of the time, slightly better than random actions. However, AI agents trained using ACE successfully obtained the coin 72% of the time. ACE works by noticing differences between training data and new data and formulates two hypotheses about the true objective based on these differences, ultimately identifying the correct objective through repeated testing. Align AI aims to further improve ACE’s capabilities to achieve “zero-shot” learning, where the AI system can accurately identify the objective upon encountering new data.
Aligned AI is currently seeking funding and has a pending patent for ACE. The algorithm not only enhances AI safety but also offers interpretability, allowing developers to understand the software’s objectives. In the future, combining ACE with a language model could enable the algorithm to express objectives in natural language, opening up new possibilities for AI systems.