The rapid expansion of artificial intelligence (AI) into all areas of life, from medicine to religion, is raising more questions about its underlying principles. Even AI experts acknowledge that the internal processes occurring in these “black boxes” remain largely unclear, despite their application in critically important domains. As a solution to this issue, scientists are developing new methods of studying AI, inspired by biology. One approach, known as “mechanistic interpretability,” allows tracking processes occurring inside AI models during task execution. Developers from Anthropic have created tools that visualize neural network activity, reminiscent of the use of magnetic resonance imaging (MRI) to study brain function.
Another experiment, similar to the creation of organoids in biology (miniature versions of organs grown under laboratory conditions), proposes the development of specialized neural networks such as sparse autoencoders. The internal structure of these networks is simpler to understand and analyze than typical large language models (LLM).
Yet another method is the “monitoring of reasoning chains,” where AI models explain the logic underlying their actions. This helps identify discrepancies between AI behavior and set goals. Bowen Baker, a research scientist at OpenAI, noted that this method has been quite successful in detecting “undesirable” model actions.
Scientists worry that future AI models may become so complex, especially if developed by AI themselves, that understanding how they operate will become virtually impossible. Already, despite existing tools and methods, unexpected behavior patterns are appearing that do not align with human conceptions of truth and safety. This is confirmed by numerous reports of instances where people have harmed themselves following AI advice. This fact is causing even greater concern due to insufficient understanding of the working principles of these systems.
Suzuki and Toyota Partnership: A Harmonious BlendThe collaboration between Suzuki and Toyota continues to flourish,…
Toyota is opening a new chapter in its history by electrifying one of its most…
The globally renowned Dreame, known for its advanced vacuum cleaners and household appliances, has officially…
The renowned company Dreame, globally famous for its intelligent vacuum cleaners and household appliances, has…
Emerging from Household Fame to Mobile InnovationThe renowned company Dreame, celebrated globally for its intelligent…
India Prepares for Its Most Ambitious Lunar Project YetThe Indian Space Research Organisation (ISRO) has…