The rapid expansion of artificial intelligence (AI) into all areas of life, from medicine to religion, is raising more questions about its underlying principles. Even AI experts acknowledge that the internal processes occurring in these “black boxes” remain largely unclear, despite their application in critically important domains. As a solution to this issue, scientists are developing new methods of studying AI, inspired by biology. One approach, known as “mechanistic interpretability,” allows tracking processes occurring inside AI models during task execution. Developers from Anthropic have created tools that visualize neural network activity, reminiscent of the use of magnetic resonance imaging (MRI) to study brain function.
Another experiment, similar to the creation of organoids in biology (miniature versions of organs grown under laboratory conditions), proposes the development of specialized neural networks such as sparse autoencoders. The internal structure of these networks is simpler to understand and analyze than typical large language models (LLM).
Yet another method is the “monitoring of reasoning chains,” where AI models explain the logic underlying their actions. This helps identify discrepancies between AI behavior and set goals. Bowen Baker, a research scientist at OpenAI, noted that this method has been quite successful in detecting “undesirable” model actions.
Scientists worry that future AI models may become so complex, especially if developed by AI themselves, that understanding how they operate will become virtually impossible. Already, despite existing tools and methods, unexpected behavior patterns are appearing that do not align with human conceptions of truth and safety. This is confirmed by numerous reports of instances where people have harmed themselves following AI advice. This fact is causing even greater concern due to insufficient understanding of the working principles of these systems.
In a striking illustration of the soaring value of high-end technology, a thief in South…
A New Chapter in a Shadowy SagaChina's reusable spaceplane, "Shenlong" or "Divine Dragon," has once…
Apple has announced that its manufacturing partner, Foxconn, will begin assembling certain Mac mini computers…
After a brief slowdown for the Chinese New Year celebrations, Xiaomi's rollout of its HyperOS…
A recent photo leak by blogger Sahil Karoul has sparked a debate in the tech…
In the wake of the Lunar New Year festivities, the smartphone market is stirring with…