We’re surrounded by Artificial Intelligence. Can we trust it? Do we know what to do with it? Will it help us make better decisions? The Trustworthy AI Lab is exploring the boundaries of AI in society and business.| Digital Data Design Institute at Harvard
The increasing complexity of AI systems has made understanding their behavior critical. Numerous interpretability methods have been developed to attribute model behavior to three key aspects: input features, training data, and internal model components, which emerged from explainable AI, data-centric AI, and mechanistic interpretability, respectively. However, these attribution methods are studied and applied rather independently, resulting in a fragmented landscape of methods and terminology...| arXiv.org