Explainable AI for Adversarially Trained Deep-Learning Models

Last updated on Feb 14, 2024

Deep learning models have demonstrated remarkable performance in various applications, particularly in computer vision tasks.

However, these models are vulnerable to adversarial attacks, which can significantly compromise their performance. To address this, there have been attempts to incorporate adversarial signals during model training.

Unfortunately, doing so often results in reduced model interpretability.

This project proposes the development of a visualization system that enables deep learning models to resist adversarial attacks while maintaining their interpretability using concepts.

Research

Explainable AI for Adversarially Trained Deep-Learning Models

Aesha Shah

ex-Data Engineer Intern @ Amazon | Computer Science