Info
This work is part of my long-term collaboration with Angus Galloway from the Vector Institute and the Uof Guelph. We were interested in understanding adversarial attacks in ML through the lens of information theory. We presented this work at the NIPS 2018 Workshop on Security in Machine Learning in Montreal – the poster is below and the paper is here.
Abstract
We analyze the adversarial examples problem in terms of a model’s fault tolerance with respect to its input. Whereas previous work focuses on arbitrarily strict threat models, i.e., ε-perturbations, we consider arbitrary valid inputs and propose an information-based characteristic for evaluating tolerance to diverse input faults.