A Discussion of ‘Adversarial Examples Are Not Bugs, They Are Features’

[ad_1]

On Might sixth, Andrew Ilyas and colleagues published a paper

outlining two units of experiments. Firstly, they confirmed that fashions educated on adversarial examples can switch to actual knowledge, and secondly that fashions educated on a dataset derived from the representations of sturdy neural networks appear to inherit non-trivial robustness. They proposed an intriguing interpretation for his or her outcomes: adversarial examples are on account of “non-robust options” that are extremely predictive however imperceptible to people. The paper was obtained with intense curiosity and dialogue on social media, mailing lists, and studying teams world wide. How ought to we interpret these experiments? Would they replicate? Adversarial instance analysis is especially susceptible to a sure type of non-replication amongst disciplines of machine studying, as a result of it requires researchers to play each assault and protection. It’s simple for even very rigorous researchers to by chance use a weak assault. Nevertheless, as we’ll see, Ilyas et al’s outcomes have held as much as preliminary scrutiny. And if non-robust options exist… what are they? To discover these questions, Distill determined to run an experimental “dialogue article.” Working a dialogue article is one thing Distill has needed to strive for a number of years. It was initially prompt to us by Ferenc Huszár, who writes many beautiful discussions of papers on his blog. Why not simply have everybody write personal weblog posts like Ferenc? Distill hopes that offering a extra organized discussion board for many individuals to take part may give extra researchers license to speculate vitality in discussing different’s work and ensure there’s a possibility for all events to remark and reply earlier than the ultimate model is printed. We invited numerous researchers to put in writing feedback on the paper and arranged dialogue and responses from the unique authors. The Machine Studying group sometimes worries that peer evaluation isn’t thorough sufficient. In distinction to this, we had been struck by how deeply respondents engaged. Some respondents actually invested weeks in replicating outcomes, working new experiments, and considering deeply concerning the unique paper. We additionally noticed respondents replace their views on non-robust options as they ran experiments — generally again and forth! The unique authors equally deeply engaged in discussing their outcomes, clarifying misunderstandings, and even working new experiments in response to feedback. We expect this deep engagement and dialogue is basically thrilling, and hope to experiment with extra such dialogue articles sooner or later.

Dialogue Themes

Clarifications: Dialogue between the respondents and unique authors was in a position to floor a number of misunderstandings or alternatives to sharpen claims. The unique authors summarize this of their rebuttal. Profitable Replication: Respondents efficiently reproduced lots of the experiments in Ilyas et al and had no unsuccessful replication makes an attempt. This was considerably facilitated by the discharge of code, fashions, and datasets by the unique authors. Gabriel Goh and Preetum Nakkiran each independently reimplemented and replicated the non-robust dataset experiments. Preetum reproduced the

widehat{mathcal{D}}_{det}

Preetum additionally replicated a part of the strong dataset experiment by
coaching fashions on the supplied strong dataset and discovering that they appeared non-trivially strong.
It appears epistemically notable that each Preetum and Gabriel had been initially skeptical.
Preetum emphasizes that he discovered it simple to make the phenomenon work and that it was strong to many variants
and hyperparameters he tried.

Exploring the Boundaries of Non-Sturdy Switch:
Three of the feedback targeted on variants of the “non-robust dataset” experiment,
the place coaching on adversarial examples transfers to actual knowledge.
When, how, and why does it occur?
Gabriel Goh explores another mechanism for the outcomes,
Preetum Nakkiran exhibits a particular building the place it doesn’t occur,
and Eric Wallace exhibits that switch can occur for different kinds of incorrectly labeled knowledge.

Properties of Sturdy and Non-Sturdy Options:
The opposite three feedback targeted on the properties of sturdy and non-robust fashions.
Gabriel Goh explores what non-robust options may appear to be within the case of linear fashions,
whereas Dan Hendrycks and Justin Gilmer focus on how the outcomes relate to the broader drawback of robustness to
distribution shift,
and Reiichiro Nakano explores the qualitative variations of sturdy fashions within the context of favor switch.

Feedback

Distill collected six feedback on the unique paper.
They’re introduced in alphabetical order by the writer’s final title,
with transient summaries of every remark and the corresponding response from the unique authors.

Adversarial Example Researchers Need to Expand What is Meant by
“Robustness”

Justin and Dan focus on “non-robust options” as a particular case
of fashions being non-robust as a result of they latch on to superficial correlations,
a view usually discovered within the distributional robustness literature.
For instance, they focus on current evaluation of how neural networks behave in frequency area.
They emphasize we should always take into consideration a broader notion of robustness.

Read Full Article

Remark from unique authors:

The demonstration of fashions that be taught from solely high-frequency parts of the information is
an attention-grabbing discovering that gives us with one other manner our fashions can be taught from knowledge that
seems “meaningless” to people.
The authors absolutely agree that learning a wider notion of robustness will change into more and more
essential in ML, and can assist us get a greater grasp of options we truly need our fashions
to depend on.

Robust Feature Leakage

Gabriel explores another mechanism that might contribute to the non-robust switch
outcomes.
He establishes a lower-bound exhibiting that this mechanism contributes a bit of bit to the
$widehat{mathcal{D}}_{rand}$

Remark from unique authors:

It is a good in-depth investigation that highlights (and neatly visualizes) one of many
motivations for designing the $widehat{mathcal{D}}_{det}$

Two Examples of Useful, Non-Robust Features

Gabriel explores what non-robust helpful options may appear to be within the linear case.
He gives two constructions:
“contaminated” options that are solely non-robust on account of a non-useful characteristic being blended in,
and “ensembles” that might be candidates for true helpful non-robust options.

Read Full Article

Remark from unique authors:

These experiments with linear fashions are a fantastic first step in the direction of visualizing non-robust
options for actual datasets (and thus a neat corroboration of their existence).
Moreover, the theoretical building of “contaminated” non-robust options opens an
attention-grabbing course of growing a extra fine-grained definition of options.

Adversarially Robust Neural Style Transfer

Reiichiro exhibits that adversarial robustness makes neural fashion switch
work by default on a non-VGG structure.
He finds that matching strong options makes fashion switch’s outputs look perceptually higher
to people.

Read Full Article

Remark from unique authors:

Very attention-grabbing outcomes that spotlight the potential position of non-robust options and the
utility of sturdy fashions for downstream duties. We’re excited to see what sort of influence robustly
educated fashions can have in neural community artwork!
Impressed by these findings, we additionally take a deeper dive into (non-robust) VGG, and discover some
attention-grabbing hyperlinks between robustness and elegance switch.

Adversarial Examples are Just Bugs, Too

Preetum constructs a household of adversarial examples with no switch to actual knowledge,
suggesting that some adversarial examples are “bugs” within the unique paper’s framing.
Preetum additionally demonstrates that adversarial examples can come up even when the underlying distribution
has no “non-robust options”.

Read Full Article

Remark from unique authors:

A fine-grained take a look at adversarial examples that neatly our thesis (i.e. that non-robust
options exist and adversarial examples come up from them, see Takeaway #1) whereas offering an
instance of adversarial examples that come up from “bugs”.
The truth that the constructed “bugs”-based adversarial examples don’t switch constitutes
one other proof for the hyperlink between transferability and (non-robust) options.

Learning from Incorrectly Labeled Data

Eric exhibits that coaching on a mannequin’s coaching errors,
or on the way it predicts examples kind an unrelated dataset,
can each switch to the true take a look at set.
These experiments are analogous to the unique paper’s non-robust switch outcomes — all three outcomes are examples of a type of “studying from incorrectly labeled knowledge.”

Read Full Article

Remark from unique authors:

These experiments are a artistic demonstration of the truth that the underlying phenomenon of
studying options from “human-meaningless” knowledge can truly come up in a broad vary of
settings.

Authentic Writer Dialogue and Responses

Discussion and Author Responses

The unique authors describe their takeaways and a few clarifcations that resulted from the
dialog.
This text additionally incorporates their responses to every remark.

Read Full Article

Quotation Info

When you want to cite this dialogue as a complete, quotation data might be discovered under.
The writer order is all members within the dialog in alphabetical order.
You can too cite particular person feedback or the writer responses utilizing the quotation data supplied on the
backside of the corresponding article.

Editorial Observe

This dialogue article is an experiment organized by Chris Olah and Ludwig Schubert.
Chris Olah facilitated and edited the feedback and dialogue course of. Ludwig Schubert
assisted by assembling the responses into their present presentation.

We’re extraordinarily grateful for the time
and energy that each the authors of the responses in addition to the authors of the unique paper put into this
course of, and the persistence that they had with the editorial staff as we experimented with this format.
Respondents had been chosen in two methods.
Some respondents got here to our consideration as a result of they had been actively engaged on higher understanding the Ilyas
et al outcomes.
Different respondents had been subject material consultants we reached out to.

Distill can be grateful to Ferenc Huszár for encouraging us to
discover this fashion of article.

References

Adversarial examples should not bugs, they’re options [PDF]
Ilyas, A., Santurkar, S., Tsipras, D., Engstrom, L., Tran, B. and Madry, A., 2019. arXiv preprint arXiv:1905.02175.

Updates and Corrections

When you see errors or wish to recommend modifications, please create an issue on GitHub.

Reuse

Diagrams and textual content are licensed beneath Inventive Commons Attribution CC-BY 4.0 with the source available on GitHub, except famous in any other case. The figures which were reused from different sources don’t fall beneath this license and might be acknowledged by a observe of their caption: “Determine from …”.

Quotation

For attribution in tutorial contexts, please cite this work as

Engstrom, et al., "A Dialogue of 'Adversarial Examples Are Not Bugs, They Are Options'", Distill, 2019.

BibTeX quotation

@article{engstrom2019a,
  writer = {Engstrom, Logan and Gilmer, Justin and Goh, Gabriel and Hendrycks, Dan and Ilyas, Andrew and Madry, Aleksander and Nakano, Reiichiro and Nakkiran, Preetum and Santurkar, Shibani and Tran, Brandon and Tsipras, Dimitris and Wallace, Eric},
  title = {A Dialogue of 'Adversarial Examples Are Not Bugs, They Are Options'},
  journal = {Distill},
  12 months = {2019},
  observe = {https://distill.pub/2019/advex-bugs-discussion},
  doi = {10.23915/distill.00019}
}

[ad_2]

Source link

A Discussion of ‘Adversarial Examples Are Not Bugs, They Are Features’

Adversarial Example Researchers Need to Expand What is Meant by
“Robustness”

Remark from unique authors:

Robust Feature Leakage

Remark from unique authors:

Two Examples of Useful, Non-Robust Features

Remark from unique authors:

Adversarially Robust Neural Style Transfer

Remark from unique authors:

Adversarial Examples are Just Bugs, Too

Remark from unique authors:

Learning from Incorrectly Labeled Data

Remark from unique authors:

Discussion and Author Responses

Video Friday: ReachBot – IEEE Spectrum

Automation in Data Science Workflows

Editor

Automation in Data Science Workflows

Leave a Reply Cancel reply

Browse by Category

Categories

Recommended

A Discussion of ‘Adversarial Examples Are Not Bugs, They Are Features’

Dialogue Themes

Feedback

Remark from unique authors:

Remark from unique authors:

Remark from unique authors:

Remark from unique authors:

Remark from unique authors:

Remark from unique authors:

Authentic Writer Dialogue and Responses

Quotation Info

Editorial Observe

References

Updates and Corrections

Reuse

Quotation

Video Friday: ReachBot – IEEE Spectrum

Automation in Data Science Workflows

Leave a Reply Cancel reply

Browse by Category

Browse by Tags

Categories

Recommended