Abstract

We introduce the structured projection of intermediate gradients optimization technique (SPIGOT), a new method for backpropagating through neural networks that include hard-decision structured predictions (e.g., parsing) in intermediate layers. SPIGOT requires no marginal inference, unlike structured attention networks (Kim et al., 2017) and some reinforcement learning-inspired solutions (Yogatama et al., 2017). Like so-called straight-through estimators (Hinton, 2012), SPIGOT defines gradient-like quantities associated with intermediate nondifferentiable operations, allowing backpropagation before and after them; SPIGOT's proxy aims to ensure that, after a parameter update, the intermediate structure will remain well-formed. We experiment on two structured NLP pipelines: syntactic-then-semantic dependency parsing, and semantic parsing followed by sentiment classification. We show that training with SPIGOT leads to a larger improvement on the downstream task than a modularly-trained pipeline, the straight-through estimator, and structured attention, reaching a new state of the art on semantic dependency parsing.

Comment: ACL 2018


Original document

The different versions of the original document can be found in:

https://arxiv.org/abs/1805.04658,
https://www.aclweb.org/anthology/P18-1173,
https://www.aclweb.org/anthology/P18-1173.pdf,
https://arxiv.org/pdf/1805.04658.pdf,
https://aclanthology.info/papers/P18-1173/p18-1173,
https://academic.microsoft.com/#/detail/2964263959 under the license cc-by
Back to Top

Document information

Published on 01/01/2018

Volume 2018, 2018
DOI: 10.18653/v1/p18-1173
Licence: Other

Document Score

0

Views 0
Recommendations 0

Share this document

claim authorship

Are you one of the authors of this document?