The Interplay Between Loss Functions and Structural Constraints in Dependency Parsing

Authors

  • Robin Kurtz Department of Computer and Information Science, Linköpng University, Sweden
  • Marco Kuhlmann Department of Computer and Information Science, Linköpng University, Sweden

DOI:

https://doi.org/10.3384/nejlt.2000-1533.19643

Abstract

Dependency parsing can be cast as a combinatorial optimization problem with the objective to find the highest-scoring graph, where edge scores are learnt from data. Several of the decoding algorithms that have been applied to this task employ structural restrictions on candidate solutions, such as the restriction to projective dependency trees in syntactic parsing, or the restriction to noncrossing graphs in semantic parsing. In this paper we study the interplay between structural restrictions and a common loss function in neural dependency parsing, the structural hingeloss. We show how structural constraints can make networks trained under this loss function diverge and propose a modified loss function that solves this problem. Our experimental evaluation shows that the modified loss function can yield improved parsing accuracy, compared to the unmodified baseline.

References

Berg-Kirkpatrick, Taylor, David Burkett, and Dan Klein. 2012. An Empirical Investigation of Statistical Signi_cance in NLP. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 995{1005. Jeju Island, Korea: Association for Computational Linguistics.

Cao, Junjie, Sheng Huang, Weiwei Sun, and Xiaojun Wan. 2017. Parsing to 1-Endpoint-Crossing, pagenumber-2 graphs. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2110{2120. Vancouver, Canada: Association for Computational Linguistics. https://doi.org/10.18653/v1/P17-1193

Chen, Danqi and Christopher Manning. 2014. A fast and accurate dependency parser using neural networks. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 740{750. Doha, Qatar: Association for Computational Linguistics. https://doi.org/10.3115/v1/D14-1082

Chu, Yoeng-Jin and Tseng-Hong Liu. 1965. On the shortest arborescence of a directed graph. Scientia Sinica 14:1396{1400.

de Lhoneux, Miryam, Sara Stymne, and Joakim Nivre. 2017. Arc-hybrid non-projective dependency parsing with a static-dynamic oracle. In Proceedings of the 15th International Conference on Parsing Technologies, pages 99{104. Pisa, Italy: Association for Computational Linguistics.

Dozat, Timothy and Christopher D. Manning. 2017. Deep Bia_ne Attention for Neural Dependency Parsing. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net.

Dozat, Timothy and Christopher D. Manning. 2018. Simpler but more accurate semantic dependency parsing. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 484{490. Melbourne, Australia: Association for Computational Linguistics. https://doi.org/10.18653/v1/P18-2077

Edmonds, Jack. 1967. Optimum Branchings. Journal of Research of the National Bureau of Standards B 71(4):233{240. https://doi.org/10.6028/jres.071B.032

Eisner, Jason and Giorgio Satta. 1999. E_cient parsing for bilexical context-free grammars and head automaton grammars. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, pages 457{464. College Park, Maryland, USA: Association for Computational Linguistics. https://doi.org/10.3115/1034678.1034748

Flickinger, Dan, Jan Hajič, Angelina Ivanova, Marco Kuhlmann, Yusuke Miyao, Stephan Oepen, and Daniel Zeman. 2016. SDP 2014 & 2015: Broad Coverage Semantic Dependency Parsing LDC2016T10 .

Francis, W. Nelson and Henry Kučera. 1985. Frequency Analysis of English Usage: Lexicon and Grammar. Journal of English Linguistics 18(1):64{70. https://doi.org/10.1177/007542428501800107

Garg, Nikhil and James Henderson. 2011. Temporal restricted boltzmann machines for dependency parsing. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pages 11{17. Portland, Oregon, USA: Association for Computational Linguistics.

Greff, K., R. K. Srivastava, J. Koutník, B. R. Steunebrink, and J. Schmidhuber. 2017. LSTM: A search space odyssey. IEEE Transactions on Neural Networks and Learning Systems 28(10):2222{2232. https://doi.org/10.1109/TNNLS.2016.2582924

Hajič, Jan, Eva Hajičová, Jarmila Panevová, Petr Sgall, Ondřej Bojar, Silvie Cinková, Eva

Fučíková, Marie Mikulová, Petr Pajas, Jan Popelka, Jiří Semecký, Jana Šindlerová, Jan

Štěpánek, Josef Toman, Zdeňka Urešová, and Zdeněk _Zabokrtský. 2012. Announcing Prague Czech-English dependency treebank 2.0. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 3153{3160. Istanbul, Turkey: European Language Resources Association (ELRA).

Henderson, James. 2004. Discriminative training of a neural network statistical parser. In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04), pages 95{102. Barcelona, Spain. https://doi.org/10.3115/1218955.1218968

Hochreiter, Sepp and Jurgen Schmidhuber. 1997. Long Short-term Memory. Neural computation 9:1735{80. https://doi.org/10.1162/neco.1997.9.8.1735

Ivanova, Angelina, Stephan Oepen, Lilja _vrelid, and Dan Flickinger. 2012. Who did what to whom? A contrastive study of syntacto-semantic dependencies. In Proceedings of the Sixth Linguistic Annotation Workshop, pages 2{11. Jeju, Republic of Korea: Association for Computational Linguistics.

Kingma, Diederik P. and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In Y. Bengio and Y. LeCun, eds., 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings.

Kiperwasser, Eliyahu and Yoav Goldberg. 2016. Simple and accurate dependency parsing using bidirectional LSTM feature representations. Transactions of the Association for Computational Linguistics 4:313{327. https://doi.org/10.1162/tacl_a_00101

Kübler, Sandra, Ryan T. McDonald, and Joakim Nivre. 2009. Dependency Parsing. Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publishers. https://doi.org/10.2200/S00169ED1V01Y200901HLT002

Kuhlmann, Marco and Peter Jonsson. 2015. Parsing to noncrossing dependency graphs. Transactions of the Association for Computational Linguistics 3:559{570. https://doi.org/10.1162/tacl_a_00158

Kummerfeld, Jonathan K. and Dan Klein. 2017. Parsing with Traces: An O(n4) Algorithm and a Structural Representation. Transactions of the Association for Computational Linguistics 5. https://doi.org/10.1162/tacl_a_00072

Kurtz, Robin and Marco Kuhlmann. 2017. Exploiting structure in parsing to 1-Endpoint-Crossing graphs. In Proceedings of the 15th International Conference on Parsing Technologies, pages 78{87. Pisa, Italy: Association for Computational Linguistics.

Marcus, Mitchell P., Beatrice Santorini, and Mary Ann Marcinkiewicz. 1993. Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics 19(2):313{330. https://doi.org/10.21236/ADA273556

Martins, André F. T. and Mariana S. C. Almeida. 2014. Priberam: A turbo semantic parser with second order features. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), pages 471{476. Dublin, Ireland: Association for Computational Linguistics. https://doi.org/10.3115/v1/S14-2082

Mayberry III, Marshall R. and Risto Miikkulainen. 1999. SARDSRN: A Neural Network Shift-Reduce Parser. In Proceedings of the 16th Annual International Joint Conference on Arti_cial Intelligence (IJCAI-99), pages 820{825. San Francisco, CA: Kaufmann.

McDonald, Ryan, Fernando Pereira, Kiril Ribarov, and Jan Hajič. 2005. Non-projective dependency parsing using spanning tree algorithms. In Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, pages 523{530. Vancouver, British Columbia, Canada: Association for Computational Linguistics. https://doi.org/10.3115/1220575.1220641

Miyao, Yusuke. 2006. From Linguistic Theory to Syntactic Analysis: Corpus-Oriented Grammar Development and Feature Forest Model. Ph.D. thesis, Department of Computer Science, University of Tokyo, Tokyo, Japan.

Moss, Henry, Andrew Moore, David Leslie, and Paul Rayson. 2019. FIESTA: Fast IdEntification of State-of-The-Art models using adaptive bandit algorithms. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2920{2930. Florence, Italy: Association for Computational Linguistics. https://doi.org/10.18653/v1/P19-1281

Neubig, Graham, Chris Dyer, Yoav Goldberg, Austin Matthews, Waleed Ammar, Antonios Anastasopoulos, Miguel Ballesteros, David Chiang, Daniel Clothiaux, Trevor Cohn, Kevin Duh, Manaal Faruqui, Cynthia Gan, Dan Garrette, Yangfeng Ji, Lingpeng Kong, Adhiguna Kuncoro, Gaurav Kumar, Chaitanya Malaviya, Paul Michel, Yusuke Oda, Matthew Richardson, Naomi Saphra, Swabha Swayamdipta, and Pengcheng Yin. 2017. DyNet: The Dynamic Neural Network Toolkit. arXiv preprint arXiv:1701.03980.

Nivre, Joakim, Mitchell Abrams, Željko Agić, Lars Ahrenberg, Lene Antonsen, Katya Aplonova, Maria Jesus Aranzabe, Gashaw Arutie, Masayuki Asahara, Luma Ateyah, Mohammed Attia, Aitziber Atutxa, Liesbeth Augustinus, Elena Badmaeva, Miguel Ballesteros, Esha Banerjee, Sebastian Bank, Verginica Barbu Mititelu, Victoria Basmov, John Bauer, Sandra Bellato, Kepa Bengoetxea, Yevgeni Berzak, Irshad Ahmad Bhat, Riyaz Ahmad Bhat, Erica Biagetti, Eckhard Bick, Rogier Blokland, Victoria Bobicev, Carl Börstell, Cristina Bosco, Gosse Bouma, Sam Bowman, Adriane Boyd, Aljoscha Burchardt, Marie Candito, Bernard Caron, Gauthier Caron, Güulşen Cebiroğglu Eryiğit, Flavio Massimiliano Cecchini, Giuseppe G. A. Celano, Slavomír _ Čéplö, Savas Cetin, Fabricio Chalub, Jinho Choi, Yongseok Cho, Jayeol Chun, Silvie Cinková, Aurélie Collomb, Çağri Çöltekin, Miriam Connor, Marine Courtin, Elizabeth Davidson, Marie-Catherine de Marneffe, Valeria de Paiva, Arantza Diaz de Ilarraza, Carly Dickerson, Peter Dirix, Kaja Dobrovoljc, Timothy Dozat, Kira Droganova, Puneet Dwivedi, Marhaba Eli, Ali Elkahky, Binyam Ephrem, Tomaž Erjavec, Aline Etienne, Richárd Farkas, Hector Fernandez Alcalde, Jennifer Foster, Cláudia Freitas, Katarína Gajdoŝová, Daniel Galbraith, Marcos Garcia, Moa Gardenfors, Sebastian Garza, Kim Gerdes, Filip Ginter, Iakes Goenaga, Koldo Gojenola, Memduh Gök_rmak, Yoav Goldberg, Xavier Gómez Guinovart, Berta Gonzáles Saavedra, Matias Grioni, Grnüzitis, Bruno Guillaume, Céline Guillot-Barbance, Nizar Habash, Jan Hajič, Jan Hajič jr., Linh Hà My, Na-Rae Han, Kim Harris, Dag Haug, Barbora Hladká, Jaroslava Hlavácová, Florinel Hociung, Petter Hohle, Jena Hwang, Radu Ion, Elena Irimia, O. lájídé Ishola, Tomáš Jelínek, Anders Johannsen, Fredrik Jørgensen, Hüuner Kaşkara, Sylvain Kahane, Hiroshi Kanayama, Jenna Kanerva, Boris Katz, Tolga Kayadelen, Jessica Kenney, Václava Kettnerová, Jesse Kirchner, Kamil Kopacewicz, Natalia Kotsyba, Simon Krek, Sookyoung Kwak, Veronika Laippala, Lorenzo Lambertino, Lucia Lam, Tatiana Lando, Septina Dian Larasati, Alexei Lavrentiev, John Lee, Phuong Lě Hnông, Alessandro Lenci, Saran Lertpradit, Herman Leung, Cheuk Ying Li, Josie Li, Keying Li, Kyung-Tae Lim, Nikola Ljubešić, Olga Loginova, Olga Lyashevskaya, Teresa Lynn, Vivien Macketanz, Aibek Makazhanov, Michael Mandl, Christopher Manning, Ruli Manurung, Cătălina Mărănduc, David Mareček, Katrin Marheinecke, Héctor Martínez Alonso, André Martins, Jan Maŝek, Yuji Matsumoto, Ryan McDonald, Gustavo Mendonça, Niko Miekka, Margarita Misirpashayeva, Anna Missilä, Cătălin Mititelu, Yusuke Miyao, Simonetta Montemagni, Amir More, Laura Moreno Romero, Keiko Sophie Mori, Shinsuke Mori, Bjartur Mortensen, Bohdan Moskalevskyi, Kadri Muischnek, Yugo Murawaki, Kaili Müürisep, Pinkey Nainwani, Juan Ignacio Navarro Horñiacek, Anna Nedoluzhko, Gunta Nešpore-Bērzkalne, Luong Nguyên Thi. , Huynên Nguyêen Thi. Minh, Vitaly Nikolaev, Rattima Nitisaroj, Hanna Nurmi, Stina Ojala, Adédayo. _ Òlúòkun, Mai Omura, Petya Osenova, Robert Östling, Lilja Øvrelid, Niko Partanen, Elena Pascual, Marco Passarotti, Agnieszka Patejuk, Guilherme Paulino-Passos, Siyao Peng, Cenel-Augusto Perez, Guy Perrier, Slav Petrov, Jussi Piitulainen, Emily Pitler, Barbara Plank, Thierry Poibeau, Martin Popel, Lauma Pretkalni_na, Sophie Prévost, Prokopis Prokopidis, Adam Przepiòrkowski, Tiina Puolakainen, Sampo Pyysalo, Andriela Rääbis, Alexandre Rademaker, Loganathan Ramasamy, Taraka Rama, Carlos Ramisch, Vinit Ravishankar, Livy Real, Siva Reddy, Georg Rehm, Michael Rießler, Larissa Rinaldi, Laura Rituma, Luisa Rocha, Mykhailo Romanenko, Rudolf Rosa, Davide Rovati, Valentin Ros, ca, Olga Rudina, Jack Rueter, Shoval Sadde, Benoit Sagot, Shadi Saleh, Tanja Samardžić, Stephanie Samson, Manuela Sanguinetti, Baiba Saulite, Yanin Sawanakunanon, Nathan Schneider, Sebastian Schuster, Djamé Seddah, Wolfgang Seeker, Mojgan Seraji, Mo Shen, Atsuko Shimada, Muh Shohibussirri, Dmitry Sichinava, Natalia Silveira, Maria Simi, Radu Simionescu, Katalin Simkó, Mária Šimková, Kiril Simov, Aaron Smith, Isabela Soares-Bastos, Carolyn Spadine, Antonio Stella, Milan Straka, Jana Strnadová, Alane Suhr, Umut Sulubacak, Zsolt Szántó, Dima Taji, Yuta Takahashi, Takaaki Tanaka, Isabelle Tellier, Trond Trosterud, Anna Trukhina, Reut Tsarfaty, Francis Tyers, Sumire Uematsu, Zdeňka Urešová, Larraitz Uria, Hans Uszkoreit, Sowmya Vajjala, Daniel van Niekerk, Gertjan van Noord, Viktor Varga, Eric Villemonte de la Clergerie, Veronika Vincze, Lars Wallin, Jing Xian Wang, Jonathan North Washington, Seyi Williams, Mats Wir_en, Tsegay Woldemariam, Tak-sum Wong, Chunxiao Yan, Marat M. Yavrumyan, Zhuoran Yu, Zdeněk Zabokrtský, Amir Zeldes, Daniel Zeman, Manying Zhang, and Hanzhi Zhu. 2018. Universal Dependencies 2.3 .

Oepen, Stephan and Jan Tore Lønning. 2006. Discriminant-based MRS banking. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC'06). Genoa, Italy: European Language Resources Association (ELRA).

Peng, Hao, Sam Thomson, and Noah A. Smith. 2017. Deep multitask learning for semantic dependency parsing. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2037{2048. Vancouver, Canada: Association for Computational Linguistics. https://doi.org/10.18653/v1/P17-1186

Pitler, Emily, Sampath Kannan, and Mitchell Marcus. 2013. Finding optimal 1-Endpoint- Crossing trees. Transactions of the Association for Computational Linguistics 1:13{24. https://doi.org/10.1162/tacl_a_00206

Schluter, Natalie. 2014. On maximum spanning DAG algorithms for semantic DAG parsing. In Proceedings of the ACL 2014 Workshop on Semantic Parsing, pages 61{65. Baltimore, MD: Association for Computational Linguistics. https://doi.org/10.3115/v1/W14-2412

Stenetorp, Pontus. 2013. Transition-based Dependency Parsing Using Recursive Neural Networks. In Deep Learning Workshop at the 2013 Conference on Neural Information Processing Systems (NIPS).

Taskar, Benjamin, Vassil Chatalbashev, Daphne Koller, and Carlos Guestrin. 2005. Learning structured prediction models: A large margin approach. In L. D. Raedt and S. Wrobel, eds., Machine Learning, Proceedings of the Twenty-Second International Conference (ICML 2005), Bonn, Germany, August 7-11, 2005 , vol. 119 of ACM International Conference Proceeding Series, pages 896{903. ACM.

Taskar, Benjamin, Carlos Guestrin, and Daphne Koller. 2003. Max-Margin Markov Networks. In S. Thrun, L. K. Saul, and B. Schölkopf, eds., Advances in Neural Information Processing Systems 16 [Neural Information Processing Systems, NIPS 2003, December 8-13, 2003, Vancouver and Whistler, British Columbia, Canada], pages 25{32. MIT Press.

Titov, Ivan and James Henderson. 2007. Fast and robust multilingual dependency parsing with a generative latent variable model. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pages 947{951. Prague, Czech Republic: Association for Computational Linguistics.

Varab, Daniel and Natalie Schluter. 30 09 { 2 10 2019. UniParse: A universal graphbased parsing toolkit. In Proceedings of the 22nd Nordic Conference on Computational Linguistics, pages 406{410. Turku, Finland: Linköping University Electronic Press.

Downloads

Published

2019-12-20