First Joint Workshop on Statistical Parsing of Morphologically Rich Languages and Syntactic Analysis of Non-Canonical Languages (SPMRL-SANCL 2014)
ENDORSED BY SIGPARSE
Co-located with Coling 2014, August 24 in Dublin, Ireland
SPMRL-SANCL 2014 will feature a shared task on semi-supervised parsing morphologically rich languages.
Outline
Statistical parsing of morphologically-rich languages has repeatedly been shown to exhibit non-trivial challenges including, among others, sparse lexica in the face of rich inflectional systems, parsing deficiency in the face of free word order and treebank annotation idiosyncrasies in the face of morphosyntactic interactions.
Similar problems arise for parsing non-canonical languages. Besides technical issues such as lexical sparseness and ad-hoc structures, we also face theoretical problems including constructions that do not, or very seldomly occur in standard language, such as verbless sentences or complex hashtags.
The first joint SPMRL-SANCL workshop addresses both the challenge of parsing MRLs and NCLs. It provides a forum for research addressing the often overlapping issues of both fields with the goal of identifying cross-cutting issues in the annotation and parsing methodology for such languages.
Areas of interest
The areas of interest of the SPMRL-SANCL workshop include, but are not limited to, the following list of topics:
- applying cutting-edge parsing techniques to new languages and domains
- strengths and weaknesses of current parsing techniques when applied to morphologically-rich and/or non-canonical language
- insights and techniques that are targeted at improving parsing quality for morphologically-rich and/or non-canonical language
- using insights from parsing and associated processing problems to motivate decisions in the creation of new syntactically annotated corpora
- annotation and parsing of data from domains and genres that are not yet covered for many languages
In addition to regular paper submissions, we ask for poster submissions addressing the syntactic analysis of frequent phenomena of non-canonical languages which are difficult to annotate and parse using conventional annotation schemes. A case in point are the representation of verbless utterances in a dependency scheme, the pros and cons of different representations of disfluencies for statistical parsing, or the analysis of complex hashtags which incorporate and merge different syntactic arguments into one token. The posters should focus on one or more of a number of given issues described in more detail (see http://spmrl.org/sancl-posters2014.html) and will be presented at the workshop. More details on the submission categories for the poster session can be found below and at the website.
Important Dates (updated!)
Please note the new deadlines (old submission deadline was May 2nd)
Submission deadline | [extended] June 13, 2014 |
Author Notification | July 1, 2014 |
Camera ready copy | July 13, 2014 |
Workshop | August 24, 2014 |
How to Submit
We solicit the following submission categories: - long papers (up to 11 pages excluding references) - short papers (up to 6 pages excluding references) - abstracts (500 words excluding examples/references, for SANCL poster topics) - shared task paper submissions (deadline will be disclosed later)
Long papers are most appropriate for presenting substantial and completed research addressing a topic relevant to either SANCL or SPMRL.
Short papers are suited for presenting work in progress, position papers or short, focused contributions relevant to either SANCL or SPMRL (including the poster session topics described above and, in more detail, here).
Both long and short papers should present original, unpublished research. They will be peer reviewed and will be presented as either an oral talk or as a poster at the workshop. Long/short papers will be included in the proceedings. Abstract submissions are most appropriate for presenting an idea for an analysis for one or more of the poster topics. In contrast to long/short paper submissions, abstract submissions do not need to back up their ideas with experimental results. Abstract submission will receive a yes/no review and will not be included in the proceedings.
Submissions will be accepted until June, 6 , 2014, (11:59 p.m. PST) in PDF format via the START system (https://www.softconf.com/coling2014/WS-13) and must be formatted using the Coling 2014 formatting instructions.
Organizers
Workshop
- Yoav Goldberg (Bar Ilan University, Israel)
- Yuval Marton (Microsoft Inc., US)
- Ines Rehbein (Potsdam University, Germany)
- Yannick Versley (Heidelberg University, Germany)
- Özlem Çetinoğlu (University of Stuttgart, Germany)
- Joel Tetreault (Yahoo! Labs, US)
SANCL Special Track
- Ines Rehbein (Potsdam University, Germany)
- Özlem Çetinoğlu (University of Stuttgart, Germany)
- Djamé Seddah (Université Paris Sorbonne & INRIA's Alpage Project, France)
- Joel Tetreault (Yahoo! Labs, US)
Shared task
- Sandra Kübler (Indiana University, US)
- Djamé Seddah (Université Paris Sorbonne & INRIA's Alpage Project, France)
- Reut Tsarfaty (Weizmann Institute of Science, Israel)
Program committee
- Bernd Bohnet (University of Birmingham, UK)
- Marie Candito (University of Paris 7, France)
- Aoife Cahill (Educational Testing Service, US)
- Jinho D. Choi (University of Massachusetts Amherst, US)
- Grzegorz Chrupala (Tilburg University, Netherlands)
- Markus Dickinson (Indiana University, US)
- Stefanie Dipper (Ruhr-Universität Bochum, Germany)
- Jacob Eisenstein (Georgia Institute of Technology, US)
- Richard Farkas (University of Szeged, Hungary)
- Jennifer Foster (Dublin City University, Ireland)
- Josef van Genabith (DFKI, Germany)
- Koldo Gojenola (University of the Basque Country, Spain)
- Spence Green (Stanford University, US)
- Samar Husain (Potsdam University, Germany)
- Sandra Kübler (Indiana University, US)
- Joseph Le Roux (Université Paris-Nord, France)
- John Lee (City University of Hong Kong, China)
- Wolfgang Maier (University of Düsseldorf, Germany)
- Takuya Matsuzaki (University of Tokyo, Japan)
- David McClosky (IBM Research, US)
- Detmar Meurers (University of Tübingen, Germany)
- Joakim Nivre (Uppsala University, Sweden)
- Kemal Oflazer (Carnegie Mellon University, Qatar)
- Adam Przepiorkowski (ICS PAS, Poland)
- Owen Rambow (Columbia University, US)
- Kenji Sagae (University of Southern California, US)
- Benoit Sagot (Inria, France)
- Djamé Seddah (Univ. Paris Sorbonne, France)
- Wolfgang Seeker (IMS Stuttgart, Germany)
- Anders Søgard (University of Copenhagen, Denmark)
- Reut Tsarfaty (Weizmann Institute of Science, Israel)
- Lamia Tounsi (Dublin City University, Ireland)
- Daniel Zeman (Charles University, Czechia)
For general questions about the workshop, please email spmrl.sancl@gmail.com. For specific questions about the shared task, please email the shared task organizers spmrl.sharedtask@gmail.com
ENDORSEMENT
This workshop is endorsed by THE ACL SIGPARSE interest group.
For their precious help preparing the SPMRL 2013 and 2014 Shared Task and for allowing their data to be part of it, we warmly thank the Linguistic Data Consortium, the Knowledge Center for Processing Hebrew (MILA), the Ben Gurion University, Bar Illan University, Columbia University, Institute of Computer Science (Polish Academy of Sciences), Korea Advanced Institute of Science and Technology, University of the Basque Country, Uppsala University, University of Stuttgart, University of Szeged and University Paris Diderot (Paris 7).