The successes and challenges of making low-data languages in online automatic translation portals

154
Опубликовано 17 августа 2016, 21:58
The majority of development work and deployment of machine translation (MT) technologies over the past several decades have been for international languages. Only a few projects for low-data/low-density/low resource/sparse-data/less-prevalent/lesser-commonly taught/minority languages have led to successful prototypes and products. There are a certain number of technical, logistical, social, educational and other factors which influence and impact the potential success of implementing systems for such languages. This talk will cover many of the lessons learned from previous projects, and some of the pitfalls to avoid. It will also demonstrate how the recent efforts for making Haitian Creole available for Haiti Disaster Relief had a certain level of success in record time because of the ability to build upon previous work. Yet, there were also obstacles with have been problematic and remain a concern for this language and for other less-prevalent languages. Lastly, the discussion will mention some ways to enable proactive, forward thinking projects, using some bootstrapping methods, to reduce the risk of situations which can result from working in a primarily reactive mode. This will be an interactive dialogue with the audience, allowing for questions throughout the session, and an additional question/answer time.
автотехномузыкадетское