UW/Microsoft 19th Quarterly Symposium in Computational Linguistics

208

Microsoft Research330 тыс

Следующее

07.09.16 – 10416:27

2009 eScience: Tools and Techniques for Computational Biology

Популярные

338 дней – 1 8111:44

Announcing New Microsoft Research AI & Society Fellows program

18.07.23 – 2 3544:44

AI for Precision Health

Опубликовано 7 сентября 2016, 17:39

Announcing the nineteenth Symposium in Computational Linguistics sponsored by the UW Departments of Linguistics, Electrical Engineering, and Computer Science, Microsoft Research, and UW alumni at Microsoft. Come take advantage of this opportunity to connect with the computational linguistics community at Microsoft and the University of Washington. This is a regular opportunity for computational linguists at the University of Washington and at Microsoft to discuss topics in the field and to connect in a friendly informal atmosphere. This time the symposium consists of three talks from UW summer interns (and their MS mentors), followed by an informal reception. Hitting the Right Paraphrases in Good Time Stanley Kok (CSE) and Chris Brockett (MSR-NLP) We present a random-walk-based approach to extracting paraphrases from bilingual parallel corpora. The corpora are represented as a graph in which a node corresponds to a phrase, and an edge exists between two nodes their corresponding phrases are aligned. We sample random walks to compute the average number of steps it takes to reach a ranking of paraphrases with better ones being closer to the phrase of interest. This approach allows feature nodes that represent domain knowledge to be easily incorporated into the graph, and incorporates techniques to prevent the graph from growing too large for efficiency. Current state-of-the-art approaches, by contrast, require the graph to be bipartite, are limited to finding paraphrases that are of length two away from a phrase, and do not generally permit easy incorporation of domain knowledge into the graph. Manual evaluation of generated output shows that this approach outperforms state-of-the-art. Toward the Twuring Test: Conversation Modeling using Twitter Alan Ritter (CSE) and Colin Cherry (MSR-NLP) The growing popularity of social media has had an interesting side-effect for language researchers: services such as Twitter have resulted in people having instant-messenger-style conversations using a public medium, where anyone can observe. This creates a unique opportunity to collect, study, and model large-scale conversation data. We present a method for mining conversations from Twitter's public feed. The resulting conversation corpus, which will be made publicly available, has more than 1.3 million conversations, 75 thousand of which have more than 5 turns, providing a rich resource for the study of both Twitter and internet chat. Furthermore, we present several methods that attempt to model the flow of conversation by discovering latent classes over Tweets. We show that a repurposed content model (Barzilay and Lee 2004) can discover meaningful dialogue acts, such as question and comment, which indicate not only the role a Tweet plays in its conversation, but also the sorts of Tweets that are likely to follow. This model is improved and extended by employing a Bayesian sampling-based approach, allowing us to model a conversation's topic, and to introduce sparse priors during learning. Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon (CSE) and Lucy Vanderwende (MSR-NLP) Automatically extracting knowledge from online repositories (e.g., PubMed) holds the promise of dramatically speeding up biomedical research and drug design, and represents an outstanding example for the great vision of knowledge extraction from the Web. After initially focusing on entity recognition and binary interaction for protein, the community has recently shifted their attention towards the more ambitious goal of recognizing complex, nested event structures, which are ubiquitous in the literature. However, the state-of-the-art systems still adopt a pipeline architecture and fail to leverage the relational structures among candidate entities for mutual disambiguation. In this paper, we present the first joint approach for bioevent extraction that obtains state-of-the-art results. Our system is based on Markov logic and jointly predicts events and their arguments. We evaluated it using the BioNLP-09 Shared Task and compared it to the participating systems. Experimental results demonstrate the advantage of our approach.

Свежие видео