Candidate talk: Domain Adaptation with Structural Correspondence Learning

245
Следующее
06.09.16 – 8451:20
Variable-Aperture Photography
Популярные
211 дней – 12 6965:06
What's new in AutoGen?
Опубликовано 6 сентября 2016, 16:39
Statistical language processing tools are being applied to an ever-wider and more varied range of linguistic data. Researchers and engineers are using statistical models to organize and understand financial news, legal documents, biomedical abstracts, and weblog entries, among many other domains. Because language varies so widely, collecting and curating training sets for each different domain is prohibitively expensive. At the same time, differences in vocabulary and writing style across domains can cause state-of-the-art supervised models to dramatically increase in error. This talk describes structural correspondence learning (SCL), a method for adapting models from resource-rich source domains to resource-poor target domains. SCL uses unlabeled data from both domains to induce a common feature representation for domain adaptation. We demonstrate SCL empirically for the task of sentiment classification, where it decreases error due to adaptation by more than 40. We also give a uniform convergence bound on the error of a classifier trained in one domain and tested in another. Our bound confirms the intuitive result that a good feature representation for domain adaptation is one which makes domains appear similar, while maintaining discriminative power.
автотехномузыкадетское