Inferring Information Status for Reference Generation in Open Domains [1/12]

26
Опубликовано 6 сентября 2016, 5:26
Multi-document summarization involves heavy information compression. Important events need to be summarized, while at the same time, the protagonists need to be described in sufficient detail that the reader can relate them to the story. This talk formalizes the trade-off between describing entities and events within the space constraints of a short summary. I present experiments on automatically acquiring the information status of entities and using this to direct the reference generation process for multi-document summarization. Information Status broadly consists of three notions- whether an entity is hearer-old or hearer-new, whether a reference to it is discourse-old or discourse-new and whether the entity is a major or minor character for that text. The discourse old/new characterization is well understood, and results in longer, descriptive initial references and shorter subsequent references. I show how, based on features extracted from the input documents to a typical summarization engine, it can be determined if entities are assumed to be hearer-old or hearer-new, and also whether a character is central for the summary or not (major/minor). This information is used to decide whether to refer to a character by name or generically, and also to decide the level of detail required in the initial reference. The learned information status successfully models human decisions on reference generation, generating short references (eg. President
Случайные видео
130 дней – 114 93511:29
The New Phanteks Evolv X2 Is Mind Blowing!
151 день – 16 1250:24
New Moms! GRWM and mini-me | Samsung
13.06.06 – 22 6710:35
NVIDIA GeForce 7950 GX2 - Part 1
автотехномузыкадетское