Inferring Information Status for Reference Generation in Open Domains [1/12]

Microsoft Research330 тыс

Следующее

06.09.16 – 311:03:28

Gadgets for good: How computer researchers can help save lives in poor countries

Популярные

309 дней – 25936:28

Podcast | Collaborators: Teachable AI with Cecily Morrison & Karolina Pakėnaitė [ASL interpretation]

09.08.23 – 1 19656:17

Keypoint Detection for Measuring Body Size of Giraffes: Enhancing Accuracy and Precision

Опубликовано 6 сентября 2016, 5:26

Multi-document summarization involves heavy information compression. Important events need to be summarized, while at the same time, the protagonists need to be described in sufficient detail that the reader can relate them to the story. This talk formalizes the trade-off between describing entities and events within the space constraints of a short summary. I present experiments on automatically acquiring the information status of entities and using this to direct the reference generation process for multi-document summarization. Information Status broadly consists of three notions- whether an entity is hearer-old or hearer-new, whether a reference to it is discourse-old or discourse-new and whether the entity is a major or minor character for that text. The discourse old/new characterization is well understood, and results in longer, descriptive initial references and shorter subsequent references. I show how, based on features extracted from the input documents to a typical summarization engine, it can be determined if entities are assumed to be hearer-old or hearer-new, and also whether a character is central for the summary or not (major/minor). This information is used to decide whether to refer to a character by name or generically, and also to decide the level of detail required in the initial reference. The learned information status successfully models human decisions on reference generation, generating short references (eg. President

Свежие видео