Lahar: Warehousing Markovian Streams

8
Опубликовано 17 августа 2016, 20:55
In this talk, I present Lahar, a warehousing system for a general class of imprecise, sequential data called Markovian streams. These imprecise streams are commonly used to model location sequences inferred from noisy sensors such as RFID/GPS, text inferred from spoken audio, etc. In the context of Lahar, I introduce algorithms for supporting sophisticated analytics on these streams (e.g. 'How many coffee breaks did Bob take in May that lasted over an hour?' or 'Find the start/end timestamps of every podcast snippet containing the phrase 'health care'.') The rich semantics of both queries and data in Lahar pose serious efficiency challenges. In this talk, I present several techniques to address these challenges, including novel indexing and approximation approaches.
автотехномузыкадетское