Data Harvesting: A Random Coding Approach to Rapid Dissemination and Efficient Storage of Data

28
Следующее
Популярные
Опубликовано 6 сентября 2016, 5:06
In this talk, we will see how Random Linear Coding (RLC) based protocols can provide huge gains in two somewhat related problems in large distributed systems: the problem of disseminating information rapidly in a decentralized manner, and the problem of efficiently storing a large file in a distributed manner. We will start with the problem of information dissemination, which will be the primary focus of the talk. The following general setting will be considered. There are N nodes in the network and there are K distinct messages spread in system initially, but not all nodes have all the messages. The question we ask is: how quickly can we disseminate all the K messages among all the nodes? For fully-connected graphs with point-to-point gossip based communication, we will show that the time to disseminate the messages with an RLC based protocol is order optimal in the regime K=O(N). Simulation results, demonstrating the large gains to be had by using RLC based protocols for simultaneous dissemination of messages, will be shown for different network topologies under point-to-point and point-to-multipoint communication. We will then touch on an RLC based strategy for storing a large file in a distributed manner. In the framework we consider, there are many storage locations, each of which only has very limited storage space. Each storage location chooses a part (or a coded version of the parts) of the file without the knowledge of what is stored in the other locations. We will show that, with RLC based storage, the minimum number of storage locations a downloader needs to connect to (for reconstructing the entire file), can be very close to the case where there is complete coordination between the storage locations and the downloader.
автотехномузыкадетское