UpSizeR: Synthetically Scaling Up a Given Database State

67
Опубликовано 18 августа 2016, 23:01
E-commerce and social networking services must ensure that their systems are scalable. Engineering for rapid growth requires intensive testing with scaled-up datasets. Although such a larger dataset is synthetically generated, it must be similar to a real dataset if it is to be useful. This talk presents UpSizeR, a tool for scaling up relational databases. Given a database state D and a positive number s, UpSizeR generates a synthetic state D' that is s times the size of D, yet similar to D in terms of query results. UpSizeR does this by extracting inter-column and inter-row information from D. UpSizeR can also be used by an enterprise to make a synthetic copy (s=1) of its proprietary dataset for a vendor, or scale down a production dataset (s<1) for non-production testing. Experiments with Flickr data shows good agreement between crawled data and UpSizeR output for various sizes. However, UpSizeR currently cannot scale the social network topology in Flickr. This leads to the Attribute Value Correlation Problem: If D records data from a social network, how do the social interactions affect correlation among attribute values in D?
автотехномузыкадетское