Google Cloud Platform1.17 млн
Опубликовано 23 апреля 2019, 1:00
Twitter collects petabytes of data every day and has the challenge of replicating it to multiple destinations based on users' use cases. One such destination is Google Cloud Storage, which acts as primary storage for tools such as BigQuery, Cloud Dataproc, and Cloud Dataflow. In this session, we deep dive into design of this system and challenges we faced at scale and share our learnings in extending Twitter's Replication Service to Cloud Storage. We explain how this self-service enables uses to set up and manage replication of datasets to Google Cloud Service. Today our Replication Service has transferred several tens of petabytes of data and is built to be used by thousands of users replicating hundreds of petabytes to Cloud Storage.
Build with Google Cloud → bit.ly/2KdoExq
Watch more:
Next '19 Data Analytics Sessions here → bit.ly/Next19DataAnalytics
Next ‘19 All Sessions playlist → bit.ly/Next19AllSessions
Subscribe to the GCP Channel → bit.ly/GCloudPlatform
Speaker(s): Lohit VijayaRenu
Session ID: DA300
product: Cloud - General; fullname: Lohit VijayaRenu; event: Google Cloud Next 2019;
Build with Google Cloud → bit.ly/2KdoExq
Watch more:
Next '19 Data Analytics Sessions here → bit.ly/Next19DataAnalytics
Next ‘19 All Sessions playlist → bit.ly/Next19AllSessions
Subscribe to the GCP Channel → bit.ly/GCloudPlatform
Speaker(s): Lohit VijayaRenu
Session ID: DA300
product: Cloud - General; fullname: Lohit VijayaRenu; event: Google Cloud Next 2019;
Свежие видео