AWS Supports You | Reducing Costs with EMR Serverless Design Patterns

713
10.8
Published on 23 Nov 2022, 14:46
We would love to hear your feedback about our show! Please take our survey here: amazonintna.qualtrics.com/jfe/...

AWS Supports You: Reducing Costs with EMR Serverless Design Patterns gives viewers on our twitch.tv/aws channel an overview of Amazon EMR Serverless, the core concepts, features, benefits, common usage patterns, and pricing. This episode originally aired on November 22, 2022.

Introduction 0:00
Amazon EMR Serverless Overview 4:26
Common Usage Patterns 18:22
Pricing 36:39
Demo Walkthrough 45:15
Conclusions 52:28

Helpful Links:

EMR Serverless Overview and features
docs.aws.amazon.com/emr/latest...

aws.amazon.com/blogs/big-data/...

Currently Hive and Spark frameworks are supported:
docs.aws.amazon.com/emr/latest...

docs.aws.amazon.com/emr/latest...

docs.aws.amazon.com/emr/latest...

EMR Serverless Workshop- Hands-on Step by Step guide:
catalog.us-east-1.prod.worksho...

aws.amazon.com/emr/serverless

docs.aws.amazon.com/emr/latest...

aws.amazon.com/blogs/big-data/...

EMR Serverless sample examples on how to run jobs and also monitor them using SparkUI/TestUI:
github.com/aws-samples/emr-ser...

Additional Info:
How can I include dependencies with jobs that I want to run on EMR Serverless?

For PySpark, you can package your Python dependencies using virtualenv and pass the archive file using the —archives option, which enables your workers to use the dependencies during the job run. For Scala or Java, you can package your dependencies as jars, upload them to Amazon S3, and pass them using the —jars or —packages options with your EMR Serverless job run.

What if we want to conclude on how or how many ways EMR serverless can help save costs?

There are three ways in which Amazon EMR Serverless can help you save costs.

First, there is no operational overhead of managing, securing, and scaling clusters.

Second, EMR Serverless automatically scales workers up at each stage of processing your job and scales them down when they’re not required. You’re charged for aggregate vCPU, memory, and storage resources used from the time a worker starts running until it stops, rounded up to the nearest second with a 1-minute minimum.

Third, EMR Serverless includes the Amazon EMR performance-optimized runtime for Apache Spark and Apache Hive. The Amazon EMR runtime is API-compatible and over twice as fast as standard open-source analytics engines, so your jobs run faster and incur fewer compute costs.

Subscribe:
More AWS videos - bit.ly/2O3zS75
More AWS events videos - bit.ly/316g9t4

ABOUT AWS
Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers — including the fastest-growing startups, largest enterprises, and leading government agencies — are using AWS to lower costs, become more agile, and innovate faster.

#AWS #AmazonWebServices #CloudComputing
autotechmusickids