SlideShare uma empresa Scribd logo
1 de 49
Baixar para ler offline
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Accelerate ML workloads using EC2
accelerated computing
Chetan Kapoor
Principal Product Manager – Amazon EC2
C M P 2 0 2
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon EC2 instance types
General
purpose
Compute
optimized
Storage
optimized
Memory
optimized
Accelerated
computing
M5
T3
C5
C4
H1
I3
X1e
R5
F1
P3
G3
D2
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Choice of processors and architectures*
Right compute for each application and workload
*Not all processors and architectures available globally
Over 100 EC2 instances
Featuring
Intel Xeon Processors
AWS Graviton Processor
based on 64-bit Arm
architecture
AMD EPYC processor
Additional Amazon EC2 instances featuring
Nvidia GPUs FPGAs
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Hardware accelerationfor computationallydemand applications
• Image recognition, natural language processing, speech recognition
Machine learning
• Computational fluid dynamics, genomics, weather simulation, EDA
High performance computing
• Graphics workstations, video transcoding, game streaming
Graphic intensive
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
C5: Compute-optimized instances
Custom 3.0 GHz Intel Xeon Scalable Processors (Skylake)
Up to 72 vCPUs and 144 GiB of memory (2:1 Memory:vCPU ratio)
25 Gbps network bandwidth
Support for Intel AVX-512 – Great for ML inference
C5d with local NVMe-based SSD storage
Up to 50%* AWS instance saving over C4
25% price/ performance
improvement over C4
C4 C5
“We saw significant performance improvement on Amazon EC2
C5, with up to a 140% performance improvement in open
standard CPU benchmarks over C4.”
“We are eager to migrate onto the AVX-512 enabled c5.18xlarge
instance size… We expect to decrease the processing time of
some of our key workloads by more than 30%.”
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
C5n: fastest networking in the cloud
33% Increased memory footprint over C5 instances
25 Gbps peak bandwidth on smaller instance sizes
Featuring Intel Xeon Scalable processors
Faster analytics and
big data workloads
Lower costs for
network-bound workloads
All of the elasticity, security,
and scalability of AWS
C5n
100 Gbps network bandwidth on largest instance sizes
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
z1d: high frequency for specialized workloads
High Frequency instances with custom Intel
Xeon Scalable processors running at
sustained 4 GHz all core turbo
8:1 GiB to vCPU ratio
Up to 25-Gbps network bandwidth and up to
1.8 TB of local NVMe storage
Electronic Design Automation Relational databases Gaming
z1d.large z1d.12xlarge
384 GiB
48 vCPU
…6 sizes
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• 10s–100s of processing
cores
• Pre-defined instruction set &
datapath widths
• Optimized for general-
purpose computing
CPU
CPUs vs. GPUs vs. FPGA vs. ASICs for compute
• 1,000s of processing cores
• Pre-defined instruction set
and datapath widths
• Highly effective at parallel
execution
GPU
• Millions of programmable
digital logic cells
• No predefined instruction set
or datapath widths
• Hardware-timed execution
FPGA
DRAM
Control
ALU
ALU
Cache
DRAM
ALU
ALU
Control
ALU
ALU
Cache
DRAM
ALU
ALU
Control
ALU
ALU
Cache
DRAM
ALU
ALU
Control
ALU
ALU
Cache
DRAM
ALU
ALU
• Optimized & custom design
for particular use/function
• Predefined software
experience exposed through
API
ASICs
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
EC2 accelerated computing instances
P3: GPU compute instance
• Up to 8 NVIDIA V100 GPUs in a single instance, with NVLink for peer-to-peer GPU communication
• Supporting a wide variety of use cases including deep learning, HPC simulations, financial computing, and batch
rendering
G3: GPU graphics instance
• Up to 4 NVIDIA M60 GPUs, with GRID Virtual Workstation features and licenses
• Designed for workloads such as 3D rendering, 3D visualizations, graphics-intensive remote workstations, video
encoding, and virtual reality applications
F1: FPGA instance
• Up to 8 Xilinx Virtex UltraScale+ VU9P FPGAs in a single instance. Programmable via VHDL, Verilog, or OpenCL.
Growing marketplace of pre-built application accelerations.
• Designed for hardware-accelerated applications including financial computing, genomics, accelerated search, and
image processing
AWS Inferentia – ML Inference Chip
• High-performance machine learning inference chip, custom designed by AWS
• Designed for lower cost-per-inference across the full range of ML applications
P3
G3
F1
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon EC2 P3 instances for
compute acceleration
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon EC2 P3 instances (October 2017)
• Up to eight NVIDIA Tesla V100 GPUs
• 1 PetaFLOPs of computational performance – Up
to 14x better than P2
• 300 GB/s GPU-to-GPU communication (NVLink)
– 9x better than P2
• 16-GB GPU memory with 900 GB/sec peak GPU
memory bandwidth
O n e o f t h e f a s t e s t , m o s t p o w e r f u l G P U i n s t a n c e s i n t h e c l o u d
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Use cases for P3 instances
Machine learning/AI High performance computing
Natural language
processing
Image and video
recognition
Autonomous vehicle
systems
Recommendation systems
Computational fluid
dynamics
Financial and data
analytics
Weather
simulation
Computational chemistry
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data visualization &
analysis
Business problem –
ML problem framing Data collection
Data integration
Data preparation &
Cleaning
Feature engineering
Model training &
Parameter tuning
Model evaluation
Are business
goals met?
Model deployment
Monitoring &
Debugging
– Predictions
YesNo
DataAugmentation
Feature
augmentation
The machine learning process
Re-training
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Training machine learning models
AlexNet, 2012
• A large, deep convolutional neural network with 5 convolutional layer, 60
million parameters, and 650,000 neurons
• Created by Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton
• Won the 2012 ILSVRC (ImageNet Large-Scale Visual Recognition Challenge)
• Used two NVIDIA GTX 580 GPUs
• Took nearly a week to train!
Source - https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS P3 vs. P2 instance
GPUperformancecomparison
• P2 instances use K80 Accelerator (Kepler architecture)
• P3 instances use V100 Accelerator (Volta architecture)
0
2
4
6
8
10
12
14
16
K80 P100 V100
FP32 Perf (TFLOPS)
1.7x
0
1
2
3
4
5
6
7
8
K80 P100 V100
FP64 Perf (TFLOPS)
2.6x
0
20
40
60
80
100
120
140
K80 P100 V100
Mixed/FP16 Perf (TFLOPS)
14x
over K80s
max perf.
FP32
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
P3 instances details
Instance size GPUs
GPU peer to
peer
vCPUs
Memory
(GB)
Network
bandwidth
Amazon EBS
bandwidth
On-Demand
price/hr.*
1-yr RI
effective
hourly*
3-yr RI
effective
hourly*
P3.2xlarge 1 No 8 61 Up to 10 Gbps 1.7 Gbps $3.06
$1.99
(35% disc.)
$1.23
(60% disc.)
P3.8xlarge 4 NVLink 32 244 10 Gbps 7 Gbps $12.24
$7.96
(35% disc.)
$4.93
(60% disc.)
P3.16xlarge 8 NVLink 64 488 25 Gbps 14 Gbps $24.48
$15.91
(35% disc.)
$9.87
(60% disc.)
Regional availability
P3 instances are generally available in AWS US East
(Northern Virginia), US East (Ohio), US West (Oregon),
EU (Ireland), Asia Pacific (Seoul), Asia Pacific (Tokyo),
AWS GovCloud (US) and China (Beijing) Regions
Framework support
P3 instances and their V100 GPUs supported across
all major frameworks (such as TensorFlow, MXNet,
PyTorch, Caffe2 and CNTK)
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
P3 instances details
Instance size GPUs
GPU peer to
peer
vCPUs
Memory
(GB)
Network
bandwidth
EBS
bandwidth
On-Demand
price/hr*
1-yr RI
effective
hourly*
3-yr RI
effective
hourly*
P3.2xlarge 1 No 8 61 Up to 10 Gbps 1.7 Gbps $3.06
$1.99
(35% disc.)
$1.23
(60% disc.)
P3.8xlarge 4 NVLink 32 244 10 Gbps 7 Gbps $12.24
$7.96
(35% disc.)
$4.93
(60% disc.)
P3.16xlarge 8 NVLink 64 488 25 Gbps 14 Gbps $24.48
$15.91
(35% disc.)
$9.87
(60% disc.)
• P3 instances provide GPU-to-GPU data
transfer over NVLink
• P2 instanced provided GPU-to-GPU data
transfer over PCI Express
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
New larger P3 size – P3dn.24xlarge
OptimizedfordistributedMLtraining
• One of the most powerful GPU instances available in the cloud
• 100 Gbps of networking throughput
• 96 vCPU using AWS customer Skylake CPUs and 768 GB of system memory
• Based on NVIDIA’s latest GPU Tesla V100 with 32 GB of memory
Instance size GPUs GPU memory
GPU
peer to peer
vCPUs CPU type
Memory
(GB)
Network
bandwidth
Amazon EBS
bandwidth
Local instance
storage
P3.2xlarge 1 x V100 16 GB/GPU No 8 Broadwell 61 Up to 10 Gbps 1.7 Gbps NA
P3.8xlarge 4 x V100 16 GB/GPU NVLink 32 Broadwell 244 10 Gbps 7 Gbps NA
P3.16xlarge 8 x V100 16 GB/GPU NVLink 64 Broadwell 488 25 Gbps 14 Gbps NA
P3dn.24xlarge 8 x V100 32 GB/GPU NVLink 96 Skylake 768 100 Gbps 14 Gbps 2 TB NVMe
Latest NVIDA V100 GPU with 32 GB
memory for large models and higher
batch sizes
96 Skylake vCPUs with support for
AVX-512 instructions for pre-
processing of training data
100 Gbps of networking throughput
for large-scale distributed training &
fast data access
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Scaling performance using distributed training
-
5,000
10,000
15,000
20,000
25,000
30,000
35,000
40,000
45,000
50,000
1 2 4 8 16 32 64
Images/Second
Number of GPUs
Training using P3 instances
(ResNet-50, ImageNet Images/Second)
• Using single P3 instances, with Volta GPUs,
customers can cut down training times of
their machine learning models from days to
a few hours.
• Using distributed training via multiple P3
instances, high performance networking and
storage solutions, customers can further cut
down their time-to-train from hours in to
minutes.
• Example – We have been able to train
ResNet-50 to Top1 validation accuracy of
76% in 14 mins cluster of P3.16xlarge
instances.
https://aws.amazon.com/blogs/machine-learning/scalable-multi-node-deep-learning-training-using-gpus-in-the-aws-cloud/
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
The broadest global availability Available AWS regions for P3
instances include:
• US East (N. Virginia)
• US East (Ohio)
• US West (Oregon)
• Canada (Central)
• Europe (Ireland)
• Europe (Frankfurt)
• Europe (London)
• Asia Pacific (Tokyo)
• Asia Pacific (Seoul)
• Asia Pacific (Sydney)
• Asia Pacific (Singapore)
• China (Beijing)
• China (Ningxia)
• AWS GovCloud (US)
Available AWS regions for P3dn.24xlarge instances include:
• US East (N. Virginia)
• US West (Oregon)
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon S3
Secure, durable, highly
scalable
object storage
Fast access, low cost
For long-term durable
storage of data, in a
readily accessible get/put
access format
Primary durable and
scalable storage for data
Amazon S3 Glacier
Secure, durable, long term,
highly cost-effective object
storage
For long-term storage and
archival of data that is
infrequently accessed
Use for long-term, lower-
cost archival of data
EC2+EBS
Create a Single-AZ
shared file system using
Amazon EC2 and
Amazon EBS, with third-
party or open source
software (e.g., ZFS, Intel
Lustre, etc.)
For near-line storage of
files optimized for high
I/O performance
Use for high-IOPs,
temporary working
storage
AWS storage options
Amazon EFS
Highly available, Multi-
AZ, fully managed
network-attached
elastic file system
For near-line, highly
available storage of files
in a traditional NFS
format (NFSv4)
Use for read-often,
temporary working
storage
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon FSx for Lustre
• High-performance file system optimized for
fast processing of workloads such as
machine learning, HPC, video processing,
financial modeling, and electronic design
automation
• Launch and run a file system that provides
submillisecond access to your data
• Enables you to read and write data at
speeds of up to hundreds of gigabytes per
second of throughput and millions of IOPS
Learn more at aws.amazon.com/fsx/lustre
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Deep Learning AMI
• Get started quickly with easy-to-launch tutorials
• Hassle-free setup and configuration
• Pay only for what you use—no additional charge for the
AMI
• Accelerate your model training and deployment
• Support for popular deep learning frameworks
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
1
2
3
Amazon SageMaker:
Build, train, and deploy ML models at scale
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon SageMaker:
Build, train, and deploy ML models at scale
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
RL Coach
Amazon SageMaker:
Build, train, and deploy ML models at scale
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon SageMaker:
Build, train, and deploy ML models at scale
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
A W S I o T
G R E E N G R A S S
A m a z o n
E C 2 C 5
Amazon SageMaker:
Build, train, and deploy ML models at scale
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon SageMaker:
Build, train, and deploy ML models at scale
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon EC2 G3 instances for
graphics acceleration
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS G3 GPU instances
• Up to four NVIDIA M60 GPUs
• Includes GRID Virtual Workstation features and licenses, supports up to four monitors with
4096x2160 (4K) resolution
• Includes NVIDIA grid virtual application capabilities for application virtualization software like Citrix
XenApp Essentials and VMWare Horizon, supporting up to 25 concurrent users per GPU
• Hardware encoding to support up to 10 H.265 (HEVC) 1080p30 streams, and up to 18 H.264
1080p30 streams per GPU
• Designed for workloads such as 3D rendering, 3D visualizations, graphics-intensive remote
workstations, video encoding, and virtual reality applications
Instance Size GPUs vCPUs Memory (GiB)
Linux price per hour
(IAD)
Windows price per hour (IAD)
g3s.xlarge 1 4 30.5 $0.75 $0.93
g3.4xlarge 1 16 122 $1.14 $1.88
g3.8xlarge 2 32 244 $2.28 $3.75
g3.16xlarge 4 64 488 $4.56 $7.50
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Four modes of using G3 instances
CPU
16 vCPUs
GPU
1 x M60
Memory
122 GB
G3.4xlarge
Up to 10G
Network
Graphics
rendering,
simulations, video
encoding
EC2 instance with
NVIDIA drivers &
libraries
EC2 instance with
NVIDIA GRID
NVIDIA GRID virtual
workstation
NVIDIA GRID
virtual application
Professional
workstation (single
user)
Virtual apps
(25 concurrent
users) Gaming
services
EC2 instance w/
NVIDIA GRID for
gaming
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
M&E – Content creation
Auto – Car configurators
E&P – Analytics
• Seismic analysis, energy E&P, cloud GPU rendering &
visualization, such as high end car configurators, AR/VR
• Desktop and application virtualization
• Productivity and consumer apps
• Design and engineering
• Media and entertainment post-production
• Media and entertainment: video playout/broadcast,
encoding/transcoding
• Cloud gaming
G3 use cases
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS G4 GPU instances
• Designed for machine learning inferencing,
video transcoding, remote graphics
workstation, and other demanding graphics
applications.
• Up to 8 NVIDIA T4 Tensor Core GPUs
• 2560 CUDA Cores, 320 Turing Codes including
support for Ray-Tracing technology
• Available in multiple sizes
• AWS-custom Intel CPUs (4–96 vCPUs)
• Available soon
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon EC2 F1 instances for
custom hardware acceleration
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
An FPGA is effective at processing data of many types in parallel, for example creating a
complex pipeline of parallel, multistage operations on a video stream, or performing
massive numbers of dependent or independent calculations for a complex financial model…
• An FPGA does not have an instruction-
set!
• Data can be any bit-width (9-bit integer?
No problem!)
• Complex control logic (such as a state
machine) is easy to implement in an
FPGA
Each FPGA in
F1 has more
than 2M of
these cells
Parallel processing in FPGAs
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
….
….
module filter1 (clock, rst, strm_in, strm_out)
for (i=0; i<NUMUNITS; i=i+1)
always@(posedge clock)
integer i,j; //index for loops
tmp_kernel[j] = k[i*OFFSETX];
FPGA handles compute-
intensive, deeply
pipelined, hardware-
accelerated operations
CPU handles the rest
Application
How FPGA acceleration works
….
….
….
….
….
….
….
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
F1 FPGA instance types on AWS
▪Up to 8 Xilinx UltraScale+ 16 nm VU9P FPGA devices in a single instance
▪The f1.16xlarge size provides:
▪ 8 FPGAs, each with over 2 million customer-accessible FPGA programmable logic
cells and over 5000 programmable DSP blocks
▪ Each of the 8 FPGAs has 4 DDR-4 interfaces, with each interface accessing a 16
GiB, 72-bit wide, ECC-protected memory
Instance size FPGAs
FPGA memory
(GB)
vCPUs
Instance memory
(GB)
NVMe instance
storage (GB)
Network
bandwidth
f1.2xlarge 1 64 8 122 1 x 470 Up to 10 Gbps
f1.4xlarge 2 128 16 244 1 x 940 Up to 10 Gbps
f1.16xlarge 8 512 64 976 4 x 940 25 Gbps
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Three methods to use F1 instance
Hardware engineers/developers1
•Developers who are comfortable programming FPGA
•Use F1 Hardware Development Kit (HDK) to develop and deploying custom FPGA accelerations using Verilog and VHDL
Software engineers/developers2
• Developers who are not proficient in FPGA design
• Use OpenCL to create custom accelerations
Software engineers/developers3
• Developers who are not proficient in FPGA design
• Use pre-build and ready to use accelerations available in AWS Marketplace
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
FPGA acceleration development
PCIe
DDR
controllers DDR-4
attached
memory
EC2
F1
Launch instance and load AFI
Amazon Machine Image (AMI)
CPU
Application
Amazon FPGA Image (AFI)
An F1 instance can have any
number of AFIs
An AFI can be loaded into
the FPGA in seconds
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Developing custom accelerations
The FPGA Developer AMI
Use Xilinx Vivado and a hardware description language (Verilog or VHDL for RTL) with the HDK to
describe and simulate your FPGA logic
Xilinx Vivado for custom logic development Virtual JTAG for interactive debugging
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
OpenCL generally available for F1
▪ Familiar development experience to accelerate C/C++
applications
▪ 50+ F1 code examples available that span multiple
domains: security, image processing, and accelerated
algorithms
▪ Already supported on the FPGA Developer AMI, no
need to upgrade/install
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Marketplace
Discover, procure, deploy, and manage software in the cloud
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Delivering FPGA partner solutions
Amazon EC2 FPGA
deployment via AWS
Marketplace
CPU
Application
Customers
Amazon Machine Image (AMI)
Amazon FPGA Image (AFI)
AFI is secured, encrypted, dynamically loaded
into the FPGA – can’t be copied or
downloaded
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Inferentia
High-performancemachinelearninginferencechip,customdesignedbyAWS
• Making predictions using a trained machine learning model–a process called inference–can drive as much
as 90% of the compute costs of the application.
• AWS Inferentia is a machine learning inference chip designed to deliver high performance at low cost.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• High Frequency instances with
custom Intel Xeon Scalable
processors
• Running at sustained 4 GHz all
core turbo
• Fastest networking in the
cloud
• Compute optimized instances
Intel Xeon Scalable Processors
(Skylake)
• Up to 100 GBps networking
Summary
• Pick the right compute platform for accelerating your application
• You have a choice of using compute optimize CPU platforms, GPU, or FPGA
accelerated platforms
• We aspire to provide you with the broadest and deepest set of products and services
to support your workload.
• Compute optimized instances
• Custom 3.0 GHz Intel Xeon
Scalable Processors (Skylake)
• Support for Intel AVX-512 –
Great for ML inference
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
EC2 accelerated computing instances
P3: GPU Compute instance
• Up to 8 NVIDIA V100 GPUs in a single instance, with NVLink for peer-to-peer GPU communication
• Supporting a wide variety of use cases including deep learning, HPC simulations, financial computing, and batch
rendering
G3: GPU Graphics instance
• Up to 4 NVIDIA M60 GPUs, with GRID Virtual Workstation features and licenses
• Designed for workloads such as 3D rendering, 3D visualizations, graphics-intensive remote workstations, video
encoding, and virtual reality applications
F1: FPGA instance
• Up to 8 Xilinx Virtex UltraScale+ VU9P FPGAs in a single instance. Programmable via VHDL, Verilog, or OpenCL.
Growing marketplace of pre-built application accelerations.
• Designed for hardware-accelerated applications including financial computing, genomics, accelerated search, and
image processing
AWS Inferentia – ML inference chip
• High-performance machine learning inference chip, custom designed by AWS
• Designed for lower cost-per-inference across the full range of ML applications
P3
G3
F1
Thank you!
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Chetan Kapoor
Principal Product Manager
Amazon EC2
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Mais conteúdo relacionado

Mais procurados

Grid computing in the cloud for Financial Services industry - CMP205-I - New ...
Grid computing in the cloud for Financial Services industry - CMP205-I - New ...Grid computing in the cloud for Financial Services industry - CMP205-I - New ...
Grid computing in the cloud for Financial Services industry - CMP205-I - New ...Amazon Web Services
 
What's new in Amazon EC2 - CMP201 - New York AWS Summit
What's new in Amazon EC2 - CMP201 - New York AWS SummitWhat's new in Amazon EC2 - CMP201 - New York AWS Summit
What's new in Amazon EC2 - CMP201 - New York AWS SummitAmazon Web Services
 
Introduction to the Well-Architected Framework and Tool - SVC212 - Santa Clar...
Introduction to the Well-Architected Framework and Tool - SVC212 - Santa Clar...Introduction to the Well-Architected Framework and Tool - SVC212 - Santa Clar...
Introduction to the Well-Architected Framework and Tool - SVC212 - Santa Clar...Amazon Web Services
 
Building enterprise solutions with blockchain technology - SVC217 - New York ...
Building enterprise solutions with blockchain technology - SVC217 - New York ...Building enterprise solutions with blockchain technology - SVC217 - New York ...
Building enterprise solutions with blockchain technology - SVC217 - New York ...Amazon Web Services
 
Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...
Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...
Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...Amazon Web Services
 
Train once, deploy anywhere on the cloud and at the edge with Amazon SageMake...
Train once, deploy anywhere on the cloud and at the edge with Amazon SageMake...Train once, deploy anywhere on the cloud and at the edge with Amazon SageMake...
Train once, deploy anywhere on the cloud and at the edge with Amazon SageMake...Amazon Web Services
 
Setting up custom machine learning environments on AWS - AIM309 - New York AW...
Setting up custom machine learning environments on AWS - AIM309 - New York AW...Setting up custom machine learning environments on AWS - AIM309 - New York AW...
Setting up custom machine learning environments on AWS - AIM309 - New York AW...Amazon Web Services
 
AWS storage solutions for business-critical applications - STG301 - Chicago A...
AWS storage solutions for business-critical applications - STG301 - Chicago A...AWS storage solutions for business-critical applications - STG301 - Chicago A...
AWS storage solutions for business-critical applications - STG301 - Chicago A...Amazon Web Services
 
利用 Fargate - 無伺服器的容器環境建置高可用的系統
利用 Fargate - 無伺服器的容器環境建置高可用的系統利用 Fargate - 無伺服器的容器環境建置高可用的系統
利用 Fargate - 無伺服器的容器環境建置高可用的系統Amazon Web Services
 
Migration to AWS: The foundation for enterprise transformation - SVC210 - New...
Migration to AWS: The foundation for enterprise transformation - SVC210 - New...Migration to AWS: The foundation for enterprise transformation - SVC210 - New...
Migration to AWS: The foundation for enterprise transformation - SVC210 - New...Amazon Web Services
 
Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...
Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...
Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...Amazon Web Services
 
Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...
Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...
Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...Amazon Web Services
 
How SAP customers are benefiting from machine learning and IoT with AWS - MAD...
How SAP customers are benefiting from machine learning and IoT with AWS - MAD...How SAP customers are benefiting from machine learning and IoT with AWS - MAD...
How SAP customers are benefiting from machine learning and IoT with AWS - MAD...Amazon Web Services
 
Developing serverless applications with .NET using AWS SDK & tools - MAD311 -...
Developing serverless applications with .NET using AWS SDK & tools - MAD311 -...Developing serverless applications with .NET using AWS SDK & tools - MAD311 -...
Developing serverless applications with .NET using AWS SDK & tools - MAD311 -...Amazon Web Services
 
Optimize deep learning training and inferencing using GPU and Amazon SageMake...
Optimize deep learning training and inferencing using GPU and Amazon SageMake...Optimize deep learning training and inferencing using GPU and Amazon SageMake...
Optimize deep learning training and inferencing using GPU and Amazon SageMake...Amazon Web Services
 
CI/CD best practices for building modern applications - MAD310 - New York AWS...
CI/CD best practices for building modern applications - MAD310 - New York AWS...CI/CD best practices for building modern applications - MAD310 - New York AWS...
CI/CD best practices for building modern applications - MAD310 - New York AWS...Amazon Web Services
 
Build a VR experience in 60 minutes - SVC222 - New York AWS Summit
Build a VR experience in 60 minutes - SVC222 - New York AWS SummitBuild a VR experience in 60 minutes - SVC222 - New York AWS Summit
Build a VR experience in 60 minutes - SVC222 - New York AWS SummitAmazon Web Services
 
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...AWS Summits
 
Tech deep dive: Cloud data management with Veeam and AWS - SVC216-S - New Yor...
Tech deep dive: Cloud data management with Veeam and AWS - SVC216-S - New Yor...Tech deep dive: Cloud data management with Veeam and AWS - SVC216-S - New Yor...
Tech deep dive: Cloud data management with Veeam and AWS - SVC216-S - New Yor...Amazon Web Services
 
The Zen of governance - Establish guardrails and empower builders - SVC201 - ...
The Zen of governance - Establish guardrails and empower builders - SVC201 - ...The Zen of governance - Establish guardrails and empower builders - SVC201 - ...
The Zen of governance - Establish guardrails and empower builders - SVC201 - ...Amazon Web Services
 

Mais procurados (20)

Grid computing in the cloud for Financial Services industry - CMP205-I - New ...
Grid computing in the cloud for Financial Services industry - CMP205-I - New ...Grid computing in the cloud for Financial Services industry - CMP205-I - New ...
Grid computing in the cloud for Financial Services industry - CMP205-I - New ...
 
What's new in Amazon EC2 - CMP201 - New York AWS Summit
What's new in Amazon EC2 - CMP201 - New York AWS SummitWhat's new in Amazon EC2 - CMP201 - New York AWS Summit
What's new in Amazon EC2 - CMP201 - New York AWS Summit
 
Introduction to the Well-Architected Framework and Tool - SVC212 - Santa Clar...
Introduction to the Well-Architected Framework and Tool - SVC212 - Santa Clar...Introduction to the Well-Architected Framework and Tool - SVC212 - Santa Clar...
Introduction to the Well-Architected Framework and Tool - SVC212 - Santa Clar...
 
Building enterprise solutions with blockchain technology - SVC217 - New York ...
Building enterprise solutions with blockchain technology - SVC217 - New York ...Building enterprise solutions with blockchain technology - SVC217 - New York ...
Building enterprise solutions with blockchain technology - SVC217 - New York ...
 
Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...
Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...
Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...
 
Train once, deploy anywhere on the cloud and at the edge with Amazon SageMake...
Train once, deploy anywhere on the cloud and at the edge with Amazon SageMake...Train once, deploy anywhere on the cloud and at the edge with Amazon SageMake...
Train once, deploy anywhere on the cloud and at the edge with Amazon SageMake...
 
Setting up custom machine learning environments on AWS - AIM309 - New York AW...
Setting up custom machine learning environments on AWS - AIM309 - New York AW...Setting up custom machine learning environments on AWS - AIM309 - New York AW...
Setting up custom machine learning environments on AWS - AIM309 - New York AW...
 
AWS storage solutions for business-critical applications - STG301 - Chicago A...
AWS storage solutions for business-critical applications - STG301 - Chicago A...AWS storage solutions for business-critical applications - STG301 - Chicago A...
AWS storage solutions for business-critical applications - STG301 - Chicago A...
 
利用 Fargate - 無伺服器的容器環境建置高可用的系統
利用 Fargate - 無伺服器的容器環境建置高可用的系統利用 Fargate - 無伺服器的容器環境建置高可用的系統
利用 Fargate - 無伺服器的容器環境建置高可用的系統
 
Migration to AWS: The foundation for enterprise transformation - SVC210 - New...
Migration to AWS: The foundation for enterprise transformation - SVC210 - New...Migration to AWS: The foundation for enterprise transformation - SVC210 - New...
Migration to AWS: The foundation for enterprise transformation - SVC210 - New...
 
Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...
Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...
Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...
 
Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...
Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...
Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...
 
How SAP customers are benefiting from machine learning and IoT with AWS - MAD...
How SAP customers are benefiting from machine learning and IoT with AWS - MAD...How SAP customers are benefiting from machine learning and IoT with AWS - MAD...
How SAP customers are benefiting from machine learning and IoT with AWS - MAD...
 
Developing serverless applications with .NET using AWS SDK & tools - MAD311 -...
Developing serverless applications with .NET using AWS SDK & tools - MAD311 -...Developing serverless applications with .NET using AWS SDK & tools - MAD311 -...
Developing serverless applications with .NET using AWS SDK & tools - MAD311 -...
 
Optimize deep learning training and inferencing using GPU and Amazon SageMake...
Optimize deep learning training and inferencing using GPU and Amazon SageMake...Optimize deep learning training and inferencing using GPU and Amazon SageMake...
Optimize deep learning training and inferencing using GPU and Amazon SageMake...
 
CI/CD best practices for building modern applications - MAD310 - New York AWS...
CI/CD best practices for building modern applications - MAD310 - New York AWS...CI/CD best practices for building modern applications - MAD310 - New York AWS...
CI/CD best practices for building modern applications - MAD310 - New York AWS...
 
Build a VR experience in 60 minutes - SVC222 - New York AWS Summit
Build a VR experience in 60 minutes - SVC222 - New York AWS SummitBuild a VR experience in 60 minutes - SVC222 - New York AWS Summit
Build a VR experience in 60 minutes - SVC222 - New York AWS Summit
 
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
 
Tech deep dive: Cloud data management with Veeam and AWS - SVC216-S - New Yor...
Tech deep dive: Cloud data management with Veeam and AWS - SVC216-S - New Yor...Tech deep dive: Cloud data management with Veeam and AWS - SVC216-S - New Yor...
Tech deep dive: Cloud data management with Veeam and AWS - SVC216-S - New Yor...
 
The Zen of governance - Establish guardrails and empower builders - SVC201 - ...
The Zen of governance - Establish guardrails and empower builders - SVC201 - ...The Zen of governance - Establish guardrails and empower builders - SVC201 - ...
The Zen of governance - Establish guardrails and empower builders - SVC201 - ...
 

Semelhante a Accelerate ML workloads using EC2 accelerated computing

AWS Compute Evolved Week: Deep Dive on Amazon EC2 Accelerated Computing
AWS Compute Evolved Week: Deep Dive on Amazon EC2 Accelerated ComputingAWS Compute Evolved Week: Deep Dive on Amazon EC2 Accelerated Computing
AWS Compute Evolved Week: Deep Dive on Amazon EC2 Accelerated ComputingAmazon Web Services
 
Deep Dive on Amazon EC2 Accelerated Computing
Deep Dive on Amazon EC2 Accelerated ComputingDeep Dive on Amazon EC2 Accelerated Computing
Deep Dive on Amazon EC2 Accelerated ComputingAmazon Web Services
 
Deep Dive on Amazon EC2 Accelerated Computing
Deep Dive on Amazon EC2 Accelerated ComputingDeep Dive on Amazon EC2 Accelerated Computing
Deep Dive on Amazon EC2 Accelerated ComputingAmazon Web Services
 
Deep Dive on Amazon EC2 Accelerated Computing - AWS Online Tech Talks
Deep Dive on Amazon EC2 Accelerated Computing - AWS Online Tech TalksDeep Dive on Amazon EC2 Accelerated Computing - AWS Online Tech Talks
Deep Dive on Amazon EC2 Accelerated Computing - AWS Online Tech TalksAmazon Web Services
 
Foundations of Amazon EC2 - SRV319
Foundations of Amazon EC2 - SRV319 Foundations of Amazon EC2 - SRV319
Foundations of Amazon EC2 - SRV319 Amazon Web Services
 
[Games on AWS 2019] AWS 입문자를 위한 초단기 레벨업 트랙 | AWS 레벨업 하기! : 컴퓨팅 - 조용진 AWS 솔루션즈...
[Games on AWS 2019] AWS 입문자를 위한 초단기 레벨업 트랙 | AWS 레벨업 하기! : 컴퓨팅 - 조용진 AWS 솔루션즈...[Games on AWS 2019] AWS 입문자를 위한 초단기 레벨업 트랙 | AWS 레벨업 하기! : 컴퓨팅 - 조용진 AWS 솔루션즈...
[Games on AWS 2019] AWS 입문자를 위한 초단기 레벨업 트랙 | AWS 레벨업 하기! : 컴퓨팅 - 조용진 AWS 솔루션즈...Amazon Web Services Korea
 
Amazon EC2 instances: Customizable cloud computing across workloads - DEM20-S...
Amazon EC2 instances: Customizable cloud computing across workloads - DEM20-S...Amazon EC2 instances: Customizable cloud computing across workloads - DEM20-S...
Amazon EC2 instances: Customizable cloud computing across workloads - DEM20-S...Amazon Web Services
 
Optimize your workloads with Amazon EC2 and AMD EPYC - DEM01-R - Atlanta AWS ...
Optimize your workloads with Amazon EC2 and AMD EPYC - DEM01-R - Atlanta AWS ...Optimize your workloads with Amazon EC2 and AMD EPYC - DEM01-R - Atlanta AWS ...
Optimize your workloads with Amazon EC2 and AMD EPYC - DEM01-R - Atlanta AWS ...Amazon Web Services
 
Optimizing your workloads with Amazon EC2 and AMD EPYC processors - DEM01-SR ...
Optimizing your workloads with Amazon EC2 and AMD EPYC processors - DEM01-SR ...Optimizing your workloads with Amazon EC2 and AMD EPYC processors - DEM01-SR ...
Optimizing your workloads with Amazon EC2 and AMD EPYC processors - DEM01-SR ...Amazon Web Services
 
Introducing Amazon EC2 P3 Instance - Featuring the Most Powerful GPU for Mach...
Introducing Amazon EC2 P3 Instance - Featuring the Most Powerful GPU for Mach...Introducing Amazon EC2 P3 Instance - Featuring the Most Powerful GPU for Mach...
Introducing Amazon EC2 P3 Instance - Featuring the Most Powerful GPU for Mach...Amazon Web Services
 
Optimize your workloads with Amazon EC2 and AMD EPYC - DEM03-SR - New York AW...
Optimize your workloads with Amazon EC2 and AMD EPYC - DEM03-SR - New York AW...Optimize your workloads with Amazon EC2 and AMD EPYC - DEM03-SR - New York AW...
Optimize your workloads with Amazon EC2 and AMD EPYC - DEM03-SR - New York AW...Amazon Web Services
 
Foundations of Amazon EC2 - SRV319 - Chicago AWS Summit
Foundations of Amazon EC2 - SRV319 - Chicago AWS SummitFoundations of Amazon EC2 - SRV319 - Chicago AWS Summit
Foundations of Amazon EC2 - SRV319 - Chicago AWS SummitAmazon Web Services
 
AWSome Day Online 2020_Module 2: Getting started with the cloud
AWSome Day Online 2020_Module 2: Getting started with the cloudAWSome Day Online 2020_Module 2: Getting started with the cloud
AWSome Day Online 2020_Module 2: Getting started with the cloudAmazon Web Services
 
Amazon EC2 Foundations - SRV319 - Anaheim AWS Summit
Amazon EC2 Foundations - SRV319 - Anaheim AWS SummitAmazon EC2 Foundations - SRV319 - Anaheim AWS Summit
Amazon EC2 Foundations - SRV319 - Anaheim AWS SummitAmazon Web Services
 
Amazon EC2 Foundations - SRV319 - Toronto AWS Summit
Amazon EC2 Foundations - SRV319 - Toronto AWS SummitAmazon EC2 Foundations - SRV319 - Toronto AWS Summit
Amazon EC2 Foundations - SRV319 - Toronto AWS SummitAmazon Web Services
 
Amazon EC2 Foundations (CMP208-R1) - AWS re:Invent 2018
Amazon EC2 Foundations (CMP208-R1) - AWS re:Invent 2018Amazon EC2 Foundations (CMP208-R1) - AWS re:Invent 2018
Amazon EC2 Foundations (CMP208-R1) - AWS re:Invent 2018Amazon Web Services
 

Semelhante a Accelerate ML workloads using EC2 accelerated computing (20)

AWS Compute Evolved Week: Deep Dive on Amazon EC2 Accelerated Computing
AWS Compute Evolved Week: Deep Dive on Amazon EC2 Accelerated ComputingAWS Compute Evolved Week: Deep Dive on Amazon EC2 Accelerated Computing
AWS Compute Evolved Week: Deep Dive on Amazon EC2 Accelerated Computing
 
Deep Dive on Amazon EC2 Accelerated Computing
Deep Dive on Amazon EC2 Accelerated ComputingDeep Dive on Amazon EC2 Accelerated Computing
Deep Dive on Amazon EC2 Accelerated Computing
 
Deep Dive on Amazon EC2 Accelerated Computing
Deep Dive on Amazon EC2 Accelerated ComputingDeep Dive on Amazon EC2 Accelerated Computing
Deep Dive on Amazon EC2 Accelerated Computing
 
Deep Dive on Amazon EC2 Accelerated Computing - AWS Online Tech Talks
Deep Dive on Amazon EC2 Accelerated Computing - AWS Online Tech TalksDeep Dive on Amazon EC2 Accelerated Computing - AWS Online Tech Talks
Deep Dive on Amazon EC2 Accelerated Computing - AWS Online Tech Talks
 
SRV319 Amazon EC2 Foundations
SRV319 Amazon EC2 FoundationsSRV319 Amazon EC2 Foundations
SRV319 Amazon EC2 Foundations
 
Foundations of Amazon EC2 - SRV319
Foundations of Amazon EC2 - SRV319 Foundations of Amazon EC2 - SRV319
Foundations of Amazon EC2 - SRV319
 
EC2 Foundations - Laura Thomson
EC2 Foundations - Laura ThomsonEC2 Foundations - Laura Thomson
EC2 Foundations - Laura Thomson
 
[Games on AWS 2019] AWS 입문자를 위한 초단기 레벨업 트랙 | AWS 레벨업 하기! : 컴퓨팅 - 조용진 AWS 솔루션즈...
[Games on AWS 2019] AWS 입문자를 위한 초단기 레벨업 트랙 | AWS 레벨업 하기! : 컴퓨팅 - 조용진 AWS 솔루션즈...[Games on AWS 2019] AWS 입문자를 위한 초단기 레벨업 트랙 | AWS 레벨업 하기! : 컴퓨팅 - 조용진 AWS 솔루션즈...
[Games on AWS 2019] AWS 입문자를 위한 초단기 레벨업 트랙 | AWS 레벨업 하기! : 컴퓨팅 - 조용진 AWS 솔루션즈...
 
Amazon EC2 instances: Customizable cloud computing across workloads - DEM20-S...
Amazon EC2 instances: Customizable cloud computing across workloads - DEM20-S...Amazon EC2 instances: Customizable cloud computing across workloads - DEM20-S...
Amazon EC2 instances: Customizable cloud computing across workloads - DEM20-S...
 
Amazon EC2 Foundations
Amazon EC2 FoundationsAmazon EC2 Foundations
Amazon EC2 Foundations
 
Optimize your workloads with Amazon EC2 and AMD EPYC - DEM01-R - Atlanta AWS ...
Optimize your workloads with Amazon EC2 and AMD EPYC - DEM01-R - Atlanta AWS ...Optimize your workloads with Amazon EC2 and AMD EPYC - DEM01-R - Atlanta AWS ...
Optimize your workloads with Amazon EC2 and AMD EPYC - DEM01-R - Atlanta AWS ...
 
Optimizing your workloads with Amazon EC2 and AMD EPYC processors - DEM01-SR ...
Optimizing your workloads with Amazon EC2 and AMD EPYC processors - DEM01-SR ...Optimizing your workloads with Amazon EC2 and AMD EPYC processors - DEM01-SR ...
Optimizing your workloads with Amazon EC2 and AMD EPYC processors - DEM01-SR ...
 
Introducing Amazon EC2 P3 Instance - Featuring the Most Powerful GPU for Mach...
Introducing Amazon EC2 P3 Instance - Featuring the Most Powerful GPU for Mach...Introducing Amazon EC2 P3 Instance - Featuring the Most Powerful GPU for Mach...
Introducing Amazon EC2 P3 Instance - Featuring the Most Powerful GPU for Mach...
 
Optimize your workloads with Amazon EC2 and AMD EPYC - DEM03-SR - New York AW...
Optimize your workloads with Amazon EC2 and AMD EPYC - DEM03-SR - New York AW...Optimize your workloads with Amazon EC2 and AMD EPYC - DEM03-SR - New York AW...
Optimize your workloads with Amazon EC2 and AMD EPYC - DEM03-SR - New York AW...
 
Foundations of Amazon EC2 - SRV319 - Chicago AWS Summit
Foundations of Amazon EC2 - SRV319 - Chicago AWS SummitFoundations of Amazon EC2 - SRV319 - Chicago AWS Summit
Foundations of Amazon EC2 - SRV319 - Chicago AWS Summit
 
AWSome Day Online 2020_Module 2: Getting started with the cloud
AWSome Day Online 2020_Module 2: Getting started with the cloudAWSome Day Online 2020_Module 2: Getting started with the cloud
AWSome Day Online 2020_Module 2: Getting started with the cloud
 
Amazon EC2 Foundations - SRV319 - Anaheim AWS Summit
Amazon EC2 Foundations - SRV319 - Anaheim AWS SummitAmazon EC2 Foundations - SRV319 - Anaheim AWS Summit
Amazon EC2 Foundations - SRV319 - Anaheim AWS Summit
 
Amazon EC2 Foundations - SRV319 - Toronto AWS Summit
Amazon EC2 Foundations - SRV319 - Toronto AWS SummitAmazon EC2 Foundations - SRV319 - Toronto AWS Summit
Amazon EC2 Foundations - SRV319 - Toronto AWS Summit
 
Amazon EC2 Foundations
Amazon EC2 FoundationsAmazon EC2 Foundations
Amazon EC2 Foundations
 
Amazon EC2 Foundations (CMP208-R1) - AWS re:Invent 2018
Amazon EC2 Foundations (CMP208-R1) - AWS re:Invent 2018Amazon EC2 Foundations (CMP208-R1) - AWS re:Invent 2018
Amazon EC2 Foundations (CMP208-R1) - AWS re:Invent 2018
 

Mais de Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Mais de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Accelerate ML workloads using EC2 accelerated computing

  • 1. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Accelerate ML workloads using EC2 accelerated computing Chetan Kapoor Principal Product Manager – Amazon EC2 C M P 2 0 2
  • 2. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon EC2 instance types General purpose Compute optimized Storage optimized Memory optimized Accelerated computing M5 T3 C5 C4 H1 I3 X1e R5 F1 P3 G3 D2
  • 3. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Choice of processors and architectures* Right compute for each application and workload *Not all processors and architectures available globally Over 100 EC2 instances Featuring Intel Xeon Processors AWS Graviton Processor based on 64-bit Arm architecture AMD EPYC processor Additional Amazon EC2 instances featuring Nvidia GPUs FPGAs
  • 4. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Hardware accelerationfor computationallydemand applications • Image recognition, natural language processing, speech recognition Machine learning • Computational fluid dynamics, genomics, weather simulation, EDA High performance computing • Graphics workstations, video transcoding, game streaming Graphic intensive
  • 5. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. C5: Compute-optimized instances Custom 3.0 GHz Intel Xeon Scalable Processors (Skylake) Up to 72 vCPUs and 144 GiB of memory (2:1 Memory:vCPU ratio) 25 Gbps network bandwidth Support for Intel AVX-512 – Great for ML inference C5d with local NVMe-based SSD storage Up to 50%* AWS instance saving over C4 25% price/ performance improvement over C4 C4 C5 “We saw significant performance improvement on Amazon EC2 C5, with up to a 140% performance improvement in open standard CPU benchmarks over C4.” “We are eager to migrate onto the AVX-512 enabled c5.18xlarge instance size… We expect to decrease the processing time of some of our key workloads by more than 30%.”
  • 6. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. C5n: fastest networking in the cloud 33% Increased memory footprint over C5 instances 25 Gbps peak bandwidth on smaller instance sizes Featuring Intel Xeon Scalable processors Faster analytics and big data workloads Lower costs for network-bound workloads All of the elasticity, security, and scalability of AWS C5n 100 Gbps network bandwidth on largest instance sizes
  • 7. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. z1d: high frequency for specialized workloads High Frequency instances with custom Intel Xeon Scalable processors running at sustained 4 GHz all core turbo 8:1 GiB to vCPU ratio Up to 25-Gbps network bandwidth and up to 1.8 TB of local NVMe storage Electronic Design Automation Relational databases Gaming z1d.large z1d.12xlarge 384 GiB 48 vCPU …6 sizes
  • 8. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • 10s–100s of processing cores • Pre-defined instruction set & datapath widths • Optimized for general- purpose computing CPU CPUs vs. GPUs vs. FPGA vs. ASICs for compute • 1,000s of processing cores • Pre-defined instruction set and datapath widths • Highly effective at parallel execution GPU • Millions of programmable digital logic cells • No predefined instruction set or datapath widths • Hardware-timed execution FPGA DRAM Control ALU ALU Cache DRAM ALU ALU Control ALU ALU Cache DRAM ALU ALU Control ALU ALU Cache DRAM ALU ALU Control ALU ALU Cache DRAM ALU ALU • Optimized & custom design for particular use/function • Predefined software experience exposed through API ASICs
  • 9. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. EC2 accelerated computing instances P3: GPU compute instance • Up to 8 NVIDIA V100 GPUs in a single instance, with NVLink for peer-to-peer GPU communication • Supporting a wide variety of use cases including deep learning, HPC simulations, financial computing, and batch rendering G3: GPU graphics instance • Up to 4 NVIDIA M60 GPUs, with GRID Virtual Workstation features and licenses • Designed for workloads such as 3D rendering, 3D visualizations, graphics-intensive remote workstations, video encoding, and virtual reality applications F1: FPGA instance • Up to 8 Xilinx Virtex UltraScale+ VU9P FPGAs in a single instance. Programmable via VHDL, Verilog, or OpenCL. Growing marketplace of pre-built application accelerations. • Designed for hardware-accelerated applications including financial computing, genomics, accelerated search, and image processing AWS Inferentia – ML Inference Chip • High-performance machine learning inference chip, custom designed by AWS • Designed for lower cost-per-inference across the full range of ML applications P3 G3 F1
  • 10. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon EC2 P3 instances for compute acceleration
  • 11. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon EC2 P3 instances (October 2017) • Up to eight NVIDIA Tesla V100 GPUs • 1 PetaFLOPs of computational performance – Up to 14x better than P2 • 300 GB/s GPU-to-GPU communication (NVLink) – 9x better than P2 • 16-GB GPU memory with 900 GB/sec peak GPU memory bandwidth O n e o f t h e f a s t e s t , m o s t p o w e r f u l G P U i n s t a n c e s i n t h e c l o u d
  • 12. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Use cases for P3 instances Machine learning/AI High performance computing Natural language processing Image and video recognition Autonomous vehicle systems Recommendation systems Computational fluid dynamics Financial and data analytics Weather simulation Computational chemistry
  • 13. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Data visualization & analysis Business problem – ML problem framing Data collection Data integration Data preparation & Cleaning Feature engineering Model training & Parameter tuning Model evaluation Are business goals met? Model deployment Monitoring & Debugging – Predictions YesNo DataAugmentation Feature augmentation The machine learning process Re-training
  • 14. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Training machine learning models AlexNet, 2012 • A large, deep convolutional neural network with 5 convolutional layer, 60 million parameters, and 650,000 neurons • Created by Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton • Won the 2012 ILSVRC (ImageNet Large-Scale Visual Recognition Challenge) • Used two NVIDIA GTX 580 GPUs • Took nearly a week to train! Source - https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
  • 15. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS P3 vs. P2 instance GPUperformancecomparison • P2 instances use K80 Accelerator (Kepler architecture) • P3 instances use V100 Accelerator (Volta architecture) 0 2 4 6 8 10 12 14 16 K80 P100 V100 FP32 Perf (TFLOPS) 1.7x 0 1 2 3 4 5 6 7 8 K80 P100 V100 FP64 Perf (TFLOPS) 2.6x 0 20 40 60 80 100 120 140 K80 P100 V100 Mixed/FP16 Perf (TFLOPS) 14x over K80s max perf. FP32
  • 16. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. P3 instances details Instance size GPUs GPU peer to peer vCPUs Memory (GB) Network bandwidth Amazon EBS bandwidth On-Demand price/hr.* 1-yr RI effective hourly* 3-yr RI effective hourly* P3.2xlarge 1 No 8 61 Up to 10 Gbps 1.7 Gbps $3.06 $1.99 (35% disc.) $1.23 (60% disc.) P3.8xlarge 4 NVLink 32 244 10 Gbps 7 Gbps $12.24 $7.96 (35% disc.) $4.93 (60% disc.) P3.16xlarge 8 NVLink 64 488 25 Gbps 14 Gbps $24.48 $15.91 (35% disc.) $9.87 (60% disc.) Regional availability P3 instances are generally available in AWS US East (Northern Virginia), US East (Ohio), US West (Oregon), EU (Ireland), Asia Pacific (Seoul), Asia Pacific (Tokyo), AWS GovCloud (US) and China (Beijing) Regions Framework support P3 instances and their V100 GPUs supported across all major frameworks (such as TensorFlow, MXNet, PyTorch, Caffe2 and CNTK)
  • 17. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. P3 instances details Instance size GPUs GPU peer to peer vCPUs Memory (GB) Network bandwidth EBS bandwidth On-Demand price/hr* 1-yr RI effective hourly* 3-yr RI effective hourly* P3.2xlarge 1 No 8 61 Up to 10 Gbps 1.7 Gbps $3.06 $1.99 (35% disc.) $1.23 (60% disc.) P3.8xlarge 4 NVLink 32 244 10 Gbps 7 Gbps $12.24 $7.96 (35% disc.) $4.93 (60% disc.) P3.16xlarge 8 NVLink 64 488 25 Gbps 14 Gbps $24.48 $15.91 (35% disc.) $9.87 (60% disc.) • P3 instances provide GPU-to-GPU data transfer over NVLink • P2 instanced provided GPU-to-GPU data transfer over PCI Express
  • 18. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. New larger P3 size – P3dn.24xlarge OptimizedfordistributedMLtraining • One of the most powerful GPU instances available in the cloud • 100 Gbps of networking throughput • 96 vCPU using AWS customer Skylake CPUs and 768 GB of system memory • Based on NVIDIA’s latest GPU Tesla V100 with 32 GB of memory Instance size GPUs GPU memory GPU peer to peer vCPUs CPU type Memory (GB) Network bandwidth Amazon EBS bandwidth Local instance storage P3.2xlarge 1 x V100 16 GB/GPU No 8 Broadwell 61 Up to 10 Gbps 1.7 Gbps NA P3.8xlarge 4 x V100 16 GB/GPU NVLink 32 Broadwell 244 10 Gbps 7 Gbps NA P3.16xlarge 8 x V100 16 GB/GPU NVLink 64 Broadwell 488 25 Gbps 14 Gbps NA P3dn.24xlarge 8 x V100 32 GB/GPU NVLink 96 Skylake 768 100 Gbps 14 Gbps 2 TB NVMe Latest NVIDA V100 GPU with 32 GB memory for large models and higher batch sizes 96 Skylake vCPUs with support for AVX-512 instructions for pre- processing of training data 100 Gbps of networking throughput for large-scale distributed training & fast data access
  • 19. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Scaling performance using distributed training - 5,000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 45,000 50,000 1 2 4 8 16 32 64 Images/Second Number of GPUs Training using P3 instances (ResNet-50, ImageNet Images/Second) • Using single P3 instances, with Volta GPUs, customers can cut down training times of their machine learning models from days to a few hours. • Using distributed training via multiple P3 instances, high performance networking and storage solutions, customers can further cut down their time-to-train from hours in to minutes. • Example – We have been able to train ResNet-50 to Top1 validation accuracy of 76% in 14 mins cluster of P3.16xlarge instances. https://aws.amazon.com/blogs/machine-learning/scalable-multi-node-deep-learning-training-using-gpus-in-the-aws-cloud/
  • 20. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. The broadest global availability Available AWS regions for P3 instances include: • US East (N. Virginia) • US East (Ohio) • US West (Oregon) • Canada (Central) • Europe (Ireland) • Europe (Frankfurt) • Europe (London) • Asia Pacific (Tokyo) • Asia Pacific (Seoul) • Asia Pacific (Sydney) • Asia Pacific (Singapore) • China (Beijing) • China (Ningxia) • AWS GovCloud (US) Available AWS regions for P3dn.24xlarge instances include: • US East (N. Virginia) • US West (Oregon)
  • 21. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon S3 Secure, durable, highly scalable object storage Fast access, low cost For long-term durable storage of data, in a readily accessible get/put access format Primary durable and scalable storage for data Amazon S3 Glacier Secure, durable, long term, highly cost-effective object storage For long-term storage and archival of data that is infrequently accessed Use for long-term, lower- cost archival of data EC2+EBS Create a Single-AZ shared file system using Amazon EC2 and Amazon EBS, with third- party or open source software (e.g., ZFS, Intel Lustre, etc.) For near-line storage of files optimized for high I/O performance Use for high-IOPs, temporary working storage AWS storage options Amazon EFS Highly available, Multi- AZ, fully managed network-attached elastic file system For near-line, highly available storage of files in a traditional NFS format (NFSv4) Use for read-often, temporary working storage
  • 22. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon FSx for Lustre • High-performance file system optimized for fast processing of workloads such as machine learning, HPC, video processing, financial modeling, and electronic design automation • Launch and run a file system that provides submillisecond access to your data • Enables you to read and write data at speeds of up to hundreds of gigabytes per second of throughput and millions of IOPS Learn more at aws.amazon.com/fsx/lustre
  • 23. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Deep Learning AMI • Get started quickly with easy-to-launch tutorials • Hassle-free setup and configuration • Pay only for what you use—no additional charge for the AMI • Accelerate your model training and deployment • Support for popular deep learning frameworks
  • 24. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 1 2 3 Amazon SageMaker: Build, train, and deploy ML models at scale
  • 25. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon SageMaker: Build, train, and deploy ML models at scale
  • 26. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. RL Coach Amazon SageMaker: Build, train, and deploy ML models at scale
  • 27. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon SageMaker: Build, train, and deploy ML models at scale
  • 28. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. A W S I o T G R E E N G R A S S A m a z o n E C 2 C 5 Amazon SageMaker: Build, train, and deploy ML models at scale
  • 29. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon SageMaker: Build, train, and deploy ML models at scale
  • 30. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon EC2 G3 instances for graphics acceleration
  • 31. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS G3 GPU instances • Up to four NVIDIA M60 GPUs • Includes GRID Virtual Workstation features and licenses, supports up to four monitors with 4096x2160 (4K) resolution • Includes NVIDIA grid virtual application capabilities for application virtualization software like Citrix XenApp Essentials and VMWare Horizon, supporting up to 25 concurrent users per GPU • Hardware encoding to support up to 10 H.265 (HEVC) 1080p30 streams, and up to 18 H.264 1080p30 streams per GPU • Designed for workloads such as 3D rendering, 3D visualizations, graphics-intensive remote workstations, video encoding, and virtual reality applications Instance Size GPUs vCPUs Memory (GiB) Linux price per hour (IAD) Windows price per hour (IAD) g3s.xlarge 1 4 30.5 $0.75 $0.93 g3.4xlarge 1 16 122 $1.14 $1.88 g3.8xlarge 2 32 244 $2.28 $3.75 g3.16xlarge 4 64 488 $4.56 $7.50
  • 32. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Four modes of using G3 instances CPU 16 vCPUs GPU 1 x M60 Memory 122 GB G3.4xlarge Up to 10G Network Graphics rendering, simulations, video encoding EC2 instance with NVIDIA drivers & libraries EC2 instance with NVIDIA GRID NVIDIA GRID virtual workstation NVIDIA GRID virtual application Professional workstation (single user) Virtual apps (25 concurrent users) Gaming services EC2 instance w/ NVIDIA GRID for gaming
  • 33. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. M&E – Content creation Auto – Car configurators E&P – Analytics • Seismic analysis, energy E&P, cloud GPU rendering & visualization, such as high end car configurators, AR/VR • Desktop and application virtualization • Productivity and consumer apps • Design and engineering • Media and entertainment post-production • Media and entertainment: video playout/broadcast, encoding/transcoding • Cloud gaming G3 use cases
  • 34. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS G4 GPU instances • Designed for machine learning inferencing, video transcoding, remote graphics workstation, and other demanding graphics applications. • Up to 8 NVIDIA T4 Tensor Core GPUs • 2560 CUDA Cores, 320 Turing Codes including support for Ray-Tracing technology • Available in multiple sizes • AWS-custom Intel CPUs (4–96 vCPUs) • Available soon
  • 35. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon EC2 F1 instances for custom hardware acceleration
  • 36. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. An FPGA is effective at processing data of many types in parallel, for example creating a complex pipeline of parallel, multistage operations on a video stream, or performing massive numbers of dependent or independent calculations for a complex financial model… • An FPGA does not have an instruction- set! • Data can be any bit-width (9-bit integer? No problem!) • Complex control logic (such as a state machine) is easy to implement in an FPGA Each FPGA in F1 has more than 2M of these cells Parallel processing in FPGAs
  • 37. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. …. …. module filter1 (clock, rst, strm_in, strm_out) for (i=0; i<NUMUNITS; i=i+1) always@(posedge clock) integer i,j; //index for loops tmp_kernel[j] = k[i*OFFSETX]; FPGA handles compute- intensive, deeply pipelined, hardware- accelerated operations CPU handles the rest Application How FPGA acceleration works …. …. …. …. …. …. ….
  • 38. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. F1 FPGA instance types on AWS ▪Up to 8 Xilinx UltraScale+ 16 nm VU9P FPGA devices in a single instance ▪The f1.16xlarge size provides: ▪ 8 FPGAs, each with over 2 million customer-accessible FPGA programmable logic cells and over 5000 programmable DSP blocks ▪ Each of the 8 FPGAs has 4 DDR-4 interfaces, with each interface accessing a 16 GiB, 72-bit wide, ECC-protected memory Instance size FPGAs FPGA memory (GB) vCPUs Instance memory (GB) NVMe instance storage (GB) Network bandwidth f1.2xlarge 1 64 8 122 1 x 470 Up to 10 Gbps f1.4xlarge 2 128 16 244 1 x 940 Up to 10 Gbps f1.16xlarge 8 512 64 976 4 x 940 25 Gbps
  • 39. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Three methods to use F1 instance Hardware engineers/developers1 •Developers who are comfortable programming FPGA •Use F1 Hardware Development Kit (HDK) to develop and deploying custom FPGA accelerations using Verilog and VHDL Software engineers/developers2 • Developers who are not proficient in FPGA design • Use OpenCL to create custom accelerations Software engineers/developers3 • Developers who are not proficient in FPGA design • Use pre-build and ready to use accelerations available in AWS Marketplace
  • 40. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. FPGA acceleration development PCIe DDR controllers DDR-4 attached memory EC2 F1 Launch instance and load AFI Amazon Machine Image (AMI) CPU Application Amazon FPGA Image (AFI) An F1 instance can have any number of AFIs An AFI can be loaded into the FPGA in seconds
  • 41. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Developing custom accelerations The FPGA Developer AMI Use Xilinx Vivado and a hardware description language (Verilog or VHDL for RTL) with the HDK to describe and simulate your FPGA logic Xilinx Vivado for custom logic development Virtual JTAG for interactive debugging
  • 42. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. OpenCL generally available for F1 ▪ Familiar development experience to accelerate C/C++ applications ▪ 50+ F1 code examples available that span multiple domains: security, image processing, and accelerated algorithms ▪ Already supported on the FPGA Developer AMI, no need to upgrade/install
  • 43. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Marketplace Discover, procure, deploy, and manage software in the cloud
  • 44. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Delivering FPGA partner solutions Amazon EC2 FPGA deployment via AWS Marketplace CPU Application Customers Amazon Machine Image (AMI) Amazon FPGA Image (AFI) AFI is secured, encrypted, dynamically loaded into the FPGA – can’t be copied or downloaded
  • 45. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Inferentia High-performancemachinelearninginferencechip,customdesignedbyAWS • Making predictions using a trained machine learning model–a process called inference–can drive as much as 90% of the compute costs of the application. • AWS Inferentia is a machine learning inference chip designed to deliver high performance at low cost.
  • 46. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • High Frequency instances with custom Intel Xeon Scalable processors • Running at sustained 4 GHz all core turbo • Fastest networking in the cloud • Compute optimized instances Intel Xeon Scalable Processors (Skylake) • Up to 100 GBps networking Summary • Pick the right compute platform for accelerating your application • You have a choice of using compute optimize CPU platforms, GPU, or FPGA accelerated platforms • We aspire to provide you with the broadest and deepest set of products and services to support your workload. • Compute optimized instances • Custom 3.0 GHz Intel Xeon Scalable Processors (Skylake) • Support for Intel AVX-512 – Great for ML inference
  • 47. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. EC2 accelerated computing instances P3: GPU Compute instance • Up to 8 NVIDIA V100 GPUs in a single instance, with NVLink for peer-to-peer GPU communication • Supporting a wide variety of use cases including deep learning, HPC simulations, financial computing, and batch rendering G3: GPU Graphics instance • Up to 4 NVIDIA M60 GPUs, with GRID Virtual Workstation features and licenses • Designed for workloads such as 3D rendering, 3D visualizations, graphics-intensive remote workstations, video encoding, and virtual reality applications F1: FPGA instance • Up to 8 Xilinx Virtex UltraScale+ VU9P FPGAs in a single instance. Programmable via VHDL, Verilog, or OpenCL. Growing marketplace of pre-built application accelerations. • Designed for hardware-accelerated applications including financial computing, genomics, accelerated search, and image processing AWS Inferentia – ML inference chip • High-performance machine learning inference chip, custom designed by AWS • Designed for lower cost-per-inference across the full range of ML applications P3 G3 F1
  • 48. Thank you! S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Chetan Kapoor Principal Product Manager Amazon EC2
  • 49. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.