Duke Researchers Develop Starfish System for Automatic Big Data Optimization

Herodotos Herodotou, Harold Lim, Gang
Luo, Nedyalko Borisov, Liang Dong, Fatma
Bilgen Cetin, Shivnath Babu
Duke University

Outline
 Why Starfish?
 What should Starfish be able to do?
 What can Starfish do so far?

2

We are in the Era of Big Data
 Google processes 20 PB a day (2008)
 Wayback Machine has 3 PB + 100 TB/month (3/2009)
 eBay has 6.5 PB of user data + 50 TB/day (5/2009)
 Facebook has 36 PB of user data + 80-90 TB/day
(6/2010)
 CERN’s LHC: 15 PB a year (any day now)
 LSST: 6-10 PB a year (~2015)

From http://www.umiacs.umd.edu/~jimmylin/

Who are the “Big Data” Practitioners?
 Data analysts
 Report generation, data mining, ad optimization, …
 Computational scientists
 Computational biology, economics, journalism, …
 Statisticians and machine-learning researchers
 Systems researchers, developers, and testers
 Distributed systems, networking, security, …
 You!

4

Practitioners want a MAD System
 Magnetic system
 Users want to get fresh new data into the system quickly
 Data may be of multiple formats, with missing fields, etc.
 Agile system and analytics
 Change (data, workload, needs) is constant, make it easy
 Complex data gathering & processing pipelines (real-time)
 Deep analytics
 Sophisticated aggregation/statistical analysis
 Users want to use interfaces they are familiar with or the
best available: SQL, MapReduce, Java, Python, R, …

5

Hadoop is as MAD as it gets!
 Magnetic:
 Load data into HDFS as files
 Load first, ask questions later

 Agile:
 Hadoop is extremely malleable: pluggable data formats, storage
engines/filesystems, scheduler, instrumentation, …
 Not just a querying tool: supports the end-to-end data pipeline
 Built for elastic computing: fine-grained scheduler, highly fault tolerant,
dynamic node addition and dropping
 Deep:
 Well integrated with programming languages
 MapReduce is a powerful programming model, plus other interfaces (Pig
Latin, HiveQL, JAQL) on top

6

MAD + Good Performance
 Users want good performance, without having to
understand and tune system internals
 Performance is multidimensional: time, cost, scalability
 Learn from the troubled history of database tuning

 Tuning a MAD system is highly challenging
 Data is opaque until it is accessed
 Data loaded/accessed as files (Vs. organized DB stores)
 MapReduce programs pose different challenges than SQL
 Simpler in some ways, more complex in others
 Heavy use of programming languages (e.g., Java/python)

 Elasticity is wonderful, but hard to achieve (Hadoop has many useful
mechanisms, but policies are lacking)
 Terabyte-scale data cycles

7

The Starfish Philosophy
 Goal: A high-performance MAD system
 Build on Hadoop’s strengths
 Hadoop is MAD & has a rapidly growing user base
 How can users get good performance automatically?
 Without having to understand & tune system internals
 Recall: Perf. is multidimensional (time, cost, scalability)

8

Starfish: Self-Tuning System
 Our goal: Provide good performance automatically
 NOT our goal: Improve Hadoop’s peak performance
Java Client Pig Hive Oozie Elastic MR …
Analytics System
Starfish
Hadoop
MapReduce Execution Engine
Distributed File System

9

Outline
 Why Starfish?

10

Lifecycle of a MapReduce Job

Map function

Reduce function

Run this program as a
MapReduce job

Lifecycle of a MapReduce Job
Time

Input Map Map Reduce Reduce
Splits Wave 1 Wave 2 Wave 1 Wave 2

How are the number of splits, number of map and reduce
tasks, memory allocation to tasks, etc., determined?

Job Configuration Parameters
• 190+ parameters in
Hadoop
• Set manually or defaults
are used
– Rules-of-thumb

MapReduce Job Tuning in Hadoop
2-dim Projection of a 13-dim Surface

Challenges faced by Practitioners
• Joe Public can now provision a 100-node Hadoop cluster
in minutes. Joe may need to answers to:
– How many reduce tasks to use in MapReduce job J for getting
the best perf. on my 8-node production cluster?
– My current cluster needs more than 6 hours to process 1 day’s
worth of data. Want to reduce that to under 3 hours.
• Are the MapReduce job workflows running optimally?
• How many and what type of Amazon EC2 nodes to use?

Users (username, GeoInfo (ipaddr, Clicks (username,
Amazon S3
age, ipaddr) region) url, value)
storage

Copy Copy Copy Copy Copy

Filter value >0
Partition by age into
<20, ≤25, ≤35, >35
Join

Count users Join
per age <20 Join Filter age >35 Filter url is
“Sports” type
Count users
per region with
Count clicks
age > 25
per url type
Count clicks
per region,age Count clicks Count clicks
per age per age

S3 II III IV V VI
I
storage

Performance Vs. Price Tradeoff

2 nodes 4 nodes 6 nodes 2 nodes 4 nodes 6 nodes
12000 $6.00
Execution Time (sec)

10000 $5.00

Cost ($)
8000 $4.00
6000 $3.00
4000 $2.00
2000 $1.00
0 $0.00
m1.small m1.large m1.xlarge m1.small m1.large m1.xlarge
Node Type on Amazon EC2 Node Type on Amazon EC2

Outline
 Why Starfish?

19

Starfish Architecture
Workload-level tuning
Workload Optimizer Elastisizer
Workflow-level tuning
Workflow-aware What-if
Optimizer Engine
Job-level tuning
Just-in-Time Optimizer
Profiler Sampler
Data Manager
Metadata Intermediate Data Layout &
Mgr. Data Mgr. Storage Mgr.

20

Optimizer Engine
Job-level tuning
Profiler Sampler
Data Manager

21

Job Configuration Parameters

 Over 190 parameters
 Many affect performance in complex ways
 Impact depends on Job, Data, and Cluster properties

22

Current Approaches
Rules of
thumb

 Rules of thumb
 mapred.reduce.tasks = 0.9 * number_of_reduce_slots
 io.sort.record.percent = 16 / (16 + average_record_size)
 Rules of thumb may not suffice

23

Just-in-Time Job Optimization
 Just-in-Time Optimizer
 Searches through the space of parameter settings
 Profiler
 Collects information about MapReduce job executions
 Sampler
 Collects statistics about input, intermediate, and output
key-value spaces of MapReduce jobs
 What-if Engine
 Uses mix of simulation and model-based estimation

Code is ready for release! Demo after the talk
24

Job Profiler
 Dynamic instrumentation
 Monitors phases of MapReduce job execution
 Benefits
 Zero overhead when turned off
 Works with unmodified MapReduce programs
 Used to construct a job profile
 Concise representation of job execution
 Allows for in-depth analysis of job behavior

25

Insights from Job Profiles
WordCount A WordCount B
 Few, large spills  Many, small spills
 Combiner gave high data  Combiner gave smaller data
reduction reduction
 Combiner made Mappers  Better resource utilization in
CPU bound Mappers

26

Estimates from the What-if Engine
True surface Surface estimated by
What-if Engine

Optimizer Engine
Job-level tuning
Profiler Sampler
Data Manager

28

MapReduce Job Workflows
 Producer-Consumer
relationships among jobs
 Data layout crucial for later
jobs in the workflow
 Avoid unbalanced data
layouts
 Make effective use of
parallelism

29

Workflow-Aware Optimizer
 Goal: Optimize overall performance of workflow
 Select best data layout + job parameters
 Overall approach is same as job-level optimizer, but
larger space of options
 We hope to support Amazon Elastic MapReduce in
the near future – summer project for Duke undergrad

30

Optimizer Engine
Job-level tuning
Profiler Sampler
Data Manager

31

Elastisizer – Hadoop Provisioning
 Goal: Make provisioning decisions based on workload
requirements (e.g., completion time, cost)
2 nodes 4 nodes 6 nodes 2 nodes 4 nodes 6 nodes
12000 $6.00
Execution Time (sec)

10000 $5.00

Cost ($)
8000 $4.00
6000 $3.00
4000 $2.00
2000 $1.00
0 $0.00
m1.small m1.large m1.xlarge m1.small m1.large m1.xlarge
Node Type on Amazon EC2 Node Type on Amazon EC2

32

Optimizing Hadoop Workloads
 Data-flow sharing
 Materialization
 Reorganization

Java Client Pig Hive Oozie Elastic MR …
Analytics System
Starfish
Hadoop
MapReduce Execution Engine
Distributed File System

33

Starfish: Self-Tuning System
Focus simultaneously on
 Different workload granularities
 Workload
 Workflows
 Jobs (procedural & declarative)
 Across various decision points
 Provisioning
 Optimization We welcome
 Scheduling your
 Data layout collaboration!
34

Duke Researchers Develop Starfish System for Automatic Big Data Optimization

Recommended

Recommended

More Related Content

What's hot

What's hot (8)

Viewers also liked

Viewers also liked (19)

Similar to Duke Researchers Develop Starfish System for Automatic Big Data Optimization

Similar to Duke Researchers Develop Starfish System for Automatic Big Data Optimization (20)

More from Grant Ingersoll

More from Grant Ingersoll (20)

Recently uploaded

Recently uploaded (20)

Duke Researchers Develop Starfish System for Automatic Big Data Optimization