SlideShare uma empresa Scribd logo
1 de 55
Baixar para ler offline
Peter Aiken, Ph.D.
Data Modeling Fundamentals
• DAMA International President 2009-2013
• DAMA International Achievement Award 2001 (with
Dr. E. F. "Ted" Codd
• DAMA International Community Award 2005
Peter Aiken, Ph.D.
• 33+ years in data management
• Repeated international recognition
• Founder, Data Blueprint (datablueprint.com)
• Associate Professor of IS (vcu.edu)
• DAMA International (dama.org)
• 10 books and dozens of articles
• Experienced w/ 500+ data
management practices
• Multi-year immersions:

– US DoD (DISA/Army/Marines/DLA)

– Nokia

– Deutsche Bank

– Wells Fargo

– Walmart

– … PETER AIKEN WITH JUANITA BILLINGS
FOREWORD BY JOHN BOTTEGA
MONETIZING
DATA MANAGEMENT
Unlocking the Value in Your Organization’s
Most Important Asset.
The Case for the
Chief Data Officer
Recasting the C-Suite to Leverage
Your MostValuable Asset
Peter Aiken and
Michael Gorman
Copyright 2018 by Data Blueprint Slide #
Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved.
Data Modeling with
Couchbase
Anuj Sahni| Director Product Marketing
April 2018
AGENDA 1. Couchbase Data Platform Architecture
2. Data Modeling with JSON
2
Data Modeling Approaches
NoSQL
Relaxed Normalization
schema implied by structure
fields may be empty, duplicate, or missing
Relational
Required Normalization
schema enforced by DB
same fields in all records
• Minimize data inconsistencies (one item = one
location)
• Reduced duplicated data
• Preserve storage resources
• Optimized based on access patterns
• Flexible, based on application requirements
• Supports clustered architecture
• Reduced server overhead
Couchbase
Data
Platform
Develop with Agility.
Deploy at any scale.
Couchbase - The Data Platform Architecture
5
COUCHBASE LITE SYNC GATEWAY COUCHBASE SERVER
Lightweight embedded NoSQL database with
full CRUD and
query functionality.
Secure web gateway with
synchronization, data access, and data
integration APIs for accessing,
integrating, and synchronizing data
over the web.
Highly scalable, highly available,
high performance NoSQL
database server.
Client Middle Tier StorageWAN LAN
Security
Built-in enterprise level security throughout the entire stack includes user authentication, user and role based data access control (RBAC), secure transport (TLS),
and 256-bit AES full database encryption.
Couchbase Server Cluster Service Deployment
STORAGE
Couchbase Server 1
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed Cache
Cluster
ManagerCluster
Manager
Managed Cache
Storage
Data
Service STORAGE
Couchbase Server 2
Managed Cache
Cluster
ManagerCluster
Manager
Data
Service STORAGE
Couchbase Server 3
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed Cache
Cluster
ManagerCluster
Manager
Data
Service STORAGE
Couchbase Server 4
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed Cache
Cluster
ManagerCluster
Manager
Query
Service STORAGE
Couchbase Server 5
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed Cache
Cluster
ManagerCluster
Manager
Query
Service STORAGE
Couchbase Server 6
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed Cache
Cluster
ManagerCluster
Manager
Index
Service
Managed Cache
Storage
Managed Cache
Storage Storage
STORAGE
Couchbase Server 7
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed Cache
Cluster
ManagerCluster
Manager
Index
Service
Storage
Managed Cache Managed Cache
SDK SDK
Managed Cache
Storage
Managed Cache
Storage
Data Modeling with JSON
Properties of Real-World Data
• Rich structure
• Attributes, Sub-structure
• Relationships
• To other data
• Value evolution
• Data is updated
• Structure evolution
• Data is reshaped
Customer
Name
DOB
Billing
Connections
Purchases
Modeling Data in Relational World
Billing
ConnectionsPurchases
Contacts
Customer
 Rich structure
 Normalize & JOIN Queries
 Relationships
 JOINS and Constraints
 Value evolution
 INSERT, UPDATE, DELETE
 Structure evolution
 ALTER TABLE
 Application Downtime
 Application Migration
 Application Versioning
JSON 101
{
"Name" : "Jane Smith",
"DOB" : "1990-01-30",
"Billing" : [
{
"type" : "visa",
"cardnum" : "5827-2842-2847-3909",
"expiry" : "2019-03"
},
{
"type" : "master",
"cardnum" : "6274-2842-2847-3909",
"expiry" : "2019-03"
}
],
"address" :
{
"Street" : "10, Downing Street",
"City" : "San Francico",
"State" : "California",
"zip" :94401
}
}
• Used to represent object data in text
• Representation
• "Key":"Value"
• Data Types:
• Number, Strings, Boolean, objects,
Arrays, NULL
• Hierarchical
Flexibility from JSON
{
"Name" : "Jane Smith",
"DOB" : "1990-01-30",
"Billing" : [
{
"type" : "visa",
"cardnum" : "5827-2842-2847-3909",
"expiry" : "2019-03"
},
{
"type" : "master",
"cardnum" : "6274-2842-2847-3909",
"expiry" : "2019-03"
}
],
"address" :
{
"Street" : "10, Downing Street",
"City" : "San Francico",
"State" : "California",
"zip" :94401
}
}
• Document is self describing
• Fields can be added or can be missing
• Data types can change
• Arrays give you flexibility in number of
items in an attribute
Using JSON to Store Data
{
"Name" : "Jane Smith",
"DOB" : "1990-01-30",
"Billing" : [
{
"type" : "visa",
"cardnum" : "5827-2842-2847-3909",
"expiry" : "2019-03"
},
{
"type" : "master",
"cardnum" : "6274-2842-2847-3909",
"expiry" : "2019-03"
}
],
"Connections" : [
{
"CustId" : "XYZ987",
"Name" : "Joe Smith"
},
{
"CustId" : "PQR823",
"Name" : "Dylan Smith"
}
{
"CustId" : "PQR823",
"Name" : "Dylan Smith"
}
],
"Purchases" : [
{ "id":12, item: "mac", "amt": 2823.52 }
{ "id":19, item: "ipad2", "amt": 623.52 }
]
}
CustomerID Name DOB
CBL2015 Jane Smith 1990-01-30
CustomerID Type Cardnum Expiry
CBL2015 visa 5827… 2019-03
CBL2015 master 6274… 2018-12
CustomerID ConnId Name
CBL2015 XYZ987 Joe Smith
CBL2015 SKR007 Sam Smith
CustomerID item amt
CBL2015 mac 2823.52
CBL2015 ipad2 623.52
CustomerID ConnId Name
CBL2015 XYZ987 Joe Smith
CBL2015 SKR007 Sam Smith
Contacts
Customer
Billing
ConnectionsPurchases
Models for Representing Data
Data Concern Relational Model
JSON Document Model
(NoSQL)
Rich Structure
 Multiple flat tables
 Constant assembly / disassembly
 Documents
 No assembly required!
Relationships
 Represented
 Queried (SQL)
 Represented
 N1QL (support ANSI JOIN)
Value Evolution  Data can be updated  Data can be updated
Structure Evolution
 Uniform and rigid
 Manual change (disruptive)
 Flexible
 Dynamic change
!3Copyright 2018 by Data Blueprint Slide #
Data Modeling Fundamentals
• Data Management Overview
• Motivation
– of Systems/components
– Data is a not well understood substructure
• Why data modeling & what is it?
– Model represents our understanding of the
– Fundamental, foundational system
characteristics
– Shared between system and human
• Fundamentals
– The power of the purpose statement
– Understanding data centric thinking
– Data modeling compliments other architecture/
engineering techniques, as well as
– Challenges beyond data modeling
• Take Aways, References & Q&A






UsesUsesReuses
What is data management?
!4Copyright 2018 by Data Blueprint Slide #
Sources


Data
Engineering


Data 

Delivery


Data

Storage
Specialized Team Skills
Data Governance
Understanding the current
and future data needs of an
enterprise and making that
data effective and efficient in
supporting 

business activities


Aiken, P, Allen, M. D., Parker, B., Mattia, A., 

"Measuring Data Management's Maturity: 

A Community's Self-Assessment" 

IEEE Computer (research feature April 2007)
Data management practices connect
data sources and uses in an
organized and efficient manner
• Engineering
• Storage
• Delivery
• Governance
When executed, 

engineering, storage, and 

delivery implement governance
Note: does not well-depict data reuse






















What is data management?
!5Copyright 2018 by Data Blueprint Slide #
Sources


Data
Engineering


Data 

Delivery


Data

Storage
More Specialized Team Skills


Resources

(optimized for reuse)

Data Governance
AnalyticInsight
!6Copyright 2018 by Data Blueprint Slide #
You can accomplish
Advanced Data Practices
without becoming proficient
in the Foundational Data
Management Practices
however this will:
• Take longer
• Cost more
• Deliver less
• Present 

greater

risk

(with thanks to Tom DeMarco)
Data Management Practices Hierarchy
Advanced 

Data 

Practices
• MDM
• Mining
• Big Data
• Analytics
• Warehousing
• SOA
Foundational Data Management Practices
Data Platform/Architecture
Data Governance Data Quality
Data Operations
Data Management Strategy
Technologies
Capabilities
Copyright 2018 by Data Blueprint Slide # !7
DMM℠ Structure of 

5 Integrated 

DM Practice Areas
Data architecture
implementation
Data 

Governance
Data 

Management

Strategy
Data 

Operations
Platform

Architecture
Supporting

Processes
Maintain fit-for-purpose data,
efficiently and effectively
!8Copyright 2018 by Data Blueprint Slide #
Manage data coherently
Manage data assets professionally
Data life cycle
management
Organizational support
Data 

Quality
Data Strategy is often
the weakest link
Data architecture
implementation
Data 

Governance
Data 

Management

Strategy
Data 

Operations
Platform

Architecture
Supporting

Processes
Maintain fit-for-purpose data,
efficiently and effectively
!9Copyright 2018 by Data Blueprint Slide #
Manage data coherently
Manage data assets professionally
Data life cycle
management
Organizational support
Data 

Quality
3 3
33
1
Data Management
Body of
Knowledge
!10Copyright 2018 by Data Blueprint Slide #
Data
Management
Functions
DAMA DM
BoK: Data
Development
!11Copyright 2018 by Data Blueprint Slide #
from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International
Architecture: here, whether you like it or not
12Copyright 2018 by Data Blueprint Slide #
deviantart.com
• All organizations have
architectures
– Some are better
understood and
documented (and
therefore more useful
to the organization)
than others
Data
Architecture







and







Data Models
!13Copyright 2018 by Data Blueprint Slide #
http://www.architecturalcomponentsinc.com
• Architecture is higher level of abstraction
– Understanding/integration focused
• Models more downward facing
– Implementation/detail focused
Models are literally the translation 

between systems and people
!14Copyright 2018 by Data Blueprint Slide #
Data Modeling Fundamentals
• Data Management Overview
• Motivation
– of Systems/components
– Data is a not well understood substructure
• Why data modeling & what is it?
– Model represents our understanding of the
– Fundamental, foundational system
characteristics
– Shared between system and human
• Fundamentals
– The power of the purpose statement
– Understanding data centric thinking
– Data modeling compliments other architecture/
engineering techniques, as well as
– Challenges beyond data modeling
• Take Aways, References & Q&A
Data Models are about ...
• Things that someone cares

to keep information about
– Entities: persons, places, things
• The characteristics of the things
– Attributes: color, size, sequence

media code, product descriptions, quantity ordered
• How the entitles interact
– Relationships: accomplished

by cooperating (sharing key 

information)



An order is placed by one 

and only one customer
!15Copyright 2018 by Data Blueprint Slide #
What do we teach knowledge workers about data?
!16Copyright 2018 by Data Blueprint Slide #
What percentage of the deal with it daily?
What do we teach IT professionals about data?
!17Copyright 2018 by Data Blueprint Slide #
• 1 course
– How to build a
new database
• What
impressions do IT
professionals get
from this
education?
– Data is a technical
skill that is needed
when developing
new databases
• Slender, elegant and graceful
• World's 3rd longest suspension span
• Opened on July 1st, collapsed in a windstorm on
November 7,1940
• "The most dramatic failure in 

bridge engineering history"
• Changed forever how engineers 

design suspension bridges leading 

to safer spans today.
Tacoma Narrows Bridge/Gallopin' Gertie
!18Copyright 2018 by Data Blueprint Slide #
!19Copyright 2018 by Data Blueprint Slide #
Similarly data failures cost organizations
minimally 20-40% of their IT budget
Repeat 100s, thousands, millions of times ...
!20Copyright 2018 by Data Blueprint Slide #
Death by 1000 Cuts
!21Copyright 2018 by Data Blueprint Slide #
• How does maltreated data cost money?
• Consider the opposite question:
– Were your systems explicitly designed to 

be integrated or otherwise work together?
– If not then what is the likelihood that they 

will work well together?
• Organizations spend 20-40% of their IT

budget evolving data - including:
– Data migration
• Changing the location from one place to another
– Data conversion
• Changing data into another form, state, or product
– Data improving
• Inspecting and manipulating, or re-keying data to prepare it for 

subsequent use - John Zachman
Lack of data coherence is a hidden expense
!22
PETER AIKEN WITH JUANITA BILLINGS
FOREWORD BY JOHN BOTTEGA
MONETIZING
DATA MANAGEMENT
Unlocking the Value in Your Organization’s
Most Important Asset.
Copyright 2018 by Data Blueprint Slide #
Bad Data Decisions Spiral
!23Copyright 2018 by Data Blueprint Slide #
Bad data decisions
Technical deci-
sion makers are not
data knowledgable
Business decision
makers are not
data knowledgable
Poor organizational outcomes
Poor treatment of
organizational data
assets
Poor

quality

data
!24Copyright 2018 by Data Blueprint Slide #
Data Modeling Fundamentals
• Data Management Overview
• Motivation
– of Systems/components
– Data is a not well understood substructure
• Why data modeling & what is it?
– Model represents our understanding of the
– Fundamental, foundational system
characteristics
– Shared between system and human
• Fundamentals
– The power of the purpose statement
– Understanding data centric thinking
– Data modeling compliments other architecture/
engineering techniques, as well as
– Challenges beyond data modeling
• Take Aways, References & Q&A
How much data,

by the minute!
For the entirety of 2017,
every minute of every day:
• (almost) Seventy
thousand hours of Netflix
• (almost) a half million
tweets
• 15+ million texts
• 3.5+ million google
searches
• 103+ million email spams
!25Copyright 2018 by Data Blueprint Slide #
https://www.domo.com/learn/data-never-sleeps-5
!26Copyright 2018 by Data Blueprint Slide #
As articulated by Micheline Casey
There will
never be less
data than
right now!
USS Midway
& Pancakes
What is this excellent
engineering example?
• It is tall
• It has a clutch
• It was built in 1942
• It is still in regular use!
!27Copyright 2018 by Data Blueprint Slide #
You cannot architect after implementation!
!28Copyright 2018 by Data Blueprint Slide #
Good Engineering/
Architectural
Foundation?
!29Copyright 2018 by Data Blueprint Slide #
Poor Foundation =
!30Copyright 2018 by Data Blueprint Slide #
Unsuitable

for

Further

Investment
Data Modeling Definition
• Modeling = Analysis and design
method used to
– Define and analyze data requirements
– Design data structures that support these
requirements
• Model = set of data specifications
and related diagrams that reflect
requirements and designs
– Representation of something in our
environment
– Employs standardized text/symbols to
represent data attributes (grouped into
data elements) and the relationships
among them
– Integrated collection of specifications and
related diagrams that represent data
requirements and design
!31Copyright 2018 by Data Blueprint Slide #
from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International
Data Modeling
• Modeling = complex process involving interaction
between people and with technology that don’t
compromise the integrity or security of the data
– Good data models accurately 

express and effectively communicate 

data requirements and 

quality solution design
• Modeling approach 

(guided by 2 formulas):
– Purpose + audience = deliverables
– Deliverables + resources + time = approach
!32Copyright 2018 by Data Blueprint Slide #
from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International
from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International
Data Models Facilitate
• Formalization
– Data model documents a single, 

precise definition of data requirements 

and data-related business rules
• Communication
– Data model is a bridge to understanding data 

between people with different levels and types of experience.
– Helps understand business area, existing application, or impact of
modifying an existing structure
– May also facilitate training new business and/or technical staff
• Scope
– Data model can help explain the data concept and scope of
purchased application packages
!33Copyright 2018 by Data Blueprint Slide #
ANSI-SPARK 3-Layer Schema
!34
For example, a changeover to a new
DBMS technology. The database
administrator should be able to change
the conceptual or global structure of the
database without affecting the users.
1. Conceptual - Allows independent
customized user views:
– Each should be able to access the same
data, but have a different customized
view of the data.
2. Logical - This hides the physical
storage details from users:
– Users should not have to deal with
physical database storage details. They
should be allowed to work with the data
itself, without concern for how it is
physically stored.
3. Physical - The database administrator
should be able to change the
database storage structures without
affecting the users’ views:
– Changes to the structure of an
organization's data will be required. The
internal structure of the database should
be unaffected by changes to the physical
aspects of the storage.
Copyright 2018 by Data Blueprint Slide #
Families of Modeling Notation Variants
!35Copyright 2018 by Data Blueprint Slide #
Eventually One, More
Eventually One
Exactly One
Zero, or More
One or More
Zero or One
Information Engineering
Pick one!
What is a Relationship?
• Natural associations between two or more entities
!36Copyright 2018 by Data Blueprint Slide #
Ordinality & Cardinality
• Defines mandatory/optional relationships using minimum/
maximum occurrences from one entity to another
!37Copyright 2018 by Data Blueprint Slide #
An order is
placed by one
and only one
customer
A customer
places zero
or more
orders
A product is contained on zero
or more orders
An order
contains at least
one or more
products
Q: What is the proper relationship for these entities?
!38Copyright 2018 by Data Blueprint Slide #
A: a relationship for these entities
!39Copyright 2018 by Data Blueprint Slide #
Eventually One, More
Eventually One
Exactly One
Zero, or More
One or More
Zero or One
Q: What is an Attribute?
!40Copyright 2018 by Data Blueprint Slide #
A: Attribute Definition
• Attributes describe an entity and attribute values describe
“instances of business things”
!41Copyright 2018 by Data Blueprint Slide #
Rigid Data Structure
!42Copyright 2018 by Data Blueprint Slide #
Person Job Class
Position
BR1) One EMPLOYEE
can be associated with one
PERSON
BR2) One EMPLOYEE can be
associated with one POSITION
Manual

Job Sharing
Manual

Moon Lighting
Employee
Flexible data structure
!43Copyright 2018 by Data Blueprint Slide #
Person Job Class
Employee Position
BR1) Zero, one, or more
EMPLOYEES can be associated
with one PERSON
BR2) Zero, one, or more EMPLOYEES
can be associated with one POSITION
Job Sharing
Moon Lighting
Everyone Shares Understanding
!44Copyright 2018 by Data Blueprint Slide #
Data structures must be specified prior
software development/acquisition
(Requires 2 structural loops more
than the more flexible data structure)
More flexible data structure Less flexible data structure
Understanding
• Definition:
– 'Understanding an architecture'
– Documented and articulated as a digital blueprint
illustrating the 

commonalities and 

interconnections 

among the 

architectural 

components
– Ideally the understanding 

is shared by systems and humans
!45Copyright 2018 by Data Blueprint Slide #
Modeling Procedures
1. Identify entities
2. Identify key for each
entity
3. Draw rough draft of
entity relationship
data model
4. Identify data
attributes
5. Map data attributes
to entities
!46Copyright 2018 by Data Blueprint Slide #
Models Evolution is good, at first ...
!47Copyright 2018 by Data Blueprint Slide #
Preliminary
activities
Modeling
cycles
Wrapup
activities
Evidence
collection &
analysis
Project
coordination
requirements
Target
system
analysis
Modeling
cycle
focus
Activity
Refinement
Collection
Analysis
Validation
Declining coordination requirements
Increasing amounts of targetsystem analysis
Preliminary
activities
Modeling
cycles
Wrapup
activities
Evidence
collection &
analysis
Project
coordination
requirements
Target
system
analysis
Modeling
cycle
focus
Activity
Refinement
Collection
Analysis
Validation
Declining coordination requirements
Increasing amounts of targetsystem analysis
Preliminary
activities
Modeling
cycles
Wrapup
activities
Evidence
collection &
analysis
Project
coordination
requirements
Target
system
analysis
Modeling
cycle
focus
Activity
Refinement
Collection
Analysis
Validation
Declining coordination requirements
Increasing amounts of targetsystem analysis
Preliminary
activities
Modeling
cycles
Wrapup
activities
Evidence
collection &
analysis
Project
coordination
requirements
Target
system
analysis
Modeling
cycle
focus
Activity
Refinement
Collection
Analysis
Validation
Declining coordination requirements
Increasing amounts of targetsystem analysis
Relative use of time allocated to tasks during Modeling
Preliminary
activities
Modeling
cycles
Wrapup
activities
Evidence
collection &
analysis
Project
coordination
requirements
Target
system
analysis
Modeling
cycle
focus
Activity
Refinement
Collection
Analysis
Validation
Declining coordination requirements
Increasing amounts of targetsystem analysis
!48Copyright 2018 by Data Blueprint Slide #
Don’t Tell Them You Are Modeling!
!49
• Just write some stuff down
• Then arrange it
• Then make some appropriate
connections between your
objects
Copyright 2018 by Data Blueprint Slide #
!50Copyright 2018 by Data Blueprint Slide #
Data Modeling Fundamentals
• Data Management Overview
• Motivation
– of Systems/components
– Data is a not well understood substructure
• Why data modeling & what is it?
– Model represents our understanding of the
– Fundamental, foundational system
characteristics
– Shared between system and human
• Fundamentals
– The power of the purpose statement
– Understanding data centric thinking
– Data modeling compliments other architecture/
engineering techniques, as well as
– Challenges beyond data modeling
• Take Aways, References & Q&A
Each model has a purpose
!51Copyright 2018 by Data Blueprint Slide #
Data Models are Developed in Response to Organizational Needs
!

 !

!

!

!52Copyright 2018 by Data Blueprint Slide #
Organizational Needs
become instantiated 

and integrated into an 

Data Models
Informa(on)System)
Requirements
authorizes and 

articulates
satisfyspecificorganizationalneeds
Standard definition reporting does not provide conceptual context
!53Copyright 2018 by Data Blueprint Slide #
Bed
Something you sleep in
Bed

Entity: BED
Purpose: This is a substructure within the room

substructure of the facility location. It 

contains information about beds within rooms.
Attributes: Bed.Description

Bed.Status

Bed.Sex.To.Be.Assigned

Bed.Reserve.Reason
Associations: >0-+ Room
Status: Validated
Keep them focused on data model purpose
!54
• The reason we are locked in
this room is to:
– Mission: Understand formal
relationship between soda and
customer
• Outcome: Walk out the door with a
data model this relationship
– Mission: Understand the
characteristics that differ
between our hospital beds
• Outcome: We will walk out the door
when we identify the top three traits that
represent the brand.
– Mission: Could our systems
handle the following business
rule tomorrow?
– "Is job-sharing permitted?"
• Outcomes: Confirm that it is possible to
staff a position with multiple employees
effective tomorrow
selects and pays forgiven to
Soda
Customer
selects
can be filled by zero or 1
Employee Position
has exactly 1
How does our
perspective change: 

the primary means of
tracking a patient
Copyright 2018 by Data Blueprint Slide #
Entity: BED
Data Asset Type: Principal Data Entity
Purpose: This is a substructure within the room

substructure of the facility location. It contains 

information about beds within rooms.
Source: Maintenance Manual for File and Table

Data (Software Version 3.0, Release 3.1)
Attributes: Bed.Description

Bed.Status

Bed.Sex.To.Be.Assigned

Bed.Reserve.Reason
Associations: >0-+ Room
Status: Validated
The Power of the Purpose Statement
!55Copyright 2018 by Data Blueprint Slide #
• A purpose statement describing
why the organization is
maintaining information about
this business concept
• Sources of information about it
• A partial list of the attributes or
characteristics of the entity
• Associations with other data
items; this one is read as "One
room contains zero or many
beds"
Data Modeling
Example #1
!56Copyright 2018 by Data Blueprint Slide #
from The DAMA Guide to the Data Management
Body of Knowledge © 2009 by DAMA International
Primary
deliverables
become reference
material
Model Purpose Statement:

This model codifies the official 

vocabulary to be used when 

describing aspects of any of the 

following organizational concepts:

– Subscriber

– Account

– Charge

– Bill
Data Modeling Example #2
fuel
rent-rate
phone-rate
phone-call
rental
agreement
customer
auto
repair
history
phone-unit
Source: Chikofsky 1990
Interpretations:
1. Car rental company
2. Rental agreement is central
3. No direct connection between
customer and contract
4. Contract must have a customer
5. Nothing structural prevents
autos from being rented to
multiple customers
6. Phone units are tied to rentals
!57Copyright 2018 by Data Blueprint Slide #
Model Purpose Statement:

This model codifies the official 

vocabulary to be used when 

describing aspects of any of the 

following organizational concepts:

– fuel

– customer

– auto

– rental agreement

– rent-rate

– phone-call

– phone-rate

– phone-unit

– repair history
It is documentation shown

during the on-

boarding process
Data Modeling
Example #3
salesperson
name
commission
rate
invoice # amount date paid
customer
name
addresscustomer #dateorder #
pricequantityorder #item #
quantity
on hand
descriptionsupplieritem # cost
SALESPERSON
INVOICE
ORDER
CATALOG
LINE ITEM
!58Copyright 2018 by Data Blueprint Slide #
• Sales commission-based pricing information
• Difficult to change a customer address
• Easy to implement variable pricing - difficult to implement
standard pricing - is standard pricing implemented
• Sales person information is not directly tied to the order
• Price not included in the catalog
• Do sales people sell things that are shipped quickly so they get
their commission quicker?
• Nothing prohibits a sales from having multiple
sales persons
• Multiple invoices are allowed for a single order
• Partial shipment is allowed
• Data base cannot tell what part of an order the
invoice pertains to
Model Purpose Statement:

This model codifies the official 

vocabulary and specific 

operational rules to be used when 

describing aspects of any of the 

following organizational concepts:
– salesperson

– invoice

– order

– line item

– catalog
!59
DISPOSITION Data Map
Copyright 2018 by Data Blueprint Slide #
Model Purpose Statement:

This model codifies the official 

vocabulary to be used when 

describing disposition related organizational concepts:

– user

– admission

– discharge

– encounter

– facility

– provider

– diagnosis
Data Model #4: DISPOSITION
• At least one but possibly more system USERS enter the
DISPOSITION facts into the system.
• An ADMISSION is associated with one and only one
DISCHARGE.
• An ADMISSION is associated with zero or more
FACILITIES.
• An ADMISSION is associated with zero or more
PROVIDERS.
• An ADMISSION is associated with one or more
ENCOUNTERS.
• An ENCOUNTER may be recorded by a system USER.
• An ENCOUNTER may be associated with a PROVIDER.
• An ENCOUNTER may be associated with one or more
DIAGNOSES.
• At least one but possibly more system USERS enter the
DISPOSITION facts into the system.
• An ADMISSION is associated with one and only one
DISCHARGE.
• An ADMISSION is associated with zero or more
FACILITIES.
• An ADMISSION is associated with zero or more
PROVIDERS.
• An ADMISSION is associated with one or more
ENCOUNTERS.
• An ENCOUNTER may be recorded by a system USER.
• An ENCOUNTER may be associated with a PROVIDER.
• An ENCOUNTER may be associated with one or more
DIAGNOSES.
!60
ADMISSION Contains information about patient admission
history related to one or more inpatient episodes
DIAGNOSIS Contains the International Disease Classification
(IDC) of code representation and/or description
of a patient's health related to an inpatient code
DISCHARGE A table of codes describing disposition types
available for an inpatient at a FACILITY
ENCOUNTER Tracking information related to inpatient
episodes
FACILITY File containing a list of all facilities in regional
health care system
PROVIDER Full name of a member of the FACILITY team
providing services to the patient
USER Any user with access to create, read, update,
and delete DISPOSITION data
Copyright 2018 by Data Blueprint Slide #
ADMISSION Contains information about patient admission
history related to one or more inpatient episodes
DIAGNOSIS Contains the International Disease Classification
(IDC) of code representation and/or description
of a patient's health related to an inpatient code
DISCHARGE A table of codes describing disposition types
available for an inpatient at a FACILITY
ENCOUNTER Tracking information related to inpatient
episodes
FACILITY File containing a list of all facilities in regional
health care system
PROVIDER Full name of a member of the FACILITY team
providing services to the patient
USER Any user with access to create, read, update,
and delete DISPOSITION data
ADMISSION Contains information about patient admission
history related to one or more inpatient episodes
DIAGNOSIS Contains the International Disease Classification
(IDC) of code representation and/or description
of a patient's health related to an inpatient code
DISCHARGE A table of codes describing disposition types
available for an inpatient at a FACILITY
ENCOUNTER Tracking information related to inpatient
episodes
FACILITY File containing a list of all facilities in regional
health care system
PROVIDER Full name of a member of the FACILITY team
providing services to the patient
USER Any user with access to create, read, update,
and delete DISPOSITION data
Death must be a disposition code!
Two Brilliant Einstein Quotes
• "The significant
problems we
face cannot be
solved at the
same level of
thinking we were
at when we
created them."
– Albert Einstein
!61Copyright 2018 by Data Blueprint Slide #
IT Project or Application-Centric Development
Original articulation from Doug Bagley @ Walmart
!62Copyright 2018 by Data Blueprint Slide #
Data/
Information
IT

Projects


Strategy
• In support of strategy, organizations
implement IT projects
• Data/information are typically
considered within the scope of IT
projects
• Problems with this approach:
– Ensures data is formed to the
applications and not around the
organizational-wide information
requirements
– Process are narrowly formed around
applications
– Very little data reuse is possible
Data-Centric Development
Original articulation from Doug Bagley @ Walmart
!63Copyright 2018 by Data Blueprint Slide #
IT

Projects
Data/

Information


Strategy
• In support of strategy, the organization
develops specific, shared data-based
goals/objectives
• These organizational data goals/
objectives drive the development of
specific IT projects with an eye to
organization-wide usage
• Advantages of this approach:
– Data/information assets are developed from an
organization-wide perspective
– Systems support organizational data needs and
compliment organizational process flows
– Maximum data/information reuse
theDataDoctrine.com
We are uncovering better ways of developing

IT systems by doing it and helping others do it.

Through this work we have come to value:



Data programmes preceding software development
Stable data structures preceding stable code
Shared data preceding completed software
Data reuse preceding reusable code

!64Copyright 2018 by Data Blueprint Slide #
theDataDoctrine.com
We are uncovering better ways of developing

IT systems by doing it and helping others do it.

Through this work we have come to value:

Data programmes preceding software development
Stable data structures preceding stable code
Shared data preceding completed software
Data reuse preceding reusable code
!65Copyright 2018 by Data Blueprint Slide #


That is, while there is value in the items on

the right, we value the items on the left more.
• "Everything should be
made as simple as
possible, but no
simpler."
– Albert Einstein
Two Brilliant Einstein Quotes
!66Copyright 2018 by Data Blueprint Slide #
Typically Managed Architectures
• Process Architecture
– Arrangement of inputs -> transformations = value -> outputs
– Typical elements: Functions, activities, workflow, events, cycles, products, procedures
• Systems Architecture
– Applications, software components, interfaces, projects
• Business Architecture
– Goals, strategies, roles, organizational structure, location(s)
• Security Architecture
– Arrangement of security controls relation to IT Architecture
• Technical Architecture/Tarchitecture
– Relation of software capabilities/technology stack
– Structure of the technology infrastructure of an enterprise, solution or system
– Typical elements: Networks, hardware, software platforms, standards/protocols
• Data/Information Architecture
– Arrangement of data assets supporting organizational strategy
– Typical elements: specifications expressed as entities, relationships, attributes,
definitions, values, vocabularies
!67Copyright 2018 by Data Blueprint Slide #
As Is Information

Requirements

Assets
As Is Data Design Assets As Is Data Implementation 

Assets
ExistingNew
Modeling in Various Contexts
O2 Recreate

Data Design
Reverse Engineering
Forward engineering
O5 Reconstitute

Requirements
O9
Reimplement
Data
To Be Data 

Implementation 

Assets
O8 

Redesign

Data
O4

Recon-

stitute

Data 

Design
O3 Recreate

Requirements
O6
Redesign
Data
To Be

Design 

Assets
O7 Re-

develop

Require-

ments
To Be
Requirements
Assets
O1 Recreate Data

Implementation
Metadata
!68Copyright 2018 by Data Blueprint Slide #
Information Architecture Component Reengineering Options
O-1 data implementation (e.g., by recreating descriptions of implemented file
layouts);
O-2 data designs (e.g., by recreating the logical system design layouts); or
O-3 information requirements (e.g., by recreating existing system specifications and
business rules).
O-4 data design assets by examining the existing data implementation (when
appropriate O-1 can facilitate O-4); and
O-5 system information requirements by reverse engineering the data design O-4.
(Note: if the data design doesn't exist O-4 must precede O-5.)
O-6 transforming as is data design assets, yielding improved to be data designs that
are based on reconstituted data design assets produced by O-2 or O-4 and
(possibly O-1);
O-7 transforming as is system requirements into to be system requirements that are
based on reconstituted system requirements produced by O-3 or O-5 and
(possibly O-2);
O-8 redesigning to be data design assets using the to be system requirements
based on reconstituted system requirements produced by O-7; and
O-9 re-implementing system data based on data redesigns produced by O-6 or O-8.
!69Copyright 2018 by Data Blueprint Slide #
Model Evolution Framework
!70Copyright 2018 by Data Blueprint Slide #
Conceptual Logical Physical






Goal
Validated
Not Validated
Every change can
be mapped to a
transformation in
this framework!
Model Evolution (better explanation)
!71Copyright 2018 by Data Blueprint Slide #
As-is To-be
Technology
Independent/
Logical
Technology
Dependent/
Physical
abstraction
Other logical
as-is data
architecture
components
• "Concern for man and
his fate must always
form the chief interest of
all technical endeavors.
Never forget this in the
midst of your diagrams
and equations."
– Albert Einstein
!72Copyright 2018 by Data Blueprint Slide #
Data Models Used to Support Strategy
• Flexible, adaptable data structures
• Cleaner, less complex code
• Ensure strategy effectiveness measurement
• Build in future capabilities
• Form/assess merger and acquisitions strategies
!73Copyright 2018 by Data Blueprint Slide #
Employee

Type
Employee
Sales

Person
Manager
Manager

Type
Staff

Manager
Line

Manager
Adapted from Clive Finkelstein Information Engineering Strategic Systems Development 1992
How do Data Models Support Organizational Strategy?
• Consider the opposite question:
– Were your systems explicitly designed to 

be integrated or otherwise work together?
– If not then what is the likelihood that they 

will work well together?
– In all likelihood your organization is spending between 20-40% of its
IT budget compensating for poor data structure integration
– They cannot be helpful as long as their structure is unknown
• Two answers
– Achieving efficiency and effectiveness goals
– Providing organizational dexterity for rapid implementation
!74Copyright 2018 by Data Blueprint Slide #
Typical focus of a
database modeling effort
Data Modeling Ensures Interoperability
!75Copyright 2018 by Data Blueprint Slide #
Program F
Program E
Program D
Program G
Program H
Application
domain 2Application
domain 3
Program I
Typical focus of a
software engineering effort
Program A
DataModel
DataModel
DataModel
DataModel
DataModel
DataModel
Program F
Program E
Program D
Program G
Program H
Program I
Application
domain 2Application
domain 3
DataModel
DataModel
DataModel
Data Model Focus has Great Potential Business Value
• How are decisions
about the range and
scope of common data
usage, made?
• Analysis scope is on
use of data to support a
process
• Problems caused by
data exchange or
interface problems
• Goals often connect
strategic and
operational
• One data model is ideal
!76Copyright 2018 by Data Blueprint Slide #
DataModel
Program A
!77Copyright 2018 by Data Blueprint Slide #
Data Modeling Fundamentals
• Data Management Overview
• Motivation
– of Systems/components
– Data is a not well understood substructure
• Why data modeling & what is it?
– Model represents our understanding of the
– Fundamental, foundational system
characteristics
– Shared between system and human
• Fundamentals
– The power of the purpose statement
– Understanding data centric thinking
– Data modeling compliments other architecture/
engineering techniques, as well as
– Challenges beyond data modeling
• Take Aways, References & Q&A
Use Models to
!78
• Store and formalize information
• Filter out extraneous detail
• Define an essential set of 

information
• Help understand complex system behavior
• Gain information from the process of developing and
interacting with the model
• Evaluate various scenarios or other outcomes indicated by
the model
• Monitor and predict system responses to changing
environmental conditions
Copyright 2018 by Data Blueprint Slide #
• Goal must be shared IT/business understanding
– No disagreements = insufficient communication
• Data sharing/exchange is largely and highly automated and 

thus dependent on successful engineering
– It is critical to engineer a sound foundation of data modeling basics 

(the essence) on which to build advantageous data technologies
• Modeling characteristics change over the course of analysis
– Different model instances may be useful to different analytical problems
• Incorporate motivation (purpose statements) in all modeling
– Modeling is a problem defining as well as a problem solving activity - both are inherent to
architecture
• Use of modeling is much more important than selection of a specific modeling method
• Models are often living documents
– It easily adapts to change
• Models must have modern access/interface/search technologies
– Models need to be available in an easily searchable manner
• Utility is paramount
– Adding color and diagramming objects customizes models and allows for a more engaging and
enjoyable user review process
Data Modeling for Business Value
!79
Inspired by: Karen Lopez http://www.information-management.com/newsletters/enterprise_architecture_data_model_ERP_BI-10020246-1.html?pg=2
Copyright 2018 by Data Blueprint Slide #
Why Modeling
!80Copyright 2018 by Data Blueprint Slide #
• Would you build a house without an
architecture sketch?
• Model is the sketch of the system to be
built in a project.
• Would you like to have an estimate how
much your new house is going to cost?
• Your model gives you a very good idea of
how demanding the implementation work
is going to be!
• If you hired a set of constructors from all
over the world to build your house, would
you like them to have a common
language?
• Model is the common language for the
project team.
• Would you like to verify the proposals of
the construction team before the work gets
started?
• Models can be reviewed before thousands
of hours of implementation work will be
done.
• If it was a great house, would you like to
build something rather similar again, in
another place?
• It is possible to implement the system to
various platforms using the same model.
• Would you drill into a wall of your house
without a map of the plumbing and electric
lines?
• Models document the system built in a
project. This makes life easier for the
support and maintenance!
Upcoming Events
Enterprise Data World 2018 (San Diego)

The First Year as a CDO

April 24, 2018 @ 1:30 PM ET
May Webinar:

Implementing the Data Maturity Model

May 8, 2018 @ 2:00 PM ET/11:00 AM PT
June Webinar:

Data Governance Strategies

June 12, 2018 @ 2:00 PM ET/11:00 AM PT
DGIQ 2018 (San Diego)

Keeping the Momentum Going in your Data Quality Program

June 11, 2018 @ 1:30 PM (PT)
Sign up for webinars at: www.datablueprint.com/webinar-schedule
!81Copyright 2018 by Data Blueprint Slide #Copyright 2018 by Data Blueprint Slide #
Brought to you by:
Join in the discussion - questions?
It’s your turn!
Use the chat feature or Twitter (#dataed) to submit
your questions to Peter now!
+ =
!82Copyright 2018 by Data Blueprint Slide #
10124 W. Broad Street, Suite C
Glen Allen, Virginia 23060
804.521.4056
Copyright 2018 by Data Blueprint Slide # !83

Mais conteúdo relacionado

Mais procurados

ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...DATAVERSITY
 
DAS Slides: Data Virtualization – Separating Myth from Reality
DAS Slides: Data Virtualization – Separating Myth from RealityDAS Slides: Data Virtualization – Separating Myth from Reality
DAS Slides: Data Virtualization – Separating Myth from RealityDATAVERSITY
 
Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...
Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...
Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...DATAVERSITY
 
The Importance of MDM - Eternal Management of the Data Mind
The Importance of MDM - Eternal Management of the Data MindThe Importance of MDM - Eternal Management of the Data Mind
The Importance of MDM - Eternal Management of the Data MindDATAVERSITY
 
Data Architecture Strategies
Data Architecture StrategiesData Architecture Strategies
Data Architecture StrategiesDATAVERSITY
 
Data-Ed Online: Unlock Business Value through Reference & MDM
Data-Ed Online: Unlock Business Value through Reference & MDMData-Ed Online: Unlock Business Value through Reference & MDM
Data-Ed Online: Unlock Business Value through Reference & MDMDATAVERSITY
 
DataEd Slides: Leveraging Data Management Technologies
DataEd Slides: Leveraging Data Management TechnologiesDataEd Slides: Leveraging Data Management Technologies
DataEd Slides: Leveraging Data Management TechnologiesDATAVERSITY
 
Cloud and Analytics -- 2020 sparksummit
Cloud and Analytics -- 2020 sparksummitCloud and Analytics -- 2020 sparksummit
Cloud and Analytics -- 2020 sparksummitMing Yuan
 
Webinar: Decoding the Mystery - How to Know if You Need a Data Catalog, a Dat...
Webinar: Decoding the Mystery - How to Know if You Need a Data Catalog, a Dat...Webinar: Decoding the Mystery - How to Know if You Need a Data Catalog, a Dat...
Webinar: Decoding the Mystery - How to Know if You Need a Data Catalog, a Dat...DATAVERSITY
 
Data-Ed Online: Unlock Business Value through Document & Content Management
Data-Ed Online: Unlock Business Value through Document & Content ManagementData-Ed Online: Unlock Business Value through Document & Content Management
Data-Ed Online: Unlock Business Value through Document & Content ManagementDATAVERSITY
 
Slides: Enterprise Architecture vs. Data Architecture
Slides: Enterprise Architecture vs. Data ArchitectureSlides: Enterprise Architecture vs. Data Architecture
Slides: Enterprise Architecture vs. Data ArchitectureDATAVERSITY
 
Data-Ed Webinar: The Importance of MDM
Data-Ed Webinar: The Importance of MDMData-Ed Webinar: The Importance of MDM
Data-Ed Webinar: The Importance of MDMDATAVERSITY
 
Lean Modeling for Any Methodology
Lean Modeling for Any MethodologyLean Modeling for Any Methodology
Lean Modeling for Any MethodologyDATAVERSITY
 
The Value of Metadata
The Value of MetadataThe Value of Metadata
The Value of MetadataDATAVERSITY
 
DAS Slides: Metadata Management From Technical Architecture & Business Techni...
DAS Slides: Metadata Management From Technical Architecture & Business Techni...DAS Slides: Metadata Management From Technical Architecture & Business Techni...
DAS Slides: Metadata Management From Technical Architecture & Business Techni...DATAVERSITY
 
Data-Ed Online Webinar: Business Value from MDM
Data-Ed Online Webinar: Business Value from MDMData-Ed Online Webinar: Business Value from MDM
Data-Ed Online Webinar: Business Value from MDMDATAVERSITY
 
Data Lake Architecture – Modern Strategies & Approaches
Data Lake Architecture – Modern Strategies & ApproachesData Lake Architecture – Modern Strategies & Approaches
Data Lake Architecture – Modern Strategies & ApproachesDATAVERSITY
 
Data Architecture - The Foundation for Enterprise Architecture and Governance
Data Architecture - The Foundation for Enterprise Architecture and GovernanceData Architecture - The Foundation for Enterprise Architecture and Governance
Data Architecture - The Foundation for Enterprise Architecture and GovernanceDATAVERSITY
 
Do-It-Yourself (DIY) Data Governance Framework
Do-It-Yourself (DIY) Data Governance FrameworkDo-It-Yourself (DIY) Data Governance Framework
Do-It-Yourself (DIY) Data Governance FrameworkDATAVERSITY
 
Using Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-PurposeUsing Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-PurposeDATAVERSITY
 

Mais procurados (20)

ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
 
DAS Slides: Data Virtualization – Separating Myth from Reality
DAS Slides: Data Virtualization – Separating Myth from RealityDAS Slides: Data Virtualization – Separating Myth from Reality
DAS Slides: Data Virtualization – Separating Myth from Reality
 
Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...
Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...
Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...
 
The Importance of MDM - Eternal Management of the Data Mind
The Importance of MDM - Eternal Management of the Data MindThe Importance of MDM - Eternal Management of the Data Mind
The Importance of MDM - Eternal Management of the Data Mind
 
Data Architecture Strategies
Data Architecture StrategiesData Architecture Strategies
Data Architecture Strategies
 
Data-Ed Online: Unlock Business Value through Reference & MDM
Data-Ed Online: Unlock Business Value through Reference & MDMData-Ed Online: Unlock Business Value through Reference & MDM
Data-Ed Online: Unlock Business Value through Reference & MDM
 
DataEd Slides: Leveraging Data Management Technologies
DataEd Slides: Leveraging Data Management TechnologiesDataEd Slides: Leveraging Data Management Technologies
DataEd Slides: Leveraging Data Management Technologies
 
Cloud and Analytics -- 2020 sparksummit
Cloud and Analytics -- 2020 sparksummitCloud and Analytics -- 2020 sparksummit
Cloud and Analytics -- 2020 sparksummit
 
Webinar: Decoding the Mystery - How to Know if You Need a Data Catalog, a Dat...
Webinar: Decoding the Mystery - How to Know if You Need a Data Catalog, a Dat...Webinar: Decoding the Mystery - How to Know if You Need a Data Catalog, a Dat...
Webinar: Decoding the Mystery - How to Know if You Need a Data Catalog, a Dat...
 
Data-Ed Online: Unlock Business Value through Document & Content Management
Data-Ed Online: Unlock Business Value through Document & Content ManagementData-Ed Online: Unlock Business Value through Document & Content Management
Data-Ed Online: Unlock Business Value through Document & Content Management
 
Slides: Enterprise Architecture vs. Data Architecture
Slides: Enterprise Architecture vs. Data ArchitectureSlides: Enterprise Architecture vs. Data Architecture
Slides: Enterprise Architecture vs. Data Architecture
 
Data-Ed Webinar: The Importance of MDM
Data-Ed Webinar: The Importance of MDMData-Ed Webinar: The Importance of MDM
Data-Ed Webinar: The Importance of MDM
 
Lean Modeling for Any Methodology
Lean Modeling for Any MethodologyLean Modeling for Any Methodology
Lean Modeling for Any Methodology
 
The Value of Metadata
The Value of MetadataThe Value of Metadata
The Value of Metadata
 
DAS Slides: Metadata Management From Technical Architecture & Business Techni...
DAS Slides: Metadata Management From Technical Architecture & Business Techni...DAS Slides: Metadata Management From Technical Architecture & Business Techni...
DAS Slides: Metadata Management From Technical Architecture & Business Techni...
 
Data-Ed Online Webinar: Business Value from MDM
Data-Ed Online Webinar: Business Value from MDMData-Ed Online Webinar: Business Value from MDM
Data-Ed Online Webinar: Business Value from MDM
 
Data Lake Architecture – Modern Strategies & Approaches
Data Lake Architecture – Modern Strategies & ApproachesData Lake Architecture – Modern Strategies & Approaches
Data Lake Architecture – Modern Strategies & Approaches
 
Data Architecture - The Foundation for Enterprise Architecture and Governance
Data Architecture - The Foundation for Enterprise Architecture and GovernanceData Architecture - The Foundation for Enterprise Architecture and Governance
Data Architecture - The Foundation for Enterprise Architecture and Governance
 
Do-It-Yourself (DIY) Data Governance Framework
Do-It-Yourself (DIY) Data Governance FrameworkDo-It-Yourself (DIY) Data Governance Framework
Do-It-Yourself (DIY) Data Governance Framework
 
Using Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-PurposeUsing Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-Purpose
 

Semelhante a Data-Ed Webinar: Data Modeling Fundamentals

Essential Reference and Master Data Management
Essential Reference and Master Data ManagementEssential Reference and Master Data Management
Essential Reference and Master Data ManagementDATAVERSITY
 
Data Structures - The Cornerstone of Your Data’s Home
Data Structures - The Cornerstone of Your Data’s HomeData Structures - The Cornerstone of Your Data’s Home
Data Structures - The Cornerstone of Your Data’s HomeDATAVERSITY
 
A Tale of Two BI Standards
A Tale of Two BI StandardsA Tale of Two BI Standards
A Tale of Two BI StandardsArcadia Data
 
Analytics in a Day Virtual Workshop
Analytics in a Day Virtual WorkshopAnalytics in a Day Virtual Workshop
Analytics in a Day Virtual WorkshopCCG
 
Analytics in a Day Ft. Synapse Virtual Workshop
Analytics in a Day Ft. Synapse Virtual WorkshopAnalytics in a Day Ft. Synapse Virtual Workshop
Analytics in a Day Ft. Synapse Virtual WorkshopCCG
 
DataEd Webinar: Reference & Master Data Management - Unlocking Business Value
DataEd Webinar:  Reference & Master Data Management - Unlocking Business ValueDataEd Webinar:  Reference & Master Data Management - Unlocking Business Value
DataEd Webinar: Reference & Master Data Management - Unlocking Business ValueDATAVERSITY
 
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...Denodo
 
Metadata Strategies - Data Squared
Metadata Strategies - Data SquaredMetadata Strategies - Data Squared
Metadata Strategies - Data SquaredDATAVERSITY
 
DataEd Slides: Unlock Business Value Using Reference and Master Data Manageme...
DataEd Slides: Unlock Business Value Using Reference and Master Data Manageme...DataEd Slides: Unlock Business Value Using Reference and Master Data Manageme...
DataEd Slides: Unlock Business Value Using Reference and Master Data Manageme...DATAVERSITY
 
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing KeynoteArchitecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing KeynoteCaserta
 
Demystifying Data Warehousing as a Service (GLOC 2019)
Demystifying Data Warehousing as a Service (GLOC 2019)Demystifying Data Warehousing as a Service (GLOC 2019)
Demystifying Data Warehousing as a Service (GLOC 2019)Kent Graziano
 
ADV Slides: Building and Growing Organizational Analytics with Data Lakes
ADV Slides: Building and Growing Organizational Analytics with Data LakesADV Slides: Building and Growing Organizational Analytics with Data Lakes
ADV Slides: Building and Growing Organizational Analytics with Data LakesDATAVERSITY
 
Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)Kent Graziano
 
JSON Data Modeling - July 2018 - Tulsa Techfest
JSON Data Modeling - July 2018 - Tulsa TechfestJSON Data Modeling - July 2018 - Tulsa Techfest
JSON Data Modeling - July 2018 - Tulsa TechfestMatthew Groves
 
Data-Ed Webinar: Data Architecture Requirements
Data-Ed Webinar: Data Architecture RequirementsData-Ed Webinar: Data Architecture Requirements
Data-Ed Webinar: Data Architecture RequirementsDATAVERSITY
 
Data-Ed: Data Architecture Requirements
Data-Ed: Data Architecture Requirements  Data-Ed: Data Architecture Requirements
Data-Ed: Data Architecture Requirements Data Blueprint
 
Building a New Platform for Customer Analytics
Building a New Platform for Customer Analytics Building a New Platform for Customer Analytics
Building a New Platform for Customer Analytics Caserta
 
Analytics in a Day Virtual Workshop
Analytics in a Day Virtual WorkshopAnalytics in a Day Virtual Workshop
Analytics in a Day Virtual WorkshopCCG
 
Business Value Through Reference and Master Data Strategies
Business Value Through Reference and Master Data StrategiesBusiness Value Through Reference and Master Data Strategies
Business Value Through Reference and Master Data StrategiesDATAVERSITY
 
JSON Data Modeling - GDG Indy - April 2020
JSON Data Modeling - GDG Indy - April 2020JSON Data Modeling - GDG Indy - April 2020
JSON Data Modeling - GDG Indy - April 2020Matthew Groves
 

Semelhante a Data-Ed Webinar: Data Modeling Fundamentals (20)

Essential Reference and Master Data Management
Essential Reference and Master Data ManagementEssential Reference and Master Data Management
Essential Reference and Master Data Management
 
Data Structures - The Cornerstone of Your Data’s Home
Data Structures - The Cornerstone of Your Data’s HomeData Structures - The Cornerstone of Your Data’s Home
Data Structures - The Cornerstone of Your Data’s Home
 
A Tale of Two BI Standards
A Tale of Two BI StandardsA Tale of Two BI Standards
A Tale of Two BI Standards
 
Analytics in a Day Virtual Workshop
Analytics in a Day Virtual WorkshopAnalytics in a Day Virtual Workshop
Analytics in a Day Virtual Workshop
 
Analytics in a Day Ft. Synapse Virtual Workshop
Analytics in a Day Ft. Synapse Virtual WorkshopAnalytics in a Day Ft. Synapse Virtual Workshop
Analytics in a Day Ft. Synapse Virtual Workshop
 
DataEd Webinar: Reference & Master Data Management - Unlocking Business Value
DataEd Webinar:  Reference & Master Data Management - Unlocking Business ValueDataEd Webinar:  Reference & Master Data Management - Unlocking Business Value
DataEd Webinar: Reference & Master Data Management - Unlocking Business Value
 
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
 
Metadata Strategies - Data Squared
Metadata Strategies - Data SquaredMetadata Strategies - Data Squared
Metadata Strategies - Data Squared
 
DataEd Slides: Unlock Business Value Using Reference and Master Data Manageme...
DataEd Slides: Unlock Business Value Using Reference and Master Data Manageme...DataEd Slides: Unlock Business Value Using Reference and Master Data Manageme...
DataEd Slides: Unlock Business Value Using Reference and Master Data Manageme...
 
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing KeynoteArchitecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote
 
Demystifying Data Warehousing as a Service (GLOC 2019)
Demystifying Data Warehousing as a Service (GLOC 2019)Demystifying Data Warehousing as a Service (GLOC 2019)
Demystifying Data Warehousing as a Service (GLOC 2019)
 
ADV Slides: Building and Growing Organizational Analytics with Data Lakes
ADV Slides: Building and Growing Organizational Analytics with Data LakesADV Slides: Building and Growing Organizational Analytics with Data Lakes
ADV Slides: Building and Growing Organizational Analytics with Data Lakes
 
Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)
 
JSON Data Modeling - July 2018 - Tulsa Techfest
JSON Data Modeling - July 2018 - Tulsa TechfestJSON Data Modeling - July 2018 - Tulsa Techfest
JSON Data Modeling - July 2018 - Tulsa Techfest
 
Data-Ed Webinar: Data Architecture Requirements
Data-Ed Webinar: Data Architecture RequirementsData-Ed Webinar: Data Architecture Requirements
Data-Ed Webinar: Data Architecture Requirements
 
Data-Ed: Data Architecture Requirements
Data-Ed: Data Architecture Requirements  Data-Ed: Data Architecture Requirements
Data-Ed: Data Architecture Requirements
 
Building a New Platform for Customer Analytics
Building a New Platform for Customer Analytics Building a New Platform for Customer Analytics
Building a New Platform for Customer Analytics
 
Analytics in a Day Virtual Workshop
Analytics in a Day Virtual WorkshopAnalytics in a Day Virtual Workshop
Analytics in a Day Virtual Workshop
 
Business Value Through Reference and Master Data Strategies
Business Value Through Reference and Master Data StrategiesBusiness Value Through Reference and Master Data Strategies
Business Value Through Reference and Master Data Strategies
 
JSON Data Modeling - GDG Indy - April 2020
JSON Data Modeling - GDG Indy - April 2020JSON Data Modeling - GDG Indy - April 2020
JSON Data Modeling - GDG Indy - April 2020
 

Mais de DATAVERSITY

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...DATAVERSITY
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceDATAVERSITY
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data LiteracyDATAVERSITY
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsDATAVERSITY
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for YouDATAVERSITY
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?DATAVERSITY
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?DATAVERSITY
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling FundamentalsDATAVERSITY
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectDATAVERSITY
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at ScaleDATAVERSITY
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?DATAVERSITY
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...DATAVERSITY
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?DATAVERSITY
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsDATAVERSITY
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayDATAVERSITY
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise AnalyticsDATAVERSITY
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best PracticesDATAVERSITY
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?DATAVERSITY
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best PracticesDATAVERSITY
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageDATAVERSITY
 

Mais de DATAVERSITY (20)

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data Literacy
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for You
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling Fundamentals
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement Today
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best Practices
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
 

Último

Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 

Último (20)

Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 

Data-Ed Webinar: Data Modeling Fundamentals

  • 1. Peter Aiken, Ph.D. Data Modeling Fundamentals • DAMA International President 2009-2013 • DAMA International Achievement Award 2001 (with Dr. E. F. "Ted" Codd • DAMA International Community Award 2005 Peter Aiken, Ph.D. • 33+ years in data management • Repeated international recognition • Founder, Data Blueprint (datablueprint.com) • Associate Professor of IS (vcu.edu) • DAMA International (dama.org) • 10 books and dozens of articles • Experienced w/ 500+ data management practices • Multi-year immersions:
 – US DoD (DISA/Army/Marines/DLA)
 – Nokia
 – Deutsche Bank
 – Wells Fargo
 – Walmart
 – … PETER AIKEN WITH JUANITA BILLINGS FOREWORD BY JOHN BOTTEGA MONETIZING DATA MANAGEMENT Unlocking the Value in Your Organization’s Most Important Asset. The Case for the Chief Data Officer Recasting the C-Suite to Leverage Your MostValuable Asset Peter Aiken and Michael Gorman Copyright 2018 by Data Blueprint Slide #
  • 2. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. Data Modeling with Couchbase Anuj Sahni| Director Product Marketing April 2018
  • 3. AGENDA 1. Couchbase Data Platform Architecture 2. Data Modeling with JSON 2
  • 4. Data Modeling Approaches NoSQL Relaxed Normalization schema implied by structure fields may be empty, duplicate, or missing Relational Required Normalization schema enforced by DB same fields in all records • Minimize data inconsistencies (one item = one location) • Reduced duplicated data • Preserve storage resources • Optimized based on access patterns • Flexible, based on application requirements • Supports clustered architecture • Reduced server overhead
  • 6. Couchbase - The Data Platform Architecture 5 COUCHBASE LITE SYNC GATEWAY COUCHBASE SERVER Lightweight embedded NoSQL database with full CRUD and query functionality. Secure web gateway with synchronization, data access, and data integration APIs for accessing, integrating, and synchronizing data over the web. Highly scalable, highly available, high performance NoSQL database server. Client Middle Tier StorageWAN LAN Security Built-in enterprise level security throughout the entire stack includes user authentication, user and role based data access control (RBAC), secure transport (TLS), and 256-bit AES full database encryption.
  • 7. Couchbase Server Cluster Service Deployment STORAGE Couchbase Server 1 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Managed Cache Storage Data Service STORAGE Couchbase Server 2 Managed Cache Cluster ManagerCluster Manager Data Service STORAGE Couchbase Server 3 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Data Service STORAGE Couchbase Server 4 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Query Service STORAGE Couchbase Server 5 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Query Service STORAGE Couchbase Server 6 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Index Service Managed Cache Storage Managed Cache Storage Storage STORAGE Couchbase Server 7 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Index Service Storage Managed Cache Managed Cache SDK SDK Managed Cache Storage Managed Cache Storage
  • 9. Properties of Real-World Data • Rich structure • Attributes, Sub-structure • Relationships • To other data • Value evolution • Data is updated • Structure evolution • Data is reshaped Customer Name DOB Billing Connections Purchases
  • 10. Modeling Data in Relational World Billing ConnectionsPurchases Contacts Customer  Rich structure  Normalize & JOIN Queries  Relationships  JOINS and Constraints  Value evolution  INSERT, UPDATE, DELETE  Structure evolution  ALTER TABLE  Application Downtime  Application Migration  Application Versioning
  • 11. JSON 101 { "Name" : "Jane Smith", "DOB" : "1990-01-30", "Billing" : [ { "type" : "visa", "cardnum" : "5827-2842-2847-3909", "expiry" : "2019-03" }, { "type" : "master", "cardnum" : "6274-2842-2847-3909", "expiry" : "2019-03" } ], "address" : { "Street" : "10, Downing Street", "City" : "San Francico", "State" : "California", "zip" :94401 } } • Used to represent object data in text • Representation • "Key":"Value" • Data Types: • Number, Strings, Boolean, objects, Arrays, NULL • Hierarchical
  • 12. Flexibility from JSON { "Name" : "Jane Smith", "DOB" : "1990-01-30", "Billing" : [ { "type" : "visa", "cardnum" : "5827-2842-2847-3909", "expiry" : "2019-03" }, { "type" : "master", "cardnum" : "6274-2842-2847-3909", "expiry" : "2019-03" } ], "address" : { "Street" : "10, Downing Street", "City" : "San Francico", "State" : "California", "zip" :94401 } } • Document is self describing • Fields can be added or can be missing • Data types can change • Arrays give you flexibility in number of items in an attribute
  • 13. Using JSON to Store Data { "Name" : "Jane Smith", "DOB" : "1990-01-30", "Billing" : [ { "type" : "visa", "cardnum" : "5827-2842-2847-3909", "expiry" : "2019-03" }, { "type" : "master", "cardnum" : "6274-2842-2847-3909", "expiry" : "2019-03" } ], "Connections" : [ { "CustId" : "XYZ987", "Name" : "Joe Smith" }, { "CustId" : "PQR823", "Name" : "Dylan Smith" } { "CustId" : "PQR823", "Name" : "Dylan Smith" } ], "Purchases" : [ { "id":12, item: "mac", "amt": 2823.52 } { "id":19, item: "ipad2", "amt": 623.52 } ] } CustomerID Name DOB CBL2015 Jane Smith 1990-01-30 CustomerID Type Cardnum Expiry CBL2015 visa 5827… 2019-03 CBL2015 master 6274… 2018-12 CustomerID ConnId Name CBL2015 XYZ987 Joe Smith CBL2015 SKR007 Sam Smith CustomerID item amt CBL2015 mac 2823.52 CBL2015 ipad2 623.52 CustomerID ConnId Name CBL2015 XYZ987 Joe Smith CBL2015 SKR007 Sam Smith Contacts Customer Billing ConnectionsPurchases
  • 14. Models for Representing Data Data Concern Relational Model JSON Document Model (NoSQL) Rich Structure  Multiple flat tables  Constant assembly / disassembly  Documents  No assembly required! Relationships  Represented  Queried (SQL)  Represented  N1QL (support ANSI JOIN) Value Evolution  Data can be updated  Data can be updated Structure Evolution  Uniform and rigid  Manual change (disruptive)  Flexible  Dynamic change
  • 15. !3Copyright 2018 by Data Blueprint Slide # Data Modeling Fundamentals • Data Management Overview • Motivation – of Systems/components – Data is a not well understood substructure • Why data modeling & what is it? – Model represents our understanding of the – Fundamental, foundational system characteristics – Shared between system and human • Fundamentals – The power of the purpose statement – Understanding data centric thinking – Data modeling compliments other architecture/ engineering techniques, as well as – Challenges beyond data modeling • Take Aways, References & Q&A 
 
 
 UsesUsesReuses What is data management? !4Copyright 2018 by Data Blueprint Slide # Sources 
 Data Engineering 
 Data 
 Delivery 
 Data
 Storage Specialized Team Skills Data Governance Understanding the current and future data needs of an enterprise and making that data effective and efficient in supporting 
 business activities

 Aiken, P, Allen, M. D., Parker, B., Mattia, A., 
 "Measuring Data Management's Maturity: 
 A Community's Self-Assessment" 
 IEEE Computer (research feature April 2007) Data management practices connect data sources and uses in an organized and efficient manner • Engineering • Storage • Delivery • Governance When executed, 
 engineering, storage, and 
 delivery implement governance Note: does not well-depict data reuse
  • 16. 
 
 
 
 
 
 
 
 
 
 
 What is data management? !5Copyright 2018 by Data Blueprint Slide # Sources 
 Data Engineering 
 Data 
 Delivery 
 Data
 Storage More Specialized Team Skills 
 Resources
 (optimized for reuse)
 Data Governance AnalyticInsight !6Copyright 2018 by Data Blueprint Slide #
  • 17. You can accomplish Advanced Data Practices without becoming proficient in the Foundational Data Management Practices however this will: • Take longer • Cost more • Deliver less • Present 
 greater
 risk
 (with thanks to Tom DeMarco) Data Management Practices Hierarchy Advanced 
 Data 
 Practices • MDM • Mining • Big Data • Analytics • Warehousing • SOA Foundational Data Management Practices Data Platform/Architecture Data Governance Data Quality Data Operations Data Management Strategy Technologies Capabilities Copyright 2018 by Data Blueprint Slide # !7 DMM℠ Structure of 
 5 Integrated 
 DM Practice Areas Data architecture implementation Data 
 Governance Data 
 Management
 Strategy Data 
 Operations Platform
 Architecture Supporting
 Processes Maintain fit-for-purpose data, efficiently and effectively !8Copyright 2018 by Data Blueprint Slide # Manage data coherently Manage data assets professionally Data life cycle management Organizational support Data 
 Quality
  • 18. Data Strategy is often the weakest link Data architecture implementation Data 
 Governance Data 
 Management
 Strategy Data 
 Operations Platform
 Architecture Supporting
 Processes Maintain fit-for-purpose data, efficiently and effectively !9Copyright 2018 by Data Blueprint Slide # Manage data coherently Manage data assets professionally Data life cycle management Organizational support Data 
 Quality 3 3 33 1 Data Management Body of Knowledge !10Copyright 2018 by Data Blueprint Slide # Data Management Functions
  • 19. DAMA DM BoK: Data Development !11Copyright 2018 by Data Blueprint Slide # from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International Architecture: here, whether you like it or not 12Copyright 2018 by Data Blueprint Slide # deviantart.com • All organizations have architectures – Some are better understood and documented (and therefore more useful to the organization) than others
  • 20. Data Architecture
 
 
 
 and
 
 
 
 Data Models !13Copyright 2018 by Data Blueprint Slide # http://www.architecturalcomponentsinc.com • Architecture is higher level of abstraction – Understanding/integration focused • Models more downward facing – Implementation/detail focused Models are literally the translation 
 between systems and people !14Copyright 2018 by Data Blueprint Slide # Data Modeling Fundamentals • Data Management Overview • Motivation – of Systems/components – Data is a not well understood substructure • Why data modeling & what is it? – Model represents our understanding of the – Fundamental, foundational system characteristics – Shared between system and human • Fundamentals – The power of the purpose statement – Understanding data centric thinking – Data modeling compliments other architecture/ engineering techniques, as well as – Challenges beyond data modeling • Take Aways, References & Q&A
  • 21. Data Models are about ... • Things that someone cares
 to keep information about – Entities: persons, places, things • The characteristics of the things – Attributes: color, size, sequence
 media code, product descriptions, quantity ordered • How the entitles interact – Relationships: accomplished
 by cooperating (sharing key 
 information)
 
 An order is placed by one 
 and only one customer !15Copyright 2018 by Data Blueprint Slide # What do we teach knowledge workers about data? !16Copyright 2018 by Data Blueprint Slide # What percentage of the deal with it daily?
  • 22. What do we teach IT professionals about data? !17Copyright 2018 by Data Blueprint Slide # • 1 course – How to build a new database • What impressions do IT professionals get from this education? – Data is a technical skill that is needed when developing new databases • Slender, elegant and graceful • World's 3rd longest suspension span • Opened on July 1st, collapsed in a windstorm on November 7,1940 • "The most dramatic failure in 
 bridge engineering history" • Changed forever how engineers 
 design suspension bridges leading 
 to safer spans today. Tacoma Narrows Bridge/Gallopin' Gertie !18Copyright 2018 by Data Blueprint Slide #
  • 23. !19Copyright 2018 by Data Blueprint Slide # Similarly data failures cost organizations minimally 20-40% of their IT budget Repeat 100s, thousands, millions of times ... !20Copyright 2018 by Data Blueprint Slide #
  • 24. Death by 1000 Cuts !21Copyright 2018 by Data Blueprint Slide # • How does maltreated data cost money? • Consider the opposite question: – Were your systems explicitly designed to 
 be integrated or otherwise work together? – If not then what is the likelihood that they 
 will work well together? • Organizations spend 20-40% of their IT
 budget evolving data - including: – Data migration • Changing the location from one place to another – Data conversion • Changing data into another form, state, or product – Data improving • Inspecting and manipulating, or re-keying data to prepare it for 
 subsequent use - John Zachman Lack of data coherence is a hidden expense !22 PETER AIKEN WITH JUANITA BILLINGS FOREWORD BY JOHN BOTTEGA MONETIZING DATA MANAGEMENT Unlocking the Value in Your Organization’s Most Important Asset. Copyright 2018 by Data Blueprint Slide #
  • 25. Bad Data Decisions Spiral !23Copyright 2018 by Data Blueprint Slide # Bad data decisions Technical deci- sion makers are not data knowledgable Business decision makers are not data knowledgable Poor organizational outcomes Poor treatment of organizational data assets Poor
 quality
 data !24Copyright 2018 by Data Blueprint Slide # Data Modeling Fundamentals • Data Management Overview • Motivation – of Systems/components – Data is a not well understood substructure • Why data modeling & what is it? – Model represents our understanding of the – Fundamental, foundational system characteristics – Shared between system and human • Fundamentals – The power of the purpose statement – Understanding data centric thinking – Data modeling compliments other architecture/ engineering techniques, as well as – Challenges beyond data modeling • Take Aways, References & Q&A
  • 26. How much data,
 by the minute! For the entirety of 2017, every minute of every day: • (almost) Seventy thousand hours of Netflix • (almost) a half million tweets • 15+ million texts • 3.5+ million google searches • 103+ million email spams !25Copyright 2018 by Data Blueprint Slide # https://www.domo.com/learn/data-never-sleeps-5 !26Copyright 2018 by Data Blueprint Slide # As articulated by Micheline Casey There will never be less data than right now!
  • 27. USS Midway & Pancakes What is this excellent engineering example? • It is tall • It has a clutch • It was built in 1942 • It is still in regular use! !27Copyright 2018 by Data Blueprint Slide # You cannot architect after implementation! !28Copyright 2018 by Data Blueprint Slide #
  • 28. Good Engineering/ Architectural Foundation? !29Copyright 2018 by Data Blueprint Slide # Poor Foundation = !30Copyright 2018 by Data Blueprint Slide # Unsuitable
 for
 Further
 Investment
  • 29. Data Modeling Definition • Modeling = Analysis and design method used to – Define and analyze data requirements – Design data structures that support these requirements • Model = set of data specifications and related diagrams that reflect requirements and designs – Representation of something in our environment – Employs standardized text/symbols to represent data attributes (grouped into data elements) and the relationships among them – Integrated collection of specifications and related diagrams that represent data requirements and design !31Copyright 2018 by Data Blueprint Slide # from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International Data Modeling • Modeling = complex process involving interaction between people and with technology that don’t compromise the integrity or security of the data – Good data models accurately 
 express and effectively communicate 
 data requirements and 
 quality solution design • Modeling approach 
 (guided by 2 formulas): – Purpose + audience = deliverables – Deliverables + resources + time = approach !32Copyright 2018 by Data Blueprint Slide # from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International
  • 30. from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International Data Models Facilitate • Formalization – Data model documents a single, 
 precise definition of data requirements 
 and data-related business rules • Communication – Data model is a bridge to understanding data 
 between people with different levels and types of experience. – Helps understand business area, existing application, or impact of modifying an existing structure – May also facilitate training new business and/or technical staff • Scope – Data model can help explain the data concept and scope of purchased application packages !33Copyright 2018 by Data Blueprint Slide # ANSI-SPARK 3-Layer Schema !34 For example, a changeover to a new DBMS technology. The database administrator should be able to change the conceptual or global structure of the database without affecting the users. 1. Conceptual - Allows independent customized user views: – Each should be able to access the same data, but have a different customized view of the data. 2. Logical - This hides the physical storage details from users: – Users should not have to deal with physical database storage details. They should be allowed to work with the data itself, without concern for how it is physically stored. 3. Physical - The database administrator should be able to change the database storage structures without affecting the users’ views: – Changes to the structure of an organization's data will be required. The internal structure of the database should be unaffected by changes to the physical aspects of the storage. Copyright 2018 by Data Blueprint Slide #
  • 31. Families of Modeling Notation Variants !35Copyright 2018 by Data Blueprint Slide # Eventually One, More Eventually One Exactly One Zero, or More One or More Zero or One Information Engineering Pick one! What is a Relationship? • Natural associations between two or more entities !36Copyright 2018 by Data Blueprint Slide #
  • 32. Ordinality & Cardinality • Defines mandatory/optional relationships using minimum/ maximum occurrences from one entity to another !37Copyright 2018 by Data Blueprint Slide # An order is placed by one and only one customer A customer places zero or more orders A product is contained on zero or more orders An order contains at least one or more products Q: What is the proper relationship for these entities? !38Copyright 2018 by Data Blueprint Slide #
  • 33. A: a relationship for these entities !39Copyright 2018 by Data Blueprint Slide # Eventually One, More Eventually One Exactly One Zero, or More One or More Zero or One Q: What is an Attribute? !40Copyright 2018 by Data Blueprint Slide #
  • 34. A: Attribute Definition • Attributes describe an entity and attribute values describe “instances of business things” !41Copyright 2018 by Data Blueprint Slide # Rigid Data Structure !42Copyright 2018 by Data Blueprint Slide # Person Job Class Position BR1) One EMPLOYEE can be associated with one PERSON BR2) One EMPLOYEE can be associated with one POSITION Manual
 Job Sharing Manual
 Moon Lighting Employee
  • 35. Flexible data structure !43Copyright 2018 by Data Blueprint Slide # Person Job Class Employee Position BR1) Zero, one, or more EMPLOYEES can be associated with one PERSON BR2) Zero, one, or more EMPLOYEES can be associated with one POSITION Job Sharing Moon Lighting Everyone Shares Understanding !44Copyright 2018 by Data Blueprint Slide # Data structures must be specified prior software development/acquisition (Requires 2 structural loops more than the more flexible data structure) More flexible data structure Less flexible data structure
  • 36. Understanding • Definition: – 'Understanding an architecture' – Documented and articulated as a digital blueprint illustrating the 
 commonalities and 
 interconnections 
 among the 
 architectural 
 components – Ideally the understanding 
 is shared by systems and humans !45Copyright 2018 by Data Blueprint Slide # Modeling Procedures 1. Identify entities 2. Identify key for each entity 3. Draw rough draft of entity relationship data model 4. Identify data attributes 5. Map data attributes to entities !46Copyright 2018 by Data Blueprint Slide #
  • 37. Models Evolution is good, at first ... !47Copyright 2018 by Data Blueprint Slide # Preliminary activities Modeling cycles Wrapup activities Evidence collection & analysis Project coordination requirements Target system analysis Modeling cycle focus Activity Refinement Collection Analysis Validation Declining coordination requirements Increasing amounts of targetsystem analysis Preliminary activities Modeling cycles Wrapup activities Evidence collection & analysis Project coordination requirements Target system analysis Modeling cycle focus Activity Refinement Collection Analysis Validation Declining coordination requirements Increasing amounts of targetsystem analysis Preliminary activities Modeling cycles Wrapup activities Evidence collection & analysis Project coordination requirements Target system analysis Modeling cycle focus Activity Refinement Collection Analysis Validation Declining coordination requirements Increasing amounts of targetsystem analysis Preliminary activities Modeling cycles Wrapup activities Evidence collection & analysis Project coordination requirements Target system analysis Modeling cycle focus Activity Refinement Collection Analysis Validation Declining coordination requirements Increasing amounts of targetsystem analysis Relative use of time allocated to tasks during Modeling Preliminary activities Modeling cycles Wrapup activities Evidence collection & analysis Project coordination requirements Target system analysis Modeling cycle focus Activity Refinement Collection Analysis Validation Declining coordination requirements Increasing amounts of targetsystem analysis !48Copyright 2018 by Data Blueprint Slide #
  • 38. Don’t Tell Them You Are Modeling! !49 • Just write some stuff down • Then arrange it • Then make some appropriate connections between your objects Copyright 2018 by Data Blueprint Slide # !50Copyright 2018 by Data Blueprint Slide # Data Modeling Fundamentals • Data Management Overview • Motivation – of Systems/components – Data is a not well understood substructure • Why data modeling & what is it? – Model represents our understanding of the – Fundamental, foundational system characteristics – Shared between system and human • Fundamentals – The power of the purpose statement – Understanding data centric thinking – Data modeling compliments other architecture/ engineering techniques, as well as – Challenges beyond data modeling • Take Aways, References & Q&A
  • 39. Each model has a purpose !51Copyright 2018 by Data Blueprint Slide # Data Models are Developed in Response to Organizational Needs ! ! ! ! !52Copyright 2018 by Data Blueprint Slide # Organizational Needs become instantiated 
 and integrated into an 
 Data Models Informa(on)System) Requirements authorizes and 
 articulates satisfyspecificorganizationalneeds
  • 40. Standard definition reporting does not provide conceptual context !53Copyright 2018 by Data Blueprint Slide # Bed Something you sleep in Bed
 Entity: BED Purpose: This is a substructure within the room
 substructure of the facility location. It 
 contains information about beds within rooms. Attributes: Bed.Description
 Bed.Status
 Bed.Sex.To.Be.Assigned
 Bed.Reserve.Reason Associations: >0-+ Room Status: Validated Keep them focused on data model purpose !54 • The reason we are locked in this room is to: – Mission: Understand formal relationship between soda and customer • Outcome: Walk out the door with a data model this relationship – Mission: Understand the characteristics that differ between our hospital beds • Outcome: We will walk out the door when we identify the top three traits that represent the brand. – Mission: Could our systems handle the following business rule tomorrow? – "Is job-sharing permitted?" • Outcomes: Confirm that it is possible to staff a position with multiple employees effective tomorrow selects and pays forgiven to Soda Customer selects can be filled by zero or 1 Employee Position has exactly 1 How does our perspective change: 
 the primary means of tracking a patient Copyright 2018 by Data Blueprint Slide #
  • 41. Entity: BED Data Asset Type: Principal Data Entity Purpose: This is a substructure within the room
 substructure of the facility location. It contains 
 information about beds within rooms. Source: Maintenance Manual for File and Table
 Data (Software Version 3.0, Release 3.1) Attributes: Bed.Description
 Bed.Status
 Bed.Sex.To.Be.Assigned
 Bed.Reserve.Reason Associations: >0-+ Room Status: Validated The Power of the Purpose Statement !55Copyright 2018 by Data Blueprint Slide # • A purpose statement describing why the organization is maintaining information about this business concept • Sources of information about it • A partial list of the attributes or characteristics of the entity • Associations with other data items; this one is read as "One room contains zero or many beds" Data Modeling Example #1 !56Copyright 2018 by Data Blueprint Slide # from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International Primary deliverables become reference material Model Purpose Statement:
 This model codifies the official 
 vocabulary to be used when 
 describing aspects of any of the 
 following organizational concepts:
 – Subscriber
 – Account
 – Charge
 – Bill
  • 42. Data Modeling Example #2 fuel rent-rate phone-rate phone-call rental agreement customer auto repair history phone-unit Source: Chikofsky 1990 Interpretations: 1. Car rental company 2. Rental agreement is central 3. No direct connection between customer and contract 4. Contract must have a customer 5. Nothing structural prevents autos from being rented to multiple customers 6. Phone units are tied to rentals !57Copyright 2018 by Data Blueprint Slide # Model Purpose Statement:
 This model codifies the official 
 vocabulary to be used when 
 describing aspects of any of the 
 following organizational concepts:
 – fuel
 – customer
 – auto
 – rental agreement
 – rent-rate
 – phone-call
 – phone-rate
 – phone-unit
 – repair history It is documentation shown
 during the on-
 boarding process Data Modeling Example #3 salesperson name commission rate invoice # amount date paid customer name addresscustomer #dateorder # pricequantityorder #item # quantity on hand descriptionsupplieritem # cost SALESPERSON INVOICE ORDER CATALOG LINE ITEM !58Copyright 2018 by Data Blueprint Slide # • Sales commission-based pricing information • Difficult to change a customer address • Easy to implement variable pricing - difficult to implement standard pricing - is standard pricing implemented • Sales person information is not directly tied to the order • Price not included in the catalog • Do sales people sell things that are shipped quickly so they get their commission quicker? • Nothing prohibits a sales from having multiple sales persons • Multiple invoices are allowed for a single order • Partial shipment is allowed • Data base cannot tell what part of an order the invoice pertains to Model Purpose Statement:
 This model codifies the official 
 vocabulary and specific 
 operational rules to be used when 
 describing aspects of any of the 
 following organizational concepts: – salesperson
 – invoice
 – order
 – line item
 – catalog
  • 43. !59 DISPOSITION Data Map Copyright 2018 by Data Blueprint Slide # Model Purpose Statement:
 This model codifies the official 
 vocabulary to be used when 
 describing disposition related organizational concepts:
 – user
 – admission
 – discharge
 – encounter
 – facility
 – provider
 – diagnosis Data Model #4: DISPOSITION • At least one but possibly more system USERS enter the DISPOSITION facts into the system. • An ADMISSION is associated with one and only one DISCHARGE. • An ADMISSION is associated with zero or more FACILITIES. • An ADMISSION is associated with zero or more PROVIDERS. • An ADMISSION is associated with one or more ENCOUNTERS. • An ENCOUNTER may be recorded by a system USER. • An ENCOUNTER may be associated with a PROVIDER. • An ENCOUNTER may be associated with one or more DIAGNOSES. • At least one but possibly more system USERS enter the DISPOSITION facts into the system. • An ADMISSION is associated with one and only one DISCHARGE. • An ADMISSION is associated with zero or more FACILITIES. • An ADMISSION is associated with zero or more PROVIDERS. • An ADMISSION is associated with one or more ENCOUNTERS. • An ENCOUNTER may be recorded by a system USER. • An ENCOUNTER may be associated with a PROVIDER. • An ENCOUNTER may be associated with one or more DIAGNOSES. !60 ADMISSION Contains information about patient admission history related to one or more inpatient episodes DIAGNOSIS Contains the International Disease Classification (IDC) of code representation and/or description of a patient's health related to an inpatient code DISCHARGE A table of codes describing disposition types available for an inpatient at a FACILITY ENCOUNTER Tracking information related to inpatient episodes FACILITY File containing a list of all facilities in regional health care system PROVIDER Full name of a member of the FACILITY team providing services to the patient USER Any user with access to create, read, update, and delete DISPOSITION data Copyright 2018 by Data Blueprint Slide # ADMISSION Contains information about patient admission history related to one or more inpatient episodes DIAGNOSIS Contains the International Disease Classification (IDC) of code representation and/or description of a patient's health related to an inpatient code DISCHARGE A table of codes describing disposition types available for an inpatient at a FACILITY ENCOUNTER Tracking information related to inpatient episodes FACILITY File containing a list of all facilities in regional health care system PROVIDER Full name of a member of the FACILITY team providing services to the patient USER Any user with access to create, read, update, and delete DISPOSITION data ADMISSION Contains information about patient admission history related to one or more inpatient episodes DIAGNOSIS Contains the International Disease Classification (IDC) of code representation and/or description of a patient's health related to an inpatient code DISCHARGE A table of codes describing disposition types available for an inpatient at a FACILITY ENCOUNTER Tracking information related to inpatient episodes FACILITY File containing a list of all facilities in regional health care system PROVIDER Full name of a member of the FACILITY team providing services to the patient USER Any user with access to create, read, update, and delete DISPOSITION data Death must be a disposition code!
  • 44. Two Brilliant Einstein Quotes • "The significant problems we face cannot be solved at the same level of thinking we were at when we created them." – Albert Einstein !61Copyright 2018 by Data Blueprint Slide # IT Project or Application-Centric Development Original articulation from Doug Bagley @ Walmart !62Copyright 2018 by Data Blueprint Slide # Data/ Information IT
 Projects 
 Strategy • In support of strategy, organizations implement IT projects • Data/information are typically considered within the scope of IT projects • Problems with this approach: – Ensures data is formed to the applications and not around the organizational-wide information requirements – Process are narrowly formed around applications – Very little data reuse is possible
  • 45. Data-Centric Development Original articulation from Doug Bagley @ Walmart !63Copyright 2018 by Data Blueprint Slide # IT
 Projects Data/
 Information 
 Strategy • In support of strategy, the organization develops specific, shared data-based goals/objectives • These organizational data goals/ objectives drive the development of specific IT projects with an eye to organization-wide usage • Advantages of this approach: – Data/information assets are developed from an organization-wide perspective – Systems support organizational data needs and compliment organizational process flows – Maximum data/information reuse theDataDoctrine.com We are uncovering better ways of developing
 IT systems by doing it and helping others do it.
 Through this work we have come to value:
 
 Data programmes preceding software development Stable data structures preceding stable code Shared data preceding completed software Data reuse preceding reusable code
 !64Copyright 2018 by Data Blueprint Slide #
  • 46. theDataDoctrine.com We are uncovering better ways of developing
 IT systems by doing it and helping others do it.
 Through this work we have come to value:
 Data programmes preceding software development Stable data structures preceding stable code Shared data preceding completed software Data reuse preceding reusable code !65Copyright 2018 by Data Blueprint Slide # 
 That is, while there is value in the items on
 the right, we value the items on the left more. • "Everything should be made as simple as possible, but no simpler." – Albert Einstein Two Brilliant Einstein Quotes !66Copyright 2018 by Data Blueprint Slide #
  • 47. Typically Managed Architectures • Process Architecture – Arrangement of inputs -> transformations = value -> outputs – Typical elements: Functions, activities, workflow, events, cycles, products, procedures • Systems Architecture – Applications, software components, interfaces, projects • Business Architecture – Goals, strategies, roles, organizational structure, location(s) • Security Architecture – Arrangement of security controls relation to IT Architecture • Technical Architecture/Tarchitecture – Relation of software capabilities/technology stack – Structure of the technology infrastructure of an enterprise, solution or system – Typical elements: Networks, hardware, software platforms, standards/protocols • Data/Information Architecture – Arrangement of data assets supporting organizational strategy – Typical elements: specifications expressed as entities, relationships, attributes, definitions, values, vocabularies !67Copyright 2018 by Data Blueprint Slide # As Is Information
 Requirements
 Assets As Is Data Design Assets As Is Data Implementation 
 Assets ExistingNew Modeling in Various Contexts O2 Recreate
 Data Design Reverse Engineering Forward engineering O5 Reconstitute
 Requirements O9 Reimplement Data To Be Data 
 Implementation 
 Assets O8 
 Redesign
 Data O4
 Recon-
 stitute
 Data 
 Design O3 Recreate
 Requirements O6 Redesign Data To Be
 Design 
 Assets O7 Re-
 develop
 Require-
 ments To Be Requirements Assets O1 Recreate Data
 Implementation Metadata !68Copyright 2018 by Data Blueprint Slide #
  • 48. Information Architecture Component Reengineering Options O-1 data implementation (e.g., by recreating descriptions of implemented file layouts); O-2 data designs (e.g., by recreating the logical system design layouts); or O-3 information requirements (e.g., by recreating existing system specifications and business rules). O-4 data design assets by examining the existing data implementation (when appropriate O-1 can facilitate O-4); and O-5 system information requirements by reverse engineering the data design O-4. (Note: if the data design doesn't exist O-4 must precede O-5.) O-6 transforming as is data design assets, yielding improved to be data designs that are based on reconstituted data design assets produced by O-2 or O-4 and (possibly O-1); O-7 transforming as is system requirements into to be system requirements that are based on reconstituted system requirements produced by O-3 or O-5 and (possibly O-2); O-8 redesigning to be data design assets using the to be system requirements based on reconstituted system requirements produced by O-7; and O-9 re-implementing system data based on data redesigns produced by O-6 or O-8. !69Copyright 2018 by Data Blueprint Slide # Model Evolution Framework !70Copyright 2018 by Data Blueprint Slide # Conceptual Logical Physical 
 
 
 Goal Validated Not Validated Every change can be mapped to a transformation in this framework!
  • 49. Model Evolution (better explanation) !71Copyright 2018 by Data Blueprint Slide # As-is To-be Technology Independent/ Logical Technology Dependent/ Physical abstraction Other logical as-is data architecture components • "Concern for man and his fate must always form the chief interest of all technical endeavors. Never forget this in the midst of your diagrams and equations." – Albert Einstein !72Copyright 2018 by Data Blueprint Slide #
  • 50. Data Models Used to Support Strategy • Flexible, adaptable data structures • Cleaner, less complex code • Ensure strategy effectiveness measurement • Build in future capabilities • Form/assess merger and acquisitions strategies !73Copyright 2018 by Data Blueprint Slide # Employee
 Type Employee Sales
 Person Manager Manager
 Type Staff
 Manager Line
 Manager Adapted from Clive Finkelstein Information Engineering Strategic Systems Development 1992 How do Data Models Support Organizational Strategy? • Consider the opposite question: – Were your systems explicitly designed to 
 be integrated or otherwise work together? – If not then what is the likelihood that they 
 will work well together? – In all likelihood your organization is spending between 20-40% of its IT budget compensating for poor data structure integration – They cannot be helpful as long as their structure is unknown • Two answers – Achieving efficiency and effectiveness goals – Providing organizational dexterity for rapid implementation !74Copyright 2018 by Data Blueprint Slide #
  • 51. Typical focus of a database modeling effort Data Modeling Ensures Interoperability !75Copyright 2018 by Data Blueprint Slide # Program F Program E Program D Program G Program H Application domain 2Application domain 3 Program I Typical focus of a software engineering effort Program A DataModel DataModel DataModel DataModel DataModel DataModel Program F Program E Program D Program G Program H Program I Application domain 2Application domain 3 DataModel DataModel DataModel Data Model Focus has Great Potential Business Value • How are decisions about the range and scope of common data usage, made? • Analysis scope is on use of data to support a process • Problems caused by data exchange or interface problems • Goals often connect strategic and operational • One data model is ideal !76Copyright 2018 by Data Blueprint Slide # DataModel Program A
  • 52. !77Copyright 2018 by Data Blueprint Slide # Data Modeling Fundamentals • Data Management Overview • Motivation – of Systems/components – Data is a not well understood substructure • Why data modeling & what is it? – Model represents our understanding of the – Fundamental, foundational system characteristics – Shared between system and human • Fundamentals – The power of the purpose statement – Understanding data centric thinking – Data modeling compliments other architecture/ engineering techniques, as well as – Challenges beyond data modeling • Take Aways, References & Q&A Use Models to !78 • Store and formalize information • Filter out extraneous detail • Define an essential set of 
 information • Help understand complex system behavior • Gain information from the process of developing and interacting with the model • Evaluate various scenarios or other outcomes indicated by the model • Monitor and predict system responses to changing environmental conditions Copyright 2018 by Data Blueprint Slide #
  • 53. • Goal must be shared IT/business understanding – No disagreements = insufficient communication • Data sharing/exchange is largely and highly automated and 
 thus dependent on successful engineering – It is critical to engineer a sound foundation of data modeling basics 
 (the essence) on which to build advantageous data technologies • Modeling characteristics change over the course of analysis – Different model instances may be useful to different analytical problems • Incorporate motivation (purpose statements) in all modeling – Modeling is a problem defining as well as a problem solving activity - both are inherent to architecture • Use of modeling is much more important than selection of a specific modeling method • Models are often living documents – It easily adapts to change • Models must have modern access/interface/search technologies – Models need to be available in an easily searchable manner • Utility is paramount – Adding color and diagramming objects customizes models and allows for a more engaging and enjoyable user review process Data Modeling for Business Value !79 Inspired by: Karen Lopez http://www.information-management.com/newsletters/enterprise_architecture_data_model_ERP_BI-10020246-1.html?pg=2 Copyright 2018 by Data Blueprint Slide # Why Modeling !80Copyright 2018 by Data Blueprint Slide # • Would you build a house without an architecture sketch? • Model is the sketch of the system to be built in a project. • Would you like to have an estimate how much your new house is going to cost? • Your model gives you a very good idea of how demanding the implementation work is going to be! • If you hired a set of constructors from all over the world to build your house, would you like them to have a common language? • Model is the common language for the project team. • Would you like to verify the proposals of the construction team before the work gets started? • Models can be reviewed before thousands of hours of implementation work will be done. • If it was a great house, would you like to build something rather similar again, in another place? • It is possible to implement the system to various platforms using the same model. • Would you drill into a wall of your house without a map of the plumbing and electric lines? • Models document the system built in a project. This makes life easier for the support and maintenance!
  • 54. Upcoming Events Enterprise Data World 2018 (San Diego)
 The First Year as a CDO
 April 24, 2018 @ 1:30 PM ET May Webinar:
 Implementing the Data Maturity Model
 May 8, 2018 @ 2:00 PM ET/11:00 AM PT June Webinar:
 Data Governance Strategies
 June 12, 2018 @ 2:00 PM ET/11:00 AM PT DGIQ 2018 (San Diego)
 Keeping the Momentum Going in your Data Quality Program
 June 11, 2018 @ 1:30 PM (PT) Sign up for webinars at: www.datablueprint.com/webinar-schedule !81Copyright 2018 by Data Blueprint Slide #Copyright 2018 by Data Blueprint Slide # Brought to you by: Join in the discussion - questions? It’s your turn! Use the chat feature or Twitter (#dataed) to submit your questions to Peter now! + = !82Copyright 2018 by Data Blueprint Slide #
  • 55. 10124 W. Broad Street, Suite C Glen Allen, Virginia 23060 804.521.4056 Copyright 2018 by Data Blueprint Slide # !83