SlideShare uma empresa Scribd logo
1 de 49
Baixar para ler offline
Elements of Data Documentation
Adam Mack
Education and Human Development Incubator (EHDi)
Social Science Research Institute
October 1, 2015
Why Is Documentation Important?
• Describe the contents of the data
• Explain context in which data was collected
• Explain any manipulations performed on the
data
• Allow research data to be understood by
people outside of the original project
Do I Need to Document?
Back in the day… … and now.
Research:
Consequences of Insufficient
Documentation
Consequences of Insufficient
Documentation
• Data may be unusable
• May make inaccurate assumptions about data
– Manipulations performed on data may affect
results of analyses
– May be unclear how to interpret contents of a
variable
Consequences of Insufficient Documentation:
Example
• Assume each of the following prompts is answered
on a 1–5 agreement scale.
– Data management is great. (dmgreat) 5
– Data management is the greatest! (dmgrtst) 5
– I don’t like data management. (dmnolike) 1
• Dmnolike needs to be reversed scored (to 5) before a
scale score can be calculated from the variables.
• You can recode this value within the same variable,
but should you?
Elements of Data Documentation
• What are the most important elements to
document?
Elements of Data Documentation
• What are the most important elements to
document?
– Data elements
– Study elements
– Processes and decisions
Elements of Data Documentation
• Who will be using the documentation?
– Data managers
– Statisticians
– Researchers
– Outside users
Elements of Data Documentation
• When should documentation be created?
– Often, projects wait until data has been collected
before creating documentation such as
codebooks.
– Creating documentation early in the project has
numerous advantages.
Elements of Data Documentation
• How should these elements be documented?
Potential forms that documentation may take
include:
– Codebook
Elements of Data Documentation
• How should these elements be documented?
Potential forms that documentation may take
include:
– Annotated version of instrument
Elements of Data Documentation
• How should these elements be documented?
Potential forms that documentation may take
include:
– More descriptive, less structured forms of
documentation (data narratives)
Data-Level Documentation
• What are the most important elements to
document?
– Data elements
– Study elements
– Processes and decisions
Data-Level Documentation
Should include basic information needed to use
the data, including:
• Structural information about variable
– Name of variable
– Label (if applicable)
– Type of variable (numeric or character)
– Length of variable
Data-Level Documentation
• Information describing variable contents
– Question text (or text description of variable
contents)
– Valid values
– Coding of values
Data-Level Documentation
• Scales/derived variables
– Algorithm used to create variable
– Procedures for handling missing data
Data-Level Documentation
• Question routing (if skip patterns used)
– Identify number of participants asked each
question/path through survey
• Error checking/validation
Data-Level Documentation
• Reliability of scales
– Calculate Cronbach’s alpha for each scale included
in the data
– Compare values for your study to previously
reported values in the literature
Types of Data Documentation
• Tabular codebook (Excel)
– Good for organizing a large amount of information
concisely
– Sortable
– Filterable
– Customizable; can hide columns that may be
needed but are not of interest to a general
audience
Tabular Codebook
Types of Data Documentation
• Annotated instrument
– Contains basic variable and value information in
context
– Easy to interpret
– Difficult to integrate much additional detail; not
useful for some forms of data
Annotated Instrument
Study-Level Documentation
• What are the most important elements to
document?
– Data elements
– Study elements
– Processes and decisions
Study-Level Documentation
• Details about the source of the data
– Study design and purpose
– Collection method
– Information about the research sample
– Longitudinal time points (if applicable)
Study-Level Documentation
• Information about data files
– File name/version
– Date created
– Number of records
– Number of variables
– Changes since last version of file
Study-Level Documentation
• Information about measures used
– Description of measure
– Description of scales
– Source of measure, including references as
appropriate
Study-Level Documentation
Programs used to process/manipulate data
– Documentation within program (comments)
Study-Level Documentation
Programs used to process/manipulate data
– Documentation of what various programs do and
in what order they are used
Program Description
SSIS_01 Creates data set with 1st batch of data. Includes scoring
code for social skills and problem behavior scales and
subscales.
SSIS_01a Corrects scoring issue with problem behavior scale.
SSIS_02 Adds 2nd batch of data; adds assessment date and birth
date information to allow calculation of age-dependent
scores.
SSIS_03 Adds 3rd batch of data.
Study-Level Documentation
• Data narrative
– Good for measure/study-level information
Study-Level Documentation
• Data narrative (continued)
Decision and Process Documentation
• What are the most important elements to
document?
– Data elements
– Study elements
– Processes and decisions
Decision and Process Documentation
• By far, the least established area of research
documentation.
• Due to individual differences between
research projects, it can be difficult to identify
a standard template.
Decision and Process Documentation
Elements to include in documentation:
• Scope (variables/measures)
• Time (if multiple time points)
• Describe purpose of process or situation
requiring a decision being made
Decision and Process Documentation
Elements to include in documentation:
• Information from the data that describes or
affects the decision or process
• A description of the process itself, including:
– Any software or tools needed to complete the
process
– Any resources /references used
Decision and Process Documentation
• What sorts of decisions and processes should
be documented with this level of detail?
– Basic scales and processes that are commonly
utilized may not require this much detail
– Processes and procedures that are not well
established or that deviate significantly from the
standard method should be documented
Decision and Process Documentation
• Examples of processes that might need to be
documented
– Naming conventions for variables
– Naming conventions for data files
– Structure of data directories
– Version information
Decision and Process Documentation
• Examples of decisions that might need to be
documented
– Resolving discrepancies in data obtained from
multiple sources or at multiple time points
– Data transformations that require interpretation
Decision and Process Documentation
Tools for Documentation
• Statistical software packages (e.g. SAS, Stata)
– Variable information (PROC contents; describe)
– Provides a good starting point for a codebook
• Database management systems
Tools for Documentation
• Data collection instruments
– Paper forms
– Electronic/online collection
PROC CODEBOOK (SAS)
• PROC CODEBOOK is a SAS macro that creates
a codebook based on a SAS data set
PROC CODEBOOK (SAS)
• Requirements
– Labels on variables and data set
– Formats assigned to categorical values
– Minimum of 1 categorical/2 numeric variables
• Optional elements
– Ordering of variables (default is by variable name)
– ODS formatting of title text
PROC CODEBOOK (SAS)
• Can be useful when dealing with data sets that
include SAS formats
• If data set does not already have formats applied,
may take as much time to add them as to create
your own codebook (which has more flexibility)
• To download the SAS macro and access
documentation, visit
http://www.cpc.unc.edu/research/tools/data_an
alysis/proc_codebook
Documentation Standards
• How can we document the data in a way that
helps interested parties find the data?
• Dublin Core
– Includes 15 standard elements.
– Intended for describing a wide range of different web-
based or physical resources
• Data Documentation Initiative
– An international specification for describing data from the
social, behavioral, and economic sciences
– Supports the entire research data lifecycle
The Takeaway
• Good documentation is not just a product, it’s
an approach
Resources
• Inter-university Consortium for Political and
Social Research (ICPSR)
– Guide to Social Science Data Preparation and
Archiving
• Cornell Research Data Management Service
Group
– Guide to writing "readme" style metadata
• Duke University Libraries
Questions?
• Ask away!
• If you would like to talk more about
documentation for your own projects, contact
us at ehdidata@duke.edu.
• Thanks for coming!
Acknowledgements
For their help in putting together this workshop:
• Lorrie Schmid
• Chandler Thomas
And for helping keep you interested in the material:
• Darth Vader
• Success Kid
• Mark Wahlberg (and @ResearchMark)

Mais conteúdo relacionado

Mais procurados

Manuscript writing
Manuscript writing Manuscript writing
Manuscript writing MitalPatani
 
Steps of scientific research
Steps of scientific researchSteps of scientific research
Steps of scientific researchalbrwaz
 
Data Archiving and Sharing
Data Archiving and SharingData Archiving and Sharing
Data Archiving and SharingC. Tobin Magle
 
Understanding Research
Understanding ResearchUnderstanding Research
Understanding Researchayazqaiser
 
Data collection and analysis
Data collection and analysisData collection and analysis
Data collection and analysisAndres Baravalle
 
Systematic review and meta analaysis course - part 1
Systematic review and meta analaysis course - part 1Systematic review and meta analaysis course - part 1
Systematic review and meta analaysis course - part 1Ahmed Negida
 
Steps epi info_training_guide
Steps epi info_training_guideSteps epi info_training_guide
Steps epi info_training_guideNahom Girma
 
Increasing your citation count
Increasing your citation countIncreasing your citation count
Increasing your citation countHelen Singer
 
Introduction to Data Management
Introduction to Data ManagementIntroduction to Data Management
Introduction to Data ManagementAmanda Whitmire
 
Research Method
Research MethodResearch Method
Research MethodYeonYuRae
 
8. randomized control trials
8. randomized control trials8. randomized control trials
8. randomized control trialsNaveen Phuyal
 
Systematic Review: Beginner's Guide
Systematic Review: Beginner's Guide Systematic Review: Beginner's Guide
Systematic Review: Beginner's Guide Saee Deshpamde
 
Methods of data collection (research methodology)
Methods of data collection  (research methodology)Methods of data collection  (research methodology)
Methods of data collection (research methodology)Muhammed Konari
 
Systematic Reviews in the Health Sciences
Systematic Reviews in the Health SciencesSystematic Reviews in the Health Sciences
Systematic Reviews in the Health SciencesBecky Morin
 
Quantitative Research Presentation (1)
Quantitative Research Presentation (1)Quantitative Research Presentation (1)
Quantitative Research Presentation (1)Angelina Lapina
 

Mais procurados (20)

Manuscript writing
Manuscript writing Manuscript writing
Manuscript writing
 
Research methodology
Research methodology Research methodology
Research methodology
 
Steps of scientific research
Steps of scientific researchSteps of scientific research
Steps of scientific research
 
Data Archiving and Sharing
Data Archiving and SharingData Archiving and Sharing
Data Archiving and Sharing
 
Research methodology
Research methodologyResearch methodology
Research methodology
 
Understanding Research
Understanding ResearchUnderstanding Research
Understanding Research
 
Data Quality Control
Data Quality ControlData Quality Control
Data Quality Control
 
Data collection and analysis
Data collection and analysisData collection and analysis
Data collection and analysis
 
Systematic review and meta analaysis course - part 1
Systematic review and meta analaysis course - part 1Systematic review and meta analaysis course - part 1
Systematic review and meta analaysis course - part 1
 
Steps epi info_training_guide
Steps epi info_training_guideSteps epi info_training_guide
Steps epi info_training_guide
 
Increasing your citation count
Increasing your citation countIncreasing your citation count
Increasing your citation count
 
Introduction to Data Management
Introduction to Data ManagementIntroduction to Data Management
Introduction to Data Management
 
Reserch methodology
Reserch methodologyReserch methodology
Reserch methodology
 
Research Method
Research MethodResearch Method
Research Method
 
8. randomized control trials
8. randomized control trials8. randomized control trials
8. randomized control trials
 
Systematic Review: Beginner's Guide
Systematic Review: Beginner's Guide Systematic Review: Beginner's Guide
Systematic Review: Beginner's Guide
 
Methods of data collection (research methodology)
Methods of data collection  (research methodology)Methods of data collection  (research methodology)
Methods of data collection (research methodology)
 
Data Quality Presentation.ppt
Data Quality Presentation.pptData Quality Presentation.ppt
Data Quality Presentation.ppt
 
Systematic Reviews in the Health Sciences
Systematic Reviews in the Health SciencesSystematic Reviews in the Health Sciences
Systematic Reviews in the Health Sciences
 
Quantitative Research Presentation (1)
Quantitative Research Presentation (1)Quantitative Research Presentation (1)
Quantitative Research Presentation (1)
 

Semelhante a Elements of Data Documentation

DAXTrainingPresentation_July2015 (1)
DAXTrainingPresentation_July2015 (1)DAXTrainingPresentation_July2015 (1)
DAXTrainingPresentation_July2015 (1)Randy Bowman
 
DATA SCIENCE AND BIG DATA ANALYTICSCHAPTER 2 DATA ANA.docx
DATA SCIENCE AND BIG DATA ANALYTICSCHAPTER 2 DATA ANA.docxDATA SCIENCE AND BIG DATA ANALYTICSCHAPTER 2 DATA ANA.docx
DATA SCIENCE AND BIG DATA ANALYTICSCHAPTER 2 DATA ANA.docxrandyburney60861
 
Data Management Lab: Session 2 slides
Data Management Lab: Session 2 slidesData Management Lab: Session 2 slides
Data Management Lab: Session 2 slidesIUPUI
 
[AIIM17] Data Categorization You Can Live With - Monica Crocker
[AIIM17]  Data Categorization You Can Live With - Monica Crocker [AIIM17]  Data Categorization You Can Live With - Monica Crocker
[AIIM17] Data Categorization You Can Live With - Monica Crocker AIIM International
 
Database Systems - Lecture Week 1
Database Systems - Lecture Week 1Database Systems - Lecture Week 1
Database Systems - Lecture Week 1Dios Kurniawan
 
Technical Documentation 101 for Data Engineers.pdf
Technical Documentation 101 for Data Engineers.pdfTechnical Documentation 101 for Data Engineers.pdf
Technical Documentation 101 for Data Engineers.pdfShristi Shrestha
 
ISBB_Chapter4.pptx
ISBB_Chapter4.pptxISBB_Chapter4.pptx
ISBB_Chapter4.pptxpurnecole
 
data structures and its importance
 data structures and its importance  data structures and its importance
data structures and its importance Anaya Zafar
 
Documentation and Metdata - VA DM Bootcamp
Documentation and Metdata - VA DM BootcampDocumentation and Metdata - VA DM Bootcamp
Documentation and Metdata - VA DM BootcampSherry Lake
 
AIS PPt.pptx
AIS PPt.pptxAIS PPt.pptx
AIS PPt.pptxdereje33
 
2. Business Data Analytics and Technology.pptx
2. Business Data Analytics and Technology.pptx2. Business Data Analytics and Technology.pptx
2. Business Data Analytics and Technology.pptxnirmalanr2
 
Week 2 - Database System Development Lifecycle-old.pptx
Week 2 - Database System Development Lifecycle-old.pptxWeek 2 - Database System Development Lifecycle-old.pptx
Week 2 - Database System Development Lifecycle-old.pptxNurulIzrin
 
Support Your Data, Kyoto University
Support Your Data, Kyoto UniversitySupport Your Data, Kyoto University
Support Your Data, Kyoto UniversityStephanie Simms
 
System Analysis And Design
System Analysis And DesignSystem Analysis And Design
System Analysis And DesignLijo Stalin
 
Info 2402 irt-chapter_2
Info 2402 irt-chapter_2Info 2402 irt-chapter_2
Info 2402 irt-chapter_2Shahriar Rafee
 
Research Lifecycles and RDM
Research Lifecycles and RDMResearch Lifecycles and RDM
Research Lifecycles and RDMMarieke Guy
 
Data management: documentation and metadata
Data management: documentation and metadataData management: documentation and metadata
Data management: documentation and metadataStatistics Specialist
 
Who says you can't do records management in SharePoint?
Who says you can't do records management in SharePoint?Who says you can't do records management in SharePoint?
Who says you can't do records management in SharePoint?John F. Holliday
 

Semelhante a Elements of Data Documentation (20)

ITFT- Dbms
ITFT- DbmsITFT- Dbms
ITFT- Dbms
 
DAXTrainingPresentation_July2015 (1)
DAXTrainingPresentation_July2015 (1)DAXTrainingPresentation_July2015 (1)
DAXTrainingPresentation_July2015 (1)
 
DATA SCIENCE AND BIG DATA ANALYTICSCHAPTER 2 DATA ANA.docx
DATA SCIENCE AND BIG DATA ANALYTICSCHAPTER 2 DATA ANA.docxDATA SCIENCE AND BIG DATA ANALYTICSCHAPTER 2 DATA ANA.docx
DATA SCIENCE AND BIG DATA ANALYTICSCHAPTER 2 DATA ANA.docx
 
Data Management Lab: Session 2 slides
Data Management Lab: Session 2 slidesData Management Lab: Session 2 slides
Data Management Lab: Session 2 slides
 
[AIIM17] Data Categorization You Can Live With - Monica Crocker
[AIIM17]  Data Categorization You Can Live With - Monica Crocker [AIIM17]  Data Categorization You Can Live With - Monica Crocker
[AIIM17] Data Categorization You Can Live With - Monica Crocker
 
Database Systems - Lecture Week 1
Database Systems - Lecture Week 1Database Systems - Lecture Week 1
Database Systems - Lecture Week 1
 
Technical Documentation 101 for Data Engineers.pdf
Technical Documentation 101 for Data Engineers.pdfTechnical Documentation 101 for Data Engineers.pdf
Technical Documentation 101 for Data Engineers.pdf
 
ISBB_Chapter4.pptx
ISBB_Chapter4.pptxISBB_Chapter4.pptx
ISBB_Chapter4.pptx
 
data structures and its importance
 data structures and its importance  data structures and its importance
data structures and its importance
 
MEDIN data guidelines
MEDIN data guidelinesMEDIN data guidelines
MEDIN data guidelines
 
Documentation and Metdata - VA DM Bootcamp
Documentation and Metdata - VA DM BootcampDocumentation and Metdata - VA DM Bootcamp
Documentation and Metdata - VA DM Bootcamp
 
AIS PPt.pptx
AIS PPt.pptxAIS PPt.pptx
AIS PPt.pptx
 
2. Business Data Analytics and Technology.pptx
2. Business Data Analytics and Technology.pptx2. Business Data Analytics and Technology.pptx
2. Business Data Analytics and Technology.pptx
 
Week 2 - Database System Development Lifecycle-old.pptx
Week 2 - Database System Development Lifecycle-old.pptxWeek 2 - Database System Development Lifecycle-old.pptx
Week 2 - Database System Development Lifecycle-old.pptx
 
Support Your Data, Kyoto University
Support Your Data, Kyoto UniversitySupport Your Data, Kyoto University
Support Your Data, Kyoto University
 
System Analysis And Design
System Analysis And DesignSystem Analysis And Design
System Analysis And Design
 
Info 2402 irt-chapter_2
Info 2402 irt-chapter_2Info 2402 irt-chapter_2
Info 2402 irt-chapter_2
 
Research Lifecycles and RDM
Research Lifecycles and RDMResearch Lifecycles and RDM
Research Lifecycles and RDM
 
Data management: documentation and metadata
Data management: documentation and metadataData management: documentation and metadata
Data management: documentation and metadata
 
Who says you can't do records management in SharePoint?
Who says you can't do records management in SharePoint?Who says you can't do records management in SharePoint?
Who says you can't do records management in SharePoint?
 

Último

How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxnelietumpap1
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 

Último (20)

How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptx
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 

Elements of Data Documentation

  • 1. Elements of Data Documentation Adam Mack Education and Human Development Incubator (EHDi) Social Science Research Institute October 1, 2015
  • 2. Why Is Documentation Important? • Describe the contents of the data • Explain context in which data was collected • Explain any manipulations performed on the data • Allow research data to be understood by people outside of the original project
  • 3. Do I Need to Document? Back in the day… … and now. Research:
  • 5. Consequences of Insufficient Documentation • Data may be unusable • May make inaccurate assumptions about data – Manipulations performed on data may affect results of analyses – May be unclear how to interpret contents of a variable
  • 6. Consequences of Insufficient Documentation: Example • Assume each of the following prompts is answered on a 1–5 agreement scale. – Data management is great. (dmgreat) 5 – Data management is the greatest! (dmgrtst) 5 – I don’t like data management. (dmnolike) 1 • Dmnolike needs to be reversed scored (to 5) before a scale score can be calculated from the variables. • You can recode this value within the same variable, but should you?
  • 7. Elements of Data Documentation • What are the most important elements to document?
  • 8. Elements of Data Documentation • What are the most important elements to document? – Data elements – Study elements – Processes and decisions
  • 9. Elements of Data Documentation • Who will be using the documentation? – Data managers – Statisticians – Researchers – Outside users
  • 10. Elements of Data Documentation • When should documentation be created? – Often, projects wait until data has been collected before creating documentation such as codebooks. – Creating documentation early in the project has numerous advantages.
  • 11. Elements of Data Documentation • How should these elements be documented? Potential forms that documentation may take include: – Codebook
  • 12. Elements of Data Documentation • How should these elements be documented? Potential forms that documentation may take include: – Annotated version of instrument
  • 13. Elements of Data Documentation • How should these elements be documented? Potential forms that documentation may take include: – More descriptive, less structured forms of documentation (data narratives)
  • 14. Data-Level Documentation • What are the most important elements to document? – Data elements – Study elements – Processes and decisions
  • 15. Data-Level Documentation Should include basic information needed to use the data, including: • Structural information about variable – Name of variable – Label (if applicable) – Type of variable (numeric or character) – Length of variable
  • 16. Data-Level Documentation • Information describing variable contents – Question text (or text description of variable contents) – Valid values – Coding of values
  • 17. Data-Level Documentation • Scales/derived variables – Algorithm used to create variable – Procedures for handling missing data
  • 18. Data-Level Documentation • Question routing (if skip patterns used) – Identify number of participants asked each question/path through survey • Error checking/validation
  • 19. Data-Level Documentation • Reliability of scales – Calculate Cronbach’s alpha for each scale included in the data – Compare values for your study to previously reported values in the literature
  • 20. Types of Data Documentation • Tabular codebook (Excel) – Good for organizing a large amount of information concisely – Sortable – Filterable – Customizable; can hide columns that may be needed but are not of interest to a general audience
  • 22. Types of Data Documentation • Annotated instrument – Contains basic variable and value information in context – Easy to interpret – Difficult to integrate much additional detail; not useful for some forms of data
  • 24. Study-Level Documentation • What are the most important elements to document? – Data elements – Study elements – Processes and decisions
  • 25. Study-Level Documentation • Details about the source of the data – Study design and purpose – Collection method – Information about the research sample – Longitudinal time points (if applicable)
  • 26. Study-Level Documentation • Information about data files – File name/version – Date created – Number of records – Number of variables – Changes since last version of file
  • 27. Study-Level Documentation • Information about measures used – Description of measure – Description of scales – Source of measure, including references as appropriate
  • 28. Study-Level Documentation Programs used to process/manipulate data – Documentation within program (comments)
  • 29. Study-Level Documentation Programs used to process/manipulate data – Documentation of what various programs do and in what order they are used Program Description SSIS_01 Creates data set with 1st batch of data. Includes scoring code for social skills and problem behavior scales and subscales. SSIS_01a Corrects scoring issue with problem behavior scale. SSIS_02 Adds 2nd batch of data; adds assessment date and birth date information to allow calculation of age-dependent scores. SSIS_03 Adds 3rd batch of data.
  • 30. Study-Level Documentation • Data narrative – Good for measure/study-level information
  • 31. Study-Level Documentation • Data narrative (continued)
  • 32. Decision and Process Documentation • What are the most important elements to document? – Data elements – Study elements – Processes and decisions
  • 33. Decision and Process Documentation • By far, the least established area of research documentation. • Due to individual differences between research projects, it can be difficult to identify a standard template.
  • 34. Decision and Process Documentation Elements to include in documentation: • Scope (variables/measures) • Time (if multiple time points) • Describe purpose of process or situation requiring a decision being made
  • 35. Decision and Process Documentation Elements to include in documentation: • Information from the data that describes or affects the decision or process • A description of the process itself, including: – Any software or tools needed to complete the process – Any resources /references used
  • 36. Decision and Process Documentation • What sorts of decisions and processes should be documented with this level of detail? – Basic scales and processes that are commonly utilized may not require this much detail – Processes and procedures that are not well established or that deviate significantly from the standard method should be documented
  • 37. Decision and Process Documentation • Examples of processes that might need to be documented – Naming conventions for variables – Naming conventions for data files – Structure of data directories – Version information
  • 38. Decision and Process Documentation • Examples of decisions that might need to be documented – Resolving discrepancies in data obtained from multiple sources or at multiple time points – Data transformations that require interpretation
  • 39. Decision and Process Documentation
  • 40. Tools for Documentation • Statistical software packages (e.g. SAS, Stata) – Variable information (PROC contents; describe) – Provides a good starting point for a codebook • Database management systems
  • 41. Tools for Documentation • Data collection instruments – Paper forms – Electronic/online collection
  • 42. PROC CODEBOOK (SAS) • PROC CODEBOOK is a SAS macro that creates a codebook based on a SAS data set
  • 43. PROC CODEBOOK (SAS) • Requirements – Labels on variables and data set – Formats assigned to categorical values – Minimum of 1 categorical/2 numeric variables • Optional elements – Ordering of variables (default is by variable name) – ODS formatting of title text
  • 44. PROC CODEBOOK (SAS) • Can be useful when dealing with data sets that include SAS formats • If data set does not already have formats applied, may take as much time to add them as to create your own codebook (which has more flexibility) • To download the SAS macro and access documentation, visit http://www.cpc.unc.edu/research/tools/data_an alysis/proc_codebook
  • 45. Documentation Standards • How can we document the data in a way that helps interested parties find the data? • Dublin Core – Includes 15 standard elements. – Intended for describing a wide range of different web- based or physical resources • Data Documentation Initiative – An international specification for describing data from the social, behavioral, and economic sciences – Supports the entire research data lifecycle
  • 46. The Takeaway • Good documentation is not just a product, it’s an approach
  • 47. Resources • Inter-university Consortium for Political and Social Research (ICPSR) – Guide to Social Science Data Preparation and Archiving • Cornell Research Data Management Service Group – Guide to writing "readme" style metadata • Duke University Libraries
  • 48. Questions? • Ask away! • If you would like to talk more about documentation for your own projects, contact us at ehdidata@duke.edu. • Thanks for coming!
  • 49. Acknowledgements For their help in putting together this workshop: • Lorrie Schmid • Chandler Thomas And for helping keep you interested in the material: • Darth Vader • Success Kid • Mark Wahlberg (and @ResearchMark)