SlideShare uma empresa Scribd logo
1 de 178
Baixar para ler offline
DATA ANALYSIS USING SPSS
Muhammad Ibrahim
Associate Professor of Statistics
Govt. MAO College Lahore
0300-4668681
Ibrahim.ap12@gmail.com
LEARNING OBJECTIVES
1.  Understand basic concepts of biostatistics
and computer software SPSS.
2.  Select appropriate statistical tests for
particular types of data.
3.  Recognize and interpret the output from
statistical analyses.
4.  Report statistical output in a concise and
appropriate manner.
BASIC TERMINOLOGY
Statistics, Biostatistics, Variable, Measurement
Scale, Data, Medical Data, type of data, Data
Analysis
VARIABLE, SCALE, DATA
Variable is a characteristics which varies and
scale is a device on which observations are
taken. Data is set of observations/measurements
taken from experiment/survey or external source
of a specific variable using some appropriate
measurement scale
Statistics and Bio-statistics
Statistics is generally understood as the subject dealing
with number and data, more broadly it involves
activities such as collection of data from survey or
experiment, summarization or management of data,
presentation of results in a convincing format, analysis
of data or drawing valid inferences from findings.
Whereas Bio-Statistics is science which helps us in
managing medical data with application of statistical
methods/techniques/tools or a collection of statistical
procedures particularly well-suited to the analysis of
healthcare-related data
What is medical data?
The data which is related to patient care or numerical
information regarding patient’s clinical characteristics,
mortality rate survival rate, disease distribution,
prevalence of disease, efficacy of treatment, and
other such information is called medical data.
NATURE OF DATA
Data is the value you get from observing
(measuring, counting, assessing etc.) from
experiment or survey. Data is either categorical or
metric. Categorical data is further divided into
Nominal and ordinal, whereas metric into discrete
and continuous (quantitative) data.
Nominal data
The data is divided into classes or categories. Blood type, sex, causes of
disease, urban/rural, alive/ dead, infected/not infected, hair color, smoking
status. No meaningful order of classes.
Ordinal data
The data is also divided into classes or categories but be put in meaningful
order.
For example satisfaction level:-Very satisfied, satisfied, neutral, unsatisfied,
very unsatisfied. Pain as mild, moderate, sever. Socioeconomic status: poor,
middle, rich, grade of breast cancer, better, same, worst.
Discrete data
When data is taken from some counting process, for example number of
patients in different wards, number of nurses, number of hospitals in different
cities.
Continuous or quantitative data
When data is taken from some measuring process, for example, height, weight,
Temperature, uric acid, blood glucose and serum level.
Primary Scales of Measurement
Scale Basic
Characteristics
Common
Examples
Marketing
Examples
Nominal Numbers identify
& classify objects
Social Security
nos., numbering
of football players
Brand nos., store
types
Percentages,
mode
Chi-square,
binomial test
Ordinal Nos. indicate the
relative positions
of objects but not
the magnitude of
differences
between them
Quality rankings,
rankings of teams
in a tournament
Preference
rankings, market
position, social
class
Percentile,
median
Rank-order
correlation,
Friedman
ANOVA
Ratio Zero point is fixed,
ratios of scale
values can be
compared
Length, weight Age, sales,
income, costs
Geometric
mean, harmonic
mean
Coefficient of
variation
Permissible Statistics
Descriptive Inferential
Interval Differences
between objects
Temperature
(Fahrenheit)
Attitudes,
opinions, index
Range, mean,
standard
Product-
moment
Nominal Scale
 The numbers serve only as labels or tags for identifying and classifying
objects.
 When used for identification, there is a strict one-to-one correspondence
between the numbers and the objects.
 The numbers do not reflect the amount of the characteristic possessed by the
objects.
 The only permissible operation on the numbers in a nominal scale is counting.
 Social security number, hockey players number. Imn marketing research
respondents, brands, attributes, stores and other objects
ORDINAL SCALE
A ranking scale in which numbers are assigned to objects to
indicate the relative extent to which the objects possess
some characteristic. Can determine whether an object has
more or less of a characteristic than some other object, but
not how much more or less. any series of numbers can be
assigned that preserves the ordered relationships between
the objects. So relative position of objects not the
magnitude of difference between the objects. In addition
to the counting operation allowable for nominal scale data,
ordinal scales permit the use of statistics based on
percentile, quartile, median. Possess description and order,
not distance or origin
INTERVAL SCALE
Numerically equal distances on the scale represent
equal values in the characteristic being measured.
It permits comparison of the differences between
objects. The difference between 1 & 2 is same as
between 2 & 3 The location of the zero point is not
fixed. Both the zero point and the units of
measurement are arbitrary. Everyday
temperature scale. Attitudinal data obtained on
rating scales. Do not possess origin characteristics
(zero and exact measurement)
RATIO SCALE
The highest scale that allows to identify objects, rank
order of objects, and compare intervals or differences.
It is also meaningful to compute ratios of scale values
Possesses all the properties of the nominal, ordinal, and
interval scales. It has an absolute zero point.
Height, weight, age, money. Sales, costs, market share
and number of customers are variables measured on a
ratio scale
All statistical techniques can be applied to ratio data.
After collecting the accurate and reliable data
successfully by using the appropriate method
from the source, the next step is how to extract
the pertinent and useful information buried in the
data for further manipulation and interpretation.
The process of performing certain calculations
and evaluation in order to extract relevant
information from data is called data analysis.
Data Analysis
The data analysis may take several steps to
reach certain conclusions. Simple data can be
organized very easily, while the complex data
requires proper processing. The word
“processing” means the recasting and dealing
with data making ready for analysis.
Cont……
•Questionnaire checking/Data preparation
•Coding
•Cleaning data
•Applying most appropriate tools for
analysis
Steps in data analysis
QUESTIONNAIRE CHECKING
A questionnaire returned from the field may be
unacceptable for several reasons.
Parts of the questionnaire may be incomplete.
The pattern of responses may indicate that the respondent did not
understand or follow the instructions.
The responses show little variance.
One or more pages are missing.
The questionnaire is received after the pre-established cutoff date.
The questionnaire is answered by someone who does not qualify for
participation.
DATA PREPARATION
Preparation of data file
It is important to convert raw data into a usable data for
analysis (coding where it needed), simply transform
information from questionnaire to computer database
The analysis and results will surely depend on the quality
of data
There are possibilities of errors in handling instruments,
raw data, transcribing, data entry, assigning codes, values,
value labels
Data need to be cleaned to fulfill the analysis conditions
CODING
Coding means assigning a code, usually a
number, to each possible response to each
question.
•One of the first steps in analyzing data is to
“clean” it of any obvious data entry errors:
Outliers? (really high or low numbers)
Example: Age = 110 (really 10 or 11?)
•Value entered that doesn’t exist for variable?
Example: 2 entered where 1=male, 0=female
•Missing values?
Did the person not give an answer? Was answer
accidentally not entered into the database?
Data cleaning
•May be able to set defined limits when entering data
Prevents entering a 2 when only 1, 0, or missing are acceptable
values
•Univariate data analysis is a useful way to check the
quality of the data
Cont……
SPSS
SPSS is a statistical Packages for data analysis, it is a
very popular software because of its friendly usage
in Social & Medical sciences
Launching SPSS
Before starting this session, you should know how to run a program in windows operating system. Click and hold on
button at lower left of your screen, and among the program listed select SPSS 16.0, click and release the mouse button
to lauanch the program
On clicking of SPSS this window will open then click on cancel button if you like to enter data in a new file or
click on OK for opening an existing file. A window will open known as data editor with variable view.
SPSS WINDOWS
There are a number of different types of windows in SPSS. The window in which you are currently working is called
the active window. Some of the frequently used windows are:
Data Editor Window: It displays the contents of the data file. This is the window that opens
automatically when you start an SPSS session. In this window, you can create new data files or modify existing ones.
When you open more than one data file, each data file has a separate Data Editor Window. The Data Editor Window
provides two view of the data:
Data View: It displays the data values. Each variable is a column. Each row is a case.
Variable View: It displays a table consisting of variable names and their attributes. You can modify the properties of
each variable or add new variables or delete existing variables in the Variable View Window.
Data view window variable view window
Viewer Window: It displays statistical results, tables, and charts. This window opens automatically the first time you
run a procedure that generates output
MORE ABOUT
WINDOWS
PULL-DOWN MENUS
Many tasks in SPSS are performed by selecting appropriate "pull-down" menus. Each window in SPSS has its own
menu bar with appropriate menu selections and toolbars. The Analyze and Graphs menus are available in all
windows. Here are some Data Editor Window menus and their uses:
File Menu: From the file menu you can open several different existing files or a database file such as
an excel file or read in a text file. You can also save any changes to the current file.
Edit Menu: from the Edit menu, you can cut, copy, paste, insert variables, insert cases, or use find in
the Data Editor window.
Data Menu: The data menu allows you to define variable properties, sort cases, merge files, split files,
select cases and use a variable to weight cases.
Transform Menu: The transform menu is where you will find the options to do some computations on
variables, to create new variables from existing ones or recode old variables.
Analyze Menu: The analyze menu is where all statistical analysis takes place. From descriptive statistics to
regression analysis to nonparametric tests
Graphs Menu: The graph menu is where you can create high resolution plots and graphs to be edited in
the chart editor window or you can create interactive graphs.
Utilities Menu: The utilities menu is used to display information on the contents of SPSS data files or to
run scripts.
Add-Ons Menu: From the add-ons menu you can run other packages like conjoint, classification trees, or
Neural Networks. Also there are programmability extensions that allow you to integrate programs like R
and Python into SPSS. But you should keep in mind that if you want to run any of the add-ons listed here
you will have to purchase them separately.
Window: From the window menu you can change the active window. The window with a check mark is the
active one. In this case it is the data editor window.
Help: The help menu allows you to get help on topics in SPSS or to ask the statistics coach some basic
questions.
TOOLBARS
Each window in SPSS has its own toolbars that provides access to common tasks. Some windows have
more than one. When you put the mouse pointer on a tool, there is a brief description of what the tool
does. You can show, move or hide a toolbar.
STATUS BARS
The status bar is at the bottom of each SPSS window and provides the following information:
Command Status: gives information about a procedure that is running.
Filter Status: Filter On shows when a subset of cases in the data is used for analysis.
Weight Status: Weight On indicates that a weight variable is being used in the analysis.
Split File Status: Split File On indicates that the file has been split into separate groups for analysis.
DIALOG BOXES
Many menu selections will open dialog boxes. In these dialog boxes, you select variables and options for analysis. The main
dialog box in any statistical procedure has the following parts:
Source variable list: A list of variable types (allowed by the procedure) from the working data file.
Target variable lists: One or more lists of variables needed for the analysis.
Command push buttons: Buttons that can be used to run the procedure by opening a subdialog box to make
additional specifications. Some of the push buttons are:
OK : Click this button to run the procedure.
Paste: Click this button to generate command syntax from your selections. The command syntax is pasted into a syntax window,
where it can be modified for future analysis. This creates the code regularly known as SPSS programs.
Reset: Deselects any selections, and resets all specifications in the dialog box and any subdialog boxes to the default status.
Cancel: Cancels any change in the dialog box settings since the last time it was opened. This will close the dialog box.
Help: Provides help about the current dialog box.
Name
The name of each SPSS variable in a given file must be unique; it must start with
a letter; it may have up to 8 characters (including letters, numbers, and the
underscore _ (note that certain key words are reversed and may not be used as
variable names, e.g., "compute", "sum", and so forth). To change an existing
name, click in the cell containing the name, highlight the part you want to
change, and type in the replacement. To create a new variable name, click in the
first empty row under the name column and type a new (unique) variable name.
Notice that we can use "cat_dog" but not "cat-dog" and not "cat dog". The hyphen
gets interpreted as subtraction (cat minus dog) by S PSS, and the space confuses
SPSS as to how many variables are being named.
TYPE
THE TWO BASIC TYPES OF VARIABLES THAT YOU WILL USE
ARE NUMERIC AND STRING. NUMERIC VARIABLES MAY ONLY
HAVE NUMBERS ASSIGNED. STRING VARIABLES MAY
CONTAIN LETTERS OR NUMBERS, BUT EVEN IF A STRING
VARIABLE HAPPENS TO CONTAIN ONLY NUMBERS, NUMERIC
OPERATIONS ON THAT VARIABLE WILL NOT BE ALLOWED
(E.G., FINDING THE MEAN, VARIANCE, STANDARD
DEVIATION, ETC...). TO CHANGE A VARIABLE TYPE, CLICK IN
THAT CELL ON THE GREY BOX WITH ...
Decimals
The decimal of a variable is the number of decimal places that SPSS will display. If more decimals have
been entered (or computed by SPSS), the additional information will be retained internally but not
displayed on screen. For whole numbers, you would reduce the number of decimals to zero. You can
change the number of decimal places by clicking in the decimals cell for the desired variable and
typing a new number or you can use the arrow keys at the edge of the cell
Label
The label of a variable is a string of text to indentify in more detail what a variable represents.
Unlike the name, the label is limited to 255 characters and may contain spaces and
punctuation. For instance, if there is a variable for each question on a questionnaire, you would
type the question as the variable label. To change or edit a variable label, simply click anywhere
within the cell
Values
Although the variable label goes a long way to explaining what the variable represents, for categorical
data (discrete data of both nominal and ordinal levels of measurement), we often need to know which
numbers represent which categories. To indicate how these numbers are assigned, one can add labels to
specific values by clicking on the ... box in the values cell
Clicking here opens up the Value Labels dialogue box.
To value 1.0 to cats and 2.0 to dogs, write 1.0 in value box and write cats in value label then click Add button,
the following box will appear.
Clicking on this box will bring up the variable type menu:
If you select a numeric variable, you can then click in the width box or
the decimal box to change the default values of 8 characters reserved
to displaying numbers with 2 decimal places. For whole numbers, you
can drop the decimals down to 0.
If you select a string variable, you can tell SPSS how much "room" to
leave in memory for each value, indicating the number of characters
to be allowed for data entry in this string variable.
When you are satisfied with the definitions of each value, click on the OK button
The real beauty of value labels can be seen in the Data View by clicking on the "toe
tag" icon in the tool bar , which switches between the numeric values
and their labels
A view of different variables with their descriptions
Missing
When you click missing button the SPSS will display this
We sometimes want to signal to SPSS that data should be treated as missing, even though there is some
other numerical code recorded instead of the data actually being missing (in which case SPSS displays a
single period -- this is also called SYSTEM MISSING data). In this example, after clicking on the ... button in
the Missing cell, I declared "9", "99", and "999" all to be treated by SPSS as missing (i.e., these values will be
ignored)
Columns
The columns property tells SPSS how wide the column should be for each variable. Don't confuse this one
with width, which indicates how many digits of the number will be displayed. The column size indicates how
much space is allocated rather than the degree to which it is filled.
Align
The alignment property indicates whether the information in the Data View should be left-justified, right-
justified, or centered
Measure
The Measure property indicates the level of measurement. Since SPSS does not differentiate between
interval and ratio levels of measurement, both of these quantitative variable types are lumped together
as "scale". Nominal and ordinal levels of measurement, however, are differentiated
ENTERING
DATA SET
Into SPSS
Let we have data set with different variables
and we need to enter in SPSS, below is set of
variables and data set, this file is named as
“bp” in dataset
Example
Data Set:
Professor Christopher conducted a study on subjects; the variable description is as with data
Variable Description
Sjcode ubject Code
Sex Subject sex (0 = female, 1= male)
Age Subject age
Height Height in inches
Weight weight, in pound
Race Subject Race (1=Amer, 2= Asian, 3= black, 4=
Hispanic, 5= white, 9= none of above)
Med Taking prescription medication (0= No, 1= Yes)
Smoke Does subject smoke? (0 =Nonsmoker, 1= smoker)
SBPCP Systolic blood pressure with cold presser
DBPCP Diastolic blood pressure with cold presser
HRCP Heart rate with cold presser
SBPMA Systolic blood pressure while doing mental
arithmetic
DBPMA Diastolic blood pressure while doing mental
arithmetic
HRMA Heart rate with while doing mental arithmetic
SBPREST Systolic blood pressure at rest
DBPREST Diastolic blood pressure at rest
PH Parental hypertension (0= No, 1= yes)
MEDPH Parent(s) on EH meds (0= No, 1=yes)
SJcode sex age height weight race meds smoking sbpcp dbpcp hrcp sbpma dbpma hrma sbrest dbrest Ph Medph
3 Female 19 65 155 White No Med Non smoker 126 65 88 135.667 81.333 76.667 116.25 60.75 PH+ Parent EH Yes
4 Female 18 63 132 White No Med Non smoker 125 80 96 130.667 82.667 92.667 115.75 76.375 PH+ Parent EH Yes
5 Female 19 66 138 White No Med Non smoker 149 90 91 135.333 90.333 64.333 120.5 65.375 PH+ Parent EH Yes
9 Female 18 66 130 White No Med Non smoker 113 89 88 128.333 82.333 85.667 113.625 72.125 PH- Parents EH No
10 Female 18 66 175 White No Med Non smoker 112 70 82 121.667 75.333 85 110 68.75 PH- Parents EH No
11 Female 18 62 113 White No Med Non smoker 125 70 73 133.333 82.333 74.333 119.75 73.5 PH- Parents EH No
13 Male 20 73 159 White No Med Smoker 162 62 58 145.667 68 74 130.75 57.125 PH+ Parent EH Yes
15 Male 18 70 155 White No Med Non smoker 123 73 53 137.333 78.667 53.667 126.375 65.625 PH+ Parent EH Yes
16 Male 19 69.5 185 White No Med Non smoker 139 66 48 148.667 81.667 78.667 127.625 67.375 PH+ Parent EH Yes
19 Male 18 70 164 White No Med Non smoker 133 65 85 134.333 58.667 66.667 121.75 56.5 PH- Parents EH No
20 Male 19 71 170 White No Med Non smoker 152 75 71 150.333 73 82.333 129.875 60 PH- Parents EH No
21 Male 18 76 179 Hispanic No Med Non smoker 128 70 63 121 71.333 71 121 68.5 PH- Parents EH No
23 Female 19 68.5 160 White No Med Non smoker 119 51 68 117 62.333 73.333 107.875 51.375 PH+ Parent EH Yes
24 Female 20 66 132 White No Med Non smoker 120 67 80 128.333 72.667 81 108 63.75 PH+ Parent EH Yes
25 Female 19 67.5 150 Black No Med Non smoker 129 95 70 121.333 71 77 110.25 62.875 PH- Parents EH No
26 Female 20 62 105 White Yes Med Non smoker 124 90 93 124 92.333 87 104.375 76.375 PH+ Parent EH Yes
29 Female 19 62 120 White No Med Non smoker 130 75 103 132.667 76 88.667 117.625 67.875 PH- Parents EH No
30 Female 18 67.5 143 White No Med Non smoker 130 95 93 120.667 83.667 98.333 111 77.375 PH- Parents EH No
32 Female 18 63.5 130 White No Med Non smoker 109 73 71 104 61 65.667 105.125 53.875 PH- Parents EH No
35 Male 20 66 127 White No Med Non smoker 129 68 107 124.333 63.667 93.333 117.75 62.75 PH- Parents EH No
Entering data into data editor
In this lesson our goal is only, how to enter, save, and edit data (the data sheet given above). The first step in
entering the data into data editor is to define all the variables. Creating a variable requires us to name it,
specify the type of data (nominal, ordinal, Scale) and assign label to the variables and data values if needed.
•Move the cursor to the bottom of the data editor, named as variable view and click it, a different grid appears
as
•Move the cursor into first empty cell in row 1 (under name) here type sjcode, then press enter
•When the cursor moves to the Type column , a small grey button marked with three dots
will appear, click on it you see this dialog box, numeric is default variable type, click ok.
Note that the Measure column (far right column) be put on scale, because you took numeric as variable
type, In SPSS, each variable carry a descriptive label to help identify its meaning. To add label, here is
procedure:
•Move the cursor into the label column and type Subject Code.
This complete the definition of first column.
•Now lets creats a varable to represent sex, move the fisrt colume of row 2, and name the variable
sex.
•Because sex is categorical (qualitative ) variable and we are going to represent it numerically ( for
data analysis purpose, because SPSS only entertains quantitative variable). Sinse numeric is the
default in type column, we shall skip it and go to width taking width as per our requirement, in
decimal column reduce from 2 to 0
•Label this variable as subject sex
•Now we can assign text label to our coded values ( as discussed previously). In the values column
click the grey box with three dots. A box will open as below
Type “0” in value box and type Female in the value label box.
Then click add
Now type 1 in Value and Male in Label, click add
and the click OK. In similar way we will add all the variables, the variable view window will be seen as
Now Switch to data view by clicking the appropriate tab in the lower left of screen.
 Move the cursor to the first cell below the sjcode, and type 3, and then press Enter.
In the next cell type 4, when you completed the subject code, move to the tope cell
under sex, type “0” for female and “1” for male and go on. When you are done all,
the data editor should look as
On clicking the third button (named Value label) at left most you will see the screen as below
Saving the data file
It is wise to save all your work in a disk file. To save a file, click on file menu, choose save as …, then next to file name, where
type BP, then click save.
Editing the data file/value
To edit any value, just to open the data file and click edit menu, and
select the case or variable which is required for editing.
Quitting SPSS
When you have completed your work, it is important to exit the program propoerly. Go
to file menu, then click on Exit , generally you will see a message asking if you wish to
save changes. Since we saved every thing earlier, click No.
Here we discuss the issues like, transform,
select, split, compute new variables,
re-coding of data, merging files, sorting,
transpose, weighted cases
File management
This tool allows you to rearrange the data
Open file data sort cases
select variable then ok
Sorting data
If some values are missing in data/variables that
can be replaced by different methods, if
variable is categorical then the value is replaced
by the researcher on his/her personal
experience, but the variable is continuous, SPSS
will help using the Replace missing value
command. Open file, and investigate any missing
value using sort command,
Replacing missing values
Then go to transform tool replace
missing value using option
Cont………
Sometimes a new variable is needed on the
basis of current/existing variable or set of
variables. The producer is as
Menu transform compute
variable ….. Insert target value and write
desired operation in target expression like
square, log ect.
Creating Variables
Open file “student” , convert weight into Kg then
fiend BMI of students. 1 Kg = 2.20462 Lb and
1M = 39.3701 and find BMI= weight/(height)2
Compare this BMI with this
BMI =weight in Lb/height in inch x703
Activity
If the researcher is interested to re-code the
data as you want to recode 1 5 or wants to
make numerical data into groups , then we use
re-code tool. Open the data file. From the menus
choose: Transform | Recode | Into
Different Variables...
Following Recode into Different Variables
Dialog box appears.
Re-coding
Select the variable you want to recode. For this example select AAA, and click the
right arrow button (►) to move the variable into the Input Variable > Output
Variable box, following sign appears in this box:
AAA >?
In the Output Variable group, enter an output variable name (e.g. AA1) in the Name
box, and you may label it as Stillbirth Rate Category [optional] for new variable and
click change.
Up to now, the dialog box looks as under:
Click Old and New Values... tab following dialog box appears, and specify how to recode
values
In the old value group, select the 5th choice then put 24 in the lowest through box.. In the
value box under new value group input 1.
Click Add tab. Similarly, for the closed class interval like 25-29, select the 4th choice in the old
value group then put 25 (selection of 4th choice in each case) till the time when you input 5 in the
New Value through 29 and in the value under new value input 2, then click Add tab. Repeat this
process . Now for the highest open class, select the 6th choice in the Old Value group then put 45
in the through highest box. In the Value box under New Value group input 6, then click Add tab.
The final shape looks as under.
Click Continue and then OK. The XYZ-SPSS Data Editor containing two variables viz. AAA and AA1t looks as under,
one in Variable View and other in Data View.
Specify Value Labels
Make the Data Editor the active window.
If the data view is displayed, double-click the variable name at the top of the column in
the data view or click the Variable View tab. Click the button in the values cell for the
variable that you want to define. For each value, enter the value and a label (the one
as seen below). Click Add to enter the value label, at last click OK.
For above activity make grouping of BMI as
Underweight < 18.5
Normal 18.5 – 22.9
Overweight > 22.9
Also make output of groups
Activity
This tool is used to analysis data for sub-group
or a specific group like mean of respondent
whose weight is above 85 Kg
Open file, select data at MENU bar, select cases
, click on if and write your option for selection ,
for example select male in BP file as gender=1
Select cases
Select male cases in “bp” file also female whose
age is more than 50 years
Activity
Two file may be merged either by variables or
by case. Let we have 1000 respondents whose
has six variables. If two data entry operators
are completing this task. They can do this task in
two ways (1) divide the cases to complete (2)
divide the number of variables
Merging file
File can be split into two or three categories, go
to menu then data then select split file and then
perform operation
Split file
Data analysis
BASIC STRATEGY
The following strategy is adopted to analyze the data
• Description , counting, Proportion
•Prediction, relationship, Association
•Comparing , estimation (95% confidence interval)
DATA ANALYSIS MAY BE
DESCRIPTIVE OR INFERENTIAL
DESCRIPTIVE CONTAINS MEAN,
MEDIAN , MODE, SD,
REGRESSION, CORRELATION ,
ON THE OTHER HAND
CONFIDENCE INTERVAL, TESTING
OF HYPOTHESIS, P-VALUE,
ANOVA RELATE TO INFERENTIAL
UNI-VARIATE DESCRIPTIVE ANALYSIS
Graphical Method
For nominal & ordinal data we use Bar or pie chart
For continuous data we use histogram
Numerical method
For nominal & ordinal data we use Frequency/proportions
For continuous data we use Mean , Standard deviation
Summary Guide
Scale Nominal Ordinal
Displaying data
Histogram
Box-plot
Bar chart, Pie chart Bar chart, Pie chart
Summarizing data
Mean, Median, SD Frequency table,
Percentages,
Proportion
Frequency table,
Percentages,
Proportion
GRAPHS FOR
CATEGORICAL DATA
MAKING BAR/PIE CHART
Open the file, then from pull-down menu click
on legacy dialogue, then click Bar/pie chart ,
select variable then click ok
DATA SUMMERY
Open the file, then from pull-down menu click on
analyze Descriptive statistics
frequency select variable
Click ok, output window will appear
GRAPH FOR CONTINUOUS
DATA
MAKING HISTOGRAM
Open the file, then from pull-down menu click
on legacy dialogue, then click histogram, select
variable, click ok
DATA SUMMARY
Open the file, then from pull-down menu click on analyze
Descriptive statistics Descriptive Statistics
select variable
Click ok, output window will appear
FOR ALL DESCRIPTIVE STATISTICS
AND 95% CONFIDENCE INTERVAL
Open the file, then from pull-down menu click on analyze
Descriptive statistics explore select
variable Click ok, output window will appear
Summary Guide for appropriate analysis for
two variable
Type of variables Graphical display Relationship
Categorical-
categorical
Multiple bar Contingency table
Categorical-Scale Box-plot Descriptive statistics
for each group
Scale-scale Scatter plot Correlation
GRAPH FOR CATEGORICAL DATA
MULTIPLE BAR CHART
Open the file, then from pull-down menu click on legacy
dialogue, then click Bar chart , select variable to
category axis and one to cluster then click ok
CONTINGENCY TABLE
Open the file, then from pull-down menu click on analyze
Descriptive statistics cross-tab select
variables, one to row and one to column, for cell proportion
Click cell and click on total, for chi-square click on statistics
ok, output window will appear
GRAPH FOR CONTINUOUS
DATA
SCATTER PLOT
Open file, on pull-down menu, click on graph
legacy dialogs scatter plot
enter variables to x-axis and y-axis then click ok
CORRELATION COEFFICIENT
Open the file, then from pull-down menu click on
analyze correlate select variables
ok, output window
will appear
SUMMARY ONE CATEGORICAL
ONE CONTINUOUS VARIABLE
When we have one categorical and one
continuous variable , then for descriptive
analysis we will use Explore command and for
graph we use Box-plot , suppose we have
gender and weight of respondents
DESCRIPTIVE STATISTICS
Open file, go to analyze, then select descriptive
statistics explore , a window will open then
select continuous variable and past to dependent list
and categorical to factor list , then click ok
BOX PLOT
Open file, click on Graph then click to legacy dialog,
the box plot then click simple then define now put
continuous variable to variable and categorical (sex,
SES) to category axis and click ok
REGRESSION ANALYSIS
Prediction of one variable on the basis of other variable or
set of variables (be sure all variables are continuous) for
example prediction of BP when age of a person is 55
years. The mathematical equation is as
Where a and b are coefficients of equation
XAgebaYBP )()( 
CONT…..
Open file analyze Regression Linear
the put dependent variable and independent variable in
respected box ok
REGRESSION LINE
This is regression line using results of previous
slide.
)(075.061.129)( AgeBPY 
MEASURE OF RISK
When we have exposure and outcome (2x2) , the
Odds Ratio (OR) is measure in cross-tab
command, when we open cross –tab, click on
statistic, then click on Risk and continue
Open file “states”, for variable “bac”, what percentage of states
use the 0.8 standard.
Open file “Aids”, determine the shape of distribution of Aids cases
reported in 1994
Open file “students”, make side-by-side histogram of height in
comparison for male and female. Make a cross-tab (contingency
table) of gender, and eye-color, also compare blue color in male
and female. Make a scatter plot between height and weight and
interpret the graph. Compute descriptive statistics of variable
amount paid for hair cut.
Activity
Open file “college” , focus on two variables in-
state tuition and out-state tuition , show which
varies more (calculate coefficient of variation).
Construct Box-plot for math score in public and
private school and comments on plot. On the
average, in which subjects (mathsat, verbsat)
score is larger.
Cont……
Open file “GSS94” , answer the questions
Did female tends to watch more or less TV per day than male
(calculate descriptive statistics)
If the respondents are afraid to walk alone in neighborhood,
compare mean age of those who said “yes” or “no”.
Make contingency table for sex and Race.
Make a cross –tab of variables marital status and marnomar and
find the probability of a person who is married
Cont….
Open file “bodyfat”, calculate correlation
between neck and chest circumference, also fit a
regression line chest circumference on neck
circumference.
Investigate the variables “Fatperc”, “age” ,
“weight”, “neck” about their normality using
appropriate test and graph.
Cont…..
Open file “sleep”, using appropriate descriptive and graphical
technique, how would you establish relationship between the amount
of sleep a species require and mean weight of species. Also
interpret the results. Make a frequency distribution of variable
amount of sleep taking appropriate interval. Construct 95%
confidence interval for total sleep and life span
Open file “colleges”, construct 95% confidence interval for mean
room and board charges and what does it mean?
Cont….
TESTING OF HYPOTHESIS
Here we will discuss
• one sample t-test
•Two sample t-test (independent groups, dependent
groups)
•One way AVOVA (F-test)
ONE SAMPLE T-TEST
Open data file “bodyfat”, test the hypothesis the
population mean body fat is 23 against it is not
equal to 23.
Analyze compare means one sample t-
test, select variable body fat and enter 23 as test
value, results are as
INTERPRETATION OF RESULTS
Here the sample mean is 19.15 and t-statistic is -7.30 and
p-value is 0.000, which suggested to reject null hypothesis
and it is concluded that population mean body fat is not
23
TWO (INDEPENDENT) SAMPLE T-
TEST
Sometimes we focus on comparing means of variable of
interest of two different samples. For example whether
height of bys is different from girl’s height. Open file
“students” and compare height of boys and girls
Open file analyze compare means
independent samples , click then a window will
open select height as test variable and gender
as grouping variable. Define grouping
variable putting the value of male and female
then click ok
T value
P-value
PAIRED T-TEST (DEPENDENT SAMPLES)
Sometimes observations are taken before and after some
treatment on same respondents. For example BP is
measure before and after medicine. This type of sample is
called paired sample. Open file “swimmer2” and we wish
to see any difference is freestyle at two points of students
Open file analyze compare means
paired sample t-test , click then a window will open
select two variables 100 meter freestyle click ok
ONE WAY- NOVA
For more than two independent groups we use one-way
ANOVA. Suppose we are interested to know whether out
campus job effect the students GPA. Open file student
and test GPA with grouping variable work category. The
null hypothesis is that GPA is same for all working
category. If null hypothesis is rejected then we post hoc
test (LSD)
PROCEDURE
Open file analyze compare means,
One-way ANOVA, the dependent list variable is
GPA, Factor variable is workcat ,click option under
statistics , select descriptive then click on post hoc, a
window will open select LSD cick ok
Posthoc
test
Open file “GSS94” and test the null hypothesis that the
adults in United States watch an average of three hours
of TV daily. Test the hypothesis males spent 3 hours
while watching TV (Use select command)
Is there a statistically significant difference in amount of
time men and women spend watching TV. Is there a
statistically significant difference in amount of time
married and divorced spend watching TV?
Activity
Open file “students”, test the hypothesis, commuters and residents
earn significantly different mean grades? Do car owners have
significantly fewer accidents on average than non-owners? Interpret
your results using 95% confidence interval and p-value.
Open file “BP”, test the hypothesis: do subjects with parental history
of hypertension have significantly higher resting Systolic and
Diastolic BP than subjects with no parental history?
Open file “GSS94”, does the amount of television viewing varying
by respondent’s race? (ANOVA)
Cont….
Open file “BP”, is systolic BP (sbpma) related to a person’s sex,
parental hypertension (ph) or some combination of these factors.
Open file “group” , is subject’s perception of co-worker related to
gender , group size or combination of these two factors?
Open file “bodyfat”, consider a man whose chest measurement is
95 cm, abdomen is 85 cm, and whose weight is 158 pounds; use
regression equation to estimate this man’s body fat percentage. (use
multiple regression) Also write the regression equation and interpret
the results.
Develop the multiple regression line to estimate body fat
percentage on the basis of following variables, Age, weight,
abdomen circumference, chest circumference, thigh circumference,
wrist circumference using matrix plot/correlation matrix/ p-value.
Open file “salem”, test whether variables proparri and accuser are
independent (use chi-square test)
Open file “students”, test smokers tend to drink more beer than
nonsmokers? (select parametric or non-parametric test , t test or
Mann-U test)
ADVANCED DATA ANALYSIS
Followings are advanced tools
•Logistic regression, survival analysis (KM curve)
•Factor analysis, Reliability
•ANOVA repeated measures
•Time series analysis (forecasting)
Medical Data
Analysis
Univariate
Categorical Data
Descriptive Analysis
Graphs, Bar, Pie Charts
Frequency (f), Percentage
(%), Proportion
Inferential Analysis
Chi-square (χ2) test
Z-test
Continuous Data
Descriptive Analysis
Histogram
Mean ± S.D
Inferential Analysis
Z-test (n>30)
t-test (n<30)
Multivariate
Categorical Data
Descriptive Analysis
Multiple Bar Charts
Contigency Table
Inferential Analysis
Association χ2, OR, RR
Prediction, Logistic
Regression
Continuous Data
Descriptive Analysis
Scatter Plot, Box Plot
Relationship, Regression,
Correlation
Inferential Analysis
t-test
ANOVA, Multiple
Regression

Mais conteúdo relacionado

Mais procurados

Software packages for statistical analysis - SPSS
Software packages for statistical analysis - SPSSSoftware packages for statistical analysis - SPSS
Software packages for statistical analysis - SPSSANAND BALAJI
 
Sampling distribution
Sampling distributionSampling distribution
Sampling distributionswarna dey
 
Testing of hypothesis
Testing of hypothesisTesting of hypothesis
Testing of hypothesisJags Jagdish
 
DIstinguish between Parametric vs nonparametric test
 DIstinguish between Parametric vs nonparametric test DIstinguish between Parametric vs nonparametric test
DIstinguish between Parametric vs nonparametric testsai prakash
 
Spss lecture notes
Spss lecture notesSpss lecture notes
Spss lecture notesDavid mbwiga
 
Introduction To SPSS
Introduction To SPSSIntroduction To SPSS
Introduction To SPSSPhi Jack
 
"A basic guide to SPSS"
"A basic guide to SPSS""A basic guide to SPSS"
"A basic guide to SPSS"Bashir7576
 
Factor analysis
Factor analysisFactor analysis
Factor analysissaba khan
 
Ppt for 1.1 introduction to statistical inference
Ppt for 1.1 introduction to statistical inferencePpt for 1.1 introduction to statistical inference
Ppt for 1.1 introduction to statistical inferencevasu Chemistry
 
F test and ANOVA
F test and ANOVAF test and ANOVA
F test and ANOVAMEENURANJI
 
An Introduction to SPSS
An Introduction to SPSSAn Introduction to SPSS
An Introduction to SPSSRayman Soe
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statisticsSarfraz Ahmad
 
Formulating hypotheses
Formulating hypothesesFormulating hypotheses
Formulating hypothesesAniket Verma
 
Lecture 6. univariate and bivariate analysis
Lecture 6. univariate and bivariate analysisLecture 6. univariate and bivariate analysis
Lecture 6. univariate and bivariate analysisDr Rajeev Kumar
 
SPSS How to use Spss software
SPSS How to use Spss softwareSPSS How to use Spss software
SPSS How to use Spss softwareDebashis Baidya
 
Quantitative analysis
Quantitative analysisQuantitative analysis
Quantitative analysisRajesh Mishra
 

Mais procurados (20)

Software packages for statistical analysis - SPSS
Software packages for statistical analysis - SPSSSoftware packages for statistical analysis - SPSS
Software packages for statistical analysis - SPSS
 
Sampling distribution
Sampling distributionSampling distribution
Sampling distribution
 
Testing of hypothesis
Testing of hypothesisTesting of hypothesis
Testing of hypothesis
 
DIstinguish between Parametric vs nonparametric test
 DIstinguish between Parametric vs nonparametric test DIstinguish between Parametric vs nonparametric test
DIstinguish between Parametric vs nonparametric test
 
Spss lecture notes
Spss lecture notesSpss lecture notes
Spss lecture notes
 
Introduction To SPSS
Introduction To SPSSIntroduction To SPSS
Introduction To SPSS
 
T test
T testT test
T test
 
"A basic guide to SPSS"
"A basic guide to SPSS""A basic guide to SPSS"
"A basic guide to SPSS"
 
Factor analysis
Factor analysisFactor analysis
Factor analysis
 
Analysis of variance anova
Analysis of variance anovaAnalysis of variance anova
Analysis of variance anova
 
Ppt for 1.1 introduction to statistical inference
Ppt for 1.1 introduction to statistical inferencePpt for 1.1 introduction to statistical inference
Ppt for 1.1 introduction to statistical inference
 
F test and ANOVA
F test and ANOVAF test and ANOVA
F test and ANOVA
 
An Introduction to SPSS
An Introduction to SPSSAn Introduction to SPSS
An Introduction to SPSS
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
 
T test statistics
T test statisticsT test statistics
T test statistics
 
Statistical analysis
Statistical  analysisStatistical  analysis
Statistical analysis
 
Formulating hypotheses
Formulating hypothesesFormulating hypotheses
Formulating hypotheses
 
Lecture 6. univariate and bivariate analysis
Lecture 6. univariate and bivariate analysisLecture 6. univariate and bivariate analysis
Lecture 6. univariate and bivariate analysis
 
SPSS How to use Spss software
SPSS How to use Spss softwareSPSS How to use Spss software
SPSS How to use Spss software
 
Quantitative analysis
Quantitative analysisQuantitative analysis
Quantitative analysis
 

Destaque

Research Methodology (MBA II SEM) - Introduction to SPSS
Research Methodology (MBA II SEM) - Introduction to SPSSResearch Methodology (MBA II SEM) - Introduction to SPSS
Research Methodology (MBA II SEM) - Introduction to SPSSGB Technical University
 
What Is the Use of SPSS in Data Analysis
What Is the Use of SPSS in Data AnalysisWhat Is the Use of SPSS in Data Analysis
What Is the Use of SPSS in Data AnalysisSPSSResearch
 
Statistical software packages
Statistical software packagesStatistical software packages
Statistical software packagesKm Ashif
 
Basic guide to SPSS
Basic guide to SPSSBasic guide to SPSS
Basic guide to SPSSpaul_gorman
 
Statistical Package for Social Science (SPSS)
Statistical Package for Social Science (SPSS)Statistical Package for Social Science (SPSS)
Statistical Package for Social Science (SPSS)sspink
 
Software for Qualitative and Quantitative Data Analysis
Software for Qualitative and Quantitative Data AnalysisSoftware for Qualitative and Quantitative Data Analysis
Software for Qualitative and Quantitative Data AnalysisAlexandru Caratas Ghenea
 

Destaque (8)

Research Methodology (MBA II SEM) - Introduction to SPSS
Research Methodology (MBA II SEM) - Introduction to SPSSResearch Methodology (MBA II SEM) - Introduction to SPSS
Research Methodology (MBA II SEM) - Introduction to SPSS
 
What Is the Use of SPSS in Data Analysis
What Is the Use of SPSS in Data AnalysisWhat Is the Use of SPSS in Data Analysis
What Is the Use of SPSS in Data Analysis
 
Presenting statistics in social media
Presenting statistics in social mediaPresenting statistics in social media
Presenting statistics in social media
 
Statistical software packages
Statistical software packagesStatistical software packages
Statistical software packages
 
Basic guide to SPSS
Basic guide to SPSSBasic guide to SPSS
Basic guide to SPSS
 
Statistical software
Statistical softwareStatistical software
Statistical software
 
Statistical Package for Social Science (SPSS)
Statistical Package for Social Science (SPSS)Statistical Package for Social Science (SPSS)
Statistical Package for Social Science (SPSS)
 
Software for Qualitative and Quantitative Data Analysis
Software for Qualitative and Quantitative Data AnalysisSoftware for Qualitative and Quantitative Data Analysis
Software for Qualitative and Quantitative Data Analysis
 

Semelhante a Data analysis using spss

Data Analysis with SPSS PPT.pdf
Data Analysis with SPSS PPT.pdfData Analysis with SPSS PPT.pdf
Data Analysis with SPSS PPT.pdfThanavathi C
 
extra material for practicals in spss.pptx
extra material for practicals in spss.pptxextra material for practicals in spss.pptx
extra material for practicals in spss.pptxMrMuhammadAsif1
 
Spss by vijay ambast
Spss by vijay ambastSpss by vijay ambast
Spss by vijay ambastVijay Ambast
 
Research Method chapter 6.pptx
Research Method chapter 6.pptxResearch Method chapter 6.pptx
Research Method chapter 6.pptxAsegidHmeskel
 
Statistical ProcessesCan descriptive statistical processes b.docx
Statistical ProcessesCan descriptive statistical processes b.docxStatistical ProcessesCan descriptive statistical processes b.docx
Statistical ProcessesCan descriptive statistical processes b.docxdarwinming1
 
data analysis techniques and statistical softwares
data analysis techniques and statistical softwaresdata analysis techniques and statistical softwares
data analysis techniques and statistical softwaresDr.ammara khakwani
 
PUH 6301, Public Health Research 1 Course Learning Ou
 PUH 6301, Public Health Research 1 Course Learning Ou PUH 6301, Public Health Research 1 Course Learning Ou
PUH 6301, Public Health Research 1 Course Learning OuTatianaMajor22
 
Computing Descriptive Statistics © 2014 Argos.docx
 Computing Descriptive Statistics     © 2014 Argos.docx Computing Descriptive Statistics     © 2014 Argos.docx
Computing Descriptive Statistics © 2014 Argos.docxaryan532920
 
Computing Descriptive Statistics © 2014 Argos.docx
Computing Descriptive Statistics     © 2014 Argos.docxComputing Descriptive Statistics     © 2014 Argos.docx
Computing Descriptive Statistics © 2014 Argos.docxAASTHA76
 
Assignment 2 RA Annotated BibliographyIn your final paper for .docx
Assignment 2 RA Annotated BibliographyIn your final paper for .docxAssignment 2 RA Annotated BibliographyIn your final paper for .docx
Assignment 2 RA Annotated BibliographyIn your final paper for .docxjosephinepaterson7611
 
Need a nonplagiarised paper and a form completed by 1006015 before.docx
Need a nonplagiarised paper and a form completed by 1006015 before.docxNeed a nonplagiarised paper and a form completed by 1006015 before.docx
Need a nonplagiarised paper and a form completed by 1006015 before.docxlea6nklmattu
 
Practical applications and analysis in Research Methodology
Practical applications and analysis in Research Methodology Practical applications and analysis in Research Methodology
Practical applications and analysis in Research Methodology Hafsa Ranjha
 
Research Method EMBA chapter 11
Research Method EMBA chapter 11Research Method EMBA chapter 11
Research Method EMBA chapter 11Mazhar Poohlah
 
Lobsters, Wine and Market Research
Lobsters, Wine and Market ResearchLobsters, Wine and Market Research
Lobsters, Wine and Market ResearchTed Clark
 

Semelhante a Data analysis using spss (20)

Data Analysis.pptx
Data Analysis.pptxData Analysis.pptx
Data Analysis.pptx
 
SPSS FINAL.pdf
SPSS FINAL.pdfSPSS FINAL.pdf
SPSS FINAL.pdf
 
Data Analysis with SPSS PPT.pdf
Data Analysis with SPSS PPT.pdfData Analysis with SPSS PPT.pdf
Data Analysis with SPSS PPT.pdf
 
Mm1
Mm1Mm1
Mm1
 
extra material for practicals in spss.pptx
extra material for practicals in spss.pptxextra material for practicals in spss.pptx
extra material for practicals in spss.pptx
 
Spss by vijay ambast
Spss by vijay ambastSpss by vijay ambast
Spss by vijay ambast
 
Research Method chapter 6.pptx
Research Method chapter 6.pptxResearch Method chapter 6.pptx
Research Method chapter 6.pptx
 
Statistical ProcessesCan descriptive statistical processes b.docx
Statistical ProcessesCan descriptive statistical processes b.docxStatistical ProcessesCan descriptive statistical processes b.docx
Statistical ProcessesCan descriptive statistical processes b.docx
 
data analysis techniques and statistical softwares
data analysis techniques and statistical softwaresdata analysis techniques and statistical softwares
data analysis techniques and statistical softwares
 
PUH 6301, Public Health Research 1 Course Learning Ou
 PUH 6301, Public Health Research 1 Course Learning Ou PUH 6301, Public Health Research 1 Course Learning Ou
PUH 6301, Public Health Research 1 Course Learning Ou
 
Computing Descriptive Statistics © 2014 Argos.docx
 Computing Descriptive Statistics     © 2014 Argos.docx Computing Descriptive Statistics     © 2014 Argos.docx
Computing Descriptive Statistics © 2014 Argos.docx
 
Computing Descriptive Statistics © 2014 Argos.docx
Computing Descriptive Statistics     © 2014 Argos.docxComputing Descriptive Statistics     © 2014 Argos.docx
Computing Descriptive Statistics © 2014 Argos.docx
 
Assignment 2 RA Annotated BibliographyIn your final paper for .docx
Assignment 2 RA Annotated BibliographyIn your final paper for .docxAssignment 2 RA Annotated BibliographyIn your final paper for .docx
Assignment 2 RA Annotated BibliographyIn your final paper for .docx
 
Need a nonplagiarised paper and a form completed by 1006015 before.docx
Need a nonplagiarised paper and a form completed by 1006015 before.docxNeed a nonplagiarised paper and a form completed by 1006015 before.docx
Need a nonplagiarised paper and a form completed by 1006015 before.docx
 
Measures of Condensation.pptx
Measures of Condensation.pptxMeasures of Condensation.pptx
Measures of Condensation.pptx
 
Practical applications and analysis in Research Methodology
Practical applications and analysis in Research Methodology Practical applications and analysis in Research Methodology
Practical applications and analysis in Research Methodology
 
محاضرة 9
محاضرة 9محاضرة 9
محاضرة 9
 
Research Method EMBA chapter 11
Research Method EMBA chapter 11Research Method EMBA chapter 11
Research Method EMBA chapter 11
 
Lecture 1
Lecture 1Lecture 1
Lecture 1
 
Lobsters, Wine and Market Research
Lobsters, Wine and Market ResearchLobsters, Wine and Market Research
Lobsters, Wine and Market Research
 

Último

Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxBoston Institute of Analytics
 
Machine learning classification ppt.ppt
Machine learning classification  ppt.pptMachine learning classification  ppt.ppt
Machine learning classification ppt.pptamreenkhanum0307
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 

Último (20)

Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
Machine learning classification ppt.ppt
Machine learning classification  ppt.pptMachine learning classification  ppt.ppt
Machine learning classification ppt.ppt
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 

Data analysis using spss

  • 1.
  • 2. DATA ANALYSIS USING SPSS Muhammad Ibrahim Associate Professor of Statistics Govt. MAO College Lahore 0300-4668681 Ibrahim.ap12@gmail.com
  • 3. LEARNING OBJECTIVES 1.  Understand basic concepts of biostatistics and computer software SPSS. 2.  Select appropriate statistical tests for particular types of data. 3.  Recognize and interpret the output from statistical analyses. 4.  Report statistical output in a concise and appropriate manner.
  • 4. BASIC TERMINOLOGY Statistics, Biostatistics, Variable, Measurement Scale, Data, Medical Data, type of data, Data Analysis
  • 5. VARIABLE, SCALE, DATA Variable is a characteristics which varies and scale is a device on which observations are taken. Data is set of observations/measurements taken from experiment/survey or external source of a specific variable using some appropriate measurement scale
  • 6. Statistics and Bio-statistics Statistics is generally understood as the subject dealing with number and data, more broadly it involves activities such as collection of data from survey or experiment, summarization or management of data, presentation of results in a convincing format, analysis of data or drawing valid inferences from findings. Whereas Bio-Statistics is science which helps us in managing medical data with application of statistical methods/techniques/tools or a collection of statistical procedures particularly well-suited to the analysis of healthcare-related data
  • 7. What is medical data? The data which is related to patient care or numerical information regarding patient’s clinical characteristics, mortality rate survival rate, disease distribution, prevalence of disease, efficacy of treatment, and other such information is called medical data.
  • 8. NATURE OF DATA Data is the value you get from observing (measuring, counting, assessing etc.) from experiment or survey. Data is either categorical or metric. Categorical data is further divided into Nominal and ordinal, whereas metric into discrete and continuous (quantitative) data.
  • 9.
  • 10. Nominal data The data is divided into classes or categories. Blood type, sex, causes of disease, urban/rural, alive/ dead, infected/not infected, hair color, smoking status. No meaningful order of classes. Ordinal data The data is also divided into classes or categories but be put in meaningful order. For example satisfaction level:-Very satisfied, satisfied, neutral, unsatisfied, very unsatisfied. Pain as mild, moderate, sever. Socioeconomic status: poor, middle, rich, grade of breast cancer, better, same, worst. Discrete data When data is taken from some counting process, for example number of patients in different wards, number of nurses, number of hospitals in different cities. Continuous or quantitative data When data is taken from some measuring process, for example, height, weight, Temperature, uric acid, blood glucose and serum level.
  • 11. Primary Scales of Measurement Scale Basic Characteristics Common Examples Marketing Examples Nominal Numbers identify & classify objects Social Security nos., numbering of football players Brand nos., store types Percentages, mode Chi-square, binomial test Ordinal Nos. indicate the relative positions of objects but not the magnitude of differences between them Quality rankings, rankings of teams in a tournament Preference rankings, market position, social class Percentile, median Rank-order correlation, Friedman ANOVA Ratio Zero point is fixed, ratios of scale values can be compared Length, weight Age, sales, income, costs Geometric mean, harmonic mean Coefficient of variation Permissible Statistics Descriptive Inferential Interval Differences between objects Temperature (Fahrenheit) Attitudes, opinions, index Range, mean, standard Product- moment
  • 12. Nominal Scale  The numbers serve only as labels or tags for identifying and classifying objects.  When used for identification, there is a strict one-to-one correspondence between the numbers and the objects.  The numbers do not reflect the amount of the characteristic possessed by the objects.  The only permissible operation on the numbers in a nominal scale is counting.  Social security number, hockey players number. Imn marketing research respondents, brands, attributes, stores and other objects
  • 13. ORDINAL SCALE A ranking scale in which numbers are assigned to objects to indicate the relative extent to which the objects possess some characteristic. Can determine whether an object has more or less of a characteristic than some other object, but not how much more or less. any series of numbers can be assigned that preserves the ordered relationships between the objects. So relative position of objects not the magnitude of difference between the objects. In addition to the counting operation allowable for nominal scale data, ordinal scales permit the use of statistics based on percentile, quartile, median. Possess description and order, not distance or origin
  • 14. INTERVAL SCALE Numerically equal distances on the scale represent equal values in the characteristic being measured. It permits comparison of the differences between objects. The difference between 1 & 2 is same as between 2 & 3 The location of the zero point is not fixed. Both the zero point and the units of measurement are arbitrary. Everyday temperature scale. Attitudinal data obtained on rating scales. Do not possess origin characteristics (zero and exact measurement)
  • 15. RATIO SCALE The highest scale that allows to identify objects, rank order of objects, and compare intervals or differences. It is also meaningful to compute ratios of scale values Possesses all the properties of the nominal, ordinal, and interval scales. It has an absolute zero point. Height, weight, age, money. Sales, costs, market share and number of customers are variables measured on a ratio scale All statistical techniques can be applied to ratio data.
  • 16. After collecting the accurate and reliable data successfully by using the appropriate method from the source, the next step is how to extract the pertinent and useful information buried in the data for further manipulation and interpretation. The process of performing certain calculations and evaluation in order to extract relevant information from data is called data analysis. Data Analysis
  • 17. The data analysis may take several steps to reach certain conclusions. Simple data can be organized very easily, while the complex data requires proper processing. The word “processing” means the recasting and dealing with data making ready for analysis. Cont……
  • 18. •Questionnaire checking/Data preparation •Coding •Cleaning data •Applying most appropriate tools for analysis Steps in data analysis
  • 19. QUESTIONNAIRE CHECKING A questionnaire returned from the field may be unacceptable for several reasons. Parts of the questionnaire may be incomplete. The pattern of responses may indicate that the respondent did not understand or follow the instructions. The responses show little variance. One or more pages are missing. The questionnaire is received after the pre-established cutoff date. The questionnaire is answered by someone who does not qualify for participation.
  • 20. DATA PREPARATION Preparation of data file It is important to convert raw data into a usable data for analysis (coding where it needed), simply transform information from questionnaire to computer database The analysis and results will surely depend on the quality of data There are possibilities of errors in handling instruments, raw data, transcribing, data entry, assigning codes, values, value labels Data need to be cleaned to fulfill the analysis conditions
  • 21. CODING Coding means assigning a code, usually a number, to each possible response to each question.
  • 22. •One of the first steps in analyzing data is to “clean” it of any obvious data entry errors: Outliers? (really high or low numbers) Example: Age = 110 (really 10 or 11?) •Value entered that doesn’t exist for variable? Example: 2 entered where 1=male, 0=female •Missing values? Did the person not give an answer? Was answer accidentally not entered into the database? Data cleaning
  • 23. •May be able to set defined limits when entering data Prevents entering a 2 when only 1, 0, or missing are acceptable values •Univariate data analysis is a useful way to check the quality of the data Cont……
  • 24.
  • 25. SPSS SPSS is a statistical Packages for data analysis, it is a very popular software because of its friendly usage in Social & Medical sciences
  • 26. Launching SPSS Before starting this session, you should know how to run a program in windows operating system. Click and hold on button at lower left of your screen, and among the program listed select SPSS 16.0, click and release the mouse button to lauanch the program
  • 27. On clicking of SPSS this window will open then click on cancel button if you like to enter data in a new file or click on OK for opening an existing file. A window will open known as data editor with variable view.
  • 28. SPSS WINDOWS There are a number of different types of windows in SPSS. The window in which you are currently working is called the active window. Some of the frequently used windows are: Data Editor Window: It displays the contents of the data file. This is the window that opens automatically when you start an SPSS session. In this window, you can create new data files or modify existing ones. When you open more than one data file, each data file has a separate Data Editor Window. The Data Editor Window provides two view of the data: Data View: It displays the data values. Each variable is a column. Each row is a case. Variable View: It displays a table consisting of variable names and their attributes. You can modify the properties of each variable or add new variables or delete existing variables in the Variable View Window. Data view window variable view window
  • 29. Viewer Window: It displays statistical results, tables, and charts. This window opens automatically the first time you run a procedure that generates output
  • 31.
  • 32. PULL-DOWN MENUS Many tasks in SPSS are performed by selecting appropriate "pull-down" menus. Each window in SPSS has its own menu bar with appropriate menu selections and toolbars. The Analyze and Graphs menus are available in all windows. Here are some Data Editor Window menus and their uses: File Menu: From the file menu you can open several different existing files or a database file such as an excel file or read in a text file. You can also save any changes to the current file. Edit Menu: from the Edit menu, you can cut, copy, paste, insert variables, insert cases, or use find in the Data Editor window. Data Menu: The data menu allows you to define variable properties, sort cases, merge files, split files, select cases and use a variable to weight cases. Transform Menu: The transform menu is where you will find the options to do some computations on variables, to create new variables from existing ones or recode old variables. Analyze Menu: The analyze menu is where all statistical analysis takes place. From descriptive statistics to regression analysis to nonparametric tests
  • 33. Graphs Menu: The graph menu is where you can create high resolution plots and graphs to be edited in the chart editor window or you can create interactive graphs. Utilities Menu: The utilities menu is used to display information on the contents of SPSS data files or to run scripts. Add-Ons Menu: From the add-ons menu you can run other packages like conjoint, classification trees, or Neural Networks. Also there are programmability extensions that allow you to integrate programs like R and Python into SPSS. But you should keep in mind that if you want to run any of the add-ons listed here you will have to purchase them separately. Window: From the window menu you can change the active window. The window with a check mark is the active one. In this case it is the data editor window. Help: The help menu allows you to get help on topics in SPSS or to ask the statistics coach some basic questions. TOOLBARS Each window in SPSS has its own toolbars that provides access to common tasks. Some windows have more than one. When you put the mouse pointer on a tool, there is a brief description of what the tool does. You can show, move or hide a toolbar.
  • 34. STATUS BARS The status bar is at the bottom of each SPSS window and provides the following information: Command Status: gives information about a procedure that is running. Filter Status: Filter On shows when a subset of cases in the data is used for analysis. Weight Status: Weight On indicates that a weight variable is being used in the analysis. Split File Status: Split File On indicates that the file has been split into separate groups for analysis. DIALOG BOXES Many menu selections will open dialog boxes. In these dialog boxes, you select variables and options for analysis. The main dialog box in any statistical procedure has the following parts: Source variable list: A list of variable types (allowed by the procedure) from the working data file. Target variable lists: One or more lists of variables needed for the analysis. Command push buttons: Buttons that can be used to run the procedure by opening a subdialog box to make additional specifications. Some of the push buttons are: OK : Click this button to run the procedure. Paste: Click this button to generate command syntax from your selections. The command syntax is pasted into a syntax window, where it can be modified for future analysis. This creates the code regularly known as SPSS programs. Reset: Deselects any selections, and resets all specifications in the dialog box and any subdialog boxes to the default status. Cancel: Cancels any change in the dialog box settings since the last time it was opened. This will close the dialog box. Help: Provides help about the current dialog box.
  • 35.
  • 36. Name The name of each SPSS variable in a given file must be unique; it must start with a letter; it may have up to 8 characters (including letters, numbers, and the underscore _ (note that certain key words are reversed and may not be used as variable names, e.g., "compute", "sum", and so forth). To change an existing name, click in the cell containing the name, highlight the part you want to change, and type in the replacement. To create a new variable name, click in the first empty row under the name column and type a new (unique) variable name. Notice that we can use "cat_dog" but not "cat-dog" and not "cat dog". The hyphen gets interpreted as subtraction (cat minus dog) by S PSS, and the space confuses SPSS as to how many variables are being named.
  • 37. TYPE THE TWO BASIC TYPES OF VARIABLES THAT YOU WILL USE ARE NUMERIC AND STRING. NUMERIC VARIABLES MAY ONLY HAVE NUMBERS ASSIGNED. STRING VARIABLES MAY CONTAIN LETTERS OR NUMBERS, BUT EVEN IF A STRING VARIABLE HAPPENS TO CONTAIN ONLY NUMBERS, NUMERIC OPERATIONS ON THAT VARIABLE WILL NOT BE ALLOWED (E.G., FINDING THE MEAN, VARIANCE, STANDARD DEVIATION, ETC...). TO CHANGE A VARIABLE TYPE, CLICK IN THAT CELL ON THE GREY BOX WITH ...
  • 38. Decimals The decimal of a variable is the number of decimal places that SPSS will display. If more decimals have been entered (or computed by SPSS), the additional information will be retained internally but not displayed on screen. For whole numbers, you would reduce the number of decimals to zero. You can change the number of decimal places by clicking in the decimals cell for the desired variable and typing a new number or you can use the arrow keys at the edge of the cell Label The label of a variable is a string of text to indentify in more detail what a variable represents. Unlike the name, the label is limited to 255 characters and may contain spaces and punctuation. For instance, if there is a variable for each question on a questionnaire, you would type the question as the variable label. To change or edit a variable label, simply click anywhere within the cell
  • 39. Values Although the variable label goes a long way to explaining what the variable represents, for categorical data (discrete data of both nominal and ordinal levels of measurement), we often need to know which numbers represent which categories. To indicate how these numbers are assigned, one can add labels to specific values by clicking on the ... box in the values cell Clicking here opens up the Value Labels dialogue box.
  • 40. To value 1.0 to cats and 2.0 to dogs, write 1.0 in value box and write cats in value label then click Add button, the following box will appear.
  • 41. Clicking on this box will bring up the variable type menu: If you select a numeric variable, you can then click in the width box or the decimal box to change the default values of 8 characters reserved to displaying numbers with 2 decimal places. For whole numbers, you can drop the decimals down to 0. If you select a string variable, you can tell SPSS how much "room" to leave in memory for each value, indicating the number of characters to be allowed for data entry in this string variable.
  • 42. When you are satisfied with the definitions of each value, click on the OK button
  • 43. The real beauty of value labels can be seen in the Data View by clicking on the "toe tag" icon in the tool bar , which switches between the numeric values and their labels
  • 44. A view of different variables with their descriptions
  • 45. Missing When you click missing button the SPSS will display this We sometimes want to signal to SPSS that data should be treated as missing, even though there is some other numerical code recorded instead of the data actually being missing (in which case SPSS displays a single period -- this is also called SYSTEM MISSING data). In this example, after clicking on the ... button in the Missing cell, I declared "9", "99", and "999" all to be treated by SPSS as missing (i.e., these values will be ignored)
  • 46. Columns The columns property tells SPSS how wide the column should be for each variable. Don't confuse this one with width, which indicates how many digits of the number will be displayed. The column size indicates how much space is allocated rather than the degree to which it is filled. Align The alignment property indicates whether the information in the Data View should be left-justified, right- justified, or centered
  • 47. Measure The Measure property indicates the level of measurement. Since SPSS does not differentiate between interval and ratio levels of measurement, both of these quantitative variable types are lumped together as "scale". Nominal and ordinal levels of measurement, however, are differentiated
  • 49. Let we have data set with different variables and we need to enter in SPSS, below is set of variables and data set, this file is named as “bp” in dataset Example
  • 50. Data Set: Professor Christopher conducted a study on subjects; the variable description is as with data Variable Description Sjcode ubject Code Sex Subject sex (0 = female, 1= male) Age Subject age Height Height in inches Weight weight, in pound Race Subject Race (1=Amer, 2= Asian, 3= black, 4= Hispanic, 5= white, 9= none of above) Med Taking prescription medication (0= No, 1= Yes) Smoke Does subject smoke? (0 =Nonsmoker, 1= smoker) SBPCP Systolic blood pressure with cold presser DBPCP Diastolic blood pressure with cold presser HRCP Heart rate with cold presser SBPMA Systolic blood pressure while doing mental arithmetic DBPMA Diastolic blood pressure while doing mental arithmetic HRMA Heart rate with while doing mental arithmetic SBPREST Systolic blood pressure at rest DBPREST Diastolic blood pressure at rest PH Parental hypertension (0= No, 1= yes) MEDPH Parent(s) on EH meds (0= No, 1=yes)
  • 51. SJcode sex age height weight race meds smoking sbpcp dbpcp hrcp sbpma dbpma hrma sbrest dbrest Ph Medph 3 Female 19 65 155 White No Med Non smoker 126 65 88 135.667 81.333 76.667 116.25 60.75 PH+ Parent EH Yes 4 Female 18 63 132 White No Med Non smoker 125 80 96 130.667 82.667 92.667 115.75 76.375 PH+ Parent EH Yes 5 Female 19 66 138 White No Med Non smoker 149 90 91 135.333 90.333 64.333 120.5 65.375 PH+ Parent EH Yes 9 Female 18 66 130 White No Med Non smoker 113 89 88 128.333 82.333 85.667 113.625 72.125 PH- Parents EH No 10 Female 18 66 175 White No Med Non smoker 112 70 82 121.667 75.333 85 110 68.75 PH- Parents EH No 11 Female 18 62 113 White No Med Non smoker 125 70 73 133.333 82.333 74.333 119.75 73.5 PH- Parents EH No 13 Male 20 73 159 White No Med Smoker 162 62 58 145.667 68 74 130.75 57.125 PH+ Parent EH Yes 15 Male 18 70 155 White No Med Non smoker 123 73 53 137.333 78.667 53.667 126.375 65.625 PH+ Parent EH Yes 16 Male 19 69.5 185 White No Med Non smoker 139 66 48 148.667 81.667 78.667 127.625 67.375 PH+ Parent EH Yes 19 Male 18 70 164 White No Med Non smoker 133 65 85 134.333 58.667 66.667 121.75 56.5 PH- Parents EH No 20 Male 19 71 170 White No Med Non smoker 152 75 71 150.333 73 82.333 129.875 60 PH- Parents EH No 21 Male 18 76 179 Hispanic No Med Non smoker 128 70 63 121 71.333 71 121 68.5 PH- Parents EH No 23 Female 19 68.5 160 White No Med Non smoker 119 51 68 117 62.333 73.333 107.875 51.375 PH+ Parent EH Yes 24 Female 20 66 132 White No Med Non smoker 120 67 80 128.333 72.667 81 108 63.75 PH+ Parent EH Yes 25 Female 19 67.5 150 Black No Med Non smoker 129 95 70 121.333 71 77 110.25 62.875 PH- Parents EH No 26 Female 20 62 105 White Yes Med Non smoker 124 90 93 124 92.333 87 104.375 76.375 PH+ Parent EH Yes 29 Female 19 62 120 White No Med Non smoker 130 75 103 132.667 76 88.667 117.625 67.875 PH- Parents EH No 30 Female 18 67.5 143 White No Med Non smoker 130 95 93 120.667 83.667 98.333 111 77.375 PH- Parents EH No 32 Female 18 63.5 130 White No Med Non smoker 109 73 71 104 61 65.667 105.125 53.875 PH- Parents EH No 35 Male 20 66 127 White No Med Non smoker 129 68 107 124.333 63.667 93.333 117.75 62.75 PH- Parents EH No
  • 52. Entering data into data editor In this lesson our goal is only, how to enter, save, and edit data (the data sheet given above). The first step in entering the data into data editor is to define all the variables. Creating a variable requires us to name it, specify the type of data (nominal, ordinal, Scale) and assign label to the variables and data values if needed. •Move the cursor to the bottom of the data editor, named as variable view and click it, a different grid appears as •Move the cursor into first empty cell in row 1 (under name) here type sjcode, then press enter •When the cursor moves to the Type column , a small grey button marked with three dots will appear, click on it you see this dialog box, numeric is default variable type, click ok.
  • 53. Note that the Measure column (far right column) be put on scale, because you took numeric as variable type, In SPSS, each variable carry a descriptive label to help identify its meaning. To add label, here is procedure: •Move the cursor into the label column and type Subject Code. This complete the definition of first column. •Now lets creats a varable to represent sex, move the fisrt colume of row 2, and name the variable sex. •Because sex is categorical (qualitative ) variable and we are going to represent it numerically ( for data analysis purpose, because SPSS only entertains quantitative variable). Sinse numeric is the default in type column, we shall skip it and go to width taking width as per our requirement, in decimal column reduce from 2 to 0 •Label this variable as subject sex •Now we can assign text label to our coded values ( as discussed previously). In the values column click the grey box with three dots. A box will open as below
  • 54. Type “0” in value box and type Female in the value label box.
  • 55. Then click add Now type 1 in Value and Male in Label, click add and the click OK. In similar way we will add all the variables, the variable view window will be seen as
  • 56.
  • 57. Now Switch to data view by clicking the appropriate tab in the lower left of screen.  Move the cursor to the first cell below the sjcode, and type 3, and then press Enter. In the next cell type 4, when you completed the subject code, move to the tope cell under sex, type “0” for female and “1” for male and go on. When you are done all, the data editor should look as On clicking the third button (named Value label) at left most you will see the screen as below
  • 58. Saving the data file It is wise to save all your work in a disk file. To save a file, click on file menu, choose save as …, then next to file name, where type BP, then click save.
  • 59.
  • 60. Editing the data file/value To edit any value, just to open the data file and click edit menu, and select the case or variable which is required for editing. Quitting SPSS When you have completed your work, it is important to exit the program propoerly. Go to file menu, then click on Exit , generally you will see a message asking if you wish to save changes. Since we saved every thing earlier, click No.
  • 61. Here we discuss the issues like, transform, select, split, compute new variables, re-coding of data, merging files, sorting, transpose, weighted cases File management
  • 62. This tool allows you to rearrange the data Open file data sort cases select variable then ok Sorting data
  • 63.
  • 64.
  • 65. If some values are missing in data/variables that can be replaced by different methods, if variable is categorical then the value is replaced by the researcher on his/her personal experience, but the variable is continuous, SPSS will help using the Replace missing value command. Open file, and investigate any missing value using sort command, Replacing missing values
  • 66. Then go to transform tool replace missing value using option Cont………
  • 67.
  • 68. Sometimes a new variable is needed on the basis of current/existing variable or set of variables. The producer is as Menu transform compute variable ….. Insert target value and write desired operation in target expression like square, log ect. Creating Variables
  • 69.
  • 70. Open file “student” , convert weight into Kg then fiend BMI of students. 1 Kg = 2.20462 Lb and 1M = 39.3701 and find BMI= weight/(height)2 Compare this BMI with this BMI =weight in Lb/height in inch x703 Activity
  • 71. If the researcher is interested to re-code the data as you want to recode 1 5 or wants to make numerical data into groups , then we use re-code tool. Open the data file. From the menus choose: Transform | Recode | Into Different Variables... Following Recode into Different Variables Dialog box appears. Re-coding
  • 72.
  • 73. Select the variable you want to recode. For this example select AAA, and click the right arrow button (►) to move the variable into the Input Variable > Output Variable box, following sign appears in this box: AAA >? In the Output Variable group, enter an output variable name (e.g. AA1) in the Name box, and you may label it as Stillbirth Rate Category [optional] for new variable and click change. Up to now, the dialog box looks as under:
  • 74. Click Old and New Values... tab following dialog box appears, and specify how to recode values In the old value group, select the 5th choice then put 24 in the lowest through box.. In the value box under new value group input 1.
  • 75. Click Add tab. Similarly, for the closed class interval like 25-29, select the 4th choice in the old value group then put 25 (selection of 4th choice in each case) till the time when you input 5 in the New Value through 29 and in the value under new value input 2, then click Add tab. Repeat this process . Now for the highest open class, select the 6th choice in the Old Value group then put 45 in the through highest box. In the Value box under New Value group input 6, then click Add tab. The final shape looks as under. Click Continue and then OK. The XYZ-SPSS Data Editor containing two variables viz. AAA and AA1t looks as under, one in Variable View and other in Data View.
  • 76. Specify Value Labels Make the Data Editor the active window. If the data view is displayed, double-click the variable name at the top of the column in the data view or click the Variable View tab. Click the button in the values cell for the variable that you want to define. For each value, enter the value and a label (the one as seen below). Click Add to enter the value label, at last click OK.
  • 77. For above activity make grouping of BMI as Underweight < 18.5 Normal 18.5 – 22.9 Overweight > 22.9 Also make output of groups Activity
  • 78. This tool is used to analysis data for sub-group or a specific group like mean of respondent whose weight is above 85 Kg Open file, select data at MENU bar, select cases , click on if and write your option for selection , for example select male in BP file as gender=1 Select cases
  • 79.
  • 80.
  • 81.
  • 82.
  • 83. Select male cases in “bp” file also female whose age is more than 50 years Activity
  • 84. Two file may be merged either by variables or by case. Let we have 1000 respondents whose has six variables. If two data entry operators are completing this task. They can do this task in two ways (1) divide the cases to complete (2) divide the number of variables Merging file
  • 85. File can be split into two or three categories, go to menu then data then select split file and then perform operation Split file
  • 87. BASIC STRATEGY The following strategy is adopted to analyze the data • Description , counting, Proportion •Prediction, relationship, Association •Comparing , estimation (95% confidence interval)
  • 88. DATA ANALYSIS MAY BE DESCRIPTIVE OR INFERENTIAL DESCRIPTIVE CONTAINS MEAN, MEDIAN , MODE, SD, REGRESSION, CORRELATION , ON THE OTHER HAND CONFIDENCE INTERVAL, TESTING OF HYPOTHESIS, P-VALUE, ANOVA RELATE TO INFERENTIAL
  • 89. UNI-VARIATE DESCRIPTIVE ANALYSIS Graphical Method For nominal & ordinal data we use Bar or pie chart For continuous data we use histogram Numerical method For nominal & ordinal data we use Frequency/proportions For continuous data we use Mean , Standard deviation
  • 90. Summary Guide Scale Nominal Ordinal Displaying data Histogram Box-plot Bar chart, Pie chart Bar chart, Pie chart Summarizing data Mean, Median, SD Frequency table, Percentages, Proportion Frequency table, Percentages, Proportion
  • 92. MAKING BAR/PIE CHART Open the file, then from pull-down menu click on legacy dialogue, then click Bar/pie chart , select variable then click ok
  • 93.
  • 94.
  • 95.
  • 96. DATA SUMMERY Open the file, then from pull-down menu click on analyze Descriptive statistics frequency select variable Click ok, output window will appear
  • 97.
  • 98.
  • 100. MAKING HISTOGRAM Open the file, then from pull-down menu click on legacy dialogue, then click histogram, select variable, click ok
  • 101.
  • 102.
  • 103.
  • 104. DATA SUMMARY Open the file, then from pull-down menu click on analyze Descriptive statistics Descriptive Statistics select variable Click ok, output window will appear
  • 105.
  • 106.
  • 107. FOR ALL DESCRIPTIVE STATISTICS AND 95% CONFIDENCE INTERVAL Open the file, then from pull-down menu click on analyze Descriptive statistics explore select variable Click ok, output window will appear
  • 108.
  • 109. Summary Guide for appropriate analysis for two variable Type of variables Graphical display Relationship Categorical- categorical Multiple bar Contingency table Categorical-Scale Box-plot Descriptive statistics for each group Scale-scale Scatter plot Correlation
  • 111. MULTIPLE BAR CHART Open the file, then from pull-down menu click on legacy dialogue, then click Bar chart , select variable to category axis and one to cluster then click ok
  • 112.
  • 113.
  • 114.
  • 115. CONTINGENCY TABLE Open the file, then from pull-down menu click on analyze Descriptive statistics cross-tab select variables, one to row and one to column, for cell proportion Click cell and click on total, for chi-square click on statistics ok, output window will appear
  • 116.
  • 117.
  • 118.
  • 119.
  • 121. SCATTER PLOT Open file, on pull-down menu, click on graph legacy dialogs scatter plot enter variables to x-axis and y-axis then click ok
  • 122.
  • 123.
  • 124.
  • 125. CORRELATION COEFFICIENT Open the file, then from pull-down menu click on analyze correlate select variables ok, output window will appear
  • 126.
  • 127.
  • 128. SUMMARY ONE CATEGORICAL ONE CONTINUOUS VARIABLE When we have one categorical and one continuous variable , then for descriptive analysis we will use Explore command and for graph we use Box-plot , suppose we have gender and weight of respondents
  • 129. DESCRIPTIVE STATISTICS Open file, go to analyze, then select descriptive statistics explore , a window will open then select continuous variable and past to dependent list and categorical to factor list , then click ok
  • 130.
  • 131.
  • 132.
  • 133. BOX PLOT Open file, click on Graph then click to legacy dialog, the box plot then click simple then define now put continuous variable to variable and categorical (sex, SES) to category axis and click ok
  • 134.
  • 135.
  • 136.
  • 137. REGRESSION ANALYSIS Prediction of one variable on the basis of other variable or set of variables (be sure all variables are continuous) for example prediction of BP when age of a person is 55 years. The mathematical equation is as Where a and b are coefficients of equation XAgebaYBP )()( 
  • 138. CONT….. Open file analyze Regression Linear the put dependent variable and independent variable in respected box ok
  • 139.
  • 140.
  • 141.
  • 142. REGRESSION LINE This is regression line using results of previous slide. )(075.061.129)( AgeBPY 
  • 143. MEASURE OF RISK When we have exposure and outcome (2x2) , the Odds Ratio (OR) is measure in cross-tab command, when we open cross –tab, click on statistic, then click on Risk and continue
  • 144.
  • 145.
  • 146. Open file “states”, for variable “bac”, what percentage of states use the 0.8 standard. Open file “Aids”, determine the shape of distribution of Aids cases reported in 1994 Open file “students”, make side-by-side histogram of height in comparison for male and female. Make a cross-tab (contingency table) of gender, and eye-color, also compare blue color in male and female. Make a scatter plot between height and weight and interpret the graph. Compute descriptive statistics of variable amount paid for hair cut. Activity
  • 147. Open file “college” , focus on two variables in- state tuition and out-state tuition , show which varies more (calculate coefficient of variation). Construct Box-plot for math score in public and private school and comments on plot. On the average, in which subjects (mathsat, verbsat) score is larger. Cont……
  • 148. Open file “GSS94” , answer the questions Did female tends to watch more or less TV per day than male (calculate descriptive statistics) If the respondents are afraid to walk alone in neighborhood, compare mean age of those who said “yes” or “no”. Make contingency table for sex and Race. Make a cross –tab of variables marital status and marnomar and find the probability of a person who is married Cont….
  • 149. Open file “bodyfat”, calculate correlation between neck and chest circumference, also fit a regression line chest circumference on neck circumference. Investigate the variables “Fatperc”, “age” , “weight”, “neck” about their normality using appropriate test and graph. Cont…..
  • 150. Open file “sleep”, using appropriate descriptive and graphical technique, how would you establish relationship between the amount of sleep a species require and mean weight of species. Also interpret the results. Make a frequency distribution of variable amount of sleep taking appropriate interval. Construct 95% confidence interval for total sleep and life span Open file “colleges”, construct 95% confidence interval for mean room and board charges and what does it mean? Cont….
  • 151. TESTING OF HYPOTHESIS Here we will discuss • one sample t-test •Two sample t-test (independent groups, dependent groups) •One way AVOVA (F-test)
  • 152. ONE SAMPLE T-TEST Open data file “bodyfat”, test the hypothesis the population mean body fat is 23 against it is not equal to 23. Analyze compare means one sample t- test, select variable body fat and enter 23 as test value, results are as
  • 153.
  • 154.
  • 155.
  • 156. INTERPRETATION OF RESULTS Here the sample mean is 19.15 and t-statistic is -7.30 and p-value is 0.000, which suggested to reject null hypothesis and it is concluded that population mean body fat is not 23
  • 157. TWO (INDEPENDENT) SAMPLE T- TEST Sometimes we focus on comparing means of variable of interest of two different samples. For example whether height of bys is different from girl’s height. Open file “students” and compare height of boys and girls
  • 158. Open file analyze compare means independent samples , click then a window will open select height as test variable and gender as grouping variable. Define grouping variable putting the value of male and female then click ok
  • 159.
  • 160.
  • 162. PAIRED T-TEST (DEPENDENT SAMPLES) Sometimes observations are taken before and after some treatment on same respondents. For example BP is measure before and after medicine. This type of sample is called paired sample. Open file “swimmer2” and we wish to see any difference is freestyle at two points of students
  • 163. Open file analyze compare means paired sample t-test , click then a window will open select two variables 100 meter freestyle click ok
  • 164.
  • 165.
  • 166.
  • 167. ONE WAY- NOVA For more than two independent groups we use one-way ANOVA. Suppose we are interested to know whether out campus job effect the students GPA. Open file student and test GPA with grouping variable work category. The null hypothesis is that GPA is same for all working category. If null hypothesis is rejected then we post hoc test (LSD)
  • 168. PROCEDURE Open file analyze compare means, One-way ANOVA, the dependent list variable is GPA, Factor variable is workcat ,click option under statistics , select descriptive then click on post hoc, a window will open select LSD cick ok
  • 169.
  • 171.
  • 172.
  • 173. Open file “GSS94” and test the null hypothesis that the adults in United States watch an average of three hours of TV daily. Test the hypothesis males spent 3 hours while watching TV (Use select command) Is there a statistically significant difference in amount of time men and women spend watching TV. Is there a statistically significant difference in amount of time married and divorced spend watching TV? Activity
  • 174. Open file “students”, test the hypothesis, commuters and residents earn significantly different mean grades? Do car owners have significantly fewer accidents on average than non-owners? Interpret your results using 95% confidence interval and p-value. Open file “BP”, test the hypothesis: do subjects with parental history of hypertension have significantly higher resting Systolic and Diastolic BP than subjects with no parental history? Open file “GSS94”, does the amount of television viewing varying by respondent’s race? (ANOVA) Cont….
  • 175. Open file “BP”, is systolic BP (sbpma) related to a person’s sex, parental hypertension (ph) or some combination of these factors. Open file “group” , is subject’s perception of co-worker related to gender , group size or combination of these two factors? Open file “bodyfat”, consider a man whose chest measurement is 95 cm, abdomen is 85 cm, and whose weight is 158 pounds; use regression equation to estimate this man’s body fat percentage. (use multiple regression) Also write the regression equation and interpret the results.
  • 176. Develop the multiple regression line to estimate body fat percentage on the basis of following variables, Age, weight, abdomen circumference, chest circumference, thigh circumference, wrist circumference using matrix plot/correlation matrix/ p-value. Open file “salem”, test whether variables proparri and accuser are independent (use chi-square test) Open file “students”, test smokers tend to drink more beer than nonsmokers? (select parametric or non-parametric test , t test or Mann-U test)
  • 177. ADVANCED DATA ANALYSIS Followings are advanced tools •Logistic regression, survival analysis (KM curve) •Factor analysis, Reliability •ANOVA repeated measures •Time series analysis (forecasting)
  • 178. Medical Data Analysis Univariate Categorical Data Descriptive Analysis Graphs, Bar, Pie Charts Frequency (f), Percentage (%), Proportion Inferential Analysis Chi-square (χ2) test Z-test Continuous Data Descriptive Analysis Histogram Mean ± S.D Inferential Analysis Z-test (n>30) t-test (n<30) Multivariate Categorical Data Descriptive Analysis Multiple Bar Charts Contigency Table Inferential Analysis Association χ2, OR, RR Prediction, Logistic Regression Continuous Data Descriptive Analysis Scatter Plot, Box Plot Relationship, Regression, Correlation Inferential Analysis t-test ANOVA, Multiple Regression