BACKGROUND:

Statistical analysis software is a valuable tool that helps researchers perform the complex calculations. However, to use such a tool effectively, the study must be well designed. The social worker must understand all the relationships involved in the study. He or she must understand the study’s purpose and select the most appropriate design. The social worker must correctly represent the relationship being examined and the variables involved. Finally, he or she must enter those variables correctly into the software package. This assignment will allow you to analyze in detail the decisions made in the “Social Work Research: Chi Square” case study (ATTACHED) and the relationship between study design and statistical analysis. Assume that the data has been entered into SPSS and you’ve been given the output of the chi-square results. (See Week 4 Handout: Chi-Square findings) (ATTACHED).

ASSIGNMENT:

To prepare for this Assignment, review the **Week 4 Handout: Chi-Square Findings** (ATTACHED) and follow the instructions.

**Submit** a 1-page paper of the following:

- 1. An analysis of the relationship between study design and statistical analysis used in the case study that includes:
- A. An explanation of why you think that the agency created a plan to evaluate the program
- B. An explanation of why the social work agency in the case study chose to use a chi square statistic to evaluate whether there is a difference between those who participated in the program and those who did not (
**Hint: Think about the level of measurement of the variables**) - C. description of the research design in terms of observations (O) and interventions (X) for each group.
**D. Interpret the chi-square output data. What do the data say about the program?**

© 2014 Laureate Education, Inc. Page 1 of 5

Week 4: A Short Course in Statistics Handout

This information was prepared to call your attention to some basic concepts underlying statistical procedures and to illustrate what types of research questions can be addressed by different statistical tests. You may not fully understand these tests without further study. However, you are strongly encouraged to note distinctions related to type of measurement used in gathering data and the choice of statistical tests. Feel free to post questions in the “Contact the Instructor” section of the course.

Statistical symbols: µ mu (population mean) α alpha (degree of error acceptable for incorrectly rejecting the null hypothesis, probability that results are unlikely to occur by chance) ≠ (not equal) ≥ (greater than or equal to) ≤ less than or equal to) ᴦ (sample correlation) ρ rho (population correlation) t r (t score) z (standard score based on standard deviation) χ

2 Chi square (statistical test for variables that are not interval or ratio scale, (i.e.

nominal or ordinal)) p (probability that results are due to chance)

Descriptives: Descriptives are statistical tests that summarize a data set. They include calculations of measures of central tendency (mean, median, and mode), and dispersion (e.g., standard deviation and range). Note: The measures of central tendency depend on the measurement level of the variable (nominal, ordinal, interval, or ratio). If you do not recall the definitions for these levels of measurement, see http://www.ats.ucla.edu/stat/mult_pkg/whatstat/nominal_ordinal_interval.htm You can only calculate a mean and standard deviation for interval or ratio scale variables. For nominal or ordinal variables, you can examine the frequency of responses. For example, you can calculate the percentage of participants who are male and female; or the percentage of survey respondents who are in favor, against, or undecided. Often nominal data is recorded with numbers, e.g. male=1, female=2. Sometimes people are tempted to calculate a mean using these coding numbers. But that would be

© 2014 Laureate Education, Inc. Page 2 of 5

meaningless. Many questionnaires (even course evaluations) use a likert scale to represent attitudes along a continuum (e.g. Strongly like … Strongly dislike). These too are often assigned a number for data entry, e.g. 1–5. Suppose that most of the responses were in the middle of a scale (3 on a scale of 1–5). A researcher could observe that the mode is 3, but it would not be reasonable to say that the average (mean) is 3 unless there were exact differences between 1 and 2, 2 and 3 etc. The numbers on a scale such as this are ordered from low to high or high to low, but there is no way to say that there is a quantifiably equal difference between each of the choices. In other words, the responses are ordered, but not necessarily equal. Strongly agree is not five times as large as strongly disagree. (See the textbook for differences between ordinal and interval scale measures.)

Inferential Statistics: Statistical tests for analysis of differences or relationships are Inferential, allowing a researcher to infer relationships between variables. All statistical tests have what are called assumptions. These are essentially rules that indicate that the analysis is appropriate for the type of data. Two key types of assumptions relate to whether the samples are random and the measurement levels. Other assumptions have to do with whether the variables are normally distributed. The determination of statistical significance is based on the assumption of the normal distribution. A full course in statistics would be needed to explain this fully. The key point for our purposes is that some statistical procedures require a normal distribution and others do not.

Understanding Statistical Significance Regardless of what statistical test you use to test hypotheses, you will be looking to see whether the results are statistically significant. The statistic p is the probability that the results of a study would occur simply by chance. Essentially, a p that is less than or equal to a predetermined (α) alpha level (commonly .05) means that we can reject a null hypothesis. A null hypothesis always states that there is no difference or no relationship between the groups or variables. When we reject the null hypothesis, we conclude (but don’t prove) that there is a difference or a relationship. This is what we generally want to know.

Parametric Tests: Parametric tests are tests that require variables to be measured at interval or ratio scale and for the variables to be normally distributed.

© 2014 Laureate Education, Inc. Page 3 of 5

These tests compare the means between groups. That is why they require the data to be at an interval or ratio scale. They make use of the standard deviation to determine whether the results are likely to occur or very unlikely in a normal distribution. If they are very unlikely to occur, then they are considered statistically significant. This means that the results are unlikely to occur simply by chance.

The T test Common uses:

To compare mean from a sample group to a known mean from a population

To compare the mean between two samples o The research question for a t test comparing the mean scores between

two samples is: Is there a difference in scores between group 1 and group 2? The hypotheses tested would be:

H0: µgroup1 = µgroup2 H1: µgroup1 ≠ µgroup2

To compare pre- and post-test scores for one sample o The research question for a t test comparing the mean scores for a

sample with pre and posttests is: Is there a difference in scores between time 1 and time 2? The hypotheses tested would be :

H0: µpre = µpost H1: µpre ≠ µpost

Example of the form for reporting results: The results of the test were not statistically significant, t (57) = .282, p = .779, thus the null hypothesis is not rejected. There is not a difference in between pre and post scores for participants in terms of a measure of knowledge (for example).

An explanation: The t is a value calculated using means and standard deviations and a relationship to a normal distribution. If you calculated the t using a formula, you would compare the obtained t to a table of t values that is based on one less than the number of participants (n-1). n-1 represents the degrees of freedom. The obtained t must be greater than a critical value of t in order to be significant. For example, if statistical analysis software calculated that p = .779, this result is much greater than .05, the usual alpha-level which most researchers use to establish significance. In order for the t test to be significant, it would need to have a p ≤ .05.

ANOVA (Analysis of variance) Common uses: Similar to the t test. However, it can be used when there are more than two groups. The hypotheses would be

H0: µgroup1 = µgroup2 = µgroup3 = µgroup4 H1: The means are not all equal (some may be equal)

© 2014 Laureate Education, Inc. Page 4 of 5

Correlation Common use: to examine whether two variables are related, that is, they vary together. The calculation of a correlation coefficient (r or rho) is based on means and standard deviations. This requires that both (or all) variables are measured at an interval or ratio level. The coefficient can range from -1 to +1. An r of 1 is a perfect correlation. A + means that as one variable increases, so does the other. A – means that as one variable increases, the other decreases. The research question for correlation is: “Is there a relationship between variable 1 and one or more other variables?” The hypotheses for a Pearson correlation: H0: ρ = 0 (there is no correlation) H1: ρ ≠ 0 (there is a real correlation)

Non-parametric Tests Nonparametric tests are tests that do not require variable to be measured at interval or ratio scale and do not require the variables to be normally distributed.

Chi Square Common uses: Chi square tests of independence and measures of association and agreement for nominal and ordinal data. The research question for a chi square test for independence is: Is there a relationship between the independent variable and a dependent variable? The hypotheses are: H0 (The null hypothesis) There is no difference in the proportions in each category of one variable between the groups (defined as categories of another variable). Or: The frequency distribution for variable 2 has the same proportions for both categories of variable 1. H1 (The alternative hypothesis) There is a difference in the proportions in each category of one variable between the groups (defined as categories of another variable). The calculations are based on comparing the observed frequency in each category to what would be expected if the proportions were equal. (If the proportions between observed and expected frequencies are equal, then there is no difference.)

© 2014 Laureate Education, Inc. Page 5 of 5

See the SOCW 6311: Week 4 Working With Data Assignment Handout to explore the Crosstabs procedure for chi square analysis.

Other non-parametric tests: Spearman rho: A correlation test for rank ordered (ordinal scale) variables.

,

Week 4 Handout: Chi-Square Findings

The chi square test for independence is used to determine whether there is a relationship between the two variables that are categorical in the level of measurement. In this case, the variables are: employment level and treatment condition. It tests whether there is a difference between groups. The research question for the study is: Is there a relationship between the independent variable, treatment, and the dependent variable, employment level? In other words, is there a difference in the number of participants who are not employed, employed part-time and employed full-time in the program and the control group (i.e., waitlist group)? The hypotheses are:

H0 (The null hypothesis): There is no difference in the proportions of individuals in the three employment categories between the treatment group and the waitlist group. In other words, the frequency distribution for variable 2 (employment) has the same proportions for both categories of variable 1 (program participation).

** It is the null hypothesis that is actually tested by the statistic. A chi square statistic that is found to be statistically significant, (e.g. p< .05) indicates that we can reject the null hypothesis (understanding that there is less than a 5% chance that the relationship between the variables is due to chance).

H1 (The alternative hypothesis): There is a difference in the proportions of individuals in the three employment categories between the treatment group and the waitlist group.

** The alternative hypothesis states that there is a difference. It would allow us to say that it appears that the treatment (voc rehab program) is effective in increasing the employment status of participants.

Assume that the data has been collected to answer the above research question. Someone has entered the data into SPSS. A chi-square test was conducted, and you were given the following SPSS output data:

,

Statistics for Social Workers

J. Timothy Stocks

tatrstrrsrefers to a branch ot mathematics dealing '"'th the direct de<erip- tion of sample or population characteristics and the an.ll)'5i• of popula· lion characteri>tics b)' inference from samples. It co•·ers J wide range of content, including th~ collection, organization, and interpretJtion of data. It is divided into two broad categoric>: de;cnptive >lathrics and inferential >lJt ost ics.

Descriptive statistics involves the CQnlputation of statistics or pnr.1meters to describe a sample' or a popu lation _~ All t he data arc available and used in <.omputntlon o f t hese aggregate characteristics. T his may involve reports of central tendency or v.~r i al>il i ty of single variables (univariate statistics). ll also may involve enumeration of the I'Ciation- sh ips between or among two or moo·e variables' (bivariate or multivariJte stot istics}. Descriptiw statistics arc used 10 provide information about a large m.b> of data in a form that ma)' be easily understood. The defining characteristic of descriptive ;tJtistks b that the product is a report, not .on inference.

Inferential statisti<> imolvc' the construction of a probable description of the charac· teristics of a population b•sed on s.unple data. We compute statistics from .1 pJrtial;et of the population data (a samplt) to estimate the population parameters. Thrse t<timates are not exact, but \·e can mo~k..: reawnable judgments as w hoV preruc our c~lim:ues are. Included within inferential statiwcs i;, hypothesis testing, a procedure for U>ing mathe- m:uics tO provide evidence for the exi<tence of relationships between o r among variable;. T bis testing is a form of inferential •"l~umem.

Descriptive Statistics

Measures of Central Tendency Measures of central tenden')' are individual numbers that typify the tot.tl set of ~cores. The three most frequently used mca>urcs of centraltendenq are the arithmetic mean, the mode, and the median.

Arir!Jmeric .1ea11. The arithmetic mean usually is simply called the mca11. It also is called the m-erage. It is computed b)' adding up all of a set of scores and dwidmg by the number of scores in the set. The algebraic representation of this is

75

76 PA11 f I • OuANTifAllVi AffkOAGHU: fouHo~;noM Of Ot.r"' CO ltf(TIO'J

~, =l:: X , 11

where 11 represents the popu I at ion mean, X represems an individual score, and rr is t he number of scores being adde(l.

The formula for the sample mean is the same except t hat the mean is represented by the variable lener with a bar above it:

– l:;X X= –. II

Following are t he numbers of class periods skipped by 20 seventh-graders d uring I week: {1, 6,2,6, 15,2(),3,20, 17, 11, 15, 18,8,3, 17, 16, 14, 17,0, 101. Wecomputethe mean by adding up the class periods missed and dh•iding by 20:

l:;X 219 • J.l = — = – = 10.9o.

II 20

Mode. The mode is the most frequently appearing score. It really is not so much a measure of centrality as it is a measure of typicalness. It is found by o rganizing scores int o a fre-

quency distribution and determining which score has t he greatest fre-

TABLE 6 . 1 Truancy Scores quency. Table 6. 1 displays the truancy scores arranged in a frequency distribution.

Score

20 19

18 17

16 IS 14 13 12 II

10 9 8 7

6 5 4

3

2 1 0

frequ ency

2 0 1

3 1

2 I

0 0 l

I 0 1

0 2 0 0 2

0

Because 17 is the most frequently appearing number, the mode (or modal number) of class periods skipped is 17.

Unlike the mean or median, a distribution o f scores can have more than one mode.

,llfedinrr. lf we take all the scores in a set of scores, place t hem in o rder from least to greatest, and count in to the middle, then the score in the middle is the median. This is easy enough if there is an odd number of scores. However, if there is an even number of scores, then there is no single score in the middle. In this case, t he two middle scores are selected, and their average is the median.

There a.re 20 scores in the previous example. The median would be the a"erage of the lOth and lith scores. We usc t he frequency table to find these scores, which are 14 and J 5. T hus, the median is 14.5.

Measures of Variabi li ty Whereas measures of central tendency are used to estimate a typical score in a dimibution, measures of variability may be thought of ns a way in which to measure departu re from typic<~lness. They pro"ide information on how "spread out" scores in a d istribution are.

J<auge. The range is the easiest measure of variability to calculate. It is simply the distance from the minimum ( lowest) score in a distribution

If 10

R

:.aJ 13

de

c .. …nu 6 • STAnsnu t<~~ Soc&AL Wouta~ 77

to the maximum ( highest) score. h is obtained by subtracting the 111ini murn score flom lhe maximum ~cor~.

Let us compute th.- rang.- for the following dJt.l ~ct:

/1, 6, 10, 14, 18,22/.

'T'he n1inimum i!) 2, and tht." tnJximum is 22:

Range = 22 – 2 20.

Sum ofSquaus. The sum of squares is a measure of the total amount of variability in" set of scores. Jts na me tells how to wmpute it. Smu ofsqunres is short (or sum ofsqumed dc1ti til ion scores. It is represented by the S)'lnbol SS.

The formulas for sample and population sums ot squares are the same except for sam- ple and populat•on mean symbob:

SS = I(X ~tl'

Using the dJtJ set fo r t11e range, the sum of squnres would be computed as in 'ldble6.2.

V.~rinuce. Another name for variance i~ mean square. This is short for mean of squared devintron score<. 1l1is is obtained by dividi ng the sum of squares by the number of scores (11). It is a me,tsure of the average amount of variabilit y associated with each score in a set of scores. The population variance fOI'mu la is

ss a2= -.

n

whc1e cr2 is the syn>bol for populn tion variance, SS is the symbol fo r sum of squares, and 11 st,uJds for th e number of scores in the population.

The variance for the example we used to compute sum of squares would be

TAOLE 6.2 Computing the Sum of Squares

X X m

2 tO

6 6

10 ]

l<t 12

18 >6

12 10

NOTE, !X~ 72; n- 6; ~ • 12; l:(X – p)' ~ 780

(X – m)'

100

36

4

4

36

100

2 280 (J –= 46.67.

6

The sample variJnce is not an unbi.as.ed estin1a1o1 of thf population variance. If we compute the vari anccs for these samples using the SS/11 formula, then the- san1ple vadn nccs wil1 average o ut smaller than the population val'iance. For th is rc:~son, the sample variance is computed differently froru the population variance:

ss sl = – – .

II – I

CHA,Ut 6 • Sr"n~nn HJa SOCIAl wouus 77

to the maximum (highc;t) score. h is obtained by subtracting the minimum scoo·c from the maximum score.

let us compute the rnnge for the following data set:

12. 6, 10, 14, 18.221 .

The minimum is 2. and the maximum is 22:

Range 22-2 = 20.

Sum of8qo~t~res. The ,um of squares;, a measure of the total amoun t o f variability in a set of score~. It> name tells how to compute it. Sum of 51Jo.arcs is short for ;um of squared dco•i- atiou scores. It is reprewnt<>tl by the symlxll SS.

The formulas for <.omple and popul.llion sums of squares are the ~arne except tor S<J m – p le a nd population mean sym bols:

ss l.(X -X)'

Usi ng the data set for the range, t he su m of squares would be computed ns i n T.,b)e 6.2.

~rta11u. Another name for variance is mean square. This is short for menn of 51JIUtred devontw11 scores. This os obtained by dividing the sum of squares by the number of ><.ores (n). It is a measure of t he averoge ••m ount of var iability associated w ith each score in a set of scores. T he popula tio n variance for m11ln is

ss ¢ =- .

n

where o ' is th e symbol foo· population v•o·ia.nc.e, SS is t he symbol fo o· Slim o f squares. a11d 11 stands for the numbet of scores in the population.

The •-..ria nee for the example we used to compute sum of squar~s would be

TABu 6.2 Computing the Sum of Squares

X X-m

2 – 10

6 -6

10 -2

14 +2

18 +6

22 +10

HOT£: r.x- 72: n; ti; p = 12: l:lX Ill'= 250.

(X- m)'

100

j(,

4

4

J&

tOO

280 cr2 =

6 ~ 46.67.

The snmple variance is uot Jn Ulbiased estimalor o f' t he population variance. Jf we com pute t he vari- ances for these samples using th" SShr formu la, then the sample variances will average out smaller than thc population ••ariance. For this reJson, the sample Vllriance is computed differe ntly from the population variance:

ss r =-. n – J

78 PAll I • QuAiuu.ot.nvt A"MACH(S.:. FouHDAIIOif"i Of O.AIA CoLLfcnow

The n – 1 i> a correction fac tor for this tendency to undcre>tima te. I t is c.1 lled degree• of freedon1. If <lur example we1< a sample. then the ,,ariance would be

.1 280 > =–

6 – 1 280 6 5 = 5.

Sumdard Deviatron. Although the variance is a measure of average variability associJtc'<l wllh each score, it i> on a d ifferent sc.lle from the score itself. Tlw variance measures avel· age squared deviation from the mean. To get " me<tstne of averdgc variabili ty on the ;a rue scale as the original scores, we ta ke the squa 1·c rc)Ot of the varia nee. The st<tndard deviation is the square root of the variance. The fo rmula< are

Using the same .ct of numbers as before, the population standard deviation would be

cr -/46.67 = 6.83 .

and the sample st.mdard deviation would be

s J56 = 7.'18.

For a normally d istribured set of scores, n ppwximately 68% of all ;cores will be within ll •tanrlard deviation of 1 he mean.

Measures of Relationship T.1ble 6.3 shows the relat iortship between number of >treSsors experien<ed by a parent during .1 week and that parent's frequency of U>C of corporal punishment during the same wee.k.

One can use ,·eg,·cssion procedures to dcrivr the line that best fo ts the data. This line is rcfel'l'ed to as a regression line (or line of best ii 1 o r prediction I inc). Su ch a line bas been .CJiculated for the example plot. It has a Y ime,·cept of – 3.555 t11id a slope of + 1.279. T his gives us the prediction equation of

Y,_. = 3.555 t 1.279X,

where Yis fi-equ ency o f <Orporal p unishment and X is stresso1 ~. This is graphically pre dieted in Figure 6 . 1.

Slope is the ch•ngc in Y for a unit increase in X. So, the slope of 11.279 meam that''" increase in stres.ors (X) of 1 will be accomp.ulicd by an increase in predicted frequency of ~orporal punishment (I') of + 1.279 incidents per week. If the slope were a negati'e number, then an increase in X would be accompanied by a pred ictcd decrease in Y.

The equation does not give the actual value of Y (called the obt.tined or obserwd score); rather, it giv~s a prediction of the value of Y for a certain value of X. Fo r

– Cu,"na 6 • SrAliSnc<o 10~ So- '"' WOhi•C. 79

r iQUIO 6.1 8

Frequency ol Stre<sors and Use of Co•poral 7

0

Punishment

~ 6 0

c . Y P'td; – 3.555 + 1.279X .. " 5 0 r:r e … c 4 .. E .r:

3 til ·;: " Q.

2 0

0

0 0 1 2 3 4 5 6 7 8 9

Stressors

example, if X were 3 , rhen we would predi<.t t hal Y would be – 3.555 + 1.279(3) ~ – 3.555 + 3.837 ~ 0.282.

Tuu 6 . 3 frequency of Sttessors and Use of Corporal Punishment

Sue-ssors Pun1.shm~nt

3 0

4

4 }

s 3 6 4

7 ~

8 6

7

q 8

1() 9

T he regression li ne is the line that predicts Y >UCh t hat t he error of p redictio n is minim ized. Error is d efined as the d ifference between the predicted score and the obtaine<l score. The equation for compu ting error is

E= Y Y..,.. ..

\~1en X= 4, there arc two obL1ined ''alues of Y: I and 2. The p redicted value of Y is

Y,,…t = – 3.555 I 1.279( 4) = – 3.555 + S. l l6 ~ 1.56 1. rhe error of prediction i~ E =I – 1.561 = -0.561 fu r Y = I, and

E – 2 – 1.561 = +0.<139fnr Y=2 . If we square each error difference score and sum the squares.

then we get a quantity called the enor sum of sq.ure;., which i;. r~presented b)•

SSI: L( Y – Y,..,.,)'.

T he regressi011 line io !he o ne line that give> the sm.11lcst va lue fo r SSt.

80 P~oar 1 • QUAtHnAnvE A ,ROACHES: FouNOAHO~r~~$ of DAtA Conte I!Otf

The SSE is a measure of the lOla I variability of obtained score values around their pre- dicted values. There are two other ;un" of squares !hat are important to undcr>tanding correlation and regri'SSion.

The total sum of squ.m:s (SS1) i$ a measure of the total variabilit)' of the obtained score values around the mean oft he obtained scores. The SST is represented by

SST = L(Y-Y)'.

The remaining sum of squa 1·cs is coiled the regression sum of S<Ju:u·cs (SSR) o r the explained sum of squares. If we squnre each of the differences between prcdie1 cd scores and t he mean and then add t hem u p, w·c get the SSR, which is represented by

SSR L( v, …. – Y)'.

The SSR is a measure of the tot.d variabil ity of the predicted score values around the mean of the obtained scores.

An important and interesting feature of the>e three sums of squares is that the sum of the SSR and SSE is equal to the SS1:

SST SSR- SSE.

This leads us to three o ther imponnnt stat istics: t he proportion of variance explJined (I'VE) , the correlation coefficient, ond the standard error of estim ate.

Proportion of Iarin nee Expluir~ctl. T ht I'VE is a measure of how good Lhc rcs,·cssion line p red icts obtained scores. The values of PV£ 1·ange fro m 0 ( no p red ictive value) to I ( pre- diction with perfect accurJLy). The cqunt ion fo r PV£ is

SSR J>vE – – ·

SST

There also is a computational equation for the PVE. which is

where

PVE – ( SSXY )' SSX • SSY'

SSXY is the "co variance" ~um of ;qua res: l.(X – X)( Y – Y ), SSX is t he sum of squares for vn rinble X: IlX – XJ', and SSYis the sum of squares for varinblc Y: 2:( Y – Y)'.

The procedure fo r computing these sums of squares is outlined in Table 6.4. The proportion of v.triance in the freque ncy of corporal punishment thnl may be

explained by stressors experienced ;,

( 4 6L5)1 3782.25 l'VE = – = = 0 .953.

(48.1)(825) 3968.25

TABLE 6.4 Computation of r2 (PVE)

y Y – y (Y- Y)' X X x (X – X)' (X X)( Y Y) 3 -33 10 .89 0 -4 5 20 .2 5 +1405

4 -2 3 5.29 -lS 12 .25 +80S

4 -23 529 2 -15 6 .25 < 5.75

5 – Ll 1.69 3 1.5 2.25 • 1.95

6 -ol 0 .09 < -o5 0.25 0 IS

7 +0./ 0.49 5 ·10.5 0.25 035

8 + II 2.89 6 ; 1.5 2 .25 • 2.55

7 TO.! 0.49 7 12.5 6 .25 11.75

9 +27 7.29 R t3.5 12.25 -19.45

10 +3 I 13 69 9 "'5 20.25 16.65

NOTE: Y – 6.3; SSY – 48. l; X = 4.5; S5X = 82.5; S5XY • •6 l S

The PVEsometimes is en lied th~ coefticient of determination and is represented by the symbol r'.

Correlation Co~ffirirm. A correlation coellicient also is a 111easure of th e strength of rela- tionship between two variables. The correlation cocfficicnt is represented by the letter r and can take on values between – 1 and + I inclu~ivc. The correlation coefficient always has the same sign a.< the slope. If one squares a correlation coefficient, then <me will obtain the PV£ It is computed using the following formula:

SSXY r = -vr.;;S50sx""•""S;;;S;;o;Y

For our examph: data, the correlation coefficient would be

+61.5 ~ 61.5 +61.5 R — = = = -0.976 .

./(18.1)(82.5) ¥'3968.25 62.994

Standard Error of Em mate. The standard error of estimate is the <tandard deviation of the prediction errors. It i< computed like any other standard deviation: the: square root of the SSE divided by the dcRn:es of freedom.

The fi rst s tep is to compute the variance error (s:.J:

..1 'E

SSE n-2

Notice that the value for degrees of freedom is 11 2 rather than 11 – l. The reason why we subtract 2 in this instance is that variance error (and standard Cfi'Or of c:stimatc) is a statistic describing characteristics of two variables. T hey deal with the error involved in the prediction of Y (one variable) from X {the other v.triable) .

'l he standard error of estimate is the square root of the variance error:

Sf.= …j(ij.

The standard error of estimate tells us hOv spread out scores are with respect to their predicted values. If the error· scores ( E = Y- Y,.o~> are normally distributed around the prediction line, then about 68% of actual scores will foil between ±I :;,; of their predicted values.

We can calculate the standard error of estimate using the foUowing computing formula:

( n-1) ( I — r 2)(——-) , u-2 where

s,. is the standard deviation of Y, r is the correlation coefficient fo r X and Y, and

n is tl1e sample si7.c.

for the example dat..1, this would be

S£ = 2.3lli ((J — .953) :~ = D = 2.311 ((0.47)~) = 2.311J0.053 = (0.230)(0.727) = 0 .167.

Inferential Statistics: Hypothesis Testing

The Null and Alternative Hypotheses

Classical ;tatistical hypothesis testing is based on the evaluation of two rival hypothescs: the null hypothesis and the alrermltive hypothesis.

We try to dete<:t relationsh ips by identifying changes that are unl ikely to have occurred simp!)• bccau~e of random fluctuat ions <If dependent measures. Statistical analysis is the usual procedure for identil)•ing ;uch relationsh•p>.

The null hypothesis is the hypotltcsis that there is no relationship between two vari- ables. This implies that if the null hypothesis is true, then any apparent relationship in Mmple

We are a professional custom writing website. If you have searched a question and bumped into our website just know you are in the right place to get help in your coursework.

Yes. We have posted over our previous orders to display our experience. Since we have done this question before, we can also do it for you. To make sure we do it perfectly, please fill our Order Form. Filling the order form correctly will assist our team in referencing, specifications and future communication.

1. Click on the “**Place order** tab at the top menu or “**Order Now**” icon at the
bottom and a new page will appear with an order form to be filled.

2. Fill in your paper’s requirements in the "**PAPER INFORMATION**" section
and click “**PRICE CALCULATION**” at the bottom to calculate your order
price.

3. Fill in your paper’s academic level, deadline and the required number of pages from the drop-down menus.

4. Click “**FINAL STEP**” to enter your registration details and get an account
with us for record keeping and then, click on “**PROCEED TO CHECKOUT**”
at the bottom of the page.

5. From there, the payment sections will show, follow the guided payment process and your order will be available for our writing team to work on it.

Need this assignment or any other paper?

Click here and claim 25% off

Discount code SAVE25

April 1, 2022

April 1, 2022