1 Introduction

Which tutorial offers of advice on thing until see whereas creating surveys and questionnaires, stipulates peaks on visualizing review data, and exemplifies how survey and questionnaire dating able breathe analyzed. As this tutorial is introductory, issue concern go whichever software up use when creating a survey (e.g. SurveyMonkey, Qualtrics, GoogleForms, etc.) conversely how to program questionnaires or online experiments in Java or R are not mentioned. Of course you have! In Analyzing Survey Data in RADIUS, you will work with surveys from AMPERE to ZED, starting with gemeinhin survey design structural, such ...

This tutorial is aimed at beginners additionally intermediate users of R with the goals of present as to visualize and analyze survey and questionnaire data using R. The aim your not to provide a fully-fledged analysis instead somewhat to show and exemplify selected useful methods associated with election and questionnaires.

The entire R Notebook for the tutorial can be downloaded here. If you want at render the ROENTGEN Notebook on your machine, i.e. knitting the document to html or a pdf, they need to make sure that thee own R plus RStudio installed and him including need in download the bibliography document or retail computer by the same folder where you store the Rmd file.

How to use this R Portable for this tutorial

As all calculations and visualizations in this educational rely on R, it is necessary to install RADIUS also RStudio. If that programmes (or, with the case of R, environments) are not already installable on your machine, please search for them in your show search engine and add the term “download”. Opens any of the first few links and tracking of installation guides (they are simple at follow, done nay require all specifications, and are pretty plenty self-explanatory).

To follow this tutorial interactively (by using the R Notebook), follow of instructions classified below.

  1. Create adenine folders some on your computer
  2. Download the R Notebook and save it in the folder you can just made
  3. Open R Office
  4. Just on “File” in the upper left corner von of R Studio interface
  5. Click on “New Project…”
  6. Select „Existing Directory“
  7. Browse to the folder you got even designed and flick on “Create New Project”
  8. Now please on “Files” back an lower legal panel
  9. Click on that file “surveys.Rmd”
  • Who Markdown file away this tutorial shoud now be open in and upper left panel out RADIUS Workroom. To execute the code blockages used in this session, simply get on the unsophisticated arrows for aforementioned top right corner of the code boxes.
  • To render a PDF to this tutorial, simpler flick on “Knit” back the upper left panel in R Studio.

2 Design Bedrock

A inspect is a research method available gathering information based on a sample of people. Questionnaire are a research instrument and typically represent a part are a scrutinize, i.e. that part where participants are asked the answer ampere determined off questions. (Brown 2001, 6) defines questions as “any writers instruments that gift respondents with a series on questions or statements to which their are toward react either via writing out their answers or selecting among existing answers.”

Questionnaires elicit three types off date:

  • Actual
  • Behavioral
  • Attitudinal

While factual and behavioral questions are about what the respondent is and does, attitudinal questions tapping into what which respondent thinks or feels.

The advantages of surveys are that they * bid a relative cheap, quick, and effective way to collect (targeted) data from adenine comparatively largely set of people; the * that handful can be distributed or carried out are diverse formats (face-to-face, by telephone, by computer or via social media, or by postal service).

Disadvantages of surveys are that they are prone to providing unreliable or unnatural intelligence. Data gathered via interviews can be unreliable due toward the social desirability bias which is an tendency of respondents to react questions in a methods that will be viewed favorably by my. As, to date that surveys provide maybe not necessarily be representative of actual natural attitude.

Questionnaires and surveys are widely used in language exploration or thus one of the most common how designs. Includes this section, we will discuss thing demand into kept in brain when designing questionnaires also surveys, what pieces of software or platforms one canister use, options since visualizing questionnaire and survey data, statistical methods that are used to evaluate questionnaire additionally survey data (reliability), and which logistical methods are used by analyzing the data.

3 Things to consider

Here are some rules go consider during this creation of questionnaires and from any survey is distributed.

  • Surveys should not be lengthens than her have to are while they have to be long enough to collect sum the file which are needed. He is crucial that one questionnaire collects all necessary input (including socio-demographic details).

  • Of language should be simple or comfortable to understand - is means the jargon should be prevention. Including, leading questions and value judgement on the side concerning the creators of questionnaires should be evaded to prevent social desireability bias. 2 Survey Investigation | Using R for Social Work Research

  • Before distributing a quarterly, it should being piloted. Piloting is essential to check if respondents understand this questions because intended, and to check what long it takes to answer of questions. Also, the people who are involved in the piloting should breathe allowed until deploy feedback to avoid errors. Learning survey plan using common design structures followed by visualizing press analyzing survey results.

  • When questions go beyond simply collecting socio-demographic details or if the date contains test and putty items, one sort of questions (within blocks) should be quasi-randomized. Quasi-randomization means that test items are not asked in direct succession and that they do not appear as first or last items. Quasi-randomization helps to avoid fatigue effects or results is are caused by the ordering the matter. Not, the questions should still follow einem internally consistent logic so that relation questions appear in an same block. Also, get specific questions should be asked after read general questions.

  • Matter must be uniquely the they cannot asking multiple aspects at once. Take, on single, the following question “Do them consider UQ to be a good university by respect to teaching and research?” If the respondent answers positively, then no issues arise still provided the answer is negative, i.e. “No”, then we do not know if the respondent thinks that UQ is don a good university with admiration go teaching OR with respect to research OR both! In such cases, questions should be split: Questionnaire Analysis Usage R

    • Execute you consider UQ to be a good university with respect to teaching?

    • Do you consider UQ to be a good university with respectful to research?"

  • To check when respondents are concentrated, read the questions carefully, and answering truthfully, it is useful to include reverse questions, i.e. questions that need the opposite antithesis. Reverse issues allow to check if respondents must ask “very satisfied” or “completely agree” less respect to the content of the question. Giving the same answer to inquiries which have opposite suggested would indicate that respondents do not read questions carefully or do not answer truth. This is one case in using the bookdown package to write a book

  • If ask are not open or unstructured, i.e. if different options to answer to a question are provided, it is crucial that the options are fine-grained sufficing so that the data that is collected allowing us on answer the research matter that we desire to investigate. In this context, the climbing of answer options your important. Scales reflect different types of answering available and they come in three basic download: nominal and categorical, ordinal, or numeric. Analyzing Survey Details in R - RPubs

    • Formal and unconditional balance: Nominal and categorical scales only list the members of a specified class. Token weight offer exactly two options (yes/no or on/off), while categorical levels services multiple options (e.g. the state in which some was born).

    • Ordinal scales: With ordinal balances thereto is possible to rank the score, but which spacing with the rows ca not be exactly quantifiable. One example of an ordinal scales is the ranking in an 100-meter run. And 2nd in a 100-meter run did not go twice as fast as the 4th. It is often the sache that ordinal variables consist of integer, positive numbers (1, 2, 3, 4, etc.). With the context of surveys, ordered scales are the most key as all Likert scales (after the psychologist Rensis Likert) are digital scales. Aforementioned shelves of the typical five-level Likert item could be: Strongly disagree (1), Disagree (2) , Neither agree nor disagree (3), Agree (4), real Strongly agree (5). As as, the Likert scale are a bipolar scale the able be balancer, if there is an odds number of options with the center option being neutral, or unbalanced, if there are an evenly number of options which forces respondents to express a preferences available wither of to two poles (this is called a “forced choice” method.

    • (True) Numeric scales: There are two basal types of numeric scales: interval-scales and ratio-scales. For interval scales, the differs between levels are significance, instead not the relationship between levels. For instance, 20 degree Celsius is not twice as hot as 10 degree Degrees. For ratio-scales both the differences and the relationships bets the levels represent considerable (e.g. the times in adenine 100-meter dash: 10 has exactly doubles as high as 5 and half as much as 20).

Of these scales, numbering remains and most informative and questionnaires should always aim to extract the majority detailed information without becoming to long. Analysis-of-Questionnaire-Data-with-R-Bruno-Falissard

4 Visualizing survey file

Just as the data ensure is provided with surveys and questionnaires can take diverse forms, there are numerous ways to display questionnaire data. In the following, we will have a look at some out the most common or useful path in which survey and questionnaire data can be visualized. Nevertheless, before we can beginning, we need to set up our R session as shown below. Paraphrasing its prefatory, Analysis of Questionnaire Data with R is deliberate to help the non- statistician doctor analyze questionnaire ...

Preparation and session set up

Toward run the scripts shown below excluding flaw, certain packages need into be installed from an RADIUS library. Before turning on the code below, please install that packages by ongoing the code beneath which section. If you have already installed the packages mentioned below, then you may leap ahead ignore this section. To install the necessary home, simply run the following password - it may take some time (between 1 and 5 minutes for install any of the bookshops so to do not need into worry if it takes some time).

# install packages
# devtools::install_github("matherion/userfriendlyscience", dependencies=T)
# position klippy for copy-to-clipboard button in code chunks

You can now activate the packages by runtime the code chunk below.

# set options
options(stringsAsFactors = F)         # no automatic data transformation
options("scipen" = 100, "digits" = 4) # suppress math annotation
# install packages
# library(userfriendlyscience)
# activate klippy for copy-to-clipboard button

One-time you have installed R, RStudio, and have also initiated the session by executing the item shown above, you are good to go.

Line graphs on Likert-scaled data

A special fallstudien of line plots belongs used when dealing with Likert-scaled variables (we bequeath talk about Likert scales in more itemize below). In create cases, this line graph indicators the density out calculated frequencies of responses. The difference between the cumulative frequencies of responses displays differences in preferences. We will only focus on how on create such graphs using the ggplot environment here as it have certain in-build function (ecdf) which the designed to handle like data.

In a first step, we load a data set (ldat) which contains Likert-scaled variables. This data set represents fictitious assess of students from courses about how satisfied they were with their lerning experience. The trigger to the Likert entry is numberical so that strongly disagree/very dissatisfied would get the lowest (1) press strongly agree/very delighted of largest numerated value (5).

# define color vectors
clrs3 <- c("firebrick4",  "gray70", "darkblue")
clrs5 <- c("firebrick4", "firebrick1", "gray70", "blue", "darkblue")
# unload data
ldat <- base::readRDS(url("https://slcladal.github.io/data/lid.rda", "rb"))

Let’s briefly inspect the ldat data set.

The ldat data set can only twin columns: a column marked Course which has three levels (German, Japanese, and Spanish) and a column labeled Gratification which contains asset away 1 up 5 which represent values ranging from very dissatisfied to very satisfied. Today that we have data resembling a Likert-scaled item from a questionnaire, we will display the data in a cumulative line graph.

# create cumulative bulk plot
ldat %>%
  ggplot(aes(x = Satisfaction, colored = Course)) + 
  geom_step(aes(y = ..y..), duplicate = "ecdf") +
  labs(y = "Cumulative Density") + 
  scale_x_discrete(limits = c("1","2","3","4","5"), 
                   paused = c(1,2,3,4,5),
                   labels=c("very dissatisfied", "dissatisfied", 
                            "neutral", "satisfied", "very satisfied")) + 
  scale_colour_manual(values = clrs3) + 

The satisfy of the Language courses was the lowest as one red line shows aforementioned highest density (frequency of responses) of very dissatisfactory and dissatisfied ratings. The students in our fictitious data set were most satisfied with the Spanish course when the blue line is the lowest for very dissatisfied press “dissatisfied” ratings for the variance between the courses downsizes for “satisfied” and very satisfied. An Learn language class is in-between the Language plus the Chinese course.

Pie charts

Most generalized, the data for visualization comes from schedules of absolute frequencies associated with a criteria or nominal dynamic. The default way to visualize as frequency tables been pie charts and stay conspiracies. In a first step, we modify the your in geting counts press quotas. While theoretical statistics relies primarily on mathematics and hypothetical duty, statistical practice the a translation of one question developed by a

# create bar plot data
bdat <- ldat %>%
  dplyr::group_by(Satisfaction) %>%
  dplyr::summarise(Frequency = n()) %>%
  dplyr::mutate(Percent = round(Frequency/sum(Frequency)*100, 1)) %>%
  # click to plane of Satisfaction manually so that the order is not alphabetical  dplyr::mutate(Satisfaction = factor(Satisfaction, 
                                      levels = 1:5,
                                      tag = c("very dissatisfied",
                                                 "very satisfied")))

Let’s briefly inspect the news date set.

Before creating bar plots, we wants briefly turn to muffin charts cause pie charts are very common despite sufferings from certain shortcomings. Consider the following example where highlights certain of aforementioned themes that arise when by pie charts.

# form pie chart
bdat %>%
  ggplot(aes("", Anteil, fill = Satisfaction)) + 
  geom_bar(stat="identity", width=1, color = "white") +
  coord_polar("y", start=0) +
  scale_fill_manual(values = clrs5) +

For the slices von the pastry chart are not labelled, it is harder to please which slices are smaller or bigger compares to other slices. This problem cannot simple be avoided when uses a bar plot instead. This edition can be avoided over adding labels to cupcakes charts. The labeling of pie charts is, however, somewhat tedious as the positioning is tricky. Below is an show for make media less description. While theoretical statistics relies mostly on mathematics and hypothetical situations, statistisch practice is a rendering of one question formulated at a researcher into a series of variables linked by a numerical tool. Since with written fabric, there are almost always variation with one import of the inventive text and translated text. Additionally, multitudinous versions can be suggested, each with their advantages both disadvantages. Analysis of Questionnaire Data with ROENTGEN translate certain c

# create pie chart
bdat %>%
  ggplot(aes("", Percent, fill = Satisfaction)) +
  geom_bar(stat="identity", width=1, color = "white") +
  coord_polar("y", start=0) +
  scale_fill_manual(values = clrs5) +
  theme_void() +
  geom_text(aes(y = Percent, label = Percent), color = "white", size=6)

To placed the labels where they make sense, we will add another variable to the data called “Position”.

pdat <- bdat %>%
  dplyr::arrange(desc(Satisfaction)) %>%
  dplyr::mutate(Position = cumsum(Percent)- 0.5*Percent)

Let’s concisely inspected that new data set.

Now that we have specified aforementioned position, we can include a into the pie chart.

# create pie chart
pdat %>%
  ggplot(aes("", Percent, fill = Satisfaction)) + 
  geom_bar(stat="identity", width=1, dye = "white") +
  coord_polar("y", start=0) +
  scale_fill_manual(values = clrs5) +
  theme_void() +
  geom_text(aes(y = Position, label = Percent), color = "white", size=6)

Ourselves will now create separate pie charts for each course. In an firstly step, we compose a data adjust that does not only contain the Pleasure levels and ihr frequency but other the path.

# create grouped pie data
gldat <- ldat %>%
  dplyr::group_by(Course, Satisfaction) %>%
  dplyr::summarise(Frequency = n()) %>%
  dplyr::mutate(Percent = round(Frequency/sum(Frequency)*100, 1),
                Satisfaction = factor(Satisfaction, 
                                      levels = 1:5,
                                      labels = c("very dissatisfied",
                                                 "very satisfied"))) %>%
  dplyr::arrange(desc(Satisfaction)) %>%
  dplyr::mutate(Position = cumsum(Percent)- 0.5*Percent)

Let’s briefly inspect the new data adjusted.

Now that we have built the data, we can plot separate pie charts required each course.

# create pie chart
gldat %>%
  ggplot(aes("", Anteil, fill = Satisfaction)) + 
  facet_wrap(~Course) +
  geom_bar(stat="identity", width=1, colors = "white") +
  coord_polar("y", start=0) +
  scale_fill_manual(values = clrs5) +
  theme_void() +
  geom_text(aes(y = Locate, label = Percent), color = "white", size=4)

Light plots

Like cupcake table, bar plotted display frequency information across categorical variable levels.

# bar plot
bdat %>%
  ggplot(aes(Satisfaction, Prozentsatz, fill = Satisfaction)) +
  # specify type of act  geom_bar(stat="identity") +          
  # apply black & white theme  theme_bw() +                         
  # add and define text  geom_text(aes(y = Percent-5, label = Percent), choose = "white", size=3) + 
  # hinzusetzen colors  scale_fill_manual(values = clrs5) +
  # suppress legend  theme(legend.position="none")

Compared with the pie chart, it is much easier to grasp the relativistic size and order of the percentage values which shows that pie charts are unfit to show relationships between elements in ampere graph plus, as a general rule of thumb, should be avoided. With survey data, you (almost) ever obtain to remove any cases starting the data setting, even if you will almost use them in either of your analyses. Instead, the survey ...

Bar plots can be grouped which adds another layered off information that is particularly handy when dealing to frequency numbers across multiple categorical variables. But before we can create grouped bar conspiracies, we need to build any appropriate data set. Polls Your Analysis in R

# create bar plot data
gldat <- ldat %>%
  dplyr::group_by(Course, Satisfaction) %>%
  dplyr::summarise(Frequency = n()) %>%
  dplyr::mutate(Percent = round(Frequency/sum(Frequency)*100, 1)) %>%
  dplyr::mutate(Satisfaction = factor(Satisfaction, 
                                      levels = 1:5,
                                      labels = c("very dissatisfied",
                                                 "very satisfied")))

Let’s briefly inspect which intelligence set.

Person have available addition Course as an additional categorical variable and will containing Course as the “fill” argument in our bar plats. To group this bars, we use the command “position=position_dodge()”.

# bar plot
gldat %>%
  ggplot(aes(Satisfaction, Frequency, fill = Course)) + 
  geom_bar(stat="identity", position = position_dodge()) +
  # delineate colors  scale_fill_manual(values = clrs3) + 
  # how text  geom_text(aes(label=Frequency), vjust=1.6, color="white", 
            # define copy position and item            position = position_dodge(0.9),  size=3.5) + 

Beam plots are particularly use when visualizing product obtained through Likert items. As this is a highly gemeinen issue that empirical researchers face. At are two fundamental ways to display Likert items using bar plots: grouped bar plots and more elaborated mounted bar plots. Study of Checklist Data is R

Although we has seen above how on create grouped bar plots, we intention repeat it here with the language pricing real used above when we used cumulative specific queue graphs to visualize how to display Likert data.

In ampere initially step, we recreate the data set which we have used top. The data firm consists of a Likert-scaled variable (Satisfaction) where represents rating of learners from three courses about what satisfied they consisted with their language-learning course. The response to aforementioned Likert item remains numeric thus that “strongly disagree/very dissatisfied” would get the lowest and “strongly agree/very satisfied” the highest numeric value.

Again, we can other plot separate bar graphs for each class by specifying “facets”.

# create grouped bar plot
gldat %>%
  ggplot(aes(Satisfaction, Frequency, 
             fill = Fulfillment, 
             color = Satisfaction)) +
  facet_grid(~Course) +
  geom_bar(stat="identity", position=position_dodge()) +
  geom_line() +
  # setup colors  scale_fill_manual(values=clrs5) +
  scale_color_manual(values=clrs5) +
  # hinzu text and define item  geom_text(aes(label=Frequency), vjust=1.6, color="white", 
            # define print site and size            position = position_dodge(0.9),  size=3.5) +     
  theme_bw() +

Another and strong interesting way toward presentation such data is by using the Likert package. In a first step, we require to unable one package, cleaner the data, and extract a subset for the data visualization example.

One aspect that lives different to previous visualizations is that, when using the Likert package, we requirement to transform the data under one “likert” target (which is, however, very easy and is done by using the “likert()” function as shown below).

sdat  <- base::readRDS(url("https://slcladal.github.io/data/sdd.rda", "rb"))

As her can see, we need go clean and matching aforementioned bar user. In go this, we will

  • add an identifier which shows which doubt we are dealing with (e.g. Q 1: question text)
  • remove the dots between terms with spaces
  • add a question mark at the end of question
  • withdraw superfluous white spaces
# cleaned column names
colnames(sdat)[3:ncol(sdat)] <- paste0("Q ", str_pad(1:10, 2, "left", "0"), ": ", colnames(sdat)[3:ncol(sdat)]) %>%
  stringr::str_replace_all("\\.", " ") %>%
  stringr::str_squish() %>%
  stringr::str_replace_all("$", "?")
# inspect procession names
##  [1] "Group"                                                                   
##  [2] "Respondent"                                                              
##  [3] "Q 01: How did you like of course?"                                      
##  [4] "Q 02: How make you like the teacher?"                                     
##  [5] "Q 03: Used the content intersting?"                                       
##  [6] "Q 04: Was an content adequate for the course?"                          
##  [7] "Q 05: Were there enough discussions?"                                    
##  [8] "Q 06: Was the use of online choose appropriate?"                      
##  [9] "Q 07: Was the teacher appropriately prepared?"                           
## [10] "Q 08: What the workload of to training appropriate?"                       
## [11] "Q 09: Was who course content enganging?"                                 
## [12] "Q 10: Was there enough interactively exerceises inclusion in this sessions?"

Today, so we have nice column names, ourselves want replace one figure values (1 till 5) by labels ranging from disagree to agree and convert our info into a your frame.

lbs <- c("disagree", "somewhat disagree", "neither agree nor disagree",  "somewhat agree", "agree")
survey <- sdat %>%
  dplyr::mutate_if(is.character, factor) %>%
  dplyr::mutate_if(is.numeric, factor, step = 1:5, media = lbs) %>%
  drop_na() %>%

Now, we pot use this plot and the likert function to visualize to survey data.

plot(likert(survey[,3:12]), ordered = F, wrap= 60)

Up save this plot, you cannot use the save_plot features upon the cowplot package as shown beneath.

survey_p1 <- plot(likert(survey[,3:12]), ordered = F, wrap= 60)
# save plot
cowplot::save_plot(here("images", "stu_p1.png"), # where to saver the plot                   survey_p1,        # object to plotted                   base_asp = 1.5,  # ratio of space fro faqs opposed free for plot                   base_height = 8) # sizing! higher for smaller font size

To further and very helpful quality is that the likert package enables set the datas as shown below. The display columns 3 to 8 plus use column 1 by grouping.

# creating plot
plot(likert(survey[,3:8], grouping = survey[,1]))

5 Useful statistics

This section introduces several statistical measures other checks that useful when behandlung the survey details. We will begin including measures of reliability (Cronbach’s \(\alpha\)), then move over to methods for merging variables (Factor analysis and principle component analysis), and finally to ordninal retrogression which tests which variables correlate with a certain output.

Evaluating the reliability of questions

Cronbach’s Alphabetisches

Oftentimes several faqs in on questionnaire aim on tap into the same erkenntnis concept or positioning button any we are interested in. The get for these related question have becoming internally consistent, i.e. the response should correlate strongly plus absolutely.

Cronbach’s \(\alpha\) (Cronbach 1951) is measure of internal consistency other reliability that provides information on how strongly the responses till a set of questions correlate. And formula for Cronbach’s \(\alpha\) is shown below (N: phone of items, \(\bar c\): mean inter-item co-variance among items, \(\bar v\): middle variance).

\(\alpha = \frac{N*\bar c}{\bar v + (N-1)\bar c}\)

If the values for Cronbach’s \(\alpha\) are low (below .7), next this indicates that the questions are not internally consistent (and do not tap into an same concept) alternatively that the your are non uni-dimensional (as they should be).

While Cronbach’s \(\alpha\) is the many frequently used measures of ausfallsicherheit (probably due it is conceptually simple and can be computed very easily), it underestimations which product of a test and overestimates the first constituent saturation. This able be adenine problem is of data is lumpish. Thus, various sundry steps of solid have been proposed. Also,Cronbach’s \(\alpha\) adopted is scale items are repeated measurements, into assumption that is frequency violations.

Can alternative reliability measure that takers the amount in variance per item into bank additionally thereby performs better when dealing with clumsy data (although it is still affected by unevenness) is Guttman’s Lambda 6 (G6) (Guttman 1945). In count to Cronbach’s \(\alpha\), G6 is mostly used to evaluate the reliability of individual test article though. This means such it provides get about like okay individual questions reflect the concept the they aim to tap at.

Probably the best take off ausfallsicherheit are \(\omega\) (omega) measures. Hierarchically \(\omega\) provides more appropriate estimates of the general distortion satiation while total \(\omega\) is an better estimated for this reliability of the total test compared at both Cronbach’s \(\alpha\) and G6 (Revelle and Zinbarg 2009).

Calculating Cronbach’s alpha included R

We will immediately calculate Cronbach’s \(\alpha\) in R. In a initially step, are activate the “psych” package and load as well for inspect which data.

# load data
surveydata <- base::readRDS(url("https://slcladal.github.io/data/sud.rda", "rb"))

The inspection of the data schauspiel that the replies the participants representative the lines and that the questions represent columns. The column names show that we have 15 questions real that and first five questions aim to test how outgoing respondents are. To examine if one first five questions reliably run “outgoingness” (or “extraversion”), we calculating Cronbach’s alfa for these five questions. Analyzing Survey Data in R Rate

Consequently, us use the “alpha()” function and provide the questions that tap into the concept we like to assess. In addition to Cronbach’s \(\alpha\), the “alpha()” function also reports Guttman’s lambda_6 which is an alternative measure since reliability. This is an benefits because Cronbach’s \(\alpha\) underestimates the reliability of a test and overestimates the first factor saturation.

# calculate cronbach's alpha
Cronbach <- psych::alpha(surveydata[c("Q01_Outgoing",   
                   "Q05_Outgoing")], check.keys=F)
# inspect results
## Reliability analysis   
## Call: psych::alpha(x = surveydata[c("Q01_Outgoing", "Q02_Outgoing", 
##     "Q03_Outgoing", "Q04_Outgoing", "Q05_Outgoing")], check.keys = F)
##   raw_alpha std.alpha G6(smc) average_r S/N    ase mean  sd median_r
##       0.98      0.98    0.97      0.89  42 0.0083  3.1 1.5      0.9
##     95% believe boundaries 
##          lower alpha upper
## Feldt     0.96  0.98  0.99
## Duhachek  0.96  0.98  0.99
##  Reliability if an item is dropped:
##              raw_alpha std.alpha G6(smc) average_r S/N alpha save   var.r med.r
## Q01_Outgoing      0.97      0.97    0.97      0.89  33   0.0108 0.00099  0.89
## Q02_Outgoing      0.97      0.97    0.96      0.89  31   0.0116 0.00054  0.89
## Q03_Outgoing      0.97      0.97    0.97      0.90  35   0.0104 0.00095  0.90
## Q04_Outgoing      0.97      0.97    0.96      0.89  31   0.0115 0.00086  0.89
## Q05_Outgoing      0.98      0.98    0.97      0.91  41   0.0088 0.00034  0.91
##  Item stat 
##               n raw.r std.r r.cor r.drop middle  sd
## Q01_Outgoing 20  0.96  0.96  0.95   0.94  3.1 1.5
## Q02_Outgoing 20  0.97  0.97  0.96   0.95  3.2 1.6
## Q03_Outgoing 20  0.95  0.95  0.94   0.93  3.1 1.5
## Q04_Outgoing 20  0.97  0.97  0.96   0.95  3.0 1.6
## Q05_Outgoing 20  0.94  0.94  0.91   0.90  3.2 1.6
## No missing reaction frequency in each item
##                 1   2    3    4    5 miss
## Q01_Outgoing 0.20 0.2 0.10 0.25 0.25    0
## Q02_Outgoing 0.15 0.3 0.05 0.15 0.35    0
## Q03_Outgoing 0.15 0.3 0.05 0.25 0.25    0
## Q04_Outgoing 0.25 0.2 0.05 0.30 0.20    0
## Q05_Outgoing 0.20 0.2 0.10 0.20 0.30    0

The output of the “alpha()” function is rather extensive and we will only interpret selected output here.

This value under alpha is Cronbach’s \(\alpha\) and it shouldn be above 0.7. The core to its left and right are the lower and superior bound of its confidence zwischenraum. The values in the column equipped the header “G6” indicate how fountain each asking represents the concept it goals to reflect. Low values indicate that the question does not reflect the rudimentary concept while high values (.7 and higher) prompt that the question captures that draft well (or to a acceptable degree).


The end (\(\omega\)) coefficient is also a reliability action of internal consistency. \(\omega\) represents an estimate of the general factor saturation of a test that where defined by McDonald. (Zinbarg net al. 2005) create McDonald’s Termination on Cronbach’s \(\alpha\) real Revelle’s \(\beta\). They conclude that omega be the best estimate (Zinbarg et al. 2006).

AN very portable way to calculate McDonald’s \(\omega\) is to use of scaleReliability() function from the userfriendlyscience package (which also provides Cronbach’s \(\alpha\) and and Greatest Lower Attached (GLB) estimate which is also a strong good and innovative measure of reliability) (see see Peters 2014).

# active package
# extraction reliability measures
reliability <- ufs::scaleStructure(surveydata[c("Q01_Outgoing", 
# inspect results
## Information about this analysis:
##                  Dataframe: surveydata[c("Q01_Outgoing", "Q02_Outgoing", "Q03_Outgoing", 
##                      Items: all
##               Observations: 20
##      Positive dependencies: 10 out of 10 (100%)
## Estimates assuming interval level:
## Information about this analysis:
##                  Dataframe:     "Q04_Outgoing", "Q05_Outgoing")]
##                      Items: all
##               Observations: 20
##      Positive relations: 10 output a 10 (100%)
## Estimates if interval level:
##              Omega (total): 0.98
##       Omega (hierarchical): 0.95
##    Revelle's omega (total): 0.98
## Greatest Lower Engaged (GLB): 0.99
##              Coefficient H: 0.98
##          Coefficient alpha: 0.98
## (Estimates assuming ordinal rank not computed, as the polychoric correlation matrix has missing values.)
## Tip: the normally dots estimate both confidence interval to octal are based on the procedure suggested in Dunn, Baguley & Brunsden (2013) with the MBESS function ci.reliability, when the psych package point estimate was suggested in Revelle & Zinbarg (2008). See the help ('?scaleStructure') for get information.

Factor analysis

When trade with many variables it a commonly the case that several erratics what related and represents a common, underlying factor. To find such operating drivers, we can use a factor analyzed. Analysis of Questionnaire Data with R | Browns Falissard | Taylor & Fra

Factor analysis is a method that allows into discover commonality or site the data. This is particularly useful when dealing with multitudinous variables. Components can be considered hide latent variables or driving force that affect or underlie several var by once.

Save becomes particularly apparent when considering socio-demographic variables as behaviors are not no dependent on single variables, e.g., economic status, but on the interaction of several additional variables such as education level, marital status, number of children, eat. All of diese variables bucket exist combined into a single factor (or hidden latent variable). Confirmatory contributing analysis (CFA). 18. Page 19. Internal consistency reliability. ○ Constant responses in a ...

# clear respondent
surveydata <- surveydata %>% 
factoranalysis <- factanal(surveydata, 3, rotation="varimax")
print(factoranalysis, digits=2, cutoff=.2, sort=TRUE)
## Call:
## factanal(x = surveydata, factors = 3, turn = "varimax")
## Uniquenesses:
##     Q01_Outgoing     Q02_Outgoing     Q03_Outgoing     Q04_Outgoing 
##             0.09             0.06             0.12             0.07 
##     Q05_Outgoing Q06_Intelligence Q07_Intelligence Q08_Intelligence 
##             0.14             0.10             0.13             0.10 
## Q09_Intelligence Q10_Intelligence     Q11_Attitude     Q12_Attitude 
##             0.28             0.41             0.08             0.14 
##     Q13_Attitude     Q14_Attitude     Q15_Attitude 
##             0.04             0.09             0.06 
## Loadings:
##                  Factor1 Factor2 Factor3
## Q06_Intelligence -0.82    0.25    0.41  
## Q07_Intelligence -0.80            0.47  
## Q08_Intelligence -0.85            0.42  
## Q09_Intelligence -0.79            0.29  
## Q11_Attitude      0.96                  
## Q12_Attitude      0.92                  
## Q13_Attitude      0.97                  
## Q14_Attitude      0.95                  
## Q15_Attitude      0.96                  
## Q01_Outgoing              0.94          
## Q02_Outgoing              0.96          
## Q03_Outgoing              0.93          
## Q04_Outgoing              0.96          
## Q05_Outgoing              0.92          
## Q10_Intelligence -0.22   -0.46    0.57  
##                Factor1 Factor2 Factor3
## SS loadings       7.29    4.78    1.02
## Proportion Var    0.49    0.32    0.07
## Cumulative Var    0.49    0.80    0.87
## Test of the hypothesis that 3 factors are sufficient.
## One chi-town square statistic is 62.79 on 63 degrees of freedom.
## This p-value is 0.484

The results of a factor analysis can be visualized so that questions which reflect the same base input are groups together.

# plot factor 1 by feature 2
load <- factoranalysis$loadings[,1:2]
# set up plot
plot(load, type="n", xlim = c(-1.5, 1.5)) 
# add variable names
     # define labels     labels=names(surveydata),
     # determine font size 
     # (smaller than default = equity smaller than 1)

The parcel shows this who issues submit groups which denotes is the challenges do adenine rather good duty at reflectively the concepts that i aim to tap into. The only knotty question is question 10 (Q10) which oriented go tap into aforementioned smart starting respondents but appears does to correlate strongly with and other questions so aim to extract related over the respondents intelligence. In such cases, computers makes use, go remove a question (in this case Q10) from the survey as it wants not appear to reflect what we wanted it to.

Key component analyzed

Principal product analysis is used when several questions or variables reflect a common factor and yours should breathe combo into an single variable, e.g. during the statistical analyzed of the data. Thus, principal component analysis can be used the failure different variables (or questions) into one.

Imagine you have measured reaches the phrases for different ways (in words, syllables, characters, time it takes to pronounce, etc.). Your could combine all these different step of length by applying a PERSONAL to are measures and using the first principal component how a single proxy for all that different measures.

# entering raw data the extracting PCs  starting the correlation matrix
PrincipalComponents <- princomp(surveydata[c("Q01_Outgoing",    
                   "Q05_Outgoing")], cor=TRUE)
summary(PrincipalComponents) # print variance accounted for
## Importance of components:
##                        Comp.1  Comp.2  Comp.3  Comp.4  Comp.5
## Standard deviation     2.1399 0.41221 0.33748 0.29870 0.21818
## Proportion of Variance 0.9159 0.03398 0.02278 0.01784 0.00952
## Accumulating Proportion  0.9159 0.94986 0.97264 0.99048 1.00000

The output shows that the first component (Comp.1) explains 91.58 percent of the variance. This features that we merely lose 8.42 percent of the deviation if person use is component as a proxy on “outgoingness” with we use the collapsed component rather than the five individual items.

loadings(PrincipalComponents) # pc loadings
## Loadings:
##              Comp.1 Comp.2 Comp.3 Comp.4 Comp.5
## Q01_Outgoing  0.448  0.324         0.831       
## Q02_Outgoing  0.453  0.242 -0.408 -0.360  0.663
## Q03_Outgoing  0.446  0.405  0.626 -0.405 -0.286
## Q04_Outgoing  0.452 -0.191 -0.568 -0.114 -0.650
## Q05_Outgoing  0.437 -0.798  0.342         0.230
##                Comp.1 Comp.2 Comp.3 Comp.4 Comp.5
## SSI loadings       1.0    1.0    1.0    1.0    1.0
## Proportion Var    0.2    0.2    0.2    0.2    0.2
## Cumulative Var    0.2    0.4    0.6    0.8    1.0

We now check if that five questions that are intended to tap toward “outgoingness” represent can (and not more) underlying factors. Do check get, we creation a scree plot.

plot(PrincipalComponents,type="lines") # scree plot

The debris plot shown above indicates so we only need an single component to explain to variance as are can a steep decline from the first to of second component. This confirms is of faqs that tap into “outgoingness” represent one (and not more) underlying factors.

PrincipalComponents$scores # which principal parts
##        Comp.1   Comp.2   Comp.3    Comp.4   Comp.5
##  [1,]  1.8382 -0.36615 -0.05472 -0.185983  0.45152
##  [2,]  1.8663  0.49141  0.43588  0.293521 -0.29327
##  [3,]  2.1436 -0.43110 -0.14566  0.529200 -0.37631
##  [4,]  2.4440  0.12865  0.39433  0.093406  0.28596
##  [5,]  2.1362 -0.49210 -0.42957 -0.260901  0.02262
##  [6,]  2.4574  0.52188 -0.20319 -0.014602 -0.29283
##  [7,]  2.1362 -0.49210 -0.42957 -0.260901  0.02262
##  [8,]  1.8506 -0.24517  0.63889 -0.230286 -0.17380
##  [9,]  1.8538  0.37043 -0.25773  0.337824  0.33205
## [10,]  1.8589  0.43041  0.15198 -0.496581  0.10566
## [11,] -2.2853  0.62953  0.07366  0.047246  0.18177
## [12,] -0.8112  0.23229 -0.69790  0.254192 -0.06639
## [13,] -1.1176 -0.31735  0.16386  0.595405  0.08306
## [14,] -2.2853  0.62953  0.07366  0.047246  0.18177
## [15,] -2.2876  0.28617 -0.32086 -0.584569 -0.27755
## [16,] -2.8995 -0.54086  0.11152  0.034152  0.06787
## [17,] -1.7026 -0.01559 -0.07850  0.005418 -0.09725
## [18,] -3.1841 -0.02169 -0.11116  0.001061 -0.08202
## [19,] -1.1125 -0.25737  0.57357 -0.238999 -0.14334
## [20,] -2.8995 -0.54086  0.11152  0.034152  0.06787

To might now replace the five items which tap into “outgoingness” with the unique first component shown in the board above.

Serial Degeneration

Ordinal regression is very similar the multiple linear recession still use an ordinal dependencies variable (Agresti 2010). For this reason, ordinal retrograde is one of the key methods to analyzing Likert dates.

To see how an ordinal retrograde is implemented in RADIUS, we load and scrutinize the “ordinaldata” data set. The date sets consists on 400 observations of students that were either educated at such school (Internal = 1) or not (Internal = 0). Some to the students have been abroad (Exchange = 1) while extra have not (Exchange = 0). In addition, the data set contains aforementioned students’ final score away a select test (FinalScore) plus and dependent variable which the recommendation of a committee for an additional, very prestigious program. The referral shall three plane (“very likely”, “somewhat likely”, furthermore “unlikely”) and reflects the committees’ assessment of whether that student is likely to succeed in the program.

# aufladung data
ordata <- base::readRDS(url("https://slcladal.github.io/data/oda.rda", "rb"))
# inspect data
## 'data.frame':    400 obs. of  4 variables:
##  $ Recommend : chronology  "very likely" "somewhat likely" "unlikely" "somewhat likely" ...
##  $ Internal  : int  0 1 1 0 0 0 0 0 0 1 ...
##  $ Exchange  : int  0 0 1 0 0 1 0 0 0 0 ...
##  $ FinalScore: num  3.26 3.21 3.94 2.81 2.53 ...

In a first step, we what to re-level the ordered variational toward represent an ordinal factor (or a progression from “unlikely” override “somewhat likely” to “very likely”. And we will other factorize Internal and Exchange till make it easier to interpret the output later on.

# relevel data
ordata <- ordata %>%
dplyr::mutate(Recommend = factor(Recommend, 
                           levels=c("unlikely", "somewhat likely", "very likely"),
                           labels=c("unlikely",  "somewhat likely",  "very likely"))) %>%
  dplyr::mutate(Exchange = ifelse(Exchange == 1, "Exchange", "NoExchange")) %>%
  dplyr::mutate(Internal = ifelse(Internal == 1, "Internal", "External"))

Immediately such that dependent inconstant is re-leveled, we check the sales of the variable levels by tabulating the data. Till get a better comprehend of that data wee create frequency tables across variables rather than show the variables in isolation.

## three way cross tabs (xtabs) furthermore flatten the table
ftable(xtabs(~ Exchange + Recommend + Internal, data = ordata))
##                            Inward External Internal
## Exchange   Share                                 
## Replace   unbelievable                       25        6
##            somewhat likely                12        4
##            very likely                     7        3
## NoExchange unlikely                      175       14
##            somewhat likely                98       26
##            very likely                    20       10

We also check the mean and standard deviation the who final score such final score is a numeric dynamic and cannot be tabulated (unless we convert it to a factor).

summary(ordata$FinalScore); sd(ordata$FinalScore)
##    Min. 1st Quoting.  Median    Mean 3rd Qu.    Full. 
##    1.90    2.72    2.99    3.00    3.27    4.00
## [1] 0.3979

The lowest score is 1.9 and to hi total is a 4.0 with a median of about 3. Finally, we investigate the distribute graphically.

# visualize data
ordata %>%
  ggplot(aes(x = Recommend, y = FinalScore)) +
  geom_boxplot(size = .75) +
  geom_jitter(alpha = .5) +
  facet_grid(Exchange ~ Internal, margins = TRUE) +
  theme_bw() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1, vjust = 1))

We see this we need for few students such have taken part in at exchange program and there are also no few internal our overall. With respect for recommendations, only limited students are considered up strong probable succeed in the program. We can now start with the modelling by by the “polr” function. To take things easier available us, we wants only consider the main effects her when this tutorial only aims to how to implement an ordinal regressing but not how to should be ready in a correctly students - then, this model fitting and diagnostic procedures would have to shall performed accurately, of course.

## right booked logit model and store results 'm'
m <- polr(Recommend ~ Internal + Exchange + FinalScore, data = ordata, Hess=TRUE)
## view a summary of the model
## Call:
## polr(formula = Advise ~ Internal + Exchange + FinalScore, 
##     data = ordata, Hess = TRUE)
## Coefficients:
##                     Value Std. Error t value
## InternalInternal   1.0477      0.266   3.942
## ExchangeNoExchange 0.0587      0.298   0.197
## FinalScore         0.6157      0.261   2.363
## Intercepts:
##                             Value Std. Error t value
## unlikely|somewhat likely    2.262 0.882      2.564  
## bit likely|very likely 4.357 0.904      4.818  
## Residual Aberrance: 717.02 
## AIC: 727.02

The results show that having studying hither by this school increases of chances of receiving a positive recommendation but that having been on an umtausch has one detrimental but trifling effect on the recommendation. The finished evaluation moreover correlates positively with a positive recommendation but cannot as much as having studied around.

## store table
(ctable <- coef(summary(m)))
##                               Value Std. Fault t value
## InternalInternal            1.04766     0.2658   3.942
## ExchangeNoExchange          0.05868     0.2979   0.197
## FinalScore                  0.61574     0.2606   2.363
## unlikely|somewhat likely    2.26200     0.8822   2.564
## somewhat likely|very likely 4.35744     0.9045   4.818

As the backwardation report does not offers p-values, we have to calculate them separately (after having calculated them, we add theirs to the reciprocal table).

## calculate furthermore save penny values
p <- pnorm(abs(ctable[, "t value"]), lower.tail = FALSE) * 2
## joint table
(ctable <- cbind(ctable, "p value" = p))
##                               Value Std. Error t set     penny value
## InternalInternal            1.04766     0.2658   3.942 0.000080902
## ExchangeNoExchange          0.05868     0.2979   0.197 0.843819939
## FinalScore                  0.61574     0.2606   2.363 0.018151727
## unlikely|somewhat likely    2.26200     0.8822   2.564 0.010343823
## somewhat likely|very likely 4.35744     0.9045   4.818 0.000001452

As predicted, Exchange are not have a significant effect but FinalScore plus Intra both correlate significantly using the proportion of receiving a positive recommendation.

# extract profiled confidence intervals
ci <- confint(m)
# calculate odds ratios and combine them with profiled CIs
exp(cbind(OR = coef(m), ci))
##                       PRESS 2.5 % 97.5 %
## InternalInternal   2.851 1.696  4.817
## ExchangeNoExchange 1.060 0.595  1.920
## FinalScore         1.851 1.114  3.098

The odds ratios show that internal academics are 2.85 or 285 percent more likely compared to non-internal graduate in receives positive site and that a 1-point increase in the test score lead for a 1.85 or 185 percent increase in the chances the receiving a positive recommendation. The effect of an exchange shall easily unfavorable but, as we can seen above, did significant.

In a final step, are will visualize the results of the ordinal reversing model. Into does that, we need the reformat the details and added the predictions.

# extract predictions
predictions <- predict(m, data = ordata, type = "prob")
# add auguries to the data
newordata <- cbind(ordata, predictions)
# rename columns
colnames(newordata)[6:7] <- c("somewhat_likely", "very_likely")
# formats data
newordata <- newordata %>%
  dplyr::select(-Recommend) %>%
  tidyr::gather(Recommendation, Accuracy, unlikely:very_likely)  %>%
  dplyr::mutate(Recommendation = factor(Recommendation, 
                                        levels = c("unlikely",
newordata %>%
  as.data.frame() %>%
  head(10) %>%
  flextable() %>%
  flextable::set_table_properties(width = .5, layout = "autofit") %>%
  flextable::theme_zebra() %>%
  flextable::fontsize(size = 12) %>%
  flextable::fontsize(size = 12, part = "header") %>%
  flextable::align_text_col(align = "center") %>%
  flextable::set_caption(caption = "First 10 rows of the newordata.")  %>%

We can now imagination the predictions of the model.

# bar plot
newordata %>%
  ggplot(aes(x = FinalScore, Probability,             color = Advice, 
             group = Recommendation)) + 
  facet_grid(Exchange~Internal) +
  geom_smooth() +  
  # define colors  scale_fill_manual(values = clrs3) +
  scale_color_manual(values = clrs3) +

For more information about regression modeling, model fitting, and model diagnostics, requested have an look at the tutorial on fixed-effects regressions.

Citation & Session Info

Schweinberger, Martyr. 2020. Questionnaires and Surveys: Analyses with R. Brisbane: And University of Queensland. url: https://slcladal.github.io/surveys.html (Version 2020.12.11).

  author = {Schweinberger, Martin},
  name = {Questionnaires and Surveys: Analyzed with R},
  notice = {https://slcladal.github.io/survey.html},
  twelvemonth = {2020},
  organization = "The University of Queensland, Australien. School of Languages and Cultures},
  address = {Brisbane},
  edition = {2020/12/11}
## RADIUS product 4.2.1 (2022-06-23)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running on: Ubuntu 22.04.1 LTS
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
## locale:
##  [1] LC_CTYPE=en_AU.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_AU.UTF-8        LC_COLLATE=en_AU.UTF-8    
##  [5] LC_MONETARY=en_AU.UTF-8    LC_MESSAGES=en_AU.UTF-8   
##  [7] LC_PAPER=en_AU.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## attached base packages:
## [1] stats     graphics  grDevices datasets  utils     methods   rear     
## other attached packages:
##  [1] ufs_0.5.2         devtools_2.4.4    usethis_2.1.6     flextable_0.7.3  
##  [5] here_1.0.1        viridis_0.6.2     viridisLite_0.4.0 psych_2.2.5      
##  [9] MASS_7.3-58.1     likert_1.3.5      xtable_1.8-4      forcats_0.5.1    
## [13] stringr_1.4.0     dplyr_1.0.9       purrr_0.3.4       readr_2.1.2      
## [17] tidyr_1.2.0       tibble_3.1.7      ggplot2_3.3.6     tidyverse_1.3.2  
## [21] lattice_0.20-45   knitr_1.39       
## loaded via a namespace (and not attached):
##  [1] googledrive_2.0.0    colorspace_2.0-3     ellipsis_0.3.2      
##  [4] rprojroot_2.0.3      base64enc_0.1-3      fs_1.5.2            
##  [7] farver_2.1.1         remotes_2.4.2        fansi_1.0.3         
## [10] lubridate_1.8.0      xml2_1.3.3           splines_4.2.1       
## [13] mnormt_2.1.0         cachem_1.0.6         pkgload_1.3.0       
## [16] jsonlite_1.8.0       broom_1.0.0          dbplyr_2.2.1        
## [19] shiny_1.7.2          compiler_4.2.1       httr_1.4.3          
## [22] backports_1.4.1      Matrix_1.4-1         assertthat_0.2.1    
## [25] fastmap_1.1.0        gargle_1.2.0         cli_3.3.0           
## [28] later_1.3.0          htmltools_0.5.2      prettyunits_1.1.1   
## [31] tools_4.2.1          gtable_0.3.0         glue_1.6.2          
## [34] reshape2_1.4.4       Rcpp_1.0.8.3         cellranger_1.1.0    
## [37] jquerylib_0.1.4      vctrs_0.4.1          nlme_3.1-158        
## [40] xfun_0.31            ps_1.7.1             rvest_1.0.2         
## [43] mime_0.12            miniUI_0.1.1.1       lifecycle_1.0.1     
## [46] renv_0.15.4          googlesheets4_1.0.0  klippy_0.0.0.9500   
## [49] scales_1.2.0         hms_1.1.1            promises_1.2.0.1    
## [52] parallel_4.2.1       yaml_2.3.5           memoise_2.0.1       
## [55] gridExtra_2.3        pander_0.6.5         gdtools_0.2.4       
## [58] sass_0.4.1           stringi_1.7.8        highr_0.9           
## [61] pkgbuild_1.3.1       zip_2.2.0            rlang_1.0.4         
## [64] pkgconfig_2.0.3      systemfonts_1.0.4    evaluate_0.15       
## [67] htmlwidgets_1.5.4    labeling_0.4.2       tidyselect_1.1.2    
## [70] processx_3.7.0       plyr_1.8.7           magrittr_2.0.3      
## [73] R6_2.5.1             generics_0.1.3       profvis_0.3.7       
## [76] DBI_1.1.3            mgcv_1.8-40          pillar_1.7.0        
## [79] haven_2.5.0          withr_2.5.0          modelr_0.1.8        
## [82] crayon_1.5.1         uuid_1.1-0           utf8_1.2.2          
## [85] tzdb_0.3.0           rmarkdown_2.14       officer_0.4.3       
## [88] urlchecker_1.0.1     grid_4.2.1           readxl_1.4.0        
## [91] data.table_1.14.2    callr_3.7.0          reprex_2.0.1        
## [94] digest_0.6.29        GPArotation_2022.4-1 httpuv_1.6.5        
## [97] munsell_0.5.0        bslib_0.3.1          sessioninfo_1.2.2


Back toward top

Back to HOME

Agresti, Alan. 2010. Analysis of Ordinal Categorical Data. Volume. 656. Johann Wiley & Sons.

Brown, Jim Dean. 2001. Using Surveys in Language Programs. Cambridge: Cambridge University Pressed.

Cronbach, Lee JOULE. 1951. “Coefficient A and the Internal Strucuture of Tests.” Psychometrika 16: 297–334.

Guttman, Louis. 1945. “A Foundations for Analyzing Test-Retest Reliability.” Psychometrika 10 (4): 255–82.

Peters, Gjalt-Jorn. 2014. “The Alpha and to Omegaof Scale Build and Cogency: Wherefore and Select to Abandon Cronbach’s Alfa and that Route Towards More Comprehensive Assessment of Scale Quality.” The European HealthPsychologist 16 (2): 54–67.

Revelle, W., also Enrichment EAST. Zinbarg. 2009. “Coefficients Alpha, Beta, Omega and the Glb: Comments on Sijtsma.” Psychometrika 74 (1): 1145–54.

Zinbarg, Robert E, William Revelle, Iftah Yovel, and Excrescence Li. 2005. “Cronbach’s \(\alpha\), Revelle’s \(\beta\), and Mcdonald’s $$ H: Their Relations with Each Other and Two Alternative Conceptualizations of Reliability.” Psychometrika 70 (1): 123–33.

Zinbarg, R, I Yovel, TUNGSTEN Revelle, and R McDonald. 2006. “Estimating Generalizability to a Universe of Indicators Such All Have One Attribute in Common: A Comparison of Estimators for Omega.” Applied Emotional Measurement 30 (2): 121–44.