Calculating compound factors

The bulk of the Sociolinguistic Questionnaire (SQ) consists of ordinal/Likert Scale questions on language use and on language and cultural attitudes. Since the questionnaire contains a large number of these variables, it is desirable to reduce them to a smaller set of factors by calculating average values over a set of variables for each individual. For example, there are a large number of questions on how often each person uses their first language and it is desirable to boil down the responses to as few variables as possible. However, it is often not easy to see which variables should be grouped together to form such factors.

In the textbook (p. 214f.) three possibilities are discussed: basing the classification on theoretical considerations (as is done in e.g. Schmid 2007), conducting a data-mining approach by means of e.g. a Factor Analysis, or replicating an approach that has been shown to be valid in a previous investigation.

Such an investigation was conducted by Schmid & Dusseldorp 2010. Based on a principal component analysis, we established several groups of factors with high internal validity—that is, we showed that participants tended to respond in the same direction on these. The factors fell into the following categories:

  • bilingual mode L1 use  (with partner, children and friends)
  • intermediate mode L1 use (language for religious purposes, L1 use in language clubs)
  • monolingual mode language use (for professional purposes, with speakers in country of origin)
  • exposure to target-like L1 (non-interactive L1 use through TV or reading, communication with speakers in the home country or visits to that country)
  • perceived importance of intergenerational maintenance
  • linguistic and cultural affiliation (L1 use in inner speech, such as when dreaming or thinking, linguistic and cultural preferences)

These compound variables have since been applied in a number of other investigations, and have been found to be valid in other settings. In other words, if you classify the data which you elicit by means of a sociolinguistic background questionnaire into these groups and calculate averages across the factors, you will probably achieve a valid set of predictors. You can test this by checking the internal reliability of the scale, through calculating what is known as the Cronbach Alpha.

If you use the matrix for data entry and the instructions for coding supplied in the section Coding the sociolinguistic data, you can calculate these factors and check their internal reliability by means of the following steps:

A syntax file is a file that contains the commands for a number of SPSS operations, you can open it in SPSS with the command File-Open-Syntax. You can then mark the text that refers to the individual commands and click the ‘Run’ button (the green triangle in the icon bar). This carries out the command in the same way as if you had entered it through the menus. When you perform any operation in SPSS, you can add the syntax to carry out this task to a Syntax file by clicking the button ‘Paste’.

  • Open the data file containing the data from your Sociolinguistic questionnaire, and open the syntax file Compoundfactors.sps
  • You can see that the syntax file Compoundfactors.sps contains a number of commands, which begin with ‘Compute’ and end with ‘Execute’. After each of these commands, and preceeding the last four, you can find comments, which are preceeded by three asterisks and will be ignored by SPSS. These explain what each command will do; which of the variables from the dataset will be averaged and what the new variable to be created will be called.
  • Once you have calculated all the factors, you should perform a Reliability Analysis to see if the items you included in the calculation do really all correlate with each other. The syntax for these analyses and the explanation is also included in the Compoundfactors.sps file.