Coding data elicited with the Sociolinguistic Questionnaire

The sociolinguistic and personal background data on factors such as age at migration, length of residence, amount of L1 use in daily life, attitudes towards the L1/L2 and the associated cultures by means of a version of the Sociolinguistic Questionnaire (SQ) you can find on this site need to be treated and coded with a great amount of thought and care. The questionnaire contains a large number of variables—far too many to include individually in any analysis. It is therefore necessary to explore to what extent the answers to certain questions may fall into clusters, and to calculate compound, average factors. You may, for example, calculate the average amount of use a person makes of their L1 in informal situations and in the bilingual mode, that is, with family and friends who are themselves bilinguals, so that code-switching is possible and the L2 does not have to be inhibited.

The SQ contains a number of questions pertaining to such situations, such as how often the L1 is spoken to the partner, how often the partner speaks the L1 to the participant, how often the participant uses the L1 to address his/her children, how often the children use the L1 to the participant, etc. Before we can begin to investigate which items from the questionnaire may fall together into such clusters, however, we have to consider three questions:

  • what types of data (nominal, ordinal, interval; see Chapter 16 of the book) are the different variables?
  • are all factors coded on the same scale (do they have the same maximum and minimum)?
  • are missing values (see p. 216 of the book) coded correctly?

I have therefore created an SPSS sheet for data entry. You can save this onto your local drive and use it to enter your data. If you click on the tab ‘Variable View’ at the bottom, you can see which of the questions from the SQ/AMTB each variable pertains to, how to code the different answers, and what type of data they are. This information can also be found in the Coding Book.

You can see that for each of the ordinal/Likert Scale variables, the maximum is 1 and the minimum is 0, where 1 is the value that would theoretically be assumed to protect the speaker against attrition (high use, positive attitude etc.), and 0 is the one that would be assumed to render him/her vulnerable (low use, negative attitude).

All missing values are 999, so if a speaker has not answered a particular question—e.g. because they do not have children or were never married—please do not enter 0 but 999.

In the data view of SPSS, you can see that I included four random attriters and four random controls from my own investigation, and assigned each the name of a famous tennis player. These are there to illustrate how to enter the data, please delete them before you begin entering your own.

Once you have coded your data, you can use the tools and tips under Calculating compound factors.