Jeromy Anglim's Blog: Psychology and Statistics


Thursday, May 20, 2010

Fitting Nonlinear Regression Models to Multiple Participants Using SPSS

This post briefly discusses how to run a nonlinear regression in SPSS. Specifically, it discusses the scenario where you have a a set of k observations for each of n participants, and where your aim is to fit a nonlinear function to the data of each participant in order to save the parameter estimates for subsequent analysis. This is a relatively common task in psychology. You have multiple participants measured on a numeric repeated measures variable and you want to see how a dependent variable is related to this repeated measures variable. And you want to do this separately for each participant. For example, you might be modelling performance as a function of practice or accuracy as a function of stimulus intensity.

A note on R

Personally, I would use R for this task. This is partly because I like R. But also the task plays to R's strengths. First, R makes it easy to run a set of models. Second, R makes it easy to extract information and do further processing. For an example of nonlinear regression using R, see this tutorial on APSnet. For further information about fitting a set of nonlinear functions, see the nlsList function in the nlme package. However, this post flows from a consulting session where the researchers were familiar with SPSS.

Overview of SPSS Nonlinear Regression Procedure

The following outlines the procedure.
  1. Set up the data frame in long format
  2. Split File by ID
  3. Choose a function to relate the DV to the IV
  4. Estimate the function using Analyze - Regression - Nonlinear
  5. Extract information and Interpret the results

1. Set up data frame

Arrange the data into long format (see UCLA for more information on converting between long and wide format). Each row is one observation for one participant. In a standard case, the data frame would have three variables:
  • id: The variable that uniquely identifies a participant.
  • x: the predictor variable
  • y: the response variable
Assuming k observations on each of n participants, you would have (k * n) rows in your data frame. Of course, k could vary between participants, which in general is not a problem.

2. Apply Split File

If you want to fit a model separately for each participant, it is helpful to activate Split File in SPSS. For details on how to do this, see UCLA's tutorial. The Split File command should be applied to each participant ID. Thus, any analysis run subsequently will have separate output arranged in a table with separate rows for each participant.

3. Choose a Function

It is generally assumed that theory and domain-specific knowledge will guide selection of an appropriate function. However, there is also a role for inspection of the raw data to both check the appropriateness of the chosen function and to explore modifications. For example, I've often fit three parameter power and exponential functions to learning curves relating practice to task completion time. In a psychophysics example, researchers wanted to fit a Gaussian function. For further discussion of nonlinear functions, see:
  • Chapter 3 of Benjamin Bolker's Ecological Models and Data in R, available online.
  • Bates and Watts (1988)Nonlinear regression analysis and its applications
  • Ritz and Steibig (2008) Nonlinear regression with R

4. Fit Nonlinear Model

SPSS has the nonlinear regression tool (Analyze - Regression - Nonlinear). Below is an example of syntax.
*Fit
* NonLinear Regression.
MODEL PROGRAM  mu=0 sigma=50.
COMPUTE  PRED_=exp(-((soa-mu) ** 2)/(2*sigma**2)).
NLR prob
  /PRED PRED_
  /SAVE PRED RESID
  /CRITERIA SSCONVERGENCE 1E-8 PCON 1E-8.
  • soa was the name of the independent variable.
  • prob was the name of the dependent variable.
  • mu and sigma were names given to the parameters of the function.
  • exp refers to the exponential function e^x.
  • ** is the notation for an exponent.
  • Parameters need starting values, which in this case were 0 for mu and 50 for sigma.
The program allows you to save predicted values and residuals. Saving residuals is useful for examining model assumptions. Saving predicted values is useful for seeing the convergence between observed and predicted.

For further information on running nonlinear regression in SPSS see one of the following links:

5. Extract Information and Interpret the Results

Extract Parameter Estimates
Assuming the split file is on, you will get a table of parameter estimate, one for each participant. Double clicking on the table activates Pivot Trays (for more info on Pivot Trays, see UNSW's tutorial). This can be used to arrange the table into a format that makes exporting the data easier. In most psychological applications, this data is used in subsequent analyses. For example, you might want to report the mean and standard deviation of parameter estimates over participants or you might want to see whether parameter estimates are related to some other variable.
Extracting R-squared
You can get R-squared for each participant by running a regression predicting the dependent variable from the saved predicted values (Analyze - Regression - Linear). If the Split File is still in effect you will get an R-squared for each individual. Note that the Adjusted R-squared and the significance levels of the R-squared are not necessarily accurate.
Examining Residuals
A histogram of the residuals can be used to assess whether the residuals are normally distributed. A scatterplot with the independent variable on the x-axis and the residual on the y-axis is useful for assessing heteroscedasticity. See a good book on nonlinear regression for further discussion of assumption testing. E.g.,
  • Bates and Watts (1988) Nonlinear regression analysis and its applications
  • Ritz and Steibig (2008) Nonlinear regression with R