How can I analyse the British Election Study data on my computer if I don’t have an SPSS or Stata license?

The British Election Study Team

14/05/2014

Our data releases are currently in SPSS and Stata format. We will also be releasing an online analysis platform that will allow easy access to the data.

If you prefer to work with the data directly but don’t have a copy of SPSS or Stata, the best option is to use R. R is a very powerful open source statistical language that is widely used in academia and industry.

You can download R here and the Rstudio software (which has a great interface) here.

Once you have both of these installed you can load rstudio and start using the software.

There are a lot of R resources available online. One place you can start is with this seminar introduction. We will also be posting more tutorials for doing various analyses on the British Election Study website.

If you already know a bit of R then you can run the following code to read in the Stata file to R:

install.packages("foreign") # you only need to run this line the first time

library(foreign)

setwd("c:/bes data") # replace with the directory where you have your data
bes <- read.dta("BES2015_W1_v1.0.dta", convert.factors = NA)

At this point you now have the data read in. You can look at the first 5 rows by running:

head(bes)
Finally, don’t forget to use the survey weights when running analysis in R (or in any package, including SPSS and STATA).

UPDATE: the convert.factors = NA option in the read.dta command will cause R to read in some scales incorrectly (dropping observations apart from the endpoints). You can change this option to convert.factors = FALSE to read them in correctly (but this will drop value labels from the rest of the variables). Depending on the variables you are using, this may require reading in the dataset twice: once with “convert.factors = NA” option and once with the “convert.factors = FALSE” option. You can then combine the variables from each read. We’re working on a fix that will avoid this problem in the next data release.

Image Credit to Nic McPhee /Flickr under Creative Commons License