If only the WLS people had Digitized their Codebooks

After once again uploading the Wisconsin Longitudinal Study datasets and related documentation, I looked it over, pleased that it had been updated a few years since I last dealt with it in 2002.  But as then, I found the usual stumbling block.  They had no digitized codebook to offer for the main study, only for a few minor ones.   This is such a problem, and one reason why will have to do a lot of work by hand, something we cannot continue to do if we want advanced social technology.  I am trying to remember all the other datasets I downloaded and did some work on over the years.  I am sure I do remember one with a somewhat useful digitized codebook.   What was that?  Well, I’ll look for it.  But if only the WLS had made it easy …   Of course codebooks would be nice, but actual questionnaires would be better.   I seem to remember some election study which had them.  It’s just on the tip of my tongue.  Probably archived on one of my CDs.    I hope.   Anyway, maybe one of these days the right people will realize the importance of digitizing everything, so that using the data can be fully automated.  — dpw

This entry was posted in Uncategorized and tagged , , , , , . Bookmark the permalink.

3 Responses to If only the WLS people had Digitized their Codebooks

  1. What is it about the complete WLS codebooks that is not “digitized”?

    I do not understand the complaint that [The WLS] “had no digitized codebook to offer for the main study, only for a few minor ones.” They are available on-line in PDF and in HTML. Look at http://www.ssc.wisc.edu/wlsresearch/documentation/waves/, and follow the links on that page. Downloading the data file into SAS, SPSS,
    or Stata will provide a codebook as well.

    Additional queries about the WLS data and documentation may be addressed to wls@ssc.wisc.edu.

    WLS project staff are also interested to learn more about what you are trying to do with the data and documentation.

    Principal Investigator
    Wisconsin Longitudinal Study

    • dpw says:

      Professor Hauser, sir,

      My apologies for using the wrong word in my post about using the
      WLS data. I do have a problem with using your codebook data, but
      that was a wrong way to put it. What I meant was that the PDF
      and HTML files are not in an easily machine readable format, and there
      was not much I could do with them but print them out. In the post
      you quoted from I expressed my frustration too bluntly and carelessly,
      ignoring the existence of large statistical packages that would
      give me access to the codebook data. I couldn’t affort to spend
      a lot of money on a package whose only purpose would be
      to translate the codebook data into the kind of simple digital
      format I could use directly.

      I did indeed mention Stata in the immediately previous post. I would
      much rather you, (or whoever mentioned my post to you) had read
      that previous post, which began:

      “In case anyone else wants to play about with the same data
      I’ll be using to prototype with, the site to go to is … home of the
      famous Wisconsin Longitudinal Study, the WLS. Their data is
      wonderful stuff, believe me. I don’t like the way social surveys are
      done, but I think this the best of a poor lot, at least. Very good
      for our purposes. I have downloaded by the Comma Separated Value,
      CSV, format data and the Stata data, hoping that the R documentation
      is correct in saying that R can import Stata data files. I don’t actually
      want to use R, though it is a great package, not to mention free,
      but from R it is easy enough to export it in a format that I can read and
      manipulate using Python. I am more concerned about the variable
      information in the Stata files than the actual user response data,
      which I could easily read using the CSV files.”

      I wrote that the data I wanted was not available in digitized form simply
      simply because the codebooks in PDF and HTML might as well be paper,
      as there is very little I could do with them but print them. I should have
      spelled that out.

      As for the data in SAS, SPSS and Stata, that is another matter.
      As the earlier post quoted above, I said that Ihoped I could extract
      variable information (by which I meant codebook data) from the Stata files,
      if indeed R would import them read them the export the codebook
      information into a format I could read with an ordinary programing
      language like Python. Even that now seems futile, because the R
      I used to use will not run only the only machine I have now, a 64-bit one.

      My real problem is trying to start a project which has no budget for
      big statistical packages like SAS, SPSS or Stata. This would
      have been so simple to do if you had made codebook data available
      in something as simple the CSV you provide for the tabular
      data itself. As it stands, I find myself sitting here with a
      powerful computer and a good programming language which
      runs on it, but no way of accessing your codebook data.

      That was the the source of the rather severe frustration which
      led me to say the wrong thing about your codebooks.

      Thank you for responding so quickly to my comments, and for
      suggesting that I contact WLS project staff. I promise to
      correct all of my misrepresentations in today’s blog post.

      — dpw

  2. All questionnaires and flowcharts (for CATI instruments) are also available on-line at the WLS website, and the on-line codebooks cross-reference variables to their sources in the instruments and to additional documentation about variable construction.


Leave a Reply

Your email address will not be published. Required fields are marked *