Abstract
In this talk I present a set of computing strategies that allow you to efficiently analyze data so that others can reproduce your results. First, I consider the importance of using robust and legible script files. Second, I explain the critical importance of retaining the unaltered script files and datasets used to obtain your results. This process involves the posting of files. Next, I consider the advantages of a dual workflow in which the data flow creates variables and datasets, while the analysis flow analyzes datasets, but does not construct variables or save datasets. Finally, I suggest ways of naming scripts and datasets that organize your files in a way that simplifies replication.
Speaker Bio
Scott Long is Distinguished Professor Emeritus and Chancellor’s Professor Emeritus of Sociology and Statistics at Indiana University. His methodological work has developed new methods for the interpretation of statistical results, unified literatures from diverse fields, and developed methods to facilitate reproducible research. He is the author of texts on confirmatory factor analysis and structural equation modeling, regression models for categorical outcomes, and methods for reproducible results. He taught statistical methods at Indiana University and for the ICPSR Summer Program. His substantive research examined gender differences in scientific careers, health among mid-life women, stigma and mental illness, and human sexuality. This work was supported by the National Institute of Health and the National Science Foundation. Dr. Long was awarded the American Sociological Association’s Paul F. Lazarsfeld Memorial Award for Distinguished Contributions in the Field of Sociological Methodology and the Leamer-Rosenthal Prize for Open Social Science. He is an elected member of the Sociological Research Association and an elected fellow of the American Statistical Association.