The Read Me File
The Read Me file is a user's guide to the documentation stored in your Project/ folder.
When a reader is interested in exploring your documentation or reproducing the results of your project, the Read Me file is the first thing they will look at.
Contents of the Read Me File
The Read Me file consists of three sections:
Section 1: Software and platform
In this section, you should indicate:
- The type(s) of software you used for the project, including version numbers.
- The names of any add-on packages that need to be installed with the software.
- The platform (e.g., Windows, Mac, or Linux) you used.
Section 1 should also explain any other hardware or software requirements that must be met for your scripts to be executed.
Section 2: A Map of your documentation
In this section, you should provide an outline or tree illustrating the hierarchy of folders and subfolders contained in your Project Folder, and listing the files stored in each folder or subfolder.
-
If you are using restricted data
As explained on the page for Input Data Files, in some cases you may not be allowed to include certain data files in your documentation for reasons of intellectual property, confidentiality, or privacy. If this is the case, you must remove the restricted data files from your documentation before making it public.
Nonetheless, in the map of your documentation in the Read Me file, when you list the Input Data files contained in the Input Data folder, you should still include any files you had to remove because of sharing restrictions. But for each such file you removed, add a note indicating that (i) the file is not actually stored in the public documentation for your project, and (ii) instructions on applying for access to the file can be found in your Data Sources Guide.
Section 3: Instructions for reproducing your results
In this section, you should give explicit step-by-step instructions for using the documentation in your Project/ folder to reproduce the Results of your study. These instructions should be written in straightforward plain English, but they must be detailed and precise enough to make it possible for an interested user to reproduce your Results without undue difficulty.
-
Read more
Typically, these instructions should guide the user through the following steps:
- Checking that the reader's computer meets the software and hardware requirements described in section 1 of the Read Me file.
- Copying the Project Folder onto the reader's computer, and verifying that the folder and file structure, as illustrated in section 2 of the Read Me file, is intact.
- NOTE: If you had to remove any of the Input Data files because of restrictions pertaining to intellectual property, confidentiality, or privacy, you should note in the instructions that the reader should: (i) consult your Data Sources Guide for information about how to apply for access to the restricted data, and (ii) once they are granted access, they should put copies of the restricted data files in the Input Data folder.
- NOTE: If you had to remove any of the Input Data files because of restrictions pertaining to intellectual property, confidentiality, or privacy, you should note in the instructions that the reader should: (i) consult your Data Sources Guide for information about how to apply for access to the restricted data, and (ii) once they are granted access, they should put copies of the restricted data files in the Input Data folder.
- Executing the scripts included in the documentation, in a specified order.
- Copying the Project Folder onto the reader's computer, and verifying that the folder and file structure, as illustrated in section 2 of the Read Me file, is intact.
- Executing the scripts included in the documentation, in a specified order. (Your instructions need to specify what that order is.)
- For each script that must be executed, the instructions should:
- Briefly describe what the script accomplishes.
- Indicate any files that are accessed by the script.
- Indicate any files that are generated and saved by the script.
- Indicate that, as an alternative to running the the scripts one at a time, the user can simply run the Master Script, which executes all the other scripts in the specified order.
- Checking that the reader's computer meets the software and hardware requirements described in section 1 of the Read Me file.
Writing the Read Me File
You may use any word processing or typesetting software you like (e.g., Microsoft Word, Google Docs, or LaTex) to write the Read Me file.
When you have finished writing the Read Me file, save it in .pdf format, in the top level of the Project/ folder.
Naming the Read Me File
Give your Read Me file the name ReadMe.pdf.
-
Conventions for naming files
Naming files
We suggest you adopt the following conventions when choosing name for files you create:
- Do not include spaces in file names. This is actually more of a requirement than a convention: if there are spaces in the file names you write in scripts, you will probably run into errors when you run the scripts.
- As with folder names, PascalCase is often a convenient way to avoid using spaces in file names. For example, we name the Read Me file ReadMe.pdf instead of "Read Me.pdf".
- You can also replace a space in a file name with an underscore (_). As just one example, you might use names such as USA_Economic.csv, USA_Demographic.csv, UK_Economic.csv, and UK_Demographic.csv.
- When you refer to a file in a written document, write the name of the file in italics. (Of course, when you type the name of the file in a script, you won't use italics.)
File name extensions
The extensions on the names you choose for files will usually be determined by the format the file is saved in (e.g., .docx for a Microsoft Word document; .xlsx for an Excel workbook; .txt for plain text; .csv for text with comma-separated values; .jpg (or .jpeg) for certain kinds of graphics files; and .pdf for (well, what else?) PDF documents.
Many types of statistical software have their own formats for storing data, scripts, graphics, and other kinds of files, with file name extensions that are specific to each type of software.