Data Processing and Archiving
Principles for data processing
The ESS Data Archive's overall guiding principle is to produce harmonised and standardised data files that balance two aims:
- The data should be as user-friendly as possible. This means, for example, that the accuracy of data should be maximised, the consistency of data should be established and the best data possible should be obtained. In addition, data should be as comparable across countries and time as possible.
- The data files should reflect the original reliability and quality of the data. This means that the editing of data at the archive, as well as in each country, should be exercised with great caution.
An important principle is that data processing should be done in close collaboration with the National Coordinators and their teams. During the data processing stage each National Team has full access to the catalogue (“virtual workspace” containing all data and all programmes) where the processing of their data is actually taking place. All decisions about the editing of data are based on advice from the National Coordinators, who have first-hand knowledge of their national data, and the National Coordinators also have to approve the final drafts of the data files.
The ESS Data Protocol is the key specification document available for the National Teams. The Data Protocol helps to achieve cross-national uniformity in data delivery, as it gives specifications for coding of data, the production of and the delivery of data files and other electronic deliverables. It contains sections on the procedures for collaboration between the National Teams and the ESS Data Archive, electronic deliverables, principles of variable definitions, standards and classifications, etc. In addition to this, its largest part is a detailed coding plan which defines the variable names, answer categories, whether numeric or alphanumeric codes are used, and detailed routing instructions consistent with the instructions given in the source questionnaire. The routing instructions are supplemented by flowcharts.
Stages in data processing
In total, 16 data programmes are applied from the stage when the files are deposited to the ESS Archive Intranet by the National Coordinators until the final draft files are ready for control and approval. Some of the programmes do automatic checks of the data files, while others produce output to be controlled manually. All programmes, files and outputs are available for the National Coordinators, securing full transparency during the processing.
Each programme represents a distinct step in the processing; however, we can identify two main stages, each of them completed by a data processing report to the National Coordinators and their feedback. The first stage consists of what we describe as data ingest and data checks. The second stage covers data edits, edit controls and the final data approval. For a more detailed account of the two stages of data processing, we recommend the ESS Data Archive Team's in-depth article available from the sidebar as further reading.