The data generated by one or more automated collection systems must be examined to determine which parts are actually of interest, (e.g., generated by the intended target population) as opposed to random or irrelevant “noise” created by the automated systems in the course of operations. Also, preparing data involves correcting incomplete, incorrect, or improperly formatted data that might interfere with analyses.
In many cases, preparing data for formal analysis is currently a time consuming manual process. There are a growing number of stand-alone data cleaning and preparation tools. Often such tools deal with one or more data problems and apply a repair strategy. The remaining task for those of us interested in automating the data processing stage will be to take stock of the various approaches represented in the data cleaning and preparation tools and then develop a comprehensive integrated data preparation process that can be automated to deliver useful files to educational researchers.
Be the first to comment