Our paper titled:”Experience: Learner Analytics Data Quality for an eTextbook System” has been accepted for publication in the¬†ACM Journal of Data and Information Quality (JDIQ).


We present lessons learned related to data collection and analysis from five years of experience with the eTextbook system OpenDSA.
The use of such cyberlearning systems is expanding rapidly in both formal and informal educational settings.
While the precise issues related to any such project are idiosyncratic based on the data collection technology and goals of the project, certain types of data collection problems will be common.
We begin by describing the nature of the data transmitted between the student’s client machine and the database server, and our initial database schema for storing interaction log data.
We describe many problems that we encountered, with the nature of the problems categorized as syntactic-level data collection issues, issues with relating events to users, or issues with tracking users over time.
Relating events to users and tracking the time spent on tasks are both prerequisites to converting syntactic-level interaction streams to semantic-level behavior needed for higher-order analysis of the data.
Finally, we describe changes made to our database schema that helped to resolve many of the issues that we had encountered.
These changes help to advance our ultimate goal of encouraging a change from ineffective learning behavior by students to more productive behavior.