Data
Downloading the Data
To download the data, run the following command at the root level of this project
mkdir data/
curl --location --request GET 'http://virulent.cs.umd.edu:3000/sessions' > data/sessions.json
Understanding Stored Data
We record VSCode event data into JSON files. The meaning of each field is not trival to understand, so we describe them below. We record two types of events, mouse movements/hightlights and keyboard events. The following comments are extracted from the VSCode API: https://code.visualstudio.com/api/references/vscode-api
Mouse movements and highlights
{
// The position of the cursor.
"active": {
"line": 0,
"character": 1
},
// The position at which the selection starts (equal to either start or end position)
"anchor": {
"line": 0,
"character": 1
},
// The end position of the selection.
"end": {
"line": 0,
"character": 1
},
// A selection is reversed if its anchor is the end position.
"isReversed": true,
// The start position of the selection.
"start": {
"line": 0,
"character": 1
}
}
Keyboard Events
{
"startLine": 0,
"startChar": 1,
"endLine": 0,
"endChar": 1,
"textChange": "H",
"testsPassed": [],
"time": 1666896719657
},
Loading data
- proof_data_analysis.utils.get_num_tests_passed(tests_passed: Series) Series
Convert a series of tests passed to a series of numbers
Each resulting datapoint is just the number of tests passed
e.g. [[1,2], [3,4], [1,2,3]] -> [2, 2, 3]
- proof_data_analysis.utils.load_df(path_to_json: str = 'example.json') DataFrame
Load the json file containing the keylogged events and convert it to a pandas dataframe.
There are 3 events listed here, insert, replace, and delete.
- Parameters
path_to_json – path to the json file with the keylogged events
- Returns
a pandas dataframe with a row for each keylogged event
- proof_data_analysis.utils.times_to_seconds(time: Series) Series
Convert a series of time stamps to seconds
Each resulting datapoint is just the amount of seconds from the first time stamp