Analisis de Uso de Actividades
De Paraguay Educa
Tabla de contenidos |
Background
Every School Server is used by the XOs as a repository where a backups (snapshots) of the Journal are kept. On the other hand, Sugar's Journal is designed in such a way that each entry contains meta data with the following fields:
| Field | Description |
| activity | |
| activity_id | |
| checksum | |
| icon-color | |
| keep | |
| mime_type | |
| mtime | |
| preview | |
| share-scope | |
| timestamp | |
| title | |
| title_set_by_user | |
| uid |
Motivation
Paraguay Educa has built the first Digital City in Paraguay. In the sense that each kid has access to their own learning environment thanks to the fact that they each own an XO loaded with the Sugar Learning Platform. A bunch of questions naturally arise:
| Question | Data used | Data needed |
| What is the popularity of various activities? | activity | |
| What is the average time spent on each activity? | activity, mtime/timestamp | duration |
| Is over-the-network collaboration being used? How often? With what activities? | share-scope, activity | |
| How can we measure the level of work dedicated to each activity? Did the kid take the time to name the activity? Did she do annotations as writing a description of her work? Did she send it across the network to another friend? Did she publish on the Web? | title_set_by_user | keystroke count, mouse movement count |
| What activities are used most in class? Outside of class? | activity, mtime/timestamp (adjusted out of UTC), class_turn* | |
| Are there any differences between the activities boys use and the activities girls use? | activity, browser history, gender* | |
| What games are most popular? What educational games are most popular? What is the difference in use between the two? | activity, browser history, game categorization** | ways of determining what's used in wine or gnome |
| How (and how quickly) do new activities spread through the population? (Who are the "opinion leaders" at each school? How do activities go between schools? Are there differences between urban and rural schools, or between large and small schools?) | activity (first instance), mtime/timestamp, school*, grade*, class*, turn*, urban*, school_size* | source of activity installer (pen drive, download, etc.)? |
| Are there differences in activity use between grades? What activities are most popular with younger children vs. older children? | activity, grade* |
* scraped from Inventario data (http://wiki.paraguayeduca.org/index.php/Inventario_manual/en) ** to be manually determined
Current Progress and TODOs
-
figure out what mtime, timestamp are(both == unix mtime) -
fix gender determination in inventario coderecategorized names -
look for other directories where metadata is storedsee below -
determine which activities aren't registering their namessee below - integrate journal metadata and inventario metadata (Morgan)
- generate CSV files from integrated metadata (Morgan)
- write code to generate graphs from integrated metadata?
- generate a description of metadata fields we would like to add to Journal
- integrate new metadata fields into Journal code
Debugging the Metadata
It turns out that a number of activities don't register their names, and some users don't have their metadata in one "store/" folder. Metadata seems to also be spread across many directories (which look like they have randomly-generated names). For example, here is a sample of the variety of metadata fields found for three users, each at a different school.
| Tags | Values | Unique | Name | Examples |
| 275 | 275 | 196 | timestamp | ['1281376919', '1284498581', '1275512215', '1286306972', '1284498910'] |
| 275 | 275 | 200 | mtime | ['2010-08-09T18:01:59.831266', '2010-09-14T21:09:41.677128', '2010-06-02T20:56:55.152037', '2010-10-05T19:29:32.029964', '2010-09-14T21:15:10.068966'] |
| 275 | 272 | 138 | title | ['Archivo w_croja.exe desde http://www.indicedejuegos.com/superjuegos/w_croja.exe', 'Binomio de oro - llevame en tus sue\xc3\xb1os.mp3', 'Archivo WISIN_Y_YANDEL_-_TE_SIENTO__VI.avi desde http___dc245.4shared.com_download_QQEBtPMp_WISIN_Y_YANDEL_-_TE_SIENTO__VI.avi_tsid=20100914-150305-b07484aa.divx', 'Damas gratis los due\xc3\xb1os del pabe.mp3', 'lloro .3gpp'] |
| 275 | 200 | 20 | mime_type | ['application/x-ms-dos-executable', 'video/x-msvideo', 'application/x-visualmatch', 'video/3gpp', 'application/octet-stream'] |
| 275 | 113 | 80 | activity_id | ['1807e97cdfd208ec9cc94b80fb3f60e49ecaf851', '8666d267b86091f09c5857ed0f10e3728dd03872', 'b2f7879cc8cba50d78d426b976b35a8a906bbf5d', '25b1d2b0693efd170f958ffbce826511a36999cf', '7ba798c189b2ca3ceaf4164d957b9e31a211e053'] |
| 275 | 103 | 28 | activity | ['org.worldwideworkshop.olpc.JigsawPuzzle', 'org.sugarlabs.VisualMatchActivity', 'org.laptop.Arithmetic', 'org.laptop.Oficina', 'org.laptop.TamTamEdit'] |
| 262 | 262 | 9 | icon-color | [u'#00A0FF,#005FE4', u'#FF2B34,#005FE4', u'#00EA11,#00B20D', '#ff2b34,#005fe4', '#F8E800,#FF2B34'] |
| 238 | 238 | 163 | uid | ['f627b8e9-3f8a-4d47-8d92-f23be2aa07c4', 'fafa78fb-87a0-4f55-bafc-24c71f5a3608', 'fc4a469a-b552-4f6b-904a-2289172e2cdd', 'fe9e491f-d1bd-4685-a80b-b1f4e51f574b', 'ffe81904-8524-4e27-bdce-f92334fbcf2b'] |
| 223 | 128 | 2 | title_set_by_user | [u'1', u'0'] |
| 203 | 187 | 2 | keep | [u'0', '1'] |
| 168 | 168 | 96 | checksum | ['e680616deaca8e6e1d2771d41a34a1b2', 'a9d6c92e3044c38a0fb6a3c19eeee562', '23b509a925f2511d39d2c2b04b6d203b', '86990a4f74fb312befac07d57726ed1c', 'e20d2f73111529503850c50cabbab1e3'] |
| 141 | 65 | 5 | preview | [OMITTED FOR LENGTH] |
| 103 | 3 | 1 | buddies | ['{"ef45fb8dd0b5dcbc30c9bf826071d839a999b896": ["xo", "#00B20D,#00A0FF"]}'] |
| 98 | 98 | 2 | share-scope | [u'private', 'invite'] |
| 91 | 52 | 25 | description | ['/media/EC4C-9AD9/Don Omar-angelito.mp3', '/media/EC4C-9AD9/Binomio de oro - llevame en tus sue\xc3\xb1os.mp3', '/media/RENE/Archivo WISIN_Y_YANDEL_-_TE_SIENTO__VI.avi desde http___dc245.4shared.com_download_QQEBtPMp_WISIN_Y_YANDEL_-_TE_SIENTO__VI.avi_tsid=20100914-150305-b07484aa.divx', '/media/EC4C-9AD9/Damas gratis los due\xc3\xb1os del pabe.mp3', '/media/RENE/lloro .3gpp'] |
| 81 | 81 | 8 | progress | [u'1', u'51', u'64', u'11', u'39'] |
| 77 | 77 | 1 | mountpoint | ['/'] |
| 43 | 0 | 0 | tags | [] |
| 17 | 16 | 10 | fulltext | [OMITTED FOR LENGTH] |
| 14 | 14 | 3 | ctime | [u'1980-01-01T00:00:00', u'2009-09-07T22:07:59', '2010-10-06T16:23:13'] |
| 13 | 13 | 3 | filename | [u'Mis archiv0s/MOTORHEAD1.jpg', u'Attachments/amigos especiales (remix).mp3', u'avusadora.flv'] |
| 5 | 5 | 2 | bundle_id | ['org.winehq.Wine', 'org.laptop.sugar.Jukebox'] |
| 4 | 4 | 1 | vid | [1.0] |
| 3 | 3 | 1 | buddies_id | ['["ef45fb8dd0b5dcbc30c9bf826071d839a999b896"]'] |
| 2 | 2 | 1 | total_time | ['0'] |
| 2 | 2 | 1 | sun | ['sol'] |
| 2 | 2 | 1 | robot_time | ['60'] |
| 2 | 2 | 1 | robot_matches | ['0'] |
| 2 | 2 | 1 | play_level | ['0'] |
| 2 | 2 | 1 | numberO | ['1'] |
| 2 | 2 | 1 | numberC | ['3'] |
| 2 | 2 | 1 | mouse | ['rat\xc3\xb3n'] |
| 2 | 2 | 1 | moon | ['luna'] |
| 2 | 2 | 1 | matches | ['0'] |
| 2 | 2 | 1 | low_score_expert | ['-1'] |
| 2 | 2 | 1 | low_score_beginner | ['-1'] |
| 2 | 2 | 1 | editing_word_list | ['0'] |
| 2 | 2 | 1 | earth | ['tierra'] |
| 2 | 2 | 1 | dog | ['perro'] |
| 2 | 2 | 1 | deck_index | ['12'] |
| 2 | 2 | 1 | cheese | ['queso'] |
| 2 | 2 | 1 | cat | ['gato'] |
| 2 | 2 | 1 | cardtype | ['number'] |
| 2 | 2 | 1 | bread | ['pan'] |
| 2 | 2 | 1 | apple | ['manzana'] |
| 1 | 1 | 1 | value | ['0 0 1 1 2 6 2 0 0 0 0 0 0 0 0'] |
| 1 | 1 | 1 | top | ['2'] |
| 1 | 1 | 1 | tamtam_subactivity | ['mini'] |
| 1 | 1 | 1 | suggested_filename | [u'avusadora.flv'] |
| 1 | 1 | 1 | rods | ['15'] |
| 1 | 1 | 1 | python code | ['1fe99c77-590a-47dc-80df-edd9c6b7412c'] |
| 1 | 1 | 1 | factor | ['5'] |
| 1 | 1 | 1 | bottom | ['5'] |
| 1 | 1 | 1 | base | ['10'] |
| 1 | 1 | 1 | abacus | ['soroban'] |
Based on this, we functionally have the following fields available for journal analysis:
| Name | Description | Notes |
| uid | unique identifier | NOT PRESENT IN ALL |
| mtime/timestamp | TIME last modified | could function as a unique identifier |
| activity, bundle_id, title (if not title_set_by_user), description, mime_type | NAME of activity, program, or media | |
| title_set_by_user | was the title set by the user? (0/1) | |
| keep | was this saved to journal? (0/1) | |
| buddies/share-scope | was this activity shared over the network? ([hash], private/invite) |
Crunching Journals
The first question we'll try to answer is the one of the most popular activity. To figure that out you have to:
1) Log into the schoolserver (as root or do sudo -i).
2) Download the following Python script:
wget http://git.paraguayeduca.org/gitweb/users/rgs/xs-scripts.git/blob_plain/HEAD:/get-journal-stats.py?js=1 -O get-journal-stats.py
3) Make it executable:
chmod +x get-journal-stats.py
4) Run the script;
./get-journal-stats.py 0 > activities.txt
Note: this process takes quite a few minutes.
Generating nice pie charts
With the output generated in the previous script you can use this other Ruby script to generate pie charts:
1) If you are going to run the image generation process in another machine you need to get a copy of activities.txt.
2) Download the script to generate pie charts:
wget http://git.paraguayeduca.org/gitweb/users/rgs/xs-scripts.git/blob_plain/HEAD:/gen_bar_graph.rb?js=1 -O gen_bar_graph.rb
3) Make it executable:
chmod +x gen_bar_graph.rb
4) Feed the data to the script:
./gen_bar_graph.rb activities.txt pie_chart_school_40.png 10 "Escuela 40 - Tte. Fariña"
where the 3rd parameter (10) means the number of (top) activities you'd like to include.
Determining Gender and Other Metadata
For many of the questions above, we would need more data than the Journal metadata provides. However, we can correlate Journal metadata with data from Inventario.
Morgan has previously written a script that parses Inventario data and creates a data structure including name, gender, school, grade, turn (morning/afternoon), and serial number, among other fields. She will prepare a file to run that will add this metadata to the Journal metadata output.
Future challenges
Many of the most interesting questions can't be answered at the time unless very crazy (and questionable) heuristics tactics are used. The challenge that is ahead of us is to work with Journal hackers and the Sugar Community as a whole to work a framework that will allow us to learn more about how kids learn and spend their time using Sugar.
