Analisis de Uso de Actividades

De Paraguay Educa
Saltar a: navegación, buscar

Contenido

Background

Every School Server is used by the XOs as a repository where a backups (snapshots) of the Journal are kept. On the other hand, Sugar's Journal is designed in such a way that each entry contains meta data with the following fields:

Field Description
activity
activity_id
checksum
icon-color
keep
mime_type
mtime
preview
share-scope
timestamp
title
title_set_by_user
uid

Motivation

Paraguay Educa has built the first Digital City in Paraguay. In the sense that each kid has access to their own learning environment thanks to the fact that they each own an XO loaded with the Sugar Learning Platform. A bunch of questions naturally arise:

Question Data used Data needed
What is the popularity of various activities? activity
What is the average time spent on each activity? activity, mtime/timestamp duration
Is over-the-network collaboration being used? How often? With what activities? share-scope, activity
How can we measure the level of work dedicated to each activity? Did the kid take the time to name the activity? Did she do annotations as writing a description of her work? Did she send it across the network to another friend? Did she publish on the Web? title_set_by_user keystroke count, mouse movement count
What activities are used most in class? Outside of class? activity, mtime/timestamp (adjusted out of UTC), class_turn*
Are there any differences between the activities boys use and the activities girls use? activity, browser history, gender*
What games are most popular? What educational games are most popular? What is the difference in use between the two? activity, browser history, game categorization** ways of determining what's used in wine or gnome
How (and how quickly) do new activities spread through the population? (Who are the "opinion leaders" at each school? How do activities go between schools? Are there differences between urban and rural schools, or between large and small schools?) activity (first instance), mtime/timestamp, school*, grade*, class*, turn*, urban*, school_size* source of activity installer (pen drive, download, etc.)?
Are there differences in activity use between grades? What activities are most popular with younger children vs. older children? activity, grade*
* scraped from Inventario data (http://wiki.paraguayeduca.org/index.php/Inventario_manual/en)
** to be manually determined

Current Progress and TODOs

  • figure out what mtime, timestamp are (both == unix mtime)
  • fix gender determination in inventario code recategorized names
  • look for other directories where metadata is stored see below
  • determine which activities aren't registering their names see below
  • integrate journal metadata and inventario metadata (Morgan)
  • generate CSV files from integrated metadata (Morgan)
  • write code to generate graphs from integrated metadata?
  • generate a description of metadata fields we would like to add to Journal
  • integrate new metadata fields into Journal code

Debugging the Metadata

It turns out that a number of activities don't register their names, and some users don't have their metadata in one "store/" folder. Metadata seems to also be spread across many directories (which look like they have randomly-generated names). For example, here is a sample of the variety of metadata fields found for three users, each at a different school.

Tags Values Unique Name Examples
275 275 196 timestamp ['1281376919', '1284498581', '1275512215', '1286306972', '1284498910']
275 275 200 mtime ['2010-08-09T18:01:59.831266', '2010-09-14T21:09:41.677128', '2010-06-02T20:56:55.152037', '2010-10-05T19:29:32.029964', '2010-09-14T21:15:10.068966']
275 272 138 title ['Archivo w_croja.exe desde http://www.indicedejuegos.com/superjuegos/w_croja.exe', 'Binomio de oro - llevame en tus sue\xc3\xb1os.mp3', 'Archivo WISIN_Y_YANDEL_-_TE_SIENTO__VI.avi desde http___dc245.4shared.com_download_QQEBtPMp_WISIN_Y_YANDEL_-_TE_SIENTO__VI.avi_tsid=20100914-150305-b07484aa.divx', 'Damas gratis los due\xc3\xb1os del pabe.mp3', 'lloro .3gpp']
275 200 20 mime_type ['application/x-ms-dos-executable', 'video/x-msvideo', 'application/x-visualmatch', 'video/3gpp', 'application/octet-stream']
275 113 80 activity_id ['1807e97cdfd208ec9cc94b80fb3f60e49ecaf851', '8666d267b86091f09c5857ed0f10e3728dd03872', 'b2f7879cc8cba50d78d426b976b35a8a906bbf5d', '25b1d2b0693efd170f958ffbce826511a36999cf', '7ba798c189b2ca3ceaf4164d957b9e31a211e053']
275 103 28 activity ['org.worldwideworkshop.olpc.JigsawPuzzle', 'org.sugarlabs.VisualMatchActivity', 'org.laptop.Arithmetic', 'org.laptop.Oficina', 'org.laptop.TamTamEdit']
262 262 9 icon-color [u'#00A0FF,#005FE4', u'#FF2B34,#005FE4', u'#00EA11,#00B20D', '#ff2b34,#005fe4', '#F8E800,#FF2B34']
238 238 163 uid ['f627b8e9-3f8a-4d47-8d92-f23be2aa07c4', 'fafa78fb-87a0-4f55-bafc-24c71f5a3608', 'fc4a469a-b552-4f6b-904a-2289172e2cdd', 'fe9e491f-d1bd-4685-a80b-b1f4e51f574b', 'ffe81904-8524-4e27-bdce-f92334fbcf2b']
223 128 2 title_set_by_user [u'1', u'0']
203 187 2 keep [u'0', '1']
168 168 96 checksum ['e680616deaca8e6e1d2771d41a34a1b2', 'a9d6c92e3044c38a0fb6a3c19eeee562', '23b509a925f2511d39d2c2b04b6d203b', '86990a4f74fb312befac07d57726ed1c', 'e20d2f73111529503850c50cabbab1e3']
141 65 5 preview [OMITTED FOR LENGTH]
103 3 1 buddies ['{"ef45fb8dd0b5dcbc30c9bf826071d839a999b896": ["xo", "#00B20D,#00A0FF"]}']
98 98 2 share-scope [u'private', 'invite']
91 52 25 description ['/media/EC4C-9AD9/Don Omar-angelito.mp3', '/media/EC4C-9AD9/Binomio de oro - llevame en tus sue\xc3\xb1os.mp3', '/media/RENE/Archivo WISIN_Y_YANDEL_-_TE_SIENTO__VI.avi desde http___dc245.4shared.com_download_QQEBtPMp_WISIN_Y_YANDEL_-_TE_SIENTO__VI.avi_tsid=20100914-150305-b07484aa.divx', '/media/EC4C-9AD9/Damas gratis los due\xc3\xb1os del pabe.mp3', '/media/RENE/lloro .3gpp']
81 81 8 progress [u'1', u'51', u'64', u'11', u'39']
77 77 1 mountpoint ['/']
43 0 0 tags []
17 16 10 fulltext [OMITTED FOR LENGTH]
14 14 3 ctime [u'1980-01-01T00:00:00', u'2009-09-07T22:07:59', '2010-10-06T16:23:13']
13 13 3 filename [u'Mis archiv0s/MOTORHEAD1.jpg', u'Attachments/amigos especiales (remix).mp3', u'avusadora.flv']
5 5 2 bundle_id ['org.winehq.Wine', 'org.laptop.sugar.Jukebox']
4 4 1 vid [1.0]
3 3 1 buddies_id ['["ef45fb8dd0b5dcbc30c9bf826071d839a999b896"]']
2 2 1 total_time ['0']
2 2 1 sun ['sol']
2 2 1 robot_time ['60']
2 2 1 robot_matches ['0']
2 2 1 play_level ['0']
2 2 1 numberO ['1']
2 2 1 numberC ['3']
2 2 1 mouse ['rat\xc3\xb3n']
2 2 1 moon ['luna']
2 2 1 matches ['0']
2 2 1 low_score_expert ['-1']
2 2 1 low_score_beginner ['-1']
2 2 1 editing_word_list ['0']
2 2 1 earth ['tierra']
2 2 1 dog ['perro']
2 2 1 deck_index ['12']
2 2 1 cheese ['queso']
2 2 1 cat ['gato']
2 2 1 cardtype ['number']
2 2 1 bread ['pan']
2 2 1 apple ['manzana']
1 1 1 value ['0 0 1 1 2 6 2 0 0 0 0 0 0 0 0']
1 1 1 top ['2']
1 1 1 tamtam_subactivity ['mini']
1 1 1 suggested_filename [u'avusadora.flv']
1 1 1 rods ['15']
1 1 1 python code ['1fe99c77-590a-47dc-80df-edd9c6b7412c']
1 1 1 factor ['5']
1 1 1 bottom ['5']
1 1 1 base ['10']
1 1 1 abacus ['soroban']


Based on this, we functionally have the following fields available for journal analysis:

Name Description Notes
uid unique identifier NOT PRESENT IN ALL
mtime/timestamp TIME last modified could function as a unique identifier
activity, bundle_id, title (if not title_set_by_user), description, mime_type NAME of activity, program, or media
title_set_by_user was the title set by the user? (0/1)
keep was this saved to journal? (0/1)
buddies/share-scope was this activity shared over the network? ([hash], private/invite)

Crunching Journals

The first question we'll try to answer is the one of the most popular activity. To figure that out you have to:

1) Log into the schoolserver (as root or do sudo -i).

2) Download the following Python script:

 wget http://git.paraguayeduca.org/gitweb/users/rgs/xs-scripts.git/blob_plain/HEAD:/get-journal-stats.py?js=1 -O get-journal-stats.py

3) Make it executable:

 chmod +x get-journal-stats.py

4) Run the script;

 ./get-journal-stats.py 0 > activities.txt

Note: this process takes quite a few minutes.

Generating nice pie charts

With the output generated in the previous script you can use this other Ruby script to generate pie charts:

1) If you are going to run the image generation process in another machine you need to get a copy of activities.txt.

2) Download the script to generate pie charts:

 wget http://git.paraguayeduca.org/gitweb/users/rgs/xs-scripts.git/blob_plain/HEAD:/gen_bar_graph.rb?js=1 -O gen_bar_graph.rb

3) Make it executable:

 chmod +x gen_bar_graph.rb

4) Feed the data to the script:

 ./gen_bar_graph.rb activities.txt pie_chart_school_40.png 10 "Escuela 40 - Tte. Fariña"

where the 3rd parameter (10) means the number of (top) activities you'd like to include.

Determining Gender and Other Metadata

For many of the questions above, we would need more data than the Journal metadata provides. However, we can correlate Journal metadata with data from Inventario.

Morgan has previously written a script that parses Inventario data and creates a data structure including name, gender, school, grade, turn (morning/afternoon), and serial number, among other fields. She will prepare a file to run that will add this metadata to the Journal metadata output.

Future challenges

Many of the most interesting questions can't be answered at the time unless very crazy (and questionable) heuristics tactics are used. The challenge that is ahead of us is to work with Journal hackers and the Sugar Community as a whole to work a framework that will allow us to learn more about how kids learn and spend their time using Sugar.