Processing ported to Javascript
Processing is an easy to use programming language designed to make creating data visualizations easier. Processing simplifies the syntax of writing programs that draw graphics and use animation. The language was designed to be easy enough to be used by designers, to abstract most of the complication of writing the same functionality in Java. Each Processing application is finally converted into Java and can be either uploaded to a website as an applet or as a standalone Java program.
John Resig, has now ported the Processing language to Javascript. Using this library now you can write processing code that gets dynamically converted into Javascript and loaded directly in the browser, without having to load a Java Applet. Resig has a long list of examples using his Processing Javascript Library.
Giving Twitter another chance
I've decided to give Twitter another try. For those unfamiliar with the service, twitter is a website where you can post (or send) a small message to people who are interested in hearing it. For people to see your messages they have to subscribe to your posts by following you on twitter. These messages can be viewed either from the twitter website, or using special twitter clients, or even getting them on the phone by SMS.
Originally the service was intended to let your friends know what you're up to. It was something similar to setting you status in facebook, or your personal message in MSN Messenger. Now apart from it's original use, twitter is used to send all sort of messages, like for example links to interesting sites, or asking questions to people who are following you. It's a sort of cross between instant messaging, chatting and mini-blogging.
I tried twitter a couple of months ago, way before it got all the hype. I used it mainly for posting my status to friends, but since there weren't many of my friends using it yet it was pretty useless. Now I'm giving it another go, hoping to make some new friends. If you want to follow me on twitter to see what I'm up to this is my profile. I still have to figure out how to find interesting twitters though.
Now with twitter I have to decide what should get blogged and what should get twitted :) Well, for posts more than a sentence long I'll have to blog them, but I might start posting less general appealing links (like computing links) on twitter more than on the blog.
How to get the miss hooked on gadgets?
Get her one of these Roomba iRobots... Click image for video
Baseball Visualizations
The new baseball season has just started and like every year the race is on to win the World Series. Baseball is probably the richest sport when it comes to statistical data and analysis, yet for a sport so rich in statistical data a search in the custom google data visualization search engine, and the infovis image search database yielded very few results. These are some of the more interesting baseball visualizations I found around.
Salary vs Performance - Ben Fry, one of the authors of the Processing programming language uses his freely available tool to visualize which baseball teams are spending their money well, and how does each team position changes over the course of the season? The last applet uploaded looks at the teams and their salaries in 2007.
Baseball Visualization Tool - This is a commercial tool that uses a pie chart to guide the manager whether to pull the pitcher or not. The fuller the pie chart the more the pitcher should be changed.
Baseball race - This visualization tracks the progress of each team in a season as the season progresses. The dataset used for this application starts from 1901 and continues till the present day. The data is freely available from Retrosheet, a baseball scores database.
Bivariate Baseball Score Plots - The bivariate baseball score plots present summary information for MLB teams game scores. The scores are visualized using a bivariate baseball score plot with each game being a point in a two-dimensional grid.
Chernoff Faces baseball managers - A visualization coming fresh off the press that uses Chernoff faces to display baseball manager stats. The features of the face like face height, width, nose size, mouth curvature, etc. change according to the values of the attributes they are representing.
Mitchell Report Visualization - In December 2007 a 409 page report was published detailing the use of steroids in Major League Baseball. A social network of connections between players and trainers mentioned in the Mitchell Report was created using Social Action, a tool developed by the HCI Lab of Maryland University.
Bill James - A video interview and a newspaper interview with the most popular baseball statistician, and also the inventor of the term used to describe baseball analysis - Sabremetrics.
Using faces to display data
Dr. Steve C Wang used a data visualization technique called Chernoff faces to display some characteristic of baseball managers in 2007. The technique was developed by Herman Chernoff in 1973, and the idea behind it is to display different data attributes as facial features such as curvature of the mouth, length of nose, direction of eyebrows. In Dr Wang’s graphic, the number of lineups used by the manager is the length of face, width of eyes and ears; the number of pinch-hitters is the width of the hair, and the width of the face.
Using this technique one can display many different attributes of a data set in a single face then allow the user to compare the different faces to analyse the data. In fact Chernoff claims that up to 18 data elements can be displayed using this method, allowing the user to visually cluster the data.

How effective are Chernoff faces in conveying information? Maybe the faces do not covey information at first glance, and they need a lot of referencing to the face legend, however I think they make an interesting and fun way of displaying information. The sole fact that this technique made the pages of the NY Times is enough proof of this. I’m sure that if the same data was displayed with bar graphs and pie charts it wouldn’t make any headlines. Most user studies in visualisation take into account the efficiency (speed in answering / accuracy of answer) of the technique, however techniques like Chernoff face maybe aren’t suited for answering questions fast, but they are catchy and media friendly.
More on Chernoff Faces
Critique by Robert Kosara
An Experimental Analysis of the Effectiveness of Features in Chernoff Faces
Chernoff Faces in Psychology
CAPTCHA's used to digitise books
CAPTCHA's are those annoying pictures with characters on them that you have to type to prove to a website that you're human. They're meant to prevent automatic programs to mimic humans and spam. While they undoubtedly are a great good for the internet, they're also a bloody waste of time. The inventor of the CAPTCHA's Luis von Ahn, realised this and decided to put this human effort to good use - to convert old books into electronic form.
Listen to this interview with Luis von Ahn at the IT Conversations Network
How to watch better quality videos on YouTube
Youtube is now making some high resolution videos available. If you find a video you'd like to watch, you can add the following text to the webpage address, "&fmt=6" (without the quotes) to get a better picture. For the higher quality version you might need to wait a bit longer than for the normal version, because the size of the file is bigger, so it takes more time to download.
For example you can take a look at these two videos and compare the difference:-
http://www.youtube.com/watch?v=QAE2-FQHkok
http://www.youtube.com/watch?v=QAE2-FQHkok&fmt=6
For the more technically oriented, there's a greasemonkey script available to automatically show the high-res video if available.
Also if you add "&fmt=18" instead of &fmt=6 you'll get the video in mp4 format instead of the usual .flv format usually used by youtube.
Via Hackzine
Data Mining and Info Vis for this week
Nat Torkington from O'Reilly Radar published an interesting weekly roundup post of the Data Mining and Visualization posts. The most interesting posts mentioned are: Catching a poker cheat with data mining, SNA toolkit for R and a link to a machine learning blog called Machine Learning (Theory)
The strengthof weak ties (in summary)
jill/txt has a brief summary of the seminal paper "The Strength of Weak Ties" by Mark Granovetter. To quote part of the text:-
"It’s really unlikely that A knows C and A knows B but B and C don’t know each other, at any rate if A is pretty good friends with both B and C: B and C will probably know each other too.
If A needs a job, she’ll ask B and C. They probably won’t have any new information, because A already shares most of the information that B and C have. There’s a far greater chance A will get new information — for instance about a job that might suit A — from her weak ties, that is from aquaintances and people that she doesn’t see very often. The greater social distance between A and D means that D knows more things that A doesn’t already know.
Weak ties also important because they work as bridges between social groups. People who are bridges between two groups may appear to be socially isolated but actually have weak ties with two or more groups which gives them very early access to new information.
175 Visualization Resources
The title says it all really. Meryl.net published a a very long list of visualization examples, blogs, influential vis people. Worth a look.

In no. 28 there's the Felton Annual report which I was planning to blog about, some time in the future. It's a personal annual report presented in a very creative way.
Info Graphic Humour
Why do people use facebook (and other social networking sites)
The article on facebook that appeared in the guardian has generated some interesting discussion in the INSNA mailing list this week. The most interesting point was made by danah boyd about the different types of people who use facebook. I've been thinking about this question myself and she summed it up very nicely. This is the explanation she used:-
- teens because they're not allowed out of the house to hang out with their friends and if they are, their friends aren't or they have to go to highly regulated and supervised settings
- college students because they know that they're supposed to be in class/doing homework/sleeping, but they're procrastinating because talking to friends is much more fun and a little bit of low-level talking through FB can be justified far better than meeting up with someone for a coffee
- white collar workers because they're bored at work and want to hang out with their friends when they should be doing a variety of other things
- nightshift/hourly service workers because their friends work different hours
- parents at home because they can't really go and hang out with their friends because babysitting costs too bloody much
- highly mobile adults and military folks because their friends are far away, probably in a different timezone and getting together in person can only take place sporadically
In a separate related post on the topic she explained two organizing principles of online socializing practices:-
What we've found in our research is that there are two organizing principles of online socializing practices: interest-driven and friend-driven. People who are interest-driven (lovingly called "geeks") seek out people who share their passions, regardless of location, and thrive on access to the technologies that connect them more broadly to others of their stripe. As much as we'd love for this to be everyone, it's not... Most people are not primarily interest-driven in their social practices, although many have a portion of their social practices that fit into this category. The majority of people and the majority of practices are friend-driven. This means that interests are derived through friends, not the other way around. This is why most people go online to connect to people that they already know to reinforce relationships that they already have. At best, this cohort will leverage the technology to meet a friend of a friend (just like at a good dinner party).
The largest exception is quite obvious: sex. By and large, when people leverage the technology for sex, they don't want to engage with people that they already know. The second notable exception is more intriguing: health issues. Interestingly, even the most friend-driven people seem to switch to interest-driven practices when it comes to needing support for an illness or help in gaining information around said illness. It should be noted that these are not common amongst teens and interest-driven practices are almost exclusively the domain of geeks and other socially marginalized and ostracized teens.
I took the liberty of quoting the whole parts of the e-mail as it will be available in the INSNA archives anyway.
Is facebook a threat to Google? What do you think?
Facebook's Mark Zuckerberg was on the CBS program 60 minutes recently. The footage showed some clips of the facebook offices that look very much like a dorm, as Zuckerberg himself says.
One of the opinionists in the feature said that facebook might become a threat to Google. I think this idea is a bit far fetched. Unlike facebook, Google solves a very distinct and important need on the internet - the need to find information. While Facebook's role as an aid to social interaction is currently very popular, this human requirement can be fulfilled without the use of facebook (Instant Messaging, email etc), as well as without the use of the internet entirely.
On a different note but still on the topic of facebook, the guardian had an article on the politics of the people behind the site. Some of them are also part of the Paypal Mafia.
Excel file action settings (inside or outside the browser) in Vista
In Vista, Microsoft removed the option to change the action settings for file types. In previous versions you used to be able to access this functionality from the Folder Options -> File Types -> Advanced. While you can still change the file association types, the Advanced option to control file opening behavior was removed. In order to set the “Browse in same window” option on or off for Office files you have to edit the registry. The Browse in same window option is controlled via the BrowserFlags key in the registry under the HKEY_LOCAL_MACHINE -> SOFTWARE -> CLASSES
For example in the case of Excel, the keys that need editing depend on the version of Excel you are using:-
"BrowserFlags"=dword:00000008
[HKEY_LOCAL_MACHINESOFTWAREClassesExcel.Sheet.12]
"BrowserFlags"=dword:00000008
Excel.Sheet.8 refers to Excel 97-2003 and Sheet 12 refers to Excel 2007. Lower numbers like Excel.5 refer to previous versions of Excel.
If you want to set Excel outside the browser then the key has to be set with value of Hex 8 as shown above. If you want excel to work inside the browser, omit the key altogether.
The respective Microsoft support notes related to this issue are:-
Note 162059
Note 927009
Note 254918
















Comments (0)