Parachuting presidential candidates, the analysis of rumors, Monopoly, Queen and Mozart. Your latest helping of The Week In Data is here, and expects delays in the arrival of train data.
Paule d'Atha désigne l'équipe des journalistes de données d'Owni : Julien Goetz, Sylvain Lapoix et Nicolas Patte. Twitter @pdatha.
Let’s get straight into it this week with a contest that has generated plenty of buzz since it launched less than a month ago – #Googleviz. We mentioned one of the first web-apps entered, ReTwhit2012, in last week’s edition, which suggested that the level of competition was going to be serious. Turns out it was.
The winning entry was Mediarena, designed and developed by Nils Grünwald, Stéphane Raux, Alexis Jacomy and Ronan Quidu. Everything is there at first glance: the angle is clear – how the mainstream online media cover the French presidential campaign – while the interaction is more than intuitive. With a few clicks, the user can play around with the data and scroll through the list of headlines. Beyond the simplicity and readability, Mediarena gives the user access to a huge amount of data that provides context and depth to their chosen angle.
Also competing were the designers of Partie 2 Campagne (“A Day in the Countryside”, a nice tribute to Raymond Depardon’s film 1974, une partie de campagne). While the design of their web-app is a little less slick, the layout of their data is really interesting and innovative. The user enters through a tag cloud that gives an overview of the main topics covered by the media and politicians (with a little graphic comparing them when you hover over each word).
Once the user has defined their choice, the second screen immerses us in the “delta” (HTML5 + canvas version), showing the topic and tributary subjects. By clicking on each term the user can access the data: trend analysis, display of sources (political and media) and even a list of YouTube reference videos.
The team from Haploid chose a slightly more fun angle, at least in terms of visualization, with Who will be parachuted into the Élysée?. The candidates are physically parachuted towards the presidential palace, whose center of gravity inexorably attracts everyone. The basic data for each candidate is their official Twitter account linked to their political party’s account. They use a correlation between the number of followers and the number of retweets to determine, in real time, who is closest to the Holy Grail.
Around them float planets representing topics identified daily on Twitter. The interaction is quite intuitive: if the user hovers over a candidate, the link between them and the topic-planets is displayed, while clicking on each element displays the corresponding Twitter feed.
Before we leave hexagonal data entirely, let’s break out the sacred #opendata tag to catch up with those folk who want us to take the train more often. data.sncf.com, ten letters and two full-stops which are enough to make some csv-junkies amongst us salivate. Except, once the homepage loaded, we were left speechless: not a single data set to be put in the spreadsheet. In its place is a call for debate: “Open Data, Open Debate.” As for the “win-win” model advocated in the short blurb, we’ll pass.
Dear transporters of humanity, please be aware that the Open Data debate has been around for a while now, and the best way to move it forward would have been to give us your data. The wait goes on for some innovation from the data hoarders.
Our friends at the Guardian have been playing around with some facts and figures for our entreatment and edification again. Obviously being no strangers to the concept of the long tail, they released a very nice interactive visualization about an event that happened nearly four months ago, the London riots.
Alastair Dant and his colleagues decided to analyze the evolution of rumors on Twitter during these events. Rumors like: “The rioters have released the animals in London Zoo” and “The rioters are making their own sandwiches in McDonalds.” You choose one of seven rumors – five of which are false, one unfounded and one accurate – to see their evolution. The replay is intelligently constructed, including a visual identification of tweets that were supportive of the rumor, opposed it, questioned it or simply commented, and highlighting key moments where the dissemination of the rumor evolved. And as the Guardian love to share, they’ve even given us a “making of” for this datavisualization, which provides an insight into the importance of teamwork, integrating journalists, developers, designers and academics.
To Spain next: another movement, OccupyWallStreet, and another visualization designed by Numeroteca. If their report is sorely lacking in opportunities for interaction to better understand the data, the principle of the project still deserves our attention. The goal is to create a comparison between the treatment of a subject on the front page of the major U.S. newspapers and the number of tweets per day regarding the same subject. The visualization allows us to not only compare the two media presences on a graph but also to visualize the place physically on the front page. The same type of study was conducted to compare the treatment of the Arab Spring on the front pages of the main Spanish newspapers and on Twitter.
Let’s go across the Atlantic to crowdsource the future of computing. This is what the New York Times proposes through an efficient vertical timeline that divides the subject into four main topics: Computation, Artificial Intelligence, Transportation & Lifestyle, and Communication. Nothing comes from nothing, and these are the foundations for the whole history of computing in its broadest sense, from Napier’s bones in 1617 up to the historic year of 2011 when Watson, the supercomputer designed by IBM, beat two Jeopardy! champions.
What will tomorrow bring? From 2012, there is a large blackboard where the public’s predictions are displayed. You can no longer post your predictions, but you can still interact in two ways. Either by moving the events displayed on the timeline post-2011, or by voting for the proposals that seem most interesting or realistic. The top rated will be gradually included in the predictive part of the timeline.
Winter is coming and with it the stereotype of long evenings at the fireplace. Some of you might find yourselves dusting off your Monopoly board for a game or two, so let us give you a tip. A California developer named Ben Jones had some fun creating a model out of the statistics from 60,000 random games of Monopoly. His Dominate Family Game Night presents a bunch of different game strategies based on the most popular trends. Sssh, you didn’t hear it from us.
Finally, while we won’t be launching our own competition just yet, we are issuing a call. The Internet world is filled with so much WTF data just waiting for some crazy small teams to visualize it all. Such as, for example, the best-selling 45’s/Singles of all time in France (thank you @Pirhoo) or every correspondence between Wolfgang Amadeus Mozart and his family. Nearly 1400 letters sorted according to date, location, sender, recipient and works mentioned. Serious scraping.
We end this week with music (you’ll note the reverse chronology from the last edition of The Week In Data) and a dataviz for the ears. With Bohemian Rhapsicord, created at Music Hack Day Boston, Jennie and Paul Lamere have beautifully deconstructed Queen’s legendary song Bohemian Rhapsody. Divided into it various constituent sequences, the user can then replay it any way they like. Either you apply one of the filters they offer (duration, volume, reverse, similarity) or you set a key on your keyboard for each segment and you rebuild the musical puzzle. Only downside: the web-app only works in Chrome.
Find previous editions of The Week In Data!