Supermarkets have always kept track of how people shop, but in the last few years the extent to which retailers collect data has rocketed. Tesco owns a majority stake in Dunnhumby Ltd, which carries out data mining and analysis for large group of retailers including Coca-Cola, BT, Mars, Vodafone and other leading brands. Dunnhumby operates the Tesco Clubcard scheme: using data collected from the scheme, Tesco can predict when people will shop, how they'll pay for their items and even how many calories they will consume. Dunnhumby recently reported a 32 per cent rise in operating profits to £53.4 million, and has grown from 300 employees at the start of 2007 to nearly 1,250 this year. The data collected by Dunnhumby has changed the way we shop.
Dating site OkCupid.com runs a regular series of blog posts analysing data from the site's 3.5 million active users. By collecting user profiles and site messages, it's possible to calculate everything from the perfect profile picture (apparently, the perfect profile picture is taken on a high end camera, in the mid afternoon, without a flash) to the right language to use when replying to messages ("your" beats "ur", and "hot" is a turn-off, whereas "fascinating" is a turn-on). The data has also shown that on average, users add two inches to their height, and over-report their salary by 20 per cent. The large datasets collected by dating sites have also attracted academics, so look out for more data-based date advice in the future.
Business deliveries
In the last few years, advanced mapping tools have allowed businesses to use data to increase the efficiency of their deliveries. Where companies used to plan deliveries on paper and in small teams, they now use advanced mapping software, routing data and live traffic information. MapMechanics produces several of these tools, with its clients delivering everything from the Yellow Pages to milk. Those clients now do their deliveries using complex and live-updated views into data that used to be limited to the shift manager's desk.
The way people find shops could also be about to change, thanks to a new view into an existing set of data. Since Google introduced Street View three years ago, rival services have since appeared from the likes of Microsoft and Mapquest. Microsoft's Bing Streetside (their version of Street View) has been remixed by a group of researchers, turning it into Street Slide. Street Slide takes the data from Streetside and turns it into a strip of businesses with clickable logos and building numbers. It's a different and intuitive view into the average shopping street, and could change window browsing forever.
The LA Times has been running a story based on new data that has shaken up how schools are assessed in the city. Until now, parents have used overall school test results to assess the quality of teaching at local schools, which reflects the circumstances of the parents more than it does the quality of the teaching. The LA Times obtained data on test scores from 600,000 students between 2002 and 2009, allowing it to calculate "value added" scores, or a measure of the progress students have made between different stages of education. This analysis shows that some schools have improved the academic achievement of its students at a greater rate than other, more respected schools. Although there have been questions raised about the methodology behind the analysis, it's still a great example of how new and previously unseen data can add a different perspective on a subject that affects everyone's life.
Spending data for the government is being released on a much greater scale, with the release of COINS spending data to be supplemented by itemised spending above £500 from local government. Several bodies have appeared that aim to provide a clear picture of how the Government spends money, including Where Does My Money Go?, OpenlyLocal and Armchair Auditor. Although they're operating on a relatively small scale at the moment, they've achieved a lot in a short time. It's not a stretch of the imagination to see WDMMG? achieving its ultimate goal of tracing where everybody's tax money, down to the nearest penny, has gone. The London Datastore and Data.gov.uk are campaigning for and highlighting open data releases from the Government, and the Government itself is planning a raft of data releases. With more data becoming available about how our Government operates, it'll inevitably be pressured to change.
Location data has powered the creation of several major new social networks, including Foursquare, Gowalla and Google Latitude. With the addition of Facebook Places, location data could be part of the lives of an additional 400 million users. University College London's Centre for Spatial Analysis has churned out some amazing views of location data, the latest from doctoral research student Anil Bawa-Cavia. He's recently published maps which take snapshots of data from Foursquare and turn them into a view of "social London". The data shows that Shoreditch, London Fields and Covent Garden are among the most popular locations. That may say more about Foursquare's users than it does the "average Londoner", but Facebook Places will change all that. Bawa-Cavia thinks the data provided by Foursquare "can help us understand how the social lives of cities relate to their spatial structure. By analysing geo-social datasets we can hope to understand the basis for a more sociable, more usable city".
The Wikileaks War Diary is the most comprehensive set of data about a war ever released. Putting aside criticsm of the data (and the motivations behind its release), the information contained within the reports on civilian deaths, increased attacks on coalition troops by the Taliban and revelations of Pakistan-Taliban links has demonstrated the futility of the war more comprehensively than a decade of war reporting.
No story about the use of data changing people's lives could omit a mention of Google. Unlike the other examples mentioned here, Google works with data in the Petabye scale, where traditional ways of organising data fall apart. Google relies on mathematical models and the input of more (and more, and more) data to increase revenue and its success. As Wired's Chris Anderson wrote two years ago, "Google conquered the advertising world with nothing more than applied mathematics. It didn't pretend to know anything about the culture and conventions of advertising — it just assumed that better data, with better analytical tools, would win the day. And Google was right".
Linked data and the future
The examples of data mentioned in this article are innovative, exciting and life changing, but the best is yet to come. The majority of the information that we use in our daily lives is "dumb", or unconnected. The next step is "linked data", or data that talks to each other. In the UK, Tim Berners-Lee and the team behind Data.gov.uk are aiming to create a linked database of Government information. By providing all data the Government produces in a linked format, individuals will be able to pull in different sets of data to produce new and innovative ways of understanding how our Government and the world works.
FluidDB, a start-up company run by Terry Jones, and with backing from Tim O'Reilly and Esther Dyson and others, is tackling this field from a different angle. FluidDB wants to create a "writeable world", where physical objects have virtual identities, which can be updated and called upon by any individual with access to the internet. That could mean tweets and status updates about everything from a brand of toothpaste to the Eiffel Tower could contribute to a collective database. The possibilities for collaboration are endless.
No comments:
Post a Comment