Category Archives: data mining

Data Loss, Data Gain

A couple of things came to light today, which all seem tied together by the common thread private data.

magn

Firstly, I noticed ma.gnolia.com was down. Aside from a frustrating domain name, they had a reasonably successful social bookmarking service. Sadly, due to lack of backup (!), they’ve lost the majority of the bookmarks/favorites that they stored on behalf of their users…

Bang, useful personal data gone.

Secondly, I tuned into “More or Less” a great statistics-focussed radio show on the BBC, on a recommendation from my Dad. Aside from a really great interview with the author of “Sustainable Energy: Without the Hot Air” which I’ll write about another time, the presenter mentioned Daytum. Setup by the Nicholas Feltron, the guy who exposes his personal stats meticulously collated and designed up at feltron.com each year, the site enables you to have your own “Personal Dashboard”.

youdata

Thirdly, I spotted an ad which had a “YouData” logo on it. Smelling a 2.0 startup, I checked out the site - and yes, it’s a (US based) service that lets you sell your attention – the old “pay me to advertise at me” model, but brought up to date.

So how do these strands tie together? Well, they are all about people realising that their own data is:

  1. Valuable and useful to them
  2. Valuable and useful to others
  3. Therefore, has a monetary value

Problem is, losing bookmarks at Magnolia is a greater value by some margin than what someone like YouData would pay for that data. And so that’s the opportunity – finding a way to bridge the gap between how much I value my data and time, and how much others (typically advertisers) value it. The answer may be that in most cases, that gap can’t be bridged?lady gaga poker face

Realtime – Sprint’s Widget Fest

now

I think realtime reporting IS the future, and dashboards that show live, pushed information are going to be ever more ubiquitous.  Hardly any exist right now, but Sprint as part of its “now” marketing campaign has put together a great live dashboard over at http://now.sprint.com/widget/

In addition to more common widgets from World Population to “top words being used online”, there are a bunch more, such as “911 calls being made” to “sticky notes being produced” to “transplants today”.  Some of the more amusing ones are:

- A “push now” button, which (predictably) does nothing, but reports that 66,713 other people have clicked it
- “You, now”, which takes your webcam feed to show you, now
- A “habitable planets” counter

While you’re browsing all of that, a female voiceover provides more realtime data, such as  ”The earth will travel 18 miles between right now… and now”

Genius, and here’s hoping more useful versions come along soon to gadgets near me.

Privacy and StreetMaps, Again!

I’ve been interviewed twice now (on local radio, nothing too mind-blowing) about Google Street Maps and Privacy.

On one level, it’s the same knee-jerk reaction that happened when the service launched States-side.  A lot of stuff about “what if I’m captured coming out of X-place, or holding hands with Y”.  Well, here’s the news:  it won’t usually be Google StreetMaps that catches us out on those moments…

On another level, stories of people stopping or barricading the Google StreetMaps car have made people think there might be something more to this – and when Google move to countries where privacy is a bigger issue, what will happen then?

My take on this: privacy IS being eroded, on a daily basis, around the world.   That’s just a fact.  Google can blur as many faces as it wants, but I’m being tracked by cameras, URL tracking software, mobile/cellphone masts – and guess what, Google: my car, my branded van (if I had one), my house are all still personally identifiable.

Two things make this loss of privacy okay:

  • The technology that comes with it (including StreetMaps) outweighs the risks by a seriously large factor
  • There is SO MUCH DATA, that no-one and nothing can really do anything that worrying or invasive with it.  There’s too much of it being gathered, and most of it is never looked at.  At least for now, and in countries that don’t have some sort of evil regime in power…

It may be the fact that Google is doing it to make money, but essentially they’re just putting online what we can walk to on our own two legs and see for ourselves.  So let’s calm down, enjoy the benefits, and only go out at night with a hoodie pulled over our faces.

Connecting things that aren’t connected

Humans tend to make connections between things, even when those connections don’t exist.  Our brains are constantly trying to rule-build and organise, and often get it wrong.

Today for a while, when a plane passed overhead (they do often where I am), the bulb on my desk lamp dimmed. I, of course, assumed the two events were related.  The fact is, planes passed over every couple of minutes, and the light only dimmed every half hour, and I’ve just now found it’s because I was kicking the cable under the table without knowing it.  They’re unconnected…

That’s what psychologists call an illusory correlation – the false connection of two things, based on data.   (it’s also a tongue-twister).

Sod’s law (Murphy’s Law) is a example – we tend to connect negative events, and ignore positive (or neutral) ones.  How often have you been driving along, only to be confronted at the top of a hill and round a bend with a truck that’s halfway across the road?  “Always happens at the top of a hill and round a bend, typical!” you’ll think.  Obviously, 99% of the time it doesn’t, but we’ll remember the times it does.

So why is this important?  Well, it usually isn’t, because we muddle along anyway.  It can get odd when unexplained events (lights in the sky) are connected with unconfirmed causes (UFOs from outer space).  Or when “there’s no smoke without fire”, which has probably convicted a fair number of innocent people. 

My interest is because at my company, Cognitive Match (of which Favy is now a part) we’re focussed on ways of making REAL connections in observed data.  And equally I guess uncovering the “illusory” ones…

Exploring the news, visually

I’ve said on here before that not enough visualisation is being used, so it’s great to find new ideas – even if they’re not totally useable/useful. DoodleBuzz is one such example, you could call it the Zen of news exploration. Even after 5 minutes of doodling, I have relatively little idea of what’s going on. But I do feel a lot calmer.