<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Alex Kelleher's Blog &#187; Statistics</title>
	<atom:link href="http://blog.alexkelleher.com/category/statistics/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.alexkelleher.com</link>
	<description>Psychology, data, future gazing, digital marketing and the internet.</description>
	<lastBuildDate>Sat, 15 May 2010 22:23:33 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>Connecting things that aren&#8217;t connected</title>
		<link>http://blog.alexkelleher.com/2009/03/01/connecting-things-that-arent-connected/</link>
		<comments>http://blog.alexkelleher.com/2009/03/01/connecting-things-that-arent-connected/#comments</comments>
		<pubDate>Sun, 01 Mar 2009 22:31:25 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[AI]]></category>
		<category><![CDATA[Psychology]]></category>
		<category><![CDATA[Statistics]]></category>
		<category><![CDATA[data mining]]></category>

		<guid isPermaLink="false">http://blog.alexkelleher.com/?p=228</guid>
		<description><![CDATA[Humans tend to make connections between things, even when those connections don&#8217;t exist.  Our brains are constantly trying to rule-build and organise, and often get it wrong. Today for a while, when a plane passed overhead (they do often where I am), the bulb on my desk lamp dimmed. I, of course, assumed the two events [...]]]></description>
			<content:encoded><![CDATA[<p>Humans tend to make connections between things, even when those connections don&#8217;t exist.  Our brains are constantly trying to rule-build and organise, and often get it wrong.</p>
<p>Today for a while, when a plane passed overhead (they do often where I am), the bulb on my desk lamp dimmed. I, of course, assumed the two events were related.  The fact is, planes passed over every couple of minutes, and the light only dimmed every half hour, and I&#8217;ve just now found it&#8217;s because I was kicking the cable under the table without knowing it.  They&#8217;re unconnected&#8230;</p>
<p>That&#8217;s what psychologists call an <strong>illusory correlation</strong> &#8211; the false connection of two things, based on data.   (it&#8217;s also a tongue-twister).</p>
<p>Sod&#8217;s law (Murphy&#8217;s Law) is a example &#8211; we tend to connect negative events, and ignore positive (or neutral) ones.  How often have you been driving along, only to be confronted at the top of a hill and round a bend with a truck that&#8217;s halfway across the road?  &#8220;Always happens at the top of a hill and round a bend, typical!&#8221; you&#8217;ll think.  Obviously, 99% of the time it doesn&#8217;t, but we&#8217;ll remember the times it does.</p>
<p>So why is this important?  Well, it usually isn&#8217;t, because we muddle along anyway.  It <em>can</em> get odd when unexplained events (lights in the sky) are connected with unconfirmed causes (UFOs from outer space).  Or when &#8220;there&#8217;s no smoke without fire&#8221;, which has probably convicted a fair number of innocent people. </p>
<p>My interest is because at my company, <a href="http://www.cognitivematch.com" target="_blank">Cognitive Match</a> (of which Favy is now a part) we&#8217;re focussed on ways of making REAL connections in observed data.  And equally I guess uncovering the &#8220;illusory&#8221; ones&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.alexkelleher.com/2009/03/01/connecting-things-that-arent-connected/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>China has more internet users than any other country</title>
		<link>http://blog.alexkelleher.com/2008/09/26/china-has-more-internet-users-than-any-other-country/</link>
		<comments>http://blog.alexkelleher.com/2008/09/26/china-has-more-internet-users-than-any-other-country/#comments</comments>
		<pubDate>Fri, 26 Sep 2008 08:55:05 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Statistics]]></category>
		<category><![CDATA[Web]]></category>

		<guid isPermaLink="false">http://blog.alexkelleher.com/?p=187</guid>
		<description><![CDATA[253 million at the latest count, according to the government agency China Internet Network Information Centre.   Impressively, 214 million of those are broadband users &#8211; but the biggest growth is mobile phone access.   Apparently, the &#8220;Great Firewall of China&#8221; is still blocking or rendering unusable large numbers of sites, so any strategy looking to address [...]]]></description>
			<content:encoded><![CDATA[<p>253 million at the latest count, according to the government agency <a href="http://www.cnnic.net.cn" target="_blank">China Internet Network Information Centre</a>.   Impressively, 214 million of those are broadband users &#8211; but the biggest growth is mobile phone access.   Apparently, the &#8220;<a href="http://www.developingtelecoms.com/content/view/1404/26/" target="_blank">Great Firewall of China</a>&#8221; is still blocking or rendering unusable large numbers of sites, so any strategy looking to address China needs to have local hosting!</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.alexkelleher.com/2008/09/26/china-has-more-internet-users-than-any-other-country/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>13 hours every minute&#8230;</title>
		<link>http://blog.alexkelleher.com/2008/09/16/13-hours-every-minute/</link>
		<comments>http://blog.alexkelleher.com/2008/09/16/13-hours-every-minute/#comments</comments>
		<pubDate>Tue, 16 Sep 2008 16:53:38 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Statistics]]></category>
		<category><![CDATA[Video]]></category>

		<guid isPermaLink="false">http://blog.alexkelleher.com/?p=173</guid>
		<description><![CDATA[&#8230; is how much video is uploaded to Google&#8217;s Youtube, according to their official blog post today.  And that&#8217;s an &#8220;exponentially growing&#8221; statistic they believe.   That&#8217;s 18720 hours a day, or 780 days every day.  Still with me?   It&#8217;s a lot.  Of course, there will be a long tail of this stuff which is never seen [...]]]></description>
			<content:encoded><![CDATA[<p>&#8230; is how much video is uploaded to Google&#8217;s Youtube, according to their <a href="http://googleblog.blogspot.com/2008/09/future-of-online-video.html">official blog</a> post today.  And that&#8217;s an &#8220;exponentially growing&#8221; statistic they believe.   That&#8217;s 18720 hours a day, or 780 days every day.  Still with me?   It&#8217;s a lot. </p>
<p>Of course, there will be a long tail of this stuff which is never seen by more than the person who created it, and at the top end there will be a small percentage that are viewed a lot.  Sure enough, <a href="http://www.youtube.com/results?search_query=%2A&amp;search_sort=video_view_count">a wildcard type search</a> (searching for &#8220;*&#8221;, if that&#8217;s valid) turns up the top video with <strong>101 million</strong> views (a music video, like a lot in the top results of that list)&#8230;  Wikipedia&#8217;s got some notes on the <a href="http://en.wikipedia.org/wiki/Long-tail_traffic#The_heavy-tail_distribution">&#8220;heavy tail&#8221; distribution,</a> which I&#8217;m guessing is what this is.</p>
<p>The important take out is that very quickly (by which I mean already) there&#8217;s too much content on Youtube for one person to make sense of.  And therefore ways of pre-selecting, filtering and locating stuff of interest &#8211; like in every area online now &#8211; are needed, beyond just search&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.alexkelleher.com/2008/09/16/13-hours-every-minute/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Reality Mining</title>
		<link>http://blog.alexkelleher.com/2008/07/31/reality-mining/</link>
		<comments>http://blog.alexkelleher.com/2008/07/31/reality-mining/#comments</comments>
		<pubDate>Thu, 31 Jul 2008 16:52:14 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Profiling]]></category>
		<category><![CDATA[Statistics]]></category>
		<category><![CDATA[data mining]]></category>
		<category><![CDATA[geotargeting]]></category>
		<category><![CDATA[reality mining]]></category>

		<guid isPermaLink="false">http://blog.alexkelleher.com/?p=100</guid>
		<description><![CDATA[What does your cell phone know about you?  Well, a fair bit according to researchers at MIT.  For instance, they claim they can divine, among other things: - how happy and productive you are - your social status - your social group Fundamentally this is just an extension of any form of data mining &#8211; [...]]]></description>
			<content:encoded><![CDATA[<p>What does your cell phone know about you?  Well, a fair bit <a href="http://reality.media.mit.edu/">according to researchers at MIT</a>.  For instance, they claim they can divine, among other things:</p>
<p>- how happy and productive you are<br />
- your social status<br />
- your social group</p>
<p>Fundamentally this is just an extension of any form of data mining &#8211; take a large amount of data, and try and make some determinations from it.  The examples based on social group and status can be fairly easily explained &#8211; by where you spend your time (the types of shop, street, district), and other mobile phones that yours tends to hang out with.  Happiness and productivity was a correlation they discovered when they combined location and call data with questionnaires.</p>
<p>The same group are doing some interesting work with other areas that use mobile data &#8211; such as &#8220;social serendipity&#8221; &#8211; trying to match users that happen to be in similar locations, and that have similar profiles or interests.  People have tried to release products into that space for as long as I can remember, but no-one&#8217;s yet cracked it, so it will be interesting to see if this research helps.</p>
<p>A lot of reality mining to date has been to do with mobiles (like <a href="http://www.readwriteweb.com/archives/yahoo_reality_mining.php">BlueTooth MyBlogLog</a>), but obviously anything that can sense us and feed data about us will add to this: cars, PCs, toasters&#8230; The more, to my mind, the merrier.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.alexkelleher.com/2008/07/31/reality-mining/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>1 trillion unique URLs</title>
		<link>http://blog.alexkelleher.com/2008/07/26/1-trillion-unique-urls/</link>
		<comments>http://blog.alexkelleher.com/2008/07/26/1-trillion-unique-urls/#comments</comments>
		<pubDate>Sat, 26 Jul 2008 15:08:36 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Statistics]]></category>
		<category><![CDATA[google]]></category>

		<guid isPermaLink="false">http://blog.alexkelleher.com/?p=85</guid>
		<description><![CDATA[Google search engineers hit the new milestone of 1 trillion unique URLs, a number which is growing at &#8220;several billion per day&#8221;.  Even with ignoring duplicates, and assuming a lot of pages get shelved as unimportant (endless calendar day pages, empty or forgotten pages, etc.), that&#8217;s a lot of content.  That&#8217;s 166 URLs for each person on the [...]]]></description>
			<content:encoded><![CDATA[<p>Google search engineers <a href="http://googleblog.blogspot.com/2008/07/we-knew-web-was-big.html">hit the new milestone</a> of 1 trillion unique URLs, a number which is growing at &#8220;several billion per day&#8221;.  Even with ignoring duplicates, and assuming a lot of pages get shelved as unimportant (endless calendar day pages, empty or forgotten pages, etc.), that&#8217;s a lot of content.  That&#8217;s 166 URLs for each person on the planet, and 10 for each star in the galaxy (assuming 100 billion stars).  So, quite a lot.</p>
<p>While we&#8217;re pushing out big numbers, here are some more to goggle at&#8230; They&#8217;re not sourced (some of them are estimated, some might be wildly out &#8211; but all were spotted on fairly reputable sites).</p>
<ul>
<li>1.4 billion internet users</li>
<li>50 billion videos viewed online in February</li>
<li>3.3 billion searches on Baidu per month</li>
<li>500 million videos on YouTube</li>
<li>4.1 billion photos on Facebook</li>
<li>2 billion images on Flickr</li>
<li>533 million results for “insurance” search</li>
<li>10 million articles on Wikipedia</li>
<li>3 billion songs sold on iTunes</li>
<li>100 million MySpace members</li>
<li>10 million songs scrobbled on last.fm a day</li>
</ul>
<p>Pretty overwhelming, huh.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.alexkelleher.com/2008/07/26/1-trillion-unique-urls/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
