Skip to main content

The Rise of Social Data Mining (as a Business Model)

Companies are discovering how to monetize social network data.  This is driving Big Open Science.  Is that a good thing?

A social network for sharing illness data, patientslikeme.com, has demonstrated that it can tap the information in its user network to predict the outcome of clinical drug trials.  The service, which is populated by a large number of ALS sufferers, determined that lithium use had no effect on the late-stage decline in ALS patients.  Why is this significant? Because it took 18 months before a formal study was able to confirm exactly the same thing.

While clearly not yet a replacement for the clinical trial process, the findings do reinforce the concept of Big Open Science - the use of large data sets to conduct a rougher, more rapid form of science.

The financial model is clever and solid:
We take the information patients share about their experience with the disease, and sell it in a de-identified, aggregated and individual format to our partners (i.e., companies that are developing or selling products to patients). These products may include drugs, devices, equipment, insurance, and medical services.  We do not rent, sell or share personally identifiable information for marketing purposes or without explicit consent.  Because we believe in transparency, we tell our members exactly what we do and do not do with their data. 
So long as the data remains totally secure, it sure reads like a win-win to me.  I can see many quantified health start-ups adopting or moving towards this model.

But, aside from the health data, it's not really all that new.

Focus groups and stock markets have been around for hundreds of years.  More recently, Hollywood Stock Exchange (HSX), a movie performance predictor site oft cited by collective intelligence researchers,  has clearly demonstrated its ability to forecast box office revenues via crowdsourcing.  And, of course, don't forget the banks, credit card companies and info aggregators that can already "predict with 95% certainty that you will get a divorce, two years before it happens, based on your purchases", as Google's Marissa Mayer famously pointed out on Charlie Rose.

In the coming years we can expect this sort of model to proliferate.  Trends like cheaper data storage, smaller sensing devices, widening bandwidth, exponentially faster computing and emergent social behavior suggest that more companies will be able to mine more valuable data from more willing participants and sell it to more interested parties.  Barring an all-out privacy backlash, it's a relatively safe bet that the broader market will create the conditions necessary for more similar social data mining startups and operations.

The new opportunities are seemingly endless:
  • health & medicine sites - like patientslikeme or curetogether (shout out to Alexandra Carmichael)
  • location video - streaming and stored on youtube
  • driving information-  gathered through your car
  • smartphone apps - more complex data capture, reality mining
  • genome - companies like 23andme
  • etcetera!
At the same time, consider how certain large companies can leverage Big Open Science (they're already doing it for market research, but could easily broaden these efforts):
  • Search: Google, Bing, Yahoo Search
  • Social: Facebook, MySpace, Twitter
  • Gaming: Sony, XBox, Nintendo, Apple
  • Smartphones: Apple, Microsoft, HTC
Privacy concerns aside (for the moment), there's such an abundance of untapped informational value that it's easy to envision a world in which total productivity grows by leaps and bounds - as a square to acceleration in the technology, data and comm space -  a sentiment echoed by Wired writer and Quantified Self blogger Gary Wolf at a recent Stanford MediaX seminar. (When I asked him whether or not he believed that Quantification was directly related to Kurzweil's Law of Accelerating Returns he thought about it for a moment then said "yes".)

Now, will these quantification driven economic gains gains trickle down to the average person?  Yes, I do think they will.  First in the form of accelerated science.  Then, second, it seems likely that the increasingly abundant services and social networks in the space will be forced by the market to return more and more value to the users, or prosumers, that are contributing this data - an effect that I have playfully nicknamed The Mandate of Kevin.

But will these gains come online fast enough to offset the disruptive forces of globalization, production automation, a large-scale privacy backlash or the resulting social turmoil?  That's hard to say, because we humans have never experienced such convergence before.  However, it is becoming more and more clear that traditional economics will drive Big Open Science and that this behavior is a thread interwoven with other accelerators.  Hopefully that will turn out to be a good thing.

Popular posts from this blog

Annotating the Physical World - How Much Augmented Reality Cake Will Layar Take?

Imagine pointing your iphone at different locations around you to reveal geographically pertinent annotations and/or other media that people have deposited there. Now there's an app for that. In futurist circles, this basic world-as-web scenario has been discussed for years (I even worked on one such forecasting project ), if not decades. The simplest version of the concept has always been an application that intuitively and instantly blends real-time first-person physical world experience with the valuable data contained Wikipedia, Yelp or other websites, allowing you to instantly access stats about restaurants, concert venues, parks, car dealerships, schools, businesses, etc, that you encounter in your view. Such an app could, for example, provide information about a certain shrub in your yard, allowing quick access to species data, historical photos and related ads from the local lawncare services. Now, thanks to the convergence of smart phones and real-time geo-sensing, a

Building Human-Level A.I. Will Require Billions of People

The Great AI hunger appears poised to quickly replace and then exceed the income flows it has been eliminating. If we follow the money, we can confidently expect millions, then billions of machine-learning support roles to emerge in the very near-term, majorly limiting if not reversing widespread technological unemployment. Human-directed  machine learning  has  emerged  as the  dominant  process  for the creation of  Weak AI  such as language translation, computer vision, search, drug discovery and logistics management. I ncreasingly, it appears  Strong AI , aka  AGI  or "human-level" AI, will be achieved by bootstrapping machine learning at scale, which will require billions of  humans  in-the-loop .  How does human-in the-loop machine learning work? The process of training a neural net to do something useful, say the ability to confidently determine whether a photo has been taken indoors or outside, requires feeding it input content, in this case thousands of diff

Donald Trump, Entertainer-in-Chief

The days of the  presidential  presidency are behind us.   JFK was the  first TV President . He and his successors exuded a distinctly  presidential vibe as they communicated confidently to the masses, primarily through color video, usually behind a podium or in high-power settings, on a monthly or sometimes weekly basis. Donald Trump is the first Web & Reality TV President.  He spent a decade as host and producer of the hit show  The Apprentice  and exudes a distinctly colloquial vibe across cable and the web. Trump prefers titanic business settings like board rooms and communicates to the masses at a daily or even hourly rate, even after the election. Twitter is his pulpit. Trump is a seasoned, self-aware, master content producer AND actor.  In sports, the equivalent is a player/coach, a Peyton Manning or LeBron.  He's calculatedly sloppy and unpredictable, which appears to boost his authenticity and watchability. Most importantly, he's relentless. Trump's m