Akkam’s Razor

Think outside of the box? OK. There is no box.

Akkam’s Razor random header image

AOL Intentional Data Dump

August 7th, 2006 · No Comments

I added my own little bit of noise to the interweb on Metafilter regarding AOL's datadump of 21,000,000 queries of 650,000 customers over 3-months.  Most are righteously outraged at AOL's stupidity, but also salivating at the unfettered access and insight it provides to web behavior.  Although AOL pulled the data after the natives got restless, it was too late - the data sets are now everywhere.

There's another context to the story, the one's who say that we should not be further discussing it, and that AOL's pulling of the data closes the book on it.  From tech-recipes:

Everybody should just take a deep breath. Obviously, they released this as an attempt to help those doing research in search engine technology. The original research page states their purpose: "The goal of this collection is to provide a real query log based on users. It could be used for personalization, query reformulation or other type of search research." Although stupid, their intent was in no way malicious. AOL was notified and quickly took the download link down. Problem solved? Nope… because here comes the idiots to keep the pain going.

Just in the digg comments alone, people are not only posting torrent and rapidshare links to the information but are also pulling out private information to prove to people that the download is evil.

Everybody understands that this release of private information is a bad idea. Do we really need a thousand mirrors of this information across the web?

Is helping to faciliate the release of evil information, not just as bad as the original release of the information?

I say BULLSHIT.  The answer, davak, is no.  If anything, it speaks directly to several of the topics that are steering events on this blue globe. It is a discussion that is essential, and one we are obliged to have.

Contemplate every query available to government and institutions of dubious reliability, making life and death (as well as prosecution or military action) decisions based on their findings.  How do you know if they were looking for child porn, or looking for advocacy groups to refer a molested child.  How do you know whether someone was looking for ways to kill their wife, or was working on a screenplay, or just drunk and bored?

Check out the search history for  user 17556639,  most recent search is at the bottom of the list..  Does this look like the search history of a user wanting to do something bad?  From the Paradigm Shift:

17556639 how to kill your wife
17556639 how to kill your wife
17556639 wife killer
17556639 how to kill a wife
17556639 poop
17556639 dead people
17556639 pictures of dead people
17556639 killed people
17556639 dead pictures
17556639 dead pictures
17556639 dead pictures
17556639 murder photo
17556639 steak and cheese
17556639 photo of death
17556639 photo of death
17556639 death
17556639 dead people photos
17556639 photo of dead people
17556639 www.murderdpeople.com
17556639 decapatated photos
17556639 decapatated photos
17556639 car crashes3
17556639 car crashes3
17556639 car crash photo

It's kind of like the WMD and Iraq thing, isn't it?  Neoconservatives wanted to believe that Iraq was a threat to the region and the world, so pipes became tools for enriching uranium.  Trucks which make fertilizer became toxic chemical labs.  In the absence of facts, our ideologies fill in the blanks.  Well, what if you are the subject of their scrutiny?  How many false positives would you expect to encounter?

Next is the whistleblower aspect.  If we are to believe the AOL press release, it's all water under the bridge, time to move on.  Again, BULLSHIT.  Folk need to understand how the internet works, and what kind of breadcrumbs we leave out there.  They need to understand that leaving this information in the hands of incompetents or outsourced to third world nations is a fabulously bad idea.

What's the consequences of AOL's boneheaded move?  Nothing.  Did you look at the stock market today?  Did the market consider TimeWarners liability over this?  No.  Why?  Because John Q. Public doesn't understand the ramifications.  That's the cost of doing nothing.  If there's no checks and balances on corporate behavior, then they will continue doing whatever they can get away with.

So, having this data out there…what's the risk?  You can damn be sure that if you look at some queries, your going to find some stalking, some porn, some illegal or unethical behavior, some infidelity.  And there is a miniscule chance that someone may have searched for their name, DOB, and SSN, but that's highly unlikely.

In any case, this is a conversation worth having, and playing devil's advocate just makes you look like a dumbass. 

Totally unrealted note - OMG, there's gold in here.  Yes, there are screwed up people on the internet.  But look at some of these people:

#1880446: Straight guy, looking for gay sex and NY broadway shows.

#74104: Nascar hubby and sissy dad.

#72467: Has an unhealthy obsession with Villanova's Allan Ray. 

Popularity: 6% [?]

Tags:

Related Posts

Commodities [Rice]: Supply and Demand Scarcity or Pump and Dump?

...

MSNBC - ChoicePoint finds wealth in information

...

Data Visualizations

...

SomethingAwful and the AOL Database Dump.

...

Re-Use of Google Maps (and others) Data

...

0 responses so far ↓

  • There are no comments yet...Kick things off by filling out the form below.

Leave a Comment

*
To prove you're a person (not a spam script), type the security word shown in the picture. Click on the picture to hear an audio file of the word.
Click to hear an audio file of the anti-spam word

You are responsible for your own content and behavior. The site owner reserves the right to delete your comment, post your IP address, contact your network administrator, or generally make your life complicated if should you behave badly.