Return of the Facebook Snatchers

First and foremost: if you want to cut to the chase, just download the torrent. If you want the full story, please read on....

Background

Way back when I worked at Symantec, my friend Nick wrote a blog that caused a little bit of trouble for us: Attack of the Facebook Snatchers. I was blog editor at the time, and I went through the usual sign off process and, eventually, published it. Facebook was none too happy, but we fought for it and, in the end, we got to leave the blog up in its original form.

Why do I bring this up? Well last week @FSLabsAdvisor wrote an interesting Tweet: it turns out, by heading to https://www.facebook.com/directory, you can get a list of every searchable user on all of Facebook!

My first idea was simple: spider the lists, generate first-initial-last-name (and similar) lists, then hand them over to @Ithilgore to use in Nmap's awesome new bruteforce tool he's working on, Ncrack.

But as I thought more about it, and talked to other people, I realized that this is a scary privacy issue. I can find the name of pretty much every person on Facebook. Facebook helpfully informs you that "[a]nyone can opt out of appearing here by changing their Search privacy settings" -- but that doesn't help much anymore considering I already have them all (and you will too, when you download the torrent). Suckers!

Once I have the name and URL of a user, I can view, by default, their picture, friends, information about them, and some other details. If the user has set their privacy higher, at the very least I can view their name and picture. So, if any searchable user has friends that are non-searchable, those friends just opted into being searched, like it or not! Oops :)

The lists

Which brings me to the next topic: the list! I wrote a quick Ruby script (which has since become a more involved Nmap Script that I haven't used for harvesting yet) that I used to download the full directory. I should warn you that it isn't exactly the most user friendly interface -- I wrote it for myself, primarily, I'm only linking to it for reference. I don't really suggest you try to recreate my spidering. It's a waste of several hundred gigs of bandwidth.

The results were spectacular. 171 million names (100 million unique). My original plan was to use this list to generate a list of the top usernames (based on first initial last name):

 129369 jsmith
  79365 ssmith
  77713 skhan
  75561 msmith
  74575 skumar
  72467 csmith
  71791 asmith
  67786 jjohnson
  66693 dsmith
  66431 akhan

Or first name last initial:

 100225 johns
  97676 johnm
  97310 michaelm
  93386 michaels
  88978 davids
  85481 michaelb
  84824 davidm
  82677 davidb
  81500 johnb
  77800 michaelc

Or even the top usernames based on first name dot last name (sorry, I can't link this one due to bandwidth concerns; but it's included in the torrent):

  17204 john.smith
   7440 david.smith
   7200 michael.smith
   6784 chris.smith
   6371 mike.smith
   6149 arun.kumar
   5980 james.smith
   5939 amit.kumar
   5926 imran.khan
   5861 jason.smith

Or even the most common first or last names:

 977014 michael
 963693 john
 924816 david
 819879 chris
 640957 mike
 602088 james
 584438 mark
 515686 jason
 503658 robert
 484403 jessica

 913465 smith
 571819 johnson
 512312 jones
 503266 williams
 471390 brown
 386764 lee
 360010 khan
 355639 singh
 343220 kumar
 324972 miller

So, those are the top 10 lists. But I'll bet you want everything!

The Torrent

But it occurred to me that this is public information that Facebook puts out, I'm assuming for search engines or whatever, and that it wouldn't be right for me to keep it private. Why waste Facebook's bandwidth and make everybody scrape it, right?

So, I present you with: a torrent! If you haven't download it, download it now! And seed it for as long as you can.

This torrent contains:

  • The URL of every searchable Facebook user's profile
  • The name of every searchable Facebook user, both unique and by count (perfect for post-processing, datamining, etc)
  • Processed lists, including first names with count, last names with count, potential usernames with count, etc
  • The programs I used to generate everything

So, there you have it: lots of awesome data from Facebook. Now, I just have to find one more problem with Facebook so I can write "Revenge of the Facebook Snatchers" and complete the trilogy. Any suggestions? >:-)

Limitations

So far, I have only indexed the searchable users, not their friends. Getting their friends will be significantly more data to process, and I don't have those capabilities right now. I'd like to tackle that in the future, though, so if anybody has any bandwidth they'd like to donate, all I need is an ssh account and Nmap installed.

An additional limitation is that these are only users whose first characters are from the latin charset. I plan to add non-Latin names in future releases.

142 thoughts on “Return of the Facebook Snatchers

  1. Reply

    Alpha

    I downloaded fbdata, Now What???????
    How to read that data????

  2. Reply

    mike

    I think many people would be interested to find if they do appear on this list... a search feature to this list would be pretty nice to use against this list w/o having to waste a ton of bandwidth to download the entire list. Just an idea, not like it's very important, but would be kind of cool to have if it's not all too hard to put up.

  3. Reply

    Tony

    Yeah, how can I read this data?

    Please Help me/us

  4. Reply

    ChiPoLy

    Very interesting.

    I've tryied to access to the info page, but there is a LOT of privacy activated profiles.

    Is there any way to access to info page without being logged/addfriend
    ?

  5. Reply

    JonAlex

    Ok. Maybe I'm a little slow.

    If you have 171 million email addresses that we're gathered from a computer database (facebook's).

    Can you market to this list? I thought their were CANSPAM restrictions against gathering emails in this manner?

    Assuming you can use these email addresses. This is absolutely huge!

    But that's the main issue. It is public. But the manner of gathering worries me. Maybe it's a lack of understanding on my part.

    Does anyone know whether gathering a list of public emails from a public database and then marketing to that list is CANSPAM compliant?

    Thanks!

  6. Reply

    Suhaib

    Is there a way to get emails addresses of these users. If someone can make a bot to extract emails from these profiles it will be great..

  7. Reply

    chandan

    now can you collect the data again and see the difference ?

  8. Reply

    FB Fucker

    I found a way to do this... i wont tell... all data... including hide data.

  9. Reply

    chujcidodupy

    @FB Fucker, yeah right.. faggot

  10. Reply

    notworking

    cool

  11. Reply

    bdev

    scraping the family database give you the same info as going after the people db however on the main listing page it also includes the profile image for the user (i.e. one less page you'd have to crawl to in order to retrieve their pic). - http://www.facebook.com/family/

  12. Reply

    hamari library

    Yeah, how can I read this data?

  13. Reply

    bestsecurity

    if you can, hack my facebook!!!
    change my password!!!
    facebook.com/profile.php?id=100001480926565

    prove it!!!

  14. Reply

    daemon

    i want to modify it to get users from a particular country … please help or else give me a little hint

  15. Reply

    Jy Johnson

    Hey Sean how many id's are their total in the url file? Also are their more ids of male or females?

    Thanks alot man... U Rock!!!

  16. Reply

    FacialX

    YOu people are dreaming You will be identified and tied to your accounts shortly
    Thieves, burglars, offenders, scammers will be idententified to the cameras
    You post BILLIONS of photos and show your faces everywhere. This is your number. Live RIght for your true colors will be exposed.

  17. Reply

    laughing so hard

    this whole post is so hilarious!

    people refuse to understand the importance of making vulnerabilities PUBLIC - an age-old discussion that should have been settled ages ago - as when vulns are public, there's a fire under the proverbial bottom of whoever's responsible (ie. FB) to fix it, and before it's public it's only accessible to the bad-guy. really people: imagine you have two bad-guys sitting in a room. if one of them started to make public his findings, would that be a good thing or a bad thing for the bad guys? OBVIOUSLY, it's a bad thing, as the hole will be patched sooner. any pentester will sabotage his evil twin by making public his findings! (as for making public instead of just telling the company in question in secret - well, i believe experience speaks for itself...ie. they never listen.)

    big score for education also! thanks a lot, great read ron & al.
    keep up the excellent work!

  18. Reply

    researcher

    I am trying to see if I can get public users facebook status and all the replies for their status thats been publish. So far I haven't had luck with the facebook api. Do you have any suggestions?

  19. Reply

    brittany

    Respect big time to you I would love just to have the spider you used for a web search you have a link thank you I shared some links to my fb wall fb is up to some thing ?? what do you think og the files sharing Google is pulling
    on the torrent sites cheers

  20. Reply

    vimax

    if this is a tool or software to get email and password of someone facebook account?

  21. Reply

    stona

    Love your work Ron
    Have you done any more updates on this project?
    Security with fb is a problem I see now some apps ask for your permission to access your account even when your not using it.
    I found comments on sites I have made on facebook and its because friends wavered there rites to apps and as friend can read my posts, so can this app

  22. Reply

    45798798

    Thanks for your work.
    A great database for finding informations "offline".
    So i can search for informations without beeing directly tracked by those criminals(Facebook).

    Greetz

  23. Reply

    chota bheem cartoon games

    Respect big time to you I would love just to have the spider you used for a web search you have a link thank you I shared some links to my fb wall fb is up to some thing ?? what do you think og the files sharing Google is pulling
    on the torrent sites cheers

  24. Reply

    diny

    Dear,

    I have a question

    how can I firstname and lastname
    link in the torrent?

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>