'Part 1: Hacking your Facebook with Python'

My wife and I share a Facebook wall. Mostly, it's because we joined a long time ago and don't want to go through the pain of adding all our friends to two different accounts. As a result, our Facebook wall is a bizarre combination of kids ("ZAKY'S WALKING!!!!!") to coding ("OMG CHECK THIS HACK OUT"). We decided to end the year by hacking our wall with Python, to see if we can find some cool trends or results.

We'll be releasing the source code in a couple of days on Github. We used facebook-sdk and pattern (a natural language processing library, much like NLTK but lighter) to analyze our 2011 wall posts. We didn't have time to parse ALL the interactions we had with people on Facebook - but just our wall posts were 800kb of JSON data!

Here are some basic results (more to follow!):

Top 10 Commenters

As expected, the most active commenters on our wall were family members. Family Member #1 (name obscured for privacy) clearly has a lot of time on her hand (nudge nudge you know who you are! :D)

  1. Family Member #1: 16

  2. Family Member #2: 12

  3. Family Member #3: 6

  4. Family Member #4: 4

  5. Family Member #5: 4

  6. Neighbor: 4

  7. Family Member #6: 3

  8. Family Member #7: 3

  9. Family Member #8: 3

  10. Friend: 3

Hmm ... why are they commenting on our wall so much?

Top 10 Most Liked Wall Posts

As it turns out, most of our wall posts were about kids ... which is probably why so many family members commented on them :) Here are our wall posts along with the no. of likes:

  1. Misha's first pre-school picture: 19

  2. Parents in Portland!: 14

  3. People, Valentine's Day can lead to this :) Happy Valentine's Day!: 13

  4. Get off Facebook, dad!!!: 12

  5. has a cold, but is still determined to celebrate our 7th wedding anniversary today!: 10

  6. Microsoft: 10

  7. Another b'day, another cake yum!: 10

  8. love... I guess?: 10

  9. Taught Misha to say "Chill, bro" when Zaky cries. +1.: 9

  10. "With REM breaking up and the new Facebook design, this has to be one of the worst days for white people in the last few years." : 9

Top 10 Most Common Proper Nouns in Our Wall Posts

Unsurprisingly, our oldest son's name is the most common proper noun in the list :)

  1. Misha: 17

  2. Bilal: 6

  3. Oman: 5

  4. Seattle: 5

  5. Pakistan: 5

  6. Portland: 4

  7. Olga: 4

  8. Redmond: 3

  9. AM: 3

  10. Ackbar: 3

Top 10 Most Common Adjectives in Our Wall Posts

Adjectives can carry a lot of meaning. People tend to have a positive bias on Facebook - that is, they will be more cheerful than they really appear in real life. This seems to be bear out in our wall posts - several of the top 10 most common adjectives are positive (cool, good, great, nice):

  1. first: 5

  2. cool: 5

  3. good: 5

  4. great: 4

  5. little: 3

  6. Nice: 3

  7. old: 3

  8. Fascinating: 3

  9. pre-school: 3

  10. interior: 2

Sentiment Analysis

Sentiment Analysis is the branch of machine learning which tries to classify text as "positive" or "negative" (or somewhere in-between). The underlying models are statistical, and are often trained on corpuses of thousands of documents. But buyer beware: they are often only as good as the training data. Pattern's sentiment analysis module is trained on product reviews and movie reviews, so we can't expect it to do great on Facebook wall posts.

But we tried anyway. The results are interesting. We could not find a SINGLE 'negative' wall post throughout the entire year. This is probably because Olga and I have a strong positivity bias on Facebook, but is also because the sentiment analyzer is not particularly suited to Facebook wall posts. Oh well.

Here are some of the most positive wall posts:

  1. Misha met Santa and his requests were #1 to say hi to his mom and #2 to show him his reindeer (he wants to figure out how they really fly..)  Merry Christmas to all!

  2. Perfect day for playing ball :)

  3. Very impressed by the March WP7 update ... can't wait for Mango to come out!

  4. Possibly one of the best quotes of all time: "Who the hell is interrupting my Kung Fu?!"

  5. Awesome! http://code.google.com/apis/predict/

  6. Software, when it works, is a beautiful thing

  7. Happy Birthday, Pakistan!