Facebook, Hadoop and Hive

Cloud Computing, Software Architecture, Software Development June 16th, 2009

facebook logo for website Facebook has the second largest installation of Hadoop (a software platform that lets one easily write and run distributed applications that process vast amounts of data), Yahoo being the first. It is also the creator of Hive, a data warehouse infrastructure built on top of Hadoop.

The following two posts shed some more light on why Facebook chose the Hadoop\Hive path, how they’re doing it and the challenges they’re facing:

Facebook, Hadoop, and Hive on DBMS2 by Curt Monash discusses Facebook’s architecture and motivation.

Facebook decided in 2007 to move what was then a 15 terabyte big-DBMS-vendor data warehouse to Hadoop — augmented by Hive — rather than to an MPP data warehouse DBMS…

The daily pipeline took more than 24 hours to process. Although aware that its big-DBMS-vendor warehouse could probably be tuned much better, Facebook didn’t see that as a path to growing its warehouse more than 100-fold.

Hive – A Petabyte Scale Data Warehouse using Hadoop by Ashish Thusoo from the Data Infrastructure team at Facebook discusses Facebook’s Hive implementation in details.

… using Hadoop was not easy for end users, specially for the ones who were not familiar with map/reduce. End users had to write map/reduce programs for simple tasks like getting raw counts or averages. Hadoop lacked the expressibility of popular query languages like SQL and as a result users ended up spending hours (if not days) to write programs for typical analysis. It was very clear to us that in order to really empower the company to analyze this data more productively, we had to improve the query capabilities of Hadoop. Bringing this data closer to users is what inspired us to build Hive. Our vision was to bring the familiar concepts of tables, columns, partitions and a subset of SQL to the unstructured world of Hadoop, while still maintaining the extensibility and flexibility that Hadoop enjoyed.

Tags: , , , ,

Facebook Opens Itself to the Web

Software Industry January 27th, 2008

Facebook is releasing a JavaScript library allowing Facebook applications to be embedded in any web-site – not just inside Facebook.
This small step brings Facebook closer towards becoming a true web operating system – a platform for any application that wants to use the user’s context information (social information, friends etc.).

As Nick O’Neill from AllFacebook puts it:

…Facebook just released their JavaScript client library than enables developers to extend their applications to their own websites. Rather than building your applications strictly within Facebook you can now extend the full functionality of the platform to your own website and leverage Facebook as the tool for managing members and their relationships. Somehow nobody has seemed to take note of this significant step.

Want to build your own social gaming platform that resides on your own website but leverages the power of users’ Facebook relationships? Now you can! There had previously been applications that could leverage the Facebook API prior to the launch of the platform but there are some significant differences now versus before. The first significant difference is the broader access to Facebook’s core features that the platform provides.

By extending to the web, Facebook will encourage the growth of its developers community without loosing its main asset – user’s information and relationships (as opposed to OpenSocial that will allow users to take their data anywhere).

Unlike all the Web-OS companies trying to sell us “windows inside your browser” Facebook seems to get it – an operating system based on social information and relationships can be much more valuable than the one that simply operates our personal computers.
No wonder they’re comparing Mark Zuckerberg to Bill Gates…

Tags:

Facebook Fan Page – Add Community Features to Your Blog

Software Industry January 17th, 2008

I’ve been playing around with creating a Facebook Fan Page for my blog – http://www.facebook.com/pages/DeveloperZencom/10649455588

It actually seems like a very good way to form a community around a blog (or a site, or a product, or a company, or a public figure…).
Besides aggregating my blog’s posts and my Google Reader’s posts to applications on the page I can also notify readers on events, and have a discussion going on using the Discussion Board or The Wall application

This service is very similar to to what MyBlogLog offer only far more customizable and featured given Facebook’s applications platform and its wide development community.

Other reasons to consider when opening a Fan Page for your blog:

  1. Fan Pages are public. This means that search engines can index Facebook Fan Pages which means there are more ways for readers to stumble your blog on the net
  2. When you link to your blog and because they’re public you get some nice Facebook.com link credit.
  3. Make your blog viral – when someone joins a fan page, the event is published in their news feed for all their friends to read.

So, I urge you all to join my blog’s community page at Facebook and tell me what you think… See you there! :-)

Tags:

Facebook Apps Spam

Software Industry October 26th, 2007

facebookappsspam

I just refuse to add all these idiotic Facebook apps…

Tags: