frank's blog

80legs, Webcrawling For The Masses

Posted in ideas, Media, web by aldorf on January 12, 2010

Web crawling means browsing and indexing online content in an automated fashion. Its most prominent usage is in creating the databases for search engines like Google, but it’s important for anyone who wants to find content on the web, such as a movie studio that wants to find pirated footage, or an ad network that wants to see where its ads are being placed. For now, the main options are to build your own web crawler, usually using your own data center, or to take advantage of online services like Amazon Elastic MapReduce, says 80legs chief executive Shion Deysarkar.

You just make your choices from several menus, telling 80legs where you want it to crawl and what you want it to look for, and it returns a data file with your results.

80legs can crawl 2 billion pages a day!

You can track your brand in every nook. You can initiate new data crunching initiatives that would never have been funded otherwise. The service is like having a “mini-Google” at your disposal.

80legs is also opening an application store, where developers sell apps that further refine the web crawling results. For example, Deysarkar says, developers could sell apps that perform sentiment analysis, look for video fingerprints, or analyze sentence structure.

Similar webcrawling tools are Crimson HexagonRadian6

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s

%d bloggers like this: