Home
Popdex
Metapop

Top Lists
All-Time Top 100
Top News Stories
Top Yahoo! Buzz
Popdex Archives

About
The Story of Popdex
Popdex Blog

Support Popdex
Add Popdex Search/Link
Donor List
Blogroll this Site!

Miscellaneous
Add your Blog
Syndicate this Site
Contact Popdex

About Popdex

Birth of an index

Popdex came about because an inflated technological ego thought that a current-events news & link spider could be built in a few days. A month later, after many Mountain Dew filled nights and weekends, an index was born.

Not having added anything "new" to what was out there already, the next step was incorporating the popularity of linking sites into the rankings. Thus if websites A and B are extremely popular, and link to site C, then site C is given more weight in the rankings than a site linked to by sites with smaller numbers of inbound links.

A score is computed hourly and the rankings are updated, with the highest possible ranking out of 100 (like a percentage). I call this technology PopScore.

-Shanti

The technology

The site uses the LAMP architecture (Linux, Apache, MySQL, Perl and PHP). 100% pure Open Source baby! The crawler is written in Perl, with some interfaces to MySQL done in PHP with XML as the messaging protocol.

The architecture

I wanted to make a distributed, scalable architecture. The primary client crawler that grabs URLs and checks for updates is written in Perl. But all clients have a centralized interface to the database through a PHP web endpoint. This script handles creating sessions and distributing URLs to the clients for them to crawl in an orderly fashion.

The client crawler has been tested with Perl on Linux and Windows. This allows me to distribute the load arbitrarily among any number of machines, so long as they have a Perl interpreter loaded. It should also save on bandwidth costs once I get a real web host!

I feel the need, the need for speed

PHP pages that serve the main content are compiled into static HTML every hour (for now), so the site should be wicked fast. Popular searches are cached and the results are refreshed often. Cached query results are served from static HTML and so should be extremely fast. That's about it…

What fool made this?

Shanti Braford is the creator of this site. You can check out my personal weblog here. I have been programming since freshman year in college (4+ years ago) in many languages, including Perl, Java, C/C++, PHP, ASP, etc. The foundation for this project came from my experience building an MP3 search engine back in the heyday of the Internet boom.

I must make a disclaimer that this site is just a side hobby. I am happily employed and cannot devote more attention to this site than to my employer (Gotta pay the bills, right?). I graduated with a BS in Computer Science from Washington University in St. Louis.

Contact info

Feedback? Flames? Suggestions? Insider stock tips? Send them to me at:

shanti (at) popdex.com



    


© 2003 Popdex.