Popdex came about because an inflated technological ego thought that a current-events news & link spider could be built in a few days. A month later, after many Mountain Dew filled nights and weekends, an index was born.
Not having added anything "new" to what was out there already, the next step was incorporating the popularity of linking sites into the rankings. Thus if websites A and B are extremely popular, and link to site C, then site C is given more weight in the rankings than a site linked to by sites with smaller numbers of inbound links.
A score is computed hourly and the rankings are updated, with the highest possible ranking out of 100 (like a percentage). I call this technology PopScore.
-Shanti
The site uses the LAMP architecture (Linux, Apache, MySQL, Perl and PHP). 100% pure Open Source baby! The crawler is written in Perl, with some interfaces to MySQL done in PHP with XML as the messaging protocol.
I wanted to make a distributed, scalable architecture. The primary client crawler that grabs URLs and checks for updates is written in Perl. But all clients have a centralized interface to the database through a PHP web endpoint. This script handles creating sessions and distributing URLs to the clients for them to crawl in an orderly fashion.
The client crawler has been tested with Perl on Linux and Windows. This allows me to distribute the load arbitrarily among any number of machines, so long as they have a Perl interpreter loaded. It should also save on bandwidth costs once I get a real web host!
PHP pages that serve the main content are compiled into static HTML every hour (for now), so the site should be wicked fast. Popular searches are cached and the results are refreshed often. Cached query results are served from static HTML and so should be extremely fast. That's about it…
Shanti Braford is the creator of this site. You can check out my personal weblog here. I have been programming since freshman year in college (4+ years ago) in many languages, including Perl, Java, C/C++, PHP, ASP, etc. The foundation for this project came from my experience building an MP3 search engine back in the heyday of the Internet boom.
I must make a disclaimer that this site is just a side hobby. I am happily employed and cannot devote more attention to this site than to my employer (Gotta pay the bills, right?). I graduated with a BS in Computer Science from Washington University in St. Louis.
Feedback? Flames? Suggestions? Insider stock tips? Send them to me at:
shanti (at) popdex.com
|