A Beast of a Project


Over the last several weeks I've been bogged down with a trip to the SES conference, supporting a legacy customer through a non-existant problem, and building some stuff for another customer who apparently just won't pay me for the work I've already done. But each and every day I've been spending some time building an incredibly thorough utility for sitemapping.

This particular project has turned into a beast for me because what I thought would be a very simple set of tools for parsing through directory trees and spidering a few pages is not nearly as simple as originally intended. The upside is that when I do complete this project, it will be something to be proud of.

There are a good amount of sitemapping utilities out there - ones that use PHP, Perl, Ruby on Rails, or any of a number of other platforms. Some of them seek out files from the filesystem. Some actually spider your site. Some produce a google sitemap, some a yahoo sitemap, some an HTML sitemap. I even found one quite useful one that output a PDF sitemap that came in very handy for me last year.

There are also the utilities that are plugins that you can use with various page generation systems - be they CMS systems, bulletin boards, or something else.

But all of the systems that I've seen have come up short in one way or another. I wouldn't go so far as to call many of them incomplete or error-prone, but a single system that contains all of the features that I desire doesn't seem to exist.

So I set out to build what I like to call the most complete sitemapping system around. Unfortunately for me, that has meant a couple of different software re-designs because of various performance or feature concerns that I've had. So far, things have gone extremely well - the performance that I've achieved has been surprising and I think that is due to the fact that some of the other systems I have tried were built with ease of development as a main context rather than performance.

In the process, I've learned a lot about Perl - a language which I've been using for a while, but mainly the things I've built have been more small scale in terms of size and like a lot of casually built perl programs - not very clean.

I expect to release at least an alpha version within the next 7 days. I keep saying to myself "tomorrow!", but I know that once I wrap up this last revision of the filesystem searching utility, I'm going to need to re-vamp the spidering utility, and then build at least a shell of a database searching utility. From there, I'll probably need to visit a few AJAX framework sites to see if there is anything out there yet that I'm comfortable using as a framework. From there, I really am going to need to review some of the open source licenses to see how I want to release the code.

As a side note, the development effort has also encouraged me to build a few other utility programs that will be released soon as well. More on those in my next entry.



Join The A Beast of a Project Discussion