phpTechnorati v0.9 has been released. Enjoy!
May, 2003:
phpTechnorati
I have completed about 80% of phpTechnorati, a Technorati library using the Technorati API for PHP. Expect the rest in the morning… or maybe tonight. The only part that doesn’t work is the outbound method.
MovableType to Blosxom conversion
I decided to convert my entire weblog to Blosxom. The constant problems I have with MovableType finally beat my brain into realizing that a better solution must be out there. And, since Inklog isn’t exactly ready yet, I figured a few hours of converting now, will save me lots of headache in the future.
Unfortunately, the one thing I LOVE about MovableType is no where to be found in Blosxom: templates. Sure, you can alter the look of your Blosxom based website, but its template engine is nothing more than the most amatuer use of search-and-replace. After the conversion was finished and I started altering the Blosxom templates, I realized this.
Everything is ready to go, but I haven’t made it live yet. I’m not sure if the loss of this functionality is worth everthing that I gain from using Blosxom. If Blosxom supported multiple categories per entry, I think I’d willing to overlook everything else. But, add that to the template issue and I’m uncertain if it is a step forward, or a step back.
None the less, I figure there are other people out there who might want to convert from MovableType to Blosxom as well, so I decided to outline the instructions here.
First, two new Individual Archive templates need to be added to MovableType. Label the first one “Blosxom Individual Template” and fill it with the following:
<MTEntryTitle> creation_timestamp: <MTEntryDate format="%Y%m%d%H%M%S"> categories: <MTEntryCategories glue=","><MTCategoryLabel></MTEntryCategories> <MTEntryBody> <MTEntryIfExtended> <!-- more --> <MTEntryMore> </MTEntryIfExtended>
Make another template called “Blosxom Writeback Template” and fill it with this:
<MTComments> name: <MTCommentAuthor> url: <MTCommentUrl> title: comment:::[<MTCommentBody>]::: excerpt: blog_name: ----- </MTComments> <MTPings> name: url: <MTPingURL> title: <MTPingTitle> comment: excerpt:::[<MTPingExcerpt>]::: blog_name: <MTPingBlogName> ----- </MTPings>
Go into “Blog Config” and add two new Individual Archive templates. For the “Archive File Template” field, prefix the field with /blos/ for the “Blosxom Indvidual template” and with /blos_wb/ for the “Blosxom Writeback Template”. You can make the rest of the URL look however you’d like. I decided that, in Blosxom, I wanted my entries divided by primary category. For this reason, I decided to make mine look like this: blos/<MTEntryCategory dirify="1">/<MTEntryID>_<MTEntryTitle dirify="1">.txt. Again, you can do whatever you’d like, as long as it ends in .txt starts with something unique to your MovableType blog (like /blos/) and is certain to produce unique names for each entry. Additionally, the names of the directories leading to the entry can’t start with a number, or Blosxom will freak out. Make the “Archive File Template” be the same for the “Blosxom Writeback Template”, just change the prefix.
Rebuild your entries. All of them. If, like me, your webserver decided to choke when you tell it to do this, and you happen to have shell access to your server, try running the following script instead of rebuilding over the web:
#!/usr/bin/perl
my($MT_DIR,$BLOG_ID); BEGIN {
#############################
# Settings:
#
$MT_DIR = '/home/revjim/www/mt';
$BLOG_ID = 2;
#
#############################
}
use strict;
use lib "$MT_DIR/lib";
use lib "$MT_DIR/extlib";
use Data::Dumper;
use MT;
use MT::Entry;
use MT::Blog;
use MT::Placement;
use MT::Category;
use MT::ConfigMgr;
my $MT = MT->new( Config => "$MT_DIR/mt.cfg" ) or die MT->errstr;
$MT->rebuild( BlogID => $BLOG_ID );
Take a nap, read a good book, or get some excercize while you wait. It could be a while.
When it’s all over, you should have two new directories in the same place that your regular MovableType archives go. These will be moved into the Blosxom data directory when it’s ready.
Install Blosxom according to its instructions. If you can’t figure this out, then you’ll have a hard time doing any of the following steps.
Also install the writeback plugin, the meta plugin, and the entries_index_tagged plugin. Installation is simply a matter of sticking the file in the right directory and editing a few configuration options at the top of the file.
Some changes have to be made to these plugins. First, the entries_index_tagged plugin wants the timestamps to be in Unix Timestamp format. Not only can MovableType NOT produce this, but, in the future, if you decided to use this feature in your new entries, coming up with a Unix Timestamp can be a pain in the ass. Therefore, we’ll modify the plugin so that the timestamps should be in YYYYMMDDHHIISS format. Just after line 104 (line 103 reads last if $thisLine =~ /^\s*$/;) add the following:
if($mtime > 0) {
($year,$mon,$day,$hour,$min,$sec) = ($mtime =~ m/^([0-9]{4})([0-9]{2})([0-9]{2})([0-9]{2})([0-9]{2})([0-9]{2})/);
$mtime = timelocal($sec,$min,$hour,$day,$mon - 1,$year - 1900);
}
Additionally, the $timestamp_tag configuration variable should be set to "creation_timestamp:".
Now edit the meta plugin. Set the $meta_prefix configuration option to nothing ('').
Now edit the writeback plugin. When the writeback plugin gets a new writeback (comment or trackback) it converts all of the line endings from -newline- -carriage return- to, simple, -carriage return-. This allows the multiline data of the comment to be held on one line. Unfortunately, MovableType doesn’t support doing this, so we have to convert writeback to support multiline data fields. Go to line 164 (it reads if ( $fh->open("$writeback_dir$path/$filename.$file_extension") ) {). Delete the line after it, and continue deleting until the line that starts with my $writeback = &$blosxom::template. In between those two remaining lines, add the following:
my $ml = 0;
my $curitem = '';
foreach my $line (<$fh>) {
if ($line =~ /^(.+?):::\[(.*)\]:::$/) {
$param{$1} = $2;
} elsif ($line =~ /^(.+?):::\[(.*)$/) {
$ml = 1;
$curitem = $1;
$param{$1} = $2;
} elsif ($line =~ /^(.*)\]:::$/) {
$ml = 0;
$param{$curitem} .= $1;
$curitem = '';
} elsif ($ml == 1) {
$param{$curitem} .= $line;
} elsif ($line =~ /^(.+?):(.*)$/) {
$param{$1} = $2;
} elsif ( $line =~ /^-----$/ ) {
Now, move all of the MovableType archives in the /blos/ directory in to the data directory for Blosxom. Addtionally, move all the /blos_wb/ files into the writeback directory for Blosxom.
That’s it. You’re done. If you’d like to ensure that your old links are preserved, a small web script might be in order to redirect users from one place to another. Try to work it out on your own. I’ll post an example tomorrow.
Let me know if you found this helpful or if any corrections or clarifications are needed.
Beware the CICADA
In case you should forget the dangers of this horrible creature, I resubmit to you my warning from last year: Beware the Cicada.
Happy Mother’s Day
Happy Mother’s Day

To all of the Mothers, Mothers of Mothers, Daughters of Mothers and Mothers to be. And to my wife, who will someday cup the hearts of our children in her own hands:
You are the bearers of life, the caretakers of the future, and the nurturers of tomorrows Fathers and Mothers. May you find the hearts of those that you love closer to your own as they realize all that you do, all that you’ve done, and all that you will be asked to do in the future.
URI issues: solved
I think it was Mart’s comment that really pushed this idea into my head. Inklog will be a CMS first, and a weblog second. The goal here is to get everything to be accessible via the same methods and easily accessible via the same methods. It isn’t to turn everything into a weblog entry. Therefore, ID numbered URLs are out. Every item in the CMS will have a name that is unique to its path. If a document is moved and a redirect is desired for old links, a redirect must be placed into the system. An entire category/path can be redirected if needed. If a weblog entry is posted, it should be titled as such (i.e. /weblog/20030509_Inklog). The module that handles weblog entries CAN provide predated and ID Numbered titles if that is desired by the user. However, the ID number will not be part of the Node system and will be generated independently.
Thank you to those of you who offered your comments and suggestions.
UpdateChecker: RSS based update notification
Have you ever wished you could have an RSS feed for sites that don’t have RSS feeds? Well, I can’t give you that. However, I can give you an RSS feed that tells you when any particular page is updated. All you have to do is use my new script “Update Checker” (which REALLY needs a NEW name).
This script merely uses md5() on the contents of the page to determine if it’s changed. This means that, if anything on the page changes, you’ll be notified. If the time is displayed (in something other than JavaScript), if there are comments on the page (that aren’t provided by a JavaScript include), if there is a constantly updating list of weblogs.com pings (again, not provided by JavaScript) or any other information on the page that updates even when the page author has not included any new content, it will be counted as an update. This isn’t exactly desireable, however, without inventing a scraper for each site (and updating it when the author changes their layout), there aren’t many other ways.
However, in the event that the page you want to monitor falls into these specifications, then this tool may be for you.
It supports conditional GET on both sides of the communication. If the site you’re trying to check uses Conditional GET, it will recognize that and therefore save bandwidth on both ends. If your RSS reader supports conditional GET, it will recognize that and save even more bandwidth.
The script will also cache the page being checked in order to lessen the bandwidth blow. If the site being checked supports conditional GET, it will be cached for 5 minutes. If it does not, it will be cached for 30 minutes. If the site being checked is broken for some reason (500 error, 404 error, etc), another attempt to retrieve it won’t be made for 12 hours. These times may be altered to provide the best performance and the most flexibility.
The script also supports HTTP Redirect (both 301 and 302). In the event of a permanent redirect, the feed itself will notify you, the reader, that the URL has moved. Additionally, if the site being monitored is broken for some reason, the feed will also note that and let you, the reader, know when it will try to retrieve it again. The script also does its best to fix broken URLs (missing end slash, no http:// provided, etc).
I’ve put up a VERY UGLY page that will allow you to enter the URL and the NAME of the site you’d like to check. It will provide you with the URL to use as the RSS feed. I’ll make the page look nicer later on.
So give it a shot and let me know if you like it. If it tells you a page has updated when it hasn’t, let me know so I can figure out why. And if you can think of a better name, by all means, let me know. Additionally, let me know if you can think of any improvements. Later today I’ll release the source so you can see how it works.
Remember, this isn’t a perfect solution. It’s merely a way of getting around the limitations of other people’s sites.
Water Volleyball
Water Volleyball is fun. We’re thinking about playing tonight. They have a great pool at my apartment complex basically designed for Water Volleyball. Comment here or email me if you’d like to come. We’re thinking we should start around 6ish. If enough people want to come, we’ll plan something to eat as well.
Site URIs
There is a common practice amongst weblog softwares to display date information in the URI for a particular item. However, if we expand weblog software to manage an entire sites worth of content, dates should not appear in most URIs. If I publish documentation for a particular version of software that I am working on, the date the document was published isn’t going to be very important. However, the name of the software package and its version number might be.
By looking at the URI of this particular entry, would you prefer to see /2003/05/08/SiteURIs or /tech/comp/software/web/SiteURIs as the URI? The only time a date is important in URI is when the information at that URI is specific to a specific date. For instance: /news/2003/05/08/TodaysEvents.
A URI should never change. Therefore, if I’m stuck with a date in the URI for a particular piece of content, then, regardless of how often I update it, that date will be there for ever. That just doesn’t make sense. Dates should be optional and available to be used on none, some, most, or all of the content.
While any particular piece of content may fall into multiple categories, it should have one, and only one, permanent location. This helps to keep everyone linking to the same place and keeps users from wondering if they’ve been to a particular link found on another site. This means that, while this entry may appear at /tech/comp/software/web/2003/05/08/ and at /personal/crazythoughts/2003/05/08/ its permanent, official location should be in one and only one place, determined by the author.
Ocassionally, a particular category may need to be divided or moved entirely due to reorganization of content or too many items being categorized in the same fashion. Because of this, the system should be able to locate any item and redirect the user to the proper location. Since, after 50,000 pieces of content have been created, generating unique names for each piece of information might become difficult, the use of ID numbers is encouraged. If I should decide that web should be its own category under tech (resulting in /tech/web/SiteURIs for this entry) the system should detect that this item has been relocated and redirect the user to the proper location.
However, in doing that, we now, once again, have to deal with the fact that the official URI has changed. Even though we are nice enough to provide the user with a redirect, I’d rather I didn’t have to. So, that leads me in the direction of using /SiteURIs as the URI for this particular entry. It’s easy to type, it states what the user will find at that location, and it doesn’t suffer the problems that can come with reorganizing a site. Unfortunately, this isn’t the first time, nor will it be the last, that I write an article about “Site URIs”. One solution would be to preface the title with an ID number: /29945_SiteURIs, but that adds junk to the URL that might not need to be there. If we’re going to add junk, we might as well make it useful junk. Perhaps the date of the entry: /2003/05/08/SiteURIs. However, now the URL is longer than before, and the date, in this case, isn’t really important. Plus, imaging making a “Contact” page or a “Resume” page and having it at a url like /2001/06/04/Resume. That just doesn’t make sense. Especially since, it’s 2003 now, and I’ve updated my Resume 37 times since then.
I’m inclined to say that the BEST solution is to use category style URIs /tech/web/SiteURIs and redirect if there is a newer, permanent location. In order to facilitate this in a simple, easy fashion, it might be wise to include an ID number in the URL: /tech/web/29945_SiteURIs. It’s a little but of unneeded information that will help locate a relocated item much easier. Additionally, if for some reason a particular item becomes retitled, the ID number is still accurate.
I’d like input from readers, potential users, developers, and designers. Please comment here, post in your own weblogs, or email me directly.
a thought-filled mind
My mind runs constantly. Not because I want it to, or because I’ve conditioned myself to be this way, but because it just does. Sometimes, mid-sentence, my mind will jump to some odd tangent leaving my mouth wide open, its next syllable lost somewhere in my synapses. I’ve grown accustomed to it. I’ve learned to expect it. I’ve learned to love those interrupting moments of silence in which everything immediately becomes clear and perfect.
When I am working towards a mental goal, these moments happen even more frequently. And just when I think I’ve managed to consider every aspect of a particular problem, another new idea will hit me, almost instantly. My brain will churn immediately, searching through the existing ideas and finding the best place to attach this newest concept to them.
On more than one ocassion, I’ve found myself in a room full of people, all meeting to acheive a common goal. One by one, various people will address the group relaying their ideas and desires. They will talk about things I’ve never heard of. They will discuss processes I never knew existed. All the while I sit in silence, asking questions only to clarify concepts that I feel wont be elaborated on. After they’ve all said their piece, they begin discussing. Arguments arise. Some people wont budge. Others just don’t care. And others are just ignored because they weren’t understood. As they argue, my mind shuts them off and spins. And then, in an instant, all the pieces snap together and I stand up. I address the group speaking almost faster than my mouth can produce the words and outline a solution that takes care of everyone. For those things that can’t be accomodated, I offer other solutions that are just as appealing. When I finish, no one says a word. I wonder if people just don’t understand what I’ve said. So I ask, “Are there any questions?”. For a few seconds, it’s silent, and then someone speaks up: “When can you have it built?”
And then the problems arise. My mind doesn’t stop thinking there. Instead, it continues to analyze other possibilities. It continues to see other avenues of improvement and streamlining. It continues to rearrange the parts producing even better and better solutions. And it never stops long enough to actually let me do it.