If you program in PHP (or any language, really), and are developing something that you intend to release to the public, please, please, please spend the little bit of extra time it takes to develop your application properly. This means using function or classes to help break down and simplify the logic in your application, using meaningful variable names, and a directory structure that makes sense. When you do, you encourage other developers to improve your product, you encourage the use of your product in ways that you never intended, and you obtain a following from developers and users alike.
Because finding/writing a PHP based weblog application that works the way I think it should work is high on my currently priority list, I'm going to pick on Cafelog. I don't have anything against its users or its developers aside from the use of poor coding practices. If you are a user of this product, please don't tell me about how cool it is, or about how well it works. If you read a site powered by this engine, please don't tell me about how easy you find it to use. If you are the developer of this product, please feel free to email me.
cafelog/b2 supports multiple authors, archives, templates, trackback, pingback, xmlrpc, categories, an RSS feed, and posting via email. It's open source, written in PHP, and uses MySQL as a backend. It used fried pages (not baked) and a has a growing community. It's probably a GREAT weblogging application. Unfortunately, I'll never know because I refuse to use it.
There are some things about it that I don't like, but they are merely a matter of taste (in most cases). These things are not the point of this article. However, I will enumerate them so that there is a better understanding of why I would want to modify the codebase to begin with. I should also point out that these features might actually exist in cafelog already, and I was just so unwilling to read every line of code written in order to see that.
It doesn't cache calls to the database. This means that every time the page is requested, regardless of whether new content was published one second ago or one decade ago, my webserver takes a hit, and the database takes a hit. I'd like to see it cache the requests it makes to the database, and then not make them again until it is informed that the data has changed.
It doesn't support Last-Modified and Etag headers. This means that if a user pounds on the "reload" button on their browser, it will have to pull all of the data from my server every time resulting in greater bandwidth usage. I'd like to see it use HTTP 1.1 features and send a 304 - Not Modified status to the client browser instead of the entire page when the content hasn't changed.
I don't care for the template system. I would prefer to use another template system allowing me more flexibility. By using my own template system, I can use the same layout files with this application as I use on the rest of my site allowing me to change the entire look of my site by editing a single layout template.
I don't like the URLs it generates. I prefer search-engine friendly URLs that are void of question marks and ampersands. This allows the search-engines to index all of my pages, provides cleaner looking URLs, and makes cutting and pasting URLs much easier.
I don't expect the author to share my vision of features. It's his project, and he can have it work anyway that he likes. Because I am a PHP programmer, I am able to modify and/or reuse his code to suit my needs and my design ideas. While I would like to see these features in this product, I understand that he may want things his way for a reason. And I have no problem with that. What I do have a problem with, however, is my inability to use his code (without extensive reading and rewriting) to implement my own features.
Let's dig into this a little bit deeper. From what little bit of code I did read I manage to deduce that most (all?) of the scripts begin with an include of blog.header.php. Opening that file makes me want to scream. This file is responsible for looking at all of the arguments that have been passed to the script and deciding exactly how the SQL query should look to generate the requested page. It then performs the query, and stuffs the results into $results which I can only imagine is used globally (AAAAAAAHHHHH!) in the remainder of the code, which is considered by Cafelog to be a "template". And, since this file is included just about everywhere, there is actually provision at the end to issue an SQL statement of "SELECT 1=1" in the event that an SQL query is not needed. This page is where the most changes are needed, and therefore it is where we will focus.
It would be fairly trivial to wrap this code in a function that took one "struct" as an argument and returned an SQL query. And while this makes things much easier on a developer trying to use this code, there is an even better way. It should be broken up into several pieces. A class with many functions would be best, however, I'd settle for many functions outside of a class. In fact, though this would not be the best way to go, I would even take one big function taking many arguments, or even one function taking one giant struct as its parameter. These functions (or function) should return the actual MySQL result rows, however, even if it merely returned the SQL query string it could be wrapped a bit easier.
After rewriting this code, we would have a class (perhaps called Cafelog) with many functions. This can be done in many ways. One of which would be to make functions that would return different types of queries. This would result in functions like getRecentPosts($limit,$offset) and getPostsFromCategory($category). Another method would be functions that imposed various restrictions on the returned data with a final execute() function to perform the work. This might involve functions like $cl->limit($number,$offset), $cl->category($category), $cl->daterange($startts,$endts). In the end, however, one would find flexible, easy to use code, ready for modification, improvement, or alternate uses.
Building clean, reusable code is important. It means that you'll implement new features quicker, require less work, see fewer bugs, and attract more developers. Writing dirty code is the easy way. Given enough examples to cut and paste from, my mom could manage dirty code. (Well maybe that's giving her too much credit). Consider this the next time you pull out your favorite text editor to write some code.











