On Database Abstraction Layers
July 9th, 2004With all of this talk of the benefit (or lack there of) of database abstraction layers, I feel the need to inject a bit of clarity into the discussion. It seems that many of the people who have so much to say on this topic are also the people who neglect to see the entire picture.
First of all, a clarification. "Jeremy":http://jeremy.zawodny.com/blog/ recently wrote regarding his belief that "Database Abstraction Layers must Die":http://jeremy.zawodny.com/blog/archives/002194.html. He mentions that the primary purpose for using an abstraction layer is to allow an application to switch database engines with little to no code changes. He accurately points out that, in many cases, altering the database calls in your code is the easiest of the tasks involved with a database engine switch if you use proper programming methods.
bq. I use a revolutionary new programming technique. Instead of littering my code with those calls, I put my core data access layer into a library–a separate piece of reusable code that I can include in various parts of my application and… reuse!
In this respect, he's correct. If you're writing a custom application for use in-house, you're not generally going to change database engines on a whim. And, if you've got all of your database calls wrapped neatly in a few classes serving as your application model, you don't have to look very far to make those alterations.
John Lim, of "PHP Everywhere":http://phplens.com/phpeverywhere/, "argues":http://phplens.com/phpeverywhere/?q=node/view/63 that Jeremy's thoughts on Database Abstraction are only relevant when you're dealing with large, Yahoo-grade applications. I don't entirely agree with this statement. Even medium-sized sites and applications can benefit from the additional speed brought forth by *not* using a database abstraction layer. And, as Jeremy said, the probability of even a medium-sized site changing database engines is low and, even if it were to happen, changing the code would be the easiest of the tasks involved when proper coding technique is uesd.
John goes on to to show that he has completely misunderstood Jeremy's "alternative" by saying:
bq. The irony is that in the end Jeremy advocates a standard database library for managing error-handling and other execution details. BUT IT'S OK BECAUSE IT DOESN'T HAVE THE WORD ABSTRACTION IN IT. Err, Jeremy, your fly is open.
John… Jeremy is not advocating a *standard database library* to manage error-handling and other execution details. He's advocating an application *model* filled with methods that pertain to the application at hand. And those methods just happen to use a database engine to perfrom their tasks. This is quite different from a database abstraction layer. His example doesn't have the word abstraction in it because it isn't a database abstraction layer at all. It's an application model and any good programmer interested in authoring maintainable code will use an application model, regardless of whether they use a database abstraction layer or not.
So, John, before you begin pointing out the fact that someone else might have his zipper down, you ought to look at your own crotch and make sure your dick isn't hanging out.
(P.S. I know John had the best of intentions with his writing, and his site is a valuable resource to the PHP community)
With all that being said, I don't believe that Jeremy has taken all things into consideration when he claims that "Database Abstraction Layers Must Die!". In an in-house application, he's right on. However, when you are developing an application that you intend to sell to other parties or to release to the public for free, you have to take into consideration that, while you're unlikely to switch database engines on a whim, the other people using your code might also use a different database engine than you. If you wish to cater to those people who have different storage requirements than you do, you really only have three options.
1) You can write various versions of your application model supporting different backends and allow the user to choose which model works best for them. Every time you make an API change, you have to make that same change in all of the mutations of that model. This means more work for you.
2) You can write your application model in two pieces, a base class, and an extention of that class, allowing other users to extend that class as well to support their own backend storage needs. This means that API changes might break their custom extentions if you're not careful and also means that you're not providing an "out of the box" solution. Of course, over time, you can work out several extensions of your base class to distribute with the application. In this case, code is more mantainable, and less likely to break or need an update for small API changes. However, you've got a lot more code to write and must suddenly become an "expert" on engines you may have never used before and may not even have access to.
3) You can use a database abstraction layer and leave the task of supporting new backends to the authors of that layer.
Option 3 is most likely the "slowest" of the three in regard to execution time. However, a few extra CPU cycles, in many cases, is well worth the flexibility this provides to the users of your software.
So, to summarize: database abstraction layers do, in most cases, slow down code execution. Often times they also alter the way certain features within a storage engine work, therefore, reducing the value of that storage engine considerably. And, since all database engines are not alike, many of the extended features of various engines are unaccounted for in these abstraction layers, leaving the programmer to mix abstraction layer calls, and direct calls, which is rarely a good idea, and almost always confusing. However, execution speed and backend features are not always the top priority. When flexibility and ease of maintenance is your goal, a database abstraction layer can be a very useful tool.


















