KWPro.net

Next PHP Framework
By: conark
Published On: 2-6-2011

A friend of mine asked another guy doing a thesis about elements that belong to a PHP framework. Having been working with frameworks for ages, I ended up writing very lengthy response, which I felt was worthy of sharing:

For myself, i'm using a hodge-podge of Kohana and Zend, where the Kohana layer I'm employing helpers and controllers, while with Zend I'm utilizing validation, models and services (for the most part). part of my decision to go this route was that my last company evaluated a few frameworks and was going after speed. my manager felt that most model frameworks though performed less than optimally so we ended up using something in house. however, Kohana had a nice feel overall to it and is a branch from the more popular CodeIgniter framework (which was my first choice actually). one guy started to implement Sprig for the model side and I tried it out as well.

however, for myself, I didn't care for the model part of Kohana and preferred Zend. I felt that Sprig was too immature and required "too much coding" whereas Zend had a nice fluidity to it, particularly with regards to relational mapping and not really needing to code up the field names and types.

that all said, to address your question more specifically, I'll list various points:

* I feel that a framework should share familiar principals that other frameworks provide, unless a radical paradigm shift is suggested (which in turn improves all the fundamental attributes of a good application). for instance, with the web, you have the typical MVC (model-view-controller) paradigm. Most veteran web programmers these days are probably familiar with this concept as it exist in numerous frameworks. In this manner, the turn around time for adapting/learning a new framework will be shortened drastically.

* A framework should be loosely-coupled and highly cohesive. This simply again is reiterating the previous principal but more specifically emphasizing an idea all good software engineering should follow.  In this case, I feel that this means people have the freedom to pick and choose parts of the framework for their needs. Take for instance the Zend Framework. I heavily dislike their implementation of controllers, authentication and authorization. However, I am not required to implement it in my project alongside Kohana. Spring is another excellent example which demonstrates this principal.  Also, we have to examine cases that are not web specific. What about scripts or (in the case of Java), applets/Swing applications? How about (perhaps in the distant future) mobile platforms?

* The framework should minimize it's assumptions about it's environment. This in some ways is attempting to say that environmental dependencies are a bad thing. For instance, what happens if the PHP version were suddenly upgraded from 5.2 to 5.3 or 6.0? Or does the framework depend upon other libraries in order to be considered usable, especially compiled libraries where perhaps a shared hosting system does not provide such things.

* Another critical thing is to attempt to answer the question: what goal am I attempting to solve? Considering that there are a great number of similar frameworks in existence, one must consider whether the new framework attempts to replicate these efforts or if it attempts to ease the burden that these frameworks problematize. If all the new framework does is conform to what the author feels works for him/her, then does it really help others? We have to examine the ecology of software, something much neglected since software engineers tend to be finicky about such matters and prefer reinventing the wheel as opposed to solving a different problem and/or improving upon the existing wheel.

* To continue along the lines of the previous point, we must look further at trying to answer that question. In my experience, new frameworks tend to pop up as a result of one or two reasons: 1) the current framework inadequately solves a problem that I have; 2) the current framework is slow (this could mean in terms of response time and engineering time). Take Hibernate vs EJBs as an example. Back in the day, EJBs, up to around version 3.0, were a horrible mess to code. Sun (and other companies like WebLogic) with their so-called prophetic marketing gurus, attempting to convince businesses that their solution was a one-stop shop for all enterprise needs. Sadly, many developers hopped onto the whole J2EE solution (mostly to boost up their resumes to get better paying positions in the unruly period of unrest during the post-bubble crisis), but employed it for projects that probably just needed a few simple albeit ugly JSP pages. The whole disastrous implementation of J2EE caused Sun (and others) to come up with unfathomably overarchitected solutions that were NOT agile (imo points of perpetual job stability through numerous redundant layers) to the point where in all honesty, EJBs were a supreme joke.

Then someone came up with a nice little solution called Hibernate. Hibernate simplified the horrible mess of the four required files of an EJB (at least from 2.0) into something called a POJO and an obvious looking configuration file. While Hibernate, until the advent of annotations, still burdened developers with "configuration hell," it still eased a lot of Java developers' pains as compared to the even more nightmarish EJBs. In fact, the solution of POJOs via Hibernate was done so well that EJB 3.0 adopted a similar position for their entity beans.

* As you can tell, I'm very much into "ease of coding." Which brings me to my next principal: make it easy! Too many frameworks suffer, imo, because you're attempting to learn things that have evolved from a very specific problem.  I believe Ruby on Rails is one such framework.  Regardless, you'll know if your framework is "easy" just by seeing how a "hello world" tutorial can be done. Most people simply want to show a page, store data and retrieve data for display. The less steps that a person has to go through in order to do these three operations, the more likely your framework will succeed.

* A huge part of "ease of coding" is the API itself.  As developers, we like to throw around the term "API" all over the place. But here, I really want to emphasize the intent. My previous manager wanted to use Kohana as well as develop some more pieces for general use inside our company. He constantly talked about the API. But in his mind API meant, "How does the code feel?" When you're used to doing certain things in a specific way, you end up developing habits, both good and bad. The bad habits though are the ones that end up screwing up others working with you. But we never really consider them as being bad nor habits because they're unconscious actions we perform on a daily basis.

With this in mind, it's truly difficult at times to answer the question, "How does the code feel?" But one way to address this idea is to examine our bad habits; the things we do constantly that abnegate our efficiencies as programmers and to our teammates.  We must look for certain constants that we can make more efficient. Let me give you an example.

Zend's model layer has two parts typically. One is when you extend the Zend_Db_Table_Abstract class. The other is when you are returned a Zend_Db_Table_Row_Abstract row (or rowset) when you fetch data. Previously, I had been extending the Zend_Db_Table_Abstract class (actually, I created another base class to handle more conditions than this class) and would instantiate it via something like:

$model = new Model_User();

Truthfully, this is horribly inefficient. First, this class does not contain sessioned data because the sessioned data is actually placed into the Zend_Db_Table_Row instance. So in essence, it's a good candidate for a singleton.  Second, what happens if the application begins instantiating unknown quantities of these classes? Since there is no sessioned data, each instance is actually a waste of memory. In that sense, we can create a factory class to manage the various instances of these types of classes.

So my call eventually evolved into something like:

$model = Model_Factory::build('User');

This looks and feels better already. Some neat things that I can do is shrink certain operations in my code into one line, avoiding having to create a new variable every time. For instance:

$row = Model_Factory::build('User')->findByUser($user->id);

If I had stuck to the original methodology, my code would look a little more sloppy like:

$model = new Model_User();
$row = $model->findByUser($user->id);

Then I thought to myself, well, I know that many of my controllers will be employing the Model_Factory object. It's still a fairly long line to type. Can I simplify this a little more? In my base controller class, I have a method that looks like this:

protected function model($model)
{
  return Model_Factory::build($model);
}

Now, in my controllers, I can shrink my code a little more to something like:

$row = $this->model('User')->findByUser($user->id);

Is this the best way of doing things in this situation? I don't know. But again it *FEELS* right.  There's more improvements I can make, but thinking along the lines of an API to make my life easier by providing a more fluid interface is the right way to code.

And again, when we talk about APIs, we're talking about the Interface aspect of the acronym.  Interfaces in software engineering generally equate to contracts. In other words, the binding agreements that developers set forth between each other.  I think many software engineers end up putting a lot of emphasis on implementation and forget about the interface aspect. That leads to horrible things like EJBs. But implementations are things that only the developer writing the behind code parts should worry about, not the people using that code.

So what needs to occur is emphasizing how the interface feels.  Is it too verbose? Is there a lot of steps to perform one action? Does something do more than what is expected?

Let me provide an example. One major reason why I avoided Zend this time around for controllers (although I've used them in the past) is that it feels like a lot of gluing needed to be done to get them to work the way I wanted. I used them previously and felt that they were too heavy and at times complex for what I needed (this could have changed). In my case, I wanted to use the authorization and authentication aspects of Zend. And truthfully, the authorization aspects in particular, were probably too loosely coupled to be of any practical use. To me, I had to do too many actions to get to the place where I needed; and more importantly, a lot of other people had solved this problem already so I felt that their solution was worthless.

* Another critical aspect is being forced to do something a certain way and whether or not that methodology makes sense. Some frameworks like Symfony and CakePHP for you to use their conventions (for myself, I haven't looked at either framework in at least 4 years, so things again may have changed). Take for instance, schemas. CakePHP had a very specific naming convention for schemas. So you were practically screwed on legacy projects where someone wanted to make a switch. Symfony might be similar in this regard. But one thing that prevented me from using CakePHP permanently was the idea of one model per controller/action deal (at least, I think that's how they did it; I never truly wrapped my head around their framework, so this is just my recollection from my experience noodling around with CakePHP). It seemed as though CakePHP wanted you to employ only one model for a page, which then mapped to a controller/action. For CRUD type of applications, this methodology works fine. However, what happens when you get complex applications that require numerous tables and relations working with each other? At the time, I found myself trying to find an answer and it seemed as though others were too. Most of the solutions then seemed more like hacks rather than elements envisioned by the authors of the framework. As a result, it's really not a solution but a hack around a problem.

Of course, a bigger question comes out of this whole situation: is it correct to have one model per controller per page? Martin Fowler has a very specific view on the subject, but I tend to think that the area is VERY controversial with no one truly coming up with a single good/standard solution. Part of it, imo, is because the idea is poorly explained and the supporting examples are few or oversimplified. I think part of the resulting effect is situations where people end up writing hacks for interpretations of this specific view, which has led to numerous problems. In the case of CakePHP, at least for me, it simply did not FEEL right.

Anyway, these have been various guiding principals in attempting to answer your basic question. But let me delve into more specific areas where in my experience address your question even more poignantly.

* Basic components: Model-View-Controller.  Stick with these concepts from the start. Everyone should now be able to recognize them.

* Front controller/Routers: These seem to be the hand off spot that a lot of newer frameworks employ. It seems that there is little programmatic value outside of serving as an interface for someone to see and acting in between your .htaccess file that contains the rewrite rules.

* Action Controller: These tend to be the spots where the specific pages/urls are mapped to. The "page"/URL portion tend to be mapped to a specific member function. While I like how most give people an incredible amount of freedom in naming their member functions, I think there should be more conventions that help enforce naming between the routes and actions. For instance, in Kohana, you have a routes file and each line represents a particular route that maps the url to a controller and particular action. As a result, you end up having a monstrous configuration file. A neater solution would be to get rid of this configuration file, except for exceptions and leave it up to the base controller to determine how URLs map to a controller and action.

* Model: What ends up being an object that somehow maps itself to another table in the database.  Usually, you have standard CRUD operations with some transaction management and a large number of SQL-like fetch operations that are able to translate into actual SQL via some adapter class. On of the fetch methods, you might find some object-relational mapping capabilities. With regards to Zend, you still end up lacking a large number of more intelligent ORM type of capabilities.  For instance, in the case of Hibernate, your mapping methods will do some intelligent SQL such as using outer joins occasionally while caching the result set into a graph.  I believe Zend does not do this and simply fetches another group of rows rather dumbly per request. Also, it seems that cascading operations feel more limited compared to Hibernate.

The other major problem that drives me crazy about Zend on occasion is that you lose all but very basic operations when you retrieve a Zend_Db_Table_Row_Abstract object. For instance, many of my Zend model classes have additional operations (like fetching). After retrieving a rowset and iterating through each row, I no longer can do something like call a custom method. This occasionally can become confusing where I might want to do something like findByUser on a row.

* Views: This is a real tough one. I used to be a big proponent of Smarty. Then after working at my previous job, I ended up disliking Smarty. Why? Too fat. Smarty is a very fat templating system. On a high trafficked site, the compilation of templates can be quite costly. Also, in the case of PHP, wasn't the original design something akin to a templating system? We took a hard look at that idea and it made me realize that employing Smarty wasn't so great after all. So now, I'm employing flat php files to avoid unnecessary layers of abstracting. Then I just bifurcate pages into smaller pieces when necessary and reassemble the pieces via a View::factory operation that comes with Kohana. If I need a particular function that can be reutilized across classes, I end up building helper classes or use the ones provided by Kohana.

* Some business layer: I've felt that one of the biggest problems in most MVC systems is the lack of a centralized area to store reusable sets of logic. Some people argue that this logic should occur in the model layer (to avoid Fowler's anemic domain layer dilemma).  Others feel that the controller should have that logic.  While this works for simple web applications, as we all know, web applications these days aren't that simple. What happens when you have a web service? Or batch jobs (i.e. scripts?) Where do operations that have no database interaction go such as email, file system management, network service calls, etc.?

I know some frameworks make use of the generic "helper" section.  But "helper" imo is a poorly termed idea. It's too generic and doesn't specifically describe what *OUGHT* to go in there. For instance, you can have form helpers and perhaps an email helper.  But now, my API is getting cluttered and people no longer might be able to instinctively know where to look for a specific problem.

So that's where this idea of "business logic" comes in.  Truthfully, I tend to hate Martin Fowler's interpretation of domain objects and anemic objects. I get his point and I've tried it. But what ends up happening is just more of a mess of code. I view model objects as simply dumb POJOs with perhaps the only link being between their relations to other model objects. However, the management of these interactions as well as the containment of additional rules are what I truly believe a "business layer" ought to be.

Back in the days of J2EE, you had entity and session beans. Entity beans, for the most part, were the model/domain layer. Session beans would manage the interactions between entity beans and occasionally serve to have additional functionality (for instance, calculating the rate of tax, which probably didn't require any interaction with entity beans at all). Unfortunately, in the J2EE world, Session beans needed things like JNDI in order to utilize them from a controller class (say Struts). So people added more layers like a business delegate class (which simply forwarded requests to the session beans; in other words, a horribly useless class) and the service layer which simply looked up the session class. IMO that little JNDI bit probably helped kill EJBs off more than anything because you now required more layers and configurations just to simply say, "Save Order."

Despite that, I think the base concept of the relationship between entity and session beans still holds valid from an architecture point of view. I like the idea of having something being a centralized location that is secure and all the other good stuff from where I can retrieve or update data. More than that, I like the idea that I can go to this centralize location to perform various actions and receive responses.

In the new world of J2EE, what has emerged partly are two things: services and business rules engines. Services, from what I've seen, are kinda like what the Zend Model is to PHP. Some also call Services "object repositories".  Either way, they functionally serve to do your main CRUD operations and then some.

Business rules engines are something I've played a bit with.  Drools is a big one that comes from JBoss/Redhat.  There's a few others like Blaze and one other I can't think of at the moment. But the idea is that these engines are capability of handling responses based on user input and large sets of conditions. A good example of where such an engine might be used is determining the credit card type appropriate for an applicant. You have a lot of criteria that needs to be processed (age, job, gender, location, current credit rating, criminal background, etc.). Based on that path, you end up with an answer. In these situations, business analysts can easily set all these conditions and the results into a centralized repository while the developers simply connect the engine to an interface of some kind.

However, both of these situations do not really address the real architectural need of a business layer from an API point of view. What really needs to be accomplish is centralized where this logic should exclusively exist. I don't think any framework at the moment adequately answers this problem, but it's a true issue that, imo, leaves developers scrambling through directories in attempting to discern each framework's attempt at solving this problem.

At my previous company, we talked about the idea of a "command object" type of class.  I think in the ASP world, the command object is more related to the database layer. Instead, what we started to discuss was the idea of having one action per class. You could start off with simple CRUD operations. But what happens say when someone registers to a new site? Typically, registration processes are quite involved. You have data validation, captchas, potentially multiple tables beyond the prototypical user table, emails, confirmations, flags that need to be set, etc. What happens when you want to extend this process beyond a single form such as mobile registration for different platforms that require different backend responses (like XML-RPC vs json)? If we put all this logic into just one controller action, then we're totally screwed because now we are forced to copy and paste most of it and modify the input to match the expected interface. If we put this into a model class, that means we potentially are violating the infrastructure layer by adding more ideas beyond simply managing data. If we split the logic, we're just as screwed because we'll be scrambling around assembling pieces as new methods of registration potentially pops up (like authorization APIs that provide application level keys for a given identity).

Hence, why we need such an API. Thus far, my own implementation has been quite simple and probably does more than "one operation per action." It does work okay for scenarios where I'm certain to reutilize logic. But it honestly serves only for organizational purposes and does not alleviate even more common stresses that probably could be done through a proper analysis of common operations.

Part of what I'm thinking about in this layer ideally is similar to UNIX commands and piping. Each time you pipe something, you continuously transform the output until it's in a format you need. So how could I get from say $_POST with various user data to a registered user, complete with a sent email, a flag telling me I need to have the user be sent to a wizard-like help menu and possibly even storing some data into their OAuth tables (like Facebook, Twitter or Google connect)? Is there a way to intelligently chain various actions based on some input, loop what you need (like if you're mass processing data), throw up exceptions non-disruptively and elegantly, manage transactions on an as-needed basis, etc.? AOP suggested only one manner of handling this. We need more. And we need plenty of base classes that can capture 80% of the most common cases before people need to turn to a customized solution. Now THAT would be cool.

* Security aspects: typical things are like buffer overflow checks, validation (possibly on the models), captchas (which really is just a helper), data obfuscation (i.e. preventing people from attempting to hack into a system by avoiding field names through obscuring them after a page is rendered), data cleansing (ensuring that all input coming in goes through thorough checks and eliminates XSS attacks, SQL injection and general bad input that simple validation cannot catch), authorization and authentication (i hate rebuilding login/registration screens each time). Authorization on granular levels such as on controllers, actions, even page level elements or sections. An example might be showing an admin link on a page for people belonging to the admin groups.  Hierarchical authorization checks as well (meaning access granted to say a community manager would automatically be granted to a super user/admin based on either a calculation or priority for that group)

* Support for output formats based on extensions in the URL. Ruby on Rails does something like this, which is pretty cool. So you could end a url with .json, .rss, .xml, etc. and Rails will understand how to render that.

* Model object automatic population from forms. I know Ruby on Rails again does something similar, which is great considering that object population tends to be an excruciatingly tedious and often times unnecessary task.  Rails takes it a step farther for complex forms where it uses the dot notation for cases where more than one model needs to be filled out. So for instance, if you declaratively have something like:

public function action_save($user)

based on the controller inspecting the $user variable, it would have some idea as to how to fill in the user variable.

* Various services.  I'm talking about things like Amazon Web Services, etc. If you look at Zend, they have quite a few popular web services integrated into their system.  It's one primary reason why I ended up continuing to utilize Zend as one of my main frameworks.

* Integration with popular Javascript frameworks. I believe a few provide something like this, including Rails and maybe Cake, Zend, etc. I've done little with this stuff myself, but I like the idea of providing a better API correspondence between Javascript and the server side layer.

* Queuing service.  This is something that I think will be great for integrating with external services like Twitter.  Here's why.  Back when I worked on Livestrong, one of our chief modules was a twitter module based on the keywords provided from a page. Then one day Michael Jackson died, taking down Twitter and inevitably our site.  It took some of the guys all night into the wee hours of the morning to determine that the problem was actually from Twitter, not us. But the dependency of calling an external site caused pages to pause or timeout as a result of Curl calls that eventually lead to Apache hanging.  Eventually, what we did was create a new module that would provide a two layer caching mechanism after retrieving external content called from Curl. The first layer was using memcache (naturally). The second layer was storing the data in a flat database table. After a timeout occurred, we would simply fetch content from the database, thus ensuring that any dead connections wouldn't result in the site dying. I don't have anything particular for my own stuff, but I love the idea and hope to see it become more prevalent.

* file level transaction management.  right now, when people think of transactions, they think mostly in terms of databases that can support them. well, what happens if the database transactions succeeds but a file transaction fails? Shouldn't there be a better mechanism to manage the two to work in cohesion? if this was done in the AOP world, we could potentially decorate the file transaction part to take part of the database transaction, thus seeing a full rollback. this situation could be said with anything where state is concerned.

* batch script support. one thing that truly bothers me is the absolute lack of batch scripting support. the only one i've seen has been Spring, but it's not out of the box. unfortunately, most batch script support is fairly dumb; create ugly script, add to cron and hope that it runs. when i worked in finance, we employed Autosys which was a horribly expensive solution that probably could be handled with a couple of apache servers, any backend and an infinite while loop. however, to its credit, Autosys provided a solution that I haven't seen matched in my experiences with most companies thus far. simply stated, Autosys is an intelligent version of Cron, where it monitors jobs on various hosts. Jobs can have dependencies between each other and Autosys allows one to manage those dependencies as well as outcomes as a result of the status of a job. Putting something under cron and waiting to see if anything happens is simply an administrative nightmare, both for developers and sysadmins.  I would love seeing something that provides some form of feedback on top of a methodology to code such jobs (which probably will not differ too much from the MVC pattern).

* unit testing. I think something that can be accomplished in a new framework that I haven't seen done yet is enforcing unit testing through making developers first code up the unit tests first. this would be excellent for getting more people into the habit of TDD (test driven development) rather than "code and let us see."

* integration testing. the question becomes how much more can we automate in this framework before handing it off to QA and usability testing? here, it would be cool to see things like whole databases automatically created, upgraded, etc. after a system is deployed into an integration environment, then scripts automatically run to ensure the infrastructure holds intact after the upgrade.

* intelligent deployment mechanisms. how many times do we see companies with in house deployment management systems that simply do something like tar/gzip up an application from an exported Subversion command then scp/rsync over SSH the file to the various servers sitting behind a firewall and have them untar/gzip them one by one? wouldn't it be nice to specify in some configuration file only accessible by production support the various parameters to run a single script that handles all of this?

* integration with common IDEs. I would love it for such a framework to have a plugin for Eclipse, especially to help manage more common task.

* scaffolding. this is a given, but it should be an option.

* the framework probably should have three levels: core, shared and specific. Kohana does this using a cascading system. i really like the idea in that core should NEVER be touched (except for debugging purposes), thus allowing people to upgrade seamlessly. Shared assumes that components are reused across applications in house. For instance, perhaps your validation logic or certain base classes should go in this area. the specific level remains exclusive to the application. so for instance, your beauty site might have an object called BeautyTips while your health site has a Food object.

* some kind of installation script. i like some of what rails, symfony and cake does through the command line generation of application code. my belief is that 80% of the code out there is truly boiler plate code and probably just requires some script to intelligently generate the basic crap.

* upgrade scripts. after someone checks in code, other people who check out that code should run a script that determines whether certain assets like new database tables, added columns, new data, or other scripts need to be run/installed. i think Cake uses the notion of a deltas directory and employs the naming convention of numbering to determine which script to run next. it would be nice to see a script also automatically upgrade models when situations like columns are added/removed or relations are made.

* scalability: i think part of the criticism for frameworks like Rails, Cake, etc. is that they're horribly slow and were designed for small websites. later on I think they were tweaked to provide some upgrades to address these situations. however, the real issue is that i think many frameworks aren't designed nor truly tested with insane traffic in mind. for instance, the model aspects for Zend probably will end up dying if people dumbly rely on operations like findParentRow as they iterate through a large set of rows that don't seem to use things like the notion of an object graph, caching, etc. or how does a framework react to things like load balancing? i've seen systems in chaotic states as a result of poor session management that cannot be tested in a non-load balanced environment. or how does an application address database clustering and slave/master style replication?

* caching: i've become a monstrous fan of memcache. but sometimes it's not an option initially. and how does one integrate it? how can one smoothly use caching without violating some principals like infrastructure concerns vs application logic concerns? also, in terms of where we can apply it, i see four main areas: external service calls that retrieve data, database data, page level data and session data.

* SEO management: mostly, this is in terms of SEO friendly url management, which tends to be handled via routing mechanisms and helper classes. but this is something to think about in the google SEO world.

* Theme management: i think something lacking in frameworks is some preset themes. Take something like Joomla where you have various essential CMS features where you add a higher level on top of a basic framework.

anyway, I think that's plenty of ideas for now :)

AddThis Social Bookmark Button Sphere: Related Content

Trackbacks: (Trackback URL)

No Comments Posted Yet
July [August] September
Sun Mon Tue Wed Thu Fri Sat
31 1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31 1 2 3