It’s Oh So Quiet

It’s so quiet in this blog, because it is closed. Forever. No more post will ever be published, you can’t send any comment, and you can unsubscribe safely from its RSS feed. More than 8,000 unique visitors a month now have time to procrastinate elsewhere.

I have finally admitted that I have better things to do in my free time than contributing to the symfony project. So I can say goodbye to all the following, without regret:

  • Writing a book,
  • Publishing 54 blog posts here and a certain amount in the symfony project blog,
  • Reading 715 comments here and countless emails in the symfony mailing-lists,
  • Following the symfony timeline every day, and reviewing the code contributed to the framework core,
  • Developing, testing, and documenting more than 20 plugins,
  • Giving a few conferences and trainings,
  • And helping newcomers find their way in the symfony ecosystem via IRC, chat, and email.

You can’t imagine how much time all that takes. Well, that’s how much free time I get by leaving symfony completely.

Since summer 2005, my involvement in symfony has been more and more thorough, more and more visceral, and more and more painful. I did my best to push symfony in the direction that I considered to be the right one, but I failed. Symfony used to be simple, well documented, and powerful; today it’s just powerful. Long forgotten are the days where symfony’s motto was “Professional Tools for Lazy Folks”. I see no future in a project where release dates are never satisfied, where new features are released undocumented, where discussions either never start or die without a generally accepted decision, where the community is tolerated only for its praises, and most of all, where the average user is despised.

If you use any of the symfony plugins that I developed and maintained, and if you want to contribute back to their code, you should contact Kris Wallsmith, the new symfony community manager. He’s responsible for these plugins now, and will give developer access at his own discretion. As for me, I may use my commit access for modifications regarding the projects I work on, without further notice.

  • DbFinderPlugin
  • sfAssetsLibraryPlugin
  • sfControlPanelPlugin
  • sfFeed2Plugin
  • sfMediaLibraryPlugin
  • sfModerationPlugin
  • sfPagerNavigationPlugin
  • sfPropelActAsSortableBehaviorPlugin
  • sfPropelAlternativeSchemaPlugin
  • sfPropelSpamTagBehaviorPlugin
  • sfSimpleBlogPlugin
  • sfSimpleCMSPlugin
  • sfSimpleForumPlugin
  • sfSpyPlugin
  • sfStatsPlugin
  • sfUFOPlugin
  • sfUJSPlugin
  • sfWebBrowserPlugin

Redoing the web was an ambitious task. Who knows, I might still manage to do it in the future.

The Good, The Bad and the Ugly

There is a lot to learn from Fabien Potencier, the creator of the symfony framework. He often comments people's work on other frameworks, but almost never when someone works on his own framework. So his recent reaction about my DDD experiment is rare enough to be thoroughly analyzed, and distilled. Let's look for the very substance of his latest post.

The Good

It's been more than ten months since my original Forms post, about which Fabien basically told me that I knew nothing about programming (which is true) but nothing else. So I'm very glad that he finally decided to give more feedback about the ideas I suggest. My previous post asked for developers' thoughts about a reworked Chapter 10, and who could better do this than him?

Not only does Fabien talk about my DDD experiment, he actually implemented some features I suggested. Thanks to commits made to the symfony 1.2 branch early this week, building a form object using the sfForm class alone is easier. Added is the ability to define default values for each widget, the ability to iterate on a form object, the ability to directly set a widget or a validator from the from object. I proposed all those changes to make the documentation easier to read (and write), and it seems that they also have an interest for the developers.

Fabien points some mistakes I did, like the 'multiple' validator, which is a bad idea. Since there are two kinds of multiple validators (on the uncleaned value and on the cleaned value), the 'multiple' keyword is not the best choice. I still thing that 'pre' and 'post' validators could get a better name, but that's a detail.

Also, the modified version of my proposed Chapter 10, published as an attachment to Fabien's post, keeps about 90% of the original text unchanged. It seems that he disagrees mostly on some technical details, and didn't touch the order in which things were introduced, nor the length of the Chapter.

The Bad

But the idea to define form widgets and validators based on an associative array got no credit to his eyes. It is probably a matter of preference; I introduced this array syntax for two reasons:

  • To avoid introducing too many classes too early in the documentation
  • To make the YAML form definition syntax completely natural

To my eyes, this array syntax has always been a layer on top of the existing syntax; that means that the ability to define custom widgets and validators is still there, and the ability to pass an object instead of an associative array is still there as well. That's how I tried to remove the coding style preference problem: whatever you like more, symfony supports it. You get both simplicity of use and power.

"[The suggested API] is not shorter, it is not easier to understand, and it is more difficult to explain.". Let me respectfully disagree. It is shorter, easier to understand, and easier to explain. Compare what a single chapter explains and the confusion introduced by an unfinished and lengthy book. Or better, compare:

// in PHP
<?php
class ContactForm extends sfForm
{
  protected static $subjects = array('Subject A', 'Subject B', 'Subject C');
 
  public function configure()
  {
    $this->widgetSchema->setNameFormat('contact[%s]');
    $this->widgetSchema->setIdFormat('my_form_%s');
    $this->setWidgets(array(
      'name'    => new sfWidgetFormInput(),
      'email'   => new sfWidgetFormInput(),
      'subject' => new sfWidgetFormSelect(array('choices' => self::$subjects)),
      'message' => new sfWidgetFormTextarea(),
      'file'    => new sfWidgetFormInputFile(),
    ));
 
    $this->setValidators(array(
      'name'    => new sfValidatorString(array('required' => false)),
      'email'   => new sfValidatorEmail(),
      'subject' => new sfValidatorChoice(array('choices' => array_keys(self::$subjects))),
      'message' => new sfValidatorString(array('min_length' => 4), array('min_length' => 'Your message is too short')),
      'file'    => new sfValidatorFile(),
    ));
    $this->setDefault('email', 'me@example.com');
  }
}
?>


And:

# in YAML
&subjects:     [Subject A, Subject B, Subject C]
name_format:  contact[%s]
id_format:    my_form_%s
widgets:
  name:       text
  email:      { type: text, default: me@example.com }
  subject:    { type: select, choices: *subjects }
  message:    textarea
  file:       file
validators:
  name:       { type: string, required: false }
  email:      email
  subject:    { type: choice, choices: *subjects }
  message:    { type: string, min_length: 4, errors: { min_length: Your message is too short } }
  file:       file

According to Fabien, using associative arrays to make the YAML syntax easier to explain is of no use, since YAML is bad and shall be dropped altogether. "Learn from our mistakes" means that symfony should never had used YAML in the first place, despite the fact that it appealed numerous users to it and that it's only a simplicity layer. That means that the symfony 1.2 admin generator will not be controlled from a YAML file at all - defining form widgets in YAML is an indispensable brick to a YAML syntax for an administration interface. So be prepared to use XML or plain PHP for your database schemas, configuration files, generated modules, etc.

"Learn from our mistakes" also means not using strings to define HTML attributes anymore. I suggested to keep on using the abilities of symfony 1.0 to output clean XHTML attributes from a string looking like 'id=contact_subject class=bar'. But that is something else you should forget about. Apparently, this brings no benefit over array('id' => 'contact_subject', 'class' => 'bar'). Once again, I can understand it's a matter of preference, but I'm convinced that this kind of syntactic sugar is what appealed many users to symfony in the first place.

In the documentation I wrote, I introduced a way to bind a form object to the request object ($form->bind($request)). I explained later in the chapter why it's better to do otherwise, but at least it shows that a form can be used without necessarily using the array syntax for widget names. Fabien explains why this is wrong (as I did) and fails to see the interest of introducing the setNameFormat() in the documentation only when it becomes necessary. However, the current symfony book uses this technique several times (think of in Chapter 2 for instance), because it is easier to know why a practice is wrong once you've seen the advantages of the good practice in comparison. His revised version of the Chapter 10 doesn't solve the problem, either.

But honestly, I don't care much about all these points. If if was just for Fabien's remarks on the technical side of things, I'd be more than willing to continue working on a modified Chapter 10 to make it worthy of The Guide.

The Ugly

But Fabien is really going to a nasty place with his post.

Does he bring a response to my request for comment? No, he replies to a commenter of my post, who challenged him to give his opinion. He never actually addresses me directly - I'm a persona non grata.

Does he agree on the DDD experiment? Of course not, DDD is the "Biggest problem" according to him (not sure why). How could documentation and teaching influence a developer's design decisions? The forms API is "good enough" as is, and its lack of documentation is not a problem to be solved by modifying the code. If someone (but not me) wants to write something to "help us improve the current documentation", he can still try again.

Is he grateful that I did in a week a work (Chapter 10) that he couldn't do in a year? Not at all. Not a word of thanks, he just feels insulted that someone dared to question parts of his work. What a childish reaction to someone who offers to help widen the symfony adoption.

Does he react in the interest of the community, proposing to use his own version of the Chapter 10 for the current guide? Not even that. He prefers no documentation at all rather than an equivalent to the symfony 1.0 documentation updated for symfony 1.1.

Does he care about the newcomers to symfony, those who don't know the 1.0 API by heart, don't follow the Trac timeline every day, don't read the framework code, and don't accept an UPGRADE text file for a documentation? No. Not a word about them in his post. Only the developers who already know good practices of web development can start using symfony. You can no longer learn these practices by learning symfony.

Does he write that he's been implementing some of my ideas? No, that would be giving me too much credit. The goal, here, is to show that I am a bad developer, and nothing else. Well, I'm not even a developer, so why all the hate?

Is he trying to be constructive? No. He writes, in bad faith: "[Francois'] API is so unintuitive that we must explain a lot of things to describe the way it works". God, I thought that I managed to explain in 1 hour what he needs a day to teach in an expensive training, and it's my API that is unintuitive?

Conclusion

All in all, Fabien reacts with pride rather than reason. He does as much as he can do discredit my work, while my purpose has always been to help leverage the symfony adoption. He probably dreams that, with a single blog post, I'd leave the symfony community completely, because he finally demonstrated that none of my work is worthy to his eyes.

Too bad, Fabien, you have taken the wrong path. I'll be a pain in your ass for a long time. Count me in to constantly remind people that symfony is a one-man work, and that this is a very high risk for enterprise projects, given the man.

Chapter 10 - Forms

Dealing with the display of form inputs, the validation of a form submission, and all the particular cases of forms is one of the most complex tasks in web development. Luckily, symfony provides a simple interface to a very powerful form sub-framework, and helps you to design and handle forms of any level of complexity in just a few lines of code.

NOTICE: This document is the first draft of a methodology experiment explained earlier in this blog. It documents the sfForm framework found in symfony 1.1, but with some changes in the API and usage. As such, it describes a library that is not yet written (like that) and cannot be used to learn the usage of the current sfForm implementation. It is quite long, so you might prefer to download the Markdown version and read it offline. Being a first draft, this document is a call for comments, both about its structure and its content. And if you are interested in implementing the differences between what this document describes and what is currently implemented in the symfony framework, please contact me.

Read more »

Sorting By Custom Column in the Symfony Admin Generator

Did you ever wish you could sort by a partial column in the admin generator? Using DbFinder and a few lines of code, it is now possible.

The symfony admin generator allows you to select which properties of a model you want to display. You can include foreign key fields, or even a partial field to display pretty much everything you want in the list view. The following example uses this ability to display the name of article authors, based on the fact that the Article model has a many to one relationship to the User model:

# in mymodule/config/generator.yml
generator:
  class:          sfPropelAdminGenerator
  param:
    model_class:  Blog
    theme:        default

list:
  display:        [=title, user, category, _nb_posts, created_at]
  fields: 
    user:         { name: Author }

This generator configuration includes a partial field that counts the number of blog posts for each blog:

// in mymodule/templates/_nb_posts.php
<?php echo $blog->countBlogPosts() ?>


The problem is that only the "True" fields (that is, the ones that correspond to a column in the main table) are sortable. The result is that, with the following example, only the title column is sortable.

With symfony alone, there is no way to make the other columns sortable except overriding the whole _list_th_tabular.php partial in your module, overriding the addSortCriteria() method in the action, and losing the ability to add or remove columns in the future.

Enters DbFinderPlugin. You probably know from this blog that DbFinder offers a very powerful and yet simple way to replace Propel Criteria queries. What you might not know is that the DbFinder plugin bundles a full admin generator theme. It has the exact same features and syntax as the standard symfony admin generator, but it is entirely written with DbFinder queries. And to make this generator theme very usable, it includes the batch_actions extension from symfony 1.1 (that's what allows to display the checkboxes on the left side of the list to perform an action on several records at a time), and the ability to sort by any type of column.

To use the DbFinder admin generator, no need to switch your entire project to DbFinder. Just install the plugin, edit the generator.yml of one of your generated modules, and change the class property from sfPropelAdminGenerator (or sfDoctrineAdminGenerator, if you use Doctrine) to DbFinderAdminGenerator. Refresh the page in your browser, and you should normally see no change. That's good news: despite the fact that all the generator code has been rewritten to work with DbFinder instead of Propel, it is completely backwards compatible.

And once a generated module uses DbFinder, you gain access to the new sort_method option for custom fields:

# in mymodule/config/generator.yml
generator:
  class:          DbFinderAdminGenerator
  param:
    model_class:  Blog
    theme:        default

list:
  display:        [=title, user, category, _nb_posts, created_at]
  fields: 
    user:         { name: Author, sort_method: orderByUsername }
    category:     { sort_method: orderByCategory }
    nb_posts:     { sort_method: orderByNbPosts }

Refresh the list view, and voila, the column headers are now clickable.

Don't click the new links yet: you've defined three methods for custom ordering, and you still have to write them. To do so, you need to create a BlogFinder, which is a finder class specific to the Blog model class. So create a lib/model/BlogFinder.php class with the following content:

// in lib/model/BlogFinder.php
class BlogFinder extends DbFinder
{
  protected $class = 'Blog';
 
  public function orderByUsername($order = 'asc')
  {
    return $this->orderBy('User.Name', $order);
  }

  public function orderByCategory($order = 'asc')
  {
    return $this->orderBy('Category.Name', $order);
  }
 
  public function orderByNbPosts($order = 'asc')
  {
    return $this->
      leftJoin('BlogPost')->
      groupBy('Blog.Id')->
      withColumn('COUNT(BlogPost.Id)', 'nbPosts')->
      orderBy('nbPosts', $order);
  }
}


The finder is smart enough to guess the relationship between the Blog and the User model, as well as the relationship with the Category model, because the YAML schema defines foreign keys between the related tables.

Clear the cache (to allow the autoloading to find the new finder class), refresh your list, and enjoy fully sortable columns.

To finish, here is a small trick to drastically improve your backend performance. Every time the _nb_posts partial is called (and that's once per row in the list), symfony issues a COUNT query. That means that the current configuration will run n+1 queries, n being the number of results per page (typically 20). That's pretty bad for performance. What if you could hydrate an additional column in the main query and use this column in the _nb_posts partial? With DbFinder, that's very easy. Just add a finder_methods setting to your list configuration, as follows:

# in mymodule/config/generator.yml
list:
  display:        [=title, user, category, _nb_posts, created_at]
  fields: 
    user:         { name: Author, sort_method: orderByUsername }
    category:     { sort_method: orderByCategory }
    nb_posts:     { sort_method: orderByNbPosts }
    finder_methods: [withNbPosts]

Symfony executes all the methods defined in the finder_methods before displaying the list. It allows you to define a default ordering, to filter out some records, or, like here, to add custom column to the main query.

Now it's time to create this BlogFinder::withNbPosts() method. Since it contains part of the code of orderByNbPosts(), and that the finder generator executes sort methods at the end of the action, you can reduce the orderByNbPosts() code accordingly:

// in lib/model/BlogFinder.php
public function withNbPosts($order = 'asc')
{
  return $this->
    leftJoin('BlogPost')->
    groupBy('Blog.Id')->
    withColumn('COUNT(BlogPost.Id)', 'nbPosts');
}

public function orderByNbPosts($order = 'asc')
{
  return $this->orderBy('nbPosts', $order);
}


Now the main list query includes the call for the calculated nbPosts column, and you can change the _nb_posts partial to use it:

// in mymodule/templates/_nb_posts.php
<?php echo $blog->getColumn('nbPosts') ?>


Refresh the list view: Ta-da, the result is the same, but using a single query instead of n+1.

So the DbFinder generator offers the same features as the current symfony 1.1 generator, except more. Don't wait until you upgrade your project to symfony 1.2 to enhance your generated modules. Read the DbFinder admin generator documentation, and download the plugin right away.

Document-Driven Development in Practice: Rethinking sfForms

If you've watched or read my presentation on Documentation-Driven Development, you may wonder how to put that new methodology into action. A practical example is often better than a long explanation, so let's see ho to apply it to the new Forms sub-framework introduced by symfony 1.1.

Not DDD

In order to use the new sfForm library, you must either read a book (not yet completely written) or dive into the source code and guess how to use it. To my mind, this is pretty much the contrary of what leads to a large adoption.

The Form framework was designed with power in mind, and reaches this goal very well: you can use it to create forms of any level of complexity, including forms embedding other forms, forms with a variable number of fields, forms split into several steps ("wizards"), etc. It is very much object oriented, so everything can be reused or overridden.

But unfortunately, in order to create a simple form, you need to learn a lot more and write a lot more code than what you used to do in symfony 1.1. The current Forms documentation describes the API and justifies its implementation. It goes very much into the details of each part of the sub-framework, and quite early in the learning process. The result - for me, at least - is that the reader feels overwhelmed by the huge amount of classes, features and options, and dismisses the whole sub-framework for being too complex.

"Let's use that new Form stuff for complex forms and keep the current form helpers and YAML validation for everyday forms", I hear. That's a pity, because once you understand how the new Forms sub-framework works and accept its verbosity, there is no good reason to stick with the old system.

An Ideal sfForm Documentation

I think that a piece of documentation is missing. This piece is probably an introduction to the Form sub-framework.

In symfony 1.0, a single chapter of the book was enough to master forms for most use cases. Even if the new form sub-framework is more powerful than the 1.0 one, it should not be more complicated to learn and use in similar cases. So the sfForms introduction should be short, requiring at most one hour to read it.

After reading this documentation, an average developer should be able to use sfForms in 80% of the cases. That includes at least all the features described in the original Forms chapter of the symfony book:

  • Displaying a form
  • Available form helpers
  • Displaying a model-based form
  • Dealing with Foreign keys
  • Handling a form submission
  • Validating a form
  • Available validators
  • Repopulating a form
  • Complex use cases

The target audience would be people knowing some concepts about symfony, but not yet everything. In fact, they should know what the Chapters 1 to 9 of the symfony guide cover, not more. So some advanced concepts should probably be skipped, or explained only after the fundamental usage is clear.

This introduction should not require additional lookup in the Forms book. That means that it should be self-sufficient. It probably also means not including the justifications of the Forms implementation that you can find in the current Forms book. The reasons why the API was designed the way it is should become obvious at the end of the introduction. Expert customization and rare use cases should probably also be left aside.

The symfony 1.0 documentation introduces concepts and features in a certain order, with a precise purpose: not loading too much information into the reader's mind at a time. In a similar fashion, the forms introduction should be a linear piece of documentation, not a set of articles that you can read in any order with hyperlinks everywhere to break the reading flow.

The forms framework is powerful, but the current form book somehow translates that into length, and verbosity. On the contrary, I think the reader should feel exalted: the documentation should put him in a rush to start using the new forms. So the forms introduction should "tell a story", and gently lead the reader to a point where he feels he can grab the steering wheel and drive the car by himself.

API enhancements

The problem is that explaining the current API takes much longer than a single piece of documentation. That's because of the many options available, because of the many objects to learn, and because even the simplest things (like a list of form controls) look complicated (sfWidgetFormSchema).

There is not much choice to overcome this problem. In order to write a short and readable guide to the forms sub-framework, its API must be adapted. That's right, the API must be changed so that the documentation can be made shorter, and more usable. This is one of the principles of the Documentation-Driven Development methodology.

These API enhancements should be completely backward compatible, so that any existing application using the current sfForms implementation can continue to work seamlessly with the modified implementation. In a way, that qualifies the API enhancements as a simplicity layer on top if the existing code. As a side note, the current Forms book still remains indispensable for advanced usage.

Note that the API enhancements don't need to be implemented before the new documentation is published. The implementation comes second, after the documentation. That's another of the DDD principles: explain first, make it work afterwards. After all, project managers write requirements for web applications before they exist, all the time.

Do As I Do, Not As I Say

Some people are getting sick of reading me criticizing parts of the symfony framework. Well, I'm not criticizing: I'm actively improving.

Rethinking sfForms is a good example for a Documentation-Driven Development. To illustrate this methodology, I'm going to rewrite the Chapter 10 of the symfony book for symfony 1.1. That's right, the current Chapter 10, which describes the "old way" of doing forms, can be rewritten in a similar fashion and serve for symfony 1.1.

But since the current API requires too much explanation to be used, I'm going to introduce the necessary API changes to the sfForms library. I'll create and manage forms in a way slightly different from what the current API allows, to make it simpler to use - and to explain.

When the new Chapter 10 is published here in this very blog, this piece of documentation will be of no use since the features it describes won't be implemented yet. But I know that writing documentation is not enough to convince people (yet), so I will Implement the API changes as a second step to the exercise. As I'm not a very good developer, any help will be welcome during that phase (contact me If you want to give me a hand after the documentation is published).

If everything goes well, the implementation of the API changes will be be released as a symfony plugin - maybe called sfSimpleForms. I hope it can lead more developers to adopt the greatest open-source Forms framework around.

Designing a CMS Architecture

When faced with the alternative between an off-the-shelf CMS or a custom development, many companies pick solutions like ezPublish or Drupal. In addition to being free, these CMS seem to fulfill all possible requirements. But while choosing an open-source solution is a great idea, going for a full-featured CMS may prove more expensive than designing and developing your own Custom Management System.

Hidden Costs

What does it cost to integrate and deploy a website based on an open-source CMS? At first sight, not much. As for every CMS, you have to design your own templates and fill your website with initial data. But there are additional costs that pop up as soon as you need a little more than just plain content management.

Think about adding a blog or a forum to a website managed by a CMS. There are modules or plugins for that, but they never provide the same flexibility as plain blogging engines such as Wordpress, or plain forum engines like phpBB. So even if the basic requirement is fulfilled by a module, you will always need - always - to adapt its code.

And this is where it gets ugly. The code base of open source CMS engines and their plugin is nowhere as good as what you can see in RAD frameworks these days. Most of them are based on a very old architecture (PHP4, no object orientation, no proper error handling, direct access to the database, etc.). That means that changing something will be very painful, and very expensive. You will encounter numerous bugs, change the blogging plugin three times because neither of the ones you tested are capable of doing what you need, you will upgrade your CMS to the latest version to benefit from this single bug fix that should save your life but then you need to change all your existing configuration...

This is as bad as it sounds. Start changing one single line of code in an application build on top of Drupal or ezPublish, to name only the two major ones, and you are in trouble. The moment you need something that is not natively supported, you enter the Dark Zone of CMS hell. You are going to spend a lot of money on development. You will never see the end of the tunnel. That is, until someone says, a few years from now, "Do we need all that crap? Let's build something that fits our needs and that actually works".

Making Your Own CMS

Given number of available open-source CMS solutions, building one on your own sounds like a stupid idea. But if your website is 50% content management and 50% something else, you probably need to start with a web application framework like symfony or Django, rather than a CMS. These frameworks provide plugins that do part of the Content Management job already, so creating a CMS today is like assembling Lego bricks to build something that exactly fits your needs.

Take symfony, for instance. It provides native support, or support through plugins, for:

Symfony doesn't yet provide an Access Control List or a Workflow plugin, but you can already put all of the above together and have a pretty powerful CMS engine.

A tailor-made CMS will always have less code and show better performance than any of the existing full-featured solutions. Also, you will be able to tweak it completely, since all the components are decoupled, and built with extensibility in mind.

Your custom CMS will cost you more during the first year, but if you expect your website(s) to live longer than that, then the benefit will become obvious after a year and a half. Plugging the CMS features into other parts of the website, adding features unrelated to content management, scaling to a larger audience, replacing the database engine or the caching backend, all that will be painless.

That is, if you design your custom CMS carefully, and with the future in mind.

Environments

When you add features to an application, you need a testing environment - a place where you can check that the additions work and don't kill the rest of the application. That means that developers have a version of the website on their desktop computer, where they change stuff. Then, they upload the application to a test server, check that everything is OK, and only then can they deploy the application to the production server. This is a very common practice, often backed up by source version control and continuous integration tools.

But what happens when a new feature is not made of code, but of data? In ezPublish, for instance, in order to define a new type of content (they call it a "Class"), you have to use the backend web interface and fill in a few forms. The properties of the new type of content are stored in the database. In order to deploy this new type of content from the testing environment to the production environment, the developers need to transfer data from one database to another - without wiping off unrelated information on the production database, such as user comments, statistics, etc.

Deploying new features in this context means executing some SQL code on each server. This is much more dangerous than just pushing a new version of the codebase, especially when the data model is made of many tables glued together in complex joins. That's why, in many websites based on ezPublish, developers add features directly on the production environment, or repeat the configuration using the backend interface on every environment. This is either a high risk or a large waste of time.

Data, or Code?

This environment drawback tends to be a major influence over the choice of features a CMS should provide. For almost every CMS feature, you should wonder: Can the user do that through the backend interface, or do we need a programmer to add a new element? In other terms, is the feature made of data, or code?

Off-the-shelf CMS engines will almost always answer 'Data'. My personal opinion is that it is wrong in many cases. Content types are just one example, but think about workflows or page layouts for instance. They define a complex logic that always translates to code, and giving the user the ability to change them via a backend interface means storing code in the database and evaluating it at runtime. Then you can't use op-code cache engines like APC incriease your website performance. And deploying that to production is a nightmare.

Some companies think that most of the CMS features should be accessible via a backend interface in order to be able to enhance the application without additional developments. But this is an illusion. For one, the configuration of content classes in ezPublish is so complex that it does indeed require a PHP developer, and an expensive one, since experience with ezPublish is one of the most demanded skills in the IT market (at least in France). More features mean more development, and there is no CMS out there that replaces the power of a programming language with a web interface.

So that leads to one good rule of thumb: Design your features so that they can be made of code rather than data. That applies to elements that can be modified by a graphical user interface, or programatically:

  • Content classes
  • "Widgets" or "Components" for pages
  • Page layouts or "templates"
  • Content validation workflow
  • Tasks

Fundamental questions

The complexity of a CMS engine depends greatly on the answer you give to a few fundamental questions:

  • Can contents exist independently of a page?
  • Can contents exist at more than one place in the website?
  • Are there several views for a single piece of content?
  • Can contents have different versions simultaneously?
  • Can contents be modified in the backend and keep unchanged in the frontend?
  • Can users compose a page with "widgets" or "components" in a WYSIWYG interface?
  • Can predefined zones in a template contain more than one "widget" or "component"?
  • Can section pages have different templates?
  • Can section pages have different versions simultaneously?
  • Can users program the publishing of a section page, or of contents, in advance?
  • Can the CMS remember previous URLs for a content that changed title?

If the answer to the first question is no, then the concept of "page" and "content" coincide. You probably don't need to develop anything, since your CMS will be quite simple.

If you answer yes to all these questions, then the CMS might take three times longer to develop than what it would be otherwise.

That's why the idea of a tailor-made CMS is not that stupid. No existing CMS will be able to answer these questions in every possible way. But designing your own relational schema based on the answer to these questions makes sense, economically speaking. Don't make it complex if you don't need do, or, to put it otherwise, Keep It Simple, Stupid.

Bootstrapping the reflection

Now that you're trying to imagine what you actually need for your own CMS, here is a glimpse of the kind of technical challenge you will face all the time.

The question turns around the concept of content types. In a CMS, you mostly deal with "articles". This type of content has a title, an author, a summary, a body, and a few other attributes. But you probably also need to deal with some other content types, like movies, slide shows, quiz games, polls, or recipes. These content types are defined by properties distinct from that of an article. Some of them can fit in a single structure, others require several structures related to each other. For instance, quiz games require a structure for the quiz itself, one for the questions, one for the answers to each question, and one for the quiz results.

The question is: Do you store the data for all these content types in a single table, or do you create a table for each content type? The most "normalized" choice is probably to create one data structure for each. You could have an "article" table, a "recipe" table, and even a "quiz" table with foreign keys to a "quiz_question" and a "quiz_result" table. That would allow you to make queries on some specific attributes of a specific content type. You could build a custom search engine for your recipes and look for ingredients, foreign cuisine and preparation time.

But then, if each content type has its own table(s), what do you do when you have to list all the contents of a section, or worse (that happens in the backend) all the contents of the website? Does that mean that, in order to display a list of contents, you must query several tables and aggregate the results together? This solution simply doesn't scale, and a CMS built like that will become slower and slower as you add new content types.

So that probably means that you should store a reference to each content in a separate table, with a copy of the data that is generic to all content types (like title, publication date, section, etc.). Pages displaying a list of contents would use this aggregate table, while pages displaying content details would use the specific tables.

And that means that you must find a way to synchronize the specific tables and the generic tables whenever data changes in content. That's not a big deal, but it gives you an idea of the kind of complexity you will encounter in a large scale CMS.

A Challenging Exercise

Designing a CMS is difficult and fun, and you'll probably do it more than once. Every CMS is different, because every content management need is different, and mostly because every customer wants more than just plain content management.

If you are a developer, whenever you meet a client that asks you for a Drupal integration, try to sell your knowledge of CMS architectures rather than a few hours of developer time. Raise the important questions, talk about the possible problems of using off-the-shelf solutions. If you ever used one of those before, you will have plenty of issues to talk about. Then, try to convince your customer to trust you into a custom development. Make it small at the beginning, so that the customer can start using it right away and refine its requirements incrementally.

This will be a very satisfying experience, and the client will thank you later for leading him on the right path. And this will give you a lot to talk about for the next CMS you build...

Developing for Developers: my SymfonyCamp08 Presentation

Did you attend this year's Symfony Camp? It was a great event, the unique occasion to meet the core team of the symfony framework. I had the opportunity to give a talk there, and you can now watch it online:

Developing for Developers
View SlideShare presentation or Upload your own. (tags: symfony php)

Don't hesitate to comment on this presentation on SlideShare.

Thanks to all the great people that I met there who gave me a feedback on my work, encouragements or advice. Thanks to Dutch Open Projects for the organization - and for inviting me. It was a great pleasure to exchange about symfony, its past, present, and future, with so many enthusiastic people.

Update: It seems that my slideshow has been featured on the SlideShare homepage by the SlideShare editorial team.

Validating a YAML file against a schema in PHP

As of today, there is no simple way to validate the syntax of a YAML file in PHP. But with two simple tricks, it takes only a few dozens of lines of code to build a robust validator capable of checking the syntax of any YAML file against a given schema.

The problem

YAML is much easier to write and read than XML, but YAML has no schema validation capabilities. With DTD and XSD, you can check that an XML file is correctly formatted before actually using it, and it helps debugging a great lot. Modern web application frameworks like symfony encourage the use of YAML for configuration files, but the lack of validation tool sometimes make YAML a poor choice in a professional environment.

Such a validation tool exists in Ruby, it's called kwalify. But unless you want to spend a huge amount of time translating the 6,000+ lines of code of the library from Ruby into PHP, or to run Ruby code inside your PHP application, you're basically stuck.

First Idea

Did you just read that XML allows validation by way of XSD? Well, why not use this mechanism to validate a YAML file? After all, PHP has a great XML manipulation library, installed by default, and capable of validating any XML file against a DTD or an XSD. Actually, this mechanism is already in use in symfony, since the Propel schema.yml is transformed into an XML counterpart that has an XSD.

It is trivial to transform a YAML file into a PHP associative array. Symfony 1.1 provides a class that does exactly that, and it's called sfYaml. With a little bit of recursion and a few lines of PHP code, it is also quite easy to transform an associative array into a simple XML file.

Let's use the view.yml configuration file in symfony for example. In a typical module, it looks like the following:

# view.yml
default:
  http_metas:
    content-type:  text/html

  metas:
    title:         My symfony project
    robots:        index, follow
    description:   This is my first symfony project
    keywords:      symfony
    language:      en

  stylesheets:     [main.css, top.css]

  javascripts:     [jquery-1.2.6.js, main.js]

  has_layout:      on
  layout:          layout

indexSuccess:
  metas:
    title:        Welcome to my site

Now what does it take to transform this YAML into a simple XML equivalent? Not much. A bit of googling shows that someone already worked on transforming an associative array into XML, and as it is not a good idea to reinvent the wheel, let's reuse this work.

// Transform YAML into XML
include 'sfYaml.class.php';
$yamlString = file_get_contents('view.yml');
$yamlArray =  sfYaml::load($yamlString);
$xmlString = ArrayToXml($yamlArray);

function ArrayToXml($data, $rootNodeName = 'root', $xml = null)
{
  if ($xml == null)
  {
    $xml = simplexml_load_string("<?xml version='1.0' encoding='utf-8'?><$rootNodeName />");
  }

  // loop through the data passed in.
  foreach($data as $key => $value)
  {
    // no numeric keys in our xml please!
    if (is_numeric($key))
    {
      // make string key...
      $key = "unknownNode_". (string) $key;
    }

    // replace anything not alpha numeric
    $key = preg_replace('/[^a-z]/i', '', $key);

    // if there is another array found recrusively call this function
    if (is_array($value))
    {
      $node = $xml->addChild($key);
      // recrusive call.
      ArrayToXml($value, $rootNodeName, $node);
    }
    else
    {
      // add single node.
      $xml->addChild($key, $value);
    }
  }

  // pass back as string. or simple xml object if you want!
  return $xml->asXML();
}


Second idea

The result of the simple YAML to XML transformation looks like this:

<?xml version="1.0" encoding="utf-8"?>
<!-- view.yml.xml -->
<root>
  <default>
    <httpmetas>
      <contenttype>text/html</contenttype>
    </httpmetas>
    <metas>
      <title>My symfony project</title>
      <robots>index, follow</robots>
      <description>This is my first symfony project</description>
      <keywords>symfony</keywords>
      <language>en</language>
    </metas>
    <stylesheets>
      <unknownNode>main.css</unknownNode>
      <unknownNode>top.css</unknownNode>
    </stylesheets>
    <javascripts>
      <unknownNode>jquery-1.2.6.js</unknownNode>
      <unknownNode>main.js</unknownNode>
    </javascripts>
    <haslayout>1</haslayout>
    <layout>layout</layout>
  </default>
  <indexSuccess>
    <metas>
      <title>Welcome to my site</title>
    </metas>
  </indexSuccess>
</root>


The trouble here is that the <default> and <indexSuccess> tags are not real tags. That means, they do not define a class of content but a value. Same for the <unknownNode> nodes. To make sense, a real equivalent to the view.yml in XML should look like this:

<?xml version="1.0"?>
<!-- view.yml.xml, semantically correct -->
<templates>
  <template name="default">
    <httpmetas>
      <contenttype>text/html</contenttype>
    </httpmetas>
    <metas>
      <title>My symfony project</title>
      <robots>index, follow</robots>
      <description>This is my first symfony project</description>
      <keywords>symfony</keywords>
      <language>en</language>
    </metas>
    <stylesheets>
      <stylesheet>main.css</stylesheet>
      <stylesheet>top.css</stylesheet>
    </stylesheets>
    <javascripts>
      <javascript>jquery-1.2.6.js</javascript>
      <javascript>main.js</javascript>
    </javascripts>
    <haslayout>1</haslayout>
    <layout>layout</layout>
  </template>
  <template name="indexSuccess">
    <metas>
      <title>Welcome to my site</title>
    </metas>
  </template>
</templates>


The difference is that main entries are <template> tags with a name attribute, and that children of the <javascripts> element are simple <javascript> elements. This second XML file is semantically correct, because it follows a simple grammar - and can be validated.

But how can you turn the first XML file into the second? The right tool for this job is called XSLT, or Extensible Stylesheet Language Transformations. An XSLT file is a set of transformation rules described in XML. Applying these rules on an XML files transforms it into another XML file. That's exactly what you need here.

The XSLT file to turn the first view.yml.xml into the second one is quite simple:

<?xml version='1.0'?>
<!-- view.yml.xsl -->
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
  <xsl:template match="/root">
    <templates>
      <xsl:for-each select="child::*">
        <template>
          <xsl:attribute name="name">
            <xsl:value-of select="local-name()" />
          </xsl:attribute>
          <xsl:apply-templates select="child::*"/>
        </template>
      </xsl:for-each>
    </templates>
  </xsl:template>
  <xsl:template match="//stylesheets/unknownNode">
    <stylesheet>
      <xsl:value-of select="text()" />
    </stylesheet>
  </xsl:template>
  <xsl:template match="//javascripts/unknownNode">
    <javascript>
      <xsl:value-of select="text()" />
    </javascript>
  </xsl:template>
  <xsl:template match="*">
    <xsl:copy>
       <xsl:apply-templates/>
     </xsl:copy>
  </xsl:template>
</xsl:stylesheet>


Basically, this XSL stylesheet copies most of the original tags (<xsl:copy>), but does special operations for elements that should be attributes (like <default>), or that should be renamed. This stylesheet defines a "semantical correction" for the automatically created XML translation of the YAML file, and is the first step of the validation. Of course, you need to define one XSLT file for each type of YAML file you want to validate.

How to apply this XSLT to the XML version of the YAML file in PHP? Using the powerful capabilities of PHP in XML, it is extremely simple:

// Transform the XML using XSLT
// Load the simple XML transformation into a DOMDocument object
$xmlDoc = new DomDocument;
$xmlDoc->loadXML($xmlString);
// Load the XSD stylesheet into another DOMDocument object
$xslDoc = new DomDocument;
$xslDoc->load('view.yml.xsd');
// Proceed with transformation using an XsltProcessor object
$xsltp = new XsltProcessor();
$xsltp->importStylesheet($xslDoc);
if (!$xmlTransformed = $xsltp->transformToDoc($xmlDoc))
{
  throw new Exception('XSL transformation failed.');
}


Validating

Validating the semantically correct XML file is quite basic: write an XML Schema, or XSD, describing the syntax expected in a view.yml.xml. You could do it with a DTD instead of and XSD, but XSD is more powerful. Here is a simple schema defining a grammar to validate view.yml.xml files:

<?xml version="1.0"?>
<!-- view.yml.xsd -->
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
  <xs:element name="templates">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="template" maxOccurs="unbounded">
          <xs:complexType mixed="true">
            <xs:all>
              <xs:element name="httpmetas" minOccurs="0">
                <xs:complexType>
                  <xs:all>
                    <xs:element name="contenttype" type="xs:string"/>
                  </xs:all>
                </xs:complexType>
              </xs:element>
              <xs:element name="metas" minOccurs="0">
                <xs:complexType>
                  <xs:all>
                    <xs:element name="title" type="xs:string" minOccurs="0"/>
                    <xs:element name="robots" type="xs:string" minOccurs="0"/>
                    <xs:element name="description" type="xs:string" minOccurs="0"/>
                    <xs:element name="keywords" type="xs:string" minOccurs="0"/>
                    <xs:element name="language" type="xs:string" minOccurs="0"/>
                  </xs:all>
                </xs:complexType>
              </xs:element>
              <xs:element name="stylesheets" minOccurs="0">
                <xs:complexType>
                  <xs:sequence>
                    <xs:element name="stylesheet" type="xs:string" maxOccurs="unbounded"/>
                  </xs:sequence>
                </xs:complexType>
              </xs:element>
              <xs:element name="javascripts" minOccurs="0">
                <xs:complexType>
                  <xs:sequence>
                    <xs:element name="javascript" type="xs:string" maxOccurs="unbounded"/>
                  </xs:sequence>
                </xs:complexType>
              </xs:element>
              <xs:element name="haslayout" type="xs:integer" minOccurs="0"/>
              <xs:element name="layout" type="xs:string" minOccurs="0"/>
            </xs:all>
            <xs:attribute name="name" type="xs:string" use="required"/>
          </xs:complexType>
        </xs:element>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:schema>


Note: I know this doesn't cover all cases; it is mostly a proof of concept.

Now, you need to check the XML file against that schema. Once again, the powerful XML manipulation library of PHP makes it a piece of cake:

// validate the new XML against and XSD
// $xmlTransformed is the semantically correct XML translation of the YAML file defined earlier
if($xmlTransformed->schemaValidate('view.yml.xsd'))
{
  return true;
}
else
{
  // display errors
}


Dealing with libxml errors

By default, DOMDocument::schemaValidate() will only return true if the XML file is valid, and false otherwise. But a good validation utility needs to be more verbose than that, and display errors where a files doesn't validate. In order to do that, you need to manually fetch the libxml errors when the validation fails, as explained in the PHP Manual.

libxml_use_internal_errors(true);
// validate the new XML against and XSD
// $xmlTransformed is the semantically correct XML translation of the YAML file defined earlier
if($xmlTransformed->schemaValidate('view.yml.xsd'))
{
  return true;
}
else
{
  // display errors
  $errors = libxml_get_errors();
  $message = "n";
  foreach ($errors as $error)
  {
    $message .= trim($error->message) . ' (';
    switch ($error->level)
    {
      case LIBXML_ERR_WARNING:
        $return .= "Warning $error->code";
        break;
      case LIBXML_ERR_ERROR:
        $return .= "Error $error->code";
        break;
      case LIBXML_ERR_FATAL:
        $return .= "Fatal Error $error->code";
        break;
    }
    if ($error->file)
    {
      $message .= " in $error->file";
    }
    $message .= " on line $error->line)n";
  }
  libxml_clear_errors();

  throw new Exception($message);
}


That's all. Now all it takes to validate any view.yml file are the XSLT and the XSD grammars. If a view.yml ever contains an incorrect setting, say:

default:
  foo: bar

Then an exception will be raised with a meaningful error message:

Element 'foo': This element is not expected. (Error 1871 on line 2)

Wrapping it up

The idea can be easily transposed to any YAML file. A YAML validator should:

  1. Turn a YAML file into a PHP associative array using sfYaml
  2. Turn this array into an XML structure, in a brute and blind way
  3. Turn the XML structure into a second XML structure using a set of XSLT rules to make the structure semantically correct
  4. Validate the second XML structure using an XML Schema
  5. If errors appear, return them wrapped up in an exception

To validate, say the generator.yml in symfony, all it takes is a generator.yml.xsl and a generator.yml.xsd to define the expected grammar in this file.

Ironic, isn't it?

You could say that the idea behind YAML is to avoid writing XML files. So using XML, XSD and XSLT in order to validate a YAML file may look a bit counter-intuitive, if not ironic.

But when you put it all together, the code necessary to validate any YAML file (not including, or course, the XSLT and XSD grammars, which depend on the file you validate) take only a few dozen lines. Besides, PHP is very good at handling XML, so it's better to use it for its strong points, instead of trying to mimic another language an end up writing thousands of lines of code. Actually, the 'K.I.S.S.' principle that encourages the use of YAML for configuration files should also apply here: XML manipulation is the simplest way to validate a YAML file, so it's the right tool for the job.

Last but not least, revolutions sometimes look backwards - think about the Renaissance. So using XML to validate YAML is probably not as dumb as it sounds.

The full YAML validator code is attached below, together with the example YAML file for your testing pleasure. Once again, I'm not a developer, so the code is just there to prove that the idea works. It could probably be much improved.

Source code + example YAML file and validator schemas

Including the YAML validation system in a web application framework that uses YAML is a must. Validation should only be done in development environment, of course, and only when the YAML files change. Symfony uses a configuration cache system with a set of configuration handlers that would make validation very easy and efficient. Let alone other frameworks in PHP, or in other languages, who could also take advantage of a similar approach.

Oh, and there is one more thing: The semantically correct XML file and its XSD syntax define a perfect XML equivalent to YAML files in symfony. If you want to use XML instead of YAML, and write your own configuration handlers, you should probably follow this kind of syntax.

Everybody Goes to Symfony Camp

And that includes me. I will be giving a presentation there, in two weeks from now, called "Developing for Developers - Usability Applied to Programming", and illustrated by my recent work on DbFinderPlugin.

If you'd like to meet the best PHP5 developers in the world, or me, you should definitely go to the Symfony Camp on September 12th and 13th. The conference is in The Netherlands, not far from Amsterdam - that means not far from anybody in Europe. I heard there are some tickets left for the conference, but it won't last long. The price is not free, but with the great people talking there, the tasty barbecue, the unique atmosphere and a huge lawn to put your tent in, it's a bargain. Besides, they may fill the swimming pool this year.

We'll have plenty of time to speak about symfony, plugins, documentation, the future and everything else. It's a unique opportunity to meet in person all those who lead the symfony community. Also, there are one or two seats left for the training session, so if you want to become operational in symfony quickly, dive in.

One last world for those who expect drama: There Won't Be Blood.

sfPropelFinder becomes DbFinder - Announcing 1.0 release

The sfPropelFinder plugin, which I've told you about a lot lately, has recently been renamed to DbFinder. This emphasizes the fact that the plugin is not Propel-specific anymore, and that you can use it with Doctrine without any change in the API.

Also, I have released a version 0.9 of the plugin today, which marks the 100% coverage of the API with both the Propel and the Doctrine adapters. That's right, now any piece of code using DbFinder will work seamlessly, whatever the ORM you use in symfony.

Take the following code, for instance:

// Look in the Article model
// For objects where the author object related to the article has $nickname for nickname
// Hydrated with related translation in the current culture and category
// And put the result into a pager implementing sfPager for easy display in a web page
$pager = DbFinder::from('Article')->
  where('Author.Nickname', $nickname)->
  with('I18n', 'Category')->
  paginate($currentPage = 1, $maxResultsPerPage = 10);


Getting the same result with either Propel or Doctrine takes considerably more code.

To be honest, the Doctrine coverage is only 99%, since there is still an issue with sfDoctrineFinder::withColumn() when dealing with a calculated column - and this is something that requires Doctrine 1.0 to be fixed. The current Doctrine adapter is based on sfDoctrinePlugin and Doctrine 0.11. But as soon as Doctrine 1.0 is released, withColumn() will be updated to work exactly the same as with Propel.

This release can be considered as a 1.0 beta 1 - meaning I'll probably not add more features before releasing a stable version. I'll work on performance and edge cases if bugs are reported, so you are encouraged to download the plugin, test it, and give me as much feedback as you can.

Next Page »