Archive for the 'web' Category

Designing a CMS Architecture

When faced with the alternative between an off-the-shelf CMS or a custom development, many companies pick solutions like ezPublish or Drupal. In addition to being free, these CMS seem to fulfill all possible requirements. But while choosing an open-source solution is a great idea, going for a full-featured CMS may prove more expensive than designing and developing your own Custom Management System.

Hidden Costs

What does it cost to integrate and deploy a website based on an open-source CMS? At first sight, not much. As for every CMS, you have to design your own templates and fill your website with initial data. But there are additional costs that pop up as soon as you need a little more than just plain content management.

Think about adding a blog or a forum to a website managed by a CMS. There are modules or plugins for that, but they never provide the same flexibility as plain blogging engines such as Wordpress, or plain forum engines like phpBB. So even if the basic requirement is fulfilled by a module, you will always need - always - to adapt its code.

And this is where it gets ugly. The code base of open source CMS engines and their plugin is nowhere as good as what you can see in RAD frameworks these days. Most of them are based on a very old architecture (PHP4, no object orientation, no proper error handling, direct access to the database, etc.). That means that changing something will be very painful, and very expensive. You will encounter numerous bugs, change the blogging plugin three times because neither of the ones you tested are capable of doing what you need, you will upgrade your CMS to the latest version to benefit from this single bug fix that should save your life but then you need to change all your existing configuration…

This is as bad as it sounds. Start changing one single line of code in an application build on top of Drupal or ezPublish, to name only the two major ones, and you are in trouble. The moment you need something that is not natively supported, you enter the Dark Zone of CMS hell. You are going to spend a lot of money on development. You will never see the end of the tunnel. That is, until someone says, a few years from now, “Do we need all that crap? Let’s build something that fits our needs and that actually works”.

Making Your Own CMS

Given number of available open-source CMS solutions, building one on your own sounds like a stupid idea. But if your website is 50% content management and 50% something else, you probably need to start with a web application framework like symfony or Django, rather than a CMS. These frameworks provide plugins that do part of the Content Management job already, so creating a CMS today is like assembling Lego bricks to build something that exactly fits your needs.

Take symfony, for instance. It provides native support, or support through plugins, for:

Symfony doesn’t yet provide an Access Control List or a Workflow plugin, but you can already put all of the above together and have a pretty powerful CMS engine.

A tailor-made CMS will always have less code and show better performance than any of the existing full-featured solutions. Also, you will be able to tweak it completely, since all the components are decoupled, and built with extensibility in mind.

Your custom CMS will cost you more during the first year, but if you expect your website(s) to live longer than that, then the benefit will become obvious after a year and a half. Plugging the CMS features into other parts of the website, adding features unrelated to content management, scaling to a larger audience, replacing the database engine or the caching backend, all that will be painless.

That is, if you design your custom CMS carefully, and with the future in mind.

Environments

When you add features to an application, you need a testing environment - a place where you can check that the additions work and don’t kill the rest of the application. That means that developers have a version of the website on their desktop computer, where they change stuff. Then, they upload the application to a test server, check that everything is OK, and only then can they deploy the application to the production server. This is a very common practice, often backed up by source version control and continuous integration tools.

But what happens when a new feature is not made of code, but of data? In ezPublish, for instance, in order to define a new type of content (they call it a “Class”), you have to use the backend web interface and fill in a few forms. The properties of the new type of content are stored in the database. In order to deploy this new type of content from the testing environment to the production environment, the developers need to transfer data from one database to another - without wiping off unrelated information on the production database, such as user comments, statistics, etc.

Deploying new features in this context means executing some SQL code on each server. This is much more dangerous than just pushing a new version of the codebase, especially when the data model is made of many tables glued together in complex joins. That’s why, in many websites based on ezPublish, developers add features directly on the production environment, or repeat the configuration using the backend interface on every environment. This is either a high risk or a large waste of time.

Data, or Code?

This environment drawback tends to be a major influence over the choice of features a CMS should provide. For almost every CMS feature, you should wonder: Can the user do that through the backend interface, or do we need a programmer to add a new element? In other terms, is the feature made of data, or code?

Off-the-shelf CMS engines will almost always answer ‘Data’. My personal opinion is that it is wrong in many cases. Content types are just one example, but think about workflows or page layouts for instance. They define a complex logic that always translates to code, and giving the user the ability to change them via a backend interface means storing code in the database and evaluating it at runtime. Then you can’t use op-code cache engines like APC incriease your website performance. And deploying that to production is a nightmare.

Some companies think that most of the CMS features should be accessible via a backend interface in order to be able to enhance the application without additional developments. But this is an illusion. For one, the configuration of content classes in ezPublish is so complex that it does indeed require a PHP developer, and an expensive one, since experience with ezPublish is one of the most demanded skills in the IT market (at least in France). More features mean more development, and there is no CMS out there that replaces the power of a programming language with a web interface.

So that leads to one good rule of thumb: Design your features so that they can be made of code rather than data. That applies to elements that can be modified by a graphical user interface, or programatically:

  • Content classes
  • “Widgets” or “Components” for pages
  • Page layouts or “templates”
  • Content validation workflow
  • Tasks

Fundamental questions

The complexity of a CMS engine depends greatly on the answer you give to a few fundamental questions:

  • Can contents exist independently of a page?
  • Can contents exist at more than one place in the website?
  • Are there several views for a single piece of content?
  • Can contents have different versions simultaneously?
  • Can contents be modified in the backend and keep unchanged in the frontend?
  • Can users compose a page with “widgets” or “components” in a WYSIWYG interface?
  • Can predefined zones in a template contain more than one “widget” or “component”?
  • Can section pages have different templates?
  • Can section pages have different versions simultaneously?
  • Can users program the publishing of a section page, or of contents, in advance?
  • Can the CMS remember previous URLs for a content that changed title?

If the answer to the first question is no, then the concept of “page” and “content” coincide. You probably don’t need to develop anything, since your CMS will be quite simple.

If you answer yes to all these questions, then the CMS might take three times longer to develop than what it would be otherwise.

That’s why the idea of a tailor-made CMS is not that stupid. No existing CMS will be able to answer these questions in every possible way. But designing your own relational schema based on the answer to these questions makes sense, economically speaking. Don’t make it complex if you don’t need do, or, to put it otherwise, Keep It Simple, Stupid.

Bootstrapping the reflection

Now that you’re trying to imagine what you actually need for your own CMS, here is a glimpse of the kind of technical challenge you will face all the time.

The question turns around the concept of content types. In a CMS, you mostly deal with “articles”. This type of content has a title, an author, a summary, a body, and a few other attributes. But you probably also need to deal with some other content types, like movies, slide shows, quiz games, polls, or recipes. These content types are defined by properties distinct from that of an article. Some of them can fit in a single structure, others require several structures related to each other. For instance, quiz games require a structure for the quiz itself, one for the questions, one for the answers to each question, and one for the quiz results.

The question is: Do you store the data for all these content types in a single table, or do you create a table for each content type? The most “normalized” choice is probably to create one data structure for each. You could have an “article” table, a “recipe” table, and even a “quiz” table with foreign keys to a “quiz_question” and a “quiz_result” table. That would allow you to make queries on some specific attributes of a specific content type. You could build a custom search engine for your recipes and look for ingredients, foreign cuisine and preparation time.

But then, if each content type has its own table(s), what do you do when you have to list all the contents of a section, or worse (that happens in the backend) all the contents of the website? Does that mean that, in order to display a list of contents, you must query several tables and aggregate the results together? This solution simply doesn’t scale, and a CMS built like that will become slower and slower as you add new content types.

So that probably means that you should store a reference to each content in a separate table, with a copy of the data that is generic to all content types (like title, publication date, section, etc.). Pages displaying a list of contents would use this aggregate table, while pages displaying content details would use the specific tables.

And that means that you must find a way to synchronize the specific tables and the generic tables whenever data changes in content. That’s not a big deal, but it gives you an idea of the kind of complexity you will encounter in a large scale CMS.

A Challenging Exercise

Designing a CMS is difficult and fun, and you’ll probably do it more than once. Every CMS is different, because every content management need is different, and mostly because every customer wants more than just plain content management.

If you are a developer, whenever you meet a client that asks you for a Drupal integration, try to sell your knowledge of CMS architectures rather than a few hours of developer time. Raise the important questions, talk about the possible problems of using off-the-shelf solutions. If you ever used one of those before, you will have plenty of issues to talk about. Then, try to convince your customer to trust you into a custom development. Make it small at the beginning, so that the customer can start using it right away and refine its requirements incrementally.

This will be a very satisfying experience, and the client will thank you later for leading him on the right path. And this will give you a lot to talk about for the next CMS you build…

Comparing Propel, Doctrine and sfPropelFinder

When it comes to ORMs, it's all a matter of preference. Is it, really? This post compares side-by-side the code required to perform some simple operations with three OO database requesting API. The purpose is to demonstrate that productivity, and not only style, can vary a lot depending on the ORM you choose.

There are not many robust Object Relational Mapping layers in PHP5. I'll consider two of them:

  • Propel is an ORM that "allows you to access your database using a set of objects, providing a simple API for storing and retrieving data. Propel allows you, the web application developer, to work with databases in the same way you work with other classes and objects in PHP."

  • Doctrine is an ORM that "sits on top of a powerful PHP DBAL (database abstraction layer). One of its key features is the ability to optionally write database queries in an OO (object oriented) SQL-dialect called DQL inspired by Hibernates HQL. This provides developers with a powerful alternative to SQL that maintains a maximum of flexibility without requiring needless code duplication."

I will also consider an additional component to Propel named sfPropelFinder. It "provides an easy API for finding Propel objects - that is, easier than the Peer methods and the Criteria stuff". sfPropelFinder is a symfony plugin, but it can be used with Propel alone.

For the examples, I'll use the classic Article/Comment model.

Disclaimer: Being the author of sfPropelFinder, you may think that I chose examples that make it look better. To avoid this bias, I wrote a lot of examples, including some where this plugin does not perform very well. Still, if the sfPropelFinder comparison with the two other ORMs is not objective, the comparison between Propel and Doctrine is quite so.

Scope

This comparison will only focus on the API - I voluntarily leave the performance benchmarks to whoever wants to do it. But I think the gross performance comparison probably looks like:

Slowest    sfPropelFinder + Propel 1.2
|          Propel 1.2
|          Doctrine 0.11
|          sfPropelFinder + Propel 1.3
Fastest    Propel 1.3

As for the features, it is hard to give an objective comparison without getting too much in the details. If you wonder if a particular ORM does something that another can't do, post a comment about it and I'll try to give you an honest answer.

Bear in mind that sfPropelFinder is very young, that Doctrine is quite young, and that Propel has a longer history and is the most stable and mature of all three.

Retrieving an article by its primary key

// Propel
$article = ArticlePeer::retrieveByPk(123);
// Doctrine
$article = Doctrine::getTable('Article')->find(123);
// sfPropelFinder
$article = sfPropelFinder::from('Article')->findPk(123);


Retrieving the comments related to an article

// Propel
$comments = $article->getComments();
// Doctrine
$comments = $article->Comments;
// sfPropelFinder
$comments = $article->getComments(); // no change - use Propel


Retrieving an article from its title

// Propel
$c = new Criteria();
$c->add(ArticlePeer::TITLE, 'FooBar');
$article = ArticlePeer::doSelectOne($c);

// Doctrine
$article = Doctrine_Query::create()->
  from('Article a')->
  where('a.title = ?', array('FooBar'))->
  fetchOne();
// Doctrine (faster)
$article = Doctrine::getTable('Article')->
  findOneByTitle('FooBar');

// sfPropelFinder
$article = sfPropelFinder::from('Article')->
  where('Title', 'FooBar')->
  findOne();
// sfPropelFinder (faster)
$article = sfPropelFinder::from('Article')->
  findOneByTitle('FooBar');


Retrieving the latest 5 articles

// Propel
$c = new Criteria();
$c->addDescendingOrderByColumn(ArticlePeer::PUBLISHED_AT);
$c->setLimit(5);
$articles = ArticlePeer::doSelect($c);

// Doctrine
$articles = Doctrine_Query::create()->
  from('Article a')->
  orderby('a.published_at DESC')->
  limit(5)->
  execute();

// sfPropelFinder
$articles = sfPropelFinder::from('Article')->
  orderBy('PublishedAt', 'desc')->
  find(5);


Retrieving the last 5 comments related to an article

// Propel
$c = new Criteria();
$c->addDescendingOrderByColumn(CommentPeer::PUBLISHED_AT);
$c->setLimit(5);
$comments = $article->getComments($c);

// Doctrine
$comments = Doctrine_Query::create()->
  from('Comment c')->
  where('c.article_id = ?', array($article->getId()))->
  orderby('c.published_at DESC')->
  limit(5)->
  execute();

// sfPropelFinder
$comments = sfPropelFinder::from('Comment')->
  relatedTo($article)->
  orderBy('PublishedAt', 'desc')->
  find(5);


Retrieving the last comment related to an article

// Propel
$c = new Criteria();
$c->addDescendingOrderByColumn(CommentPeer::PUBLISHED_AT);
$c->add(CommentPeer::ARTICLE_ID, $article->getId());
$comment = CommentPeer::doSelectOne($c);

// Doctrine
$comments = Doctrine_Query::create()->
  from('Comment c')->
  where('c.article_id = ?', array($article->getId()))->
  orderby('c.published_at DESC')->
  fetchOne();

// sfPropelFinder
$comments = sfPropelFinder::from('Comment')->
  relatedTo($article)->
  findLast();


Retrieving articles based on a word appearing in the title or the summary

// Propel
$c = new Criteria();
$cton1 = $c->getNewCriterion(ArticlePeer::TITLE, '%FooBar%', Criteria::LIKE);
$cton2 = $c->getNewCriterion(ArticlePeer::SUMMARY, '%FooBar%', Criteria::LIKE);
$cton1->addOr($cton2);
$c->add($cton1);
$articles = ArticlePeer::doSelect($c);

// Doctrine
$article = Doctrine_Query::create()->
  from('Article a')->
  where('a.title like ? OR a.summary like ?', array('%FooBar%', '%FooBar%'))->
  execute();

// sfPropelFinder
$article = sfPropelFinder::from('Article')->
  where('Title', 'like', '%FooBar%')->
  _or('Summary', 'like', '%FooBar%')->
  find();


Retrieving articles based on a complex AND/OR clause

// Articles having name or summary like %FooBar% and published between $begin and $end

// Propel
$c = new Criteria();
$cton1 = $c->getNewCriterion(ArticlePeer::TITLE, '%FooBar%', Criteria::LIKE);
$cton1 = $c->getNewCriterion(ArticlePeer::SUMMARY, '%FooBar%', Criteria::LIKE);
$cton1->addOr($cton2);
$c->add($cton1);
$c->add(ArticlePeer::PUBLISHED_AT, $begin, Criteria::GREATER_THAN);
$c->addAnd(ArticlePeer::PUBLISHED_AT, $end, Criteria::LESS_THAN);
$article = ArticlePeer::doSelect($c);

// Doctrine
$article = Doctrine_Query::create()->
  from('Article a')->
  where('(a.title like ? OR a.summary like ?) and (article.published_at> ? and article.published_at> ?)', array('%FooBar%', '%FooBar%', $begin, $end))->
  execute();

// sfPropelFinder
$article = sfPropelFinder::from('Article')->
    where('Title', 'like', '%FooBar%', 'cond1')->
    where('Summary', 'like', '%FooBar%', 'cond2')->
   combine(array('cond1', 'cond2'), 'or', 'cond3')->
    where('PublishedAt', '>', $begin, 'cond4')->
    where('PublishedAt', '<', $end, 'cond5')->
   combine(array('cond4', 'cond5'), 'and', 'cond6')->
  combine(array('cond3', 'cond6'), 'and')->
  find();


Retrieving articles authored by someone

// Propel
$c = new Criteria();
$c->addJoin(ArticlePeer::AUTHOR_ID, AuthorPeer::ID);
$c->add(AuthorPeer::NAME, 'John Doe');
$articles = ArticlePeer::doSelect($c);

// Doctrine
$article = Doctrine_Query::create()->
  from('Article a')->
  leftJoin('a.Author b')->
  where('b.name = ?', array('John Doe'))->
  execute();

// sfPropelFinder
$article = sfPropelFinder::from('Article')->
  where('Author.Name', 'John Doe')-> // Guesses the join from the schema
  find();


Retrieving articles authored by people of a certain group

// Propel
$c = new Criteria();
$c->addJoin(ArticlePeer::AUTHOR_ID, AuthorPeer::ID);
$c->addJoin(AuthorPeer::GROUP_ID, GroupPeer::ID);
$c->add(GroupPeer::NAME, 'The Foos');
$articles = ArticlePeer::doSelect($c);

// Doctrine
$article = Doctrine_Query::create()->
  from('Article a')->
  leftJoin('a.Author b')->
  leftJoin('b.Group c')->
  where('c.name = ?', array('The Foos'))->
  execute();

// sfPropelFinder
$article = sfPropelFinder::from('Article')->
  join('Author')->
  where('Group.Name', 'The Foos')-> // Guesses the Group join from the schema
  find();


Retrieving all articles and hydrating their category object in the same query

// Propel
$c = new Criteria();
$articles = ArticlePeer::doSelectJoinCategory($c);

// Doctrine
$article = Doctrine_Query::create()->
  from('Article a')->
  leftJoin('a.Category c')->
  execute();

// sfPropelFinder
$article = sfPropelFinder::from('Article')->
  with('Category')->
  find();


Retrieving an article and its category by the article primary key

// Propel
$c = new Criteria();
$c->add(ArticlePeer::ID, 123);
$c->setLimit(1);
$articles = ArticlePeer::doSelectJoinCategory($c);
$article = isset($articles[0]) ? $articles[0] : null;

// Doctrine
$article = Doctrine_Query::create()->
  from('Article a')->
  leftJoin('a.Category c')->
  where('a.id = ?', array(123))->
  fetchOne();

// sfPropelFinder
$article = sfPropelFinder::from('Article')->
  with('Category')->
  findPk(123);


Retrieving articles and hydrating their author object and the author group

// Propel
// Impossible do to it simply - need for a custom hydration method (approx 40 LOC)

// Doctrine
$article = Doctrine_Query::create()->
  from('Article a')->
  leftJoin('a.Author b')->
  leftJoin('b.Group c')->
  where('a.id = ?', array(123))->
  fetchOne();

// sfPropelFinder
$article = sfPropelFinder::from('Article')->
  with('Category', 'Group')->
  findPk(123);


Conclusion

That's a lot of queries. And I didn't mention many-to-many relations, addition of columns, behaviors, update/delete queries, count queries, or pagers. But overall, my conclusion after writing these examples is:

  • Propel is the most verbose ORM of all three
  • sfPropelFinder is the most magic of all three
  • sfPropelFinder and Doctrine are the fastest to write, depending on the cases
  • Some limits of Propel are very frustrating (limited doSelectJoinXXX(), Criterions, custom hydration)
  • Propel and sfPropelFinder will never beat DQL for complex queries

Finally, if you are wondering which ORM to choose for your next symfony project, make sure that you put the productivity in the balance.

Is PicLens Malware?

I recently installed the PicLens Firefox extension. It is an incredibly useful way to browse image collections, the interface is both very responsive and well thought, and the integration into existing websites is unobtrusive enough to convince me.

Then, as I was monitoring requests on one application I develop on my local server, I noticed that each time I requested a page, two requests were received by the web server (in addition to requests for web assets such as JavaScript, CSS and image files). After investigation, I realized that the PicLens extension detected a <link> tag in the page content, and automatically fetched the RSS feed linked by that tag. It does so everytime it detects an application/rss+xml link.

I made the test with pages including more than one RSS feed (try php.net for instance) and noticed the same behavior, only at a larger scale. So PicLens does basically what Google Web Accelerator does: it prefetches web resources (in this case: RSS feeds) to accelerate the navigation experience.

I emailed the PicLens support about the issue, and here is their response:

Hi Francois,

Thank you so much for your kind words and for using PicLens! We really appreciate you taking the time to send us your thoughts.

I'm sorry to hear you are worried about PicLens's prefetching behavior. We prefetch all tags that have a content type of "application/rss+xml" because we use that to match up mediarss feeds with items on the page. It's not a bug at all, nor have we heard of it causing any problems for anyone. Is there a specific reason you feel that it jeopardizes websites?

Hope to hear back from you soon.

All the best, Meg & The PicLens Team

I can think of many reasons why link prefetching is bad, among which wrong statistics, additional bandwidth and server load. But maybe I'm being too extremist on that one. What do you think? Can prefetching be considered as an acceptable practice nowadays? Or is the PicLens extension something that should not be installed?

Talking ’bout “Ma Generation”

Together with all the people working at my company, we've been working on a French website called "Ma Génération". It's a generalist news/service portal, exclusively in French - and built entirely in PHP with symfony.

Please, pay us a visit at http://www.mageneration.com.

In addition to a large custom-made CMS, the site features a forum, user blogs, user galleries, a dating service and a Netvibes-like personal homepage. Under the hood, a lot of work has been done since last August, when the development started. Some of the plugins I have been contributing lately are directly related to this website.

I want to express my thanks to all the people who've been working on the project, and my encouragements for the work left to be done... This is just the beginning!

Home page

A small symfony for a fast response

Sometimes, the price of a request when dealing with a symfony application can be overwhelming. But instead of getting back to spaghetti PHP, maybe you can get a handful of symfony features for a share of its initialization time.

The features without the cost

I met this case when designing a feature-rich Content Management System that made a heavy use of the cache - and of the Super Cache. Basically, symfony was able to compute very complex pages and serve them as static page, that means very fast.

The Super Cache was a very efficient performance enhancer, for a small cost. Imagine that you set the lifetime of the super cache to 10 seconds; when a server is under a heavy load, a given page is only calculated once every 10 seconds, even if requested 500 times in between. With these figures, activating the Super Cache roughly multiplies your site's responsiveness by 500.

Design with speed in mind

But "with great power comes great responsibility", or so they say. To be able to use page caching with layout, and therefore the Super Cache plugin, the application had to be designed very carefully.

The most forbidden thing when you want to use page cache with layout are parts of the page that depend on the session - think about a header saying "Hello, John Doe" when you are connected, even if the rest o the page is completely session-independent. Unfortunately, that's always what the client wants on every page. So I had to find a solution to make the page cacheable without losing the basic user customization.

Ajax to the rescue

As explained in a previous post, you can always defer the customization of the page to the next request, by serving a session-independent page that calls an Ajax action in the background to retrieve the data necessary to change the user name in the header.

But symfony does not really fit for the second request. The cost of the initialization of a symfony request is not to neglect, and counts for 90% of the response time in very small requests - such as getting a username from the database based on a key.

Bitter swift symfony

That's when the idea of a "small symfony" comes. Wouldn't it be great if you could get access to the model layer, the configuration, the autoloading, the user object, the helpers, and keep a MVC separation, without initializing the whole framework?

That would indeed give a boost to any application designed according to the principle exposed in the quoted article. But wait a minute, we already know of "lightweight actions". They are called "components" in symfony. The only problem is that they cannot be called from the outside.

Inside out

Can't they? I'm not so sure. Imagine a script lying under your web root folder, a "lightweight front controller", with the following code:

define('SF_ROOT_DIR',    realpath(dirname(__FILE__).'/..'));
define('SF_APP',         'frontend');
define('SF_ENVIRONMENT', 'prod');
define('SF_DEBUG',       false);

require_once(SF_ROOT_DIR.DIRECTORY_SEPARATOR.'apps'.DIRECTORY_SEPARATOR.SF_APP.DIRECTORY_SEPARATOR.'config'.DIRECTORY_SEPARATOR.'config.php');

$module = $_GET['module'];
unset($_GET['module']);
$action = $_GET['action'];
unset($_GET['action']);
sfLoader::loadHelpers('Partial');
include_component($module, $action, $_GET);


It looks very much like a regular symfony front controller script, except that it lacks the final sfController::dispatch() line. And indeed, it does not initialize the filter chain, handle validation nor output escaping. It just initializes the smallest part of symfony required to execute a "lightweight action"

I saved this script under web/component.php. Now, my Ajax calls can be made to the following URI:

http://mysite/component.php?module=foo&action=bar&key1=value1

The server will then return the result of the execution of an

include_component('foo', 'bar', array(
'key1' =&gt; 'value1'
));


Does it work?

Now I can execute a component from the client side. The component architecture offers native View/Controller separation, and the configuration initialization brings autoloading, database access, and more. It does work perfectly, but is it fast? Speed tests show that not launching the filter chain saves about 40% to 50% of the cost of a symfony initalization. This means that you can multiply the number of requests that your server can handle by two - for very simple requests.

Be aware that this trick can only be used in some very particular cases, and only for very light requests. It may jeopardize security, and will often prove to be very limited. But for a page split between a session-independent part and a small session-dependent action, it does the trick.

Before we leave

The great thing about this article is that the trick it exposes is not its best part. For the original need of including session-dependent data into a generated CMS page, Ajax is not the only solution. JavaScript alone can do all the job on the client side, and so your server will never need to embed user data in the page. Ok, it will need to do it once, after which the client keeps the data in a cookie, and a JavaScript executed at page load inserts this data into the page - on the client side. And now every CMS page can be cacheable with its layout.

Bring serendipity by vicinity

One thing that is really cool about living in the real world is serendipity: the ability to discover something that you were not looking for in the first place. How many times have you been looking for a book in a bookstore and finally leave the place with a book you never heard of before? (you buy books, right?)

A blind man

In the world wide web, this doesn't exist. Apart from a few attempts to bring unexpected content in e-shops - aimed at having you spend more by cross-selling similar products - the web is just like the world as seen by a blind man. You go to where you want to, you sometimes get lost, but you rarely discover things you were not looking for.

Helpful neighbor

That's why sites like digg, stumble upon, del.icio.us and brothers were born lately. To help you find something you are not looking for. But isn't it ironic? You have to visit regularly a place to find things that you are not interested in in the first place. It's just as if the blind man paid a neighbor to talk about the latest events in the neighborhood: Mr Smith just moved, a certain Miss Doe took his place, she seems well educated and nice.

That's a very partial vision of things, and the blind man may never hear about this porn shop that just opened a few blocks away, and where he would love to go every once in a while, all that because his neighbor wouldn't want to be seen there (by a blind man?).

Web vicinity

Back to the web. What if, every time you go to a site, you had to pass in front of some others? Not random sites, but a constant sequence of sites, depending on the place you're at and the place your target is? When I say place, I think IP address. I'd love to see a widget which detects existing websites around the routers and proxies leading you to your target, and show each one of them briefly (one or two seconds) during the course of your request.

Of course, you would have to see quite often websites that you may not want to see, but it's just like a mean neighbor that you have to greet when you cross his path, even if you know he beats his wife and watches TV all day long. And you know what? Life isn't better when you only see nice things (that's the big mistake of the Disney corp.). On the contrary, the more you see, the more you get to compare things with, and you vision of things becomes better established. In a word: the path to enlightenment.

It's not that fun, but it's good

Once again, it's not about related sites, webrings or some other site deciding for you (according to user ratings or redactors review). It's not about ad-paid fake randomness, and it's not about giving you surprises. It is about recreating an environment, having time to discover something through repetition, and finding where you are in the giant map of the world wide web.

In the long run, if you really can't stand your neighborhood, you can always move.

When Ajax can speed up your site

Many people think that adding Ajax interactions to a web application can cripple a website's performance. Of course, if you add remote periodical executers everywhere, or if you make three Ajax requests to update three parts of a page, the web server will just hate you (servers have feelings, you know). But there can be cases where Ajax can take some burden off the server, where it can be an architecture choice rather than a pure UI choice.

Does it sound familiar?

For instance, take the now classic "add a comment" Ajax form. The user enters data in a form, submits it, and the result is sent to the client in XmlHttpRequest. There is an immediate benefit for the server here: It only has to send back the updated part of the page (in that case, the new comment) rather than the entire page. That represents a notable bandwidth and CPU economy.

Super caching

Another example is an "almost static page", which means that the page contents depend on the user session only for some limited parts. Think of a news website where the only session-dependent part is the name of the connected user displayed on the upper part of the window. If this element wasn't present, the page would be a perfect candidate for super caching.

The super caching is the action to store a copy of the HTML response somewhere under the web root of the server, so that next time the page is requested, the server sends the HTML response without even using PHP. This is very fast, and it can even be done by a lightweight and specialized server like lighthttpd. Symfony has a super caching solution in the form of a plugin, it is called sfSuperCachePlugin.

Ajax comes to the rescue

But, because of the session-dependent element, the page I talk about cannot benefit from the super caching. Can't it, really? What if the session dependent element was removed, and added to the static page afterwards, by the web browser? That's where Ajax comes in. It is a great replacement for iframes, because the Ajax response can be any JavaScript code, used to do some complex DOM modification, and that is more powerful than just replacing an element's innerHTML.

Concretely, that's how you would design your pages to take advantage of the super cache:

  • The page is designed without session dependent element

  • The first time the page is requested, it is stored in the super cache

  • The page contains a static call to another action in Ajax.

For instance, if you use the jQuery Javascript framework, the end of the page can show something like:

<script>
$().ready(function () {
  $.getScript('/path/to/javascript/action');
});
</script>


The /path/to/javascript/action action gets the user's name from the session and database, and sends it back to the browser as a piece of JavaScript modifying the DOM of the static page to include the user's name.

But wait a minute. Modifying a page after it is loaded with JavaScript, isn't that just what unobtrusive behaviours do? That's true, the sfUJSPlugin is designed exactly with this process in mind. Build the static, session-independent, accessible version first, and add the dynamic, session-dependent, highly interactive sugar in JavaScript afterwards. Or, to put it differently, design fast pages first, add the performance penalty afterwards. There is no more limit to the number of pages you can put in cache - even the most session-dependent pages can benefit from super cache.

Pros and cons

The performance advantage is not huge in symfony, because the real cost of a request is the framework initialization. Whether you send one page with cached fragments or two pages with only one using symfony, you will always have to initialize symfony once. But using the solution described here will at least save you the time of deserialization of a complex response from the cache, and a better logic in your design.

One drawback of these techniques is that the load taken off the server ends up being transferred to the client. The web browser has more to do, and the full response to a request takes more exchanges with the server to display - in short, the answer is somehow slower for the end user. Besides, developers tend to forget accessibility when they code Ajax interactions, so the pages have to be though carefully.

Conclusion

To conclude, Ajax can make your website faster because it allows you to use super caching in pages that normally couldn't benefit from it. Symfony has already all the tools to put this idea into practice (namely sfSuperCachePlugin and sfUJSPlugin), so you should never have to buy a new server again.

Let the old ones die and attend their funeral

The web is overburdened with old sites, visited by nobody and victims of the pride of their creators, who don't want to let them go. "It costs nothing", they say, "and someone may want to read my opinion on carrot soup someday".

What with the old ones in real life

In real life, the old ones are visited regularly by the members of their family, so that they don't get forgotten. It's like a child's duty to pay a visit every once in a while to grandparents, old uncles and sick elderly aunt Tatiana. Until they die, and then you go to the funeral, gather with the nearest and dearest, cry a little, drink a lot, and start something else.

The same applies to the web

To avoid overpopulation, the web should follow the example given by the family traditions. Any forgotten website for more than, say, a year, should be declared sick, and its creators/members/users/trackbacks should be told about the situation. They could all meet in the website's backyard on Sundays, to talk about the good old times when the website was still active, and about the Superbowl. That way, the website's visits figures, although low, would keep at an acceptable level.

Then the web doctor would visit the old website and check its health. He would advise against useless attempts to rejuvenation, give a few coins to the host so that the website doesn't get kicked out, check that it still looks acceptable on modern browsers... Web doctor could be a nice profession, and if it's like in real life, it would pay well.

Web funerals

Anyway. After numerous years of brave resistance of the patient, the web doctor could declare it dead. He would call the relatives and ask them to organize the ceremony. A chat would be organized in a #funeral IRC channel, people would overdress and exchange memories of the website, only to say that it was better in the old times, nowadays everything gets corrupted. The creator would choose a picture of the website, and put it on a virtual grave at graveyard.com. Then the host would be informed to wipe out all data, the search indexes would be told to do the same, and the website would only live in our memories, for the best.

How would that change the web?

I see numerous advantages to the old websites dying.

  • First of all, it leaves room for the youth. Yeah, altavista.com was a kick-ass site, but it's time to start using a real search engine.

  • Also, it forces us to keep the knowledge of the past, and is a good way to avoid repeating the same mistakes. The numerous web 2.0 sites coming out everyday have a strange aftertaste of the 00s Internet bubble (unreasonnable cash burn rate, ridiculous business potential).

  • New sites respect the older ones, and don't try to show off too much, even if they do better.

  • The growth rate of the web... well, this will probably not get better.

  • Your name can't be found related to some old story anymore. You know, when you wrote in a webmaster forum that JavaScript is crap.

  • Has-been website creators pass to something else (at last).

  • Information gathered from websites isn't outdated in fifty percent of the cases.

  • Hard disk manufacturers don't get rich so fast.

And most of all, my searches in Google would return relevant results.

Unobtrusive JavaScript made possible

Unobtrusive Javascript explained

The usual way of writing JavaScript in templates results in obtrusive code. For instance, the classing link_to_function() helper called in symfony like this:

<?php use_helper('Javascript') ?>
<?php echo link_to_function('see the image', 'showPopup('image.jpg')') ?>


The code appearing in the template is:

<a href="#" onclick="showPopup('image.jpg'); return false;">see the image</a>


While this code effectively provides the correct interaction, it contains an inline even handler (onclick) which will create problems with screen readers. Additionally, the CSS revolution taught us to separate content from presentation, and we soon understood the benefits this could bring (reusability of the presentation layer, better maintainability, etc). Following the same concept would naturally lead to separating behavior from content, and this means putting away JavaScript code. Accessibility and layer separation recently led to a new way of using JavaScript called unobtrusive scripting. The term comes from Stuart Langridge, who started the movement in 2002, rapidly followed by JavaScript gurus like Peter-Paul Koch.

The problem with unobtrusive JavaScript is that it is much longer to design and write. You must first deliver a Javascript-free XHTML content, which must be able to run without further addition, then execute JavaScript code on it to modify some elements, add event handlers, etc. This should sound familiar to those used to styling through CSS, but look at how long it takes:

<head>
  <script type="text/javascript" src="my_behaviors.js"></script>
</head>
<body>
  <a href="image.jpg" id="foobar">see the image</a>


// in my_behaviors.js
function initializePage()
{
  var x=document.getElementById('foobar');
  x.onlick = function () { showPopup('image.jpg'); return false };
}
window.onload = initializePage;


The link works even if JavaScript is disabled. But in order to modify if afterwards, we must add an id attribute to it, then we must register its onclick event handler in JavaScript, and then make sure that the JavaScript is launched when the DOM is ready, therefore executing it when the onload event fires. This is the correct way to achieve the same effect as the first listing in an unobtrusive way.

If you ever tried this, you probably thought just like me: the hell with unobtrusiveness, there is no way I'll spend the rest of my life writing ten lines instead of one just for the sake of accessibility and layer separation!

Fortunately, there are two tools that could make you change your mind.

The first is a JavaScript framework. The most well-known (because it's the official js framework of the most well-known web application framework) is Prototype, but you could as well consider jQuery or others. Doing unobtrusive JavaScript (ok, let's call it UJS until the end of this article) with a JavaScript framework is a lot more easier than with plain JavaScript. See how the previous listing could be reduced by using jQuery:

<head>
  <script type="text/javascript" src="jquery.js"></script>
  <script type="text/javascript" src="my_behaviors.js"></script>
</head>
<body>
  <a href="image.jpg" id="foobar">see the image</a>


// in my_behaviors.js
$().ready(function(){
  $('#foobar').click('showPopup('image.jpg')');
 })


The jQuery dollar function can use a CSS3 selector to find a DOM element, and the addition of a handler is greatly facilitated by the click method. Also, the ready method fires when the document is ready, but you can attach more than one funciton to it.

But that's not enough. This is still way longer to write than the obtrusive version, which was using a single call to a PHP helper. But why not use a helper to add UJS code? This could be very easily done. The syntax could look like this:

<?php use_helper('UJS') ?>
<a href="image.jpg" id="foobar">see the image</a>
<?php UJS_add_behaviour('#foobar', 'click', 'showPopup('image.jpg')') ?>


The UJS helper could very well manage storing the UJS code in the session and writing it in a separate file that would be called by the first template, transforming this listing into the previous one. In fact, I've written a symfony plugin called sfUJSPlugin that does exactly that.

It goes event beyond: If you use the regular symfony helpers, every mention of an event handler property gets automatically transformed into UJS code. So the same effect as before can be achieved with just:

<?php use_helper('UJS') ?>
<?php link_to('see the image', 'image.jpg', 'onclick=showPopup('image.jpg')') ?>


This is not longer than the first example, and that's unobtrusive. At last, we have the right tools to make the web both accessible and usable.

Rename Cookies

Websites use cookies. It's a nice name for something that can be really nasty. It's also a good idea to use some kitchen metaphors in a world full of office metaphors.

But using the word "cookies" in the web context leads to some strange stuff like:

Do you want to clear your session cookies?

or even:

Accepting cookies can endanger your privacy

Today is the opportunity to change the cookies name. We just have to figure out the best way to name them.

If our primary concern is to keep a sound relation to the current word, I propose

Cuckoo

or

Kickin

We would understand better why "accepting cuckoos can endanger your privacy" and "beware of session kickins".

If our primary concern is to keep the kitchen metaphor, then I propose:

Soufflé

or

Cheese nan

It doesn't make more sense than cookies as far as web browsers are concerned, but at least it emphasizes the fact that cookies are hot, hard to bake and delicate (for the first word) or exotic, indispensable and round (for the second one).

But being a revolutionary is all about pushing the limits - at least that's what Guy Kawasaki says. So while we are at it, why not allow ourselves to rename cookies without any constraint related to the current name?

That's a great idea. And it leads to great names, too. Judge for yourself:

morning-after pill (because when you look for it, it is often too late)

smut (because of all the dirt that's into)

disclosure (so that you don't forget about it)

eskimo (because it is very resistant)

janitor (because it remembers you)

Now, what's your suggestions about the best new name for web cookies, in an ideal world?