Build your own feed aggregator with symfony
With the help of the sfFeed2 plugin and the sfWebBrowser plugin, symfony makes the creation of a feed aggregator a breeze. Let's see what it would take to create the core of a Google Reader-like.
Fetching feeds
First of all, you'll have to fetch feeds from the Internet. It is strongly recommended to browse feeds in an asynchronous way, i.e. not when the user requests the page showing the aggregated feeds. There are two obvious reasons why you wouldn't want a synchronous process:
Distant servers providing the feeds that you want to fetch would receive one request per request on your server. That's a nasty trick to play to other service providers, and it can corrupt the distant server's statistics.
If you have to fetch a dozen URLs per request, then the response time might exceed the server timeout.
So you have to fetch feeds, store them somewhere (in your filesystem or in a database), and keep them for later. I choose to store them in the disk, which gives me an occasion to use the sfFileCache class. Here is the code that I write in a batch process:
define('SF_ROOT_DIR', realpath(dirname(__file__).'/..'));
define('SF_APP', 'frontend');
define('SF_ENVIRONMENT', 'dev');
define('SF_DEBUG', true);
require_once(SF_ROOT_DIR.DIRECTORY_SEPARATOR.'apps'.DIRECTORY_SEPARATOR.SF_APP.DIRECTORY_SEPARATOR.'config'.DIRECTORY_SEPARATOR.'config.php');
// Put the URLs of the feeds you want to fetch in an array
$urls = array(
'http://api.flickr.com/services/feeds/photos_public.gne?format=rss',
'http://del.icio.us/rss/popular',
'http://feeds.feedburner.com/TechCrunch',
'http://www.symfony-project.com/weblog/rss'
);
// Fetch the feeds
$feeds = array();
foreach($urls as $url)
{
try
{
$feeds[] = sfFeedPeer::createFromWeb($url);
echo "fetched feed ".$url."\n";
}
catch(Exception $e)
{
echo "error fetching feed ".$url.": ".$e."\n";
}
}
// Aggregate the feeds
$aggregated_feeds = sfFeedPeer::aggregate($feeds, array('limit' => 10));
// Cache the results
$f = new sfFileCache(sfConfig::get('sf_data_dir').'/feed');
$f->set('feeds', '', serialize($aggregated_feeds));
The interesting part of the batch is the use of the sfFeed2 plugin classes, made simple by the sfFeedPeer utility methods:
sfFeedPeer::createFromWeb()takes an URL as parameter, makes a request to this URL, decodes the response and populates asfFeedobject accordingly. It relies on thesfWebBrowserplugin for the HTTP request. It can recognize feeds of various formats (Atom1, RSS0.92, RSS1, RSS2).sfFeedPeer::aggregate()takes an array ofsfFeedobjects and returns a single feed, in which all feed items are aggregated and ordered chronologically. The second parameter is an array of options, that I use here to limit the number of items present in the resulting feed.
Then I serialize the sfFeed object containing the aggregated items and store it in the disk (under the data/ directory, to make it environment-independent) using the sfFileCache class.
I execute the batch once to test it and to generate the first version of the data/feed/feeds.cache file; as it needs to run periodically, I also add the following command to my crontab:
30 1 * * * cd /path/to/my/project && php batch/fetch_feeds.php
Displaying a feed
That's it for the first part. Now, what happens when a user makes a request to my application for the page showing the aggregated feeds? If this action is called feed/show, it can look like:
{
$f = new sfFileCache(sfConfig::get('sf_data_dir').'/feed');
$this->feed = unserialize($f->get('feeds', '', true));
}
The last thing I'll do is to display the details of each item, in feed/templates/showSuccess.php:
<?php foreach($feed->getItems() as $item): ?>
<div class="post">
<h2><?php echo link_to(truncate_text(strip_tags($item->getTitle()), 40), $item->getLink()) ?></h2>
Posted on <?php echo format_date($item->getPubDate(), "EEEE d MMMM 'at' h:ma ") ?>
by <?php echo link_to($item->getFeed()->getTitle(), $item->getFeed()->getLink()) ?>
<div class="summary"><?php echo truncate_text($item->getDescription(), 300) ?></div>
</div>
<?php endforeach; ?>
That's where I'm glad that the sfFeed and sfFeedItem classes provided by the sfFeed2 plugin have the same accessors whatever the format of the feed (Atom/Rss/etc). It makes the display of a feed item details very simple.
If you want to see the result, check the "outside" columns of the symfony community page.
Comments(4)