Feed update mechanism

Sylvain Hellegouarch Tue, 16 May 2006 04:30:50 -0700

Hi everyone,

These days It seems that when UAs request a server to check if a feed has
changed the server responds with either an HTTP 304 Not Modified status
code or by returning the updated feed.


It looks to me as a problem if only one or a couple of entries have been
updated within the feed as it wastes so much bandwith specially when the
said feed contains all entries being produced on the server.

Below are ideas I had about that issue. Nothing formal.

PROPOSAL 1
==========

1. An UA requests the feed for the first time, the server returns the full
current feed. The UA stores it in its cache
2. The UA requests the feed again and provide the If-Modified-Since header
   a. No changes done, the server returns a 304 Not Modified
   b. The server checks which entries have been updated or published since
that date and returns a feed containing only those. The UA updates its
copy of the feed with the incoming data.

The big issue with this algorithm is that it adds semantic to the HTTP
caching system by expecting the UA to update its copy of the feed instead
of simply replacing it. Although specific aggregators could do it, it is
more than likely that some clients would just replace their copy of the
feed and confuse the end user.

PROPOSAL 2
==========

1. An UA requests the feed for the first time, the server returns the full
current feed. The UA stores it in its cache
2. The UA requests the feed again but provides a time range like this:
http://host/feed?after=2003-12-13T18:30:02Z&before=2003-12-13T19:30:02Z
   a. No changes done, the server returns a 304 Not Modified
   b. If entries have been published or updated in the time range, then
return a feed with those only.

Of course, one might only pass either 'after' or 'before'.
The problem with this proposal is to add parameters to the URI which might
not be handled by the server. There are a few ways to handle that:
   * The UA could also use the OPTIONS method from HTTP to check if the
server supports the said URI.
   * The client sends the requests and the server could return 400 Bad
Request if it cannot handle it.


In order to deal with those proposals we could add a simple extension to a
feed such as <atom:feed update-method="full|incremental|chunk">

When receiving the feed for the first time the client would know if it can
either request for:
     * full => default to fetch the complete feed each time when it has
been modified. The default behavior as it exists today.
     * incremental => the server says it can serve entries published or
updated after a given date (equivalent to using only the 'after'
parameter in PROPOSAL 2 or PROPOSAL 1)
     * chunk => the server says it can serve feed of entries published or
updated between two dates (as explained in PROPOSAL 2)

These are just some thoughts and there might already be ways of dealing
with this. If not I'd be happy to hearing your feedback.

- Sylvain
http://www.defuze.org

Feed update mechanism

Reply via email to