Sunday, 20 October 2002

Congestion control for RSS

Dave Winer, Joel Spolsky and Phil Ringnalda are discussing the problem of RSS aggregators that check for updates by polling becoming an effective Distributed Denial of Service attack.

As others have said, adopting HTTP's "If-modified-since" timestamp fetch can help here, by only doing a full-page fetch when the RSS has changed. In addition, adopting RFC 3299's way of only sending changes will help reduce the bandwidth of the RSS fetches (I mentioned this back in January when it first came out).

However, this doesn't reduce the number of HTTP setup/teardowns. To do this, the aggregators need to get smarter. They can do this by estimating an update frequency for each feed - something modelled on TCP's congestion control (exponential back-off, with 'no change' treated as congestion) would probably suit well.
If the aggregator polls the feed, and finds no changes, it doubles the polling interval. If it polls and finds changes, it decrements the polling interval by the number of changes found multiplied by the overall polling frequency. The lower bound is the maximum polling frequency set by the user (once an hour is common). You could set an upper bound, or let it establish itself which blogs are moribund.

No comments:

Post a Comment