Tuesday, 6 July 2004

Citation and deep linking

Dorothea points out some problems around piecemeal citation:
[...] there’s no automated way to add callouts to one individual paragraph without adding callouts to all of them.

A more subtle explication of the problem: I could, if I chose, add individual id attributes to paragraphs on CavLec I thought especially worthy of notice. But who’s to say that my idea of noteworthy paragraphs meshes with any other blogger’s? Nobody, that’s who. (Not least because it’s an open question whether any paragraphs on CavLec are noteworthy.) The only way to ensure that anyone who wants to link to noteworthy paragraphs can do so is to assume that all paragraphs are potentially noteworthy.

Worse, even if I do add id attributes, there’s no way for a would-be linker to get at them for linking purposes except by inspecting my HTML code. Green hash marks may be crufty, but they address a genuine issue, one we might call “identifier invisibility.”

The way around this is to do what I just did - copy in the piece you are citing and link to the whole. It's a little cumbersome, but it has the benefit of resilience (the original might vanish or be re-edited). A way to take this technique further is to use QuickTopic Document Review, as I did for AKMA's speech for example. This both adds the paragraph citation links, enables inline comments, and archives a copy of the cited source elsewhere, protecting against it changing or vanishing and thus invalidating the citation link.

This is the same issue as discussed by Jon Udell last month for MP3's.
If you want to cite an MP3 in a stable way, you can do it by copying a fragment and saving it locally, and linking back to the original source file. We don't try to dynamically insert chunks of text from other people's servers into the middle of our prose; why do it for media?
What is missing here is the rich media equivalent of QuickTopic Document Review, which mirrors media and adds annotation. Building something to enable this would be a fine project for the Internet Archive.

No comments:

Post a Comment