Thursday 6 December 2012

The Antifragility of the Web

We’re used to taking the web for granted. We expect it to be there as substrate, with its addresses, declaratory documents, universally available programming language and the links between pages.

Ah, the links. There’s the rub. How many times have you followed a link and got a 404 or a different page than you were expecting? Links rot. As Tim Berners-Lee says, eventually every domain becomes a porn site.

So we want to do better. We want to build a non-web web. A special place for ourselves and our friends that is self-contained, and where all the pages and links are in the same database, and they can’t rot.

Instead of these messy links with protocols and domains in we just use @names or +names and #topics and tag. It’s easier for people to do, and self-consistent and grows explosively. Biz dev gets excited about the reciprocal deals we can do with other content owners.

If you’ve read Nasim Taleb’s Antifragile, you know what comes next. By shielding people from the complexities of the web, by removing the fragility of links, we’re actually making things worse. We’re creating a fragility debt. Suddenly, something changes - money runs out, a pivot is declared, an aquihire happens, and the pent-up fragility is resolved in a Black Swan moment.

The special place disappears entirely. Or, if we’re lucky, the Archive Team lights the cat signal and emergency archivists preserve it in formaldehyde somewhere else, the clock stopped, the links severed.

Meanwhile, out there on the web, people can still connect and discuss and say what went wrong, and do better next time. The web itself is antifragile. It interprets our business models as damage and routes around them. If we’ve learned, we’ll respect this next time we make something.

Thursday 24 May 2012

Keep ALL the versions

Back in the 1980s, storage was expensive and slow. You had a copy of your document in memory and you would be asked every time you wanted to save it out to disk, because you didn't want to fill the disk up. That paradigm is so out of date now it's embarrassing to try to explain to my sons what the little floppy disk icon is in Microsoft Word. "What's that?" "it's a floppy disk." "Oh yeah, I think I saw one in the garage once"

The world view of having to load and save is being gradually eroded. Apple has changed the operating system - Mountain Lion no longer has Save and Save As, but instead a model of going back through edit histories. Google Docs originally didn’t have a floppy disk icon, but put it back because people were looking for it in user testing. Now it has been removed.

We've had source code control as programmers for a long time. But the github world takes that further: the first thing you do is clone a project into your own repository, then start forking it. You can eventually merge stuff back later, but there is the assumption that things are happening in parallel. As James Governor put it:

Open Source used to count download numbers as a measure of developer success.

Today, we increasingly use forks as the metric of traction.

Wikipedia has put this into the public consciousness by having publicly visible edit histories, so you can go back and forwards in time over the history of the article. The paradigm of "storage is not a problem, we should keep every version of everything ever" is moving through culture to be a default assumption.

This will be something we want on mobile too. The issue of "which have I got on the phone and which have I got in the cloud?” is what makes it tricky. I think there will be a battle between Apple and Google about how you present that to the user in a coherent way - Google Drive and iCloud are taking different paths here, with DropBox actually working between both. Google Drive not storing copies of Google Documents locally is a mistake I expect them to fix.

Everything is moving in this direction, even low level system design. The growth of functional programming is all about not having contention over a single copy of things in memory but having paths through data that are modifying things in their own version of the world.

If you think about the difference between the way JavaScript handles stuff and the way Java does, Java still has a ‘data structure being passed around’ world view; JavaScript has closures that are passed around that contain the entire state of the current machine at the time of that event which are held as you go off and do something else and come back. One of the reasons that node.js feels so nice when you're writing web programming is that it has the feeling you’re used to on the client side that you do something and then call something, passing it a callback. You end up writing the server side stuff in the same way, and it's just naturally parallelizable. It's easy to spin up more machines for it because you've written stuff with the presumption that each instance of it is wholly independent.

Not everything can be written that way. The core of node is in C because it has to deal with the raw machine contention and deal with this routing, but the spread of the functional world view is a natural fit for web apps.

The other potential that node.js makes manifest is convergence of client and server code. If you're running and writing the same code in the same language, with the same libraries running on both the client and the server you can decide to migrate bits back and forth much more easily.

You don’t have to spend so much time deciding which is which and without worrying about the boundaries of the world and different shapes of the data structures. So if you're creating a JSON object on the server and passing it to the client you can decide whether you do that or not, at which point you do that, at which point you render it and which point you don't. That sort of fluidity is going to become more important over time. Pat Patterson showed how this can work for mobile apps too.

The programmers' world view has changed on this and it is permeates out to the world as programmers make those ideas available to the public in usable form. The invention of ‘Undo’ was a huge part of what made the Mac great - enabling users to experiment safely. Being able to retrospectively undo mistakes later, or learn from others’ public variations on a theme is going mainstream too.

This post is based on discussions I had on This Week In Google ep 143, with Gina Trapani, Jeff Jarvis and Tom Merritt which were transcribed by Michael Shook from this video of the show Updated: At the same time, JP Rangaswami wrote Warning: Contains Warnings which give more context about how 'Undo' helps protect innovation.

Sunday 1 April 2012

Draw Something CEO, grace and high school mathematics

Dan Porter, CEO of OMGPop, has had a good week. His game, Draw Something (it is an asynchronous Pictionary for cellphones, like Words With Friends is an asynchronous Scrabble) has taken off like mad, and Zynga bought his company for over $200 million. However, one employee didn't go along to Zynga, and Dan's been whining on twitter:

This has drawn some reactions from others, eg Notch, CEO of Minecaraft:
and Dick Costolo, CEO of Twitter:
and the lovely Tom Coates:

Now just before this crass public display of arrogance, he said something just as telling:

The thing is, Draw Something has a maths problem. The so-called Birthday Paradox is kicking in. This is named for the unexpected result that if you have 23 people in a room, there's a 50:50 chance two of them have the same birthday.

There's a similar effect with games. If you keep randomly picking a word from a list, you'll see repeats quickly. Classic board games understand this - this is why Balderdash, Pictionary, Trivial Pursuit etc insist you use a discard pile after shuffling and picking a question, so you only pick a new card from those you haven't seen. Draw Something isn't doing this, so we're all seeing words repeat, which is discouraging play. This is all over twitter too:

One way or another, I think Draw Something has peaked.

Wednesday 21 March 2012

When you're the merchandise, not the customer

Jonathan Zittrain posted today that he is not the source of the quote widely attributed to him:

I participated in the Berkman Center’s fascinating HyperPublic symposium in the summer of 2011. When moderating a panel I invoked the aphorism that “When something online is free, you’re not the customer, you’re the product.” It’s a way of encapsulating the idea that online free services usually make money by extracting lots of data from users — and then selling that data, or using it for targeted availability of those users for advertising, to advertisers. In that sense, the advertisers are the clients, and the users enjoying free content are what’s being sold. (Of course, sometimes that happens even when the user pays.)

I didn’t coin the phrase, and since it was featured (and attributed to me!) in wordsmith.org’s wildly popular “word a day” as a thought for the day accompanying the word “enceinte” — I sought to nail down its provenance.

The first use of the quote that we can find is as a comment within the famed MetaFilter community in August 2010. The user’s name is blue_beetle, who might be someone named Andrew Lewis. It’s entirely possible I saw it there, as MeFi is one of my five favorite sites on the Web.

I was pretty sure this idea dates back further, so I went digging. First I found Josh Klein's 2009 blogpost, which cites Philip Broughton's 2008 book Ahead of the Curve: Two Years at Harvard Business School

"My favorite moment comes in an anecdote about an MBA candidate who, not getting his way, complains to an administrator, “I’m the customer! Why are you treating me so badly?”

To which the administrator responds, “you’re not the customer. You’re the product.”

But the sense is not quite the same there - an MBA is not a free web service after all. Going back a little further, this 2006 discussion at Joel on Software Is the Magical Fairy-tale For Google Engineers about to End? (nicely prefiguring James Whittaker's Why I left Google) includes this contribution from Drew K:

Like clam pointed out, Google's customers are the advertisers. "Skooter" is a user. Just like with ad-supported broadcast TV, you're not the customer, you're the product.

The idea is pretty well-expressed there, but I think we can go back further. In 2004, Coding Forums discussed the then-new Gmail, and liorian commented:

From a Google perspective, you're not the customer. The ad service buyer is the customer. You're the commodity. By making you a more attractive commodity, i.e. by making sure to only serve you an ad if you are in the target population for it, they are making the ads pay better for their customers, and they can reap a large part of the difference to their competitors, the other ad services.

This isn't a new idea then, as the analogy to television makes clear. The earliest, most thorough exegesis of this idea I have found is Claire Wolfe's 1999 article Little brother is watching you The Menace of Corporate America which opens with:

Perhaps because you're not the customer any more. You're simply a "resource" to be managed for profit. The customer is someone else now — and usually someone without your best interests at heart.
And has a continuing refrain of “Who is the Customer? Not you”, ending with
Who is the customer? Not you, whose life is reduced to someone else's salable, searchable, investigatable data. The customer is everyone who wishes to own a piece of your life.

The underlying warning is definitely worth thinking about — Maciej Ceglowski eloquently made the case for why you should pay Software Artisans on a recent TummelVision — but the deeper changes to what it means to be a customer matter too. There are other things we take part in without paying or being sold, because we find shared value in them, and the net enables those too.

Saturday 28 January 2012

QR Codes: bad idea or terrible idea?

People have a problem finding your URL. You post a QR Code. Now they have 2 problems. Or more:


  1. They see a chunk of robot barf on your poster, and have to realise it isn't a crossword puzzle, but a QR code.
  2. They need to take a digital photograph of it with their phone. If they have a laptop, even with a camera, this requires physical contortions
  3. They need an application on their phone that can make sense of a QR code.
  4. They need a lot of patience as they fiddle with it.
  5. They need a working network connection to resolve it.

Conversely, with a URL they could type it in, take a photograph of it and type it in later, or if they have the right app, it will recognise the URL text from the image and make it clickable.

That is the irony of this. QR Codes ignore years of research and culture on how to communicate meaning in symbolic form designed to be captured by image processing tools behind a lens. We have this technology. It is called writing.

Written language has a set of symbols that are relatively unambiguous, that are formed of curves rather than hard edges making them resilient to noise, and have been market-tested for milennia. QR Codes don't just ignore this, they ignore the relative success of one dimensional barcodes. Notice something about a barcode? It has the number printed on it as well, so you can type it in if the scan fails. QR Codes don't do this, so it's far too easy to put the wrong one in, or fail to replace a mockup. Which is why so many QR codes link to Justin's site instead.

The only place you should use QR codes is if you have a dedicated reader for them, like a classic barcode scanner, and a workflow that is designed for this that actually saves time. If you do empirical research on using QR codes for the public, you'll likely see 80% worse performance than text like this museum did. By all means try the experiment and report your results. Put up a QR code and a printed URL and see which gets the most usage.

Or listen to others:

a majority of our respondents knew more or less what they were for, very few (n=2, or around 7%) were successfully able to use QR codes to resolve a URL, even when coached by a knowledgeable researcher.[..] A strong theme that emerged — which we certainly found entirely unsurprising, but which ought to give genuine pause to the cleverer sort of marketers — is that, even where respondents displayed sufficient awareness and understanding of QR codes to make use of them, virtually no one expressed any interest in actually doing so.

As Alexis Madrigal puts it:

Is it really faster and better to use a QR code that will direct you to part of a marketing campaign rather than getting a broader sweep of information by simply using the browser that you already use all the time on your phone? In the instant cost-benefit analysis I do every time I see a QR code, it has yet to make sense for me to fire up the decoder app I have installed on my phone.

Monday 23 January 2012

Google Plus admits they want fake names

Today, after 7 months, Bradley Horowitz announced that Google Plus will accept some pseudonyms. Kinda. If you can prove you're already famous. And can convince their robot it looks like a name. However, Google Engineer Yonatan Zunger spills the beans in a comment on that thread:

First of all, you might ask why we have a names policy at all. (i.e., why we don’t simply go with the JWZ proposal) One thing which we have discovered, while putting some miles on the system, is that it is indeed important to have a name-based service rather than a handle-based service. This isn’t a matter of functionality so much as of community: You get a different kind of community when people are known as Mary Smith than when they are known as captaincrunch42, and for a social product in particular we decided that the first kind of community is the one we want to build. In order to do that, we want to establish a general norm that the names you put in to the system should be names, not handles.

So one thing that our name checking flow tries to catch is handles, which should normally be nicknames, shown in addition to a name. The other important thing it’s trying to catch is people who are creating individual accounts, rather than +Pages, for non-human entities such as businesses or organizations. The behavior of +Pages is deliberately restricted in the system, and we don’t want people to be creating fake human accounts to circumvent that. The name check turns out to be a very powerful tool to catch these.

Our name check is therefore looking, not for things that don’t look like “your” name, but for things which don’t look like names, period. In fact, we do not give a damn whether the name posted is “your” name or not: we will not challenge you on this basis, nor is there any mechanism for other users to cause you to be challenged for this.

There are two main cases where the name check screws up. One is false positives: people (such as you) who have unusual names which get flagged because they looked like handles. Being able to appeal via things such as drivers’ licenses is useful for this case, since it’s a simple “oh, we got this wrong.” The other case is people such as +trench coat, who are so well-known under this handle that it would be bizarre not to let them onto the system under this name. For this case, we allow appeals based on being well-known under the name: thus the ability to prove the “established pseudonym.” We’ve deliberately set the threshold for that latter case fairly high for now, but we intend to continue to tune it; the objective is that the frequency of such names should basically be the same as their frequency in meatspace.

So to answer your questions one-by-one:

(2) “Meaningful following” only applies to cases of established pseudonyms which do not look like names. The definition of “meaningful” is deliberately vague so that we can tune it, so that it behaves in a natural fashion.

(3) That’s correct; drivers’ licenses are for false positives, not pseudonyms.

(4) Unusual names will indeed hit friction, because of false positives. We’re trying to minimize that, but it’s going to take some trial and error.

(5) Google+ can absolutely be your first identity online. No matter what your language, no matter where you come from. The “established pseudonym” logic should apply to a very small subset of people. If some groups are seeing a higher false positive rate than others, that’s a bug, not a feature, and we have the data available to spot this situation and remedy it.
(posted in full, in case of subsequent retraction, and because G+ doesn't have permalinks for comments)

Yonatan admits what Bradley obscures:that this is an Identity Theatre issue. They don't want your name, They don't care if you have a forename in one language and a surname in another. Let me quote this exactly:

Our name check is therefore looking, not for things that don’t look like “your” name, but for things which don’t look like names, period. In fact, we do not give a damn whether the name posted is “your” name or not: we will not challenge you on this basis, nor is there any mechanism for other users to cause you to be challenged for this.

This is what I suspected when I wrote Google Plus must stop this Identity Theatre

Google+ is letting an algorithm decide what is a name and what isn't. You will be forced into it's Procrustean idea of what names are, or be harassed for it. You have to pass as normal, like call centre workers forced to learn to sound American.

You can create disposable accounts with fake names, as long as they look plausible to Yonatan's bot.


This algorithm has allowed people called 'panel heater' 'The Phoenix Rising' 'tous les mais du monde' and Mehr Decent , a bot with a well-known actress's photo posting links to a single website to follow me (and that's just in the most recent 30 I checked).

So Google continues to encourage fakers and discourage those who need a pseudonym for good reasons.

Could Apple make premium devices in the USA?

After This American Life's disturbing episode on Apple's Chinese factories, the NYT wrote a defence of Apple, which said it was just too expensive to build their products in the USA:

Not long ago, Apple boasted that its products were made in America. Today, few are. Almost all of the 70 million iPhones, 30 million iPads and 59 million other products Apple sold last year were manufactured overseas.

Why can’t that work come home? Mr. Obama asked.

Mr. Jobs’s reply was unambiguous. “Those jobs aren’t coming back,” he said.

For computers, phones and tablets, it's hard to make a real premium product, as the economies of scale work so well - Tim Cook's Apple has closed in on PC prices by a focus on costs and suppliers, and by building fewer models and relying on Chinese flexibility to ramp them up.

The Gold iPad 2 had a huge premium price, but also weighed more the 3 times as much as a normal iPad.

Instead, what if Apple made premium USA iPads, MacBooks and iPhones? They could have a distinctive look, so people knew they were US made, focus on the higher-end models, and charge a premium markup for the warm glow of supporting US jobs.

How much more would it cost? Hard to say, according to the NYT:

It is hard to estimate how much more it would cost to build iPhones in the United States. However, various academics and manufacturing analysts estimate that because labor is such a small part of technology manufacturing, paying American wages would add up to $65 to each iPhone’s expense. Since Apple’s profits are often hundreds of dollars per phone, building domestically, in theory, would still give the company a healthy reward.
[...]
Another critical advantage for Apple was that China provided engineers at a scale the United States could not match. Apple’s executives had estimated that about 8,700 industrial engineers were needed to oversee and guide the 200,000 assembly-line workers eventually involved in manufacturing iPhones. The company’s analysts had forecast it would take as long as nine months to find that many qualified engineers in the United States.

In China, it took 15 days.
[...]
A few years after Mr. Saragoza started his job, his bosses explained how the California plant stacked up against overseas factories: the cost, excluding the materials, of building a $1,500 computer in Elk Grove was $22 a machine. In Singapore, it was $6. In Taiwan, $4.85. Wages weren’t the major reason for the disparities. Rather it was costs like inventory and how long it took workers to finish a task.

Compared the the huge price disparities for other goods, these seem modest; for example, Timoni found a nice carry-on bag recently:


So here's my proposition for Tim Cook:
Reopen the Elk Grove Apple factory to sell top-line Apple products, designed for those who want 'designer' luxury goods, and are willing to pay more for exclusivity. Make the 'made in USA' a key argument for a premium price. that way you need fewer staff than in China, and paying them well just adds to the cachet of the devices. You could cover them in Jasper Johns Flag, visibly number them as a limited edition, or come up with something more creative. As a way of extending the product line to a new, higher price point, while quieting those who wish Apple did more in the US, it seems an a obvious move.

Tuesday 17 January 2012

Translation from sanctimonious bluster to English of Chris Dodd's statement on the internet blackout protests

WASHINGTON —The following is a statement by Senator Chris Dodd, Chairman and CEO of the Motion Picture Association of America, Inc. (MPAA) on the so-called “Blackout Day” protesting anti-piracy legislation:

Senator and CEO - let's lead with the revolving door promises to politicians

“Only days after the White House and chief sponsors of the legislation responded to the major concern expressed by opponents and then called for all parties to work cooperatively together,

Why are my former colleagues listening to their constituents about legislation? Don't they stay bought?

some technology business interests are resorting to stunts that punish their users or turn them into their corporate pawns, rather than coming to the table to find solutions to a problem that all now seem to agree is very real and damaging.

Maybe if we keep saying copyright infringement is a real problem without evidence, they'll believe it.

It is an irresponsible response and a disservice to people who rely on them for information and use their services.

How dare they edit their sites unless we force them to under penalty of perjury and felony convictions?

It is also an abuse of power given the freedoms these companies enjoy in the marketplace today.

Tomorrow was supposed to be different, that's why we bought this legislation.

It’s a dangerous and troubling development when the platforms that serve as gateways to information intentionally skew the facts to incite their users in order to further their corporate interests.

Being the gateways and skewing the facts is our job, dammit.

A so-called “blackout” is yet another gimmick, albeit a dangerous one, designed to punish elected and administration officials who are working diligently to protect American jobs from foreign criminals.

I am high as a kite

It is our hope that the White House and the Congress will call on those who intend to stage this “blackout” to stop the hyperbole and PR stunts and engage in meaningful efforts to combat piracy.”

What have the Romans done for us? Apart from instantaneous global communications, digital audio and video editing, the DVD, Blu-ray, Digital projection, movie playback devices in everyone's pockets and handbags...

How to fight this nonsense

(with apologies to John Gruber and Mark Pilgrim)