So Google Web Accelerator. Like I said in del.icio.us, proxy servers can be a huge pain for web developers, putting Google’s name on one is inviting a shitstorm. Putting the web development stuff off for a moment, let’s talk about why Google is providing this.
First off, it’s helpful. There’s no market for tools that fuck users over, and the Google culture seems to be built around helping users. OK, but what’s in it for Google aside from being helpful? It doesn’t center around becoming more privacy invasive, although I’m sure their advertising division will use it to sell you herpes medication.
Tristan Louis explains half of the benefit: Google will use the data to find information it hasn’t yet. You should really read that if you haven’t already, Tristan has covered every inch of that argument. Summarized for people who read it and forgot (you’re not too lazy to click that link, are you? Of course not) Google will use the URLs to find pages that its spiders missed or haven’t gotten to yet so that there are more results with newer information.
The other half is that PageRank is in trouble. When it first debuted, it brought order to the web. But it’s based on voting, and voting can be abused. Spammers are waging war on PageRank and Google has to respond manually. Look at how Syndic8’s link farming got shut down: Andy Baio had to make a blog post before they lost their PageRank. That is not a scalable solution and Google is a company that scales up with computers, not employees.
The way PageRank works is that it counts each link as a vote for a page, and pages with higher PageRanks get more votes. This is used to organize the web, Google infers which pages are most popular by the number of links to that page.
What the Web Accelerator and the Toolbar do when they report what page you’re on is give Google traffic information almost as good as the web server’s logs—sometimes better. This lets Google know how popular a page is, they don’t have to infer it from incoming links. They can then use that information to devalue sites that link farm as well as promote sites that are highly visited but not highly linked.
Google is watching you browse and using the information to organize the web better. I’m not going to tell you how to feel about that.
Now, for pages showing up on other people’s connections, that’s another part of HTTP. You need to look at Section 10.9 of RFC 2616, which explains the cache-control header. Personally, I hate dealing with caching proxy servers because they usually suck at following the standard, but if you build your app to standards then you can blame the proxy server. You’ll still probably have to come up with a workaround, but at least you have the moral high ground. In this case, the standard is to send the header
Cache-Control: private to any user that is signed in. Google should respect this header; if it doesn’t then you shouldn’t feel so bad about Google not hiring you.
And then there’s also the controversy over the proxy breaking web applications. Google Web Accelerator doesn’t break web applications. If it’s following a link that deletes an item in your database, the application is broken. The application was built by a developer who doesn’t understand the difference between GET and POST. That would be like someone who doesn’t know the difference between RAM and a hard disk building desktop applications.
While web developers should read Section 9 of RFC 2616 to understand the difference between safe and idempotent methods, I’ll summarize: If a form or a link changes anything on the server, it should be called with POST. If not, GET.
I know a bunch of designers want text links that make changes because submit buttons are ugly. Unless the W3C adds a way to create post links in XHTML (which I haven’t seen an argument against adding) there are at least two ways to POST from a text link, neither really good but both better than the alternative of having an application break because a browser behaves correctly.
The second way is what Instiki decided to do. Instiki used to have plain GET links that would cause a page to revert to an earlier version. When a search engine would index the site, it would follow those links and cause the page to roll back. The solution was to link the plain text link to a page with a form that would perform the rollback. And can you guess the method attribute of that form element? It was POST. And now you know the rest of the story.