The Consequences of Hiybbprqag'ing

Like so many who watch developments in the tech world, I have been absorbed in the last few days on the ‘Bing Sting’ and precisely what it means. The debate is quite passionate. Following the initial story from SearchEngineLand by Danny Sullivan that documented Google’s identification of how Bing’s search rankings were related to its own, Microsoft responded but outrage continued (David Poguecalled it sleazy and Stephen Colbert got nervous about his previous Bing sponsorship). The debate has not lacked for commentary on precisely what it was the Bing did and how Google trapped them. The initial Danny Sullivan piece is informative and comprehensive as are discussions from Ben Edelman and Shane Greenstein. Indeed, the latter two nicely illustrate what the debate is about. Greenstein argues that this is not the way innovative firms should act in rivalry with others while Edelman argues that this is the norm for the search engine industry and that Google is not clean in this regard. But in all cases, commentators have struggled with language and this has degenerated into a squabble akin to what you might find in a school yard.

The issues are real but the language has not caught up. In my own attempts to sort this out, I have found it useful to abandon terms that are unhelpful. Let’s start with “imitation,” “copying” and its stronger variants of “plagiarism” and “cheating.” Had Bing wanted to do this and directly map Google’s search results onto its own, it could have done it. It could have set up programs to enter terms in Google and skimmed off the results and then used them directly. And I think we can all agree that that is wrong. Why? Two reasons. First, if Google has invested to produce those results, if others can just hang off them and copy it, Google’s may not earn the return on its efforts it should do. Second, if Bing were doing this and representing itself as a different kind of search, then that misrepresentation would be misleading. Thus, imitation reduces Google’s reward for innovation while adding no value in terms of diversity.

Outright imitation of this type should be prohibited but what do we call some more innocuous types? Just look at how the look and feel of the iPhone has been adopted by some mobile software developers just as the consumer success of graphic based interfaces did in an earlier time. This certainly reduces Apple’s reward for its innovations but the hit on diversity is murkier because while some features are common, competitors have tried to differentiate themselves. So this is not imitation but it is something more common, leveraging without compensation and how you feel about it depends on just how much reward you think pioneers should receive.

Bing falls into this latter category. And I don’t mean with regard to this latest issue but with regard to the entire endeavour. Google pioneered information organisation algorithms that could index web content, rank them and relate them to search queries in a very timely manner. That is the business that Microsoft entered many years back and now most recently called Bing. No one, I believe, is claiming that entry was illegitimate and that Google should be able to keep that business model to itself. But let’s be clear, Bing is Google’s closest competitor for a reason: it is leveraging off that baseline and game changing idea.

Of course, that is precisely why issues of how search algorithms work to take information from the web is a sensitive one. Google’s initial idea was to use what people were actually doing – in this case, hyperlinks – and use that to rank results. It didn’t compensate people for that activity but eventually it did provide an ‘opt out’ tool to allow content providers to block Google. Interestingly, it is precisely because so many content providers did not want Google blocked – that is, they wanted to appear high in Google’s results – that we see the whole issue of search engine spam today. Just looking at what content providers are doing does not necessarily get you great search results because they can be gaming things.

Which brings us back to the current debate. Bing appears to gather some part of the information it needs to rank searches not from what content providers are doing per se but how consumers are moving between sites. While there are means of preventing this and perhaps nudges to do so, when they use Internet Explorer, Microsoft gathers information on their behavior and uses it for search results. This way of gathering information is something that should allow Bing to distinguish its own rankings from Google’s but, as we just saw, there is a problem.

The problem is that Google is very important and so when you are tracking user behavior, there is a big chunk of that coming from their behavior while Googling. Along with everything else, Microsoft was picking up that behavior and integrating it into their search results. Google suspected it and constructed its experiments to see if that was the case. It picked nonsense terms and used its own people to send information to Microsoft and then, as expected, Microsoft used that information for Bing rankings; exactly as advertised. So if there were terms, the only information for which was coming from user behavior and that behavior involved Google search click throughs, which would become the Bing result. (By the way, that suggests that perhaps it isn’t in Bing’s interest to use Googling in this way but that is another matter).

What is interesting, however, is that Google’s attempt to catch this out for Bing did not really work that well.SearchEngineLand reported that Google operated 100 sting experiments and only 7 to 9 ended up being incorporated in Bing’s rankings. In other words, Bing largely ignored the nonsense words. So from the perspective of the larger claim: that Bing was engaged in a process that led to widespread similar results to Google, this experiment does not, according to any measure of statistic significance, support that claim.

What about the subtler claim that Bing’s behavior is with regard to something more specific, i.e., Google’s spell checking algorithms? This is what Google uses to check to see if you have typed something correctly. It is so good that my kids (and, ahem, I) use to learn how to spell words without looking them up in a dictionary or gaining a proper education. Google noticed that Bing was correcting typos automatically and guiding people to the right results for unusual terms. So search for torsoraphy instead of what you meant, tarsorrhaphy, and both Google and Bing will correct your error and ranking the Wikipedia reference first (which, by the way, gives you just a definition and no real information). Of course, that is the only commonality. Unless you are feeling lucky, the sites return different results although in each case for the corrected spelling. Google tells you about it explicitly, Bing does not.

This is going to happen if you use user behavior to rank results. If one company – Google – guides consumers using spell correction then the other will end up using that too. The issue is whether this is a deliberate ploy by Microsoft or a consequence of how they gather information and rank searches. I don’t know either way but I do know that for searches that are returning a Wikipedia reference first, this is hardly conclusive. When it comes down to it, it is Wikipedia that now organises much of the web information and all search engines are leveraging off that. Google has not quite made its case with this example. [Indeed, enter ‘torsoraphy’ into Wikipedia’s search and you go straight to ‘tarsorrhaphy’! Who corrected that spelling and when?]

Imitation does not describe what is going on here. Instead, it is a situation where user information is being used in product design. If one firm is dominant in an activity, most users are their users and so, if the name of the game is handling information overload, users will be handling it using their product. It is a consequence of that activity. Microsoft could choose to ignore that but doesn’t. I think we need a new word for it. My suggestion is to use “hiybbprqag” – the most pronounceable of Google’s sting terms – to describe it. Bing is hiybbprqag’ing Google. Is that a market problem? Is it in a company’s interest to do it? Let the debate begin.

