Languages and the Web

Answer me quick: “What is the language of the web?”

Ok, how many of you answered with the name of a programming language? How many shouted out PHP, Ruby, Python or something similar? This is not that kind of post – I am not trying to start a religious war amongst the supporters of our beloved languages (not in this post at least…)

What I mean is this – what is the real language of the web: is it English? French? German? Or the ever more important Chinese? This may seem to be a simple question, but let’s look at the implications of language on the web.

I’m writing this post after attending Campus Party ‘09, and I ended up reflecting a lot on this issue after watching Tim Berners-Lee’s presentation on the semantic web. This presentation had semi-simultaneous translation, and I must say I was not pleased with it. I have been working with the internet for over 9 years now, and over the years I have noticed the way in which languages such as Portuguese, Chinese and others started to invade a previously almost-pure-English environment.

Someone very wise once told me that knowing English means that you have access to a great deal more more content, and you have that access before someone that does not know English have it – that is the difference English makes on your resume (no wonder he is now my boss). And that is actually our current reality, most of the content on the Internet is first generated in English, and then it makes its way to other languages through the various translations made by local bloggers and such. This is not always the case of course – I also post in Portuguese and must say I do the opposite by translating into English from Portuguese. Nevertheless I have seen blogposts in English attract far more attention. During my first year of blogging one of my posts written in Portuguese and then translated to English proved this point. While the original post got lots of attention and comments, the English post rapidly made it to the first page of Digg and made me suffer from the “Digg-effect”. My blog has never since reached close to that peek number of visits – so hypothesis proved, English does go a long way.

This of course is not just because of the number of English readers out there, but also because of the number of tools available to English content generators, tools such as Digg and so many others. The rule seems not to apply itself only to user generated content, but also to applications, since an application has a much larger chance of gaining traction if it is in English. Of course this opens up a new door, the “localized version” door. If applications do not localize themselves to certain countries, a natural evolution of the web and the vacuum left by this application might generate local sites, developed by local people with local cultures. Take a look at BlogBlogs, based on Technorati but for a Brazilian audience.

Globalization, or whatever you want to call it, is changing this picture, more and more references are popping up in different languages, new bloggers and new sites. This is turning the web into a truly multi-language environment, which means content is now being generated in various languages, and then making its way to English speakers, no longer exclusively the other way around.

This is positive, but it also weakens the unified language pattern and has a second side effect, very negative in my opinion, and which inspired this reflection. New internet enthusiasts and content creators are actually feeling as if though learning English is not important anymore. “Hey, I have that in Portuguese” or “I can just google-translate it” are phrases heard more and more often these days, and this is bad. People begin to get locked up in little box, an expanding box, true, but a box anyway. Poorly translated material and lack of “knowing better” precipitates this chain reaction. And this ultimately is reflected at technology events like the Campus Party event in Sao Paulo.

Tim’s session was a embarrassment in my opinion. In order to accommodate the segment of the crowd that did not speak English, the session was presented with a translator being present. If it was done with simultaneous translation this might not have been as bad, but it was a ping-pong style translation. This gave Tim some problems, having his line of though interrupted by the translator, who could not let him go ahead with too many phrases before she translated it, and finally, she was not a technical translator making quite a few translation mistakes, and losing some technical terms all-together, such as the very complex “HTTP”.

This is the point where globalization really annoys me. These high level events and sessions, directed at high level developers and internet professionals, should not need translation from English, since it is such a widespread and globally accepted language, especially in the world of technology. This would act like a filter and solve other problems of these sessions, raising the bar on quality of attendees, avoiding some of the questions that were asked, for example where Tim (the creator of the web) was asked how we could make the transition from the web to the web 2.0 and 3.0…the only thing not added to the question was “where can i download the patch?”

So my final suggestion to you is, spread out, make yourself available to more content, learn English and if you have a chance, learn at least one of the other big 5 languages other than your mother-tongue. The content is out there, go after it.

[first published on the SWAT Blog]

  • Flavia

    Olá Rafael,
    Gostaria de saber se você estaria interessado ou conhece algum profissional interessado em realizar um projeto para a nossa companhia. Por favor entre em contato comigo por e-mail.
    Muito Obrigada,
    Flavia