This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Find out more here

Thursday, 23 October 2014

Translation on the fly (or on the sly?)

From: Tim Macer's research technology blog

World Wide Lexicon Toolbar is a new plug-in to the Firefox web browser that promises to take webpages in any unfamiliar language and, as you browse, simply present the pages in English (or for non-English speakers, the language of their choice).  My preparations for the trip I am about to make to Korea have focussed my mind on the frustrations of being unable read webpages. But I was also curious to see how useful this would be to Web 2.0 researchers that are analysing social media content and the like.

It is a very smart add-on: if you browse a page, and it isn’t in the language you understand, the page will be machine-translated and presented to you. If a human translation has been made, it will show this instead. It surpasses the Google option to machine-translate pages in a couple of other ways, too: more languages are covered and the translated version is presented in the format and style of the original page. There is even an option to double up the text so you can see the original and the translation. Of course, the translated text may still disrupt the layout, but it gives you a much better idea of the context of the text,  which aids understanding considerably.

Human or machine translations

The software is currently in beta, and can be installed free-of-charge from the  Mozilla Firefox add-ons page. Reports from early adopters are that it is extremely useful, provided that you are willing to put up with the limitations of machine translations. The human translations it shows are those that have been entered by volunteer contributors to the World Wide Lexicon community. It’s a fantastic idea and is another example of the wisdom of the crowd at work on the Web. Yet the reality for any social Web researcher is that the blogs and community forums you are likely to visit will not have attracted the attention of a community-minded translator, and you will still need to endure the inadequacies of the machine translation.

Machine translations are not bad with well-constructed texts that have been written in a stylistically neutral way, but the more colloquial and idiomatic the text is, the more bizarre and worthless the translation becomes. I don’t have the means to try this out, but I suspect this tool may be more useful when doing Web-based desk research into more authoritative sources than the general Web 2.0 free-for-all. For that, we need machine translations to get smarter.

A catch?

Why on the sly? You need to login and register to use the service, and the server must, by definition, be aware of all of the pages you visit - so you are giving to the plug-in owner a complete trail of all your browsing activity. This is not made clear when you sign up. If it bothers you, you could only use Firefox when you wish to translate something, and another browser for what you wish to keep private.

Tim is at the First International Workshop on the Internet Survey this week, organised by Kostat, the Korean National Statistics service, and will be posting highlights from the event.

  • If you have tried out this plug in or any other comments concerning machine translations, please log in and leave a comment.

Readers' comments (1)

  • Tim,

    I am the lead developer of the Worldwide Lexicon and wanted to clarify the privacy policy for the add on (everything is stated up front in the privacy policy page on the mozilla.org install page btw).

    We do not log user identifiable information, except when you submit or edit a translation (so we can credit you, and assign scores to you as an author).

    The addon does contact WWL and other translation servers to determine which language a page is published in. We do this using a combination of the domain name (e.g. lemonde.fr = French) and a randomly chosen text from the current page. We only keep aggregate statistics so we can build a map of which sites people are interested in other languages. The latest version, due out in a day or so, enables you to disable the tool very easily, and also automatically disables it in private browsing mode.

    Lastly, when we publish the production version, we will include full source code, and will invite developers to audit the code to compare what it does against what we say it does. We take privacy seriously, and don't want to collect information beyond what we need for the tool to do its job.

    Brian McConnell

    Unsuitable or offensive? Report this comment

Have your say

Please add your comment. You can include links, but HTML is not permitted.
Your email address will not be displayed on the site. All comments are moderated.

Mandatory
Mandatory
Mandatory
Mandatory