Introduction to OmegaT and BiText2TMX: free and open-source translation tools

17 comments

in Technology for translators

Did you know you can also subscribe to the RSS of all the comments on this blog? This is where the most interesting discussions often take place. Enjoy!

Two quick screencasts with my take on OmegaT, a translation environment, and BiText2TMX, an alignment tool. Both are cross-platform, multilingual, free and open source.

OmegaT (4 mins 57 sec)

Note: a newer, slicker-looking version has been released since this screencast was made. This new version automates some of the steps I cover here.

I run through the interface, concordancing feature, setting up the file structure and also briefly demonstrate how I use OpenOfficeOrg to translate a Microsoft Office document.

Notes

Versions available in Albanian, Arabic, Basque, Belarusian, Brazilian Portuguese, Catalan, Chinese (Simplified), Chinese (Traditional), Czech, Danish, Dutch, English, Esperanto, French, German, Greek, Hungarian, Italian, Japanese, Polish, Portuguese, Russian, Slovak, Slovene, Spanish, Turkish and Ukrainian (and counting).

BiText2TMX (4 min 18 sec)

This is the software I used to create the translation memory in the screencast above.

Notes

Versions available in Catalan, English, French and Spanish.

These screencasts first appeared on Andrew Bell’s Watercooler: Tips, Tricks and Networking for Translators – thanks for the inspiration Andrew.

Share/Bookmark

{ 17 comments }

1 Rachel McRoberts December 9, 2009 at 5:39 pm

An interesting update in the newest version of OmegaT is support for Microsoft XML file formats. That should make it a little more accessible to Microsoft Office users. (Although I spent most of my academic career using only OpenOffice, and I love it!)

Thanks for the little intro to both of these. I have been meaning to explore OmegaT, and this has given me the nudge to actually do it. BiText2TMX also looks like a very handy tool that I hadn’t seen before. Thank you!
.-= Rachel McRoberts´s last blog ..Business Description =-.

2 céline December 9, 2009 at 6:00 pm

Oh great! I’ve been meaning to look into OmegaT too since my disappointment about MemoQ not having a Mac version. Thanks for this Sarah.
.-= céline´s last blog ..Accounting for freelancers =-.

3 Jean-Christophe Helary December 11, 2009 at 7:15 pm

Don’t forget that Sarah’s demo was made on OmegaT 1.6. That version was superseded by OmegaT 1.7 about 3 years ago. The current stable version was released a few weeks ago. It includes, among other things, a Google Translate interface and many more filters than 3 years ago. Take a look at the “changes.txt” file to see how things have evolved !

Jean-Christophe
.-= Jean-Christophe Helary´s last blog ..OmegaT and Snow Leopard(anti-aliasing problem – solved) =-.

4 Luke December 22, 2009 at 7:06 pm

I second what JC said – it is much improved since the version on display here. The inline spellcheck being one of the handy features I’ve enjoyed using. It’s such a lightweight and quick app. And the autosaves are a godsend.

Céline, it should cover the majority of your needs. I’d like to hear what you thought, if you’ve tried it already.
.-= Luke´s last blog ..Voice recognition for translation – three myths, three facts =-.

5 céline December 22, 2009 at 7:15 pm

Hi Luke, I haven’t had time to try OmegaT yet, but January looks fairly quiet at the moment so I’ll do it then – I’ll let you know :)
.-= céline´s last blog ..Accounting for freelancers =-.

6 Luke December 22, 2009 at 7:26 pm

btw, fyi etc. I did have an tag in my comment, but it got stripped out. You get the picture. Thanks for the blog post, Sarah.
.-= Luke´s last blog ..Voice recognition for translation – three myths, three facts =-.

7 Luke December 22, 2009 at 7:28 pm

Ah, I’m making a mess. Sorry, Sarah. I was just trying to say I had an (evangelising) (/evangelising) joke html tag in my first comment. And it got stripped. Feel free to delete these last two comments :)
.-= Luke´s last blog ..Voice recognition for translation – three myths, three facts =-.

8 mj January 4, 2010 at 4:23 pm

I have a question: I’ve noticed that some places require you to have certain machine translation tools in order to get work on a project, but they’re really expensive. Do we really need to get that stuff to get more translation work? Or are these freebies enough?
.-= mj´s last blog ..The problem with watching TV with subtitles =-.

9 Jean-Christophe Helary January 4, 2010 at 5:24 pm

@mj

OmegaT and Anaphraseus are not toy software. The places that “require” such and such translation tool (_not_ “machine” translation tool) mostly require the translator to be able to support the proprietary formats used by said software. There are other ways to work with such formats. That can include conversion processes to work with the “freebies” or use of cheaper software (Swordfish comes to mind).

Jean-Christophe Helary
.-= Jean-Christophe Helary´s last blog ..OSX + Windows hybrid systems =-.

10 sergio blum January 10, 2010 at 12:00 pm

I have downloaded BiText2TMX. I installed the Java Runtime Environment.

But then, I have NO cluewhatsoever on how effectively run the program.

Thanks for any (please) help.

Sergio

11 Raymond Martin January 28, 2010 at 6:12 pm

Try OmegaT+ also. Recent versions of bitext2tmx include work done by the OmegaT+ project.

OmegaT+ is an improved version of OmegaT, more robust and better programmed. Missing some extra features, but those are in progress.

OmegaT+ http://omegatplus.sourceforge.net

12 Yves March 1, 2010 at 11:40 am

BiText2TMX is a very simple and great tool.. It only has one major drawback that greatly distorts the aligment.
It takes every dot (.) as the end of a sentence.
If it could only take the \n (end of line) as marker, results would be close to perfect as long as two texts are parallel.

13 Raymond Martin March 1, 2010 at 2:29 pm

The use or non-use of a period as a segment marker for alignment is a matter of preference. If bitext2tmx (B2T) segmented on \n then you would not get full sentences as alignments. This is not desirable if you want sentence segmented TMX as the output. You will have arbitrarily segmented TMX containing segments that rely on no logic in particular (not paragraph, sentence, phrase, clause, term). On the other hand, B2T presently lacks the ability to segment in any other way automatically than a few built in rules (that do not consider the many exceptions in different languages).

14 Raymond Martin March 1, 2010 at 2:31 pm

Note: it is “bitext2tmx” not “BiText2TMX”, “BiTeXt2TmX”, or some other silly thing. For short we call it B2T. We being those of us who actually did some development on B2T, namely me.

15 Sam Gordon March 10, 2010 at 1:13 am

Many thanks for this Sarah – I am currently learning the ropes with these CAT tools whilst doing an online Masters in Translation, and it is immensely reassuring to be hearing a human voice relaying such clear and concise guidelines. I am in a Mac-user minority and my course tutor doesn’t have much experience with Macs, so I have been a bit slow getting my teeth into things like B2T and OmegaT.

Also, apparently there is a new Wordfast programme called Wordfast Anywhere which is compatible with new Macs – may be of interest to some of your readers who don’t already know about it: http://67.221.227.93/anywhere/ . Apparently it allows you to upload you r own glossaries and TMs.

Thanks again, Sam

16 Raymond Martin March 10, 2010 at 2:01 am

You might want to try OmegaT+ also.

I have made some effort to integrate it better into Mac OS X than OmegaT. Still much more to do, but better than nothing. Also note that recent versions of OmegaT do not work properly on Snow Leopard, OmegaT+ does (AFAIK and from user reports)! OmegaT has left Mac users hanging for over 7 months in this regard.

As for Wordfast Anywhere, you will note that any of these on-line sort of TM tools can take your data and use it for their own purposes (getting translations for free) without asking proper permission of copyright holders, etc. Be careful in using these, since these tools just sort of offer you something but often do not specify exactly what the legal and privacy aspects are. One of their main points in offering this free stuff (from a usually commercial vendor) is to get lots of data from users for their own use. Watch out.

17 Sarah Dillon
@
April 1, 2010 at 10:33 am

Some interesting points in the comments here, thanks for weighing in.

My take on it is this:
I have no vested interests in any of the products mentioned here. I choose to use them simply because I like them. I like them because they haven’t caused me any hassles and because they do what I want them to do with the minimum of fuss. I know there will always be bigger, better and shinier products and/or versions out there, but for now, the fact that they simply do their job is good enough for me – and, I suspect, for the average practising translator. Which is why I mentioned them here, on my blog.

I’m also grateful to the community of developers for all the work they’ve done, and indeed continue to do, on free and open source tools. This is just one more reason why I’m happy to highlight an application when I find something I like.

Glad to hear you found it useful Sam – good luck with your masters. Thanks to everyone who emailed me too.

Comments on this entry are closed.

Previous post:

Next post:

Academics Business Directory - BTS Local