In the TMX module in the censhare Admin Client, you can configure the import and export of segments for the translation memory. Activate and set up the server actions for this task and learn how to carry out imports and exports.


Context

  • The manual import and export can be executed in the censhare Client and the censhare Admin Client.

  • The configuration of the manual and the automatic server actions is done in the censhare Admin Client.    

Prerequisites

  • Permission to access Configuration folder in censhare Admin Client.

Introduction

If you work in "Translate with memory", you can import segments from other systems and use them in censhare. The import is done in a TMX file (Translation Memory eXchange), a standardized XML format. The import can be done manually or it can be automated. Segments from the censhare translation memory can also be exported in the same format and then used in other translation memory systems.

First, you need to activate the desired server actions. In the censhare Admin Client, open the module Configuration/Module/TMX. In the TMX module, you can configure the following server actions:

  • Import TMX files

  • Import (automatically) TMX files from hot folders

  • Export TMX files

The TMX format

The abbreviation TMX (Translation Memory eXchange) stands for an XML-based data exchange format between two translation programs. It is an open standard used by a majority of programs. For more information, see the GALA-Website. censhare uses version TMX 1.4 and the character code UTF-8.

Example of a TMX file:

<?xml version="1.0" encoding="UTF-8"?>
<tmx version="1.4">
   <header creationtool="censhare Server" 
                     creationtoolversion="2017.3.0a2" 
                     datatype="XML" 
                     segtype="phrase" 
                     adminlang="en-us" 
                     srclang="de" 
                     o-tmf="unknown">
   </header>
   <body>
      <tu>
         <tuv xml:lang="de">
            <seg>Hallo Welt!</seg>
         </tuv>
         <tuv xml:lang="en">
            <seg>Hello World!</seg>
         </tuv>
      </tu>
   </body>
</tmx>
CODE


In the XML declaration for your TMX files, pay attention to the"UTF-8" code. censhare only processes files with this code. The header element must contain the following attributes:

Attribute

Description

adminlang

This attribute indicates the standard language for administrative elements. The language can deviate from the source language.

creationtool

This attribute identifies the program with which the TMX file was created. The values are not specified any further. Here you use the specifications from the manufacturer.    

creationtoolversion

This attribute identifies the version of the program with which the TMX file was created. The values are not specified any further. Here you use the specifications from the manufacturer.

datatype

This attribute contains a specification about how the data is formatted in the segments. The translation memory then knows it is dealing with an HTML, RTF or an unformatted text, for example. The standard value is "unknown". For censhare Translation with memory, "XML" should be used here.    

o-tmf

This attribute indicates the original format of the translation program from which the TMX file was created. You can use the standard value "unknown" here.    

segtype

This attribute specifies which type of segmentation is being used. The permissible values here are "block", "paragraph", "sentence" or "phrase". The latter is the default setting in censhare.    

srclang

In this attribute, you indicate the source language in the TMX file from which you are translating. This value is irrelevant for Translation with memory because the source and target language in censhare is taken from the Variant with update flag asset relation. However, you still need to enter a value here in order to get a valid TMX file.    

The XML element "<tu>" (Translation Unit) is what makes up segments. This is where a "<tuv>" XML element (TranslationUnit Variant) is added for every language. Each of these elements needs to contain an "xml:lang" attribute. The segment's text is enclosed with a "<seg>" tag. The translation memory interprets all characters inside this element.  If the segments contain inline elements, then you need to change the tag rules accordingly in order to ensure that they are processed properly.

Import or export failure

When you import a TMX file, the XML in the file must be well-formed. If it is not, there is an error during the import process. For example, here is a part of a TMX file:

<seg>Alice was beginning to get very <br> tired.</seg>
CODE


This part is malformed because the "<br>" XML tag is missing the "/" to close it. Use an XML editor to find the malformed XML tags in the TMX file.

On the other side, the segments stored in the translation memory have to be well-formed XML, too. If one or more segments are mal-formed, the TMX export fails. For example, here is a segment entry:

<?xmlversion="1.0" encoding="UTF-8"?><root><it>She</it> asked: This 
  <it>Buch?</root>
XML

To find mal-formed segments in the translation memory, censhare provides the Translation with memory integrity server action. You then have to edit the respective segments in the database.

Import TMX files    

Configure server action

To activate the import function for segments, go to the directory "Configuration/Modules/TMX" in the censhare Admin Client and double-click Import TMX file. Then configure the server action.

Click OK to save the configuration. The changed configuration is shown in the TMX directory. You can also add more system-specific configurations, for example for another server or for another role. Then you need to update the server so the configuration will take effect.

Icon

Action 

Update server configuration    

If you work with a master server configuration, configure the TMX import on the master server and then synchronize the remote server in order to make changes to that configuration as well:

Icon

Action

Synchronize remote server

Carry out imports

To carry out an import, open the Server actions menu in the censhare Admin Client or the censhare Client and select the option Import TMX file

Icon

Action

Server actions

In the dialog, select the file you wish to import and confirm your selection. The Import TMX file dialog opens. Specify the required settings here:

Field

Required 

Definition 

Setup

Domain  

No

Enter a domain name here where you wish to import the TMX file. The translation memory can save segments in different domain trees (sister domains) and split them up from one another. Segments in a domain tree are only accessible to users who are logged in to the associated domains. It doesn't matter whether a segment is in the user domain itself or in a parent or child domain.    

2nd domain

No

Enter a value here for a second domain for the import. For more information see the description of the Domain field.    

Use formatting    

No

Activate this field in order to accept the inline formatting in segments. We recommend always having this field activated. Otherwise, segments may be imported incompletely or in multiple parts because censhare interprets a segment only up until the first inline tag.    

Language mapping

Segment

n/a

Shows the language code for a language that has been found in the import file. The language code cannot be changed.   

Mapping

Yes

Select an in-censhare defined language for the segment language code on the left side. This language is then mapped to every segment with this language code in the import file.

Note: you can select any language that is available in censhare. It does not matter if this is a language with standard code or a custom definition.

Include mono-lingual segments

No

Select this box to also import mono-lingual segments in the TMX import file into the translation memory. With mono-lingual segments, there are no segments with other languages assigned in a translation unit ("<tu>")  in the TMX file.

Otherwise, censhare only imports translation units that contain at least two segments.

To confirm the import settings, click OK. censhare shows you in an information dialog how many datasets have been added. To close the dialog, click OK. The imported segments are now available in the translation memory.

If the XML of the TMX file is not well-formed,  you receive an error message and the import fails. The error message is also written into the log. 

Import TMX files from hot folders automatically

censhare can automatically process TMX files. To do that, you need to set up the configuration in the server action Import (automatically) TMX files from a hot folder. This server action looks at the defined intervals in the entry directory. There are four hot folders:

  • Entry directory: Place files to import here.

  • Working directory: During the import process itself censhare moves the file into this directory.

  • Output directory: When the import of the file is finished, censhare moves it into the Output directory.

  • Errata directory: If there is an error, the file is moved into the Error directory.

These four directories give you the option of following the status of an automatic import. For the configuration of the different directories, see the Files system setting of the server action.

Note: To be sure that the import was correctly executed, keep an eye on the right coding and language assignment of the TMX files to be imported.

Field 

Required    

Description

Note

No

This field is for documentation purposes only.

General settings    

Server names 

No

In a cluster system or a master remote configuration, you can limit the import function to one server or set up different configurations for different servers. By default, the function is activated for all servers.    

Activated

Yes

This field has to be activated in order to import segments.    

Run only on master server

No

This field is not relevant for TMX imports.    

Version

No

This field is for documentation purposes only.    

Interval

Yes

Enter the time interval (in seconds) in which censhare checks whether a new TMX file is available in the entry directory.

File system settings    

Entry directory

Yes

Set the directories for the individual steps of the automatic import. Files to be imported need to be saved to the entry directory. To process them, censhare moves them to the working directory. If the import is successful, censhare moves the file to the working directory and if there were errors they are moved to the errata directory. Select Temp dir for the File system field. Enter the relative path for the selected File system. It always starts with the "file:" prefix, for example, "file:in/" for Input dir. Only change the default setting if you are sure the directory exists and the functionality of the import won't be affected.

Working directory

Yes

Output directory

Yes

Errata directory

Yes

File-setup  

Use sample files

No

Here you can enter a regular expression. censhare imports only files that correspond to this expression.

The default setting is that only files with the extension ".xml" or "tmx" will be imported.

Ignore sample files 

No

Here, you enter a regular expression for file types that censhare should ignore for imports.

The default setting is that system files are ignored.

New file configuration

No

Here you can enter additional regular expressions for imports that censhare should import or ignore.    

Translation memory settings   

Language mapping

Yes

In the language mapping, you create a map for languages from the source file using the system languages saved in censhare. The mapping is required because TMX files only use the ISO-639-1 language code consisting of two letters.     

censhare, on the other hand, can also save languages as language-country codes, for example, "de-DE". For each language you create a line Language mapping and indicate a source and a target language.    

Source language

Yes

Here, you enter for every language the language coding used in the source file. Typically, this is an ISO 639-1 code, for example, "de" or "en", or a local code, for example, "en-US" for American English.    

Target language

Yes

For every language in the source file you select the corresponding system language from the list.    

New language assignment 

No

Click to add another line.

Use formatting

No

Activated this field in order to accept the inline formatting in segments. We recommend always having this field activated. Otherwise, segments may be imported incompletely or in multiple parts because censhare interprets a segment only up until the first inline tag.    

Domain

No

Enter a domain name here where you wish to import the TMX file. The translation memory can save segments in different domain trees (sister domains) and split them up from one another. Segments in a domain tree are only accessible for users whose main domain is in this domain tree. It doesn't matter whether a segment is in the user domain orin a parent or child domain.    

2nd domain

No

Enter a value here for a second domain for the import. For more information, see the description of the Domain field.    

If the syntax of the TMX file is not well formed, the import fails and an error message is written into the log. censhare then moves the failed TMX file into the error folder that you defined for the automatic server action. 

Export TMX files

Configure server action

You can export segments from the censhare Translation with memory in a TMX file in order to edit them in an external system or in order to import them into another translation program, for example. To activate the export function for segments, go to the directory Configuration/Modules/TMX in the censhare Admin Client and double-click Export TMX file. Then configure the server action.

Save the configuration by clicking OK. The changed configuration will now be shown in the TMX directory. You can also add more system-specific configurations, for example for another server or for another role. Then you need to update the server so the configuration will take effect.

Icon

Action

Update server action

If you are working in a censhare cluster or a master remote server configuration, you need to then synchronize the remote server in order to activate the changed configuration on that server too.

Icon

Action

Synchronize remote server    

Execute expor

To carry out an export, open the Server actions menu in the censhare Admin Client or the censhare Client and select the option Export TMX file.

Icon

Action

Server actions

This action starts the Export TMX file. Here you can specify the desired settings for the export:

Field

Required

Definition

Setup   

Domain

No

Enter a domain name ere from which you wish to export the TMX file. Please note that censhare will generate an empty file if the selected domain does not contain any segments.    

2nd domain

No

Enter a second domain here for the export. For more information, see the description of the Domain field.

Language mapping  

Segment

Yes

Select a language for segments to export.

Mapping

Yes

Enter an ISO-639-1 language code. This can be a lowercase two-digit format "xx", for example, "de" or "en". Or, extend the code to "xx-XX" for country-specific information, for example, "de-DE" or "en-US".

+ button

No

Click to add another line. Create a mapping entry for every language that you want to export.

Include mono-lingual segments

No

If you select this box, censhare exports all segments in the translation memory that match one of the defined languages above. A mono-lingual segment is exported if within a set of segments only one segment has a language that matches one of entered languages in the mapping. For example, there is a segment set with three segments for "fr", "de" and "en". The dialog has mapping entries for "sp" and "fr". If you select mono-lingual, only the "fr" segment is exported. So, there is no pair of segments exported. If you do not select mono-lingual, only segment pairs are exported.

If the translation memory contains malformed segments, the export fails and you receive an error message. censhare also writes this message into the server log. 

Note: The TMX export in censhare relies on a certain XML (TMX) and variables structure (placeholders for date/time or numbers). If you import the TMX file into an external system that uses a different structure, this might not work.

Result

You can configure the server actions for import and export segments with TMX. You know how to import or export TMX files and can handle errors.