File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/w04-1401_metho.xml

Size: 10,516 bytes

Last Modified: 2025-10-06 14:09:18

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-1401">
  <Title>Corporate Language Resources in Multilingual Content Creation, Maintenance and Leverage</Title>
  <Section position="3" start_page="4" end_page="7" type="metho">
    <SectionTitle>
2 From theory to practice
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="4" end_page="4" type="sub_section">
      <SectionTitle>
2.1 Language resources and Knowledge
</SectionTitle>
      <Paragraph position="0"> The breadth and depth of knowledge required today in order to perform a good quality technical or specialised corporate translation relies upon a panoply of language resources (LR) in machine-readable form, which are self-created in the corporation or purchased from external parties (sister organisations, domain-specific specialist groups and societies, applied software and solution companies, etc.). In this panoply of corporate LR4Trans, one may find domain specific terminology, source and target language dictionaries of corporation-dependant word meanings, source and target language structures and rules, a corporate language stylesheet, appendix of phrases and expressions denoting cultural differences within a (multinational) corporation or when attempting global expansion, prescriptive and descriptive notes about the corporation &amp;quot;culture&amp;quot;, among others.</Paragraph>
      <Paragraph position="1"> All these resources contain precious corporate knowledge that should be taken into consideration and be made accessible to all corporate members and partners accordingly. Tagging or flagging the knowledge in those language resources will be extremely useful for optimising - on a constant basis - not only the resources themselves but the whole of the multilingual content production process. Tags or flags, normally called content properties, content attibutes or metadata, are aimed at retrieving a content unit when necessary and preventing loss of content. It is precisely thanks to these attributes, often visualised to the content repository into an external translation process, and then returns in one or more new languages - a further challenge, especially if there is not yet an effective content management system in place.</Paragraph>
      <Paragraph position="2"> user by means colours or other agreed conventions, that a content management system can manage content even if it moves across multiple languages or sites.</Paragraph>
      <Paragraph position="3"> Capturing that knowledge will thus be helpful when developing scalable and adaptive applications for managing corporate multilingual content.</Paragraph>
    </Section>
    <Section position="2" start_page="4" end_page="7" type="sub_section">
      <SectionTitle>
2.2 From LR4Trans to knowledge
</SectionTitle>
      <Paragraph position="0"> repositories and content management systems A corporate knowledge-geared multilingual content strategy is open to a varying degree of automation, in terms of not only linguistic processing but also in content transaction  operations, on the basis of the type of documentation, business conditioning factors, etc. It usually combines tightly integrated translation technologies (and maybe other kind of human language technologies) with human specialist intervention, i.e. unique  language work processes, which have to be driven by highly skilled linguists.</Paragraph>
      <Paragraph position="1"> This form of knowledge-based translation work aims to bridge the gap between low cost, poor output machine-only translation and costly high-quality human-only translation. Although this could be seen as a type of machine-aided human translation (MAHT), we would like to emphasise the issue of knowledge, corporate knowledge in particular, which precisely ought to be captured into the translation system's knowledge base. This corporate knowledge base, characterised for being configurable and updatable, will detect and classify the knowledge present in the language resources into: general knowledge, domain-specific knowledge and, knowledge specific to each individual customer or department within the organisation.</Paragraph>
      <Paragraph position="2"> The knowledge base will nonetheless be acting as a single repository with the following possible functions  : automated identification of terms that  Transaction costs can outweigh translation costs, especially when the creation and maintenance of multilingual content is required for e-learning or ecustomer support.</Paragraph>
      <Paragraph position="3">  Ideally tailor-made and customisable, that is, conceived for the corporation or the client they work in or for.</Paragraph>
      <Paragraph position="4">  These functions will be linked to one another and called according to the stage of multilingual content creation we are in. A function or component may be called more than once within the multilingual content are candidates for once-only translation; spotting of translation for terms from previously translated, aligned texts; semi- or automated creation of domain and/or customer-specific terminology, dictionaries and glossaries; creation and regular update of domain and/or customer-specific language rules; implementation of domain and/or customer-specific translation memories; dynamic and integrative machine translation, making use of customised dictionaries (lexicons) and language rules; translation and edition application, ideally increasing ease of use by showing colour-coded aligned bi-texts (bilingual) or multi-texts (multilingual) with a context expansion feature and highlighting terminology; and, most importantly, automated and user-dependant feedback of new knowledge into the knowledge base.</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="7" end_page="8" type="metho">
    <SectionTitle>
3 The envisaged scenario: workflows,
</SectionTitle>
    <Paragraph position="0"> content management systems, and agents Having the corporate knowledge base linked to various LR4Trans, as presented in section 2 makes us think of a procedural and very agile multilingual content workflow. But let us examine it in greater detail starting to look at the simplest of workflows first (figure 1):  an unstructured document collection, are handed over to the translator, who does translate without having the chance to get feedback from the authorship department. There is neither an obvious use of language resources in machine-readable form nor a corporate knowledge detection and exploitation strategy in operation. This poor production process will have negative consequences in terms of the quality of the product translation (e.g. lacking consistency, production lifecycle.</Paragraph>
    <Paragraph position="1"> frequent content losses, etc.) and costs, particularly in the long run.</Paragraph>
    <Paragraph position="2"> In order to streamline the procurement and management of corporate multilingual content we propose the following workflow (see figure 2 in the appendix). Its main assets would be an overall corporate knowledge base linked to various LR4Trans, as appropriate, and maintained by all agents  intervening in the workflow, plus a content management system, or CMS, that would reflect the business roles controlling the workflow, data production and update flow, user roles and access privileges, costing rules, etc. In contrast to figure 1, the following features can be found in the workflow presented in figure 2: * Cyclic nature of content, from monolingual to multilingual, and back to enhance and expand the first; * Corporate content is traceable and its state and structure can be followed-up at all times; * Authors are aware of what happens at the other end, and so are capable of &amp;quot;writing for translation&amp;quot;, that is, editing or approving content that will be later received by an audience or market of another language and culture. In other words, the package of the content starts being taken care of from the beginning; * Translators are connected with the authoring department: the concepts of content negotiation and feedback are essential here. Translators, being intercultural mediators, have a strong say in issues of international content relevancy.</Paragraph>
  </Section>
  <Section position="5" start_page="8" end_page="8" type="metho">
    <SectionTitle>
TRANSLATION
</SectionTitle>
    <Paragraph position="0"> CMS are meant to work seamlessly in the background, automatically identifying changes in the content (e.g. keeping track of the content production or processing stage, keeping a log of agent participation, etc.) by means of a built-in feedback loop mechanism. Besides, a multilingual CMS comes to live action when, as some kind of document gate keeper and donor, passes on the content from one agent to another,  By this we mean not only the multiskilled corporate linguist (who could be a translator, terminologist, editor, domain validator, cross-cultural consultant...), but also all those agents that construct and share the knowledge of a corporation, namely decision makers (i.e. management force), marketeers, legal specialists, and so on.</Paragraph>
    <Paragraph position="1"> notifying him or her of any vital new piece of information: &amp;quot;a new translation has been received&amp;quot;, &amp;quot;glossary validated by expert XY and saved today at 18:27 hours&amp;quot;, &amp;quot;not possible to close up project before client acceptancy test&amp;quot;.</Paragraph>
    <Paragraph position="2"> CMS are usually dependant on the corporate knowledge base. Together, they define the workflow and have interaction capabilities with the various users by means of secure interfaces, usually very similar to a web portal for internal and very often external use, too (mainly for workers or at different sites and clients).</Paragraph>
    <Paragraph position="3"> Concerning language work, it is extremely important that both online and offline editing and review of content are allowed. In other words, the corporate knowledge base has to be centralised (online use) and yet distributed at times (offline use). It will be the system, which will manage the synchronisation of content and knowledge base alterations and updates across all the different user types.</Paragraph>
    <Paragraph position="4"> The CMS thus relies heavily upon automated mechanisms (e.g. automatic updating of the translation memory once the project translations have gone through the review process) but needs skilled human intervention to improve its efficiency over time.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML