Transforming Workflows with AI

Transforming Workflows with AI

Introduction

In this edition of the AI in Loc newsletter, I am thrilled to present an interview with fellow Belgian Jourik Ciesielski, language technology expert, localization engineer, industry researcher and recent guest on a very well attended LocDiscussion I organized.

Jourik currently serves as the Chief Technology Officer (CTO) at Yamagata Europe and is the Co-Founder of C-Jay International. With a Master's degree in Translation and a Postgraduate Certificate in Specialized Translation from KU Leuven, Jourik brings a wealth of knowledge and experience to the table. In addition to his native Dutch, he speaks fluent English, French, and Spanish, along with some basic Portuguese.

Our conversation delves into the best practices for integrating AI into Translation Management Systems (TMS), offering practical insights on how to upgrade your localization toolbox. From practical LLM integration to the role of AI in quality assurance and terminology management, Jourik provides valuable guidance for both small LSPs and freelance translators. Join us as we explore the cutting-edge techniques and future trends that are shaping the localization industry.

Practical LLM Integration in Localization

Stefan: What are the key steps for integrating LLMs into existing localization workflows?

Jourik: The first step is to understand how the technology works. LLMs (Large Language Models) require context and instructions to function effectively. This involves prompt optimization, such as providing examples and using few-shot prompting. LLMs work primarily with data rather than documents, and advanced techniques like Retrieval-Augmented Generation (RAG) can enhance context and information retrieval. Once you grasp these concepts, the potential use cases are vast.

From a practical standpoint, serious LLM integration will either require investment or hands-on development of use cases. My preference is the latter, especially to support smaller organizations.

Stefan: Can you share some real-world examples where LLM integration significantly improved efficiency or quality?

Jourik: LLMs have the potential to revolutionize localization processes, from automated translation to quality assurance (QA) and workflow management. However, commercial LLM integrations often focus on low-hanging fruit. Some companies, like Bureau Works, Custom MT, and Intento, are making innovative strides.

Success stories usually involve specific, high-impact tasks such as term extraction, targeted QA, and cleaning translation memories (TM) and term bases (TB). Additionally, linguists seem to appreciate generative AI most for creative assistance, contrasting with the traditionally negative perception of machine translation (MT).

Stefan: What are the common challenges faced during LLM integration, and how can they be overcome?

Jourik: Integrating LLMs presents multiple challenges, not all of which are technical. On the technical side, the concept of prompting is new, and LLMs are less task-specific than MT. Process-wise, the localization industry is still grappling with fully adopting previous disruptive technologies like MT, dealing with issues such as quality measurement, segmentation in TMS, and pricing models. Now, we have to integrate a new technology amidst ongoing uncertainty and a lack of standardization.

Stefan: How can LSPs and translators measure the success of their LLM integrations?

Jourik: Human-in-the-loop processes are crucial here. For purely linguistic applications, traditional quality measurement methods like Linguistic Quality Assurance (LQA) will likely remain standard, relying on human judgment to assess quality, efficiency, and ROI. It's challenging because we haven't fully figured out how to measure quality and ROI for MT. If LLMs are used for quality evaluation, we face the dilemma of "AI evaluating AI."

Stefan: What are the future trends you foresee in the use of LLMs within the localization industry?

Jourik: I foresee LLMs completely replacing traditional rule-based algorithms for QA and LQA. We will also see more multimodal use cases, such as audio transcriptions, text-to-speech, and screenshot testing, leveraging LLMs' capabilities.

Operational AI Use for Translators and Small LSPs

Stefan: What AI tools and technologies are most beneficial for freelance translators and small LSPs?

Jourik: For freelance linguists and small Language Service Providers (LSPs), my primary advice is to focus on machine translation (MT). It's important to remember that MT is a foundational form of AI—Google developed the transformer model with translation in mind. To build out MT as a service, leverage robust technologies like DeepL with glossaries or adaptive models like ModernMT. Establish solid processes, workflows, and commercial frameworks to support these technologies. This will lower the barrier for integrating LLMs from both technological and operational perspectives.

Additionally, my rule of thumb is to first use AI to make your own work easier. If you can increase your productivity by leveraging AI, it becomes much simpler to build a service around it.

Stefan: How can small LSPs leverage AI to stay competitive with larger players in the industry?

Jourik: To stay competitive, small LSPs should adopt a consultant role rather than focusing solely on leveraging AI directly. Clients are eager for AI solutions, and LSPs need to guide them through the process by highlighting the advantages, acknowledging the weaknesses, discussing quality implications, and setting realistic expectations. Writing the perfect prompts and becoming a trusted resource for clients is crucial.

The technology is already accessible, as evidenced by the widespread use of ChatGPT. The key is to provide added value and give clients a reason to choose your services over commercial tools. Knowledge is power; being more knowledgeable about new AI technologies than your clients and competitors is the first step. Clients will appreciate embarking on this journey with you.

Stefan: What are the cost implications of implementing AI solutions for smaller organizations?

Jourik: Fortunately, the costs associated with MT models and LLMs are relatively low, averaging about $20 per million characters per month. GPT-3.5 Turbo, for instance, is quite affordable. The bigger challenge is selling AI as a service rather than purchasing the solutions. Clients are primarily interested in AI for cost-cutting, so smaller organizations must find ways to monetize these cheaper services while investing in the necessary technology.

AI in Quality Assurance (QA) Processes

Stefan: How can AI be integrated into the QA process to improve translation quality?

Jourik: The key to integrating AI into QA processes is understanding how the models work and optimizing prompts accordingly. For QA, you start by defining the criteria for what constitutes a good versus a bad translation, the corresponding error categories, and the severity of each error. You can then instruct the model to deliver its verdict in a specific format, whether it's an explanation in natural language, a pass/fail indication, or a quality score. By using chain-of-thought prompting—explaining to the model step by step how to calculate a score and providing examples—the model can apply this methodology effectively.

Stefan: What are the benefits of using AI for QA compared to traditional methods?

Jourik: Traditional QA methods are often cumbersome and time-consuming, generating numerous false positives that must be painstakingly sifted through to find even minor mistakes. LLMs can revolutionize QA in localization by significantly speeding up the process and reducing the pain points associated with traditional methodologies. However, human oversight is still necessary to finalize the output, ensuring quality and reliability.

Extracting and Evaluating Terminology with AI

Stefan: How can AI be used to extract and evaluate terminology from large datasets?

Jourik: The process is quite similar to QA. You need to specify the domain of the terminology to be extracted and define criteria for what constitutes good and bad terms. For example, terms should not be too short or too long, avoid verbs, include adjectives only in compound terms, and exclude obvious terms. You also specify the desired output format. By prompting the model with these criteria, it can extract terms accordingly. Similarly, the model can be prompted to evaluate existing terminology based on the same criteria.

Stefan: What are the advantages of using AI for terminology management in localization projects?

Jourik: Terminology extraction is a classic example of a cumbersome task that becomes much more efficient with LLMs. Instead of a human scanning large documents to extract terms entry by entry, LLMs can handle this task. Humans can then focus on the final validation, ensuring the accuracy and relevance of the extracted terms.

Stefan: Can you share any success stories of AI-driven terminology extraction in the industry?

Jourik: While this use case doesn't generate much excitement in case studies, webinars, or conferences, it is being utilized effectively. For instance, some LSPs deploy LLMs to check their translation memories (TMs) against provided terminology and automatically update entries with terminology errors. This automation significantly improves efficiency and accuracy in managing terminology.

Augmenting Machine Translation with Style Guides and Glossaries

Stefan: How can style guides and glossaries be integrated into machine translation workflows using AI?

Jourik: This is an exciting use case that greatly interests translation buyers. The primary advantage of using LLMs is their flexibility in adaptation techniques. Training traditional neural MT models can be cumbersome; it often involves using bilingual corpora, which is time-consuming, expensive, and difficult to control in terms of output. Alternatively, clumsy MT glossaries can be used, but these are far from ideal.

LLMs, on the other hand, can be simply prompted to follow specific instructions or apply terminology from style guides and glossaries. This approach is much more efficient and adaptable. The entire Translation Management System (TMS) market is focusing on augmented MT, with specialized TMS solutions like Crowdin, Lokalise, and Transifex leading the way. Additionally, memoQ is reinventing adaptive MT on GPT-3.5 Turbo, and Bureau Works offers a refreshing approach to contextualized translations.

Carole Pinto🦄

🌍 EN>FR Translation | Writing | Reviewing | Transcreation | Localization | Linguistic Quality Assurance | Subtitling | Copywriting📍

6mo

Finding a solution for false positives in QA processes would be great. They are cumbersome indeed.

Like
Reply

Jourik, "we haven't fully figured out how to measure quality and ROI for MT" - we have figured out absolutely everything here :).

Like
Reply
Mohsene Chelirem

Arabic Localization QA (LocQA | QA tester) | ex-Apple | Multilingual Expert in Localization Quality Assurance | Polyglot: Arabic, French, Italian, English

6mo

Interesting interview. How can we adapt industry-standard technology for personal translation use? 🤔 Stefan Huyghe

Patrice Dussault, s.a.h., b.a.

EN>FR Translator, writer, keynote speaker | Luxury cars, economics, international cooperation, history, arts, music

6mo

Process streamlining and automation will save us time (and help us make more $$). I found ’Augmenting Machine Translation with Style Guides and Glossaries’ particularly interesting. Thank you, Stefan! You’re at the bow of the Loc vessel.

Definitely looking forward to more of these conversations. I think it is very important to understand where the pains and opportunities are on all sides to move the needle on deploying a solution that works for all parties. #langops

To view or add a comment, sign in

More articles by Stefan Huyghe

Insights from the community

Others also viewed

Explore topics