Translation quality falls apart when teams use different words for the same thing. If you’re localizing product UI, docs, and marketing across languages, a shared terminology foundation is the quickest way to cut errors and speed up releases. This guide shows how to build a multilingual glossary and termbase that works in real life: what to plan, how to model your data (including TBX), how to extract and approve terms, how to integrate with CAT/TMS, and how to keep everything healthy over time.
Glossary vs Termbase: What’s the Difference?
A glossary is the human‑friendly list of approved terms and definitions. A termbase is the structured database behind it that stores concepts, language variants, metadata, and workflow states.
- Glossary: Readable, searchable view for writers, translators, support. Often a filtered output of “Approved” terms.
- Termbase: Concept‑centric store with IDs, definitions, preferred/forbidden variants, context, domains, locale attributes, status, and audit trail.
Key principle: model by concept, not by string. One concept (e.g., “Sign in”) can have multiple terms per language and usage notes per platform/audience.
Why a Multilingual Glossary Pays Off
- Consistency: Fewer contradictory translations and fewer “which word should we use?” threads.
- Speed: Translators and writers move faster when terminology is clear and in‑tool.
- Quality: Fewer support tickets caused by confusing labels or mismatched wording.
- Compliance: Regulated industries need exact, approved phrases.
- Scalability: New vendors and languages onboard with less coaching.
- Measurability: You can track issues and prove ROI (fewer terminology QA fails, faster approvals).
Plan Before You Build: Scope, Languages, and Brand Rules
Decide what you will cover first and who owns what. Align this in one page before you open a spreadsheet.
- People: Who uses the glossary (translators, UX writers, engineers, support)? Who approves?
- Domains: Start with UI + help center (highest visibility), then marketing/legal.
- Locales: Pilot with 3–5 key languages; expand after governance holds.
- Brand names: Set rules per market on translate vs transliterate vs keep original. For a practical framework, see Translate vs Transliterate Brand Names: Best Practices.
Model Your Termbase: Fields, IDs, and TBX Mapping
Start small but future‑proof. These fields cover 90% of real‑world use cases.
Minimum viable fields
| Field | Purpose | Example |
|---|---|---|
| Concept ID | Stable identifier | CON‑000142 |
| Source Term | Canonical term in source language | Sign in |
| Definition | Unambiguous meaning | Authenticate to access an account |
| Part of Speech | Grammar and UI guidance | Verb (button label) |
| Domain | Module/subject area | Authentication |
| Context Sentence | Real usage | Tap “Sign in” to continue |
| Usage Note | Style/case rules | Title case in buttons |
| Forbidden Terms | What not to use (and why) | Login (as verb) |
| Locale Fields | Approved translation + attributes | fr‑FR: Se connecter (verb) |
| Status | Lifecycle | Proposed / Approved / Deprecated |
| Source/Authority | Where definition came from | Spec v3.1 |
| Last Updated | Auditability | 2026‑01‑15 |
Helpful extras
- Synonyms (admitted/avoid) with priority flags.
- Morphology (gender, pluralization) where languages require it.
- Relations (broader/narrower/related concepts).
- Platform/audience qualifiers (web, iOS, Android; consumer vs admin).
- Risk tags (legal/medical/payments) to route to SMEs.
TBX mapping at a glance
| Your field | TBX element |
|---|---|
| Concept ID | <termEntry id="CON‑000142"> |
| Language section | <langSet xml:lang="fr‑FR"> |
| Term (preferred) | <tig><term>Se connecter</term></tig> |
| Definition | <descrip type="definition">…</descrip> |
| Forbidden term | <termNote type="administrativeStatus">deprecatedTerm</termNote> |
| Status | <admin type="status">approved</admin> |
<termEntry id="CON-000142">
<descrip type="definition">Authenticate to access an account</descrip>
<langSet xml:lang="en">
<tig><term>Sign in</term></tig>
</langSet>
<langSet xml:lang="fr-FR">
<tig>
<term>Se connecter</term>
<termNote type="partOfSpeech">verb</termNote>
<admin type="status">approved</admin>
</tig>
</langSet>
</termEntry>Choose the Right Tools
- Spreadsheet (MVP): Fast start; add validation and filters. Plan to export CSV/TBX later.
- Terminology module (in CAT/TMS): In‑editor term suggestions, QA checks, workflows, TBX I/O.
- Standalone terminology tools: Rich metadata and API integration with product systems.
Select on: TBX support, approval workflows, QA rules (forbidden/casing), API access, and search UX. Pilot in one domain and 3–5 locales before scaling.
Step‑by‑Step Workflow (extraction → publication)
1) Harvest candidate terms
- Pull from UI strings, docs, release notes, support tickets, analytics queries, sales decks.
- Run monolingual extractors to spot frequent noun phrases; align bilingual corpora to find stable pairs.
- Ask PMs/support for “must‑keep” terms and pain points.
2) Curate and normalize (by concept)
- Group duplicates under one concept; write a single‑sense definition.
- Set preferred term, list admitted/forbidden variants and the “why”.
- Capture part of speech, domain, platform/audience qualifiers.
3) Approve with a lightweight workflow
- Roles: Terminologist (curates), SME (meaning), Language leads (locale approvals).
- States: Proposed → In Review → Approved → Deprecated (with reasons).
- Keep an audit trail (who/when/why) for regulated domains.
4) Localize per language
- Provide context sentences and screenshots.
- Capture locale attributes (gender, formality, platform quirks).
- Check against trusted sources (e.g., IATE) when helpful.
5) Enrich metadata
- Link related concepts (“Sign in” ↔ “Sign out” ↔ “Register”).
- Add pronunciation/transliteration where non‑Latin scripts help support teams.
- Flag regulated terms for mandatory SME review.
6) QA the terminology
- Run automatic checks for forbidden terms, casing, duplicates, and locale completeness.
- Preview in context (staging UI/docs) to catch truncation and RTL/LTR issues.
7) Publish and integrate
- Expose a read‑only glossary (searchable, filterable) for non‑linguists.
- Enable in‑tool suggestions and QA in CAT/TMS; enforce forbidden terms.
- Sync via TBX or API to keep systems aligned.
8) Train the team
- Share a one‑pager: “How to use the glossary” with examples.
- Offer a simple form to propose terms or flag issues.
9) Maintain and measure
- Quarterly review high‑impact domains; deprecate stale entries.
- Track KPIs: terminology QA fails, lookup rate, time‑to‑approve, and support tickets mentioning terms.
Example Entries and Templates
Concept: Sign in (Authentication)
- ID: CON‑000142
- Definition: Action that authenticates a user to access an account.
- Source: Sign in (preferred); Login (noun only). Avoid “Logon”.
- Context: Button label on login screen.
- Locales:
- es‑ES: Iniciar sesión (verb)
- fr‑FR: Se connecter
- de‑DE: Anmelden
- ar‑SA: تسجيل الدخول (RTL)
- zh‑CN: 登录
- Status: Approved
Concept: Two‑factor authentication (2FA)
- Definition: Security process requiring two independent verification factors.
- Note: Use “2FA” in UI where space is tight; expand on first mention in help.
- Locales: es‑ES: Autenticación de dos factores; fr‑FR: Authentification à deux facteurs; ja‑JP: 2要素認証
- Status: Approved
Concept: Free trial (Subscription)
- Definition: Time‑limited access at no charge; charges start unless cancelled.
- Locales: ar‑SA: تجربة مجانية (digits policy: Arabic‑Indic on UI), fr‑FR: Essai gratuit, pt‑BR: Avaliação gratuita
- Usage: Add legal disclaimer link in UI when required.
Automation and AI‑Assisted Term Extraction
- Monolingual extraction: Identify frequent domain phrases; filter out stopwords and boilerplate.
- Bilingual alignment: Align legacy translations to surface stable term pairs and inconsistencies.
- LLM support: Draft definitions or disambiguation notes; keep human approvers in the loop.
- Linting: Add CI checks in source repos to block forbidden terms before localization starts.
Governance, Workflows, and QA
- RACI: Who proposes, reviews, approves, audits.
- SLA: e.g., 5 business days for high‑impact terms; 10 for low‑impact.
- Lifecycle: Proposal → Review → Approval → Publication → Periodic review → Deprecation.
- QA gates: Conflicts, casing, duplicates, forbidden hits, locale coverage.
- KPI dashboard: Approval backlog, time‑to‑approve, QA fail trend, regulated term queue.
Troubleshooting: Symptoms and Fixes
| Symptom | Likely cause | Fix |
|---|---|---|
| Translators keep using different words | No concept‑level modeling; glossary hard to find | Group by concept; expose a read‑only glossary; enable in‑tool suggestions |
| UI truncates or looks wrong in RTL | No context check; digits/punctuation not specified | Add context screenshots; add digits and punctuation attributes; preview in staging |
| Legal blocks last‑minute | Regulated terms not flagged early | Tag “payments/privacy/medical” terms; auto‑route to SMEs; set SLA |
| TBX import loses statuses | Custom fields not mapped; spec mismatch | Document extensions; validate TBX; run round‑trip tests before switching |
| Search shows wrong script/variant | No hreflang or schema; mixed usage online | Add alternateName in Organization schema; use hreflang; publish a clarification page |
Common Pitfalls (and how to avoid them)
- Modeling strings, not concepts: consolidate meaning first, then terms.
- Over‑translating brand names: document per‑market rules and stick to them.
- Ignoring morphology: capture gender/plural rules where needed.
- Siloed systems: use TBX and APIs so tools can talk to each other.
- No owner: assign a terminologist and language leads; publish SLAs.
- No context: add examples/screenshots for critical UI terms.
- Skipping i18n basics: ensure Unicode, RTL support, and proper segmentation in product.
Export/Import with TBX
TBX (TermBase eXchange) is the open standard for moving terminology between tools. Use it for vendor sharing and platform migrations.
- Map concept IDs to
<termEntry>, languages to<langSet>, and terms to<tig>/<term>with admin/descriptive data. - Keep any custom fields in a documented extension; confirm round‑trip symmetry.
- Validate against the spec before large imports to avoid silent data loss.
Integrate with CAT/TMS and Product Content
- CAT/TMS: Real‑time term lookups; automatic warnings for forbidden terms; enforce casing.
- Docs/CMS: Glossary widget in help center and knowledge base for consistent linking.
- Design systems: Sync preferred UI terms to component libraries (e.g., via API) so product copy matches the glossary.
- Dev portals: Expose approved API terminology and parameter names for consistency.
SEO and UX Benefits
- Use glossary terms in titles, headings, and snippets where natural to improve discoverability.
- Align paid/organic keywords to approved terms per locale to reduce mixed nomenclature.
- Create FAQ or landing pages for high‑value terms; link them across docs and product guides.
Build Checklist
- Define scope, locales, roles, and brand rules.
- Draft data model and TBX mapping; set stable IDs.
- Extract candidate terms (UI/docs/tickets); gather SME input.
- Normalize by concept; write definitions and usage notes.
- Localize with attributes; set status; add context and relations.
- QA: conflicts, forbidden terms, casing, locale completeness; preview in context.
- Publish a read‑only glossary; integrate with CAT/TMS and CMS; set CI linting.
- Train teams; track KPIs; review quarterly; deprecate stale entries.
FAQ
What’s the fastest way to start?
Use a spreadsheet with the minimum fields and data validation. Pilot on one product area and 3–5 locales. When stable, export to TBX and move to a terminology tool.
Glossary vs termbase—do I need both?
Yes. The termbase is your source of truth; the glossary is its readable, filtered output. Many tools generate the glossary automatically from the termbase.
Who approves terms?
A terminologist or language lead, with SMEs for domain accuracy. Publish SLAs and keep an audit trail.
How often should we review?
Quarterly for high‑impact domains (auth, payments); twice a year for stable areas. Also review after major feature or branding changes.
How is a termbase different from translation memory (TM)?
TM stores previous sentence‑level translations. A termbase stores concept‑level terms and rules. They complement each other: termbase guides wording; TM speeds repeated segments.
References and Useful Resources
- OASIS TBX Core v3.0 — terminology exchange standard
- W3C Internationalization — guidance on scripts, bidi, and global UX
- IATE — EU terminology database for reference
Conclusion and Takeaways
- Model terminology by concept, not string. Assign clear owners and SLAs.
- Start small (spreadsheet + validation), then standardize with TBX and integrate with CAT/TMS.
- Document Arabic/RTL specifics (digits, punctuation, bidi) where relevant to avoid UI rework.
- Automate checks (forbidden terms, casing) and preview in context before release.
- Measure impact (QA fails, time‑to‑approve, support issues) and keep the glossary visible and easy to use.
With a disciplined termbase and a simple workflow, you’ll ship multilingual content faster, reduce rework, and give every team—from engineering to support—the same reliable words to work with.

Aarav Sharma — Founder & Editor, WA Translator. I publish hands‑on, privacy‑first guides on WhatsApp translation, iOS Shortcuts, and AI translators. All workflows are tested on real devices (EN↔AR) with screenshots and downloadable Shortcuts. About Aarav • Contact
