Even if your organization has one single official language (English, mostly), it’s common to use multiple official languages across. The Term Store provides the capability to store each term in multiple language variations – but can these be used in Search as well?
If yes, how?
If not, what options (workarounds) do we have?
I have created a test site, with three sub-sites, each in different language: English (EN), German (DE), and Hungarian (HU).
Every subsite contains a document library, with dummy documents in the corresponding language.
The columns “Animal” / “Tier” / “Állat” point to the same term set “Animals” in Term Store.
The default language in the Term Set is English. German and Hungarian translations are added to every term:
This setup ensures, that’s Cat / Katze / Macska are technically the same terms, with the same term ID.
On the English site, only the English terms are available for tagging (Dog, Cat, Bird, …).
On the German site, only the German terms are available for tagging (Hund, Katze, Vogel, …).
On the Hungarian site, only the Hungarian terms are available for tagging (kutya, macska, madár, …).
Once a document is tagged, the corresponding label will be stored to it.
The Term ID is common, however, there are three term labels (English, German, OR Hungarian) which cannot be normalized by the search engine:
If the user searches for “dog” – only the English documents will be returned.
If the user searches for “Hund” – only the German documents will be returned.
If the user searches for “kutya” – only the Hungarian documents will be returned.
Despite the TermID is the same for all the three.
Synonyms are useful when tagging a document. For example, “cica” is a synonym for “macska” (cat) in Hungarian.
If the user types “cica” when tagging a document, “macska” comes up, and the document will be tagged with “macska”:
The document will be tagged with “macska” – the synonym (“cica”) gets lost instantly.
This means, synonyms cannot be used for multi-lingual purposes (workaround) either.
One Possible Workaround
In bi-lingual environments, one possible workaround can be to add both the English and German words/expressions to the term. (As the number of languages grows, obviously this is not a feasible option anymore.)
For example, store every term in “English term / German term” format. For example, “Dog / Hund”.
This makes both the English and German terms searchable on every document, regardless of it’s language settings.
BUT: Synonyms might be still important when tagging!
Let’s say the terms are stored in “English term / German term” form. Let’s use the example “Dog / Hund”.
If the user wants to use the English term, and starts typing “Dog” – the corresponding term (“Dog / Hund”) will be selectable.
However, if the user starts typing the German word “Hund” – no term will be found:
If we add “Hund / Dog” as a German translation, solves the tagging issue:
The challenge again will be that this is a different term label, and this document will have “Hund / Dog” as tag, while the English documents will have “Dog / Hund” – same TermID, two different labels!
As for now, my suggestion is to use multi-lingual term labels, eg. “English Term / German Term”, and add the German term as a Synonym as well. Do NOT add German translations as of now.
- When tagging the documents, user can find both the English as well as the German terms.
- Every document will be tagged consistently (with the original term label).
- Search will find every word / expression, which is stored in the term label (English term / German term).
- The terms can be used as consistent filters (refiners) if needed.