How to pluralize in Web apps ⨉ Melnyk

You’ve seen a lot of “You have 1 item(s)” or “5 day(s) ago” in Web interfaces. It’s kinda expected: pluralization is tricky and often does not worth it. However, I you are after perfection (and if you have time for it), here’s a framework for taming the pluralization in your Web app.

Terms and Conditions

The Term is a subject (word) to pluralize. Let’s not limit it to nouns; verbs have plural form as well:

“does” ↔ “do”;
“робить” ↔ “роблять”.

Pluralization implies the number which is cardinal (think count; e.g., “one person has five apples”) or ordinal (think position in the sequence; e.g., “3^rd row in the first class”).

The plural form of a term is usually generated by following some rules. For instance, English uses one plural form for cardinals: adding the suffix “-s” or “-es” in most cases does the trick for nouns:
“one apple” ↔ “2/50/many apples”.
There are always exceptions to the rules: “person” ↔ “people”.

Different rules apply for pluralization of cardinals and ordinals. E.g., English cardinal pluralization uses one aforementioned form, however, there are several forms for ordinal numbers:
“1^st/2^nd/3^rd/N^th“.

Speaking of multiple plural forms (or categories), they depend on the number used. The plural tags define which category corresponds to given number. E.g., for English ordinals:

Numbers	Plural tag / Category	Example
1, 21, 101, …	`"one"`	“…first” or N^st
2, 22, 552, …	`"two"`	“…second” or N^nd
3, 23, 333, …	`"few"`	“…third” or N^rd
4, 11, 20, 1000, …	`"other"`	in general, N^th

Obviously, the language or locale matters. It’s usually coded as ll-CC, where ll stands for language and CC stands for country and is optional. Actually, the lang/country codes might be longer than 2 letters; the locale might be extended with script, region, variant and symbol sequence tags but that’s next level.

Short version, like "en" or "uk" is enough for most cases.

In rare occasions you amend the lang code with the country variant if, for example, your app thesaurus differs in Brasilian and Portugal variants of Portugese language. Then the "pt-BR" ↔ "pt-PT" make sense. Again, check well whether you really need it: it often happens that the app uses more or less standard vocabulary where grammar rules remain the same over the countries.

So, how to pluralize?

The general algorithm looks like this:

GIVEN a Term, a Number and the Locale.
DEFINE the PluralTag by the Locale and Number.
IF the Term appears in ExceptionsDictionary ⇒ use it.
OTHERWISE apply Rule(Term).

The question is, how to define the PluralTag?

The `Intl.PluralRules`

The Intl namespace provides a lot of neat helpers to for i18n. One of them is the Intl.PluralRules constructor. Let’s take a deeper look.

The constructor

new Intl.PluralRules(
  locale: string,
  options?: { type?: "cardinal" | "ordinal" }
)

It returns the locale-specific instance that handles the plural tag task.

The instance

The PluralRules instance provides several methods. Two are the most useful for our purposes:

.resolvedOptions() shows the rule set for given locale.

In depth:

> const en = new Intl.PluralRules("en", {type: "ordinal"})
  PluralRules [Intl.PluralRules] {}
> en.resolvedOptions()
  {
    locale: 'en',
    type: 'ordinal',
    minimumIntegerDigits: 1,
    minimumFractionDigits: 0,
    maximumFractionDigits: 3,
    pluralCategories: [ 'one', 'two', 'few', 'other' ],
    roundingIncrement: 1,
    roundingMode: 'halfExpand',
    roundingPriority: 'auto',
    trailingZeroDisplay: 'auto'
  }

.select(n: number): string returns the plural tag for given number.
Returned value will be one of .resolvedOptions().pluralCategories. For instance:
```
const en = new Intl.Pluralrules("en", { type: "ordinal" })
en.select(21)     // "one"     ∵ '21st' like '1st'
en.select(22)     // "two"     ∵ '22nd' like '2nd'
en.select(25)     // "other"   ∵ '25th', the default way
```
which corresponds to traditional English ordinals building rules: “N^st/nd/rd“ with default “N^th“ (check the table above).

Stitching it together

I created a tiny framework for pluralization; it’s available in the NPM registry.

First, preparation. Install:

npm i --save @rom98m/pluralize

…and init:

import { Plural } from "@rom98m/pluralize"
const en = new Plural("en")

Now let’s add default pluralization rule for English cardinal terms:

en.registerRule((cat, term) => {
  if (cat === "one") return term
  if (/(s|x|z|sh|ch)$/.test(term)) return term + "es"
  if (/y$/.test(term)) return term.replace(/y$/, "i") + "es"
  return term + "s"
})

Let’s add some exception terms:

en
  .registerException("child", { other: "children" })
  .registerException("person", { other: "people" })

Now we are ready to pluralize:

en.pluralize(1, "box")       // "box"
en.pluralize(5, "box")       // "boxes"
en.pluralize(5, "city")      // "cities"
en.pluralize(5, "item")      // "items"
en.pluralize(1, "person")    // "person"
en.pluralize(5, "person")    // "people"

⚠️ Caveats

Yes, it requires that amount of preparation.
Luckily, you don’t use too many exception terms in the app; it makes sense to register only those which are used.
When cardinal/ordinal pluralization is needed, 2 separate instances should be created (as the rules are different):
```
const enCardinal = new Plural("en")
const enOrdinal = new Plural("en", { type: "ordinal" })
```
Obviously, each language/locale should instantiate its own new Plural("...").
…and should define its own rule and register exceptions 🤷‍♂️

Can LLM help? — in theory, yes. However we’re talking about dynamic pluralization so sending new prompt when user adds something to the shopping cart does not sound like a good idea.

Is it worth it?

As you see, there’s no quick-and-easy solution. Even the Plural framework requires a lot of scaffolding to work well: the rule should be well defined and tested; the exceptions should be added.
🙅‍♂️ So I’d recommend omit pluralization until the bigger issues are resolved.

On the bright side, it’s usually not that much to pluralize dynamically: a couple of terms like “item” ↔ “items”. The rest of the text is usually ~~static~~ less depending on the numbers.
🤔 So if you already started with traditional i18n, you can make it a bit better.

How to pluralize in Web apps