I am considering working on the PANOSE font matching part of Fontmatrix because I enjoy playing with Fontmatrix, but its idea of how PANOSE’s individual facets1 are named or work seems to me to be a bit wonky. For instance, it only understands the names for Latin Text facets, and uses them even for Latin Decorative or Pictorial fonts.

The first step (apart from trying to persuade my one-year-old son to go to sleep long enough for me to even turn on the computer), is to take a look at whether improved matching or re-classifying facilities would do any good at all, and for that, I need to take a look at font classifications in the wild.

Turning to my Font Corpus database, I’ve extracted the following bare facts about PANOSE usage, and I’m quite buoyed up by the results.

From the 35420 fonts in the Corpus, I first get rid of fonts that have complete rubbish in the PANOSE field, which means discarding fonts with:

  1. Family Kind of “Any”(0), which means no attempt at all was made at classification. (13863 fonts).

  2. Facet values out of range. This is generally Weight, which for some reason, perhaps tool error, tends to have the values of 114 or 226. (409 fonts).

  3. Family Kind > 5. Family Kind values up here, which would be used for non Latin, aren’t formalised in any document I can find. (12 fonts).

  4. Weight of “Any”(0). Weight is the only facet that is present for all values of Family Kind, so it really ought to be set to something, even if “No Fit”(1) is the only appropriate value. (1540 fonts).

Having cleared out the rubbish, we are left with 19596 fonts, 55% of the total collection. Of these, just over 90% are Latin Text fonts.

For the Latin Text fonts, more complete classification means that more of the individual facets are set to any value other than “Any”(0). Even “No Fit”(1) gives us some information about the limitations of the classification system.

So, how many facets are set to non-zero in our remaining fonts?

No. of non-zero facets No. of fonts
3 349
4 1718
5 135
6 1391
7 6712
8 226
9 66
10 8689

Some of these facets are derived from measured values, and some of them are picked by judgement, which may explain the somewhat uneven coverage of classifications. (And who’s going to classify nine of the facets without doing one extra to complete the job?)

I think the end result is that there are enough fonts with decent classification in the wild to make this something worth working on. Go to sleep, baby boy!

  1. I call the individual numbers of a PANOSE number “facets” to help me from over-using the word “value”.