Embedding and subsetting external fonts inside SVG images

SVG editors like Inkscape rely on fonts from the local system. Unlike PDF files, SVG images containing text aren't usually self-contained. This is desirable for editing. For publication, it is better not to rely on system fonts, because viewers on systems that do not have them installed will automatically choose a replacement that may have very different metrics.

Surprisingly, this embedding problem also happens on the web with external fonts. Without going into too much details, SVG images embedded inline inside HTML pages inherit the styles (including fonts) from the surrounding context. This is great if you want your page to interact with the SVG elements, but there are good reasons to treat your SVG as external images sourced from a regular <img> element for caching, scaling, lazy loading, and security.

Sourcing SVG images like this isolates them from the surrounding styles. Unfortunately, they cannot even load external resources anymore, so you have really no other choice than making them fully self-contained. I had some trouble finding how to make SVG images produced by Excalidraw or Inkscape self-contained. There are proprietary solutions to this problem, but the challenge is to do it through free and open-source tools and a little bit of code.

Let's start with a minimal example, an SVG image exported from Excalidraw:

<svg version="1.1" xmlns="http://www.w3.org/2000/svg">
  <!-- svg-source:excalidraw -->
  <defs>
    <style>
      @font-face {
        font-family: "Virgil";
        src: url("https://excalidraw.com/Virgil.woff2");
      }
    </style>
  </defs>
  <text font-family="Virgil, Segoe UI Emoji" font-size="20px">
    Hello, World!
  </text>
</svg>

If you source this image from an <img> element, it will be rendered with whatever your web browser's default font family is (and because this website doesn't rely on webfonts, it will looks the same as the surrounding text, even though the Virgil font looks like handwritten text):

External SVG with hello world text in the default browser font — (That is supposed to look like handwritten text.)

If open this image with Right click > Open Image in New Tab, you will see that the external font is loaded properly. It is possible to achieve this by embedding the characters as outlines. This is what Matplotlib does if you set text.usetex, rendering the text through $\LaTeX$ and embedding the resulting shapes. Inkscape also makes that possible to some extent, but I encountered a number of issues:

The CLI is not suitable for headless shell scripting as it performs actions through the GUI.
I had a couple of crashes and hangs on seemingly basic SVG documents.
You still have some manual fiddling left to install the matching font in TTF or OTF format on your system and update its name in the SVG image if it doesn't match the @font-face definition.

The general technique of transforming characters to outlines also applies to PDF files, commonly employed by publishers who do not have the right to redistribute copyrighted fonts. Unfortunately, this technique has the following drawbacks:

The resulting text looks weird as it approximates the rendering features encoded in the original font.
You cannot edit the text or change the font properties after embedding.
Without some form of character deduplication, it leads to larger files.

Fortunately, there is a much simpler way to embedded fonts (or other assets) inside an SVG image with Base64-encoded resources. The following snippet shows what embedding Virgil looks like:

<svg version="1.1" xmlns="http://www.w3.org/2000/svg">
  <!-- svg-source:excalidraw -->
  <defs>
    <style>
      @font-face {
        font-family: "Virgil";
        src: url(data:application/font;base64,d09GMk9UVE8AAA...);
      }
    </style>
  </defs>
  <text font-family="Virgil, Segoe UI Emoji" font-size="20px">
    Hello, World!
  </text>
</svg>

This particular font weighs about 60 kB in WOFF2 format, but the Base64 encoding expands it to 80 kB. Additionally, if a webpage sources multiple SVG images, each single one of them will carry an entire copy of the font, as they are all self-contained.

Through a font subsetting process, you can drastically reduce the size of the these images by stripping unused characters (and optionally apply further optimizations to remove some of the rendering information). These techniques are nothing new, this is what most PDF files rely on for portability, low file size, and proper font rendering.

Because it would be quite difficult to do this by hand for an SVG image, your humble servant created svgoptim. This script goes through the following steps:

Extract the set of characters from the SVG text nodes, grouped by font family, using the excellent lxml Python bindings.
Load each font file with the fontTool Python bindings and apply the font subsetting routine.
Update all @font-face definitions with the Base64 encoding of the previously generated font subsets.
Optimize the resulting SVG image with the great svgo library.

Here are some statistics about the copy of an XKCD comics you can find at the end of this article:

Original PNG from XKCD: 12.3 kB.
SVG copy with Excalidraw: 10.3 kB (+ 61.2 kB for the external font in WOFF2 format).
After replacing the font by XKCD Script in Inkscape: 17.3 kB (+ 52.8 kB for the external font in TTF format).
After font embedding, subsetting, and optimization with svgoptim: 15.3 kB.

Not to bad, but we can do much better:

After Brotli compression: 7.6 kB.

Finally, that leaves us to the last reason you should favor SVG over other image formats:

Ranking of information trustworthiness based on file extension — SVG copy of *File Extensions* from XKCD (CC BY-NC 2.5)