29 August, 2023
From Code to Screen (Part 2)
Explore the fascinating world of digital typography, from early computer fonts to modern vector-based systems. Learn how text is rendered on screens and the techniques used to make it crisp and readable.
Computer Font
Every day, you see typed text all around you: text on screens, road signs, license plates, restaurant signs... Have you ever wondered how computers can display so many different fonts and styles?
In this second article of the "From Code to Screen" series, we'll explore two things we use daily but rarely notice: digital fonts (computer fonts).
Other parts of the "From Code to Screen" series:
Once Upon a Time
In the early days, the first computers in the world didn't actually have screens. They sent characters they wanted to display through a telegraph machine / mechanical typewriter (which is why the function to display text on screen is called "print"). Here's an image of such a typewriter, restored to display a Linux (Debian) command line:
It's easy to see that the font on this type of typewriter consists of "stamps" made of hardened steel, and there was virtually no way to change the font (except buying a new printer).
Bitmap Font
When computers began to switch to using screens instead of printers, early fonts were stored directly in the terminal's ROM memory, which only the manufacturer could change. This type of font is called a Bitmap font because each character of the font is stored as an image (bitmap):
It's easy to see that this method makes font processing very simple and requires little storage space. However, the biggest drawback is that this type of font cannot be enlarged. Like how a pixel image becomes "pixelated" when enlarged, fonts need to be stored in vector format to be scalable.
Vector Font / Outline Font
As the name suggests, this type of font stores each character as a vector. One of the earliest formats supporting vectors was TrueType, developed by Adobe in the 1980s. The TrueType format is still widely used even in 2023.
The TrueType format (and later OpenType) primarily operates based on the Bézier curve formula in mathematics. Regarding the mathematical topic, I'd like to "disappear" because I'm quite poor at math, but basically, the Bézier algorithm helps convert from vector to pixel easily (light algorithm, can run fast). It's also not difficult to adjust for non-programmers (if you've ever used the Pen Tool in Adobe Illustrator, that's Bézier!)
The process of converting from vector to pixel is called "rasterization". The way it works is by calculating which pixels intersect with the curve, and those pixels will be "turned on". In the following example, the navy blue part will be the pixels turned on after the rasterization process:
Applying this method, we can now change the font size as desired. This is also the method used in early versions of Mac and Windows:
However, when looking at the corners of the text, we can easily notice that they're quite jagged (aliased). This is because pixels now only have "on" and "off" states, without 20%, 50%,... states. This limitation comes from two reasons: (1) early computer screens couldn't display many colors, and (2) processing speed limitations.
To overcome this, we can consider how far the curve is from the center of the pixel, then adjust the pixel's opacity accordingly. This method is called "font smoothing". The following image (left) is an example of how this algorithm works, while the image on the right is the same text but with font smoothing enabled:
Note that the font smoothing algorithm has "evolved" and become much more complete. For example, in Windows XP, the ClearType feature takes advantage of subpixels (individual RGB lights within a pixel) to make text appear smoother on LCD screens. In the following image, (a) is without font smoothing, while (b) uses ClearType:
In addition to font smoothing, font hinting is often applied to make text appear clearer. This method tries to "fit" the vector into the nearest pixel, resulting in less "blurry" text. Font hinting is particularly necessary for writing systems with complex characters. The following image shows Chinese characters (quite complex), with font hinting off (left) and on (right). Notice how the horizontal strokes look thinner and clearer:
The Operating System Side
Although rasterizing a character is quite fast, if it had to rasterize every time a character needed to be displayed, it would accumulate and cause the computer to lag constantly. To solve this problem, fonts will be rasterized as little as possible, then software will use the cached pixel version after the rasterization process. This method is called Glyph Cache.
The remaining task is automatic line wrapping. Notice how the texts you read on screen automatically wrap at spaces, rather than c-utting w-ords l-ike th-is. This method is called Line wrap and word wrap, and it simply works by summing the width of each character to be displayed, and when it exceeds the display frame, it finds the last space and turns it into a "line break", then repeats.
All these methods and algorithms are quite complex, but in reality, they've been pre-coded in the operating system or libraries, such as libpango in Linux. Programmers simply write code to specify what text to display, which font, what size, etc.
Conclusion
So we've "dissected" another "obvious" truth that we observe every day. I hope you found something useful from this article. Thank you very much for your time.
In the next article, I'll talk about the graphical shell. See you soon!
References
- https://en.wikipedia.org/wiki/TrueType
- https://en.wikipedia.org/wiki/Font%5Fhinting
- https://learn.microsoft.com/en-us/windows/win32/direct3d11/d3d10-graphics-programming-guide-rasterizer-stage-rules
- https://learn.microsoft.com/en-us/dotnet/desktop/wpf/advanced/cleartype-overview?view=netframeworkdesktop-4.8
- https://pango.gnome.org/