Matt Brubeck

08 Aug 2014

Let's build a browser engine!

Part 1: Getting started

I’m building a toy HTML rendering engine, and I think you should too. This is the first in a series of articles:

Part 1: Getting started

Part 2: HTML

Part 3: CSS

Part 4: Style

Part 5: Boxes

Part 6: Block layout

Part 7: Painting 101

The full series will describe the code I’ve written, and show how you can make your own. But first, let me explain why.

You’re building a what?

Let’s talk terminology. A browser engine is the portion of a web browser that works “under the hood” to fetch a web page from the internet, and translate its contents into forms you can read, watch, hear, etc. Blink, Gecko, WebKit, and Trident are browser engines. In contrast, the browser’s own UI—tabs, toolbar, menu and such—is called the chrome. Firefox and SeaMonkey are two browsers with different chrome but the same Gecko engine.

A browser engine includes many sub-components: an HTTP client, an HTML parser, a CSS parser, a JavaScript engine (itself composed of parsers, interpreters, and compilers), and much more. Those components involved in parsing web formats like HTML and CSS and translating them into what you see on-screen are sometimes called the layout engine or rendering engine.

Why a “toy” rendering engine?

A full-featured browser engine is hugely complex. Blink, Gecko, WebKit—these are millions of lines of code each. Even younger, simpler rendering engines like Servo and WeasyPrint are each tens of thousands of lines. Not the easiest thing for a newcomer to comprehend!

Speaking of hugely complex software: If you take a class on compilers or operating systems, at some point you will probably create or modify a “toy” compiler or kernel. This is a simple model designed for learning; it may never be run by anyone besides the person who wrote it. But making a toy system is a useful tool for learning how the real thing works. Even if you never build a real-world compiler or kernel, understanding how they work can help you make better use of them when writing your own programs.

So, if you want to become a browser developer, or just to understand what happens inside a browser engine, why not build a toy one? Like a toy compiler that implements a subset of a “real” programming language, a toy rendering engine could implement a small subset of HTML and CSS. It won’t replace the engine in your everyday browser, but should nonetheless illustrate the basic steps needed for rendering a simple HTML document.

Try this at home.

I hope I’ve convinced you to give it a try. This series will be easiest to follow if you already have some solid programming experience and know some high-level HTML and CSS concepts. However, if you’re just getting started with this stuff, or run into things you don’t understand, feel free to ask questions and I’ll try to make it clearer.

Before you start, a few remarks on some choices you can make:

On Programming Languages

You can build a toy layout engine in any programming language. Really! Go ahead and use a language you know and love. Or use this as an excuse to learn a new language if that sounds like fun.

If you want to start contributing to major browser engines like Gecko or WebKit, you might want to work in C++ because it’s the main language used in those engines, and using it will make it easier to compare your code to theirs.

My own toy project, robinson, is written in Rust. I’m part of the Servo team at Mozilla, so I’ve become very fond of Rust programming. Plus, one of my goals with this project is to understand more of Servo’s implementation. Robinson sometimes uses simplified versions of Servo’s data structures and code.

On Libraries and Shortcuts

In a learning exercise like this, you have to decide whether it’s “cheating” to use someone else’s code instead of writing your own from scratch. My advice is to write your own code for the parts that you really want to understand, but don’t be shy about using libraries for everything else. Learning how to use a particular library can be a worthwhile exercise in itself.

I’m writing robinson not just for myself, but also to serve as example code for these articles and exercises. For this and other reasons, I want it to be as tiny and self-contained as possible. So far I’ve used no external code except for the Rust standard library. (This also side-steps the minor hassle of getting multiple dependencies to build with the same version of Rust while the language is still in development.) This rule isn’t set in stone, though. For example, I may decide later to use a graphics library rather than write my own low-level drawing code.

Another way to avoid writing code is to just leave things out. For example, robinson has no networking code yet; it can only read local files. In a toy program, it’s fine to just skip things if you feel like it. I’ll point out potential shortcuts like this as I go along, so you can bypass steps that don’t interest you and jump straight to the good stuff. You can always fill in the gaps later if you change your mind.

First Step: The DOM

Are you ready to write some code? We’ll start with something small: data structures for the DOM. Let’s look at robinson’s dom module.

The DOM is a tree of nodes. A node has zero or more children. (It also has various other attributes and methods, but we can ignore most of those for now.)

struct Node {
    // data common to all nodes:
    children: Vec<Node>,

    // data specific to each node type:
    node_type: NodeType,
}

There are several node types, but for now we will ignore most of them and say that a node is either an Element or a Text node. In a language with inheritance these would be subtypes of Node. In Rust they can be an enum (Rust’s keyword for a “tagged union” or “sum type”):

enum NodeType {
    Text(String),
    Element(ElementData),
}

An element includes a tag name and any number of attributes, which can be stored as a map from names to values. Robinson doesn’t support namespaces, so it just stores tag and attribute names as simple strings.

struct ElementData {
    tag_name: String,
    attrs: AttrMap,
}

type AttrMap = HashMap<String, String>;

Finally, some constructor functions to make it easy to create new nodes:

fn text(data: String) -> Node {
    Node { children: Vec::new(), node_type: NodeType::Text(data) }
}

pub fn elem(tag_name: String, attrs: AttrMap, children: Vec<Node>) -> Node {
    Node {
        children,
        node_type: NodeType::Element(ElementData { tag_name, attrs })
    }
}

And that’s it! A full-blown DOM implementation would include a lot more data and dozens of methods, but this is all we need to get started.

Exercises

These are just a few suggested ways to follow along at home. Do the exercises that interest you and skip any that don’t.

Start a new program in the language of your choice, and write code to represent a tree of DOM text nodes and elements.
Install the latest version of Rust, then download and build robinson. Open up dom.rs and extend NodeType to include additional types like comment nodes.
Write code to pretty-print a tree of DOM nodes.

In the next article, we’ll add a parser that turns HTML source code into a tree of these DOM nodes.

References

For much more detailed information about browser engine internals, see Tali Garsiel’s wonderful How Browsers Work and its links to further resources.

For example code, here’s a short list of “small” open source web rendering engines. Most of them are many times bigger than robinson, but still way smaller than Gecko or WebKit. WebWhirr, at 2000 lines of code, is the only other one I would call a “toy” engine.

CSSBox (Java)
Cocktail (Haxe)
gngr (Java)
litehtml (C++)
LURE (Lua)
NetSurf (C)
Servo (Rust)
Simple San Simon (Haskell)
WeasyPrint (Python)
WebWhirr (C++)

You may find these useful for inspiration or reference. If you know of any other similar projects—or if you start your own—please let me know!