Matt Brubeck2024-01-09T10:09:53-08:00https://limpet.net/mbrubeck/Matt Brubeckmbrubeck@limpet.nethttps://limpet.net/mbrubeck/images/mbrubeck.jpgRust: A unique perspective2019-02-07T00:00:00-08:00https://limpet.net/mbrubeck//2019/02/07/rust-a-unique-perspective<p><a href="https://www.rust-lang.org/">The Rust programming language</a> is designed to ensure memory safety,
using a mix of compile-time and run-time checks to stop programs from
accessing invalid pointers or sharing memory across threads without proper
synchronization.</p>
<p>The way Rust does this is usually introduced in terms of <strong>mutable</strong> and
<strong>immutable</strong> borrowing and lifetimes. This makes sense, because these are
mechanisms that Rust programmers must use directly. They describe <em>what</em> the
Rust compiler checks when it compiles a program.</p>
<p>However, there is another way to explain Rust. This alternate story focuses
on <strong>unique</strong> versus <strong>shared</strong> access to memory. I believe this
version is useful for understanding <em>why</em> various checks exist and <em>how</em> they
provide memory safety.</p>
<p>Most experienced Rust programmers are already familiar with this concept.
Five years ago, Niko Matsakis even proposed <a href="http://smallcultfollowing.com/babysteps/blog/2014/05/13/focusing-on-ownership/">changing the <code>mut</code> keyword to
<code>uniq</code></a> to emphasize it. My goal is to make these important
ideas more accesssible to beginning and intermediate Rust programmers.</p>
<p>This is a very quick introduction that skips over many details to focus on
high-level concepts. It should complement the official Rust documentation, not
supplant it.</p>
<h2 id="unique-access">Unique access</h2>
<p>The first key observation is: <strong>If a variable has unique access to a value,
then it is safe to mutate it.</strong></p>
<p>By <em>safe</em>, I mean <em>memory-safe</em>: free from invalid pointer accesses, data races,
or other causes of <a href="https://doc.rust-lang.org/nomicon/what-unsafe-does.html">undefined behavior</a>. And by <em>unique access</em>, I mean that
while this variable is alive, there are no other variables that can be used to
read or write any part of the same value.</p>
<p>Unique access makes memory safety very simple: If there are no other
pointers to the value, then you don’t need to worry about invalidating them.
Similarly, if variables on other threads can’t access the value, you needn’t
worry about synchronization.</p>
<h3 id="unique-ownership">Unique ownership</h3>
<p>One form of unique access is <strong>ownership</strong>. When you initialize a variable with
a value, it becomes the sole <em>owner</em> of that value. Because the value has
just one owner, the owner can safely mutate the value, destroy it, or
transfer it to a new owner.</p>
<p>Depending on the type of the value, assigning a value to a new variable
will either <strong>move</strong> it or <strong>copy</strong> it. Either way, unique ownership is
preserved. For a <em>move</em> type, the old owner becomes inaccessible after the
move, so we still have one value owned by one variable:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">let</span> <span class="n">x</span> <span class="o">=</span> <span class="nd">vec!</span><span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">];</span>
<span class="k">let</span> <span class="n">y</span> <span class="o">=</span> <span class="n">x</span><span class="p">;</span> <span class="c1">// move ownership from x to y</span>
<span class="c1">// can’t access x after moving its value to y</span></code></pre></figure>
<p>For a <em>copy</em> type, the value is duplicated, so we end up with two values owned
by two variables:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">let</span> <span class="n">x</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
<span class="k">let</span> <span class="n">y</span> <span class="o">=</span> <span class="n">x</span><span class="p">;</span> <span class="c1">// copy the value of x into y</span></code></pre></figure>
<p>In this case, each variable ends up with a separate, independent value.
Mutating one will not affect the other.</p>
<p>One value might be owned by another value, rather than directly by a variable.
For example, a struct owns its fields, a <code>Vec<T></code> owns the <code>T</code> items inside
it, and a <code>Box<T></code> owns the <code>T</code> that it points to.</p>
<h3 id="unique-borrowing">Unique borrowing</h3>
<p>If you have unique access to a value of type <code>T</code>, you can borrow a <strong>unique
reference</strong> to that value. A unique reference to a <code>T</code> has type <code>&mut T</code>.</p>
<p>Because it’s safe to mutate when you have a unique reference, unique
references are also called “mutable references.“</p>
<p>The Rust compiler enforces this uniqueness at compile time. In any region of
code where the unique reference may be used, no other reference to any part of
the same value may exist, and even the owner of that value may not move or
destroy it. Violating this rule triggers a compiler error.</p>
<p>A reference only <strong>borrows</strong> the value, and must return it to its owner.
This means that the reference can be used to mutate the value, but not to move
or destroy it (unless it overwrites it with a new value, for example using
<a href="https://doc.rust-lang.org/std/mem/fn.replace.html"><code>replace</code></a>). Just like in real life, you need to give back what you’ve
borrowed.</p>
<p>Borrowing a value is like locking it. Just like a mutex lock in a
multi-threaded program, it’s usually best to hold a borrowed reference for as
little time as possible. Storing a unique reference in a long-lived data
structure will prevent any other use of the value for as long as that
structure exists.</p>
<h3 id="unique-references-cant-be-copied">Unique references can’t be copied</h3>
<p>An <code>&mut T</code> cannot be copied or cloned, because this would result in
two ”unique” references to the same value. It can only be moved:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">let</span> <span class="k">mut</span> <span class="n">a</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
<span class="k">let</span> <span class="n">x</span> <span class="o">=</span> <span class="o">&</span><span class="k">mut</span> <span class="n">a</span><span class="p">;</span>
<span class="k">let</span> <span class="n">y</span> <span class="o">=</span> <span class="n">x</span><span class="p">;</span> <span class="c1">// move the reference from x into y</span>
<span class="c1">// x is no longer accessible here</span></code></pre></figure>
<p>However, you can temporarily ”re-borrow” from a unique reference. This gives
a new unique reference to the same value, but the original reference can no
longer be accessed until the new one goes out of scope or is no longer used
(depending on which version of Rust you are using):</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">let</span> <span class="k">mut</span> <span class="n">a</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
<span class="k">let</span> <span class="n">x</span> <span class="o">=</span> <span class="o">&</span><span class="k">mut</span> <span class="n">a</span><span class="p">;</span>
<span class="p">{</span>
<span class="k">let</span> <span class="n">y</span> <span class="o">=</span> <span class="o">&</span><span class="k">mut</span> <span class="o">*</span><span class="n">x</span><span class="p">;</span>
<span class="c1">// x is "re-borrowed" and cannot be used while y is alive</span>
<span class="o">*</span><span class="n">y</span> <span class="o">=</span> <span class="mi">4</span><span class="p">;</span> <span class="c1">// y has unique access and can mutate `a`</span>
<span class="p">}</span>
<span class="c1">// x becomes accessible again after y is dead</span>
<span class="o">*</span><span class="n">x</span> <span class="o">+=</span> <span class="mi">1</span><span class="p">;</span> <span class="c1">// now x has unique access again and can mutate the value</span>
<span class="nd">assert_eq!</span><span class="p">(</span><span class="o">*</span><span class="n">x</span><span class="p">,</span> <span class="mi">5</span><span class="p">);</span></code></pre></figure>
<p>Re-borrowing happens implicitly when you call a function that takes a unique
reference. This greatly simplifies code that passes unique references around,
but can confuse programmers who are just learning about these restrictions.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">fn</span> <span class="nf">f</span><span class="p">(</span><span class="n">n</span><span class="p">:</span> <span class="o">&</span><span class="k">mut</span> <span class="nb">i32</span><span class="p">)</span> <span class="p">{</span> <span class="o">*</span><span class="n">n</span> <span class="o">=</span> <span class="mi">2</span> <span class="p">}</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">a</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
<span class="k">let</span> <span class="n">x</span> <span class="o">=</span> <span class="o">&</span><span class="k">mut</span> <span class="n">a</span><span class="p">;</span>
<span class="nf">f</span><span class="p">(</span><span class="n">x</span><span class="p">);</span> <span class="c1">// x is re-borrowed rather than moved</span>
<span class="nd">dbg!</span><span class="p">(</span><span class="o">*</span><span class="n">x</span><span class="p">);</span> <span class="c1">// x is accessible again here</span></code></pre></figure>
<h2 id="shared-access">Shared access</h2>
<p>A value is <strong>shared</strong> if there are multiple variables that are alive at the
same time that can be used to access it.</p>
<p>While a value is shared, we have to be a lot more careful about mutating it.
Writing to the value through one variable could invalidate pointers held by
other variables, or cause a data race with readers or writers on other
threads.</p>
<p>Rust ensures that <strong>you can read from a value only while no variables can
write to it</strong>, and <strong>you can write to a value only while no other variables
can read or write to it.</strong> In other words, you can have a unique writer, <em>or</em>
multiple readers, but not both at once. Some Rust types enforce this at
compile time and others at run time, but the principle is always the same.</p>
<h3 id="shared-ownership">Shared ownership</h3>
<p>One way to share a value of type <code>T</code> is to create an <code>Rc<T></code>, or
“reference-counted pointer to T”. This allocates space on the heap for a <code>T</code>,
plus some extra space for reference counting (tracking the number of pointers
to the value). Then you can call <code>Rc::clone</code> to increment the reference count
and receive another <code>Rc<T></code> that points to the same value:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">let</span> <span class="n">x</span> <span class="o">=</span> <span class="nn">Rc</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="mi">1</span><span class="p">);</span>
<span class="k">let</span> <span class="n">y</span> <span class="o">=</span> <span class="n">x</span><span class="nf">.clone</span><span class="p">();</span>
<span class="c1">// x and y hold two different Rc that point to the same memory</span></code></pre></figure>
<p>Because the <code>T</code> lives on the heap and <code>x</code> and <code>y</code> just hold pointers to it, it
can outlive any particular pointer. It will be destroyed only when the last
of the pointers is dropped. This is called <strong>shared ownership</strong>.</p>
<h3 id="shared-borrowing">Shared borrowing</h3>
<p>A <strong>shared reference to T</strong>, or <code>&T</code>, is another “borrowed” type which can’t
outlive its referent. This is also called an “immutable reference.”</p>
<p>The compiler ensures that a shared reference can’t be
created while a unique reference exists to any part of the same value, and
vice-versa. And (just like unique references) the owner isn’t allowed to
drop/move/mutate the value while any shared references are alive.</p>
<p>If you have unique access to a value, you can produce many shared references
or one unique reference to it. However, if you only have shared access to a
value, you can’t produce a unique reference (at least, not without some
additional checks, which I’ll discuss soon). One consequence of this is that
you can convert an <code>&mut T</code> to an <code>&T</code>, but not vice-versa.</p>
<p>Because multiple shared references are allowed, an <code>&T</code> can be copied/cloned
(unlike <code>&mut T</code>).</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">let</span> <span class="k">mut</span> <span class="n">a</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
<span class="k">let</span> <span class="n">x</span> <span class="o">=</span> <span class="o">&</span><span class="n">a</span><span class="p">;</span>
<span class="k">let</span> <span class="n">y</span> <span class="o">=</span> <span class="n">x</span><span class="p">;</span> <span class="c1">// copy the reference</span>
<span class="c1">// both x and y are accessible here</span></code></pre></figure>
<h2 id="thread-safety">Thread safety</h2>
<p>Astute readers might notice that merely cloning an <code>Rc<T></code> mutates a value in
memory, since it modifies the reference count. This could cause a data race
if another clone of the <code>Rc</code> were accessed at the same time on a different
thread! The compiler solves this in typical Rust fashion: By refusing to
compile any program that passes an <code>Rc</code> to a different thread.</p>
<p>Rust has two built-in traits that it uses to mark types that can be accessed
safely by other threads:</p>
<ul>
<li>
<p><strong><code>T: Send</code></strong> means it’s safe to access a value of <code>T</code> on any thread,
if that thread has exclusive access to that value. A value of this type
can be moved to another thread by unique ownership, or borrowed on another
thread by unique reference (<code>&mut T</code>). A more descriptive name for this
trait might be <strong><code>UniqueThreadSafe</code></strong>.</p>
</li>
<li>
<p><strong><code>T: Sync</code></strong> means it’s safe for many threads to access a <code>T</code>
simultaneously, with each thread having shared access.
Values of such types can be accessed on other threads via shared ownership
or shared references (<code>&T</code>). A more descriptive name would be
<strong><code>SharedThreadSafe</code></strong>.</p>
</li>
</ul>
<p><code>Rc<T></code> implements neither of these traits, so an <code>Rc<T></code> cannot be moved or
borrowed into a variable on a different thread. It is forever trapped on the
thread where it was born.</p>
<p>The standard library also offers an <code>Arc<T></code> type, which is exactly like
<code>Rc<T></code> except that it implements <code>Send</code>, and uses atomic operations to
synchronize access to its reference counts. This can make <code>Arc<T></code> a little
more expensive at run time, but it allows multiple threads to share a value
safely.</p>
<p>These traits are not mutually exclusive. Many types are both <code>Send</code> and
<code>Sync</code>, meaning that it’s safe to give unique access to one other thread (for
example, moving the value itself or sending an <code>&mut T</code> reference) <em>or</em> shared
access to many threads (for example, sending multiple <code>Arc<T></code> or <code>&T</code>).</p>
<h2 id="shared-mutability">Shared mutability</h2>
<p>So far, we’ve seen that sharing is safe when values are not mutated, and
mutation is safe when values are not shared. But what if we want to share
<em>and</em> mutate a value? The Rust standard library provides several different
mechanisms for <strong>shared mutability</strong>.</p>
<p>The official documentation also calls this “interior mutability” because it
lets you mutate a value that is “inside” of an immutable value. This
terminology can be confusing: What does it mean for the exterior to be
“immutable” if its interior is mutable? I prefer “shared mutability” which
puts the spotlight on a different question: How can you safely mutate a value
while it is shared?</p>
<h3 id="what-could-go-wrong">What could go wrong?</h3>
<p>What’s the big deal about shared mutation? Let’s start by listing some of the
ways it could go wrong:</p>
<p>First, mutating a value can cause <strong>pointer invalidation</strong>. For example,
pushing to a vector might cause it to reallocate its buffer. If there are
other variables that contained addresses of items in the buffer, they would
now point to deallocated memory. Or, mutating an enum might overwrite a
value of one type with a value of a different type. A pointer to the old
value will now be pointing at memory occupied by the wrong type. Either of
these cases would trigger undefined behavior.</p>
<p>Second, it could violate <strong>aliasing assumptions</strong>. For example, the optimizing
compiler assumes by default that the referent of an <code>&T</code> will not
change while the reference exists. It might re-order code based on this
assumption, leading to undefined behavior when the assumption is violated.</p>
<p>Third, if one thread mutates a value at the same time that another thread is
accessing it, this causes a <strong>data race</strong> unless both threads use
<a href="https://doc.rust-lang.org/std/sync/">synchronization</a> primitives to prevent their operations from overlapping.
Data races can cause arbitrary undefined behavior (in part because data races
can also violate assumptions made by the optimizer during code generation).</p>
<h3 id="unsafecell">UnsafeCell</h3>
<p>To fix the problem of aliasing assumptions, we need <a href="https://doc.rust-lang.org/std/cell/struct.UnsafeCell.html"><code>UnsafeCell<T></code></a>. The
compiler knows about this type and treats it specially: It tells the optimizer
that the value inside an <code>UnsafeCell</code> is not subject to the usual restrictions
on aliasing.</p>
<p>Safe Rust code doesn’t use <code>UnsafeCell</code> directly. Instead, it’s used by
libraries (including the standard library) that provide APIs for <em>safe</em> shared
mutability. All of the shared mutable types discussed in the following
sections use <code>UnsafeCell</code> internally.</p>
<p><code>UnsafeCell</code> alone solves only one of the three problems listed above (the
compiler’s aliasing assumptions). To provide safe shared mutation, we will
also need to solve the other two problems: pointer invalidation and data
races.</p>
<h3 id="multi-threaded-shared-mutability">Multi-threaded shared mutability</h3>
<p>Rust programs can safely mutate a value that’s shared across threads, as long
as the basic rules of unique and shared access are enforced: Only one thread
at a time may have unique access to a value, and only this thread can mutate
it. When no thread has unique access, then many threads may have shared
access, but the value can’t be mutated while they do.</p>
<p>Rust has two main types that allow thread-safe shared mutation:</p>
<ul>
<li>
<p><strong><code>Mutex<T></code></strong> allows one thread at a time to “lock” a mutex and get unique
access to its contents. If a second thread tries to lock the mutex, it will
block until the first thread unlocks it. Since <code>Mutex</code> provides access to
only one thread at a time, it can be used to share any type that implements
the <code>Send</code> (“unique thread-safe”) trait.</p>
</li>
<li>
<p><strong><code>RwLock<T></code></strong> is similar but has two different types of lock: A “write”
lock that provides unique access, and a “read” lock that provides shared
access. It will allow many threads to hold read locks at the same time, but
only one thread can hold a write lock. If one thread tries to write while
other threads are reading (or vice-versa), it will block until the other
threads release their locks. Since <code>RwLock</code> provides both unique and shared
access, its contents must implement both <code>Send</code> (“unique thread-safe”) and
<code>Sync</code> (“shared thread-safe”).</p>
</li>
</ul>
<p>These types prevent pointer invalidation by using run-time checks to enforce
the rules of unique and shared borrowing. They prevent data races by using
synchronization primitives provided by the platform’s native threading system.</p>
<p>In addition, various <strong><a href="https://doc.rust-lang.org/std/sync/atomic/">atomic types</a></strong> allow safe shared mutation of
individual primitive values. These prevent data races by using compiler
intrinsics that provide synchronized operations, and they prevent pointer
invalidation by refusing to give out references to their contents; you can
only read from them or write to them by value.</p>
<p>All these types are only useful when shared by multiple threads, so they are
often used in combination with <code>Arc</code>. Because <code>Arc</code> lets multiple threads
share ownership of a value, it works with threads that might outlive the
function that spawns them (and therefore can’t borrow references from it).
However, <a href="https://doc.rust-lang.org/std/thread/fn.scope.html">scoped threads</a> are guaranteed to terminate before their spawning
function, so they can use shared borrowing (<code>&Mutex<T></code>) instead of
shared ownership (<code>Arc<Mutex<T>></code>).</p>
<h3 id="single-threaded-shared-mutability">Single-threaded shared mutability</h3>
<p>The standard library also has two types that allow safe shared mutation
within a single thread. These types don’t implement the <code>Sync</code> trait, so the
compiler won’t let you share them across multiple threads. This neatly avoids
data races, and also means that these types don’t need atomic operations
(which are potentially expensive).</p>
<ul>
<li>
<p><strong><code>Cell<T></code></strong> solves the problem of pointer invalidation by forbidding
pointers to its contents. Like the atomic types mentioned above, you can only
read from it or write to it by value. Changing the data “inside” of the
<code>Cell<T></code> is okay, because there are no shared pointers to that data – only to
the <code>Cell<T></code> itself, whose type and address do not change when you mutate its
interior. (Now we see why “interior mutability” is also a useful concept.)</p>
</li>
<li>
<p>Many Rust types are useless without references, so Cell is often too
restrictive. <strong><code>RefCell<T></code></strong> allows you to borrow either unique or shared
references to its contents, but it keeps count of how many borrowers are alive
at a time. Like <code>RwLock</code>, it allows one unique reference or many shared
references, but not both at once. It enforces this rule using run-time
checks. (But since it’s used within a single thread, it can’t block the
thread while waiting for other borrowers to finish. Instead, it panics
if a program violates its borrowing rules.)</p>
</li>
</ul>
<p>These types are often used in combination with <code>Rc<T></code>, so that a value shared
by multiple owners can still be mutated safely. They may also be used for
mutating values behind shared references. The <a href="https://doc.rust-lang.org/std/cell/"><code>std::cell</code></a> docs have some
examples.</p>
<h2 id="summary">Summary</h2>
<p>To summarize some key ideas:</p>
<ul>
<li>Rust has two types of references: unique and shared.</li>
<li>Unique mutable access is easy.</li>
<li>Shared immutable access is easy.</li>
<li>Shared mutable access is hard.</li>
<li>This is true for both single-threaded and multi-threaded programs.</li>
</ul>
<p>We also saw a couple of ways to classify Rust types. Here’s a table showing
some of the most common types according to this classification scheme:</p>
<table class="data">
<tr>
<td></td>
<th>Unique</th>
<th>Shared</th></tr>
<tr>
<th>Borrowed</th>
<td><code>&mut T</code></td>
<td><code>&T</code></td>
</tr>
<tr>
<th>Owned</th>
<td><code>T, Box<T></code></td>
<td><code>Rc<T></code>, <code>Arc<T></code></td>
</tr>
</table>
<p>I hope that thinking of these types in terms of uniqueness and sharing will
help you understand how and why they work, as it helped me.</p>
<h2 id="want-to-know-more">Want to know more?</h2>
<p>As I said at the start, this is just a quick introduction and glosses over
many details. The exact rules about unique and shared access in Rust are
still being worked out. The <a href="https://doc.rust-lang.org/nomicon/aliasing.html">Aliasing</a> chapter of the Rustonomicon explains
more, and Ralf Jung’s <a href="https://www.ralfj.de/blog/2018/11/16/stacked-borrows-implementation.html">Stacked Borrows</a> model is the start of a more complete
and formal definition of the rules.</p>
<p>If you want to know more about how shared mutability can lead to
memory-unsafety, read <a href="https://manishearth.github.io/blog/2015/05/17/the-problem-with-shared-mutability/">The Problem With Single-threaded Shared Mutability</a> by
Manish Goregaokar.</p>
<p>The Swift language has an approach to memory safety that is similar in some
ways, though its exact mechanisms are different. You might be interested in
its recently-introduced <a href="https://swift.org/blog/swift-5-exclusivity/">Exclusivity Enforcement</a> feature, and the <a href="https://github.com/apple/swift/blob/fa952d398611e9a2b97531e2ac3efb6c36e9ba98/docs/OwnershipManifesto.md">Ownership
Manifesto</a> that originally described its design and rationale.</p>
Matt Brubeckmbrubeck@limpet.nethttps://limpet.net/mbrubeck/Let's build a browser engine! Part 7: Painting 1012014-11-05T09:55:00-08:00https://limpet.net/mbrubeck//2014/11/05/toy-layout-engine-7-painting<p>I’m returning at last to my series on building a simple HTML rendering engine:</p>
<blockquote>
<ul>
<li>
<a href="/mbrubeck/2014/08/08/toy-layout-engine-1.html">Part 1: Getting started</a>
</li>
<li>
<a href="/mbrubeck/2014/08/11/toy-layout-engine-2.html">Part 2: HTML</a>
</li>
<li>
<a href="/mbrubeck/2014/08/13/toy-layout-engine-3-css.html">Part 3: CSS</a>
</li>
<li>
<a href="/mbrubeck/2014/08/23/toy-layout-engine-4-style.html">Part 4: Style</a>
</li>
<li>
<a href="/mbrubeck/2014/09/08/toy-layout-engine-5-boxes.html">Part 5: Boxes</a>
</li>
<li>
<a href="/mbrubeck/2014/09/17/toy-layout-engine-6-block.html">Part 6: Block layout</a>
</li>
<li>
<b>Part 7: Painting 101</b>
</li>
</ul>
</blockquote>
<p>In this article, I will add very basic <a href="https://github.com/mbrubeck/robinson/blob/master/src/painting.rs">painting code</a>. This
code takes the tree of boxes from the layout module and turns them into an
array of pixels. This process is also known as “rasterization.”</p>
<p><img src="/mbrubeck/images/2014/pipeline.svg" style="width: 720px" /></p>
<p>Browsers usually implement rasterization with the help of graphics APIs and
libraries like Skia, Cairo, Direct2D, and so on. These APIs provide functions
for painting polygons, lines, curves, gradients, and text. For now, I’m going
to write my own rasterizer that can only paint one thing: rectangles.</p>
<p>Eventually I want to implement text rendering. At that point, I may
throw away this toy painting code and switch to a “real” 2D graphics library.
But for now, rectangles are sufficient to turn the output of my block layout
algorithm into pictures.</p>
<h2 id="catching-up">Catching Up</h2>
<p>Since my last post, I’ve made some small changes to the code from previous
articles. These includes some minor refactoring, and some updates to keep the
code compatible with the latest Rust nightly builds. None of these changes
are vital to understanding the code, but if you’re curious, check the <a href="https://github.com/mbrubeck/robinson/commits/master">commit
history</a>.</p>
<h2 id="building-the-display-list">Building the Display List</h2>
<p>Before painting, we will walk through the layout tree and build a <a href="https://en.wikipedia.org/wiki/Display_list">display
list</a>. This is a list of graphics operations like “draw a circle” or
“draw a string of text.” Or in our case, just “draw a rectangle.”</p>
<p>Why put commands into a display list, rather than execute them immediately?
The display list is useful for a several reasons. You can search it for items
that will be completely covered up by later operations, and remove them to
eliminate wasted painting. You can modify and re-use the display list in
cases where you know only certain items have changed. And you can use the
same display list to generate different types of output: for example, pixels
for displaying on a screen, or vector graphics for sending to a printer.</p>
<p>Robinson’s display list is a vector of DisplayCommands. For now there is only
one type of DisplayCommand, a solid-color rectangle:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">type</span> <span class="n">DisplayList</span> <span class="o">=</span> <span class="nb">Vec</span><span class="o"><</span><span class="n">DisplayCommand</span><span class="o">></span><span class="p">;</span>
<span class="k">enum</span> <span class="n">DisplayCommand</span> <span class="p">{</span>
<span class="nf">SolidColor</span><span class="p">(</span><span class="n">Color</span><span class="p">,</span> <span class="n">Rect</span><span class="p">),</span>
<span class="c1">// insert more commands here</span>
<span class="p">}</span></code></pre></figure>
<p>To build the display list, we walk through the layout tree and generate a
series of commands for each box. First we draw the box’s background, then we
draw its borders and content on top of the background.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">fn</span> <span class="nf">build_display_list</span><span class="p">(</span><span class="n">layout_root</span><span class="p">:</span> <span class="o">&</span><span class="n">LayoutBox</span><span class="p">)</span> <span class="k">-></span> <span class="n">DisplayList</span> <span class="p">{</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">list</span> <span class="o">=</span> <span class="nn">Vec</span><span class="p">::</span><span class="nf">new</span><span class="p">();</span>
<span class="nf">render_layout_box</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="n">list</span><span class="p">,</span> <span class="n">layout_root</span><span class="p">);</span>
<span class="k">return</span> <span class="n">list</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">fn</span> <span class="nf">render_layout_box</span><span class="p">(</span><span class="n">list</span><span class="p">:</span> <span class="o">&</span><span class="k">mut</span> <span class="n">DisplayList</span><span class="p">,</span> <span class="n">layout_box</span><span class="p">:</span> <span class="o">&</span><span class="n">LayoutBox</span><span class="p">)</span> <span class="p">{</span>
<span class="nf">render_background</span><span class="p">(</span><span class="n">list</span><span class="p">,</span> <span class="n">layout_box</span><span class="p">);</span>
<span class="nf">render_borders</span><span class="p">(</span><span class="n">list</span><span class="p">,</span> <span class="n">layout_box</span><span class="p">);</span>
<span class="c1">// TODO: render text</span>
<span class="k">for</span> <span class="n">child</span> <span class="k">in</span> <span class="o">&</span><span class="n">layout_box</span><span class="py">.children</span> <span class="p">{</span>
<span class="nf">render_layout_box</span><span class="p">(</span><span class="n">list</span><span class="p">,</span> <span class="n">child</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>By default, HTML elements are stacked in the order they appear: If two
elements overlap, the later one is drawn on top of the earlier one. This is
reflected in our display list, which will draw the elements in the same order
they appear in the DOM tree. If this code supported the <a href="http://www.w3.org/TR/CSS2/visuren.html#z-index">z-index</a>
property, then individual elements would be able to override this stacking
order, and we’d need to sort the display list accordingly.</p>
<p>The background is easy. It’s just a solid rectangle. If no background color
is specified, then the background is transparent and we don’t need to generate
a display command.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">fn</span> <span class="nf">render_background</span><span class="p">(</span><span class="n">list</span><span class="p">:</span> <span class="o">&</span><span class="k">mut</span> <span class="n">DisplayList</span><span class="p">,</span> <span class="n">layout_box</span><span class="p">:</span> <span class="o">&</span><span class="n">LayoutBox</span><span class="p">)</span> <span class="p">{</span>
<span class="nf">get_color</span><span class="p">(</span><span class="n">layout_box</span><span class="p">,</span> <span class="s">"background"</span><span class="p">)</span><span class="nf">.map</span><span class="p">(|</span><span class="n">color</span><span class="p">|</span>
<span class="n">list</span><span class="nf">.push</span><span class="p">(</span><span class="nn">DisplayCommand</span><span class="p">::</span><span class="nf">SolidColor</span><span class="p">(</span><span class="n">color</span><span class="p">,</span> <span class="n">layout_box</span><span class="py">.dimensions</span><span class="nf">.border_box</span><span class="p">())));</span>
<span class="p">}</span>
<span class="c1">// Return the specified color for CSS property `name`, or None if no color was specified.</span>
<span class="k">fn</span> <span class="nf">get_color</span><span class="p">(</span><span class="n">layout_box</span><span class="p">:</span> <span class="o">&</span><span class="n">LayoutBox</span><span class="p">,</span> <span class="n">name</span><span class="p">:</span> <span class="o">&</span><span class="nb">str</span><span class="p">)</span> <span class="k">-></span> <span class="nb">Option</span><span class="o"><</span><span class="n">Color</span><span class="o">></span> <span class="p">{</span>
<span class="k">match</span> <span class="n">layout_box</span><span class="py">.box_type</span> <span class="p">{</span>
<span class="nf">BlockNode</span><span class="p">(</span><span class="n">style</span><span class="p">)</span> <span class="p">|</span> <span class="nf">InlineNode</span><span class="p">(</span><span class="n">style</span><span class="p">)</span> <span class="k">=></span> <span class="k">match</span> <span class="n">style</span><span class="nf">.value</span><span class="p">(</span><span class="n">name</span><span class="p">)</span> <span class="p">{</span>
<span class="nf">Some</span><span class="p">(</span><span class="nn">Value</span><span class="p">::</span><span class="nf">ColorValue</span><span class="p">(</span><span class="n">color</span><span class="p">))</span> <span class="k">=></span> <span class="nf">Some</span><span class="p">(</span><span class="n">color</span><span class="p">),</span>
<span class="n">_</span> <span class="k">=></span> <span class="nb">None</span>
<span class="p">},</span>
<span class="n">AnonymousBlock</span> <span class="k">=></span> <span class="nb">None</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>The borders are similar, but instead of a single rectangle we draw
four—one for each edge of the box.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">fn</span> <span class="nf">render_borders</span><span class="p">(</span><span class="n">list</span><span class="p">:</span> <span class="o">&</span><span class="k">mut</span> <span class="n">DisplayList</span><span class="p">,</span> <span class="n">layout_box</span><span class="p">:</span> <span class="o">&</span><span class="n">LayoutBox</span><span class="p">)</span> <span class="p">{</span>
<span class="k">let</span> <span class="n">color</span> <span class="o">=</span> <span class="k">match</span> <span class="nf">get_color</span><span class="p">(</span><span class="n">layout_box</span><span class="p">,</span> <span class="s">"border-color"</span><span class="p">)</span> <span class="p">{</span>
<span class="nf">Some</span><span class="p">(</span><span class="n">color</span><span class="p">)</span> <span class="k">=></span> <span class="n">color</span><span class="p">,</span>
<span class="n">_</span> <span class="k">=></span> <span class="k">return</span> <span class="c1">// bail out if no border-color is specified</span>
<span class="p">};</span>
<span class="k">let</span> <span class="n">d</span> <span class="o">=</span> <span class="o">&</span><span class="n">layout_box</span><span class="py">.dimensions</span><span class="p">;</span>
<span class="k">let</span> <span class="n">border_box</span> <span class="o">=</span> <span class="n">d</span><span class="nf">.border_box</span><span class="p">();</span>
<span class="c1">// Left border</span>
<span class="n">list</span><span class="nf">.push</span><span class="p">(</span><span class="nn">DisplayCommand</span><span class="p">::</span><span class="nf">SolidColor</span><span class="p">(</span><span class="n">color</span><span class="p">,</span> <span class="n">Rect</span> <span class="p">{</span>
<span class="n">x</span><span class="p">:</span> <span class="n">border_box</span><span class="py">.x</span><span class="p">,</span>
<span class="n">y</span><span class="p">:</span> <span class="n">border_box</span><span class="py">.y</span><span class="p">,</span>
<span class="n">width</span><span class="p">:</span> <span class="n">d</span><span class="py">.border.left</span><span class="p">,</span>
<span class="n">height</span><span class="p">:</span> <span class="n">border_box</span><span class="py">.height</span><span class="p">,</span>
<span class="p">}));</span>
<span class="c1">// Right border</span>
<span class="n">list</span><span class="nf">.push</span><span class="p">(</span><span class="nn">DisplayCommand</span><span class="p">::</span><span class="nf">SolidColor</span><span class="p">(</span><span class="n">color</span><span class="p">,</span> <span class="n">Rect</span> <span class="p">{</span>
<span class="n">x</span><span class="p">:</span> <span class="n">border_box</span><span class="py">.x</span> <span class="o">+</span> <span class="n">border_box</span><span class="py">.width</span> <span class="o">-</span> <span class="n">d</span><span class="py">.border.right</span><span class="p">,</span>
<span class="n">y</span><span class="p">:</span> <span class="n">border_box</span><span class="py">.y</span><span class="p">,</span>
<span class="n">width</span><span class="p">:</span> <span class="n">d</span><span class="py">.border.right</span><span class="p">,</span>
<span class="n">height</span><span class="p">:</span> <span class="n">border_box</span><span class="py">.height</span><span class="p">,</span>
<span class="p">}));</span>
<span class="c1">// Top border</span>
<span class="n">list</span><span class="nf">.push</span><span class="p">(</span><span class="nn">DisplayCommand</span><span class="p">::</span><span class="nf">SolidColor</span><span class="p">(</span><span class="n">color</span><span class="p">,</span> <span class="n">Rect</span> <span class="p">{</span>
<span class="n">x</span><span class="p">:</span> <span class="n">border_box</span><span class="py">.x</span><span class="p">,</span>
<span class="n">y</span><span class="p">:</span> <span class="n">border_box</span><span class="py">.y</span><span class="p">,</span>
<span class="n">width</span><span class="p">:</span> <span class="n">border_box</span><span class="py">.width</span><span class="p">,</span>
<span class="n">height</span><span class="p">:</span> <span class="n">d</span><span class="py">.border.top</span><span class="p">,</span>
<span class="p">}));</span>
<span class="c1">// Bottom border</span>
<span class="n">list</span><span class="nf">.push</span><span class="p">(</span><span class="nn">DisplayCommand</span><span class="p">::</span><span class="nf">SolidColor</span><span class="p">(</span><span class="n">color</span><span class="p">,</span> <span class="n">Rect</span> <span class="p">{</span>
<span class="n">x</span><span class="p">:</span> <span class="n">border_box</span><span class="py">.x</span><span class="p">,</span>
<span class="n">y</span><span class="p">:</span> <span class="n">border_box</span><span class="py">.y</span> <span class="o">+</span> <span class="n">border_box</span><span class="py">.height</span> <span class="o">-</span> <span class="n">d</span><span class="py">.border.bottom</span><span class="p">,</span>
<span class="n">width</span><span class="p">:</span> <span class="n">border_box</span><span class="py">.width</span><span class="p">,</span>
<span class="n">height</span><span class="p">:</span> <span class="n">d</span><span class="py">.border.bottom</span><span class="p">,</span>
<span class="p">}));</span>
<span class="p">}</span></code></pre></figure>
<p>Next the rendering function will draw each of the box’s children, until the
entire layout tree has been translated into display commands.</p>
<h2 id="rasterization">Rasterization</h2>
<p>Now that we’ve built the display list, we need to turn it into pixels by
executing each DisplayCommand. We’ll store the pixels in a Canvas:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">struct</span> <span class="n">Canvas</span> <span class="p">{</span>
<span class="n">pixels</span><span class="p">:</span> <span class="nb">Vec</span><span class="o"><</span><span class="n">Color</span><span class="o">></span><span class="p">,</span>
<span class="n">width</span><span class="p">:</span> <span class="nb">usize</span><span class="p">,</span>
<span class="n">height</span><span class="p">:</span> <span class="nb">usize</span><span class="p">,</span>
<span class="p">}</span>
<span class="k">impl</span> <span class="n">Canvas</span> <span class="p">{</span>
<span class="c1">// Create a blank canvas</span>
<span class="k">fn</span> <span class="nf">new</span><span class="p">(</span><span class="n">width</span><span class="p">:</span> <span class="nb">usize</span><span class="p">,</span> <span class="n">height</span><span class="p">:</span> <span class="nb">usize</span><span class="p">)</span> <span class="k">-></span> <span class="n">Canvas</span> <span class="p">{</span>
<span class="k">let</span> <span class="n">white</span> <span class="o">=</span> <span class="n">Color</span> <span class="p">{</span> <span class="n">r</span><span class="p">:</span> <span class="mi">255</span><span class="p">,</span> <span class="n">g</span><span class="p">:</span> <span class="mi">255</span><span class="p">,</span> <span class="n">b</span><span class="p">:</span> <span class="mi">255</span><span class="p">,</span> <span class="n">a</span><span class="p">:</span> <span class="mi">255</span> <span class="p">};</span>
<span class="k">return</span> <span class="n">Canvas</span> <span class="p">{</span>
<span class="n">pixels</span><span class="p">:</span> <span class="nf">repeat</span><span class="p">(</span><span class="n">white</span><span class="p">)</span><span class="nf">.take</span><span class="p">(</span><span class="n">width</span> <span class="o">*</span> <span class="n">height</span><span class="p">)</span><span class="nf">.collect</span><span class="p">(),</span>
<span class="n">width</span><span class="p">:</span> <span class="n">width</span><span class="p">,</span>
<span class="n">height</span><span class="p">:</span> <span class="n">height</span><span class="p">,</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="c1">// ...</span>
<span class="p">}</span></code></pre></figure>
<p>To paint a rectangle on the canvas, we just loop through its rows and columns,
using a <a href="https://github.com/mbrubeck/robinson/blob/619a03bea918a0c756655fae02a004e6b4a3974c/src/painting.rs#L133-L135">helper method</a> to make sure we don’t go outside the bounds of
our canvas.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">fn</span> <span class="nf">paint_item</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">,</span> <span class="n">item</span><span class="p">:</span> <span class="o">&</span><span class="n">DisplayCommand</span><span class="p">)</span> <span class="p">{</span>
<span class="k">match</span> <span class="n">item</span> <span class="p">{</span>
<span class="o">&</span><span class="nn">DisplayCommand</span><span class="p">::</span><span class="nf">SolidColor</span><span class="p">(</span><span class="n">color</span><span class="p">,</span> <span class="n">rect</span><span class="p">)</span> <span class="k">=></span> <span class="p">{</span>
<span class="c1">// Clip the rectangle to the canvas boundaries.</span>
<span class="k">let</span> <span class="n">x0</span> <span class="o">=</span> <span class="n">rect</span><span class="py">.x</span><span class="nf">.clamp</span><span class="p">(</span><span class="mf">0.0</span><span class="p">,</span> <span class="k">self</span><span class="py">.width</span> <span class="k">as</span> <span class="nb">f32</span><span class="p">)</span> <span class="k">as</span> <span class="nb">usize</span><span class="p">;</span>
<span class="k">let</span> <span class="n">y0</span> <span class="o">=</span> <span class="n">rect</span><span class="py">.y</span><span class="nf">.clamp</span><span class="p">(</span><span class="mf">0.0</span><span class="p">,</span> <span class="k">self</span><span class="py">.height</span> <span class="k">as</span> <span class="nb">f32</span><span class="p">)</span> <span class="k">as</span> <span class="nb">usize</span><span class="p">;</span>
<span class="k">let</span> <span class="n">x1</span> <span class="o">=</span> <span class="p">(</span><span class="n">rect</span><span class="py">.x</span> <span class="o">+</span> <span class="n">rect</span><span class="py">.width</span><span class="p">)</span><span class="nf">.clamp</span><span class="p">(</span><span class="mf">0.0</span><span class="p">,</span> <span class="k">self</span><span class="py">.width</span> <span class="k">as</span> <span class="nb">f32</span><span class="p">)</span> <span class="k">as</span> <span class="nb">usize</span><span class="p">;</span>
<span class="k">let</span> <span class="n">y1</span> <span class="o">=</span> <span class="p">(</span><span class="n">rect</span><span class="py">.y</span> <span class="o">+</span> <span class="n">rect</span><span class="py">.height</span><span class="p">)</span><span class="nf">.clamp</span><span class="p">(</span><span class="mf">0.0</span><span class="p">,</span> <span class="k">self</span><span class="py">.height</span> <span class="k">as</span> <span class="nb">f32</span><span class="p">)</span> <span class="k">as</span> <span class="nb">usize</span><span class="p">;</span>
<span class="k">for</span> <span class="n">y</span> <span class="k">in</span> <span class="p">(</span><span class="n">y0</span> <span class="o">..</span> <span class="n">y1</span><span class="p">)</span> <span class="p">{</span>
<span class="k">for</span> <span class="n">x</span> <span class="k">in</span> <span class="p">(</span><span class="n">x0</span> <span class="o">..</span> <span class="n">x1</span><span class="p">)</span> <span class="p">{</span>
<span class="c1">// TODO: alpha compositing with existing pixel</span>
<span class="k">self</span><span class="py">.pixels</span><span class="p">[</span><span class="n">x</span> <span class="o">+</span> <span class="n">y</span> <span class="o">*</span> <span class="k">self</span><span class="py">.width</span><span class="p">]</span> <span class="o">=</span> <span class="n">color</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>Note that this code only works with opaque colors. If we added transparency
(by reading the <code>opacity</code> property, or adding support for <code>rgba()</code> values
in the CSS parser) then it would need to <a href="https://en.wikipedia.org/wiki/Alpha_compositing">blend</a> each new pixel with
whatever it’s drawn on top of.</p>
<p>Now we can put everything together in the <code>paint</code> function, which builds a
display list and then rasterizes it to a canvas:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="c1">// Paint a tree of LayoutBoxes to an array of pixels.</span>
<span class="k">fn</span> <span class="nf">paint</span><span class="p">(</span><span class="n">layout_root</span><span class="p">:</span> <span class="o">&</span><span class="n">LayoutBox</span><span class="p">,</span> <span class="n">bounds</span><span class="p">:</span> <span class="n">Rect</span><span class="p">)</span> <span class="k">-></span> <span class="n">Canvas</span> <span class="p">{</span>
<span class="k">let</span> <span class="n">display_list</span> <span class="o">=</span> <span class="nf">build_display_list</span><span class="p">(</span><span class="n">layout_root</span><span class="p">);</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">canvas</span> <span class="o">=</span> <span class="nn">Canvas</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="n">bounds</span><span class="py">.width</span> <span class="k">as</span> <span class="nb">usize</span><span class="p">,</span> <span class="n">bounds</span><span class="py">.height</span> <span class="k">as</span> <span class="nb">usize</span><span class="p">);</span>
<span class="k">for</span> <span class="n">item</span> <span class="k">in</span> <span class="n">display_list</span> <span class="p">{</span>
<span class="n">canvas</span><span class="nf">.paint_item</span><span class="p">(</span><span class="o">&</span><span class="n">item</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">return</span> <span class="n">canvas</span><span class="p">;</span>
<span class="p">}</span></code></pre></figure>
<p>Lastly, we can write <a href="https://github.com/mbrubeck/robinson/blob/8feb394e9c87663e35a4e8e5040d6e964ffc2396/src/main.rs#L60-L65">a few lines of code</a> using the <a href="https://github.com/PistonDevelopers/image/">Rust Image</a>
library to save the array of pixels as a PNG file.</p>
<h1 id="pretty-pictures">Pretty Pictures</h1>
<p>At last, we’ve reached the end of our rendering pipeline. In under 1000 lines
of code, robinson can now parse this HTML file:</p>
<figure class="highlight"><pre><code class="language-html" data-lang="html"><span class="nt"><div</span> <span class="na">class=</span><span class="s">"a"</span><span class="nt">></span>
<span class="nt"><div</span> <span class="na">class=</span><span class="s">"b"</span><span class="nt">></span>
<span class="nt"><div</span> <span class="na">class=</span><span class="s">"c"</span><span class="nt">></span>
<span class="nt"><div</span> <span class="na">class=</span><span class="s">"d"</span><span class="nt">></span>
<span class="nt"><div</span> <span class="na">class=</span><span class="s">"e"</span><span class="nt">></span>
<span class="nt"><div</span> <span class="na">class=</span><span class="s">"f"</span><span class="nt">></span>
<span class="nt"><div</span> <span class="na">class=</span><span class="s">"g"</span><span class="nt">></span>
<span class="nt"></div></span>
<span class="nt"></div></span>
<span class="nt"></div></span>
<span class="nt"></div></span>
<span class="nt"></div></span>
<span class="nt"></div></span>
<span class="nt"></div></span></code></pre></figure>
<p>…and this CSS file:</p>
<figure class="highlight"><pre><code class="language-css" data-lang="css"><span class="o">*</span> <span class="p">{</span> <span class="nl">display</span><span class="p">:</span> <span class="nb">block</span><span class="p">;</span> <span class="nl">padding</span><span class="p">:</span> <span class="m">12px</span><span class="p">;</span> <span class="p">}</span>
<span class="nc">.a</span> <span class="p">{</span> <span class="nl">background</span><span class="p">:</span> <span class="m">#ff0000</span><span class="p">;</span> <span class="p">}</span>
<span class="nc">.b</span> <span class="p">{</span> <span class="nl">background</span><span class="p">:</span> <span class="m">#ffa500</span><span class="p">;</span> <span class="p">}</span>
<span class="nc">.c</span> <span class="p">{</span> <span class="nl">background</span><span class="p">:</span> <span class="m">#ffff00</span><span class="p">;</span> <span class="p">}</span>
<span class="nc">.d</span> <span class="p">{</span> <span class="nl">background</span><span class="p">:</span> <span class="m">#008000</span><span class="p">;</span> <span class="p">}</span>
<span class="nc">.e</span> <span class="p">{</span> <span class="nl">background</span><span class="p">:</span> <span class="m">#0000ff</span><span class="p">;</span> <span class="p">}</span>
<span class="nc">.f</span> <span class="p">{</span> <span class="nl">background</span><span class="p">:</span> <span class="m">#4b0082</span><span class="p">;</span> <span class="p">}</span>
<span class="nc">.g</span> <span class="p">{</span> <span class="nl">background</span><span class="p">:</span> <span class="m">#800080</span><span class="p">;</span> <span class="p">}</span></code></pre></figure>
<p>…to produce this:</p>
<p><img src="/mbrubeck/images/2014/rainbow.png" /></p>
<p>Yay!</p>
<h2 id="exercises">Exercises</h2>
<p>If you’re playing along at home, here are some things you might want to try:</p>
<ol>
<li>
<p>Write an alternate painting function that takes a display list and produces
vector output (for example, an SVG file) instead of a raster image.</p>
</li>
<li>
<p>Add support for opacity and alpha blending.</p>
</li>
<li>
<p>Write a function to optimize the display list by culling items that are
completely outside of the canvas bounds.</p>
</li>
<li>
<p>If you’re familiar with OpenGL, write a hardware-accelerated painting
function that uses GL shaders to draw the rectangles.</p>
</li>
</ol>
<h2 id="to-be-continued">To Be Continued…</h2>
<p>Now that we’ve got basic functionality for each stage in our rendering
pipeline, it’s time to go back and fill in some of the missing
features—in particular, inline layout and text rendering. Future
articles may also add additional stages, like networking and scripting.</p>
<p>I’m going to give a short “Let’s build a browser engine!” talk at this
month’s <a href="http://www.meetup.com/Rust-Bay-Area/events/203495172/">Bay Area Rust Meetup</a>. The meetup is at 7pm tomorrow
(Thursday, November 6) at Mozilla’s San Francisco office, and it will also
feature talks on Servo by my fellow Servo developers. Video of the talks will
be streamed live on <a href="https://air.mozilla.org/bay-area-rust-meetup-november-2014/">Air Mozilla</a>, and recordings will be published
there later.</p>
Matt Brubeckmbrubeck@limpet.nethttps://limpet.net/mbrubeck/Let's build a browser engine! Part 6: Block layout2014-09-17T21:30:00-07:00https://limpet.net/mbrubeck//2014/09/17/toy-layout-engine-6-block<p>Welcome back to my series on building a toy HTML rendering engine:</p>
<blockquote>
<ul>
<li>
<a href="/mbrubeck/2014/08/08/toy-layout-engine-1.html">Part 1: Getting started</a>
</li>
<li>
<a href="/mbrubeck/2014/08/11/toy-layout-engine-2.html">Part 2: HTML</a>
</li>
<li>
<a href="/mbrubeck/2014/08/13/toy-layout-engine-3-css.html">Part 3: CSS</a>
</li>
<li>
<a href="/mbrubeck/2014/08/23/toy-layout-engine-4-style.html">Part 4: Style</a>
</li>
<li>
<a href="/mbrubeck/2014/09/08/toy-layout-engine-5-boxes.html">Part 5: Boxes</a>
</li>
<li>
<b>Part 6: Block layout</b>
</li>
<li>
<a href="/mbrubeck/2014/11/05/toy-layout-engine-7-painting.html">Part 7: Painting 101</a>
</li>
</ul>
</blockquote>
<p>This article will continue the layout module that we started in Part 5. This
time, we’ll add the ability to lay out block boxes. These are boxes that are
stack vertically, such as headings and paragraphs.</p>
<p>To keep things simple, this code implements only <a href="http://www.w3.org/TR/CSS2/visuren.html#positioning-scheme">normal flow</a>:
no floats, no absolute positioning, and no fixed positioning.</p>
<h2 id="traversing-the-layout-tree">Traversing the Layout Tree</h2>
<p>The entry point to this code is the <code>layout</code> function, which takes a takes a
LayoutBox and calculates its dimensions. We’ll break this function into three
cases, and implement only one of them for now:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">impl</span> <span class="n">LayoutBox</span> <span class="p">{</span>
<span class="c1">// Lay out a box and its descendants.</span>
<span class="k">fn</span> <span class="nf">layout</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">,</span> <span class="n">containing_block</span><span class="p">:</span> <span class="n">Dimensions</span><span class="p">)</span> <span class="p">{</span>
<span class="k">match</span> <span class="k">self</span><span class="py">.box_type</span> <span class="p">{</span>
<span class="nf">BlockNode</span><span class="p">(</span><span class="n">_</span><span class="p">)</span> <span class="k">=></span> <span class="k">self</span><span class="nf">.layout_block</span><span class="p">(</span><span class="n">containing_block</span><span class="p">),</span>
<span class="nf">InlineNode</span><span class="p">(</span><span class="n">_</span><span class="p">)</span> <span class="k">=></span> <span class="p">{}</span> <span class="c1">// TODO</span>
<span class="n">AnonymousBlock</span> <span class="k">=></span> <span class="p">{}</span> <span class="c1">// TODO</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="c1">// ...</span>
<span class="p">}</span></code></pre></figure>
<p>A block’s layout depends on the dimensions of its <em>containing block</em>. For
block boxes in normal flow, this is just the box’s parent. For the root
element, it’s the size of the browser window (or <em>“viewport”</em>).</p>
<p>You may remember from the previous article that a block’s width depends on its
parent, while its height depends on its children. This means that our code
needs to traverse the tree <em>top-down</em> while calculating widths, so it can lay
out the children after their parent’s width is known, and traverse <em>bottom-up</em>
to calculate heights, so that a parent’s height is calculated after its
children’s.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">fn</span> <span class="nf">layout_block</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">,</span> <span class="n">containing_block</span><span class="p">:</span> <span class="n">Dimensions</span><span class="p">)</span> <span class="p">{</span>
<span class="c1">// Child width can depend on parent width, so we need to calculate</span>
<span class="c1">// this box's width before laying out its children.</span>
<span class="k">self</span><span class="nf">.calculate_block_width</span><span class="p">(</span><span class="n">containing_block</span><span class="p">);</span>
<span class="c1">// Determine where the box is located within its container.</span>
<span class="k">self</span><span class="nf">.calculate_block_position</span><span class="p">(</span><span class="n">containing_block</span><span class="p">);</span>
<span class="c1">// Recursively lay out the children of this box.</span>
<span class="k">self</span><span class="nf">.layout_block_children</span><span class="p">();</span>
<span class="c1">// Parent height can depend on child height, so `calculate_height`</span>
<span class="c1">// must be called *after* the children are laid out.</span>
<span class="k">self</span><span class="nf">.calculate_block_height</span><span class="p">();</span>
<span class="p">}</span></code></pre></figure>
<p>This function performs a single traversal of the layout tree, doing width
calculations on the way down and height calculations on the way back up. A
real layout engine might perform several tree traversals, some top-down and
some bottom-up.</p>
<h2 id="calculating-the-width">Calculating the Width</h2>
<p>The width calculation is the first step in the block layout function, and also
the most complicated. I’ll walk through it step by step. To start, we need
the values of the CSS <code>width</code> property and all the left and right edge sizes:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">fn</span> <span class="nf">calculate_block_width</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">,</span> <span class="n">containing_block</span><span class="p">:</span> <span class="n">Dimensions</span><span class="p">)</span> <span class="p">{</span>
<span class="k">let</span> <span class="n">style</span> <span class="o">=</span> <span class="k">self</span><span class="nf">.get_style_node</span><span class="p">();</span>
<span class="c1">// `width` has initial value `auto`.</span>
<span class="k">let</span> <span class="n">auto</span> <span class="o">=</span> <span class="nf">Keyword</span><span class="p">(</span><span class="s">"auto"</span><span class="nf">.to_string</span><span class="p">());</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">width</span> <span class="o">=</span> <span class="n">style</span><span class="nf">.value</span><span class="p">(</span><span class="s">"width"</span><span class="p">)</span><span class="nf">.unwrap_or</span><span class="p">(</span><span class="n">auto</span><span class="nf">.clone</span><span class="p">());</span>
<span class="c1">// margin, border, and padding have initial value 0.</span>
<span class="k">let</span> <span class="n">zero</span> <span class="o">=</span> <span class="nf">Length</span><span class="p">(</span><span class="mf">0.0</span><span class="p">,</span> <span class="n">Px</span><span class="p">);</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">margin_left</span> <span class="o">=</span> <span class="n">style</span><span class="nf">.lookup</span><span class="p">(</span><span class="s">"margin-left"</span><span class="p">,</span> <span class="s">"margin"</span><span class="p">,</span> <span class="o">&</span><span class="n">zero</span><span class="p">);</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">margin_right</span> <span class="o">=</span> <span class="n">style</span><span class="nf">.lookup</span><span class="p">(</span><span class="s">"margin-right"</span><span class="p">,</span> <span class="s">"margin"</span><span class="p">,</span> <span class="o">&</span><span class="n">zero</span><span class="p">);</span>
<span class="k">let</span> <span class="n">border_left</span> <span class="o">=</span> <span class="n">style</span><span class="nf">.lookup</span><span class="p">(</span><span class="s">"border-left-width"</span><span class="p">,</span> <span class="s">"border-width"</span><span class="p">,</span> <span class="o">&</span><span class="n">zero</span><span class="p">);</span>
<span class="k">let</span> <span class="n">border_right</span> <span class="o">=</span> <span class="n">style</span><span class="nf">.lookup</span><span class="p">(</span><span class="s">"border-right-width"</span><span class="p">,</span> <span class="s">"border-width"</span><span class="p">,</span> <span class="o">&</span><span class="n">zero</span><span class="p">);</span>
<span class="k">let</span> <span class="n">padding_left</span> <span class="o">=</span> <span class="n">style</span><span class="nf">.lookup</span><span class="p">(</span><span class="s">"padding-left"</span><span class="p">,</span> <span class="s">"padding"</span><span class="p">,</span> <span class="o">&</span><span class="n">zero</span><span class="p">);</span>
<span class="k">let</span> <span class="n">padding_right</span> <span class="o">=</span> <span class="n">style</span><span class="nf">.lookup</span><span class="p">(</span><span class="s">"padding-right"</span><span class="p">,</span> <span class="s">"padding"</span><span class="p">,</span> <span class="o">&</span><span class="n">zero</span><span class="p">);</span>
<span class="c1">// ...</span>
<span class="p">}</span></code></pre></figure>
<p>This uses a helper function called <a href="https://github.com/mbrubeck/robinson/blob/275ea716d50565b10ce91c0054fbf527281180bb/src/style.rs#L33-L38"><code>lookup</code></a>, which just tries a
series of values in sequence. If the first property isn’t set, it tries the
second one. If that’s not set either, it returns the given default value.
This provides an incomplete (but simple) implementation of <a href="http://www.w3.org/TR/CSS2/about.html#shorthand">shorthand
properties</a> and initial values.</p>
<blockquote>
<p><strong>Note</strong>: This is similar to the following code in, say, JavaScript or Ruby:</p>
<p><span class="highlight"></span></p>
</blockquote>
<figure class="highlight"><pre><code class="language-javascript" data-lang="javascript"><span class="nx">margin_left</span> <span class="o">=</span> <span class="nx">style</span><span class="p">[</span><span class="dl">"</span><span class="s2">margin-left</span><span class="dl">"</span><span class="p">]</span> <span class="o">||</span> <span class="nx">style</span><span class="p">[</span><span class="dl">"</span><span class="s2">margin</span><span class="dl">"</span><span class="p">]</span> <span class="o">||</span> <span class="nx">zero</span><span class="p">;</span></code></pre></figure>
<p></span></p>
<p>Since a child can’t change its parent’s width, it needs to make sure its own
width fits the parent’s. The CSS spec expresses this as a set of
<a href="http://www.w3.org/TR/CSS2/visudet.html#blockwidth">constraints</a> and an algorithm for solving them. The
following code implements that algorithm.</p>
<p>First we add up the margin, padding, border, and content widths. The
<a href="https://github.com/mbrubeck/robinson/blob/275ea716d50565b10ce91c0054fbf527281180bb/src/css.rs#L75-L81"><code>to_px</code></a> helper method converts lengths to their numerical values. If
a property is set to <code>'auto'</code>, it returns 0 so it doesn’t affect the sum.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">let</span> <span class="n">total</span> <span class="o">=</span> <span class="p">[</span><span class="o">&</span><span class="n">margin_left</span><span class="p">,</span> <span class="o">&</span><span class="n">margin_right</span><span class="p">,</span> <span class="o">&</span><span class="n">border_left</span><span class="p">,</span> <span class="o">&</span><span class="n">border_right</span><span class="p">,</span>
<span class="o">&</span><span class="n">padding_left</span><span class="p">,</span> <span class="o">&</span><span class="n">padding_right</span><span class="p">,</span> <span class="o">&</span><span class="n">width</span><span class="p">]</span><span class="nf">.iter</span><span class="p">()</span><span class="nf">.map</span><span class="p">(|</span><span class="n">v</span><span class="p">|</span> <span class="n">v</span><span class="nf">.to_px</span><span class="p">())</span><span class="nf">.sum</span><span class="p">();</span></code></pre></figure>
<p>This is the minimum horizontal space needed for the box. If this isn’t equal
to the container width, we’ll need to adjust something to make it equal.</p>
<p>If the width or margins are set to <code>'auto'</code>, they can expand or contract to
fit the available space. Following the spec, we first check if the box is too
big. If so, we set any expandable margins to zero.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="c1">// If width is not auto and the total is wider than the container, treat auto margins as 0.</span>
<span class="k">if</span> <span class="n">width</span> <span class="o">!=</span> <span class="n">auto</span> <span class="o">&&</span> <span class="n">total</span> <span class="o">></span> <span class="n">containing_block</span><span class="py">.content.width</span> <span class="p">{</span>
<span class="k">if</span> <span class="n">margin_left</span> <span class="o">==</span> <span class="n">auto</span> <span class="p">{</span>
<span class="n">margin_left</span> <span class="o">=</span> <span class="nf">Length</span><span class="p">(</span><span class="mf">0.0</span><span class="p">,</span> <span class="n">Px</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">if</span> <span class="n">margin_right</span> <span class="o">==</span> <span class="n">auto</span> <span class="p">{</span>
<span class="n">margin_right</span> <span class="o">=</span> <span class="nf">Length</span><span class="p">(</span><span class="mf">0.0</span><span class="p">,</span> <span class="n">Px</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>If the box is too large for its container, it <em>overflows</em> the container. If
it’s too small, it will <em>underflow</em>, leaving extra space. We’ll calculate the
underflow—the amount of extra space left in the container. (If this
number is negative, it is actually an overflow.)</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">let</span> <span class="n">underflow</span> <span class="o">=</span> <span class="n">containing_block</span><span class="py">.content.width</span> <span class="o">-</span> <span class="n">total</span><span class="p">;</span></code></pre></figure>
<p>We now follow the spec’s <a href="http://www.w3.org/TR/CSS2/visudet.html#blockwidth">algorithm</a> for eliminating any
overflow or underflow by adjusting the expandable dimensions. If there are no
<code>'auto'</code> dimensions, we adjust the right margin. (Yes, this means the
margin may be <a href="http://www.smashingmagazine.com/2009/07/27/the-definitive-guide-to-using-negative-margins/">negative</a> in the case of an overflow!)</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">match</span> <span class="p">(</span><span class="n">width</span> <span class="o">==</span> <span class="n">auto</span><span class="p">,</span> <span class="n">margin_left</span> <span class="o">==</span> <span class="n">auto</span><span class="p">,</span> <span class="n">margin_right</span> <span class="o">==</span> <span class="n">auto</span><span class="p">)</span> <span class="p">{</span>
<span class="c1">// If the values are overconstrained, calculate margin_right.</span>
<span class="p">(</span><span class="k">false</span><span class="p">,</span> <span class="k">false</span><span class="p">,</span> <span class="k">false</span><span class="p">)</span> <span class="k">=></span> <span class="p">{</span>
<span class="n">margin_right</span> <span class="o">=</span> <span class="nf">Length</span><span class="p">(</span><span class="n">margin_right</span><span class="nf">.to_px</span><span class="p">()</span> <span class="o">+</span> <span class="n">underflow</span><span class="p">,</span> <span class="n">Px</span><span class="p">);</span>
<span class="p">}</span>
<span class="c1">// If exactly one size is auto, its used value follows from the equality.</span>
<span class="p">(</span><span class="k">false</span><span class="p">,</span> <span class="k">false</span><span class="p">,</span> <span class="k">true</span><span class="p">)</span> <span class="k">=></span> <span class="p">{</span> <span class="n">margin_right</span> <span class="o">=</span> <span class="nf">Length</span><span class="p">(</span><span class="n">underflow</span><span class="p">,</span> <span class="n">Px</span><span class="p">);</span> <span class="p">}</span>
<span class="p">(</span><span class="k">false</span><span class="p">,</span> <span class="k">true</span><span class="p">,</span> <span class="k">false</span><span class="p">)</span> <span class="k">=></span> <span class="p">{</span> <span class="n">margin_left</span> <span class="o">=</span> <span class="nf">Length</span><span class="p">(</span><span class="n">underflow</span><span class="p">,</span> <span class="n">Px</span><span class="p">);</span> <span class="p">}</span>
<span class="c1">// If width is set to auto, any other auto values become 0.</span>
<span class="p">(</span><span class="k">true</span><span class="p">,</span> <span class="n">_</span><span class="p">,</span> <span class="n">_</span><span class="p">)</span> <span class="k">=></span> <span class="p">{</span>
<span class="k">if</span> <span class="n">margin_left</span> <span class="o">==</span> <span class="n">auto</span> <span class="p">{</span> <span class="n">margin_left</span> <span class="o">=</span> <span class="nf">Length</span><span class="p">(</span><span class="mf">0.0</span><span class="p">,</span> <span class="n">Px</span><span class="p">);</span> <span class="p">}</span>
<span class="k">if</span> <span class="n">margin_right</span> <span class="o">==</span> <span class="n">auto</span> <span class="p">{</span> <span class="n">margin_right</span> <span class="o">=</span> <span class="nf">Length</span><span class="p">(</span><span class="mf">0.0</span><span class="p">,</span> <span class="n">Px</span><span class="p">);</span> <span class="p">}</span>
<span class="k">if</span> <span class="n">underflow</span> <span class="o">>=</span> <span class="mf">0.0</span> <span class="p">{</span>
<span class="c1">// Expand width to fill the underflow.</span>
<span class="n">width</span> <span class="o">=</span> <span class="nf">Length</span><span class="p">(</span><span class="n">underflow</span><span class="p">,</span> <span class="n">Px</span><span class="p">);</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
<span class="c1">// Width can't be negative. Adjust the right margin instead.</span>
<span class="n">width</span> <span class="o">=</span> <span class="nf">Length</span><span class="p">(</span><span class="mf">0.0</span><span class="p">,</span> <span class="n">Px</span><span class="p">);</span>
<span class="n">margin_right</span> <span class="o">=</span> <span class="nf">Length</span><span class="p">(</span><span class="n">margin_right</span><span class="nf">.to_px</span><span class="p">()</span> <span class="o">+</span> <span class="n">underflow</span><span class="p">,</span> <span class="n">Px</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="c1">// If margin-left and margin-right are both auto, their used values are equal.</span>
<span class="p">(</span><span class="k">false</span><span class="p">,</span> <span class="k">true</span><span class="p">,</span> <span class="k">true</span><span class="p">)</span> <span class="k">=></span> <span class="p">{</span>
<span class="n">margin_left</span> <span class="o">=</span> <span class="nf">Length</span><span class="p">(</span><span class="n">underflow</span> <span class="o">/</span> <span class="mf">2.0</span><span class="p">,</span> <span class="n">Px</span><span class="p">);</span>
<span class="n">margin_right</span> <span class="o">=</span> <span class="nf">Length</span><span class="p">(</span><span class="n">underflow</span> <span class="o">/</span> <span class="mf">2.0</span><span class="p">,</span> <span class="n">Px</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>At this point, the constraints are met and any <code>'auto'</code> values have been
converted to lengths. The results are the the <a href="http://www.w3.org/TR/CSS2/cascade.html#used-value">used values</a> for
the horizontal box dimensions, which we will store in the layout tree. You
can see the final code in <a href="https://github.com/mbrubeck/robinson/blob/619a03bea918a0c756655fae02a004e6b4a3974c/src/layout.rs#L132-L217">layout.rs</a>.</p>
<h2 id="positioning">Positioning</h2>
<p>The next step is simpler. This function looks up the remanining
margin/padding/border styles, and uses these along with the containing block
dimensions to determine this block’s position on the page.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">fn</span> <span class="nf">calculate_block_position</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">,</span> <span class="n">containing_block</span><span class="p">:</span> <span class="n">Dimensions</span><span class="p">)</span> <span class="p">{</span>
<span class="k">let</span> <span class="n">style</span> <span class="o">=</span> <span class="k">self</span><span class="nf">.get_style_node</span><span class="p">();</span>
<span class="k">let</span> <span class="n">d</span> <span class="o">=</span> <span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="py">.dimensions</span><span class="p">;</span>
<span class="c1">// margin, border, and padding have initial value 0.</span>
<span class="k">let</span> <span class="n">zero</span> <span class="o">=</span> <span class="nf">Length</span><span class="p">(</span><span class="mf">0.0</span><span class="p">,</span> <span class="n">Px</span><span class="p">);</span>
<span class="c1">// If margin-top or margin-bottom is `auto`, the used value is zero.</span>
<span class="n">d</span><span class="py">.margin.top</span> <span class="o">=</span> <span class="n">style</span><span class="nf">.lookup</span><span class="p">(</span><span class="s">"margin-top"</span><span class="p">,</span> <span class="s">"margin"</span><span class="p">,</span> <span class="o">&</span><span class="n">zero</span><span class="p">)</span><span class="nf">.to_px</span><span class="p">();</span>
<span class="n">d</span><span class="py">.margin.bottom</span> <span class="o">=</span> <span class="n">style</span><span class="nf">.lookup</span><span class="p">(</span><span class="s">"margin-bottom"</span><span class="p">,</span> <span class="s">"margin"</span><span class="p">,</span> <span class="o">&</span><span class="n">zero</span><span class="p">)</span><span class="nf">.to_px</span><span class="p">();</span>
<span class="n">d</span><span class="py">.border.top</span> <span class="o">=</span> <span class="n">style</span><span class="nf">.lookup</span><span class="p">(</span><span class="s">"border-top-width"</span><span class="p">,</span> <span class="s">"border-width"</span><span class="p">,</span> <span class="o">&</span><span class="n">zero</span><span class="p">)</span><span class="nf">.to_px</span><span class="p">();</span>
<span class="n">d</span><span class="py">.border.bottom</span> <span class="o">=</span> <span class="n">style</span><span class="nf">.lookup</span><span class="p">(</span><span class="s">"border-bottom-width"</span><span class="p">,</span> <span class="s">"border-width"</span><span class="p">,</span> <span class="o">&</span><span class="n">zero</span><span class="p">)</span><span class="nf">.to_px</span><span class="p">();</span>
<span class="n">d</span><span class="py">.padding.top</span> <span class="o">=</span> <span class="n">style</span><span class="nf">.lookup</span><span class="p">(</span><span class="s">"padding-top"</span><span class="p">,</span> <span class="s">"padding"</span><span class="p">,</span> <span class="o">&</span><span class="n">zero</span><span class="p">)</span><span class="nf">.to_px</span><span class="p">();</span>
<span class="n">d</span><span class="py">.padding.bottom</span> <span class="o">=</span> <span class="n">style</span><span class="nf">.lookup</span><span class="p">(</span><span class="s">"padding-bottom"</span><span class="p">,</span> <span class="s">"padding"</span><span class="p">,</span> <span class="o">&</span><span class="n">zero</span><span class="p">)</span><span class="nf">.to_px</span><span class="p">();</span>
<span class="n">d</span><span class="py">.content.x</span> <span class="o">=</span> <span class="n">containing_block</span><span class="py">.content.x</span> <span class="o">+</span>
<span class="n">d</span><span class="py">.margin.left</span> <span class="o">+</span> <span class="n">d</span><span class="py">.border.left</span> <span class="o">+</span> <span class="n">d</span><span class="py">.padding.left</span><span class="p">;</span>
<span class="c1">// Position the box below all the previous boxes in the container.</span>
<span class="n">d</span><span class="py">.content.y</span> <span class="o">=</span> <span class="n">containing_block</span><span class="py">.content.height</span> <span class="o">+</span> <span class="n">containing_block</span><span class="py">.content.y</span> <span class="o">+</span>
<span class="n">d</span><span class="py">.margin.top</span> <span class="o">+</span> <span class="n">d</span><span class="py">.border.top</span> <span class="o">+</span> <span class="n">d</span><span class="py">.padding.top</span><span class="p">;</span>
<span class="p">}</span></code></pre></figure>
<p>Take a close look at that last statement, which sets the <code>y</code> position. This
is what gives block layout its distinctive vertical stacking behavior. For
this to work, we’ll need to make sure the parent’s <code>content.height</code> is updated
after laying out each child.</p>
<h2 id="children">Children</h2>
<p>Here’s the code that recursively lays out the box’s contents. As it loops
through the child boxes, it keeps track of the total content height. This is
used by the positioning code (above) to find the vertical position of the next
child.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">fn</span> <span class="nf">layout_block_children</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="p">{</span>
<span class="k">let</span> <span class="n">d</span> <span class="o">=</span> <span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="py">.dimensions</span><span class="p">;</span>
<span class="k">for</span> <span class="n">child</span> <span class="k">in</span> <span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="py">.children</span> <span class="p">{</span>
<span class="n">child</span><span class="nf">.layout</span><span class="p">(</span><span class="o">*</span><span class="n">d</span><span class="p">);</span>
<span class="c1">// Track the height so each child is laid out below the previous content.</span>
<span class="n">d</span><span class="py">.content.height</span> <span class="o">=</span> <span class="n">d</span><span class="py">.content.height</span> <span class="o">+</span> <span class="n">child</span><span class="py">.dimensions</span><span class="nf">.margin_box</span><span class="p">()</span><span class="py">.height</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>The total vertical space taken up by each child is the height of its <em>margin
box</em>, which we calculate like so:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">impl</span> <span class="n">Dimensions</span> <span class="p">{</span>
<span class="c1">// The area covered by the content area plus its padding.</span>
<span class="k">fn</span> <span class="nf">padding_box</span><span class="p">(</span><span class="k">self</span><span class="p">)</span> <span class="k">-></span> <span class="n">Rect</span> <span class="p">{</span>
<span class="k">self</span><span class="py">.content</span><span class="nf">.expanded_by</span><span class="p">(</span><span class="k">self</span><span class="py">.padding</span><span class="p">)</span>
<span class="p">}</span>
<span class="c1">// The area covered by the content area plus padding and borders.</span>
<span class="k">fn</span> <span class="nf">border_box</span><span class="p">(</span><span class="k">self</span><span class="p">)</span> <span class="k">-></span> <span class="n">Rect</span> <span class="p">{</span>
<span class="k">self</span><span class="nf">.padding_box</span><span class="p">()</span><span class="nf">.expanded_by</span><span class="p">(</span><span class="k">self</span><span class="py">.border</span><span class="p">)</span>
<span class="p">}</span>
<span class="c1">// The area covered by the content area plus padding, borders, and margin.</span>
<span class="k">fn</span> <span class="nf">margin_box</span><span class="p">(</span><span class="k">self</span><span class="p">)</span> <span class="k">-></span> <span class="n">Rect</span> <span class="p">{</span>
<span class="k">self</span><span class="nf">.border_box</span><span class="p">()</span><span class="nf">.expanded_by</span><span class="p">(</span><span class="k">self</span><span class="py">.margin</span><span class="p">)</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">impl</span> <span class="n">Rect</span> <span class="p">{</span>
<span class="k">fn</span> <span class="nf">expanded_by</span><span class="p">(</span><span class="k">self</span><span class="p">,</span> <span class="n">edge</span><span class="p">:</span> <span class="n">EdgeSizes</span><span class="p">)</span> <span class="k">-></span> <span class="n">Rect</span> <span class="p">{</span>
<span class="n">Rect</span> <span class="p">{</span>
<span class="n">x</span><span class="p">:</span> <span class="k">self</span><span class="py">.x</span> <span class="o">-</span> <span class="n">edge</span><span class="py">.left</span><span class="p">,</span>
<span class="n">y</span><span class="p">:</span> <span class="k">self</span><span class="py">.y</span> <span class="o">-</span> <span class="n">edge</span><span class="py">.top</span><span class="p">,</span>
<span class="n">width</span><span class="p">:</span> <span class="k">self</span><span class="py">.width</span> <span class="o">+</span> <span class="n">edge</span><span class="py">.left</span> <span class="o">+</span> <span class="n">edge</span><span class="py">.right</span><span class="p">,</span>
<span class="n">height</span><span class="p">:</span> <span class="k">self</span><span class="py">.height</span> <span class="o">+</span> <span class="n">edge</span><span class="py">.top</span> <span class="o">+</span> <span class="n">edge</span><span class="py">.bottom</span><span class="p">,</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>For simplicity, this does not implement <a href="http://www.w3.org/TR/CSS2/box.html#collapsing-margins">margin collapsing</a>.
A real layout engine would allow the bottom margin of one box to overlap the
top margin of the next box, rather than placing each margin box completely
below the previous one.</p>
<h2 id="the-height-property">The ‘height’ Property</h2>
<p>By default, the box’s height is equal to the height of its contents. But if
the <code>'height'</code> property is set to an explicit length, we’ll use that instead:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">fn</span> <span class="nf">calculate_block_height</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="p">{</span>
<span class="c1">// If the height is set to an explicit length, use that exact length.</span>
<span class="c1">// Otherwise, just keep the value set by `layout_block_children`.</span>
<span class="k">if</span> <span class="k">let</span> <span class="nf">Some</span><span class="p">(</span><span class="nf">Length</span><span class="p">(</span><span class="n">h</span><span class="p">,</span> <span class="n">Px</span><span class="p">))</span> <span class="o">=</span> <span class="k">self</span><span class="nf">.get_style_node</span><span class="p">()</span><span class="nf">.value</span><span class="p">(</span><span class="s">"height"</span><span class="p">)</span> <span class="p">{</span>
<span class="k">self</span><span class="py">.dimensions.content.height</span> <span class="o">=</span> <span class="n">h</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>And that concludes the block layout algorithm. You can now call <code>layout()</code> on
a styled HTML document, and it will spit out a bunch of rectangles with
widths, heights, margins, etc. Cool, right?</p>
<h2 id="exercises">Exercises</h2>
<p>Some extra ideas for the ambitious implementer:</p>
<ol>
<li>
<p>Collapsing vertical margins.</p>
</li>
<li>
<p><a href="http://www.w3.org/TR/CSS2/visuren.html#relative-positioning">Relative positioning</a>.</p>
</li>
<li>
<p>Parallelize the layout process, and measure the effect on performance.</p>
</li>
</ol>
<p>If you try the parallelization project, you may want to separate the width
calculation and the height calculation into two distinct passes. The top-down
traversal for width is easy to parallelize just by spawning a separate task
for each child. The height calculation is a little trickier, since you need
to go back and adjust the <code>y</code> position of each child after its siblings are
laid out.</p>
<h2 id="to-be-continued">To Be Continued…</h2>
<p>Thank you to everyone who’s followed along this far!</p>
<p>These articles are taking longer and longer to write, as I journey further
into unfamiliar areas of layout and rendering. There will be a longer hiatus
before the next part as I experiment with font and graphics code, but I’ll
resume the series as soon as I can.</p>
<p><em>Update: <a href="/mbrubeck/2014/11/05/toy-layout-engine-7-painting.html">Part 7</a> is now ready.</em></p>
Matt Brubeckmbrubeck@limpet.nethttps://limpet.net/mbrubeck/Let's build a browser engine! Part 5: Boxes2014-09-08T16:16:00-07:00https://limpet.net/mbrubeck//2014/09/08/toy-layout-engine-5-boxes<p>This is the latest in a series of articles about writing a simple HTML
rendering engine:</p>
<blockquote>
<ul>
<li>
<a href="/mbrubeck/2014/08/08/toy-layout-engine-1.html">Part 1: Getting started</a>
</li>
<li>
<a href="/mbrubeck/2014/08/11/toy-layout-engine-2.html">Part 2: HTML</a>
</li>
<li>
<a href="/mbrubeck/2014/08/13/toy-layout-engine-3-css.html">Part 3: CSS</a>
</li>
<li>
<a href="/mbrubeck/2014/08/23/toy-layout-engine-4-style.html">Part 4: Style</a>
</li>
<li>
<b>Part 5: Boxes</b>
</li>
<li>
<a href="/mbrubeck/2014/09/17/toy-layout-engine-6-block.html">Part 6: Block layout</a>
</li>
<li>
<a href="/mbrubeck/2014/11/05/toy-layout-engine-7-painting.html">Part 7: Painting 101</a>
</li>
</ul>
</blockquote>
<p>This article will begin the <a href="https://github.com/mbrubeck/robinson/blob/master/src/layout.rs">layout</a> module, which takes the
style tree and translates it into a bunch of rectangles in a two-dimensional
space. This is a big module, so I’m going to split it into several articles.
Also, some of the code I share in this article may need to change as I write
the code for the later parts.</p>
<p>The layout module’s input is the style tree from <a href="/mbrubeck/2014/08/23/toy-layout-engine-4-style.html">Part 4</a>, and its
output is yet another tree, the <em>layout tree</em>. This takes us one step further
in our mini rendering pipeline:</p>
<p><img src="/mbrubeck/images/2014/pipeline.svg" style="width: 720px" /></p>
<p>I’ll start by talking about the basic HTML/CSS layout model. If you’ve ever
learned to develop web pages you might be familiar with this already—but
it may look a bit different from the implementer’s point of view.</p>
<h2 id="the-box-model">The Box Model</h2>
<p>Layout is all about <em>boxes</em>. A box is a rectangular section of a web page.
It has a width, a height, and a position on the page. This rectangle is
called the <em>content area</em> because it’s where the box’s content is drawn. The
content may be text, image, video, or other boxes.</p>
<p>A box may also have <em>padding</em>, <em>borders</em>, and <em>margins</em> surrounding its
content area. The CSS spec has a <a href="http://www.w3.org/TR/CSS2/box.html#box-dimensions">diagram</a> showing how all these
layers fit together.</p>
<p>Robinson stores a box’s content area and surrounding areas in the following
structure. [<strong>Rust note:</strong> <code>f32</code> is a 32-bit floating point type.]</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="c1">// CSS box model. All sizes are in px.</span>
<span class="k">struct</span> <span class="n">Dimensions</span> <span class="p">{</span>
<span class="c1">// Position of the content area relative to the document origin:</span>
<span class="n">content</span><span class="p">:</span> <span class="n">Rect</span><span class="p">,</span>
<span class="c1">// Surrounding edges:</span>
<span class="n">padding</span><span class="p">:</span> <span class="n">EdgeSizes</span><span class="p">,</span>
<span class="n">border</span><span class="p">:</span> <span class="n">EdgeSizes</span><span class="p">,</span>
<span class="n">margin</span><span class="p">:</span> <span class="n">EdgeSizes</span><span class="p">,</span>
<span class="p">}</span>
<span class="k">struct</span> <span class="n">Rect</span> <span class="p">{</span>
<span class="n">x</span><span class="p">:</span> <span class="nb">f32</span><span class="p">,</span>
<span class="n">y</span><span class="p">:</span> <span class="nb">f32</span><span class="p">,</span>
<span class="n">width</span><span class="p">:</span> <span class="nb">f32</span><span class="p">,</span>
<span class="n">height</span><span class="p">:</span> <span class="nb">f32</span><span class="p">,</span>
<span class="p">}</span>
<span class="k">struct</span> <span class="n">EdgeSizes</span> <span class="p">{</span>
<span class="n">left</span><span class="p">:</span> <span class="nb">f32</span><span class="p">,</span>
<span class="n">right</span><span class="p">:</span> <span class="nb">f32</span><span class="p">,</span>
<span class="n">top</span><span class="p">:</span> <span class="nb">f32</span><span class="p">,</span>
<span class="n">bottom</span><span class="p">:</span> <span class="nb">f32</span><span class="p">,</span>
<span class="p">}</span></code></pre></figure>
<h2 id="block-and-inline-layout">Block and Inline Layout</h2>
<p class="hidden-desc"><strong>Note:</strong> This section contains diagrams that won't
make sense if you are reading them without the associated visual styles.
If you are reading this in a feed reader, try opening the
<a href="/mbrubeck//2014/09/08/toy-layout-engine-5-boxes.html">original page</a> in a regular browser
tab. I also included text descriptions for those of you using screen readers
or other assistive technologies.</p>
<p>The CSS <a href="https://developer.mozilla.org/en-US/docs/Web/CSS/display"><code>display</code></a> property determines which type of box an
element generates. CSS defines several box types, each with its own layout
rules. I’m only going to talk about two of them: <em>block</em> and <em>inline</em>.</p>
<p>I’ll use this bit of pseudo-HTML to illustrate the difference:</p>
<figure class="highlight"><pre><code class="language-html" data-lang="html"><span class="nt"><container></span>
<span class="nt"><a></a></span>
<span class="nt"><b></b></span>
<span class="nt"><c></c></span>
<span class="nt"><d></d></span>
<span class="nt"></container></span></code></pre></figure>
<p><em>Block boxes</em> are placed vertically within their container, from top to bottom.</p>
<figure class="highlight"><pre><code class="language-css" data-lang="css"><span class="nt">a</span><span class="o">,</span> <span class="nt">b</span><span class="o">,</span> <span class="nt">c</span><span class="o">,</span> <span class="nt">d</span> <span class="p">{</span> <span class="nl">display</span><span class="p">:</span> <span class="nb">block</span><span class="p">;</span> <span class="p">}</span></code></pre></figure>
<p class="hidden-desc"><strong>Description:</strong> The diagram below shows
four rectangles in a vertical stack.</p>
<div id="example1" class="example outer">
<div class="example">a</div>
<div class="example">b</div>
<div class="example">c</div>
<div class="example">d</div>
</div>
<p><em>Inline boxes</em> are placed horizontally within their container, from left to
right. If they reach the right edge of the container, they will wrap around
and continue on a new line below.</p>
<figure class="highlight"><pre><code class="language-css" data-lang="css"><span class="nt">a</span><span class="o">,</span> <span class="nt">b</span><span class="o">,</span> <span class="nt">c</span><span class="o">,</span> <span class="nt">d</span> <span class="p">{</span> <span class="nl">display</span><span class="p">:</span> <span class="nb">inline</span><span class="p">;</span> <span class="p">}</span></code></pre></figure>
<p class="hidden-desc"><strong>Description:</strong> The diagram below shows
boxes `a`, `b`, and `c` in a horizontal line from left to right, and box `d`
in the next line.</p>
<div id="example2" class="example outer container">
<div class="example inline">a</div>
<div class="example inline">b</div>
<div class="example inline">c</div>
<div class="example inline">d</div>
</div>
<p>Each box must contain <em>only</em> block children, or <em>only</em> inline children. When
an DOM element contains a mix of block and inline children, the layout engine
inserts <a href="http://www.w3.org/TR/CSS2/visuren.html#anonymous-block-level">anonymous boxes</a> to separate the two types. (These boxes
are “anonymous” because they aren’t associated with nodes in the DOM tree.)</p>
<p>In this example, the inline boxes <code>b</code> and <code>c</code> are surrounded by an <em>anonymous
block box</em>, shown in pink:</p>
<figure class="highlight"><pre><code class="language-css" data-lang="css"><span class="nt">a</span> <span class="p">{</span> <span class="nl">display</span><span class="p">:</span> <span class="nb">block</span><span class="p">;</span> <span class="p">}</span>
<span class="nt">b</span><span class="o">,</span> <span class="nt">c</span> <span class="p">{</span> <span class="nl">display</span><span class="p">:</span> <span class="nb">inline</span><span class="p">;</span> <span class="p">}</span>
<span class="nt">d</span> <span class="p">{</span> <span class="nl">display</span><span class="p">:</span> <span class="nb">block</span><span class="p">;</span> <span class="p">}</span></code></pre></figure>
<p class="hidden-desc"><strong>Description:</strong> The diagram below shows
three boxes in a vertical stack. The first is labeled `a`; the second
contains two boxes in a horizonal row labeled `b` and `c`; the third box in
the stack is labeled `d`.</p>
<div id="example3" class="example outer">
<div class="example">a</div>
<div class="example container anon">
<div class="example inline">b</div>
<div class="example inline">c</div>
</div>
<div class="example">d</div>
</div>
<p>Note that content grows <em>vertically</em> by default. That is, adding children to
a container generally makes it taller, not wider. Another way to say this is
that, by default, the width of a block or line depends on its container’s
width, while the height of a container depends on its children’s heights.</p>
<p>This gets more complicated if you override the default values for properties
like <code>width</code> and <code>height</code>, and <em>way</em> more complicated if you want to support
features like <a href="http://dev.w3.org/csswg/css-writing-modes/">vertical writing</a>.</p>
<h2 id="the-layout-tree">The Layout Tree</h2>
<p>The layout tree is a collection of boxes. A box has dimensions, and it may
contain child boxes.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">struct</span> <span class="n">LayoutBox</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span> <span class="p">{</span>
<span class="n">dimensions</span><span class="p">:</span> <span class="n">Dimensions</span><span class="p">,</span>
<span class="n">box_type</span><span class="p">:</span> <span class="n">BoxType</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span><span class="p">,</span>
<span class="n">children</span><span class="p">:</span> <span class="nb">Vec</span><span class="o"><</span><span class="n">LayoutBox</span><span class="o"><</span><span class="nv">'a</span><span class="o">>></span><span class="p">,</span>
<span class="p">}</span></code></pre></figure>
<p>A box can be a block node, an inline node, or an anonymous block box. (This
will need to change when I implement text layout, because line wrapping can
cause a single inline node to split into multiple boxes. But it will do for
now.)</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">enum</span> <span class="n">BoxType</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span> <span class="p">{</span>
<span class="nf">BlockNode</span><span class="p">(</span><span class="o">&</span><span class="nv">'a</span> <span class="n">StyledNode</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span><span class="p">),</span>
<span class="nf">InlineNode</span><span class="p">(</span><span class="o">&</span><span class="nv">'a</span> <span class="n">StyledNode</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span><span class="p">),</span>
<span class="n">AnonymousBlock</span><span class="p">,</span>
<span class="p">}</span></code></pre></figure>
<p>To build the layout tree, we need to look at the <code>display</code> property for each
DOM node. I added some code to the <code>style</code> module to get the <code>display</code> value
for a node. If there’s no specified value it returns the initial value,
<code>'inline'</code>.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">enum</span> <span class="n">Display</span> <span class="p">{</span>
<span class="n">Inline</span><span class="p">,</span>
<span class="n">Block</span><span class="p">,</span>
<span class="nb">None</span><span class="p">,</span>
<span class="p">}</span>
<span class="k">impl</span> <span class="n">StyledNode</span> <span class="p">{</span>
<span class="c1">// Return the specified value of a property if it exists, otherwise `None`.</span>
<span class="k">fn</span> <span class="nf">value</span><span class="p">(</span><span class="o">&</span><span class="k">self</span><span class="p">,</span> <span class="n">name</span><span class="p">:</span> <span class="o">&</span><span class="nb">str</span><span class="p">)</span> <span class="k">-></span> <span class="nb">Option</span><span class="o"><</span><span class="n">Value</span><span class="o">></span> <span class="p">{</span>
<span class="k">self</span><span class="py">.specified_values</span><span class="nf">.get</span><span class="p">(</span><span class="n">name</span><span class="p">)</span><span class="nf">.map</span><span class="p">(|</span><span class="n">v</span><span class="p">|</span> <span class="n">v</span><span class="nf">.clone</span><span class="p">())</span>
<span class="p">}</span>
<span class="c1">// The value of the `display` property (defaults to inline).</span>
<span class="k">fn</span> <span class="nf">display</span><span class="p">(</span><span class="o">&</span><span class="k">self</span><span class="p">)</span> <span class="k">-></span> <span class="n">Display</span> <span class="p">{</span>
<span class="k">match</span> <span class="k">self</span><span class="nf">.value</span><span class="p">(</span><span class="s">"display"</span><span class="p">)</span> <span class="p">{</span>
<span class="nf">Some</span><span class="p">(</span><span class="nf">Keyword</span><span class="p">(</span><span class="n">s</span><span class="p">))</span> <span class="k">=></span> <span class="k">match</span> <span class="o">&*</span><span class="n">s</span> <span class="p">{</span>
<span class="s">"block"</span> <span class="k">=></span> <span class="nn">Display</span><span class="p">::</span><span class="n">Block</span><span class="p">,</span>
<span class="s">"none"</span> <span class="k">=></span> <span class="nn">Display</span><span class="p">::</span><span class="nb">None</span><span class="p">,</span>
<span class="n">_</span> <span class="k">=></span> <span class="nn">Display</span><span class="p">::</span><span class="n">Inline</span>
<span class="p">},</span>
<span class="n">_</span> <span class="k">=></span> <span class="nn">Display</span><span class="p">::</span><span class="n">Inline</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>Now we can walk through the style tree, build a <code>LayoutBox</code> for each node, and
then insert boxes for the node’s children. If a node’s <code>display</code> property is
set to <code>'none'</code> then it is not included in the layout tree.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="c1">// Build the tree of LayoutBoxes, but don't perform any layout calculations yet.</span>
<span class="k">fn</span> <span class="n">build_layout_tree</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span><span class="p">(</span><span class="n">style_node</span><span class="p">:</span> <span class="o">&</span><span class="nv">'a</span> <span class="n">StyledNode</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span><span class="p">)</span> <span class="k">-></span> <span class="n">LayoutBox</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span> <span class="p">{</span>
<span class="c1">// Create the root box.</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">root</span> <span class="o">=</span> <span class="nn">LayoutBox</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="k">match</span> <span class="n">style_node</span><span class="nf">.display</span><span class="p">()</span> <span class="p">{</span>
<span class="n">Block</span> <span class="k">=></span> <span class="nf">BlockNode</span><span class="p">(</span><span class="n">style_node</span><span class="p">),</span>
<span class="n">Inline</span> <span class="k">=></span> <span class="nf">InlineNode</span><span class="p">(</span><span class="n">style_node</span><span class="p">),</span>
<span class="n">DisplayNone</span> <span class="k">=></span> <span class="nd">panic!</span><span class="p">(</span><span class="s">"Root node has display: none."</span><span class="p">)</span>
<span class="p">});</span>
<span class="c1">// Create the descendant boxes.</span>
<span class="k">for</span> <span class="n">child</span> <span class="k">in</span> <span class="o">&</span><span class="n">style_node</span><span class="py">.children</span> <span class="p">{</span>
<span class="k">match</span> <span class="n">child</span><span class="nf">.display</span><span class="p">()</span> <span class="p">{</span>
<span class="n">Block</span> <span class="k">=></span> <span class="n">root</span><span class="py">.children</span><span class="nf">.push</span><span class="p">(</span><span class="nf">build_layout_tree</span><span class="p">(</span><span class="n">child</span><span class="p">)),</span>
<span class="n">Inline</span> <span class="k">=></span> <span class="n">root</span><span class="nf">.get_inline_container</span><span class="p">()</span><span class="py">.children</span><span class="nf">.push</span><span class="p">(</span><span class="nf">build_layout_tree</span><span class="p">(</span><span class="n">child</span><span class="p">)),</span>
<span class="n">DisplayNone</span> <span class="k">=></span> <span class="p">{}</span> <span class="c1">// Skip nodes with `display: none;`</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">return</span> <span class="n">root</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">impl</span> <span class="n">LayoutBox</span> <span class="p">{</span>
<span class="c1">// Constructor function</span>
<span class="k">fn</span> <span class="nf">new</span><span class="p">(</span><span class="n">box_type</span><span class="p">:</span> <span class="n">BoxType</span><span class="p">)</span> <span class="k">-></span> <span class="n">LayoutBox</span> <span class="p">{</span>
<span class="n">LayoutBox</span> <span class="p">{</span>
<span class="n">box_type</span><span class="p">:</span> <span class="n">box_type</span><span class="p">,</span>
<span class="n">dimensions</span><span class="p">:</span> <span class="nn">Default</span><span class="p">::</span><span class="nf">default</span><span class="p">(),</span> <span class="c1">// initially set all fields to 0.0</span>
<span class="n">children</span><span class="p">:</span> <span class="nn">Vec</span><span class="p">::</span><span class="nf">new</span><span class="p">(),</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="c1">// ...</span>
<span class="p">}</span></code></pre></figure>
<p>If a block node contains an inline child, create an anonymous block box to
contain it. If there are several inline children in a row, put them all in
the same anonymous container.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="c1">// Where a new inline child should go.</span>
<span class="k">fn</span> <span class="nf">get_inline_container</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="k">-></span> <span class="o">&</span><span class="k">mut</span> <span class="n">LayoutBox</span> <span class="p">{</span>
<span class="k">match</span> <span class="k">self</span><span class="py">.box_type</span> <span class="p">{</span>
<span class="nf">InlineNode</span><span class="p">(</span><span class="n">_</span><span class="p">)</span> <span class="p">|</span> <span class="n">AnonymousBlock</span> <span class="k">=></span> <span class="k">self</span><span class="p">,</span>
<span class="nf">BlockNode</span><span class="p">(</span><span class="n">_</span><span class="p">)</span> <span class="k">=></span> <span class="p">{</span>
<span class="c1">// If we've just generated an anonymous block box, keep using it.</span>
<span class="c1">// Otherwise, create a new one.</span>
<span class="k">match</span> <span class="k">self</span><span class="py">.children</span><span class="nf">.last</span><span class="p">()</span> <span class="p">{</span>
<span class="nf">Some</span><span class="p">(</span><span class="o">&</span><span class="n">LayoutBox</span> <span class="p">{</span> <span class="n">box_type</span><span class="p">:</span> <span class="n">AnonymousBlock</span><span class="p">,</span><span class="o">..</span><span class="p">})</span> <span class="k">=></span> <span class="p">{}</span>
<span class="n">_</span> <span class="k">=></span> <span class="k">self</span><span class="py">.children</span><span class="nf">.push</span><span class="p">(</span><span class="nn">LayoutBox</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="n">AnonymousBlock</span><span class="p">))</span>
<span class="p">}</span>
<span class="k">self</span><span class="py">.children</span><span class="nf">.last_mut</span><span class="p">()</span><span class="nf">.unwrap</span><span class="p">()</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>This is intentionally simplified in a number of ways from the standard CSS
<a href="http://www.w3.org/TR/CSS2/visuren.html#box-gen">box generation</a> algorithm. For example, it doesn’t handle the
case where an inline box contains a block-level child. Also, it generates
an unnecessary anonymous box if a block-level node has only inline children.</p>
<h2 id="to-be-continued">To Be Continued…</h2>
<p>Whew, that took longer than I expected. I think I’ll stop here for now, but
don’t worry: <a href="/mbrubeck/2014/09/17/toy-layout-engine-6-block.html">Part 6</a> is coming soon, and will cover block-level layout.</p>
<p>Once block layout is finished, we could jump ahead to the next stage of the
pipeline: painting! I think I might do that, because then we can finally see
the rendering engine’s output as pretty pictures instead of just numbers.</p>
<p>However, the pictures will just be a bunch of colored rectangles, unless we
finish the layout module by implementing inline layout and text layout. If I
don’t implement those before moving on to painting, I hope to come back to
them afterward.</p>
Matt Brubeckmbrubeck@limpet.nethttps://limpet.net/mbrubeck/Let's build a browser engine! Part 4: Style2014-08-23T15:45:00-07:00https://limpet.net/mbrubeck//2014/08/23/toy-layout-engine-4-style<p>Welcome back to my series on building your own toy browser engine. If you’re
just tuning in, you can find the previous episodes here:</p>
<blockquote>
<ul>
<li>
<a href="/mbrubeck/2014/08/08/toy-layout-engine-1.html">Part 1: Getting started</a>
</li>
<li>
<a href="/mbrubeck/2014/08/11/toy-layout-engine-2.html">Part 2: HTML</a>
</li>
<li>
<a href="/mbrubeck/2014/08/13/toy-layout-engine-3-css.html">Part 3: CSS</a>
</li>
<li>
<b>Part 4: Style</b>
</li>
<li>
<a href="/mbrubeck/2014/09/08/toy-layout-engine-5-boxes.html">Part 5: Boxes</a>
</li>
<li>
<a href="/mbrubeck/2014/09/17/toy-layout-engine-6-block.html">Part 6: Block layout</a>
</li>
<li>
<a href="/mbrubeck/2014/11/05/toy-layout-engine-7-painting.html">Part 7: Painting 101</a>
</li>
</ul>
</blockquote>
<p>This article will cover what the CSS standard calls <a href="http://www.w3.org/TR/CSS2/cascade.html">assigning property
values</a>, or what I call the <a href="https://github.com/mbrubeck/robinson/blob/275ea716d50565b10ce91c0054fbf527281180bb/src/style.rs">style</a> module.
This module takes DOM nodes and CSS rules as input, and matches them up to
determine the value of each CSS property for any given node.</p>
<p>This part doesn’t contain a lot of code, since I didn’t implement the really
complicated parts. However, I think what’s left is still quite interesting,
and I’ll also explain how some of the missing pieces can be implemented.</p>
<h2 id="the-style-tree">The Style Tree</h2>
<p>The output of robinson’s style module is something I call the <em>style tree.</em>
Each node in this tree includes a pointer to a DOM node, plus its CSS property
values:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="c1">// Map from CSS property names to values.</span>
<span class="k">type</span> <span class="n">PropertyMap</span> <span class="o">=</span> <span class="n">HashMap</span><span class="o"><</span><span class="nb">String</span><span class="p">,</span> <span class="n">Value</span><span class="o">></span><span class="p">;</span>
<span class="c1">// A node with associated style data.</span>
<span class="k">struct</span> <span class="n">StyledNode</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span> <span class="p">{</span>
<span class="n">node</span><span class="p">:</span> <span class="o">&</span><span class="nv">'a</span> <span class="n">Node</span><span class="p">,</span> <span class="c1">// pointer to a DOM node</span>
<span class="n">specified_values</span><span class="p">:</span> <span class="n">PropertyMap</span><span class="p">,</span>
<span class="n">children</span><span class="p">:</span> <span class="nb">Vec</span><span class="o"><</span><span class="n">StyledNode</span><span class="o"><</span><span class="nv">'a</span><span class="o">>></span><span class="p">,</span>
<span class="p">}</span></code></pre></figure>
<blockquote>
<p><strong>What’s with all the <code>'a</code> stuff?</strong> Those are <a href="http://doc.rust-lang.org/book/ownership.html">lifetimes</a>,
part of how Rust guarantees that pointers are memory-safe without requiring
garbage collection. If you’re not working in Rust you can ignore them; they
aren’t critical to the code’s meaning.</p>
</blockquote>
<p>We could add new fields to the <code>dom::Node</code> struct instead of creating a new
tree, but I wanted to keep style code out of the earlier “lessons.” This also
gives me an opportunity to talk about the parallel trees that inhabit most
rendering engines.</p>
<p>A browser engine module often takes one tree as input, and produces a
different but related tree as output. For example, Gecko’s <a href="https://wiki.mozilla.org/Gecko:Key_Gecko_Structures_And_Invariants">layout
code</a> takes a DOM tree and produces a <em>frame tree</em>, which is
then used to build a <em>view tree</em>. Blink and WebKit transform the DOM tree
into a <a href="http://dev.chromium.org/developers/design-documents/gpu-accelerated-compositing-in-chrome"><em>render tree</em></a>. Later stages in all these engines produce
still more trees, including <em>layer trees</em> and <em>widget trees</em>.</p>
<p>The pipeline for our toy browser engine will look something like this, after we
complete a few more stages:</p>
<p><img src="/mbrubeck/images/2014/pipeline.svg" style="width: 720px" /></p>
<p>In my implementation, each node in the DOM tree has exactly one node in the
style tree. But in a more complicated pipeline stage, several input nodes
could collapse into a single output node. Or an input node might expand into
several output nodes, or be skipped completely. For example, the style tree
could exclude elements whose <a href="https://developer.mozilla.org/en-US/docs/Web/CSS/display"><code>display</code></a> property is set to
<code>'none'</code>. (Instead I’ll remove these in the layout stage, because my code
turned out a bit simpler that way.)</p>
<h2 id="selector-matching">Selector Matching</h2>
<p>The first step in building the style tree is <a href="http://www.w3.org/TR/CSS2/selector.html#pattern-matching">selector matching</a>.
This will be very easy, since my <a href="/mbrubeck/2014/08/13/toy-layout-engine-3-css.html">CSS parser</a> supports only simple
selectors. You can tell whether a simple selector matches an element just by
looking at the element itself. Matching compound selectors would require
traversing the DOM tree to look at the element’s siblings, parents, etc.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">fn</span> <span class="nf">matches</span><span class="p">(</span><span class="n">elem</span><span class="p">:</span> <span class="o">&</span><span class="n">ElementData</span><span class="p">,</span> <span class="n">selector</span><span class="p">:</span> <span class="o">&</span><span class="n">Selector</span><span class="p">)</span> <span class="k">-></span> <span class="nb">bool</span> <span class="p">{</span>
<span class="k">match</span> <span class="o">*</span><span class="n">selector</span> <span class="p">{</span>
<span class="nf">Simple</span><span class="p">(</span><span class="k">ref</span> <span class="n">simple_selector</span><span class="p">)</span> <span class="k">=></span> <span class="nf">matches_simple_selector</span><span class="p">(</span><span class="n">elem</span><span class="p">,</span> <span class="n">simple_selector</span><span class="p">)</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>To help, we’ll add some convenient ID and class accessors to our <a href="https://github.com/mbrubeck/robinson/blob/master/src/dom.rs">DOM element
type</a>. The <code>class</code> attribute can contain multiple class names
separated by spaces, which we return in a hash table.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">impl</span> <span class="n">ElementData</span> <span class="p">{</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">id</span><span class="p">(</span><span class="o">&</span><span class="k">self</span><span class="p">)</span> <span class="k">-></span> <span class="nb">Option</span><span class="o"><&</span><span class="nb">String</span><span class="o">></span> <span class="p">{</span>
<span class="k">self</span><span class="py">.attributes</span><span class="nf">.get</span><span class="p">(</span><span class="s">"id"</span><span class="p">)</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">classes</span><span class="p">(</span><span class="o">&</span><span class="k">self</span><span class="p">)</span> <span class="k">-></span> <span class="n">HashSet</span><span class="o"><&</span><span class="nb">str</span><span class="o">></span> <span class="p">{</span>
<span class="k">match</span> <span class="k">self</span><span class="py">.attributes</span><span class="nf">.get</span><span class="p">(</span><span class="s">"class"</span><span class="p">)</span> <span class="p">{</span>
<span class="nf">Some</span><span class="p">(</span><span class="n">classlist</span><span class="p">)</span> <span class="k">=></span> <span class="n">classlist</span><span class="nf">.split</span><span class="p">(</span><span class="sc">' '</span><span class="p">)</span><span class="nf">.collect</span><span class="p">(),</span>
<span class="nb">None</span> <span class="k">=></span> <span class="nn">HashSet</span><span class="p">::</span><span class="nf">new</span><span class="p">()</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>To test whether a simple selector matches an element, just look at each
selector component, and return <code>false</code> if the element doesn’t have a matching
class, ID, or tag name.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">fn</span> <span class="nf">matches_simple_selector</span><span class="p">(</span><span class="n">elem</span><span class="p">:</span> <span class="o">&</span><span class="n">ElementData</span><span class="p">,</span> <span class="n">selector</span><span class="p">:</span> <span class="o">&</span><span class="n">SimpleSelector</span><span class="p">)</span> <span class="k">-></span> <span class="nb">bool</span> <span class="p">{</span>
<span class="c1">// Check type selector</span>
<span class="k">if</span> <span class="n">selector</span><span class="py">.tag_name</span><span class="nf">.iter</span><span class="p">()</span><span class="nf">.any</span><span class="p">(|</span><span class="n">name</span><span class="p">|</span> <span class="n">elem</span><span class="py">.tag_name</span> <span class="o">!=</span> <span class="o">*</span><span class="n">name</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="k">false</span><span class="p">;</span>
<span class="p">}</span>
<span class="c1">// Check ID selector</span>
<span class="k">if</span> <span class="n">selector</span><span class="py">.id</span><span class="nf">.iter</span><span class="p">()</span><span class="nf">.any</span><span class="p">(|</span><span class="n">id</span><span class="p">|</span> <span class="n">elem</span><span class="nf">.id</span><span class="p">()</span> <span class="o">!=</span> <span class="nf">Some</span><span class="p">(</span><span class="n">id</span><span class="p">))</span> <span class="p">{</span>
<span class="k">return</span> <span class="k">false</span><span class="p">;</span>
<span class="p">}</span>
<span class="c1">// Check class selectors</span>
<span class="k">let</span> <span class="n">elem_classes</span> <span class="o">=</span> <span class="n">elem</span><span class="nf">.classes</span><span class="p">();</span>
<span class="k">if</span> <span class="n">selector</span><span class="py">.class</span><span class="nf">.iter</span><span class="p">()</span><span class="nf">.any</span><span class="p">(|</span><span class="n">class</span><span class="p">|</span> <span class="o">!</span><span class="n">elem_classes</span><span class="nf">.contains</span><span class="p">(</span><span class="o">&**</span><span class="n">class</span><span class="p">))</span> <span class="p">{</span>
<span class="k">return</span> <span class="k">false</span><span class="p">;</span>
<span class="p">}</span>
<span class="c1">// We didn't find any non-matching selector components.</span>
<span class="k">return</span> <span class="k">true</span><span class="p">;</span>
<span class="p">}</span></code></pre></figure>
<blockquote>
<p><strong>Rust note:</strong> This function uses the <a href="http://doc.rust-lang.org/core/iter/trait.Iterator.html#method.any"><code>any</code></a> method, which
returns <code>true</code> if an iterator contains an element that passes the provided
test. This is the same as the <a href="https://docs.python.org/3/library/functions.html#any"><code>any</code></a> function in Python (<a href="http://hackage.haskell.org/package/base-4.7.0.1/docs/Prelude.html#v:any">or
Haskell</a>), or the <a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/some"><code>some</code></a> method in JavaScript.</p>
</blockquote>
<h2 id="building-the-style-tree">Building the Style Tree</h2>
<p>Next we need to traverse the DOM tree. For each element in the tree, we will
search the stylesheet for matching rules.</p>
<p>When comparing two rules that match the same element, we need to use the
highest-specificity selector from each match. Because our CSS parser stores
the selectors from most- to least-specific, we can stop as soon as we find a
matching one, and return its specificity along with a pointer to the rule.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">type</span> <span class="n">MatchedRule</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span> <span class="o">=</span> <span class="p">(</span><span class="n">Specificity</span><span class="p">,</span> <span class="o">&</span><span class="nv">'a</span> <span class="n">Rule</span><span class="p">);</span>
<span class="c1">// If `rule` matches `elem`, return a `MatchedRule`. Otherwise return `None`.</span>
<span class="k">fn</span> <span class="n">match_rule</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span><span class="p">(</span><span class="n">elem</span><span class="p">:</span> <span class="o">&</span><span class="n">ElementData</span><span class="p">,</span> <span class="n">rule</span><span class="p">:</span> <span class="o">&</span><span class="nv">'a</span> <span class="n">Rule</span><span class="p">)</span> <span class="k">-></span> <span class="nb">Option</span><span class="o"><</span><span class="n">MatchedRule</span><span class="o"><</span><span class="nv">'a</span><span class="o">>></span> <span class="p">{</span>
<span class="c1">// Find the first (highest-specificity) matching selector.</span>
<span class="n">rule</span><span class="py">.selectors</span><span class="nf">.iter</span><span class="p">()</span>
<span class="nf">.find</span><span class="p">(|</span><span class="n">selector</span><span class="p">|</span> <span class="nf">matches</span><span class="p">(</span><span class="n">elem</span><span class="p">,</span> <span class="o">*</span><span class="n">selector</span><span class="p">))</span>
<span class="nf">.map</span><span class="p">(|</span><span class="n">selector</span><span class="p">|</span> <span class="p">(</span><span class="n">selector</span><span class="nf">.specificity</span><span class="p">(),</span> <span class="n">rule</span><span class="p">))</span>
<span class="p">}</span></code></pre></figure>
<p>To find all the rules that match an element we call <code>filter_map</code>, which
does a linear scan through the style sheet, checking every rule and throwing
out ones that don’t match. A real browser engine would speed this up by
storing the rules in multiple hash tables based on tag name, id, class, etc.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="c1">// Find all CSS rules that match the given element.</span>
<span class="k">fn</span> <span class="n">matching_rules</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span><span class="p">(</span><span class="n">elem</span><span class="p">:</span> <span class="o">&</span><span class="n">ElementData</span><span class="p">,</span> <span class="n">stylesheet</span><span class="p">:</span> <span class="o">&</span><span class="nv">'a</span> <span class="n">Stylesheet</span><span class="p">)</span> <span class="k">-></span> <span class="nb">Vec</span><span class="o"><</span><span class="n">MatchedRule</span><span class="o"><</span><span class="nv">'a</span><span class="o">>></span> <span class="p">{</span>
<span class="n">stylesheet</span><span class="py">.rules</span><span class="nf">.iter</span><span class="p">()</span><span class="nf">.filter_map</span><span class="p">(|</span><span class="n">rule</span><span class="p">|</span> <span class="nf">match_rule</span><span class="p">(</span><span class="n">elem</span><span class="p">,</span> <span class="n">rule</span><span class="p">))</span><span class="nf">.collect</span><span class="p">()</span>
<span class="p">}</span></code></pre></figure>
<p>Once we have the matching rules, we can find the <em>specified values</em> for the
element. We insert each rule’s property values into a HashMap. We sort the
matches by specificity, so the more-specific rules are processed after the
less-specific ones, and can overwrite their values in the HashMap.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="c1">// Apply styles to a single element, returning the specified values.</span>
<span class="k">fn</span> <span class="nf">specified_values</span><span class="p">(</span><span class="n">elem</span><span class="p">:</span> <span class="o">&</span><span class="n">ElementData</span><span class="p">,</span> <span class="n">stylesheet</span><span class="p">:</span> <span class="o">&</span><span class="n">Stylesheet</span><span class="p">)</span> <span class="k">-></span> <span class="n">PropertyMap</span> <span class="p">{</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">values</span> <span class="o">=</span> <span class="nn">HashMap</span><span class="p">::</span><span class="nf">new</span><span class="p">();</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">rules</span> <span class="o">=</span> <span class="nf">matching_rules</span><span class="p">(</span><span class="n">elem</span><span class="p">,</span> <span class="n">stylesheet</span><span class="p">);</span>
<span class="c1">// Go through the rules from lowest to highest specificity.</span>
<span class="n">rules</span><span class="nf">.sort_by</span><span class="p">(|</span><span class="o">&</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">_</span><span class="p">),</span> <span class="o">&</span><span class="p">(</span><span class="n">b</span><span class="p">,</span> <span class="n">_</span><span class="p">)|</span> <span class="n">a</span><span class="nf">.cmp</span><span class="p">(</span><span class="o">&</span><span class="n">b</span><span class="p">));</span>
<span class="k">for</span> <span class="p">(</span><span class="n">_</span><span class="p">,</span> <span class="n">rule</span><span class="p">)</span> <span class="k">in</span> <span class="n">rules</span> <span class="p">{</span>
<span class="k">for</span> <span class="n">declaration</span> <span class="k">in</span> <span class="o">&</span><span class="n">rule</span><span class="py">.declarations</span> <span class="p">{</span>
<span class="n">values</span><span class="nf">.insert</span><span class="p">(</span><span class="n">declaration</span><span class="py">.name</span><span class="nf">.clone</span><span class="p">(),</span> <span class="n">declaration</span><span class="py">.value</span><span class="nf">.clone</span><span class="p">());</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">return</span> <span class="n">values</span><span class="p">;</span>
<span class="p">}</span></code></pre></figure>
<p>Now we have everything we need to walk through the DOM tree and build the
style tree. Note that selector matching works only on elements, so the
specified values for a text node are just an empty map.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="c1">// Apply a stylesheet to an entire DOM tree, returning a StyledNode tree.</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="n">style_tree</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span><span class="p">(</span><span class="n">root</span><span class="p">:</span> <span class="o">&</span><span class="nv">'a</span> <span class="n">Node</span><span class="p">,</span> <span class="n">stylesheet</span><span class="p">:</span> <span class="o">&</span><span class="nv">'a</span> <span class="n">Stylesheet</span><span class="p">)</span> <span class="k">-></span> <span class="n">StyledNode</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span> <span class="p">{</span>
<span class="n">StyledNode</span> <span class="p">{</span>
<span class="n">node</span><span class="p">:</span> <span class="n">root</span><span class="p">,</span>
<span class="n">specified_values</span><span class="p">:</span> <span class="k">match</span> <span class="n">root</span><span class="py">.node_type</span> <span class="p">{</span>
<span class="nf">Element</span><span class="p">(</span><span class="k">ref</span> <span class="n">elem</span><span class="p">)</span> <span class="k">=></span> <span class="nf">specified_values</span><span class="p">(</span><span class="n">elem</span><span class="p">,</span> <span class="n">stylesheet</span><span class="p">),</span>
<span class="nf">Text</span><span class="p">(</span><span class="n">_</span><span class="p">)</span> <span class="k">=></span> <span class="nn">HashMap</span><span class="p">::</span><span class="nf">new</span><span class="p">()</span>
<span class="p">},</span>
<span class="n">children</span><span class="p">:</span> <span class="n">root</span><span class="py">.children</span><span class="nf">.iter</span><span class="p">()</span><span class="nf">.map</span><span class="p">(|</span><span class="n">child</span><span class="p">|</span> <span class="nf">style_tree</span><span class="p">(</span><span class="n">child</span><span class="p">,</span> <span class="n">stylesheet</span><span class="p">))</span><span class="nf">.collect</span><span class="p">(),</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>That’s all of robinson’s code for building the style tree. Next I’ll talk
about some glaring omissions.</p>
<h2 id="the-cascade">The Cascade</h2>
<p>Style sheets provided by the author of a web page are called <em>author style
sheets</em>. In addition to these, browsers also provide <a href="http://www.w3.org/TR/CSS2/sample.html">default
styles</a> via <em>user agent style sheets</em>. And they may allow users
to add custom styles through <em>user style sheets</em> (like Gecko’s
<a href="http://www-archive.mozilla.org/unix/customizing.html#usercss">userContent.css</a>).</p>
<p>The <a href="http://www.w3.org/TR/CSS2/cascade.html#cascade">cascade</a> defines which of these three “origins” takes
precedence over another. There are six levels to the cascade: one for each
origin’s “normal” declarations, plus one for each origin’s <code>!important</code>
declarations.</p>
<p>Robinson’s style code does not implement the cascade; it takes only a single
style sheet. The lack of a default style sheet means that HTML elements will
not have any of the default styles you might expect. For example, the
<code><head></code> element’s contents will not be hidden unless you explicitly add this
rule to your style sheet:</p>
<figure class="highlight"><pre><code class="language-css" data-lang="css"><span class="nt">head</span> <span class="p">{</span> <span class="nl">display</span><span class="p">:</span> <span class="nb">none</span><span class="p">;</span> <span class="p">}</span></code></pre></figure>
<p>Implementing the cascade should by fairly easy: Just track the origin of each
rule, and sort declarations by origin and importance in addition to
specificity. A simplified, two-level cascade should be enough to support the
most common cases: normal user agent styles and normal author styles.</p>
<h2 id="computed-values">Computed Values</h2>
<p>In addition to the “specified values” mentioned above, CSS defines <a href="http://www.w3.org/TR/CSS2/cascade.html#value-stages"><em>initial</em>,
<em>computed</em>, <em>used</em>, and <em>actual</em> values</a>.</p>
<p><em>Initial values</em> are defaults for properties that aren’t specified in the
cascade. <em>Computed values</em> are based on specified values, but may have some
property-specific normalization rules applied.</p>
<p>Implementing these correctly requires separate code for each property, based
on its definition in the CSS specs. This work is necessary for a real-world
browser engine, but I’m hoping to avoid it in this toy project. In later
stages, code that uses these values will (sort of) simulate initial values by
using a default when the specified value is missing.</p>
<p><em>Used values</em> and <em>actual values</em> are calculated during and after layout, which
I’ll cover in future articles.</p>
<h2 id="inheritance">Inheritance</h2>
<p>If text nodes can’t match selectors, how do they get colors and fonts and
other styles? The answer is <a href="http://www.w3.org/TR/CSS2/cascade.html#inheritance">inheritance</a>.</p>
<p>When a property is inherited, any node without a cascaded value will receive
its parent’s value for that property. Some properties, like <code>'color'</code>, are
inherited by default; others only if the cascade specifies the special
value <code>'inherit'</code>.</p>
<p>My code does not support inheritance. To implement it, you could pass the
parent’s style data into the <code>specified_values</code> function, and use a hard-coded
lookup table to decide which properties should be inherited.</p>
<h2 id="style-attributes">Style Attributes</h2>
<p>Any HTML element can include a <code>style</code> attribute containing a list of CSS
declarations. There are no selectors, because these declarations
automatically apply only to the element itself.</p>
<figure class="highlight"><pre><code class="language-html" data-lang="html"><span class="nt"><span</span> <span class="na">style=</span><span class="s">"color: red; background: yellow;"</span><span class="nt">></span></code></pre></figure>
<p>If you want to support the <code>style</code> attribute, make the <code>specified_values</code> function
check for the attribute. If the attribute is present, pass it to
<code>parse_declarations</code> from the <a href="/mbrubeck/2014/08/13/toy-layout-engine-3-css.html">CSS parser</a>. Apply the resulting
declarations <em>after</em> the normal author declarations, since the attribute is
more specific than any CSS selector.</p>
<h2 id="exercises">Exercises</h2>
<p>In addition to writing your own selector matching and value assignment code,
for further exercise you can implement one or more of the missing pieces
discussed above, in your own project or a fork of robinson:</p>
<ol>
<li>Cascading</li>
<li>Initial and/or computed values</li>
<li>Inheritance</li>
<li>The <code>style</code> attribute</li>
</ol>
<p>Also, if you extended the CSS parser from Part 3 to include compound
selectors, you can now implement matching for those compound selectors.</p>
<h2 id="to-be-continued">To Be Continued…</h2>
<p><a href="/mbrubeck/2014/09/08/toy-layout-engine-5-boxes.html">Part 5</a> will introduce the layout module. I haven’t
finished the code for this yet, so there will be another delay before I can
start writing the article. I plan to split layout into at least two articles
(one for block layout and one for inline layout, probably).</p>
<p>In the meantime, I’d love to see anything you’ve created based on these
articles or exercises. If your code is online somewhere, feel free to add a
link to the comments below! So far I have seen Martin Tomasi’s <a href="http://www.wambo.at:8080/GyrosOfWar/browserino/tree/master">Java
implementation</a> and Pohl Longsine’s <a href="http://www.screaming.org/blog/categories/crow/">Swift version</a>.</p>
Matt Brubeckmbrubeck@limpet.nethttps://limpet.net/mbrubeck/Let's build a browser engine! Part 3: CSS2014-08-13T12:30:00-07:00https://limpet.net/mbrubeck//2014/08/13/toy-layout-engine-3-css<p>This is the third in a series of articles on building a toy browser rendering
engine. Want to build your own? Start at the beginning to learn more:</p>
<blockquote>
<ul>
<li>
<a href="/mbrubeck/2014/08/08/toy-layout-engine-1.html">Part 1: Getting started</a>
</li>
<li>
<a href="/mbrubeck/2014/08/11/toy-layout-engine-2.html">Part 2: HTML</a>
</li>
<li>
<b>Part 3: CSS</b>
</li>
<li>
<a href="/mbrubeck/2014/08/23/toy-layout-engine-4-style.html">Part 4: Style</a>
</li>
<li>
<a href="/mbrubeck/2014/09/08/toy-layout-engine-5-boxes.html">Part 5: Boxes</a>
</li>
<li>
<a href="/mbrubeck/2014/09/17/toy-layout-engine-6-block.html">Part 6: Block layout</a>
</li>
<li>
<a href="/mbrubeck/2014/11/05/toy-layout-engine-7-painting.html">Part 7: Painting 101</a>
</li>
</ul>
</blockquote>
<p>This article introduces code for reading <a href="http://www.w3.org/TR/CSS2/">Cascading Style Sheets (CSS)</a>.
As usual, I won’t try to cover everything in the spec. Instead, I tried to
implement just enough to illustrate some concepts and produce input for later
stages in the rendering pipeline.</p>
<h2 id="anatomy-of-a-stylesheet">Anatomy of a Stylesheet</h2>
<p>Here’s an example of CSS source code:</p>
<figure class="highlight"><pre><code class="language-css" data-lang="css"><span class="nt">h1</span><span class="o">,</span> <span class="nt">h2</span><span class="o">,</span> <span class="nt">h3</span> <span class="p">{</span> <span class="nl">margin</span><span class="p">:</span> <span class="nb">auto</span><span class="p">;</span> <span class="nl">color</span><span class="p">:</span> <span class="m">#cc0000</span><span class="p">;</span> <span class="p">}</span>
<span class="nt">div</span><span class="nc">.note</span> <span class="p">{</span> <span class="nl">margin-bottom</span><span class="p">:</span> <span class="m">20px</span><span class="p">;</span> <span class="nl">padding</span><span class="p">:</span> <span class="m">10px</span><span class="p">;</span> <span class="p">}</span>
<span class="nf">#answer</span> <span class="p">{</span> <span class="nl">display</span><span class="p">:</span> <span class="nb">none</span><span class="p">;</span> <span class="p">}</span></code></pre></figure>
<p>Next I’ll walk through the <a href="https://github.com/mbrubeck/robinson/blob/master/src/css.rs">css module</a> from my toy browser engine,
<a href="https://github.com/mbrubeck/robinson">robinson</a>. The code is written in <a href="http://www.rust-lang.org/">Rust</a>, though the
concepts should translate pretty easily into other programming languages.
Reading the previous articles first might help you understand some the code
below.</p>
<p>A CSS <strong>stylesheet</strong> is a series of rules. (In the example stylesheet above,
each line contains one rule.)</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">struct</span> <span class="n">Stylesheet</span> <span class="p">{</span>
<span class="n">rules</span><span class="p">:</span> <span class="nb">Vec</span><span class="o"><</span><span class="n">Rule</span><span class="o">></span><span class="p">,</span>
<span class="p">}</span></code></pre></figure>
<p>A <strong>rule</strong> includes one or more selectors separated by commas, followed by a
series of declarations enclosed in braces.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">struct</span> <span class="n">Rule</span> <span class="p">{</span>
<span class="n">selectors</span><span class="p">:</span> <span class="nb">Vec</span><span class="o"><</span><span class="n">Selector</span><span class="o">></span><span class="p">,</span>
<span class="n">declarations</span><span class="p">:</span> <span class="nb">Vec</span><span class="o"><</span><span class="n">Declaration</span><span class="o">></span><span class="p">,</span>
<span class="p">}</span></code></pre></figure>
<p>A <strong>selector</strong> can be a <a href="http://www.w3.org/TR/CSS2/selector.html#selector-syntax">simple selector</a>, or it can be a chain of
selectors joined by <em>combinators</em>. Robinson supports only simple selectors
for now.</p>
<blockquote>
<p><strong>Note:</strong> Confusingly, the newer <a href="http://www.w3.org/TR/css3-selectors/">Selectors Level 3</a> standard uses
the same terms to mean slightly different things. In this article I’ll
mostly refer to CSS2.1. Although outdated, it’s a useful starting point
because it’s smaller and more self-contained (compared to CSS3, which is
split into myriad specs that depend on each other and CSS2.1).</p>
</blockquote>
<p>In robinson, a <strong>simple selector</strong> can include a tag name, an ID prefixed by
<code>'#'</code>, any number of class names prefixed by <code>'.'</code>, or some combination of the
above. If the tag name is empty or <code>'*'</code> then it is a “universal selector” that
can match any tag.</p>
<p>There are many other types of selector (especially in CSS3), but this will do
for now.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">enum</span> <span class="n">Selector</span> <span class="p">{</span>
<span class="nf">Simple</span><span class="p">(</span><span class="n">SimpleSelector</span><span class="p">),</span>
<span class="p">}</span>
<span class="k">struct</span> <span class="n">SimpleSelector</span> <span class="p">{</span>
<span class="n">tag_name</span><span class="p">:</span> <span class="nb">Option</span><span class="o"><</span><span class="nb">String</span><span class="o">></span><span class="p">,</span>
<span class="n">id</span><span class="p">:</span> <span class="nb">Option</span><span class="o"><</span><span class="nb">String</span><span class="o">></span><span class="p">,</span>
<span class="n">class</span><span class="p">:</span> <span class="nb">Vec</span><span class="o"><</span><span class="nb">String</span><span class="o">></span><span class="p">,</span>
<span class="p">}</span></code></pre></figure>
<p>A <strong>declaration</strong> is just a name/value pair, separated by a colon and ending
with a semicolon. For example, <code>"margin: auto;"</code> is a declaration.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">struct</span> <span class="n">Declaration</span> <span class="p">{</span>
<span class="n">name</span><span class="p">:</span> <span class="nb">String</span><span class="p">,</span>
<span class="n">value</span><span class="p">:</span> <span class="n">Value</span><span class="p">,</span>
<span class="p">}</span></code></pre></figure>
<p>My toy engine supports only a handful of CSS’s many <strong>value</strong> types.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">enum</span> <span class="n">Value</span> <span class="p">{</span>
<span class="nf">Keyword</span><span class="p">(</span><span class="nb">String</span><span class="p">),</span>
<span class="nf">Length</span><span class="p">(</span><span class="nb">f32</span><span class="p">,</span> <span class="n">Unit</span><span class="p">),</span>
<span class="nf">ColorValue</span><span class="p">(</span><span class="n">Color</span><span class="p">),</span>
<span class="c1">// insert more values here</span>
<span class="p">}</span>
<span class="k">enum</span> <span class="n">Unit</span> <span class="p">{</span>
<span class="n">Px</span><span class="p">,</span>
<span class="c1">// insert more units here</span>
<span class="p">}</span>
<span class="k">struct</span> <span class="n">Color</span> <span class="p">{</span>
<span class="n">r</span><span class="p">:</span> <span class="nb">u8</span><span class="p">,</span>
<span class="n">g</span><span class="p">:</span> <span class="nb">u8</span><span class="p">,</span>
<span class="n">b</span><span class="p">:</span> <span class="nb">u8</span><span class="p">,</span>
<span class="n">a</span><span class="p">:</span> <span class="nb">u8</span><span class="p">,</span>
<span class="p">}</span></code></pre></figure>
<blockquote>
<p><strong>Rust note:</strong> <code>u8</code> is an 8-bit unsigned integer, and <code>f32</code> is a 32-bit float.</p>
</blockquote>
<p>All other CSS syntax is unsupported, including @-rules, comments, and any
selectors/values/units not mentioned above.</p>
<h2 id="parsing">Parsing</h2>
<p>CSS has a straightforward <a href="http://www.w3.org/TR/CSS2/grammar.html">grammar</a>, making it easier to parse correctly
than its quirky cousin HTML. When a standards-compliant CSS parser encounters
a <a href="http://www.w3.org/TR/CSS2/syndata.html#parsing-errors">parse error</a>, it discards the unrecognized part of the stylesheet
but still processes the remaining portions. This is useful because it allows
stylesheets to include new syntax but still produce well-defined output in
older browsers.</p>
<p>Robinson uses a very simplistic (and totally <em>not</em> standards-compliant)
parser, built the same way as the HTML parser from <a href="/mbrubeck/2014/08/11/toy-layout-engine-2.html">Part 2</a>.
Rather than go through the whole thing line-by-line again, I’ll just paste in
a few snippets. For example, here is the code for parsing a single selector:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="c1">// Parse one simple selector, e.g.: `type#id.class1.class2.class3`</span>
<span class="k">fn</span> <span class="nf">parse_simple_selector</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="k">-></span> <span class="n">SimpleSelector</span> <span class="p">{</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">selector</span> <span class="o">=</span> <span class="n">SimpleSelector</span> <span class="p">{</span> <span class="n">tag_name</span><span class="p">:</span> <span class="nb">None</span><span class="p">,</span> <span class="n">id</span><span class="p">:</span> <span class="nb">None</span><span class="p">,</span> <span class="n">class</span><span class="p">:</span> <span class="nn">Vec</span><span class="p">::</span><span class="nf">new</span><span class="p">()</span> <span class="p">};</span>
<span class="k">while</span> <span class="o">!</span><span class="k">self</span><span class="nf">.eof</span><span class="p">()</span> <span class="p">{</span>
<span class="k">match</span> <span class="k">self</span><span class="nf">.next_char</span><span class="p">()</span> <span class="p">{</span>
<span class="sc">'#'</span> <span class="k">=></span> <span class="p">{</span>
<span class="k">self</span><span class="nf">.consume_char</span><span class="p">();</span>
<span class="n">selector</span><span class="py">.id</span> <span class="o">=</span> <span class="nf">Some</span><span class="p">(</span><span class="k">self</span><span class="nf">.parse_identifier</span><span class="p">());</span>
<span class="p">}</span>
<span class="sc">'.'</span> <span class="k">=></span> <span class="p">{</span>
<span class="k">self</span><span class="nf">.consume_char</span><span class="p">();</span>
<span class="n">selector</span><span class="py">.class</span><span class="nf">.push</span><span class="p">(</span><span class="k">self</span><span class="nf">.parse_identifier</span><span class="p">());</span>
<span class="p">}</span>
<span class="sc">'*'</span> <span class="k">=></span> <span class="p">{</span>
<span class="c1">// universal selector</span>
<span class="k">self</span><span class="nf">.consume_char</span><span class="p">();</span>
<span class="p">}</span>
<span class="n">c</span> <span class="k">if</span> <span class="nf">valid_identifier_char</span><span class="p">(</span><span class="n">c</span><span class="p">)</span> <span class="k">=></span> <span class="p">{</span>
<span class="n">selector</span><span class="py">.tag_name</span> <span class="o">=</span> <span class="nf">Some</span><span class="p">(</span><span class="k">self</span><span class="nf">.parse_identifier</span><span class="p">());</span>
<span class="p">}</span>
<span class="n">_</span> <span class="k">=></span> <span class="k">break</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">return</span> <span class="n">selector</span><span class="p">;</span>
<span class="p">}</span></code></pre></figure>
<p>Note the lack of error checking. Some malformed input like <code>###</code> or <code>*foo*</code>
will parse successfully and produce weird results. A real CSS parser would
discard these invalid selectors.</p>
<h2 id="specificity">Specificity</h2>
<p><a href="http://www.w3.org/TR/selectors/#specificity">Specificity</a> is one of the ways a rendering engine decides
which style overrides the other in a conflict. If a stylesheet contains two
rules that match an element, the rule with the matching selector of higher
specificity can override values from the one with lower specificity.</p>
<p>The specificity of a selector is based on its components. An ID selector is
more specific than a class selector, which is more specific than a tag
selector. Within each of these “levels,” more selectors beats fewer.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">pub</span> <span class="k">type</span> <span class="n">Specificity</span> <span class="o">=</span> <span class="p">(</span><span class="nb">usize</span><span class="p">,</span> <span class="nb">usize</span><span class="p">,</span> <span class="nb">usize</span><span class="p">);</span>
<span class="k">impl</span> <span class="n">Selector</span> <span class="p">{</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">specificity</span><span class="p">(</span><span class="o">&</span><span class="k">self</span><span class="p">)</span> <span class="k">-></span> <span class="n">Specificity</span> <span class="p">{</span>
<span class="c1">// http://www.w3.org/TR/selectors/#specificity</span>
<span class="k">let</span> <span class="nn">Selector</span><span class="p">::</span><span class="nf">Simple</span><span class="p">(</span><span class="k">ref</span> <span class="n">simple</span><span class="p">)</span> <span class="o">=</span> <span class="o">*</span><span class="k">self</span><span class="p">;</span>
<span class="k">let</span> <span class="n">a</span> <span class="o">=</span> <span class="n">simple</span><span class="py">.id</span><span class="nf">.iter</span><span class="p">()</span><span class="nf">.count</span><span class="p">();</span>
<span class="k">let</span> <span class="n">b</span> <span class="o">=</span> <span class="n">simple</span><span class="py">.class</span><span class="nf">.len</span><span class="p">();</span>
<span class="k">let</span> <span class="n">c</span> <span class="o">=</span> <span class="n">simple</span><span class="py">.tag_name</span><span class="nf">.iter</span><span class="p">()</span><span class="nf">.count</span><span class="p">();</span>
<span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">)</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>(If we supported chained selectors, we could calculate the specificity of a
chain just by adding up the specificities of its parts.)</p>
<p>The selectors for each rule are stored in a sorted vector, most-specific
first. This will be important in matching, which I’ll cover in the next
article.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="c1">// Parse a rule set: `<selectors> { <declarations> }`.</span>
<span class="k">fn</span> <span class="nf">parse_rule</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="k">-></span> <span class="n">Rule</span> <span class="p">{</span>
<span class="n">Rule</span> <span class="p">{</span>
<span class="n">selectors</span><span class="p">:</span> <span class="k">self</span><span class="nf">.parse_selectors</span><span class="p">(),</span>
<span class="n">declarations</span><span class="p">:</span> <span class="k">self</span><span class="nf">.parse_declarations</span><span class="p">()</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="c1">// Parse a comma-separated list of selectors.</span>
<span class="k">fn</span> <span class="nf">parse_selectors</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="k">-></span> <span class="nb">Vec</span><span class="o"><</span><span class="n">Selector</span><span class="o">></span> <span class="p">{</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">selectors</span> <span class="o">=</span> <span class="nn">Vec</span><span class="p">::</span><span class="nf">new</span><span class="p">();</span>
<span class="k">loop</span> <span class="p">{</span>
<span class="n">selectors</span><span class="nf">.push</span><span class="p">(</span><span class="nn">Selector</span><span class="p">::</span><span class="nf">Simple</span><span class="p">(</span><span class="k">self</span><span class="nf">.parse_simple_selector</span><span class="p">()));</span>
<span class="k">self</span><span class="nf">.consume_whitespace</span><span class="p">();</span>
<span class="k">match</span> <span class="k">self</span><span class="nf">.next_char</span><span class="p">()</span> <span class="p">{</span>
<span class="sc">','</span> <span class="k">=></span> <span class="p">{</span> <span class="k">self</span><span class="nf">.consume_char</span><span class="p">();</span> <span class="k">self</span><span class="nf">.consume_whitespace</span><span class="p">();</span> <span class="p">}</span>
<span class="sc">'{'</span> <span class="k">=></span> <span class="k">break</span><span class="p">,</span> <span class="c1">// start of declarations</span>
<span class="n">c</span> <span class="k">=></span> <span class="nd">panic!</span><span class="p">(</span><span class="s">"Unexpected character {} in selector list"</span><span class="p">,</span> <span class="n">c</span><span class="p">)</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="c1">// Return selectors with highest specificity first, for use in matching.</span>
<span class="n">selectors</span><span class="nf">.sort_by</span><span class="p">(|</span><span class="n">a</span><span class="p">,</span><span class="n">b</span><span class="p">|</span> <span class="n">b</span><span class="nf">.specificity</span><span class="p">()</span><span class="nf">.cmp</span><span class="p">(</span><span class="o">&</span><span class="n">a</span><span class="nf">.specificity</span><span class="p">()));</span>
<span class="k">return</span> <span class="n">selectors</span><span class="p">;</span>
<span class="p">}</span></code></pre></figure>
<p>The rest of the CSS parser is fairly straightforward. You can read the whole
thing <a href="https://github.com/mbrubeck/robinson/blob/master/src/css.rs">on GitHub</a>. And if you didn’t already do it for Part 2,
this would be a great time to try out a parser generator. My hand-rolled
parser gets the job done for simple example files, but it has a lot of hacky
bits and will fail badly if you violate its assumptions. Someday I might
replace it with one built on <a href="https://github.com/kevinmehall/rust-peg/">rust-peg</a> or similar.</p>
<h2 id="exercises">Exercises</h2>
<p>As before, you should decide which of these exercises you want to do, and
skip the rest:</p>
<ol>
<li>
<p>Implement your own simplified CSS parser and specificity calculation.</p>
</li>
<li>
<p>Extend robinson’s CSS parser to support more values, or one or more
selector combinators.</p>
</li>
<li>
<p>Extend the CSS parser to discard any declaration that contains a parse
error, and follow the <a href="http://www.w3.org/TR/CSS2/syndata.html#parsing-errors">error handling rules</a> to resume parsing
after the end of the declaration.</p>
</li>
<li>
<p>Make the HTML parser pass the contents of any <code><style></code> nodes to the CSS
parser, and return a Document object that includes a list of Stylesheets in
addition to the DOM tree.</p>
</li>
</ol>
<h2 id="shortcuts">Shortcuts</h2>
<p>Just like in Part 2, you can skip parsing by hard-coding CSS data structures
directly into your program, or by writing them in an alternate format like
JSON that you already have a parser for.</p>
<h2 id="to-be-continued">To Be Continued…</h2>
<p>The <a href="/mbrubeck/2014/08/23/toy-layout-engine-4-style.html">next article</a> will introduce the <code>style</code> module. This is where
everything starts to come together, with selector matching to apply CSS styles
to DOM nodes.</p>
<p>The pace of this series might slow down soon, since I’ll be busy later this
month and I haven’t even written the code for some of the upcoming articles.
I’ll keep them coming as fast as I can!</p>
Matt Brubeckmbrubeck@limpet.nethttps://limpet.net/mbrubeck/Let's build a browser engine! Part 2: HTML2014-08-11T08:00:00-07:00https://limpet.net/mbrubeck//2014/08/11/toy-layout-engine-2<p>This is the second in a series of articles on building a toy browser
rendering engine:</p>
<blockquote>
<ul>
<li>
<a href="/mbrubeck/2014/08/08/toy-layout-engine-1.html">Part 1: Getting started</a>
</li>
<li>
<b>Part 2: HTML</b>
</li>
<li>
<a href="/mbrubeck/2014/08/13/toy-layout-engine-3-css.html">Part 3: CSS</a>
</li>
<li>
<a href="/mbrubeck/2014/08/23/toy-layout-engine-4-style.html">Part 4: Style</a>
</li>
<li>
<a href="/mbrubeck/2014/09/08/toy-layout-engine-5-boxes.html">Part 5: Boxes</a>
</li>
<li>
<a href="/mbrubeck/2014/09/17/toy-layout-engine-6-block.html">Part 6: Block layout</a>
</li>
<li>
<a href="/mbrubeck/2014/11/05/toy-layout-engine-7-painting.html">Part 7: Painting 101</a>
</li>
</ul>
</blockquote>
<p>This article is about parsing <a href="http://www.whatwg.org/specs/web-apps/current-work/multipage/introduction.html#a-quick-introduction-to-html">HTML source code</a> to produce a tree of
DOM nodes. Parsing is a fascinating topic, but I don’t have the time or
expertise to give it the introduction it deserves. You can get a detailed
introduction to parsing from any good <a href="https://www.coursera.org/course/compilers">course</a> or <a href="http://www.amazon.com/Compilers-Principles-Techniques-Tools-Edition/dp/0321486811">book</a> on compilers.
Or get a hands-on start by going through the documentation for a <a href="https://en.wikipedia.org/wiki/Comparison_of_parser_generators">parser
generator</a> that works with your chosen programming language.</p>
<p>HTML has its own unique <a href="http://www.whatwg.org/specs/web-apps/current-work/multipage/syntax.html#parsing">parsing algorithm</a>. Unlike parsers for most
programming languages and file formats, the HTML parsing algorithm does not
reject invalid input. Instead it includes specific error-handling
instructions, so web browsers can agree on how to display every web page, even
ones that don’t conform to the syntax rules. Web browsers have to do this to
be usable: Since non-conforming HTML has been supported since the early days
of the web, it is now used in a huge portion of existing web pages.</p>
<h2 id="a-simple-html-dialect">A Simple HTML Dialect</h2>
<p>I didn’t even try to implement the standard HTML parsing algorithm. Instead
I wrote a basic parser for a tiny subset of HTML syntax. My parser can handle
simple pages like this:</p>
<figure class="highlight"><pre><code class="language-html" data-lang="html"><span class="nt"><html></span>
<span class="nt"><body></span>
<span class="nt"><h1></span>Title<span class="nt"></h1></span>
<span class="nt"><div</span> <span class="na">id=</span><span class="s">"main"</span> <span class="na">class=</span><span class="s">"test"</span><span class="nt">></span>
<span class="nt"><p></span>Hello <span class="nt"><em></span>world<span class="nt"></em></span>!<span class="nt"></p></span>
<span class="nt"></div></span>
<span class="nt"></body></span>
<span class="nt"></html></span></code></pre></figure>
<p>The following syntax is allowed:</p>
<ul>
<li>Balanced tags: <code><p>...</p></code></li>
<li>Attributes with quoted values: <code>id="main"</code></li>
<li>Text nodes: <code><em>world</em></code></li>
</ul>
<p>Everything else is unsupported, including:</p>
<ul>
<li>Comments</li>
<li>Doctype declarations</li>
<li>Escaped characters (like <code>&amp;</code>) and CDATA sections</li>
<li>Self-closing tags: <code><br/></code> or <code><br></code> with no closing tag</li>
<li>Error handling (e.g. unbalanced or improperly nested tags)</li>
<li>Namespaces and other XHTML syntax: <code><html:body></code></li>
<li>Character encoding detection</li>
</ul>
<p>At each stage of this project I’m writing more or less the minimum code
needed to support the later stages. But if you want to learn more about
parsing theory and tools, you can be much more ambitious in your own project!</p>
<h2 id="example-code">Example Code</h2>
<p>Next, let’s walk through my toy HTML parser, keeping in mind that this is just
one way to do it (and probably not the best way). Its structure is based
loosely on the <a href="https://github.com/servo/rust-cssparser/blob/032e7aed7acc31350fadbbc3eb5a9bbf6f4edb2e/src/tokenizer.rs">tokenizer</a> module from Servo’s <a href="https://github.com/servo/rust-cssparser">cssparser</a> library. It
has no real error handling; in most cases, it just aborts when faced with
unexpected syntax. The code is in <a href="http://www.rust-lang.org/">Rust</a>, but I hope it’s fairly
readable to anyone who’s used similar-looking languages like Java, C++, or C#.
It makes use of the DOM data structures from <a href="/mbrubeck/2014/08/08/toy-layout-engine-1.html">part 1</a>.</p>
<p>The parser stores its input string and a current position within the string.
The position is the index of the next character we haven’t processed yet.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">struct</span> <span class="n">Parser</span> <span class="p">{</span>
<span class="n">pos</span><span class="p">:</span> <span class="nb">usize</span><span class="p">,</span> <span class="c1">// "usize" is an unsigned integer, similar to "size_t" in C</span>
<span class="n">input</span><span class="p">:</span> <span class="nb">String</span><span class="p">,</span>
<span class="p">}</span></code></pre></figure>
<p>We can use this to implement some simple methods for peeking at the next
characters in the input:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">impl</span> <span class="n">Parser</span> <span class="p">{</span>
<span class="c1">// Read the current character without consuming it.</span>
<span class="k">fn</span> <span class="nf">next_char</span><span class="p">(</span><span class="o">&</span><span class="k">self</span><span class="p">)</span> <span class="k">-></span> <span class="nb">char</span> <span class="p">{</span>
<span class="k">self</span><span class="py">.input</span><span class="p">[</span><span class="k">self</span><span class="py">.pos</span><span class="o">..</span><span class="p">]</span><span class="nf">.chars</span><span class="p">()</span><span class="nf">.next</span><span class="p">()</span><span class="nf">.unwrap</span><span class="p">()</span>
<span class="p">}</span>
<span class="c1">// Do the next characters start with the given string?</span>
<span class="k">fn</span> <span class="nf">starts_with</span><span class="p">(</span><span class="o">&</span><span class="k">self</span><span class="p">,</span> <span class="n">s</span><span class="p">:</span> <span class="o">&</span><span class="nb">str</span><span class="p">)</span> <span class="k">-></span> <span class="nb">bool</span> <span class="p">{</span>
<span class="k">self</span><span class="py">.input</span><span class="p">[</span><span class="k">self</span><span class="py">.pos</span> <span class="o">..</span><span class="p">]</span><span class="nf">.starts_with</span><span class="p">(</span><span class="n">s</span><span class="p">)</span>
<span class="p">}</span>
<span class="c1">// Return true if all input is consumed.</span>
<span class="k">fn</span> <span class="nf">eof</span><span class="p">(</span><span class="o">&</span><span class="k">self</span><span class="p">)</span> <span class="k">-></span> <span class="nb">bool</span> <span class="p">{</span>
<span class="k">self</span><span class="py">.pos</span> <span class="o">>=</span> <span class="k">self</span><span class="py">.input</span><span class="nf">.len</span><span class="p">()</span>
<span class="p">}</span>
<span class="c1">// ...</span>
<span class="p">}</span></code></pre></figure>
<p>Rust strings are stored as <a href="https://en.wikipedia.org/wiki/UTF-8">UTF-8</a> byte arrays. To go to the next
character, we can’t just advance by one byte. Instead we use <code>char_indices</code>
which correctly handles multi-byte characters. (If our string used fixed-width
characters, we could just increment <code>pos</code> by 1.)</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="c1">// Return the current character, and advance self.pos to the next character.</span>
<span class="k">fn</span> <span class="nf">consume_char</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="k">-></span> <span class="nb">char</span> <span class="p">{</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">iter</span> <span class="o">=</span> <span class="k">self</span><span class="py">.input</span><span class="p">[</span><span class="k">self</span><span class="py">.pos</span><span class="o">..</span><span class="p">]</span><span class="nf">.char_indices</span><span class="p">();</span>
<span class="k">let</span> <span class="p">(</span><span class="n">_</span><span class="p">,</span> <span class="n">cur_char</span><span class="p">)</span> <span class="o">=</span> <span class="n">iter</span><span class="nf">.next</span><span class="p">()</span><span class="nf">.unwrap</span><span class="p">();</span>
<span class="k">let</span> <span class="p">(</span><span class="n">next_pos</span><span class="p">,</span> <span class="n">_</span><span class="p">)</span> <span class="o">=</span> <span class="n">iter</span><span class="nf">.next</span><span class="p">()</span><span class="nf">.unwrap_or</span><span class="p">((</span><span class="mi">1</span><span class="p">,</span> <span class="sc">' '</span><span class="p">));</span>
<span class="k">self</span><span class="py">.pos</span> <span class="o">+=</span> <span class="n">next_pos</span><span class="p">;</span>
<span class="k">return</span> <span class="n">cur_char</span><span class="p">;</span>
<span class="p">}</span></code></pre></figure>
<p>Often we will want to consume a string of consecutive characters. The
<code>consume_while</code> method consumes characters that meet a given condition, and
returns them as a string. This method’s argument is a function that takes a
<code>char</code> and returns a <code>bool</code>.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="c1">// Consume characters until `test` returns false.</span>
<span class="k">fn</span> <span class="n">consume_while</span><span class="o"><</span><span class="n">F</span><span class="o">></span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">,</span> <span class="n">test</span><span class="p">:</span> <span class="n">F</span><span class="p">)</span> <span class="k">-></span> <span class="nb">String</span>
<span class="k">where</span> <span class="n">F</span><span class="p">:</span> <span class="nf">Fn</span><span class="p">(</span><span class="nb">char</span><span class="p">)</span> <span class="k">-></span> <span class="nb">bool</span> <span class="p">{</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">result</span> <span class="o">=</span> <span class="nn">String</span><span class="p">::</span><span class="nf">new</span><span class="p">();</span>
<span class="k">while</span> <span class="o">!</span><span class="k">self</span><span class="nf">.eof</span><span class="p">()</span> <span class="o">&&</span> <span class="nf">test</span><span class="p">(</span><span class="k">self</span><span class="nf">.next_char</span><span class="p">())</span> <span class="p">{</span>
<span class="n">result</span><span class="nf">.push</span><span class="p">(</span><span class="k">self</span><span class="nf">.consume_char</span><span class="p">());</span>
<span class="p">}</span>
<span class="k">return</span> <span class="n">result</span><span class="p">;</span>
<span class="p">}</span></code></pre></figure>
<p>We can use this to ignore a sequence of space characters, or to consume a
string of alphanumeric characters:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="c1">// Consume and discard zero or more whitespace characters.</span>
<span class="k">fn</span> <span class="nf">consume_whitespace</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="p">{</span>
<span class="k">self</span><span class="nf">.consume_while</span><span class="p">(</span><span class="nn">CharExt</span><span class="p">::</span><span class="n">is_whitespace</span><span class="p">);</span>
<span class="p">}</span>
<span class="c1">// Parse a tag or attribute name.</span>
<span class="k">fn</span> <span class="nf">parse_tag_name</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="k">-></span> <span class="nb">String</span> <span class="p">{</span>
<span class="k">self</span><span class="nf">.consume_while</span><span class="p">(|</span><span class="n">c</span><span class="p">|</span> <span class="k">match</span> <span class="n">c</span> <span class="p">{</span>
<span class="sc">'a'</span><span class="o">...</span><span class="sc">'z'</span> <span class="p">|</span> <span class="sc">'A'</span><span class="o">...</span><span class="sc">'Z'</span> <span class="p">|</span> <span class="sc">'0'</span><span class="o">...</span><span class="sc">'9'</span> <span class="k">=></span> <span class="k">true</span><span class="p">,</span>
<span class="n">_</span> <span class="k">=></span> <span class="k">false</span>
<span class="p">})</span>
<span class="p">}</span></code></pre></figure>
<p>Now we’re ready to start parsing HTML. To parse a single node, we look at its
first character to see if it is an element or a text node. In our simplified
version of HTML, a text node can contain any character except <code><</code>.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="c1">// Parse a single node.</span>
<span class="k">fn</span> <span class="nf">parse_node</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="k">-></span> <span class="nn">dom</span><span class="p">::</span><span class="n">Node</span> <span class="p">{</span>
<span class="k">match</span> <span class="k">self</span><span class="nf">.next_char</span><span class="p">()</span> <span class="p">{</span>
<span class="sc">'<'</span> <span class="k">=></span> <span class="k">self</span><span class="nf">.parse_element</span><span class="p">(),</span>
<span class="n">_</span> <span class="k">=></span> <span class="k">self</span><span class="nf">.parse_text</span><span class="p">()</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="c1">// Parse a text node.</span>
<span class="k">fn</span> <span class="nf">parse_text</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="k">-></span> <span class="nn">dom</span><span class="p">::</span><span class="n">Node</span> <span class="p">{</span>
<span class="nn">dom</span><span class="p">::</span><span class="nf">text</span><span class="p">(</span><span class="k">self</span><span class="nf">.consume_while</span><span class="p">(|</span><span class="n">c</span><span class="p">|</span> <span class="n">c</span> <span class="o">!=</span> <span class="sc">'<'</span><span class="p">))</span>
<span class="p">}</span></code></pre></figure>
<p>An element is more complicated. It includes opening and closing tags, and
between them any number of child nodes:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="c1">// Parse a single element, including its open tag, contents, and closing tag.</span>
<span class="k">fn</span> <span class="nf">parse_element</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="k">-></span> <span class="nn">dom</span><span class="p">::</span><span class="n">Node</span> <span class="p">{</span>
<span class="c1">// Opening tag.</span>
<span class="nd">assert!</span><span class="p">(</span><span class="k">self</span><span class="nf">.consume_char</span><span class="p">()</span> <span class="o">==</span> <span class="sc">'<'</span><span class="p">);</span>
<span class="k">let</span> <span class="n">tag_name</span> <span class="o">=</span> <span class="k">self</span><span class="nf">.parse_tag_name</span><span class="p">();</span>
<span class="k">let</span> <span class="n">attrs</span> <span class="o">=</span> <span class="k">self</span><span class="nf">.parse_attributes</span><span class="p">();</span>
<span class="nd">assert!</span><span class="p">(</span><span class="k">self</span><span class="nf">.consume_char</span><span class="p">()</span> <span class="o">==</span> <span class="sc">'>'</span><span class="p">);</span>
<span class="c1">// Contents.</span>
<span class="k">let</span> <span class="n">children</span> <span class="o">=</span> <span class="k">self</span><span class="nf">.parse_nodes</span><span class="p">();</span>
<span class="c1">// Closing tag.</span>
<span class="nd">assert!</span><span class="p">(</span><span class="k">self</span><span class="nf">.consume_char</span><span class="p">()</span> <span class="o">==</span> <span class="sc">'<'</span><span class="p">);</span>
<span class="nd">assert!</span><span class="p">(</span><span class="k">self</span><span class="nf">.consume_char</span><span class="p">()</span> <span class="o">==</span> <span class="sc">'/'</span><span class="p">);</span>
<span class="nd">assert!</span><span class="p">(</span><span class="k">self</span><span class="nf">.parse_tag_name</span><span class="p">()</span> <span class="o">==</span> <span class="n">tag_name</span><span class="p">);</span>
<span class="nd">assert!</span><span class="p">(</span><span class="k">self</span><span class="nf">.consume_char</span><span class="p">()</span> <span class="o">==</span> <span class="sc">'>'</span><span class="p">);</span>
<span class="k">return</span> <span class="nn">dom</span><span class="p">::</span><span class="nf">elem</span><span class="p">(</span><span class="n">tag_name</span><span class="p">,</span> <span class="n">attrs</span><span class="p">,</span> <span class="n">children</span><span class="p">);</span>
<span class="p">}</span></code></pre></figure>
<p>Parsing attributes is pretty easy in our simplified syntax. Until we reach
the end of the opening tag (<code>></code>) we repeatedly look for a name followed by <code>=</code>
and then a string enclosed in quotes.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="c1">// Parse a single name="value" pair.</span>
<span class="k">fn</span> <span class="nf">parse_attr</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="k">-></span> <span class="p">(</span><span class="nb">String</span><span class="p">,</span> <span class="nb">String</span><span class="p">)</span> <span class="p">{</span>
<span class="k">let</span> <span class="n">name</span> <span class="o">=</span> <span class="k">self</span><span class="nf">.parse_tag_name</span><span class="p">();</span>
<span class="nd">assert!</span><span class="p">(</span><span class="k">self</span><span class="nf">.consume_char</span><span class="p">()</span> <span class="o">==</span> <span class="sc">'='</span><span class="p">);</span>
<span class="k">let</span> <span class="n">value</span> <span class="o">=</span> <span class="k">self</span><span class="nf">.parse_attr_value</span><span class="p">();</span>
<span class="k">return</span> <span class="p">(</span><span class="n">name</span><span class="p">,</span> <span class="n">value</span><span class="p">);</span>
<span class="p">}</span>
<span class="c1">// Parse a quoted value.</span>
<span class="k">fn</span> <span class="nf">parse_attr_value</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="k">-></span> <span class="nb">String</span> <span class="p">{</span>
<span class="k">let</span> <span class="n">open_quote</span> <span class="o">=</span> <span class="k">self</span><span class="nf">.consume_char</span><span class="p">();</span>
<span class="nd">assert!</span><span class="p">(</span><span class="n">open_quote</span> <span class="o">==</span> <span class="sc">'"'</span> <span class="p">||</span> <span class="n">open_quote</span> <span class="o">==</span> <span class="sc">'\''</span><span class="p">);</span>
<span class="k">let</span> <span class="n">value</span> <span class="o">=</span> <span class="k">self</span><span class="nf">.consume_while</span><span class="p">(|</span><span class="n">c</span><span class="p">|</span> <span class="n">c</span> <span class="o">!=</span> <span class="n">open_quote</span><span class="p">);</span>
<span class="nd">assert!</span><span class="p">(</span><span class="k">self</span><span class="nf">.consume_char</span><span class="p">()</span> <span class="o">==</span> <span class="n">open_quote</span><span class="p">);</span>
<span class="k">return</span> <span class="n">value</span><span class="p">;</span>
<span class="p">}</span>
<span class="c1">// Parse a list of name="value" pairs, separated by whitespace.</span>
<span class="k">fn</span> <span class="nf">parse_attributes</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="k">-></span> <span class="nn">dom</span><span class="p">::</span><span class="n">AttrMap</span> <span class="p">{</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">attributes</span> <span class="o">=</span> <span class="nn">HashMap</span><span class="p">::</span><span class="nf">new</span><span class="p">();</span>
<span class="k">loop</span> <span class="p">{</span>
<span class="k">self</span><span class="nf">.consume_whitespace</span><span class="p">();</span>
<span class="k">if</span> <span class="k">self</span><span class="nf">.next_char</span><span class="p">()</span> <span class="o">==</span> <span class="sc">'>'</span> <span class="p">{</span>
<span class="k">break</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">let</span> <span class="p">(</span><span class="n">name</span><span class="p">,</span> <span class="n">value</span><span class="p">)</span> <span class="o">=</span> <span class="k">self</span><span class="nf">.parse_attr</span><span class="p">();</span>
<span class="n">attributes</span><span class="nf">.insert</span><span class="p">(</span><span class="n">name</span><span class="p">,</span> <span class="n">value</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">return</span> <span class="n">attributes</span><span class="p">;</span>
<span class="p">}</span></code></pre></figure>
<p>To parse the child nodes, we recursively call <code>parse_node</code> in a loop until we
reach the closing tag. This function returns a <code>Vec</code>, which is Rust’s name
for a growable array.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="c1">// Parse a sequence of sibling nodes.</span>
<span class="k">fn</span> <span class="nf">parse_nodes</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="k">-></span> <span class="nb">Vec</span><span class="o"><</span><span class="nn">dom</span><span class="p">::</span><span class="n">Node</span><span class="o">></span> <span class="p">{</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">nodes</span> <span class="o">=</span> <span class="nn">Vec</span><span class="p">::</span><span class="nf">new</span><span class="p">();</span>
<span class="k">loop</span> <span class="p">{</span>
<span class="k">self</span><span class="nf">.consume_whitespace</span><span class="p">();</span>
<span class="k">if</span> <span class="k">self</span><span class="nf">.eof</span><span class="p">()</span> <span class="p">||</span> <span class="k">self</span><span class="nf">.starts_with</span><span class="p">(</span><span class="s">"</"</span><span class="p">)</span> <span class="p">{</span>
<span class="k">break</span><span class="p">;</span>
<span class="p">}</span>
<span class="n">nodes</span><span class="nf">.push</span><span class="p">(</span><span class="k">self</span><span class="nf">.parse_node</span><span class="p">());</span>
<span class="p">}</span>
<span class="k">return</span> <span class="n">nodes</span><span class="p">;</span>
<span class="p">}</span></code></pre></figure>
<p>Finally, we can put this all together to parse an entire HTML document into a
DOM tree. This function will create a root node for the document if it
doesn’t include one explicitly; this is similar to what a real HTML parser
does.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="c1">// Parse an HTML document and return the root element.</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">parse</span><span class="p">(</span><span class="n">source</span><span class="p">:</span> <span class="nb">String</span><span class="p">)</span> <span class="k">-></span> <span class="nn">dom</span><span class="p">::</span><span class="n">Node</span> <span class="p">{</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">nodes</span> <span class="o">=</span> <span class="n">Parser</span> <span class="p">{</span> <span class="n">pos</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span> <span class="n">input</span><span class="p">:</span> <span class="n">source</span> <span class="p">}</span><span class="nf">.parse_nodes</span><span class="p">();</span>
<span class="c1">// If the document contains a root element, just return it. Otherwise, create one.</span>
<span class="k">if</span> <span class="n">nodes</span><span class="nf">.len</span><span class="p">()</span> <span class="o">==</span> <span class="mi">1</span> <span class="p">{</span>
<span class="n">nodes</span><span class="nf">.swap_remove</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
<span class="nn">dom</span><span class="p">::</span><span class="nf">elem</span><span class="p">(</span><span class="s">"html"</span><span class="nf">.to_string</span><span class="p">(),</span> <span class="nn">HashMap</span><span class="p">::</span><span class="nf">new</span><span class="p">(),</span> <span class="n">nodes</span><span class="p">)</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>That’s it! The entire code for the <a href="https://github.com/mbrubeck/robinson/blob/master/src/html.rs">robinson HTML parser</a>. The whole
thing weighs in at just over 100 lines of code (not counting blank lines and
comments). If you use a good library or parser generator, you can probably
build a similar toy parser in even less space.</p>
<h2 id="exercises">Exercises</h2>
<p>Here are a few alternate ways to try this out yourself. As before, you can
<strong>choose one or more</strong> of them and ignore the others.</p>
<ol>
<li>
<p>Build a parser (either “by hand” or with a library or parser generator)
that takes a subset of HTML as input and produces a tree of DOM nodes.</p>
</li>
<li>
<p>Modify robinson’s HTML parser to add some missing features, like comments.
Or replace it with a better parser, perhaps built with a library or
generator.</p>
</li>
<li>
<p>Create an invalid HTML file that causes your parser (or mine) to fail.
Modify the parser to recover from the error and produce a DOM tree for your
test file.</p>
</li>
</ol>
<h2 id="shortcuts">Shortcuts</h2>
<p>If you want to skip parsing completely, you can build a DOM tree
programmatically instead, by adding some code like this to your program (in
pseudo-code; adjust it to match the DOM code you wrote in Part 1):</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="c1">// <html><body>Hello, world!</body></html></span>
<span class="k">let</span> <span class="n">root</span> <span class="o">=</span> <span class="nf">element</span><span class="p">(</span><span class="s">"html"</span><span class="p">);</span>
<span class="k">let</span> <span class="n">body</span> <span class="o">=</span> <span class="nf">element</span><span class="p">(</span><span class="s">"body"</span><span class="p">);</span>
<span class="n">root</span><span class="py">.children</span><span class="nf">.push</span><span class="p">(</span><span class="n">body</span><span class="p">);</span>
<span class="n">body</span><span class="py">.children</span><span class="nf">.push</span><span class="p">(</span><span class="nf">text</span><span class="p">(</span><span class="s">"Hello, world!"</span><span class="p">));</span></code></pre></figure>
<p>Or you can find an existing HTML parser and incorporate it into your program.</p>
<p>The <a href="/mbrubeck/2014/08/13/toy-layout-engine-3-css.html">next article</a> in this series will cover CSS data structures
and parsing.</p>
Matt Brubeckmbrubeck@limpet.nethttps://limpet.net/mbrubeck/Let's build a browser engine! Part 1: Getting started2014-08-08T09:40:00-07:00https://limpet.net/mbrubeck//2014/08/08/toy-layout-engine-1<p>I’m building a toy HTML rendering engine, and I think you should too. This is
the first in a series of articles:</p>
<blockquote>
<ul>
<li>
<b>Part 1: Getting started</b>
</li>
<li>
<a href="/mbrubeck/2014/08/11/toy-layout-engine-2.html">Part 2: HTML</a>
</li>
<li>
<a href="/mbrubeck/2014/08/13/toy-layout-engine-3-css.html">Part 3: CSS</a>
</li>
<li>
<a href="/mbrubeck/2014/08/23/toy-layout-engine-4-style.html">Part 4: Style</a>
</li>
<li>
<a href="/mbrubeck/2014/09/08/toy-layout-engine-5-boxes.html">Part 5: Boxes</a>
</li>
<li>
<a href="/mbrubeck/2014/09/17/toy-layout-engine-6-block.html">Part 6: Block layout</a>
</li>
<li>
<a href="/mbrubeck/2014/11/05/toy-layout-engine-7-painting.html">Part 7: Painting 101</a>
</li>
</ul>
</blockquote>
<p>The full series will describe the code I’ve written, and show how you can make
your own. But first, let me explain why.</p>
<h2 id="youre-building-a-what">You’re building a what?</h2>
<p>Let’s talk terminology. A <strong>browser engine</strong> is the portion of a web browser
that works “under the hood” to fetch a web page from the internet, and
translate its contents into forms you can read, watch, hear, etc. Blink,
Gecko, WebKit, and Trident are browser engines. In contrast, the
browser’s own UI—tabs, toolbar, menu and such—is called the
<strong>chrome</strong>. Firefox and SeaMonkey are two browsers with different <em>chrome</em> but
the same Gecko <em>engine</em>.</p>
<p>A browser engine includes many sub-components: an HTTP client, an HTML parser,
a CSS parser, a JavaScript engine (itself composed of parsers, interpreters,
and compilers), and much more. Those components involved in parsing web
formats like HTML and CSS and translating them into what you see on-screen are
sometimes called the <strong>layout engine</strong> or <strong>rendering engine</strong>.</p>
<h2 id="why-a-toy-rendering-engine">Why a “toy” rendering engine?</h2>
<p>A full-featured browser engine is hugely complex. Blink, Gecko,
WebKit—these are millions of lines of code each. Even younger,
simpler rendering engines like <a href="https://github.com/servo/servo/">Servo</a> and <a href="http://weasyprint.org/">WeasyPrint</a> are each
tens of thousands of lines. Not the easiest thing for a newcomer to
comprehend!</p>
<p>Speaking of hugely complex software: If you take a class on compilers or
operating systems, at some point you will probably create or modify a “toy”
compiler or kernel. This is a simple model designed for learning; it may
never be run by anyone besides the person who wrote it. But making a toy
system is a useful tool for learning how the real thing works. Even if you
never build a real-world compiler or kernel, understanding how they work can
help you make better use of them when writing your own programs.</p>
<p>So, if you want to become a browser developer, or just to understand what
happens inside a browser engine, why not build a toy one? Like a toy compiler
that implements a subset of a “real” programming language, a toy rendering
engine could implement a small subset of HTML and CSS. It won’t replace the
engine in your everyday browser, but should nonetheless illustrate the basic
steps needed for rendering a simple HTML document.</p>
<h2 id="try-this-at-home">Try this at home.</h2>
<p>I hope I’ve convinced you to give it a try. This series will be easiest to
follow if you already have some solid programming experience and know some
high-level HTML and CSS concepts. However, if you’re just getting started
with this stuff, or run into things you don’t understand, feel free to ask
questions and I’ll try to make it clearer.</p>
<p>Before you start, a few remarks on some choices you can make:</p>
<h2 id="lang">On Programming Languages</h2>
<p>You can build a toy layout engine in any programming language. Really! Go
ahead and use a language you know and love. Or use this as an excuse to learn
a new language if that sounds like fun.</p>
<p>If you want to start contributing to major browser engines like Gecko or
WebKit, you might want to work in C++ because it’s the main language used in
those engines, and using it will make it easier to compare your code to
theirs.</p>
<p>My own toy project, <a href="https://github.com/mbrubeck/robinson">robinson</a>, is written in <a href="http://www.rust-lang.org/">Rust</a>. I’m part
of the Servo team at Mozilla, so I’ve become very fond of Rust programming.
Plus, one of my goals with this project is to understand more of Servo’s
implementation. Robinson sometimes uses simplified versions of Servo’s data
structures and code.</p>
<h2 id="on-libraries-and-shortcuts">On Libraries and Shortcuts</h2>
<p>In a learning exercise like this, you have to decide whether it’s “cheating”
to use someone else’s code instead of writing your own from scratch. My
advice is to write your own code for the parts that you really want to
understand, but don’t be shy about using libraries for everything else.
Learning how to use a particular library can be a worthwhile exercise in
itself.</p>
<p>I’m writing robinson not just for myself, but also to serve as example code for
these articles and exercises. For this and other reasons, I want it to be as
tiny and self-contained as possible. So far I’ve used no external code except
for the Rust standard library. (This also side-steps the minor hassle of
getting multiple dependencies to build with the same version of Rust while the
language is still in development.) This rule isn’t set in stone, though.
For example, I may decide later to use a graphics library rather than write my
own low-level drawing code.</p>
<p>Another way to avoid writing code is to just leave things out. For example,
robinson has no networking code yet; it can only read local files. In a toy
program, it’s fine to just skip things if you feel like it. I’ll point out
potential shortcuts like this as I go along, so you can bypass steps that
don’t interest you and jump straight to the good stuff. You can always fill
in the gaps later if you change your mind.</p>
<h2 id="first-step-the-dom">First Step: The DOM</h2>
<p>Are you ready to write some code? We’ll start with something small: data
structures for the <a href="http://dom.spec.whatwg.org/" title="Document Object Model">DOM</a>. Let’s look at robinson’s <a href="https://github.com/mbrubeck/robinson/blob/master/src/dom.rs">dom module</a>.</p>
<p>The DOM is a tree of nodes. A node has zero or more children. (It also has
various other attributes and methods, but we can ignore most of those for now.)</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">struct</span> <span class="n">Node</span> <span class="p">{</span>
<span class="c1">// data common to all nodes:</span>
<span class="n">children</span><span class="p">:</span> <span class="nb">Vec</span><span class="o"><</span><span class="n">Node</span><span class="o">></span><span class="p">,</span>
<span class="c1">// data specific to each node type:</span>
<span class="n">node_type</span><span class="p">:</span> <span class="n">NodeType</span><span class="p">,</span>
<span class="p">}</span></code></pre></figure>
<p>There are several <a href="http://dom.spec.whatwg.org/#dom-node-nodetype">node types</a>, but for now we will ignore most of them
and say that a node is either an Element or a Text node. In a language with
inheritance these would be subtypes of <code>Node</code>. In Rust they can be an enum
(Rust’s keyword for a “tagged union” or “sum type”):</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">enum</span> <span class="n">NodeType</span> <span class="p">{</span>
<span class="nf">Text</span><span class="p">(</span><span class="nb">String</span><span class="p">),</span>
<span class="nf">Element</span><span class="p">(</span><span class="n">ElementData</span><span class="p">),</span>
<span class="p">}</span></code></pre></figure>
<p>An element includes a tag name and any number of attributes, which can be
stored as a map from names to values. Robinson doesn’t support namespaces, so
it just stores tag and attribute names as simple strings.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">struct</span> <span class="n">ElementData</span> <span class="p">{</span>
<span class="n">tag_name</span><span class="p">:</span> <span class="nb">String</span><span class="p">,</span>
<span class="n">attributes</span><span class="p">:</span> <span class="n">AttrMap</span><span class="p">,</span>
<span class="p">}</span>
<span class="k">type</span> <span class="n">AttrMap</span> <span class="o">=</span> <span class="n">HashMap</span><span class="o"><</span><span class="nb">String</span><span class="p">,</span> <span class="nb">String</span><span class="o">></span><span class="p">;</span></code></pre></figure>
<p>Finally, some constructor functions to make it easy to create new nodes:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">fn</span> <span class="nf">text</span><span class="p">(</span><span class="n">data</span><span class="p">:</span> <span class="nb">String</span><span class="p">)</span> <span class="k">-></span> <span class="n">Node</span> <span class="p">{</span>
<span class="n">Node</span> <span class="p">{</span> <span class="n">children</span><span class="p">:</span> <span class="nn">Vec</span><span class="p">::</span><span class="nf">new</span><span class="p">(),</span> <span class="n">node_type</span><span class="p">:</span> <span class="nn">NodeType</span><span class="p">::</span><span class="nf">Text</span><span class="p">(</span><span class="n">data</span><span class="p">)</span> <span class="p">}</span>
<span class="p">}</span>
<span class="k">fn</span> <span class="nf">elem</span><span class="p">(</span><span class="n">name</span><span class="p">:</span> <span class="nb">String</span><span class="p">,</span> <span class="n">attrs</span><span class="p">:</span> <span class="n">AttrMap</span><span class="p">,</span> <span class="n">children</span><span class="p">:</span> <span class="nb">Vec</span><span class="o"><</span><span class="n">Node</span><span class="o">></span><span class="p">)</span> <span class="k">-></span> <span class="n">Node</span> <span class="p">{</span>
<span class="n">Node</span> <span class="p">{</span>
<span class="n">children</span><span class="p">:</span> <span class="n">children</span><span class="p">,</span>
<span class="n">node_type</span><span class="p">:</span> <span class="nn">NodeType</span><span class="p">::</span><span class="nf">Element</span><span class="p">(</span><span class="n">ElementData</span> <span class="p">{</span>
<span class="n">tag_name</span><span class="p">:</span> <span class="n">name</span><span class="p">,</span>
<span class="n">attributes</span><span class="p">:</span> <span class="n">attrs</span><span class="p">,</span>
<span class="p">})</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>And that’s it! A full-blown DOM implementation would include a lot more data
and dozens of methods, but this is all we need to get started.</p>
<h2 id="exercises">Exercises</h2>
<p>These are just a few suggested ways to follow along at home. <strong>Do the
exercises that interest you</strong> and skip any that don’t.</p>
<ol>
<li>
<p>Start a new program in the language of your choice, and write code to
represent a tree of DOM text nodes and elements.</p>
</li>
<li>
<p>Install the latest version of <a href="http://www.rust-lang.org/">Rust</a>, then download and build
<a href="https://github.com/mbrubeck/robinson">robinson</a>. Open up <code>dom.rs</code> and extend <code>NodeType</code> to include
additional types like comment nodes.</p>
</li>
<li>
<p>Write code to pretty-print a tree of DOM nodes.</p>
</li>
</ol>
<p>In the <a href="/mbrubeck/2014/08/11/toy-layout-engine-2.html">next article</a>, we’ll add a parser that turns HTML source
code into a tree of these DOM nodes.</p>
<h2 id="references">References</h2>
<p>For much more detailed information about browser engine internals, see Tali
Garsiel’s wonderful <a href="http://www.html5rocks.com/en/tutorials/internals/howbrowserswork/">How Browsers Work</a> and its links to further
resources.</p>
<p>For example code, here’s a short list of “small” open source web rendering
engines. Most of them are many times bigger than robinson, but still way
smaller than Gecko or WebKit. WebWhirr, at 2000 lines of code, is the only
other one I would call a “toy” engine.</p>
<ul>
<li><a href="https://github.com/philborlin/CSSBox">CSSBox</a> (Java)</li>
<li><a href="https://github.com/silexlabs/Cocktail">Cocktail</a> (Haxe)</li>
<li><a href="https://gngr.info/">gngr</a> (Java)</li>
<li><a href="https://github.com/tordex/litehtml">litehtml</a> (C++)</li>
<li><a href="https://github.com/admin36/LURE">LURE</a> (Lua)</li>
<li><a href="http://www.netsurf-browser.org/">NetSurf</a> (C)</li>
<li><a href="https://github.com/servo/servo/">Servo</a> (Rust)</li>
<li><a href="http://hsbrowser.wordpress.com/3s-functional-web-browser/">Simple San Simon</a> (Haskell)</li>
<li><a href="https://github.com/Kozea/WeasyPrint">WeasyPrint</a> (Python)</li>
<li><a href="https://github.com/reesmichael1/WebWhirr">WebWhirr</a> (C++)</li>
</ul>
<p>You may find these useful for inspiration or reference. If you know of any
other similar projects—or if you start your own—please let me
know!</p>
Matt Brubeckmbrubeck@limpet.nethttps://limpet.net/mbrubeck/Better automated detection of Firefox performance regressions2013-11-10T11:53:00-08:00https://limpet.net/mbrubeck//2013/11/10/improving-regression-detection<p>Last spring I spent some of my spare time improving the <a href="http://hg.mozilla.org/graphs/file/tip/server/analysis/">automated script</a>
that detects regressions in <a href="https://wiki.mozilla.org/Buildbot/Talos">Talos</a> and other Firefox performance data.
I’m finally writing up some of that work in case it’s useful or interesting to
anyone else.</p>
<p>Talos is a system for running performance benchmarks; we use it to run a suite
of benchmarks every time a change is pushed to the Firefox source repository.
The Talos test harness reports these results to the <a href="http://graphs.mozilla.org/graph.html">graph server</a> which
stores them and can plot the recorded data to show how it changes over time.</p>
<p>Like most performance measurements, Talos benchmarks can be noisy. We need to
use statistics to separate signal from noise. To determine whether a change
to the source code caused a change in the benchmark results, an automated
script takes multiple datapoints from before and after each push. It computes
the average and standard deviation of the “before” datapoints and the “after”
datapoints, and uses a <a href="https://en.wikipedia.org/wiki/Student%27s_t-test">Student’s t-test</a> to estimate the likelihood that
the datasets are significantly different. If the t-test exceeds a certain
threshold, the script sends email to the author(s) of the responsible patches
and to the <a href="http://www.mozilla.org/about/forums/#dev-tree-management">dev-tree-management</a> mailing list.</p>
<p>By nature, these statistical estimates can never be 100% certain. However, we
need to make sure that they are correct as often as possible. False negatives
mean that real regressions go undetected. But false positives will generate
extra work, and may cause developers to ignore future regression alerts. I
started inspecting graph server data and regression alerts by hand, recording
and investigating any false negatives or false positives I found, and filed
<a href="https://bugzilla.mozilla.org/showdependencytree.cgi?id=627860&hide_resolved=0">bugs</a> to fix the causes of those errors.</p>
<p>Some of these were straightforward implementation bugs, like one where an
<a href="https://bugzilla.mozilla.org/show_bug.cgi?id=858735">infinite t-test score</a> (near certain likelihood of regression) was treated
as a zero score (no regression at all). Others involved <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=858877">tuning</a> the
number of datapoints and the threshold for sending alerts. I also added some
unit tests and regression tests, including some past datasets with known
regressions. Now we can ensure that future revisions of the script still
identify the correct regression points in these datasets.</p>
<p>Some fixes required more involved changes to the analysis. For example,
if one code change actually caused a regression, the pushes right before or
after that change will also appear somewhat likely to be responsible for the
regression (because they will also have large differences in average score
between their “before” and “after” windows). If multiple pushes in a row had
t-test scores over the threshold, the script used to send an alert for the
first of those pushes, even if it was not the most likely culprit. Now the
script blames the push with the <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=858756">highest t-test score</a>, which is almost
always the guilty party. This change had the biggest impact in reducing
incorrect regression alerts.</p>
<p>After those changes, there was still one common cause of false alarms that I
found. The regression analyzer compares the 12 datapoints before each push to
the 12 datapoints after it. But these 12-point moving averages could change
suddenly not just at the point where a regression happened, but also at an
unrelated point that happens to be 12 pushes earlier or later. This caused
<a href="https://bugzilla.mozilla.org/show_bug.cgi?id=879903">spooky action at a distance</a> where a regression in one push would cause a
false alarm in a completely different push. To fix this, we now compute
weighted averages with “triangular” weighting functions that give more weight
to the point being analyzed, and fall off gradually with increasing distance
from that point. This smooths out changes at the opposite ends of the moving
windows.</p>
<p>There are still occasional errors in regression detection, but as far as I can
tell most of them are caused by genuinely misleading random noise or
<a href="http://bugzil.la/:Talos+[bimodal">bimodal data</a>. If you see any problems with regression emails, please
<a href="https://bugzilla.mozilla.org/enter_bug.cgi?product=Release+Engineering&component=Tools">file a bug</a> (and CC :mbrubeck) and we’ll take a look at it. And if you’d
like to help reduce useless alert emails, maybe you’d like to help fix
<a href="https://bugzilla.mozilla.org/show_bug.cgi?id=914756">bug 914756</a>…</p>
Matt Brubeckmbrubeck@limpet.nethttps://limpet.net/mbrubeck/A good time to try Firefox for Metro2013-11-10T09:20:00-08:00https://limpet.net/mbrubeck//2013/11/10/try-metro-firefox<p>“Firefox for Metro” is our project to build a new Firefox user interface
designed for touch-screen devices running Windows 8. (“Metro” was Microsoft’s
code name for the new touch-friendly user interface mode in Windows 8.) I’m
part of the small team working on this project.</p>
<p>For the past year we’ve been fairly quiet, partly because the browser has been
under heavy construction and not really suitable for regular use. It started
as a fork of the old <a href="https://support.mozilla.org/en-US/kb/how-do-i-install-windows-8-metro-style-firefox">Fennec</a> (mobile Firefox) UI, plus a new port of
Gecko’s widget layer to Microsoft’s <a href="http://en.wikipedia.org/wiki/Windows_Runtime">WinRT</a> API. We spent part of that
time ripping out and rebuilding old Fennec features to make them work on
Windows 8, and finding and fixing bugs in the new widget code. More recently
we’ve been focused on reworking the touch input layer. With a ton of help
from the graphics team, we replaced Fennec’s old multi-process JavaScript
touch support with a new <a href="https://wiki.mozilla.org/Platform/GFX/OffMainThreadCompositing">off-main-thread compositing</a> backend for the
Windows Direct3D API, and added WinRT support to the <a href="https://wiki.mozilla.org/Platform/GFX/APZ">async pan/zoom
module</a> that implements touch scrolling and zooming on Firefox OS.</p>
<p>All this work is still underway, but in the past week we finally reached a
tipping point where I’m able to use Firefox for Metro for most of my everyday
browsing. There are still bugs, and we are still actively working on
performance and completing the UI work, but I’m now finding very few cases
where I need to switch to another browser because of a problem with Firefox
for Metro. If you are using Window 8 (especially on a touch-screen PC) and
are the type of brave person who uses Firefox nightly builds, this would be a
great time to <a href="https://support.mozilla.org/en-US/kb/how-do-i-install-windows-8-metro-style-firefox">try Metro-style Firefox</a> and let us know what you think!</p>
<p>Looking to the future, here are some of our remaining development priorities
for the first release of Firefox for Metro:</p>
<ul>
<li>
<p>Improve the installation and first-run experience, to help users figure out
how to use the new UI and switch between “Metro” and desktop modes. (Our UX
designer has user testing planned to help identify issues here and
throughout the product.)</p>
</li>
<li>
<p>Fix any performance and rendering issues with scrolling and zooming,
and add support for double-tap to zoom in on a specific page element.</p>
</li>
<li>
<p>Make the Metro and desktop interfaces <a href="http://www.brianbondy.com/blog/id/155/shared-profiles-for-metro-firefox-and-desktop-firefox">share a profile</a>, so
they can seamlessly use the same bookmarks and other data without connecting
to a Firefox Sync account.</p>
</li>
</ul>
<p>And here are some things that I hope we can spend more time on once that work
has shipped:</p>
<ul>
<li>
<p>Improve the experience on pages with plugins, which currently require
the user to switch to the desktop Firefox interface (<a href="https://bugzilla.mozilla.org/show_bug.cgi?id=936907">bug 936907</a>).</p>
</li>
<li>
<p>Implement a “Reader Mode,” like Firefox for Android. (A pair of students have
started working on this <a href="https://github.com/Yoric/Mozilla-Student-Projects/issues/57">project</a>, and their work should also be useful
for adding Reader Mode to Firefox for desktop.)</p>
</li>
<li>
<p>Add more features, and more ways to customize and tweak the Metro UI.</p>
</li>
</ul>
<p>If you want to contribute to any of this work, please check out our <a href="https://wiki.mozilla.org/Firefox/Windows_8_Integration">developer
documentation</a> and come chat with us in #windev on irc.mozilla.org or on
our <a href="https://mail.mozilla.org/listinfo/metro">project mailing list</a>!</p>
Matt Brubeckmbrubeck@limpet.nethttps://limpet.net/mbrubeck/Congratulations on IE10: from Mozilla with cake2012-10-26T14:38:00-07:00https://limpet.net/mbrubeck//2012/10/26/mozilla-ie10-cake<p class="figure" style="float:right; margin: 1em 0 1em 1em;">
<a href="http://www.flickr.com/photos/albill/5551419956/">
<img src="http://farm6.staticflickr.com/5024/5551419956_fb565579c5_n.jpg" alt="“Congratulations on shipping! Love, the IE team”" />
</a>
</p>
<p>Back when Firefox 2 was released (six years ago this week!), the Internet
Explorer team started a friendly tradition of <a href="http://fredericiana.com/2006/10/24/from-redmond-with-love/">sending Mozilla a cake</a> as
congratulations. This continued for <a href="http://www.openbuddha.com/2008/06/17/ie-sends-mozilla-a-new-cake-for-firefox-3/">Firefox 3</a> and <a href="http://www.openbuddha.com/2011/03/22/another-version-of-firefox-another-cake/">Firefox 4</a>.
After Firefox switched from major releases once or twice a year to incremental
updates every six weeks, they sent us <a href="http://img.ly/5k48">a cupcake</a> for the next few updates
instead. <tt>:)</tt></p>
<p>I thought it would be fun to revive the tradition by ordering a cake for the
IE team for the IE10 release today. Here it is right after I picked it up
from <a href="http://www.custombakedcakes.com/">Baked Custom Cakes</a>, with a Firefox logo in painted fondant:</p>
<p class="figure">
<img src="/mbrubeck/images/2012/ie10-mozilla-firefox-cake.jpg" alt="" />
</p>
<p class="figure">
<a href="http://www.flickr.com/photos/mbrubeck/8125978974/in/photostream/lightbox/">
<img src="http://farm9.staticflickr.com/8183/8125978974_11c00ceb8c_z.jpg" alt="“Congratulations on IE10! Love, Mozilla”" />
</a>
</p>
<p>Fellow Mozilla developer <a href="http://blog.monotonous.org/">Eitan Isaacson</a> drove with my wife Sarah and me
to Microsoft Building 50 in Redmond, where program manager Jacob Rossi
helped us deliver the cake to a group of IE team members:</p>
<p class="figure">
<a href="http://www.flickr.com/photos/mostlypictures/8125824299/in/set-72157631859662686/">
<img src="http://farm9.staticflickr.com/8196/8125824299_8cd3b13184_z.jpg" alt="Photo: IE team members and two Mozilla developers gather around the
IE10 cake in Microsoft Building 50" />
</a>
</p>
<p class="caption">That's me on the left and Eitan on the right in Firefox
hoodies.</p>
<p>The IE team posted their <a href="https://twitter.com/IE/status/261920937123917826">thanks</a> through their official Twitter account.
(As you can see from their picture, the bottom border of the cake was slightly
<a href="https://twitter.com/carlesgp/status/261928471398346752">restyled</a> in transit.) Just 30 minutes later, Michael Bolan tweeted that
the cake was <a href="https://twitter.com/therpham/status/261921654148591616">gone</a>. I hear the sugary Firefox logo was eaten soon
after.</p>
<p>So congratulations to the Internet Explorer team on your latest release, and
we hope you enjoyed the cake!</p>
Matt Brubeckmbrubeck@limpet.nethttps://limpet.net/mbrubeck/Metro Firefox without Windows 82012-09-19T17:42:00-07:00https://limpet.net/mbrubeck//2012/09/19/metro-firefox-without-windows<p class="figure"><img src="/mbrubeck/images/2012/metrodesktop.png" alt="" /></p>
<p>A few weeks ago I started working on the <a href="http://www.brianbondy.com/blog/id/151/">Firefox “Metro UI”</a> project, for
Windows 8’s Metro (or <a href="http://www.theverge.com/2012/8/10/3232921/microsoft-modern-ui-style-metro-style-replacement">Modern</a>) touch-screen environment. While we’re
still working on getting our first preview builds ready for Windows 8 users to
try out, you can already check out the current source code from the <a href="http://hg.mozilla.org/projects/elm/">elm</a>
branch and build it yourself if you want to get involved and help us fix some
<a href="https://wiki.mozilla.org/Firefox/Windows_8_Integration">bugs</a>.</p>
<p>What you might not know is that you can run “Metro” Firefox even if you don’t
have Windows 8. It’s been possible for a while to build and run on older
versions of Windows using the <a href="https://wiki.mozilla.org/Firefox/Windows_8_Integration#Desktop_Launch">-metrodesktop</a> flag. Today I landed a
<a href="https://bugzilla.mozilla.org/show_bug.cgi?id=792509">patch</a> to make this work on other platforms too. To build the latest elm
source code on Linux or Mac OS X, follow these instructions:</p>
<ol>
<li>
<p>Clone the elm repo:
<code>hg clone http://hg.mozilla.org/projects/elm/</code>
<br />(If you have already cloned mozilla-central or some other repo that
shares with it, there’s a <a href="http://jlebar.com/2011/5/20/Faster_and_smaller_clones_of_branches.html">faster way</a> to do this.)</p>
</li>
<li>
<p>Create a <a href="https://developer.mozilla.org/en/Configuring_Build_Options">.mozconfig</a> file with <code>ac_add_options --enable-metro</code></p>
</li>
<li>
<p><a href="https://developer.mozilla.org/en-US/docs/Simple_Firefox_build">Build Firefox</a> as you normally would.</p>
</li>
<li>
<p>From your <a href="https://developer.mozilla.org/en-US/docs/Configuring_Build_Options#Building_with_an_Objdir">objdir</a>, run <code>dist/bin/firefox -metrodesktop</code> (Linux)
<br />or <code>dist/Nightly.app/Contents/MacOS/firefox -metrodesktop</code> (Mac)</p>
</li>
<li>
<p>You can visit <code>about:config</code> and enable <code>metro.debug.treatmouseastouch</code>
(then restart the browser) to simulate touch interaction with the mouse.
Right-click to simulate the Windows 8 edge-swipe gesture, which displays
the toolbars.</p>
</li>
</ol>
<p>This is still experimental and mostly untested. Elm might accidentally break
on non-Windows platforms from time to time (because of course we are doing all
our main development and testing on Windows). While it’s not a perfect
replacement for running in the real Windows 8 environment, I hope this is a
useful option for adventurous Firefox contributors who want to experiment with
the Metro code but don’t have convenient access to Windows 8.</p>
Matt Brubeckmbrubeck@limpet.nethttps://limpet.net/mbrubeck/Android app permissions and Firefox Beta2011-10-11T10:00:00-07:00https://limpet.net/mbrubeck//2011/10/11/firefox-android-permissions<p>As a <a href="http://www.mozilla.org/about/mission.html">non-profit organization</a>, Mozilla has a strong commitment to personal
<a href="http://www.mozilla.org/en-US/legal/privacy/firefox.html">privacy</a> and <a href="http://www.mozilla.org/about/manifesto.html">empowerment</a>. But after we released the last update to
<a href="http://blog.mozilla.com/futurereleases/2011/09/30/firefoxbeta8/">Firefox Beta</a> for Android, many people started asking us why Firefox
needed access to their phone numbers.</p>
<p>Firefox does <em>not</em> access users’ phone numbers, but it was clear that we
needed to address this concern. Where did these questions come from? Here’s
the first thing users saw when installing or updating Firefox Beta in the
Android Market:</p>
<p class="figure">
<img src="/mbrubeck/images/2011/firefox-beta-market-permissions.png" alt="“Firefox Beta permissions: Storage, Phone calls, Network
communication, Your location”" />
</p>
<p>The “Phone Calls” permission was added in the last update to Firefox Beta (but
has been since been removed, as I’ll explain below). When users installed
that update and tapped on “Phone calls” for more information, they saw this:</p>
<p class="figure">
<img src="/mbrubeck/images/2011/firefox-beta-market-read-phone-state.png" alt="“Read phone state and identity: Allows the application to access
the phone features of the device. An application with this permission can
determine the phone number and serial number of this phone, whether a call
is active, the number that call is connected to and the like.”" />
</p>
<p>Why did Firefox Beta ask for this permission? Firefox did not ever access
phone numbers, serial numbers, or phone calls. But it did have <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=667980">code</a> to
detect the type of network connection: 2G, 3G, 4G, Wi-Fi, and so on. Firefox
or add-ons could use this code to change settings automatically based on
network type, for example to use less data on mobile networks.</p>
<p>Unfortunately, this required permission to <code>READ_PHONE_STATE</code>,
which also grants access to very sensitive data. We knew this would worry
some users, so we immediately started working on <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=672352">explaining</a> how and why
Firefox uses various permissions. We now have this information on our
<a href="https://support.mozilla.com/en-US/kb/how-firefox-android-use-permissions-it-requests">support site</a> and will link to it from our Android Market page.</p>
<p>But the reaction to the new permission in Firefox Beta was so strong that we
decided to <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=691054">remove that permission completely</a>, along with the code that
used it. Now when you go to the Android Market to <a href="https://market.android.com/details?id=org.mozilla.firefox_beta">install Firefox Beta</a>,
it will no longer ask to read “phone state and identity.”</p>
<h2>Thoughts on Android Permissions</h2>
<p>Permissions on Android and similar platforms are not perfect, but they do give
users some useful tools to protect themselves. When an app requests only
minimal permissions, users know it can do only limited damage if it is buggy
or malicious. Recent versions of Android also have well-written explanations
of each permission to help users make decisions.</p>
<p>But when an app requests lots of permissions, users have a tough choice. They
can grant the permissions, or not use the app at all. This is especially bad
for permissions like <code>READ_PHONE_STATE</code> that are needed for some reasonable
features but also provide access to sensitive data. Eventually, most people
probably get used to granting whatever permissions are requested, especially
for apps like Facebook and Netflix that provide unique access to popular
services.</p>
<p>Making permissions finer-grained might help (for example, separating “Read
phone number” from “Read connection type”), but would also mean longer lists
of permissions. That could make users even less likely to read and understand
them. Explanations from developers can also help, but only if users trust
them to tell the truth. Allowing users to grant or deny individual
permissions (perhaps only at the time the app needs them) might help too, or
it might just train users to always grant permissions so that apps will stop
nagging them.</p>
<p>Aside from these overall design issues, there are also bugs in the
<a href="http://code.google.com/p/android/issues/detail?id=6940">developer documentation</a>, and a bug that causes old permissions to <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=691054#c28">stick
around</a> even after updating to a new version that doesn’t need them. These
little bugs make it harder for developers to do the right thing. Some
researchers at UC Berkeley have analyzed the Android source code to produce
<a href="http://android-permissions.org/">tools and documentation</a> that fill in some of the gaps for developers.</p>
<p>The good news is that some users <em>are</em> paying attention, and those users make
things better for everyone by pressuring developers (like us!) to remove
invasive permissions. If you’re one of the Firefox fans who wrote to us
about the new permissions in Firefox Beta, thank you! We appreciate it.</p>
Matt Brubeckmbrubeck@limpet.nethttps://limpet.net/mbrubeck/Mobile web developers: Your users hate it when you do this2010-11-19T07:15:00-08:00https://limpet.net/mbrubeck//2010/11/19/mobile-browser-detection<p><a href="http://www.mozilla.com/mobile/">Mobile Firefox</a> beta releases include a “Feedback” add-on (like the one in
Firefox 4 beta for desktop), which lets users tell us what they think about
the new browser. Based on a sample of <a href="http://input.mozilla.com/en-US/search/?product=mobile">feedback</a> from mobile beta
testers, the most common complaints are about:</p>
<ol>
<li>Speed</li>
<li>Fitting text to the screen when zoomed in</li>
<li>Mobile vs. desktop versions of web sites</li>
</ol>
<p>The first two are straightforward, though not necessarily easy. We’re always
working on performance, and we have experimental text reflow code (currently
available in the <a href="https://addons.mozilla.org/en-US/mobile/addon/157099/">Easy Reading add-on</a>). But the last item is more
complicated…</p>
<h2 id="browser-detection-pitfalls">Browser detection pitfalls</h2>
<p>Web sites can read the User-Agent header sent by your browser to see what
browser and OS you are using. Some sites use that information to decide
whether to send a “full” version of a web page, or a version formatted for
mobile devices.</p>
<p>This can go wrong in several ways. If your browser or device is new, or
wasn’t tested when a site was developed, that site has no way of knowing
whether it is “mobile.” Users may also change their User-Agent to <a href="http://www.mobilecrunch.com/2010/05/24/how-to-watch-hulu-on-android-2-2/">work
around content restrictions</a> or <a href="http://daringfireball.net/2010/11/masquerading_as_mobile_safari">access different media formats</a>. And
some sites make incorrect assumptions, like that all browsers with “Android”
in their User-Agent string are based on WebKit.</p>
<p>Even when the browser is known, readers and publishers might not
agree about whether the mobile or desktop version is better. Based on our
feedback, some users want to switch from full sites to mobile sites while
others want just the opposite. And some devices, like large touch-screen
tablets, combine aspects of handheld and desktop computers.</p>
<h2 id="solutions">Solutions</h2>
<p>Looking through these <a href="http://input.mozilla.com/en-US/search/?product=mobile&sentiment=sad&q=mobile+view">complaints</a>, many people are under the mistaken
impression that the browser, rather than the web site, decides whether to
display mobile-formatted pages. Even the New York Times’ David Pogue gets this
wrong in his <a href="http://www.nytimes.com/2010/11/11/technology/personaltech/11pogue.html">Galaxy Tab review</a>:</p>
<blockquote>
<p><em>When you visit sites like nytimes.com, CNBC.com and Amazon.com, the Galaxy’s
browser shows the stripped-down, mobile versions of those sites. According
to Samsung, there’s no way to turn that feature off and no way to visit the
full-size sites. You can delete the little “m.” in the Web address until
you’re blue in the browser, but the Galaxy always puts it right back.</em></p>
</blockquote>
<p>Web developers: your readers are <em>begging</em> us to display your content in their
preferred format. We want to help them, but we can’t do it alone.</p>
<p>(I wrote an add-on called <a href="https://addons.mozilla.org/en-US/mobile/addon/162014/">Phony</a> that lets mobile Firefox impersonate the
User-Agent strings of other browsers. While this improves the experience on
some sites, it breaks it on others. Masquerading as another browser can
lead sites to serve non-standard markup that do not work in Firefox. Trying to
solve this in the browser creates as many problems as it solves.)</p>
<p>Because browser detection is never perfect, web sites should <strong>let readers
choose</strong> between mobile and full content. They can try to guess the right
version by default, but please let users opt in or out.</p>
<h2 id="suggestions-for-web-developers">Suggestions for web developers</h2>
<p>Here are some first steps typical mobile web sites can take to make their
readers happier:</p>
<ul>
<li>
<p>When appropriate, serve the same content to all browsers. You can use
stylesheets and scripts to customize your layout for different display
sizes, as in this <a href="http://hicksdesign.co.uk/journal/finally-a-fluid-hicksdesign">beautiful site by Jon Hicks</a>.</p>
</li>
<li>
<p>There <em>are</em> valid reasons to use User-Agent sniffing. But if you must use
it, test in as many browsers and devices as possible and learn the correct
way to detect various browsers. For example, you can <a href="https://developer.mozilla.org/en-US/docs/Gecko_user_agent_string_reference">detect Gecko-based
browsers</a> by looking for <code>Gecko</code> and <code>rv:</code>, and you can detect mobile
Firefox by looking for <code>Mobile</code>. <em>[An earlier version of this post
recommended looking for <code>Fennec/</code> but this is no longer correct.]</em></p>
</li>
<li>
<p>If a “mobile” user requests a page that isn’t available on your mobile site,
don’t just redirect them to an unrelated mobile landing page.</p>
</li>
<li>
<p>Let users switch from your mobile site to your full site <em>and vice-versa</em>.
You can remember users’ previous choices for convenience, but let them
change their minds.</p>
</li>
</ul>
<h2 id="further-reading">Further reading</h2>
<p>For much more comprehensive development advice, see <a href="http://yiibu.com/">Yiibu</a>’s thoughtful
and practical approach to building sites that work across many different
browsers and mobile devices.</p>
<p>One concern with the “same markup” approach is that it leads to heavyweight
pages. Peter-Paul Koch explains how you can avoid sending unused images or
markup to mobile devices by <a href="http://www.quirksmode.org/blog/archives/2010/08/combining_media.html">combinining CSS media queries and JavaScript</a>
to implement progressive enhancement.</p>
<p>Coming from a different perspective, Andrea Trasatti (former developer of the
device-detection library WURFL) talks about <a href="http://blog.trasatti.it/2010/10/sorting-user-agent-strings-out.html">problems in mobile User-Agent
strings</a> and how they could be more useful for device detection.</p>
Matt Brubeckmbrubeck@limpet.nethttps://limpet.net/mbrubeck/What's different about Firefox for Android2010-10-04T16:00:00-07:00https://limpet.net/mbrubeck//2010/10/04/why-fennec-is-different<p>The Mozilla Mobile team has been working for several months on
<a href="http://www.mozilla.com/mobile/">Firefox 4 for Android and Maemo</a> (also known as “Fennec”). Here are some
thoughts about the challenges we’ve discovered, how we’ve decided to solve
them, and why Firefox is different from other mobile browsers.</p>
<h2 id="why">Why</h2>
<p>People often ask us why Android needs another web browser. These are a few
things Firefox does that other Android browsers don’t:</p>
<ul>
<li>
<p><a href="http://www.mozilla.com/firefox/sync/">Syncs bookmarks, tabs, history, passwords, and form data</a>
to and from your phone. Firefox Sync and the Firefox Awesomebar help you
enter URLs and passwords with less typing, and move seamlessly between your
desktop and your mobile phone.</p>
</li>
<li>
<p>Lets anyone write <a href="https://addons.mozilla.org/mobile/">add-ons</a> that can customize any part of the
user interface. (Dolphin HD is another Android browser with some great
add-ons, but its add-ons are provided by the browser vendor.)</p>
</li>
<li>
<p>Uses the Jaegermonkey JIT, which is <a href="http://arewefastyet.com/">getting faster</a> all the
time. It runs JavaScript much faster than the Android 2.1 browser, and is
even faster than the Android 2.2 browser on WebKit’s own SunSpider
benchmark.</p>
</li>
<li>
<p>Supports web technologies like SVG, ECMAScript 5, WebM, and HTTP Strict
Transport Security. Firefox for Android currently scores 217 points plus 9
bonus points on <a href="http://html5test.com/">html5test.com</a>. (Warning: Those
tests can be deceptive; use them as a starting point for comparison only.)</p>
</li>
</ul>
<p>There are several other browsers for Android, but all of them use the built-in
WebKit rendering engine (except Opera Mini, which uses a proxy server for
rendering). The same is true for iOS, which uses WebKit too – as do the
latest versions of BlackBerry, Symbian, and Palm webOS. <em>[Update, November
2010: Opera Mobile for Android is now in beta and uses its own rendering
engine too.]</em></p>
<p>Part of the point of Firefox is to provide alternate capabilities, rather than
reuse the built-in ones. Firefox for Android uses same Gecko engine as
Firefox 4 for desktop. That’s how it can support features that Android’s
WebKit doesn’t, like SVG and ES5. (Of course, WebKit supports some features
Gecko doesn’t, which is why it’s great to have a choice.)</p>
<h2 id="speed-and-responsiveness">Speed and responsiveness</h2>
<p>Early Firefox for Android builds were very slow compared to the stock browser.
Performance is critical in a mobile browser, and our work in this area is
starting to pay off. The new beta version is much speedier, and we have plans
to make it even faster.</p>
<p>To make sure Firefox’s interface stays responsive even when pages are
rendering, we split the browser into two processes: one for the user
interface, and one to render and run scripts in web content. This was part of
the <a href="http://starkravingfinkle.org/blog/2010/06/fennec-2-0-whats-coming/">Electrolysis</a> project.</p>
<p>To make scrolling and zooming fast, Mozilla’s graphics team has implemented
a new “Layers” architecture to allow hardware acceleration and other
optimizations. Beta 1 is the first mobile Firefox release to take advantage
of this work. Because of Electrolysis, we need to share layers between the UI
process and the content process. These cross-process layers allow Firefox to
scroll smoothly in response to user input, even if the content process is
still busy rendering the page.</p>
<p>These are just the first steps toward making Firefox fast on mobile devices.
Upcoming releases will feature OpenGL hardware-accelerated compositing, for
improved scrolling of complex pages. They will also load web pages faster,
thanks to <a href="http://mozakai.blogspot.com/2010/09/visualizing-ipc-messages-in-fennec.html">optimizations of our inter-process communication</a>.</p>
<h2 id="installation-size-problems-and-solutions">Installation size: Problems and solutions</h2>
<p>Not all mobile platforms allow browser apps to include low-level components
like JIT compilers. Fortunately there are still platforms like Android,
webOS, and Maemo that let apps bundle any libraries they want. But while
Android <em>allows</em> us to distribute our own rendering engine and JavaScript
compiler, it isn’t really built with apps like Firefox in mind.</p>
<p>Unlike browsers that use the stock WebKit library, Fennec must ship its own
rendering engine. Many Android phones were built with just 64 MB to 512 MB of
storage for apps. Users who think nothing of a 12 MB download to install
Firefox or Chrome on a laptop may think twice before installing it on these
phones! Storage space is much larger on newer phones, but this is still an
issue for many users.</p>
<p>Even worse, a quirk of the Android NDK means these native libraries are saved
twice – both compressed inside the APK and extracted to a folder for
loading. For apps like Firefox that are mostly native code, this more than
doubles the installation size. Other NDK apps like Google Earth pay the same
double storage penalty.</p>
<p>To solve this problem, Mozilla’s Michael Wu wrote a <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=588607">custom dynamic
linker</a> that loads libraries from the APK without installing them to a
folder. This cut the installed size by more than half, but increased startup
time slightly. For newer phones with 1 GB or more of internal storage, we
might want to let Firefox take more space but start faster. On phones with
less storage, we can use the custom linker to save space.</p>
<p>Firefox 4 beta 1 needed about 40 MB of storage on Android. With the custom
linker, Firefox 4 beta 2 takes a fraction of the space, and in Android 2.2 you
can move almost all of it to SD.</p>
<h2 id="hardware-compatibility">Hardware compatibility</h2>
<p>Stock libraries have another advantage: They can be optimized for specific
hardware. In contrast, apps usually come in a single flavor for all devices.
Firefox can use ARMv7 features like Thumb-2 and NEON to run as fast as
possible on high-end Android phones – but with these optimizations it
can’t run at all on low-end hardware. To run optimally on all current
hardware, we’d need different builds for different devices. For now we’re
focusing on the <a href="https://wiki.mozilla.org/Mobile/Platforms/Android#System_Requirements">current high-end phones</a>, which will likely be next
year’s mainstream hardware.</p>
<p>Even in this smaller set of devices, we’ve run into problems. Most recently,
we discovered that Firefox’s JIT code crashes unpredictably on Samsung Galaxy
S phones. This seems to be a bug in the Android 2.1 kernel on these devices.
Other developers are seeing similar problems, including crashes in the
MonoDroid JIT and in Android’s own Dalvik JIT. For now, Firefox’s JIT
features are disabled when running on Galaxy S hardware. This makes
JavaScript slower, but a lot more stable. The problem is fixed in <a href="http://www.engadget.com/2010/10/02/samsung-captivate-gets-unofficial-froyo-build-with-flash-10-1/">leaked
Android 2.2 images</a>, so we expect to re-enable the JITs before long.</p>
<h2 id="competition-and-choice">Competition and choice</h2>
<p>Firefox is built by Mozilla, a non-profit organization with a <a href="http://www.mozilla.org/about/mission">mission</a> to
promote openness, innovation, and opportunity on the web. We want our work on
the mobile web to benefit everyone, not only Firefox users – just as
Firefox on the desktop helped create a new era of innovation and standards for
all browsers.</p>
<p>WebKit is an excellent project. But a growing number of mobile sites work
<em>only</em> on WebKit. This is dangerously similar to the web ten years ago, when
Internet Explorer had an overwhelming market share and many sites used
IE-specific markup. That made it hard for other browsers to compete, which
killed the incentive for the dominant browser to keep improving.</p>
<p>Upcoming platforms like MeeGo and Windows Phone will give WebKit some new
mobile competition – but many users still can’t choose new browser
technology without buying new hardware (and often new service contracts). We
think you should have a meaningful choice of browsers on your current phone,
just like you do on your computer. User choice will encourage all browsers to
innovate and learn from each other, so they all improve faster.</p>
<h2 id="try-it-out">Try it out</h2>
<p>To check if your phone is compatible with Firefox 4 beta, go to our
<a href="http://www.mozilla.com/mobile/download/">download page</a>. This is a test release, and we aren’t finished fixing and
optimizing it – but we are working hard. Let us know what you think!</p>
<p>We have a lot more work ahead of us. Our next releases will include even more
exciting changes like the <a href="http://www.flickr.com/photos/madhava_work/sets/72157624962763028/detail/">new Android skin</a>, reduced installation size,
and more speed improvements.</p>
Matt Brubeckmbrubeck@limpet.nethttps://limpet.net/mbrubeck/Changes for add-ons in Fennec 2.0 alpha 12010-09-02T16:11:00-07:00https://limpet.net/mbrubeck//2010/09/02/fennec-alpha-addon-changes<blockquote>
<p><strong>Update (October 2010):</strong> We’ve changed the version number of the next
mobile Firefox release from 2.0 to 4.0. The alpha release was numbered
2.0a1, but the first beta release will be numbered 4.0b1.</p>
</blockquote>
<p>Last week we released a new alpha version of <a href="http://www.mozilla.com/mobile/">Firefox for Android and Maemo</a>
(a.k.a. Fennec). This release brings some major changes and new
features for add-on authors. Our <a href="https://wiki.mozilla.org/Mobile/Fennec/Extensions">Fennec add-on documentation</a> now has
the details you need to start updating your Fennec add-ons or creating new
ones.</p>
<h2 id="whats-new-for-add-ons">What’s new for add-ons?</h2>
<p>One very big change in Fennec 2.0 is Electrolysis, the project to move
content and chrome into separate processes. Any add-on code that interacts
with web content through the DOM must now be in a separate script that runs in
the content process. For details, see the <a href="https://wiki.mozilla.org/Mobile/Fennec/Extensions/Electrolysis">Electrolysis guide for add-on authors</a>.</p>
<p>Fennec 2.0a1 also features new APIs for extending the context menu and
site menu. See the <a href="https://wiki.mozilla.org/Mobile/Fennec/Extensions/UserInterface">User Interface Guide</a> for links to documentation and
example code.</p>
<p>The upcoming beta releases will include even more changes. Add-ons that use
Fennec’s panning and zooming features will probably need significant changes
for the new graphics code in Fennec 2.0b1. We will also include APIs for for
add-ons to customize <a href="https://wiki.mozilla.org/Mobile/Projects/Sharing">sharing</a> and <a href="https://wiki.mozilla.org/Mobile/Planning/2.0">other new features</a>. If you are
working on an add-on that is affected by these changes, please <a href="https://wiki.mozilla.org/Mobile#Get_Involved">let us
know</a>.</p>
<h2 id="get-started">Get started</h2>
<p>To start updating or creating your Fennec add-on, download
<a href="http://www.mozilla.com/mobile/">Firefox for Android and Nokia N900</a> or <a href="http://www.mozilla.com/en-US/mobile/platforms/">get the emulator</a> for
Mac/Windows/Linux. When you’re ready, update your addons.mozilla.org listing
and set the maxVersion to 2.0a1. Or you can start getting ready for beta by
setting your maxVersion to 2.0b1pre and keeping up-to-date with our pre-beta
<a href="http://ftp.mozilla.org/pub/mozilla.org/mobile/nightly/latest-mobile-trunk/">nightly builds</a>.</p>
Matt Brubeckmbrubeck@limpet.nethttps://limpet.net/mbrubeck/Fennec 2.0 update: The road to alpha2010-08-12T16:23:00-07:00https://limpet.net/mbrubeck//2010/08/12/fennec-2-alpha-status<blockquote>
<p><strong>Update (October 2010):</strong> We’ve changed the version number of the next
mobile Firefox release from 2.0 to 4.0. I’ve updated this article to use
the new version.</p>
</blockquote>
<p>The <a href="http://planet.firefox.com/mobile/">Mozilla Mobile team</a> has been quiet lately. We’re making a lot of
under-the-hood changes for the next version of <a href="https://wiki.mozilla.org/Mobile/Fennec">Fennec</a> (Firefox for
mobile), and have been focused on getting basic functionality working again
after some major platform changes.</p>
<p>Now things are starting to stabilize, and we are gearing up for an alpha
release in just a few weeks. There are still noticeable bugs in our current
builds, but it is possible to use them now for testing, add-on development,
and regular web browsing (if you don’t mind occasional crashes).</p>
<h2 id="under-the-hood">Under the hood</h2>
<p>The biggest back-end change in Fennec 4.0 is Electrolysis (a.k.a “e10s”).
By moving content rendering and JavaScript into a separate process, e10s
allows the Fennec UI to stay responsive while pages are loading. This
required us to rewrite large parts of the Fennec UI and platform code, a
process that is finally approaching completion.</p>
<p>After the alpha release, the next big platform changes will be related to
<a href="https://wiki.mozilla.org/Gecko:Layers">Layers</a>. Fennec currently handles panning and zooming by dividing pages
into “tiles” and rendering them on HTML canvas elements. This works, but
it is complicated and not as fast as we’d like. The new layers system will
let us replace Fennec’s custom tile management with hardware-accelerated
rendering and compositing built into the Firefox 4.0 platform.</p>
<h2 id="features">Features</h2>
<p>We have a bunch of <a href="https://wiki.mozilla.org/Mobile/Planning/2.0">new features</a> planned for the Fennec UI. A few of
these have already started to land, so you can try them in nightly
builds or the upcoming alpha:</p>
<ul>
<li>
<p>[Firefox Sync](http://www.mozilla.com/en-US/firefox/sync/) is now built
in – sync tabs, bookmarks, and history from your computer to your
phone, no add-on required!
<p style="text-align: center"><img alt="" src="http://limpet.net/mbrubeck/2010/08/12/fennec-2-weave-firefox-sync.png" />
</p></p>
</li>
<li>
<p>The new [Find In Page][9] command is available
through the site menu (or by pressing Control+F on a hardware keyboard).
<p style="text-align: center"><img alt="" src="http://limpet.net/mbrubeck/2010/08/12/fennec-2-find-in-page.png" />
</p></p>
</li>
<li>
<p>You can now [share links][7] through Twitter, Facebook, Google Reader, or
email. (The final version of this feature will also let you send links
using native Android or Maemo apps.)
</p>
</li>
<li>
<p>Fennec alpha can [use your phone's address book][8] to make it easy to enter
phone numbers and email adresses into web forms. (This works on Maemo now;
support for Android will be added later.)
</p>
</li>
<li>
<p>We're adding [multi-touch gestures][15]. Pinch zoom has landed for
alpha; later releases will also include multi-touch swipe gestures
to go to the top or bottom of the current page, or navigate between pages.
</p>
</li>
</ul>
<p>The design of these features is not yet final, so their look and feel may
change significantly before the final release.</p>
<h2 id="android">Android</h2>
<p>Fennec 1.0 and 1.1 were for available only for Nokia’s Maemo operating system.
Fennec 4.0 will run on the Google Android platform, as well as Maemo and its
successor MeeGo.</p>
<p>Fennec for Android is brand new, but it is progressing fast. Most of the
blocking bugs for alpha 1 have been fixed in the last few days, and the
very latest nightly builds are usable for regular browsing, though still rough
in places.</p>
<p>Some of our most visible Android bugs were related to keyboard and input
method support. Jim Chen’s <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=576065">IME rewrite</a> fixed a lot of these bugs,
including a crash on startup with the popular Swype keyboard. There are still
a few keyboard bugs left to fix before alpha 1.</p>
<p>Other Android changes, like alert-bar notifications and a new visual
theme, will appear in our beta releases this fall.</p>
<h2 id="add-ons">Add-ons</h2>
<p>For Fennec add-ons, the biggest change coming is Electrolysis. Any add-on
code that interacts with web content through the DOM must now be in a separate
script that runs in the content process. Mark Finkle has written a very
useful <a href="https://wiki.mozilla.org/Mobile/Fennec/Extensions/Electrolysis">Electrolysis guide for add-on authors</a>.</p>
<p>In Fennec 1.1 we added the <a href="http://madhava.com/egotism/archive/005043.html">site menu</a> and <a href="http://starkravingfinkle.org/blog/2010/04/fennec-1-1-context-menus/">context menu</a>. Fennec 4
will have improved APIs for add-on authors to add new items to either of those
menus. Documentation of the new APIs is coming soon.</p>
<h2 id="nightly-builds">Nightly builds</h2>
<p>Alpha 1 will go into code freeze as soon as the remaining <a href="https://bugzilla.mozilla.org/buglist.cgi?quicksearch=blocking-fennec%3A2.0a1%2B">a1 blocker
bugs</a> are fixed. If you want to start testing or developing for Fennec 4
even sooner, you can download a nightly build today. Just remember this is
still pre-alpha software, and you should expect bugs.</p>
<ul>
<li>
<p>Maemo users can use the [trunk repo][17] to stay up to date with the latest
nightly build.
</p>
</li>
<li>
<p>For Android users, the [Fennec for Android wiki page][3] has pointers to the
latest nightly build, plus a list of known bugs and compatible hardware. (A
number of Android blogs have linked recently to random TryServer builds and
out-of-date blog posts. Please go to the wiki page instead to get the
latest information.)
</p>
</li>
<li>
<p>If you don't have a compatible Maemo or Android device, you can always
download [nightly Fennec builds for Mac/Windows/Linux][1] and try out Fennec
on your desktop or laptop.
</p>
</li>
</ul>
<p>If you have questions, feedback, or bug reports, file them under Fennec in
Bugzilla, or come to the #mobile channel on irc.mozilla.org to chat with us!</p>
Matt Brubeckmbrubeck@limpet.nethttps://limpet.net/mbrubeck/Implementing the viewport meta tag in Mozilla Fennec2010-05-11T15:36:00-07:00https://limpet.net/mbrubeck//2010/05/11/fennec-meta-viewport<p>The upcoming release of <a href="http://www.mozilla.com/mobile/">Mobile Firefox (Fennec)</a> 1.1 features improved
support for the <code><meta name="viewport"></code> tag. Previous version of Fennec
supported the <em>width</em>, <em>height</em>, and <em>initial-scale</em> viewport properties, but
had <a href="http://starkravingfinkle.org/blog/2010/01/perils-of-the-viewport-meta-tag/">problems</a> with some sites designed for iPhone and Android browsers.
We now support the same properties Safari does, and we changed Fennec to render
mobile sites more consistently on screens of different sizes and resolutions.</p>
<p class="caption">touch.facebook.com before:</p>
<p class="figure"><img src="/mbrubeck/images/2010/05-11-fennec-meta-viewport-2.png" /></p>
<p class="caption">touch.facebook.com after:</p>
<p class="figure"><img src="/mbrubeck/images/2010/05-11-fennec-meta-viewport-1.png" /></p>
<p>You can see these changes for yourself in the latest <a href="http://ftp.mozilla.org/pub/mozilla.org/mobile/nightly/latest-mobile-1.9.2/">Fennec 1.1</a> or <a href="http://ftp.mozilla.org/pub/mozilla.org/mobile/nightly/latest-mobile-trunk/">trunk</a> nightly builds.</p>
<h2 id="background">Background</h2>
<p>Mobile browers like Fennec render pages in a virtual “window” (the viewport),
usually wider than the screen, so they don’t need to mangle existing layouts
by squeezing them into a tiny window. Users can pan and zoom to display
different areas of the viewport.</p>
<p>Mobile Safari introduced the “viewport meta tag” to let web developers control
the viewport’s size and scale. Many other mobile browsers now support this
tag, although it is not part of any web standard. Apple’s <a href="http://developer.apple.com/safari/library/documentation/AppleApplications/Reference/SafariWebContent/UsingtheViewport/UsingtheViewport.html#//apple_ref/doc/uid/TP40006509-SW29">documentation</a>
does a great job explaining how it works for web developers, but it leaves out
some information that would be useful to browser vendors. For example, it
says the content attribute is a comma-separated list, but existing browsers
and web pages use a mix of commas, semicolons, and spaces as separators.</p>
<h2 id="a-pixel-is-not-a-pixel">A pixel is not a pixel</h2>
<p>The iPhone and many popular Android phones have 3- to 4-inch screens with
320×480 pixels (~160 dpi). Fennec’s target devices have the same physical
size but 480×800 pixels (~240 dpi). Because of this, the last version of
Fennec displayed many pages about one third smaller (in physical units) than
iPhone or Android. This caused usability and readability problems on many
touch-optimized web sites. Peter-Paul Koch wrote about this problem in
<a href="http://www.quirksmode.org/blog/archives/2010/04/a_pixel_is_not.html">A pixel is not a pixel is not a pixel</a>.</p>
<p>Fennec 1.1 for Maemo will use 1.5 hardware pixels for each CSS “pixel,”
following the lead of the Android browser. This means a site with
“initial-scale=1” will render at the same physical size in Fennec for Maemo,
Mobile Safari for iPhone, and the Android Browser on both <a href="http://developer.android.com/guide/practices/screens_support.html#range">HDPI and MDPI</a>
phones. It’s also consistent with the <a href="http://www.w3.org/TR/CSS2/syndata.html#length-units">CSS 2.1 specification</a>, which says:</p>
<blockquote>
<p>If the pixel density of the output device is very different from that of a
typical computer display, the user agent should rescale pixel values. It is
recommended that the pixel unit refer to the whole number of device pixels
that best approximates the reference pixel. It is recommended that the
reference pixel be the visual angle of one pixel on a device with a pixel
density of 96dpi and a distance from the reader of an arm’s length.</p>
</blockquote>
<p>This change only affects web pages that explicitly set the viewport size or
scale. The pixel ratio is 1.5 applies only if the viewport scale is set to 1.
The size of a “pixel” on any page changes with the zoom level, and the default
zoom level for most pages in Fennec has not changed.</p>
<p>On 240-dpi screens, pages with <em>initial-scale=1</em> will effectively be zoomed to
150% by both Fennec and Android WebKit. Their text will be smooth and crisp,
but their bitmap images will probably not take advantage of the full screen
resolution. To get sharper images on these screens, mobile web developers can
create images at 150% of their final size (or 200%, to support the rumored
320-dpi iPhone) and then scale them down using HTML/CSS.</p>
<p>WebKit on Android supports an additional undocumented
<a href="http://darkforge.blogspot.com/2010/05/customize-android-browser-scaling-with.html">target-densityDpi</a> property, to let web developers override the
CSS-to-device pixel ratio. Fennec doesn’t support this property now, but if
we see a compelling need for it (or if it becomes part of a documented
standard) then we might implement it too.</p>
<p>Right now Fennec uses the same default ratio of 1.5 on all devices. (This
is a hidden preference that can be changed in about:config or by an add-on.)
Later we’ll need to change this – as well as many other parts of
Fennec’s user interface – to choose the correct size automatically,
depending on the screen density.</p>
<h2 id="viewport-width-and-screen-width">Viewport width and screen width</h2>
<p>Many sites set their viewport to <em>width=320, initial-scale=1</em> to fit precisely
onto the iPhone display in portrait mode. As mentioned above, this caused
<a href="http://starkravingfinkle.org/blog/2010/01/perils-of-the-viewport-meta-tag/">problems</a> when Fennec 1.0 rendered these sites, especially in landscape
mode. To fix this, Fennec 1.1 will expand the viewport width if necessary to
fill the screen at the requested scale. This matches the behavior of Android
and Mobile Safari, and is especially useful on large-screen devices like the
iPad. (Allen Pike’s <a href="http://www.antipode.ca/2010/choosing-a-viewport-for-ipad-sites/">Choosing a viewport for iPad sites</a> has a good
explanation for web developers.)</p>
<p>We also added support for <em>minimum-scale</em>, <em>maximum-scale</em>, and
<em>user-scalable</em>, with defaults and limits similar to <a href="http://developer.apple.com/safari/library/documentation/AppleApplications/Reference/SafariHTMLRef/Articles/MetaTags.html">Safari’s</a>. These
properties affect the initial scale and width as well as limiting zooming
after the page is loaded.</p>
<h2 id="standards">Standards</h2>
<p><code><meta name="viewport"></code> is a good example of browsers innovating exactly how
<a href="http://sachin.posterous.com/the-web-sucks">Sachin Agarwal thinks they should</a>. It was implemented by a single
browser, used by web developers, and copied by other browsers without waiting
for any standards organization. It has clearly improved on earlier solutions
like <a href="http://learnthemobileweb.com/2009/07/mobile-meta-tags/">MobileOptimized and HandheldFriendly</a>.</p>
<p>Now that viewport metadata has proved to be a useful extension to HTML, I
think it is worth standardizing. According to the HTML5 spec, new names for
the meta element should first be registered on the <a href="http://wiki.whatwg.org/wiki/MetaExtensions">WHATWG wiki</a> and then
be ratified through the W3C standards process. If anyone at Mozilla or
elsewhere is working on a standard specification for viewport metadata,
please let me know. <em>[Update: this is now being standardized as part of the
<a href="http://www.w3.org/TR/css-device-adapt/">CSS Device Adaptation</a> spec.]</em></p>
Matt Brubeckmbrubeck@limpet.nethttps://limpet.net/mbrubeck/Fennec on Android: user feedback and next steps2010-04-30T00:00:00-07:00https://limpet.net/mbrubeck//2010/04/30/fennec-android<p>Last month I joined Mozilla as a UI engineer on the <a href="http://www.mozilla.com/mobile/">Fennec (Mobile
Firefox)</a> project. Firefox is already available for Nokia’s Maemo platform, and
now a group of Mozilla programmers are <a href="https://wiki.mozilla.org/Mobile/Platforms/Android">porting it to Android</a>. This Tuesday
they asked for feedback on an early <a href="http://blog.vlad1.com/2010/04/27/fennec-on-android-ground-zero/">preview build</a>.</p>
<p class="figure"><img src="/mbrubeck/images/2010/04-30-fennec-n1.png" /></p>
<p>Until now, the only people working on Firefox for Android were back-end
(platform) developers. This week was the first time most other people
– including me – got to try it out. We front-end developers and
designers are now <em>starting</em> to adapt the user interface to Android.
For now it uses the look and feel of Firefox for Maemo.</p>
<p>Because we are an open source project, we like to share our work
even at this early stage of development. While I wasn’t directly
involved in the Android development effort, I spent some of my spare time this
week talking to users via Twitter and our <a href="http://groups.google.com/group/fennec-android-pre-alpha">Android feedback group</a>. Here’s
what I heard, in rough order of importance to users, plus some information on
our future plans.<sup id="fennec-android-fnr1"><a href="#fennec-android-fn1">1</a></sup></p>
<ul>
<li><strong>Zoom and multi-touch</strong>: Pinch zoom gestures are coming! We are reviewing a
patch for <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=437957">animated multi-touch (pinch) zooming on Qt-based devices</a>, and
testing similar code on Android. (Maemo devices have no
multi-touch, so we use their volume buttons to zoom. That code hasn’t been
ported to Android, so only double-tap zoom was working in the preview
build.)
<p>
We also had some requests to fit text to the screen when zoomed in, like the
Android browser. Today Brad Lassey and Ben Stover released the [Easy
Reading][11] add-on that does exactly that. We might make this a built-in
option in Fennec once it is fast and reliable enough.
</p>
</li>
<li>
<p><strong>Menu and Back buttons:</strong> The preview build did not handle Android’s standard
hardware buttons, but code is now checked in to support the <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=559453">back button</a>
and the <a href="http://hg.mozilla.org/users/vladimir_mozilla.com/mozilla-droid/rev/1af28380fc88">menu and search buttons</a>. We’ll continue to refine the way
Fennec uses these buttons.</p>
</li>
<li><strong>Size:</strong> A ten megabyte download (over 30 MB installed) is not
huge for a desktop browser, but it’s hefty for a mobile app –
especially on Android, where apps are saved to limited onboard memory.
<p>
Shrinking Fennec is possible, but not trivial. Some of the library and
toolkit code in our build is probably unused and could be removed. And we
could try minifying our Java­Script source, like many websites do.
Michael Wu hopes current efforts like [Omnijar][17] and [Thumb-2][18]
([bug 563751][20]) will cut the installed size approximately in half.
<p>
Users also reported that our Fennec build did not work with the feature in
some custom Android ROMs to move apps to the SD card. Mozilla's Android
devs are working on a fix for this. It will be nice someday when app
storage on Android is as plentiful as it is on other mobile platforms.
</p></p>
</li>
<li><strong>Hardware compatibility:</strong> There are a lot of different Android phones out
there. Some of them won’t run Fennec because they still have Android 1.5 or
1.6. We hope this will be fixed by the hardware vendors soon, since we
currently rely on some Android 2.0 APIs. Other devices failed for different
reasons, possibly related to insufficient RAM or incompatible OpenGL APIs.
We will need to optimize Firefox’s memory footprint on Android, and test on
a wider selection of devices, perhaps with help from Firefox users.
<p>
Here's a list of [supported hardware][21].
</p>
</li>
<li>
<p><strong>Keyboard problems</strong>: There were many problems with the software keyboard
working intermittently or not at all, especially in landscape orientation.
There were also problems with Shift and Alt keys on some hardware
keyboards. I haven’t heard any news about of these bugs, but we know we
need to fix them quickly.</p>
</li>
<li><strong>Speed</strong>: Strangely, we had some users calling Fennec <a href="http://twitter.com/gwalter/statuses/13033288945">slooooooow</a> and
others calling it <a href="http://twitter.com/TonyWainwrightV/statuses/13033251510">fast as hell</a> (and those tweets were sent just one
minute apart)!
<p>
Once a page is loaded, Fennec is pretty speedy. It's faster than the
Android browser in some areas, and slower in others. But it's definitely
choppy while a page is still loading or complex scripts are running. To fix
this, our next major release of Fennec will include [Elec­trolysis][9]. This
gives Firefox a multi-process architecture much like Google Chrome, and
ensures that the browser always stays responsive.
<p>
Electrolysis requires many changes to our code, so it may be a couple of months
before it appears in usable Fennec builds. In the mean­time, Mozilla is
working on many other performance improvements. This work will also speed
up Firefox for desktop computers – I've been using the FF4 nightly
builds, and they are already much snappier than the Firefox 3.5 I was using
before.
<p>
We've also checked in some simple changes to improve perceived speed,
like [better feedback when pages start loading][16].
</p></p></p>
</li>
<li>
<p><strong>Crashing bugs:</strong> Users were generally forgiving of crashes and other
obvious bugs, to be expected at this stage of development. We will of
course fix any such bugs as fast as possible.</p>
</li>
<li><strong>Add-ons:</strong> We’re just starting <a href="http://starkravingfinkle.org/blog/2010/04/firefox-1-1-beta-1-for-maemo/">Fennec 1.1 beta testing</a>, and most
of our add-ons are not yet updated for version 1.1. Unfortunately, this
meant that many add-ons were not available to our first Android previewers.
This should be fixed over the next few weeks.
<p>
Add-ons are easily Firefox's biggest advantage over other mobile
browsers. For the first time I can easily customize my phone's
browser exactly how I want. I've already written two Fennec add-ons, [Read Later][12]
and [Show Image Title][13]. And there are [many great add-ons][14] from
other devel­opers to choose from.
</p>
</li>
<li>
<p><strong>User interface:</strong> Feedback on our UI was generally positive.
Most users said panning to reveal the toolbars felt natural and easy. I
think <a href="http://madhava.com/egotism/">Madhava</a> and <a href="http://blog.seanmartell.com/">Sean</a>
have done a great job with the design. This will get even better as we take
advantage of Android features like the hardware buttons, integration with
other activities, voice input, and the notification bar.</p>
</li>
<li>
<p><strong>Flash</strong>: The Flash plugin is not yet included in our Android builds, but
it will be supported eventually. Firefox for Maemo already works with Flash,
although enabling it does cause performance problems on some sites. (We are
working on fixing that with major changes to our graphics code.)</p>
</li>
<li><strong>Web site compatibility:</strong> Fennec renders almost all pages the same as
desktop Firefox. Users did report problems entering data on some pages, and
found that most sites do not have mobile versions targeted at Firefox. One
of my personal goals is to make Fennec compatible with more mobile sites,
and to give web developers the support they need to make sites work great in
Fennec. I’ll write much more about this in future articles.</li>
</ul>
<p>We don’t have a regular schedule yet for releasing new builds on Android.
Once we get the code merged and automated build servers configured, we’ll
publish nightly builds of Firefox for Android alongside our <a href="http://ftp.mozilla.org/pub/mozilla.org/mobile/nightly/latest-mobile-trunk/">Maemo and desktop
nightlies</a>. Later this year we will have alpha and beta versions, and
with luck a stable release. Until then, you can follow
<a href="http://twitter.com/MozMobile">@MozMobile</a> or <a href="http://blog.vlad1.com/">Vlad</a>
(<a href="http://twitter.com/vvuk">@vvuk</a>) to hear about any new previews.</p>
<ol class="footnotes">
<li id="fennec-android-fn1">
Please remember I am still new to the project, and cannot speak for the whole team. This is a personal blog, not a Firefox roadmap!
<a href="#fennec-android-fnr1" title="Return to article.">↩</a>
</li>
</ol>
Matt Brubeckmbrubeck@limpet.nethttps://limpet.net/mbrubeck/Headless Web Workers: Does the web need background apps?2010-04-22T00:00:00-07:00https://limpet.net/mbrubeck//2010/04/22/headless-web-workers<p>At my last job, I created several web applications designed to replace
built-in apps on mobile phones. While modern browsers and HTML5 made this
incredibly easy in many ways, we still ended up writing native (i.e. non-web)
code for most of our applications. There were a few different areas where the
browser alone didn’t meet our needs, but one that I found suprisingly common
was background processing.</p>
<p>Consider the following mobile applications:</p>
<ul>
<li>Calendar or clock with alarms.</li>
<li>E-book reader that syncs content from a server.</li>
<li>IM or email client that notifies the user of new messages.</li>
<li>Shopping list that pops up whenever you are near the store.</li>
</ul>
<p>Ideally, each of these apps will perform some actions even when the user does
not have it open. (Background processing is not strictly necessary for the
e-reader, but it would be useful to ensure the library is up-to-date even when
opened in a place with no network connection.)</p>
<p>You can’t do this with a web app. <a href="http://www.whatwg.org/specs/web-workers/current-work/">Web Workers</a> don’t solve the problem,
because they run only while the web page is open. What we need are <em>headless
web workers</em>.<sup id="web-workers-fnr1"><a href="#web-workers-fn1">1</a></sup></p>
<blockquote>
<p><strong>Update (2010-04-26):</strong> Gordon Anderson points out in comments that members
of Google’s Chromium/Chrome OS projects have made a very similar proposal they
call <a href="http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-March/018722.html">Persistent SharedWorkers</a>.</p>
</blockquote>
<h2 id="the-api">The API</h2>
<p>Headless workers could use almost the same API as Web Workers. Instead of
responding to messages from a web page, they would listen to events from the
host system (browser or OS). These events might include time intervals,
power-on/resume, changes in network connection, geographic locations, or
“push” notifications from a remote server.</p>
<p>The event-driven architecture of JavaScript in the browser allows the host
system a high level of discretion over resource consumption. There’s no
special code needed to suspend processes and later restore their state,
because JavaScript workers are naturally inactive between events. The host
can provide limits on CPU or memory usage per event, with a separate message
to notify processes whose handlers were aborted. And it can limit the
number of concurrent processes by choosing when to dispatch events to
listeners. Some listeners could even be disabled completely at times (like if
the device is busy or the battery is low), and notified later of the events
they missed.</p>
<p>This is almost a return to the old days of cooperative multitasking. Mobile
computing is definitely driving everyone towards higher-level process control
in the OS, and different assumptions for applications. It’s not surprising
that my whole proposal resembles Android and iPhone 4.0 multitasking in
several ways, since I’ve been doing development on Android for the last 18
months and encountering many of the same issues.</p>
<h2 id="the-ui">The UI</h2>
<p>Headless workers do need some way to interact with the user. They could
display standard system notifications (via Growl on the Mac, libnotify on
Ubuntu, the status bar in Android, etc.) using <a href="http://dev.w3.org/2006/webapi/WebNotifications/publish/">W3C Web Notifications</a>,
which already have an <a href="http://0xfe.blogspot.com/2010/04/desktop-notifications-with-webkit.html">experimental implementation in Chrome</a>.</p>
<p>Users also needs to know which sites have background tasks installed.
Headless workers could be represented by icons in a standard location (perhaps
a toolbar in desktop browsers, or the home screen on a mobile device). The
icons could display ambient status; clicking one would reveal a menu with
options to configure or remove it.</p>
<h2 id="questions">Questions</h2>
<p>This proposal might be hard to standardize, especially where it’s tied to
specific OS capabilities. For now I’m just curious: would it be useful?
You can write a native app or a browser extension to solve this problem today.
But would it be worthwhile to have a standard, cross-platform way to do it?
Has anyone else run into problems that this approach could solve?</p>
<ol class="footnotes">
<li id="web-workers-fn1">
Because all web standards should have names that sound like Harry Potter creatures.
<a href="#web-workers-fnr1" title="Return to article.">↩</a>
</li>
</ol>
Matt Brubeckmbrubeck@limpet.nethttps://limpet.net/mbrubeck/Reading List: a Mobile Firefox extension2010-04-18T00:00:00-07:00https://limpet.net/mbrubeck//2010/04/18/read-later-fennec<blockquote>
<p><strong>Update (October 2010):</strong> I changed the name of the add-on from “Read
Later” to “Reading List” to avoid confusion with the popular service and
Firefox extension called Read It Later.</p>
</blockquote>
<p>Hello, <a href="http://planet.mozilla.org/">Planet Mozilla</a>! I’m <a href="http://limpet.net/mbrubeck/">Matt Brubeck</a>, the newest member of the
Mobile Firefox (Fennec) front-end team. I’m working remotely from Seattle,
but you can find me in <a href="irc://irc.mozilla.org/mobile">#mobile</a> during the North American day, or follow
me on <a href="http://www.google.com/profiles/mbrubeck">Buzz/Twitter/etc.</a></p>
<p><a href="http://www.mozilla.com/mobile/">Fennec</a> is a new browser built on the Mozilla platform and sharing much
of Firefox’s code and features, but with a UI designed from the ground up for
touchscreen devices. It’s shipping now for Nokia Maemo, and builds should be
available very soon (weeks rather than months, I hope) for Android 2.x.</p>
<p>To help myself learn Fennec and XUL, I wrote a simple extension called
<a href="http://bitbucket.org/mbrubeck/readlater/wiki/Home">Reading List</a>. Like Marco Arment’s <a href="http://www.instapaper.com/">Instapaper</a> service it stores a list
of web pages so you can return to them later. Unlike Instapaper, my extension
does not save pages to a remote server. Instead, it uses your mobile device’s
storage, so you can view saved pages offline. I use code from Arc90’s
<a href="http://lab.arc90.com/experiments/readability/">Readability</a> bookmarklet to extract the main content from the page, save
it, and present it in a simple mobile-friendly layout.</p>
<p>One thing the extension can’t do (which Instapaper and other services can) is
synchronize saved pages between computers. This would be a great feature for
a Mobile Firefox add-on, but writing my own sync service is more work than I
want to put into this little side project. A future version may use
<a href="https://mozillalabs.com/weave/">Weave</a> to sync saved pages, if the size of the data is not a problem.</p>
<p>If you are using a recent Fennec 1.1 build, <a href="https://addons.mozilla.org/en-US/mobile/addon/144983/">try out Reading List</a> and
let me know what you think. And if you’re a developer, you can look at the
<a href="http://bitbucket.org/mbrubeck/readlater/src">source code</a> to see how a simple Fennec extension works.</p>
Matt Brubeckmbrubeck@limpet.nethttps://limpet.net/mbrubeck/Finding domain names with Node.js2010-01-13T00:00:00-08:00https://limpet.net/mbrubeck//2010/01/13/si-unit-domains-node-js<p>I’m working on some ideas for finance or news software that deliberately
updates <em>infrequently</em>, so it doesn’t reward me for reloading it
constantly. I came up with the name “microhertz” to describe the idea. (1
microhertz ≈ once every eleven and a half days.)</p>
<p>As usual when I think of a project name, I did some DNS searches.
Unfortunately “microhertz.com” is not available (but “microhertz.org” is).
Then I went off on a tangent and got curious about which other SI units are
available as domain names.</p>
<p>This was the perfect opportunity to try <a href="http://nodejs.org/">node.js</a> so I could use its
asynchronous DNS library to run dozens of lookups in parallel. I grabbed a
list of <a href="http://physics.nist.gov/cuu/Units/">units and prefixes</a> from NIST and wrote the following script:</p>
<figure class="highlight"><pre><code class="language-js" data-lang="js"><span class="kd">var</span> <span class="nx">dns</span> <span class="o">=</span> <span class="nf">require</span><span class="p">(</span><span class="dl">"</span><span class="s2">dns</span><span class="dl">"</span><span class="p">),</span> <span class="nx">sys</span> <span class="o">=</span> <span class="nf">require</span><span class="p">(</span><span class="dl">'</span><span class="s1">sys</span><span class="dl">'</span><span class="p">);</span>
<span class="kd">var</span> <span class="nx">prefixes</span> <span class="o">=</span> <span class="p">[</span><span class="dl">"</span><span class="s2">yotta</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">zetta</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">exa</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">peta</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">tera</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">giga</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">mega</span><span class="dl">"</span><span class="p">,</span>
<span class="dl">"</span><span class="s2">kilo</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">hecto</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">deka</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">deci</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">centi</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">milli</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">micro</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">nano</span><span class="dl">"</span><span class="p">,</span>
<span class="dl">"</span><span class="s2">pico</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">femto</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">atto</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">zepto</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">yocto</span><span class="dl">"</span><span class="p">];</span>
<span class="kd">var</span> <span class="nx">units</span> <span class="o">=</span> <span class="p">[</span><span class="dl">"</span><span class="s2">meter</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">gram</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">second</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">ampere</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">kelvin</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">mole</span><span class="dl">"</span><span class="p">,</span>
<span class="dl">"</span><span class="s2">candela</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">radian</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">steradian</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">hertz</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">newton</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">pascal</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">joule</span><span class="dl">"</span><span class="p">,</span>
<span class="dl">"</span><span class="s2">watt</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">colomb</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">volt</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">farad</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">ohm</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">siemens</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">weber</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">henry</span><span class="dl">"</span><span class="p">,</span>
<span class="dl">"</span><span class="s2">lumen</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">lux</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">becquerel</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">gray</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">sievert</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">katal</span><span class="dl">"</span><span class="p">];</span>
<span class="k">for </span><span class="p">(</span><span class="kd">var</span> <span class="nx">i</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span> <span class="nx">i</span><span class="o"><</span><span class="nx">prefixes</span><span class="p">.</span><span class="nx">length</span><span class="p">;</span> <span class="nx">i</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
<span class="k">for </span><span class="p">(</span><span class="kd">var</span> <span class="nx">j</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span> <span class="nx">j</span><span class="o"><</span><span class="nx">units</span><span class="p">.</span><span class="nx">length</span><span class="p">;</span> <span class="nx">j</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
<span class="nf">checkAvailable</span><span class="p">(</span><span class="nx">prefixes</span><span class="p">[</span><span class="nx">i</span><span class="p">]</span> <span class="o">+</span> <span class="nx">units</span><span class="p">[</span><span class="nx">j</span><span class="p">]</span> <span class="o">+</span> <span class="dl">"</span><span class="s2">.com</span><span class="dl">"</span><span class="p">,</span> <span class="nx">sys</span><span class="p">.</span><span class="nx">puts</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="kd">function</span> <span class="nf">checkAvailable</span><span class="p">(</span><span class="nx">name</span><span class="p">,</span> <span class="nx">callback</span><span class="p">)</span> <span class="p">{</span>
<span class="nx">dns</span><span class="p">.</span><span class="nf">resolve4</span><span class="p">(</span><span class="nx">name</span><span class="p">).</span><span class="nf">addErrback</span><span class="p">(</span><span class="kd">function</span><span class="p">(</span><span class="nx">e</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if </span><span class="p">(</span><span class="nx">e</span><span class="p">.</span><span class="nx">errno</span> <span class="o">==</span> <span class="nx">dns</span><span class="p">.</span><span class="nx">NXDOMAIN</span><span class="p">)</span> <span class="nf">callback</span><span class="p">(</span><span class="nx">name</span><span class="p">);</span>
<span class="p">})</span>
<span class="p">}</span></code></pre></figure>
<p>Out of 540 possible .com names, I found 376 that are available (and 10 more
that produced temporary DNS errors, which I haven’t investigated). Here are a
few interesting ones, with some commentary:</p>
<ul>
<li>exasecond.com – <i>32 billion years</i></li>
<li>petasecond.com – <i>32 million years</i></li>
<li>petawatt.com – <i>can be produced for femtoseconds by powerful lasers</i></li>
<li>terapascal.com</li>
<li>gigakelvin.com – <i>possible temperature of picosecond flashes in sonoluminescence</i></li>
<li>giganewton.com – <i>225 million pounds force</i></li>
<li>gigafarad.com</li>
<li>kilosecond.com – <i>16 minutes 40 seconds</i></li>
<li>kilokelvin.com – <i>1340 degrees Fahrenheit</i></li>
<li>centiohm.com</li>
<li>millifarad.com</li>
<li>microkelvin.com</li>
<li>picohertz.com – <i>once every 31,689 years</i></li>
<li>picojoule.com</li>
<li>femtogram.com – <i>mass of a single virus</i></li>
<li>yoctogram.com – <i>a hydrogen atom weighs 1.66 yoctograms</i></li>
<li>zeptomole.com – <i>602 molecules</i></li>
</ul>
<p>To get the complete list, just copy the script above to a file, and run it
like this: <code>node listnames.js</code></p>
<p>Along the way I discovered that the API documentation for Node’s <code>dns</code> module
was out-of-date. This is <a href="https://github.com/mbrubeck/node/commit/bd4f56c8239aca12b6f7c2016bda51507ba7aec7">fixed</a> in my GitHub fork, and I’ve sent a pull
request to the author Ryan Dahl.</p>
Matt Brubeckmbrubeck@limpet.nethttps://limpet.net/mbrubeck/Weekend hack: Outline grep2010-01-12T00:00:00-08:00https://limpet.net/mbrubeck//2010/01/12/outline-grep<p>I keep almost all of my notes and to-do lists in plain text files, so I can
edit and search them with Vim, grep, and other standard Unix tools. I often
indent lines in these files to create a simple outline structure, and use the
<code>autoindent</code> and <code>foldmethod=indent</code> options to make Vim into a simple
outliner.</p>
<p>To get useful output when searching through these outline-structured files, I
wrote a simple grep replacement. Given a text file with a Python-style
indentation structure, <code>ogrep</code> searches the file for a regular expression. It
prints matching lines, with their “parent” lines as context. For example, if
input.txt looks like this:</p>
<pre><code>2009-01-01
New Year's Day!
No work today.
Visit with family.
2009-01-02
Grocery store and library.
2009-01-03
Stay home.
2009-01-04
Back to work.
Remember to set an alarm.
</code></pre>
<p>then <code>ogrep work input.txt</code> will produce the following output:</p>
<pre><code>2009-01-01
New Year's Day!
No work today.
2009-01-04
Back to work.
</code></pre>
<p>You can download ogrep from the <a href="https://github.com/mbrubeck/outline-grep">outline-grep repository</a> on GitHub, or
just read the <a href="https://github.com/mbrubeck/outline-grep/blob/master/OutlineGrep.lhs">literate Haskell file</a>. The code is almost trivial (40
lines of code, plus imports and comments); I’m publishing it just in case
anyone else has a use for it, and because some of my friends were curious
about how I’m using Haskell. I’ve now written a few “real-world” Haskell
programs (<a href="http://limpet.net/mbrubeck/2009/10/30/compleat.html">compleat</a> was the first). I’m finding Haskell very well suited
to such programs, though this particular one would be equally easy in a
language like Perl, Python, or Ruby.</p>
<p>This is a one-off tool to fill a gap in my workflow; there are no
configuration options or useful error messages. It would be fairly easy to
extend it, though. For example, an option to include children (as well as
parents) of matching lines might be handy. I recently realized that ogrep
often works for searching through source code too, which may generate some
more unexpected use cases.</p>
Matt Brubeckmbrubeck@limpet.nethttps://limpet.net/mbrubeck/Android 2.0 uses V8 JavaScript engine2009-11-06T00:00:00-08:00https://limpet.net/mbrubeck//2009/11/06/android-v8<p>Google has not yet released most of the Android 2.0 “eclair” source code, but they did
publish source for a very small number of components, including a <a href="http://android.git.kernel.org/?p=platform/external/webkit.git;a=tree;h=refs/heads/android-2.0_r1_snapshot;hb=android-2.0_r1_snapshot">WebKit
snapshot</a>. I was excited to see that the snapshot includes Google’s V8
virtual machine. (Previous Android releases used Safari’s
JavaScriptCore/”SquirrelFish Extreme” VM.) But without the rest of the source
tree, there was no way to build and run this on a real Android phone. The
SDK includes a binary image that runs only in the qemu-based emulator.</p>
<p>Today I got to try out a Motorola Droid. Here’s how its browser compares to
Android 1.6 on my HTC Dream (Android Dev Phone / T-Mobile G1) in the <a href="http://v8.googlecode.com/svn/data/benchmarks/v5/run.html">V8
Benchmark Suite</a>:</p>
<table class="data">
<tr><th>Test</th> <th class="num">Dream (1.6)</th> <th class="num">Droid (2.0)</th> <th class="num">Change</th></tr>
<tr><td>Richards</td> <td class="num">13.5</td> <td class="num">15.6</td> <td class="num">+16%</td></tr>
<tr><td>DeltaBlue</td> <td class="num">5.23</td> <td class="num">12.9</td> <td class="num">+147%</td></tr>
<tr><td>Crypto</td> <td class="num">13.2</td> <td class="num">10.9</td> <td class="num neg">-17%</td></tr>
<tr><td>RayTrace</td> <td class="num">10.9</td> <td class="num">80.1</td> <td class="num">+635%</td></tr>
<tr><td>EarleyBoyer</td> <td class="num">23.5</td> <td class="num">74.7</td> <td class="num">+218%</td></tr>
<tr><td>RegExp</td> <td class="num note">did not complete</td> <td class="num">16.5</td> <td class="num">–</td></tr>
<tr><td>Splay </td> <td class="num note">did not complete</td> <td class="num note">did not complete</td> <td class="num">–</td></tr>
</table>
<p>Some tests (Richards, Crypto) see little or no improvement, while others
(DeltaBlue, RayTrace, EarleyBoyer) are dramatically faster. Just for
comparison, let’s run the same benchmark on Safari 4 (JavaScriptCore) and a
Chromium 4 nightly build (V8) on a Mac Pro:</p>
<table class="data">
<tr><th>Test</th> <th class="num">Safari 4</th> <th class="num">Chromium 4</th> <th class="num">Change</th></tr>
<tr><td>Richards</td> <td class="num">4103</td> <td class="num">4640</td> <td class="num">+13%</td></tr>
<tr><td>DeltaBlue</td> <td class="num">3171</td> <td class="num">4418</td> <td class="num">+39%</td></tr>
<tr><td>Crypto</td> <td class="num">3331</td> <td class="num">3643</td> <td class="num">+9%</td></tr>
<tr><td>RayTrace</td> <td class="num">3509</td> <td class="num">6662</td> <td class="num">+90%</td></tr>
<tr><td>EarleyBoyer</td> <td class="num">4737</td> <td class="num">7643</td> <td class="num">+61%</td></tr>
<tr><td>RegExp</td> <td class="num">1268</td> <td class="num">1187</td> <td class="num neg">-6%</td></tr>
<tr><td>Splay </td> <td class="num">1198</td> <td class="num">7290</td> <td class="num">+509%</td></tr>
</table>
<p>The precise ratios are different, but the same tests that showed the most
improvement from Android 1.6 to 2.0 also show the most improvement from Safari
to Chrome. Based on this plus the source code snapshot, I’m pretty sure that
Android 2.0 is indeed using V8.</p>
<p>This is exciting news. It makes Droid the first shipping product I know that
uses V8 on an ARM processor, although V8 has included an ARM JIT compiler for
some time now. <em>[Correction: Palm Pre was first; see the comments below.]</em>
For mobile web developers like me, it means we’re one step closer to having
desktop-quality rich web applications on low-power handheld devices.</p>
<p>Android still lags behind the iPhone in at least one important way for web
developers: CSS animation. The iPhone (and Safari on the desktop) provides
hardware acceleration for CSS transforms, like this <a href="http://webkit.org/blog/324/css-animation-2/">falling leaves demo</a>.
On Android, CSS animation is done in software, making it <a href="http://vimeo.com/3697944">much, much
slower</a>. (Even outside the browser, Android’s Skia 2D graphics API lacks
hardware acceleration. OpenGL is the only way to for Android developers to
take advantage of the GPU.) Accelerated animation would really make it
possible to write interactive web pages that match the smoothness and
responsiveness of native apps.</p>
<p><strong>Final thought:</strong> Although the Motorola Droid is still 100 times slower than
Chromium on a Mac Pro, it’s already faster at some benchmarks than IE8 or
Firefox 2 on desktop hardware from just a few years ago.</p>
<p><strong>Update (2010-02-09):</strong> Just for comparison, here are numbers for the
Google/HTC Nexus One with Android 2.1. The Nexus One is around 2-4 times
faster than the Droid at the V8 benchmark suite. It even renders the falling
leaves animation at a decent framerate (but still not as smoothly as the
GPU-accelerated iPhone).</p>
<table class="data">
<tr><th>Test</th> <th class="num">Droid (2.0)</th> <th class="num">Nexus One (2.1)</th> <th class="num">Change</th></tr>
<tr><td>Richards</td> <td class="num">15.6</td> <td class="num">52.1</td> <td class="num">+234%</td></tr>
<tr><td>DeltaBlue</td> <td class="num">12.9</td> <td class="num">60.2</td> <td class="num">+367%</td></tr>
<tr><td>Crypto</td> <td class="num">10.9</td> <td class="num">31.7</td> <td class="num">+191%</td></tr>
<tr><td>RayTrace</td> <td class="num">80.1</td> <td class="num">170</td> <td class="num">+112%</td></tr>
<tr><td>EarleyBoyer</td> <td class="num">74.7</td> <td class="num">126</td> <td class="num">+69%</td></tr>
<tr><td>RegExp</td> <td class="num">16.5</td> <td class="num">27.5</td> <td class="num">+67%</td></tr>
<tr><td>Splay </td> <td class="num note">did not complete</td> <td class="num note">did not complete</td> <td class="num">–</td></tr>
</table>
Matt Brubeckmbrubeck@limpet.nethttps://limpet.net/mbrubeck/Compleat: Bash completion for human beings2009-10-30T00:00:00-07:00https://limpet.net/mbrubeck//2009/10/30/compleat<p><em>Compleat</em> is an easy, declarative way to add smart tab completion for
any shell command. It’s written in Haskell but requires no programming knowledge.
See the <a href="https://github.com/mbrubeck/compleat">GitHub repository</a> for a quick description, or read on for a
complete explanation.</p>
<h2 id="background">Background</h2>
<p>I’m one of those programmers who loves to <a href="http://www.yosefk.com/blog/teeth-marks-at-the-rear-end.html">carefully tailor my development
environment</a>. I do nearly all of my work at the shell or in a text editor,
and I’ve spent a dozen years learning and customizing them to work more
quickly and easily.</p>
<p>Most experienced shell users know about programmable completion, which provides
smart tab-completion for supported programs like ssh and git. You can
also add your own completions for programs that aren’t supported. So I read
the fine manual and started writing completions. You can see the <a href="https://github.com/mbrubeck/android-completion/blob/master/android">script I
made</a> for three commands from the Google Android SDK. It’s 200 lines of
Bash code, and fairly straightforward if you happen to be familiar with the
Bash completion API. But as I cranked out more and more <code>case</code> statements, I
felt there must be a better way…</p>
<h2 id="the-idea">The Idea</h2>
<p>It’s not hard to describe the usage of a typical command-line program.
There’s even a semi-standard format for it, used in man pages and generated by
libraries like <a href="http://autogen.sourceforge.net/autoopts.html">AutoOpt</a>. For example, here’s the usage for <code>android</code>, one
of the SDK commands supported by my script:</p>
<pre><code> android [--silent | --verbose]
( list [avd|target]
| create avd ( --target <target> | --name <name> | --skin <name>
| --path <file> | --sdcard <file> | --force ) ...
| move avd (--name <avd> | --rename <new> | --path <file>) ...
| (delete|update) avd --name <avd>
| create project ( (--package|--name|--activity|--path) <val>
| --target <target> ) ...
| update project ((--name|--path) <val> | --target <target>) ...
| update adb )
</code></pre>
<p>My idea: What if you could teach the shell to complete a program’s arguments
just by writing a usage description like this one?</p>
<h2 id="the-solution">The Solution</h2>
<p>With <a href="https://github.com/mbrubeck/compleat">Compleat</a>, you can add completion for any command just by writing a
usage description and saving it in a configuration folder. The ten-line
description of the <code>android</code> command above generates the same results as my
76-line bash function, and it’s <em>so</em> much easier to write and understand!</p>
<p>The syntax should be familiar to long-time Unix users. Optional arguments are
enclosed in square brackets; alternate choices are separated by vertical
pipes. An ellipsis following an item means it may be repeated, and
parentheses group several items into one. Words in angle brackets are
parameters for the user to fill in.</p>
<p>Let’s look at some more features of the usage format. For programs with
complicated arguments, it can be useful to break them down further. You can
place alternate usages on their own lines separated by semicolons, like this:</p>
<pre><code>android <opts> list [avd|target];
android <opts> move avd (--name <avd>|--rename <new>|--path <file>)...;
android <opts> (delete|update) avd --name <avd>;
</code></pre>
<p>…and so on. Rather than repeat the common options on every line, I used a
parameter named “opts”. I can define that parameter to be a sub-pattern,
which will be used wherever <code><opts></code> appears:</p>
<pre><code>opts = [ --silent | --verbose ];
</code></pre>
<p>For parameters whose values are not fixed but can be computed by another
program, we use a <code>!</code> symbol followed by a shell command to generate
completions. For example, we can run shell commands to suggest names of
Android Virtual Devices or target types whenever <code><avd></code> or <code><target></code> appears
in a pattern:</p>
<pre><code>avd = ! android list avd | grep 'Name:' | cut -f2 -d: ;
target = ! android list target | grep '^id:'| cut -f2 -d' ' ;
</code></pre>
<p>Any parameter without a definition will use the shell’s built-in completion
rules, which suggest matching filenames by default.</p>
<p><a href="https://github.com/mbrubeck/compleat"><strong>The source code is on GitHub.</strong></a> I’ve been using it for just a week and
I’m now writing new usage files for myself almost every day. The README file
has more details about the usage syntax, and instructions for installing the
software. Give it a try, and please send in any usage files that you want to
share! (Questions, bug reports, or patches are also welcome.)</p>
<h2 id="future-work">Future Work</h2>
<p>For the next release of Compleat, I would like to make installation easier by
providing better packaging and pre-compiled binaries; support <code>zsh</code> and other
non-bash shells; and write better documentation.</p>
<p>In the long term, I’m thinking about replacing the usage file interpreter with
a compiler. The compiler would translate the usage file into shell code, or
perhaps another language like C or Haskell. This would potentially improve
performance (although speed isn’t an issue right now on my development box),
and make it easy for usage files to include logic written in the target
language. Another idea for the future: What if option-parsing libraries like
AutoOpt or the Ruby/Perl/Python equivalents generated completion scripts for
every program you wrote?</p>
<h2 id="final-thoughts">Final Thoughts</h2>
<p>I realized recently that some things I do are so specialized that my parents
and non-programmer friends will probably never get them. For example,
Compleat is a program to generate programs to help you… run programs?
Sigh. Well, maybe <em>someone</em> out there will appreciate it.</p>
<p>Compleat was my weekends/evenings/bus-rides project for the last few weeks (as
you can see in the <a href="https://github.com/mbrubeck/compleat/graphs/punch_card">GitHub punch card</a>), and my most fun side project in
quite a while. It’s the first “real” program I’ve written in Haskell, though
I’ve been experimenting with the language for a while. Now that I’m
comfortable with it, I find that Haskell’s particular combination of features
works just right to enable quick exploratory programming, while giving a high
level of confidence in the behavior of the resulting program. Compleat 1.0 is
just 160 lines of Haskell, excluding comments and imports. Every module was
completely rewritten at least once as I compared different approaches. (This
is much less daunting when the code in question is only a couple dozen lines.)
I don’t think this particular program would have been quite as easy to
write—at least for me—in any of the other platforms I know
(including Ruby, Python, Scheme, and C).</p>
<p>I had the idea for Compleat more than a year ago, but at the time I did not
know how to implement it easily. I quickly realized that what I wanted to
write was a specialized parser generator, and a domain-specific language to go
with it. Unfortunately I never took a compiler-design class in school, and
had forgotten most of what I learned in my programming languages course. So I
began studying parsing algorithms and language implementation, with Compleat
as my ultimate goal.</p>
<p>My good friend Josh and his <a href="http://www.gazelle-parser.org/">Gazelle parser generator</a> helped inspire me
and point me toward other existing work. Compleat actually contains three
parsers. The usage file parser and the input line tokenizer are built on the
excellent <a href="http://legacy.cs.uu.nl/daan/parsec.html">Parsec</a> library. The usage file is then translated into a
parser that’s built with my own simple set of parser combinators, which were
inspired both by Parsec and by the original <a href="http://www.cs.nott.ac.uk/~gmh/bib.html#monparsing">Monadic Parser Combinators</a>
paper by Graham Hutton and Erik Meijer. The simple evaluator for the usage
DSL applies what I learned from Jonathan Tang’s <a href="http://jonathan.tang.name/files/scheme_in_48/tutorial/overview.html">Write Yourself a Scheme in
48 Hours</a>. And of course <a href="http://book.realworldhaskell.org/">Real World Haskell</a> was an essential
resource for both the nuts and bolts and the design philosophy of Haskell.</p>
<p>So besides producing a tool that will be useful to me and hopefully others, I
also filled in a gap in my education, learned some great new languages and
tools, and kindled an interest in several new (to me) research areas. It has
also renewed my belief in the importance of “academic” knowledge to real
engineering problems. I’ve already come across at least one problem in my
day job that I was able to solve faster by implementing a simple parser than I
would have a year ago by fumbling with regexes. And I’ll be even happier if
this inspires some friends or strangers to take a closer look at Haskell,
Parsec, or any problem they’ve thought about and didn’t know enough to solve.
Yet.</p>
Matt Brubeckmbrubeck@limpet.nethttps://limpet.net/mbrubeck/