*This module will introduce you to several useful data structures built on the door, then discuss how the compiler handles types and the sample.*

## Key Data Structures and Molds

++maps are a versatile way to store and access data, but they are far from the only useful pattern. `++map`

s were documented in the previous module.

`tree`

We use tree to make a binary tree data structure in Hoon, e.g., `(tree @)`

for a binary tree of atoms.

There are two kinds of `tree`

in Hoon:

The null tree

`~`

.A non-null tree which is a cell with three parts.

- The node value.
- The left child of the node.
- The right child of the node.

Each child is itself a tree. The node value has the face

`n`

, the left child has the face`l`

, and the right child has the face`r`

. The following diagram provides an illustration of a`(tree @)`

(without the faces):

12/ \8 14/ \ / \4 ~ ~ 16/ \ / \~ ~ ~ ~

Hoon supports trees of any type that can be constructed in Hoon, e.g.: `(tree @)`

, `(tree ^)`

, `(tree [@ ?])`

, etc. Let's construct the tree in the diagram above in the dojo, casting it accordingly:

> `(tree @)`[12 [8 [4 ~ ~] ~] [14 ~ [16 ~ ~]]]{4 8 12 14 16}

Notice that we don't have to insert the faces manually; by casting the noun above to a `(tree @)`

Hoon inserts the faces for us. Let's put this noun in the dojo subject with the face `b`

and pull out the tree at the left child of the `12`

node:

> =b `(tree @)`[12 [8 [4 ~ ~] ~] [14 ~ [16 ~ ~]]]> b{4 8 12 14 16}> l.b-find.l.bfind-fork

This didn't work because we haven't first proved to Hoon that `b`

is a non-null tree. A null tree has no `l`

in it, after all. Let's try again, using `?~`

wutsig to prove that `b`

isn't null. We can also look at `r`

and `n`

:

> ?~(b ~ l.b){4 8}> ?~(b ~ r.b){14 16}> ?~(b ~ n.b)12

#### Find and Replace

Here's a program that finds and replaces certain atoms in a `(tree @)`

.

|= [nedl=@ hay=(tree @) new=@]^- (tree @)?~ hay ~:+ ?: =(n.hay nedl)newn.hay$(hay l.hay)$(hay r.hay)

`nedl`

is the atom to be replaced, `hay`

is the tree, and `new`

is the new atom with which to replace `nedl`

. Save this as `findreplacetree.hoon`

and run in the dojo:

> b{4 8 12 14 16}> +findreplacetree [8 b 94]{4 94 12 14 16}> +findreplacetree [14 b 94]{4 8 12 94 16}

`set`

A set is rather like a list except that each entry can only be represented once. As with a map, a `set`

is typically associated with a particular type, such as `(set @ud)`

for a collection of decimal values. (`set`

s also don't have an order, so they're basically a bag of unique values.)

`set`

operations are provided by ++in. Most names are similar to `map`

/++by operations when applicable.

++silt produces a `set`

from a `list`

:

=primes (silt ~[2 3 5 7 11 13])

++put:in adds a value to a `set`

(and null-ops when the value is already present):

=primes (~(put in primes) 17)=primes (~(put in primes) 13)

++del:in removes a value from a `set`

:

=primes (~(put in primes) 18)=primes (~(del in primes) 18)

++has:in checks for existence:

> (~(has in primes) 15)%.n> (~(has in primes) 17)%.y

++tap:in yields a `list`

of the values:

> ~(tap in primes)~[3 2 7 5 11 13 17]> (sort ~(tap in primes) lth)~[2 3 5 7 11 13 17]

++run:in applies a function across all values:

> (~(run in primes) dec){10 6 12 1 2 16 4}

#### Example: Cartesian Product

Here's a program that takes two sets of atoms and returns the Cartesian product↗ of those sets. A Cartesian product of two sets `a`

and `b`

is a set of all the cells whose head is a member of `a`

and whose tail is a member of `b`

.

|= [a=(set @) b=(set @)]=/ c=(list @) ~(tap in a)=/ d=(list @) ~(tap in b)=| acc=(set [@ @])|- ^- (set [@ @])?~ c acc%= $c t.cacc |- ?~ d acc%= $d t.dacc (~(put in acc) [i.c i.d])====

Save this as `cartesian.hoon`

in your urbit's pier and run in the dojo:

> =c `(set @)`(sy ~[1 2 3])> c{1 2 3}> =d `(set @)`(sy ~[4 5 6])> d{5 6 4}> +cartesian [c d]{[2 6] [1 6] [3 6] [1 4] [1 5] [2 4] [3 5] [3 4] [2 5]}

`unit`

Redux (and `vase`

)

We encountered the unit briefly as a tool for distinguishing null results from actual zeroes: using a `unit`

allows you to specify something that may not be there. For this reason, `unit`

s are commonly used for operations that sometimes fail, such as search functions, database lookups, remote data requests, etc.

You can build a `unit`

using the tic special notation or ++some:

> `%mars[~ %mars]> (some %mars)[~ u=%mars]

While ++got:by is one way to get a value back without wrapping it in a `unit`

, it's better practice to use the `unit`

logic gates to manipulate gates to work correctly with `unit`

s.

For example, use ++need to unwrap a `unit`

, or crash if the `unit`

is `~`

null.

> =colors (malt `(list (pair @tas @ux))`~[[%red 0xed.0a3f] [%yellow 0xfb.e870] [%green 0x1.a638] [%blue 0x66ff]])> (~(get by colors) %yellow)[~ q=0xfb.e870]> (need (~(get by colors) %yellow))0xfb.e870> (~(get by colors) %teal)~> (need (~(get by colors) %teal))dojo: hoon expression failed

Rather than unwrap a unit, one can modify gates to work with `unit`

s directly even if they're not natively set up that way. For instance, one cannot decrement a `unit`

because ++dec doesn't accept a `unit`

. ++bind can bind a non-`unit`

function—another gate-building gate!.

> (bind ((unit @ud) [~ 2]) dec)[~ 1]> (bind (~(get by colors) %orange) red)[~ 0xff]

(There are several others tools listed on that page which may be potentially useful to you.)

A vase is a pair of type and value, such as that returned by `!>`

zapgar. A `vase`

is useful when transmitting data in a way that may lose its type information.

### Containers of Containers

maps and sets are frequently used in the standard library and in the extended ecosystem. There are other common patterns which recur often enough that they have their own names:

++jar is a mold for a

`map`

of`list`

s.`++jar`

uses the ++ja core. (Mnemonic: jars hold solid ordered things, like a list.)++jug is a mold for a

`map`

of`set`

s.`++jug`

uses the ++ju core. (Mnemonic: jugs hold liquids, evoking the unordered nature of a set.)++mip is a mold for a map of maps.

`++mip`

lives in the`%landscape`

desk in`/lib/mip.hoon`

. Affordances are still few but a short example follows:> =mip -build-file /=landscape=/lib/mip/hoon> =my-map-warm (malt `(list (pair @tas @ux))`~[[%red 0xed.0a3f] [%yellow 0xfb.e870]])> =my-map-cool (malt `(list (pair @tas @ux))`~[[%green 0x1.a638] [%blue 0x66ff]])> =my-mip *(mip:mip @tas (map @tas @ux))> =my-mip (~(put bi:mip my-mip) %cool %blue 0x66ff)> =my-mip (~(put bi:mip my-mip) %cool %green 0x1.a638)> =my-mip (~(put bi:mip my-mip) %warm %red 0xed.0a3f)> =my-mip (~(put bi:mip my-mip) %warm %yellow 0xfb.e870)> my-mip[ n=[p=%warm q=[n=[p=%yellow q=0xfb.e870] l=[n=[p=%red q=0xed.0a3f] l=~ r=~] r=~]]l=[n=[p=%cool q=[n=[p=%green q=0x1.a638] l=[n=[p=%blue q=0x66ff] l=~ r=~] r=~]] l=~ r=~]r=~]> (~(got bi:mip my-mip) %cool %green)0x1.a638> ~(tap bi:mip my-mip)~[[x=%warm y=%yellow v=0xfb.e870][x=%warm y=%red v=0xed.0a3f][x=%cool y=%green v=0x1.a638][x=%cool y=%blue v=0x66ff]]`mip`

s are unjetted and quite slow but serve as a proof of concept.++mop ordered maps are discussed in the App School guides.

## Molds and Samples

### Modifying Gate Behavior

Sometimes you need to modify parts of a core (like a gate) on-the-fly to get the desired behavior. For instance, if you are using ++roll to calculate the multiplicative product of the elements of a list, this “just works”:

> (roll `(list @ud)`~[10 12 14 16 18] mul)483.840

In contrast, if you do the same thing to a list of numbers with a fractional part (`@rs`

floating-point values), the naïve operation will fail:

> (roll `(list @rs)`~[.10 .12 .14 .16 .18] mul:rs).0

Why is this? Let's peek inside the gates and see. Since we know a core is a cell of `[battery payload]`

, let's take a look at the payload:

> +:mul[[a=1 b=1] <33.uof 1.pnw %138>]> +:mul:rs[[a=.0 b=.0] <21.ezj [r=?(%d %n %u %z) <51.njr 139.oyl 33.uof 1.pnw %138>]>]

The correct behavior for ++mul:rs is really to multiply starting from one, not zero, so that ++roll doesn't wipe out the entire product.

### Custom Samples

In an earlier exercise we created a door with sample `[a=@ud b=@ud c=@ud]`

. If we investigated, we would find that the initial value of each is `0`

, the bunt value of `@ud`

.

> +6:poly[a=0 b=0 c=0]

What if we wish to define a door with a chosen sample value directly? We can make use of the `$_`

buccab rune, whose irregular form is simply `_`

. To create the door `poly`

with the sample set to have certain values in the Dojo, we would write

> =poly |_ [a=_5 b=_4 c=_3]++ quad|= x=@ud(add (add (mul a (mul x x)) (mul b x)) c)--> (quad:poly 2)31

For our earlier example with ++roll, if we wanted to set the default sample to have a different value than the bunt of the type, we could use `_`

cab:

> =mmul |=([a=_.1 b=_.1] (mul:rs a b))> (roll `(list @rs)`~[.10 .12 .14 .16 .18] mmul).483840

### Named Tuples

A named tuple is a structured collection of values with faces. The `$:`

buccol rune forms a named tuple. We use these implicitly in an irregular form when we specify the sample of a gate, as `|=([a=@ b=@] (add a b))`

expands to a `$:`

buccol expression for `[a=@ b=@]`

. Otherwise, we only need these if we are building a special type like a vector (e.g. with two components like an *x* and a *y*).

### Structure Mode

Most Hoon expressions evaluate normally (that's what “normal” means), what we'll call *noun mode* (or *normal mode*). However, sample definitions and `+$`

lusbuc mold specification arms evaluate in what is called *structure mode*. (You may occasionally see this the older term “spec mode”.) Structure mode expressions use a similar syntax to regular Hoon expressions but create structure definitions instead.

For instance, in eval mode if you use the irregular form `p=1`

this is an irregular form of the `^=`

kettis rune. This is one way to define a variable using a `=+`

tislus; these are equivalent statements:

> =+(hello=1 hello)1> =+(^=(hello 1) hello)1

(Normally we have preferred `=/`

tisfas in the Hoon School docs, but that is just for consistency.)

In a sample definition, such as in a gate, the statement is evaluated in structure mode; these are equivalent statements:

|=(hello=@ hello)|=($=(hello @) hello)

There are several other subtle cases where normal mode and structure mode diverge, but most of the time structure mode is invisible to you. The `$`

buc runes are typically invoked in structure mode.