*This module will introduce you to several useful data structures built on the door, then discuss how the compiler handles types and the sample.*

## Key Data Structures and Molds

++maps are a versatile way to store and access data, but they are far from the only useful pattern. `++map`

s were documented in the previous module.

`tree`

We use tree to make a binary tree data structure in Hoon, e.g., `(tree @)`

for a binary tree of atoms.

There are two kinds of `tree`

in Hoon:

The null tree

`~`

.A non-null tree which is a cell with three parts.

- The node value.
- The left child of the node.
- The right child of the node.

Each child is itself a tree. The node value has the face

`n`

, the left child has the face`l`

, and the right child has the face`r`

. The following diagram provides an illustration of a`(tree @)`

(without the faces):

12/ \8 14/ \ / \4 ~ ~ 16/ \ / \~ ~ ~ ~

Hoon supports trees of any type that can be constructed in Hoon, e.g.: `(tree @)`

, `(tree ^)`

, `(tree [@ ?])`

, etc. Let's construct the tree in the diagram above in the dojo, casting it accordingly:

> `(tree @)`[12 [8 [4 ~ ~] ~] [14 ~ [16 ~ ~]]]{4 8 12 14 16}

Notice that we don't have to insert the faces manually; by casting the noun above to a `(tree @)`

Hoon inserts the faces for us. Let's put this noun in the dojo subject with the face `b`

and pull out the tree at the left child of the `12`

node:

> =b `(tree @)`[12 [8 [4 ~ ~] ~] [14 ~ [16 ~ ~]]]> b{4 8 12 14 16}> l.b-find.l.bfind-fork-d

This didn't work because we haven't first proved to Hoon that `b`

is a non-null tree. A null tree has no `l`

in it, after all. Let's try again, using `?~`

wutsig to prove that `b`

isn't null. We can also look at `r`

and `n`

:

> ?~(b ~ l.b){4 8}> ?~(b ~ r.b){14 16}> ?~(b ~ n.b)12

#### Find and Replace

Here's a program that finds and replaces certain atoms in a `(tree @)`

.

|= [nedl=@ hay=(tree @) new=@]^- (tree @)?~ hay ~:+ ?: =(n.hay nedl)newn.hay$(hay l.hay)$(hay r.hay)

`nedl`

is the atom to be replaced, `hay`

is the tree, and `new`

is the new atom with which to replace `nedl`

. Save this as `findreplacetree.hoon`

and run in the dojo:

> b{4 8 12 14 16}> +findreplacetree [8 b 94]{4 94 12 14 16}> +findreplacetree [14 b 94]{4 8 12 94 16}

`set`

A set is rather like a list except that each entry can only be represented once. As with a map, a `set`

is typically associated with a particular type, such as `(set @ud)`

for a collection of decimal values. (`set`

s also don't have an order, so they're basically a bag of unique values.)

`set`

operations are provided by ++in. Most names are similar to `map`

/++by operations when applicable.

++silt produces a `set`

from a `list`

:

=primes (silt ~[2 3 5 7 11 13])

++put:in adds a value to a `set`

(and null-ops when the value is already present):

=primes (~(put in primes) 17)=primes (~(put in primes) 13)

++del:in removes a value from a `set`

:

=primes (~(put in primes) 18)=primes (~(del in primes) 18)

++has:in checks for existence:

> (~(has in primes) 15)%.n> (~(has in primes) 17)%.y

++tap:in yields a `list`

of the values:

> ~(tap in primes)~[3 2 7 5 11 13 17]> (sort ~(tap in primes) lth)~[2 3 5 7 11 13 17]

++run:in applies a function across all values:

> (~(run in primes) dec){10 6 12 1 2 16 4}

#### Example: Cartesian Product

Here's a program that takes two sets of atoms and returns the Cartesian product↗ of those sets. A Cartesian product of two sets `a`

and `b`

is a set of all the cells whose head is a member of `a`

and whose tail is a member of `b`

.

|= [a=(set @) b=(set @)]=/ c=(list @) ~(tap in a)=/ d=(list @) ~(tap in b)=| acc=(set [@ @])|- ^- (set [@ @])?~ c acc%= $c t.cacc |- ?~ d acc%= $d t.dacc (~(put in acc) [i.c i.d])====

Save this as `cartesian.hoon`

in your urbit's pier and run in the dojo:

> =c `(set @)`(sy ~[1 2 3])> c{1 2 3}> =d `(set @)`(sy ~[4 5 6])> d{5 6 4}> +cartesian [c d]{[2 6] [1 6] [3 6] [1 4] [1 5] [2 4] [3 5] [3 4] [2 5]}

`unit`

Redux (and `vase`

)

We encountered the unit briefly as a tool for distinguishing null results from actual zeroes: using a `unit`

allows you to specify something that may not be there. For this reason, `unit`

s are commonly used for operations that sometimes fail, such as search functions, database lookups, remote data requests, etc.

You can build a `unit`

using the tic special notation or ++some:

> `%mars[~ %mars]> (some %mars)[~ u=%mars]

While ++got:by is one way to get a value back without wrapping it in a `unit`

, it's better practice to use the `unit`

logic gates to manipulate gates to work correctly with `unit`

s.

For example, use ++need to unwrap a `unit`

, or crash if the `unit`

is `~`

null.

> =colors (malt `(list (pair @tas @ux))`~[[%red 0xed.0a3f] [%yellow 0xfb.e870] [%green 0x1.a638] [%blue 0x66ff]])> (~(get by colors) %yellow)[~ q=0xfb.e870]> (need (~(get by colors) %yellow))0xfb.e870> (~(get by colors) %teal)~> (need (~(get by colors) %teal))dojo: hoon expression failed

Rather than unwrap a unit, one can modify gates to work with `unit`

s directly even if they're not natively set up that way. For instance, one cannot decrement a `unit`

because ++dec doesn't accept a `unit`

. ++bind can bind a non-`unit`

function—another gate-building gate!.

> (bind ((unit @ud) [~ 2]) dec)[~ 1]> (bind (~(get by colors) %orange) red)[~ 0xff]

(There are several others tools listed on that page which may be potentially useful to you.)

A vase is a pair of type and value, such as that returned by `!>`

zapgar. A `vase`

is useful when transmitting data in a way that may lose its type information.

### Containers of Containers

maps and sets are frequently used in the standard library and in the extended ecosystem. There are a some other common patterns which recur often enough that they have their own names:

++jar is a mold for a

`map`

of`list`

s.`++jar`

uses the ++ja core. (Mnemonic: jars hold solid ordered things, like a list.)++jug is a mold for a

`map`

of`set`

s.`++jug`

uses the ++ju core. (Mnemonic: jugs hold liquids, evoking the unordered nature of a set.)++mip is a mold for a map of maps.

`++mip`

lives in the`%landscape`

desk in`/lib/mip.hoon`

. Affordances are still few but a short example follows:> =mip -build-file /=landscape=/lib/mip/hoon> =my-map-warm (malt `(list (pair @tas @ux))`~[[%red 0xed.0a3f] [%yellow 0xfb.e870]])> =my-map-cool (malt `(list (pair @tas @ux))`~[[%green 0x1.a638] [%blue 0x66ff]])> =my-mip *(mip:mip @tas (map @tas @ux))> =my-mip (~(put bi:mip my-mip) %cool %blue 0x66ff)> =my-mip (~(put bi:mip my-mip) %cool %green 0x1.a638)> =my-mip (~(put bi:mip my-mip) %warm %red 0xed.0a3f)> =my-mip (~(put bi:mip my-mip) %warm %yellow 0xfb.e870)> my-mip[ n=[p=%warm q=[n=[p=%yellow q=0xfb.e870] l=[n=[p=%red q=0xed.0a3f] l=~ r=~] r=~]]l=[n=[p=%cool q=[n=[p=%green q=0x1.a638] l=[n=[p=%blue q=0x66ff] l=~ r=~] r=~]] l=~ r=~]r=~]> (~(got bi:mip my-mip) %cool %green)0x1.a638> ~(tap bi:mip my-mip)~[[x=%warm y=%yellow v=0xfb.e870][x=%warm y=%red v=0xed.0a3f][x=%cool y=%green v=0x1.a638][x=%cool y=%blue v=0x66ff]]`mip`

s are unjetted and quite slow but serve as a proof of concept.++mop ordered maps are discussed in the App School guides.

## Molds and Samples

### Modifying Gate Behavior

Sometimes you need to modify parts of a core (like a gate) on-the-fly to get the desired behavior. For instance, if you are using ++roll to calculate the multiplicative product of the elements of a list, this “just works”:

> (roll `(list @ud)`~[10 12 14 16 18] mul)483.840

In contrast, if you do the same thing to a list of numbers with a fractional part (`@rs`

floating-point values), the naïve operation will fail:

> (roll `(list @rs)`~[.10 .12 .14 .16 .18] mul:rs).0

Why is this? Let's peek inside the gates and see. Since we know a core is a cell of `[battery payload]`

, let's take a look at the payload:

> +:mul[[a=1 b=1] <46.hgz 1.pnw %140>]> +:mul:rs[[a=.0 b=.0] <21.hqd [r=?(%d %n %u %z) <51.qbt 123.zao 46.hgz 1.pnw %140>]>]

The correct behavior for ++mul:rs is really to multiply starting from one, not zero, so that ++roll doesn't wipe out the entire product.

### Custom Samples

In an earlier exercise we created a door with sample `[a=@ud b=@ud c=@ud]`

. If we investigated, we would find that the initial value of each is `0`

, the bunt value of `@ud`

.

> +6:poly[a=0 b=0 c=0]

What if we wish to define a door with a chosen sample value directly? We can make use of the `$_`

buccab rune, whose irregular form is simply `_`

. To create the door `poly`

with the sample set to have certain values in the Dojo, we would write

> =poly |_ [a=_5 b=_4 c=_3]++ quad|= x=@ud(add (add (mul a (mul x x)) (mul b x)) c)--> (quad:poly 2)31

For our earlier example with ++roll, if we wanted to set the default sample to have a different value than the bunt of the type, we could use `_`

cab:

> =mmul |=([a=_.1 b=_.1] (mul:rs a b))> (roll `(list @rs)`~[.10 .12 .14 .16 .18] mmul).483840

### Named Tuples

A named tuple is a structured collection of values with faces. The `$:`

buccol rune forms a named tuple. We use these implicitly in an irregular form when we specify the sample of a gate, as `|=([a=@ b=@] (add a b))`

expands to a `$:`

buccol expression for `[a=@ b=@]`

. Otherwise, we only need these if we are building a special type like a vector (e.g. with two components like an *x* and a *y*).

### Structure Mode

Most Hoon expressions evaluate normally (that's what “normal” means), what we'll call *noun mode* (or *normal mode*). However, sample definitions and `+$`

lusbuc mold specification arms evaluate in what is called *structure mode*. (You may occasionally see this the older term “spec mode”.) Structure mode expressions use a similar syntax to regular Hoon expressions but create structure definitions instead.

For instance, in eval mode if you use the irregular form `p=1`

this is an irregular form of the `^=`

kettis rune. This is one way to define a variable using a `=+`

tislus; these are equivalent statements:

> =+(hello=1 hello)1> =+(^=(hello 1) hello)1

(Normally we have preferred `=/`

tisfas in the Hoon School docs, but that is just for consistency.)

In a sample definition, such as in a gate, the statement is evaluated in structure mode; these are equivalent statements:

|=(hello=@ hello)|=($=(hello @) hello)

There are several other subtle cases where normal mode and structure mode diverge, but most of the time structure mode is invisible to you. The `$`

buc runes are typically invoked in structure mode.