Urbit Docs
  • What is Urbit?
  • Get on Urbit
  • Build on Urbit
    • Contents
    • Environment Setup
    • Hoon School
      • 1. Hoon Syntax
      • 2. Azimuth (Urbit ID)
      • 3. Gates (Functions)
      • 4. Molds (Types)
      • 5. Cores
      • 6. Trees and Addressing
      • 7. Libraries
      • 8. Testing Code
      • 9. Text Processing I
      • 10. Cores and Doors
      • 11. Data Structures
      • 12. Type Checking
      • 13. Conditional Logic
      • 14. Subject-Oriented Programming
      • 15. Text Processing II
      • 16. Functional Programming
      • 17. Text Processing III
      • 18. Generic and Variant Cores
      • 19. Mathematics
    • App School I
      • 1. Arvo
      • 2. The Agent Core
      • 3. Imports and Aliases
      • 4. Lifecycle
      • 5. Cards
      • 6. Pokes
      • 7. Structures and Marks
      • 8. Subscriptions
      • 9. Vanes
      • 10. Scries
      • 11. Failure
      • 12. Next Steps
      • Appendix: Types
    • App School II (Full-Stack)
      • 1. Types
      • 2. Agent
      • 3. JSON
      • 4. Marks
      • 5. Eyre
      • 6. React app setup
      • 7. React app logic
      • 8. Desk and glob
      • 9. Summary
    • Core Academy
      • 1. Evaluating Nock
      • 2. Building Hoon
      • 3. The Core Stack
      • 4. Arvo I: The Main Sequence
      • 5. Arvo II: The Boot Sequence
      • 6. Vere I: u3 and the Serf
      • 7. Vere II: The Loom
      • 8. Vanes I: Behn, Dill, Kahn, Lick
      • 9. Vanes II: Ames
      • 10. Vanes III: Eyre, Iris
      • 11. Vanes IV: Clay
      • 12. Vanes V: Gall and Userspace
      • 13. Vanes VI: Khan, Lick
      • 14. Vanes VII: Jael, Azimuth
    • Runtime
      • U3
      • Conn.c Guide
      • How to Write a Jet
      • API Overview by Prefix
      • C in Urbit
      • Cryptography
      • Land of Nouns
    • Tools
      • Useful Links
      • JS Libraries
        • HTTP API
      • Docs App
        • File Format
        • Index File
        • Suggested Structure
    • Userspace
      • Command-Line App Tutorial
      • Remote Scry
      • Unit Tests
      • Software Distribution
        • Software Distribution Guide
        • Docket File
        • Glob
      • Examples
        • Building a CLI App
        • Debugging Wrapper
        • Host a Website
        • Serving a JS Game
        • Ship Monitoring
        • Styled Text
  • Urbit ID
    • What is Urbit ID?
    • Azimuth Data Flow
    • Life and Rift
    • Urbit HD Wallet
    • Advanced Azimuth Tools
    • Custom Roller Tutorial
    • Azimuth.eth Reference
    • Ecliptic.eth Reference
    • Layer 2
      • L2 Actions
      • L2 Rollers
      • L2 Roller HTTP RPC-API
      • L2 Transaction Format
  • Urbit OS
    • What is Urbit OS?
    • Base
      • Hood
      • Threads
        • Basics Tutorial
          • Bind
          • Fundamentals
          • Input
          • Output
          • Summary
        • HTTP API Guide
        • Spider API Reference
        • Strandio Reference
        • Examples
          • Child Thread
          • Fetch JSON
          • Gall
            • Poke Thread
            • Start Thread
            • Stop Thread
            • Take Facts
            • Take Result
          • Main-loop
          • Poke Agent
          • Scry
          • Take Fact
    • Kernel
      • Arvo
        • Cryptography
        • Move Trace
        • Scries
        • Subscriptions
      • Ames
        • Ames API Reference
        • Ames Cryptography
        • Ames Data Types
        • Ames Scry Reference
      • Behn
        • Behn API Reference
        • Behn Examples
        • Behn Scry Reference
      • Clay
        • Clay API Reference
        • Clay Architecture
        • Clay Data Types
        • Clay Examples
        • Clay Scry Reference
        • Filesystem Hierarchy
        • Marks
          • Mark Examples
          • Using Marks
          • Writing Marks
        • Using Clay
      • Dill
        • Dill API Reference
        • Dill Data Types
        • Dill Scry Reference
      • Eyre
        • EAuth
        • Eyre Data Types
        • Eyre External API
        • Eyre Internal API
        • Eyre Scry Reference
        • Low-Level Eyre Guide
        • Noun channels
      • Gall
        • Gall API Reference
        • Gall Data Types
        • Gall Scry Reference
      • Iris
        • Iris API Reference
        • Iris Data Types
        • Iris Example
      • Jael
        • Jael API Reference
        • Jael Data Types
        • Jael Examples
        • Jael Scry Reference
      • Khan
        • Khan API Reference
        • Khan Data Types
        • Khan Example
      • Lick
        • Lick API Reference
        • Lick Guide
        • Lick Examples
        • Lick Scry Reference
  • Hoon
    • Why Hoon?
    • Advanced Types
    • Arvo
    • Auras
    • Basic Types
    • Cheat Sheet
    • Cryptography
    • Examples
      • ABC Blocks
      • Competitive Programming
      • Emirp
      • Gleichniszahlenreihe
      • Islands
      • Luhn Number
      • Minimum Path Sum
      • Phone Letters
      • Restore IP
      • Rhonda Numbers
      • Roman Numerals
      • Solitaire Cipher
      • Water Towers
    • Generators
    • Hoon Errors
    • Hoon Style Guide
    • Implementing an Aura
    • Irregular forms
    • JSON
    • Limbs and wings
      • Limbs
      • Wings
    • Mips (Maps of Maps)
    • Parsing Text
    • Runes
      • | bar · Cores
      • $ buc · Structures
      • % cen · Calls
      • : col · Cells
      • . dot · Nock
      • / fas · Imports
      • ^ ket · Casts
      • + lus · Arms
      • ; mic · Make
      • ~ sig · Hints
      • = tis · Subject
      • ? wut · Conditionals
      • ! zap · Wild
      • Constants (Atoms and Strings)
      • --, == · Terminators
    • Sail (HTML)
    • Serialization
    • Sets
    • Standard Library
      • 1a: Basic Arithmetic
      • 1b: Tree Addressing
      • 1c: Molds and Mold-Builders
      • 2a: Unit Logic
      • 2b: List Logic
      • 2c: Bit Arithmetic
      • 2d: Bit Logic
      • 2e: Insecure Hashing
      • 2f: Noun Ordering
      • 2g: Unsigned Powers
      • 2h: Set Logic
      • 2i: Map Logic
      • 2j: Jar and Jug Logic
      • 2k: Queue Logic
      • 2l: Container from Container
      • 2m: Container from Noun
      • 2n: Functional Hacks
      • 2o: Normalizing Containers
      • 2p: Serialization
      • 2q: Molds and Mold-Builders
      • 3a: Modular and Signed Ints
      • 3b: Floating Point
      • 3c: Urbit Time
      • 3d: SHA Hash Family
      • 3e: AES encryption (Removed)
      • 3f: Scrambling
      • 3g: Molds and Mold-Builders
      • 4a: Exotic Bases
      • 4b: Text Processing
      • 4c: Tank Printer
      • 4d: Parsing (Tracing)
      • 4e: Parsing (Combinators)
      • 4f: Parsing (Rule-Builders)
      • 4g: Parsing (Outside Caller)
      • 4h: Parsing (ASCII Glyphs)
      • 4i: Parsing (Useful Idioms)
      • 4j: Parsing (Bases and Base Digits)
      • 4k: Atom Printing
      • 4l: Atom Parsing
      • 4m: Formatting Functions
      • 4n: Virtualization
      • 4o: Molds
      • 5a: Compiler Utilities
      • 5b: Macro Expansion
      • 5c: Compiler Backend & Prettyprinter
      • 5d: Parser
      • 5e: Molds and mold builders
      • 5f: Profiling support
    • Strings
    • The Engine Pattern
    • Udon (Markdown-esque)
    • Vases
    • Zuse
      • 2d(1-5): To JSON, Wains
      • 2d(6): From JSON
      • 2d(7): From JSON (unit)
      • 2e(2-3): Print & Parse JSON
      • 2m: Ordered Maps
  • Nock
    • What is Nock?
    • Decrement
    • Definition
    • Fast Hints and Jets
    • Implementations
    • Specification
  • User Manual
    • Contents
    • Running Urbit
      • Cloud Hosting
      • Home Servers
      • Runtime Reference
      • Self-hosting S3 Storage with MinIO
    • Urbit ID
      • Bridge Troubleshooting
      • Creating an Invite Pool
      • Get an Urbit ID
      • Guide to Factory Resets
      • HD Wallet (Master Ticket)
      • Layer 2 for planets
      • Layer 2 for stars
      • Proxies
      • Using Bridge
    • Urbit OS
      • Basics
      • Configuring S3 Storage
      • Dojo Tools
      • Filesystem
      • Shell
      • Ship Troubleshooting
      • Star and Galaxy Operations
      • Updates
Powered by GitBook

GitHub

  • Urbit ID
  • Urbit OS
  • Runtime

Resources

  • YouTube
  • Whitepaper
  • Awesome Urbit

Contact

  • X
  • Email
  • Gather
On this page
  • Key Data Structures and Molds
  • +tree
  • +set
  • +unit Redux (and $vase)
  • Containers of Containers
  • Molds and Samples
  • Modifying Gate Behavior
  • Custom Samples
  • Named Tuples
  • Structure Mode
Edit on GitHub
  1. Build on Urbit
  2. Hoon School

11. Data Structures

This module will introduce you to several useful data structures built on the door, then discuss how the compiler handles types and the sample.

Key Data Structures and Molds

+maps are a versatile way to store and access data, but they are far from the only useful pattern. +maps were documented in the previous module.

+tree

We use tree to make a binary tree data structure in Hoon, e.g., (tree @) for a binary tree of atoms.

There are two kinds of +tree in Hoon:

  1. The null tree ~.

  2. A non-null tree which is a cell with three parts.

    1. The node value.

    2. The left child of the node.

    3. The right child of the node.

Each child is itself a tree. The node value has the face n, the left child has the face l, and the right child has the face r. The following diagram provides an illustration of a (tree @) (without the faces):

          12
        /    \
      8       14
    /   \    /   \
   4     ~  ~     16
 /  \            /  \
~    ~          ~    ~

Hoon supports trees of any type that can be constructed in Hoon, e.g.: (tree @), (tree ^), (tree [@ ?]), etc. Let's construct the tree in the diagram above in the dojo, casting it accordingly:

> `(tree @)`[12 [8 [4 ~ ~] ~] [14 ~ [16 ~ ~]]]
{4 8 12 14 16}

Notice that we don't have to insert the faces manually; by casting the noun above to a (tree @) Hoon inserts the faces for us. Let's put this noun in the Dojo subject with the face "b" and pull out the tree at the left child of the 12 node:

> =b `(tree @)`[12 [8 [4 ~ ~] ~] [14 ~ [16 ~ ~]]]

> b
{4 8 12 14 16}

> l.b
-find.l.b
find-fork

This didn't work because we haven't first proved to Hoon that .b is a non-null tree. A null tree has no .l in it, after all. Let's try again, using ?~ wutsig to prove that .b isn't null. We can also look at .r and .n:

> ?~(b ~ l.b)
{4 8}

> ?~(b ~ r.b)
{14 16}

> ?~(b ~ n.b)
12

Find and Replace

Here's a program that finds and replaces certain atoms in a (tree @).

|=  [nedl=@ hay=(tree @) new=@]
^-  (tree @)
?~  hay  ~
:+  ?:  =(n.hay nedl)
      new
    n.hay
  $(hay l.hay)
$(hay r.hay)

The .nedl is the atom to be replaced, .hay is the tree, and .new is the new atom with which to replace .nedl. Save this as findreplacetree.hoon and run in the dojo:

> b
{4 8 12 14 16}

> +findreplacetree [8 b 94]
{4 94 12 14 16}

> +findreplacetree [14 b 94]
{4 8 12 94 16}

+set

A set is rather like a list except that each entry can only be represented once. As with a map, a +set is typically associated with a particular type, such as (set @ud) for a collection of decimal values. (+sets also don't have an order, so they're basically a bag of unique values.)

+set operations are provided by +in. Most names are similar to +map/+by operations when applicable.

+silt produces a +set from a +list:

=primes (silt ~[2 3 5 7 11 13])

+put:in adds a value to a +set (and null-ops when the value is already present):

=primes (~(put in primes) 17)
=primes (~(put in primes) 13)

+del:in removes a value from a +set:

=primes (~(put in primes) 18)
=primes (~(del in primes) 18)

+has:in checks for existence:

> (~(has in primes) 15)
%.n

> (~(has in primes) 17)
%.y

+tap:in yields a +list of the values:

> ~(tap in primes)
~[3 2 7 5 11 13 17]

> (sort ~(tap in primes) lth)
~[2 3 5 7 11 13 17]

+run:in applies a function across all values:

> (~(run in primes) dec)
{10 6 12 1 2 16 4}

Example: Cartesian Product

Here's a program that takes two sets of atoms and returns the Cartesian product of those sets. A Cartesian product of two sets .a and .b is a set of all the cells whose head is a member of .a and whose tail is a member of .b.

|=  [a=(set @) b=(set @)]
=/  c=(list @)  ~(tap in a)
=/  d=(list @)  ~(tap in b)
=|  acc=(set [@ @])
|-  ^-  (set [@ @])
?~  c  acc
%=  $
  c  t.c
  acc  |-  ?~  d  acc
       %=  $
         d  t.d
         acc  (~(put in acc) [i.c i.d])
       ==
==

Save this as cartesian.hoon in your urbit's pier and run in the dojo:

> =c `(set @)`(sy ~[1 2 3])

> c
{1 2 3}

> =d `(set @)`(sy ~[4 5 6])

> d
{5 6 4}

> +cartesian [c d]
{[2 6] [1 6] [3 6] [1 4] [1 5] [2 4] [3 5] [3 4] [2 5]}

+unit Redux (and $vase)

We encountered the unit briefly as a tool for distinguishing null results from actual zeroes: using a +unit allows you to specify something that may not be there. For this reason, +units are commonly used for operations that sometimes fail, such as search functions, database lookups, remote data requests, etc.

You can build a +unit using the tic special notation or +some:

> `%mars
[~ %mars]

> (some %mars)
[~ u=%mars]

While +got:by is one way to get a value back without wrapping it in a +unit, it's better practice to use the +unit logic gates to manipulate gates to work correctly with +units.

For example, use +need to unwrap a +unit, or crash if the +unit is ~ null.

> =colors (malt `(list (pair @tas @ux))`~[[%red 0xed.0a3f] [%yellow 0xfb.e870] [%green 0x1.a638] [%blue 0x66ff]])

> (~(get by colors) %yellow)
[~ q=0xfb.e870]

> (need (~(get by colors) %yellow))
0xfb.e870

> (~(get by colors) %teal)
~

> (need (~(get by colors) %teal))
dojo: hoon expression failed

Rather than unwrap a unit, one can modify gates to work with +units directly even if they're not natively set up that way. For instance, one cannot decrement a +unit because +dec doesn't accept a +unit. +bind can bind a non-+unit function, another gate-building gate!.

> (bind ((unit @ud) [~ 2]) dec)  
[~ 1]

> (bind (~(get by colors) %orange) red)  
[~ 0xff]

(There are several others tools listed on that page which may be potentially useful to you.)

A vase is a pair of type and value, such as that returned by !> zapgar. A $vase is useful when transmitting data in a way that may lose its type information.

Containers of Containers

maps and sets are frequently used in the standard library and in the extended ecosystem. There are other common patterns which recur often enough that they have their own names:

+jar is a mold for a +map of +lists. +jar uses the +ja core. (Mnemonic: jars hold solid ordered things, like a list.)

+jug is a mold for a +map of +sets. +jug uses the +ju core. (Mnemonic: jugs hold liquids, evoking the unordered nature of a set.)

+mip is a mold for a map of maps. +mip lives in the %landscape desk in /lib/mip.hoon. Affordances are still few but a short example follows:

> =mip -build-file /=landscape=/lib/mip/hoon

> =my-map-warm (malt `(list (pair @tas @ux))`~[[%red 0xed.0a3f] [%yellow 0xfb.e870]])

> =my-map-cool (malt `(list (pair @tas @ux))`~[[%green 0x1.a638] [%blue 0x66ff]])

> =my-mip *(mip:mip @tas (map @tas @ux))

> =my-mip (~(put bi:mip my-mip) %cool %blue 0x66ff)

> =my-mip (~(put bi:mip my-mip) %cool %green 0x1.a638)

> =my-mip (~(put bi:mip my-mip) %warm %red 0xed.0a3f)

> =my-mip (~(put bi:mip my-mip) %warm %yellow 0xfb.e870)

> my-mip
[ n=[p=%warm q=[n=[p=%yellow q=0xfb.e870] l=[n=[p=%red q=0xed.0a3f] l=~ r=~] r=~]]
  l=[n=[p=%cool q=[n=[p=%green q=0x1.a638] l=[n=[p=%blue q=0x66ff] l=~ r=~] r=~]] l=~ r=~]
  r=~
]

> (~(got bi:mip my-mip) %cool %green)
0x1.a638

> ~(tap bi:mip my-mip) ~[ [x=%warm y=%yellow v=0xfb.e870] [x=%warm
y=%red v=0xed.0a3f] [x=%cool y=%green v=0x1.a638] [x=%cool y=%blue
v=0x66ff] ]

Note that +mips are unjetted and quite slow, but serve as a proof of concept.

+mop ordered maps are discussed in the App School guides.

Molds and Samples

Modifying Gate Behavior

Sometimes you need to modify parts of a core (like a gate) on-the-fly to get the desired behavior. For instance, if you are using +roll to calculate the multiplicative product of the elements of a list, this “just works”:

> (roll `(list @ud)`~[10 12 14 16 18] mul)  
483.840

In contrast, if you do the same thing to a list of numbers with a fractional part (@rs floating-point values), the naïve operation will fail:

> (roll `(list @rs)`~[.10 .12 .14 .16 .18] mul:rs)  
.0

Why is this? Let's peek inside the gates and see. Since we know a core is a cell of [battery payload], let's take a look at the payload:

> +:mul
[[a=1 b=1] <33.uof 1.pnw %138>]

> +:mul:rs
[[a=.0 b=.0] <21.ezj [r=?(%d %n %u %z) <51.njr 139.oyl 33.uof 1.pnw %138>]>]

The correct behavior for +mul:rs is really to multiply starting from one, not zero, so that +roll doesn't wipe out the entire product.

Custom Samples

In an earlier exercise we created a door with sample [a=@ud b=@ud c=@ud]. If we investigated, we would find that the initial value of each is 0, the bunt value of @ud.

> +6:poly
[a=0 b=0 c=0]

What if we wish to define a door with a chosen sample value directly? We can make use of the $_ buccab rune, whose irregular form is simply _. To create the door "poly" with the sample set to have certain values in the Dojo, we would write

> =poly |_  [a=_5 b=_4 c=_3]
++  quad
  |=  x=@ud
  (add (add (mul a (mul x x)) (mul b x)) c)
--

> (quad:poly 2)  
31

For our earlier example with +roll, if we wanted to set the default sample to have a different value than the bunt of the type, we could use _ cab:

> =mmul |=([a=_.1 b=_.1] (mul:rs a b))

> (roll `(list @rs)`~[.10 .12 .14 .16 .18] mmul)
.483840

Named Tuples

A named tuple is a structured collection of values with faces. The $: buccol rune forms a named tuple. We use these implicitly in an irregular form when we specify the sample of a gate, as |=([a=@ b=@] (add a b)) expands to a $: buccol expression for [a=@ b=@]. Otherwise, we only need these if we are building a special type like a vector (e.g. with two components like an x and a y).

Structure Mode

Most Hoon expressions evaluate normally (that's what “normal” means), what we'll call "noun mode" (or "normal mode"). However, sample definitions and +$ lusbuc mold specification arms evaluate in what is called "structure mode". (You may occasionally see this the older term “spec mode”.) Structure mode expressions use a similar syntax to regular Hoon expressions but create structure definitions instead.

For instance, in eval mode if you use the irregular form p=1 this is an irregular form of the ^= kettis rune. This is one way to define a variable using a =+ tislus; these are equivalent statements:

> =+(hello=1 hello)
1

> =+(^=(hello 1) hello)
1

(Normally we have preferred =/ tisfas in the Hoon School docs, but that is just for consistency.)

In a sample definition, such as in a gate, the statement is evaluated in structure mode; these are equivalent statements:

|=(hello=@ hello)

|=($=(hello @) hello)

There are several other subtle cases where normal mode and structure mode diverge, but most of the time structure mode is invisible to you. The $ buc runes are typically invoked in structure mode.

Previous10. Cores and DoorsNext12. Type Checking

Last updated 1 day ago