Urbit Docs
  • What is Urbit?
  • Get on Urbit
  • Build on Urbit
    • Contents
    • Environment Setup
    • Hoon School
      • 1. Hoon Syntax
      • 2. Azimuth (Urbit ID)
      • 3. Gates (Functions)
      • 4. Molds (Types)
      • 5. Cores
      • 6. Trees and Addressing
      • 7. Libraries
      • 8. Testing Code
      • 9. Text Processing I
      • 10. Cores and Doors
      • 11. Data Structures
      • 12. Type Checking
      • 13. Conditional Logic
      • 14. Subject-Oriented Programming
      • 15. Text Processing II
      • 16. Functional Programming
      • 17. Text Processing III
      • 18. Generic and Variant Cores
      • 19. Mathematics
    • App School I
      • 1. Arvo
      • 2. The Agent Core
      • 3. Imports and Aliases
      • 4. Lifecycle
      • 5. Cards
      • 6. Pokes
      • 7. Structures and Marks
      • 8. Subscriptions
      • 9. Vanes
      • 10. Scries
      • 11. Failure
      • 12. Next Steps
      • Appendix: Types
    • App School II (Full-Stack)
      • 1. Types
      • 2. Agent
      • 3. JSON
      • 4. Marks
      • 5. Eyre
      • 6. React app setup
      • 7. React app logic
      • 8. Desk and glob
      • 9. Summary
    • Core Academy
      • 1. Evaluating Nock
      • 2. Building Hoon
      • 3. The Core Stack
      • 4. Arvo I: The Main Sequence
      • 5. Arvo II: The Boot Sequence
      • 6. Vere I: u3 and the Serf
      • 7. Vere II: The Loom
      • 8. Vanes I: Behn, Dill, Kahn, Lick
      • 9. Vanes II: Ames
      • 10. Vanes III: Eyre, Iris
      • 11. Vanes IV: Clay
      • 12. Vanes V: Gall and Userspace
      • 13. Vanes VI: Khan, Lick
      • 14. Vanes VII: Jael, Azimuth
    • Runtime
      • U3
      • Conn.c Guide
      • How to Write a Jet
      • API Overview by Prefix
      • C in Urbit
      • Cryptography
      • Land of Nouns
    • Tools
      • Useful Links
      • JS Libraries
        • HTTP API
      • Docs App
        • File Format
        • Index File
        • Suggested Structure
    • Userspace
      • Command-Line App Tutorial
      • Remote Scry
      • Unit Tests
      • Software Distribution
        • Software Distribution Guide
        • Docket File
        • Glob
      • Examples
        • Building a CLI App
        • Debugging Wrapper
        • Host a Website
        • Serving a JS Game
        • Ship Monitoring
        • Styled Text
  • Urbit ID
    • What is Urbit ID?
    • Azimuth Data Flow
    • Life and Rift
    • Urbit HD Wallet
    • Advanced Azimuth Tools
    • Custom Roller Tutorial
    • Azimuth.eth Reference
    • Ecliptic.eth Reference
    • Layer 2
      • L2 Actions
      • L2 Rollers
      • L2 Roller HTTP RPC-API
      • L2 Transaction Format
  • Urbit OS
    • What is Urbit OS?
    • Base
      • Hood
      • Threads
        • Basics Tutorial
          • Bind
          • Fundamentals
          • Input
          • Output
          • Summary
        • HTTP API Guide
        • Spider API Reference
        • Strandio Reference
        • Examples
          • Child Thread
          • Fetch JSON
          • Gall
            • Poke Thread
            • Start Thread
            • Stop Thread
            • Take Facts
            • Take Result
          • Main-loop
          • Poke Agent
          • Scry
          • Take Fact
    • Kernel
      • Arvo
        • Cryptography
        • Move Trace
        • Scries
        • Subscriptions
      • Ames
        • Ames API Reference
        • Ames Cryptography
        • Ames Data Types
        • Ames Scry Reference
      • Behn
        • Behn API Reference
        • Behn Examples
        • Behn Scry Reference
      • Clay
        • Clay API Reference
        • Clay Architecture
        • Clay Data Types
        • Clay Examples
        • Clay Scry Reference
        • Filesystem Hierarchy
        • Marks
          • Mark Examples
          • Using Marks
          • Writing Marks
        • Using Clay
      • Dill
        • Dill API Reference
        • Dill Data Types
        • Dill Scry Reference
      • Eyre
        • EAuth
        • Eyre Data Types
        • Eyre External API
        • Eyre Internal API
        • Eyre Scry Reference
        • Low-Level Eyre Guide
        • Noun channels
      • Gall
        • Gall API Reference
        • Gall Data Types
        • Gall Scry Reference
      • Iris
        • Iris API Reference
        • Iris Data Types
        • Iris Example
      • Jael
        • Jael API Reference
        • Jael Data Types
        • Jael Examples
        • Jael Scry Reference
      • Khan
        • Khan API Reference
        • Khan Data Types
        • Khan Example
      • Lick
        • Lick API Reference
        • Lick Guide
        • Lick Examples
        • Lick Scry Reference
  • Hoon
    • Why Hoon?
    • Advanced Types
    • Arvo
    • Auras
    • Basic Types
    • Cheat Sheet
    • Cryptography
    • Examples
      • ABC Blocks
      • Competitive Programming
      • Emirp
      • Gleichniszahlenreihe
      • Islands
      • Luhn Number
      • Minimum Path Sum
      • Phone Letters
      • Restore IP
      • Rhonda Numbers
      • Roman Numerals
      • Solitaire Cipher
      • Water Towers
    • Generators
    • Hoon Errors
    • Hoon Style Guide
    • Implementing an Aura
    • Irregular forms
    • JSON
    • Limbs and wings
      • Limbs
      • Wings
    • Mips (Maps of Maps)
    • Parsing Text
    • Runes
      • | bar · Cores
      • $ buc · Structures
      • % cen · Calls
      • : col · Cells
      • . dot · Nock
      • / fas · Imports
      • ^ ket · Casts
      • + lus · Arms
      • ; mic · Make
      • ~ sig · Hints
      • = tis · Subject
      • ? wut · Conditionals
      • ! zap · Wild
      • Constants (Atoms and Strings)
      • --, == · Terminators
    • Sail (HTML)
    • Serialization
    • Sets
    • Standard Library
      • 1a: Basic Arithmetic
      • 1b: Tree Addressing
      • 1c: Molds and Mold-Builders
      • 2a: Unit Logic
      • 2b: List Logic
      • 2c: Bit Arithmetic
      • 2d: Bit Logic
      • 2e: Insecure Hashing
      • 2f: Noun Ordering
      • 2g: Unsigned Powers
      • 2h: Set Logic
      • 2i: Map Logic
      • 2j: Jar and Jug Logic
      • 2k: Queue Logic
      • 2l: Container from Container
      • 2m: Container from Noun
      • 2n: Functional Hacks
      • 2o: Normalizing Containers
      • 2p: Serialization
      • 2q: Molds and Mold-Builders
      • 3a: Modular and Signed Ints
      • 3b: Floating Point
      • 3c: Urbit Time
      • 3d: SHA Hash Family
      • 3e: AES encryption (Removed)
      • 3f: Scrambling
      • 3g: Molds and Mold-Builders
      • 4a: Exotic Bases
      • 4b: Text Processing
      • 4c: Tank Printer
      • 4d: Parsing (Tracing)
      • 4e: Parsing (Combinators)
      • 4f: Parsing (Rule-Builders)
      • 4g: Parsing (Outside Caller)
      • 4h: Parsing (ASCII Glyphs)
      • 4i: Parsing (Useful Idioms)
      • 4j: Parsing (Bases and Base Digits)
      • 4k: Atom Printing
      • 4l: Atom Parsing
      • 4m: Formatting Functions
      • 4n: Virtualization
      • 4o: Molds
      • 5a: Compiler Utilities
      • 5b: Macro Expansion
      • 5c: Compiler Backend & Prettyprinter
      • 5d: Parser
      • 5e: Molds and mold builders
      • 5f: Profiling support
    • Strings
    • The Engine Pattern
    • Udon (Markdown-esque)
    • Vases
    • Zuse
      • 2d(1-5): To JSON, Wains
      • 2d(6): From JSON
      • 2d(7): From JSON (unit)
      • 2e(2-3): Print & Parse JSON
      • 2m: Ordered Maps
  • Nock
    • What is Nock?
    • Decrement
    • Definition
    • Fast Hints and Jets
    • Implementations
    • Specification
  • User Manual
    • Contents
    • Running Urbit
      • Cloud Hosting
      • Home Servers
      • Runtime Reference
      • Self-hosting S3 Storage with MinIO
    • Urbit ID
      • Bridge Troubleshooting
      • Creating an Invite Pool
      • Get an Urbit ID
      • Guide to Factory Resets
      • HD Wallet (Master Ticket)
      • Layer 2 for planets
      • Layer 2 for stars
      • Proxies
      • Using Bridge
    • Urbit OS
      • Basics
      • Configuring S3 Storage
      • Dojo Tools
      • Filesystem
      • Shell
      • Ship Troubleshooting
      • Star and Galaxy Operations
      • Updates
Powered by GitBook

GitHub

  • Urbit ID
  • Urbit OS
  • Runtime

Resources

  • YouTube
  • Whitepaper
  • Awesome Urbit

Contact

  • X
  • Email
  • Gather
On this page
  • Text Conversions
  • Formatted Text
  • Producing Error Messages
  • %ask Generators
Edit on GitHub
  1. Build on Urbit
  2. Hoon School

15. Text Processing II

This module will elaborate on text representation in Hoon, including formatted text and %ask generators_. It may be considered optional and skipped if you are speedrunning Hoon School._

Text Conversions

We frequently need to convert from text to data, and between different text-based representations. Let's examine some specific arms:

  • How do we convert text into all lower-case?

    • +cass

  • How do we turn a $cord into a tape?

    • +trip

  • How can we make a list of a null-terminated tuple?

    • +le:nl

  • How can we evaluate Nock expressions?

    • +mink

(If you see a |* bartar rune in the code, it's similar to a |= bartis, but produces what's called a "wet gate".)

The +html core of the standard libary contains some additional important tools for working with web-based data, such as MIME types and JSON strings.

To convert a @ux hexadecimal value to a $cord:

> (en:base16:mimes:html [3 0x12.3456])  
'123456'

To convert a $cord to a @ux hexadecimal value:

> `@ux`q.+>:(de:base16:mimes:html '123456')
0x12.3456

There are tools for working with Bitcoin wallet base-58 values, JSON strings, XML strings, and more.

> (en-urlt:html "https://hello.me")
"https%3A%2F%2Fhello.me"

Formatted Text

Hoon produces messages at the Dojo (or otherwise) using an internal formatted text system, called $tanks. A $tank is a formatted print tree. Error messages and the like are built of $tanks. $tanks are defined in hoon.hoon:

::  $tank: formatted print tree
::
::    just a cord, or
::    %leaf: just a tape
::    %palm: backstep list
::           flat-mid, open, flat-open, flat-close
::    %rose: flat list
::           flat-mid, open, close
::
+$  tank
  $~  leaf/~
  $@  cord
  $%  [%leaf p=tape]
      [%palm p=(qual tape tape tape tape) q=(list tank)]
      [%rose p=(trel tape tape tape) q=(list tank)]
  ==
+$ tang (list tank) :: bottom-first error

The +ram:re arm is used to convert these to actual formatted output as a tape, e.g.

> ~(ram re leaf+"foo")
"foo"
> ~(ram re [%palm ["|" "(" "!" ")"] leaf+"foo" leaf+"bar" leaf+"baz" ~])
"(!foo|bar|baz)"
> ~(ram re [%rose [" " "[" "]"] leaf+"foo" leaf+"bar" leaf+"baz" ~])
"[foo bar baz]"

Many generators build sophisticated output using $tanks and the short-format cell builder +, e.g. in /gen/azimuth-block/hoon:

[leaf+(scow %ud block)]~

which is equivalent to

~[[%leaf (scow %ud block)]]

$tanks are the primary output mechanism for more advanced generators. Even if you don't end up writing them much, you will encounter them as you delve into the Urbit codebase.

Tutorial: Deep Dive into ls.hoon

The +ls generator shows the contents at a particular path in Clay:

> +cat /===/gen/ls/hoon
/~nec/base/~2022.6.22..17.25.54..1034/gen/ls/hoon
::  LiSt directory subnodes
::
::::  /hoon/ls/gen
  ::
/?    310
/+    show-dir
::
::::
  ::
~&  %
:-  %say
|=  [^ [arg=path ~] vane=?(%g %c)]
=+  lon=.^(arch (cat 3 vane %y) arg)
tang+[?~(dir.lon leaf+"~" (show-dir vane arg dir.lon))]~

Let's go line by line:

/?    310
/+    show-dir

The first line /? faswut represents now-future functionality which will allow the version number of the kernel to be pinned. It is currently non-functioning but you will see it in many Urbit-shipped files.

Then the show-dir library is imported.

~&  %

A separator % is printed.

:-  %say

A %say generator is a cell with a metadata tag %say as the head and the gate as the tail.

|=  [^ [arg=path ~] vane=?(%g %c)]

This generator requires a path argument in its sample and optionally accepts a vane tag (%g Gall or %c Clay). Most of the time, +cat is used with Clay, so %c as the last entry in the type union serves as the bunt value.

=+  lon=.^(arch (cat 3 vane %y) arg)

We saw .^ dotket for the first time in the previous module, where we learned that it performs a "peek" or scry into the state of an Arvo vane. Most of the time this functionality is used to ask %c Clay or %g Gall for information about a path, desk, agent, etc. In this case, (cat 3 %c %y) is a fancy way of collocating the two @tas terms into %cy, a Clay file or directory lookup. The type of this lookup is $arch, and the location of the file or directory is given by .arg from the sample.

tang+[?~(dir.lon leaf+"~" (show-dir vane arg dir.lon))]~

The result of the lookup on the previous line is adapted into a formatted text block with a head of %tang and different results depending on whether the request was ~ null or not.

Tutorial: Deep Dive into cat.hoon

For instance, how does +cat work? Let's look at the structure of /gen/cat/hoon:

/gen/cat.hoon
::  ConCATenate file listings
::
::::  /hoon/cat/gen
  ::
/?    310
/+    pretty-file, show-dir
::
::::
  ::
:-  %say
|=  [^ [arg=(list path)] vane=?(%g %c)]
=-  tang+(flop `tang`(zing -))
%+  turn  arg
|=  pax=path
^-  tang
=+  ark=.^(arch (cat 3 vane %y) pax)
?^  fil.ark
  ?:  =(%sched -:(flop pax))
    [>.^((map @da cord) (cat 3 vane %x) pax)<]~
  [leaf+(spud pax) (pretty-file .^(noun (cat 3 vane %x) pax))]
?-     dir.ark                                          ::  handle ambiguity
    ~
  [rose+[" " `~]^~[leaf+"~" (smyt pax)]]~
::
    [[@t ~] ~ ~]
  $(pax (welp pax /[p.n.dir.ark]))
::
    *
  =-  [palm+[": " ``~]^-]~
  :~  rose+[" " `~]^~[leaf+"*" (smyt pax)]
      `tank`(show-dir vane pax dir.ark)
  ==
==

What is the top-level structure of the generator? (A cell of %say and the gate, what Dojo recognizes as a %say generator.)

Some points of interest include:

  • /? faswut pins the expected Arvo kelvin version; right now it doesn't do anything.

  • .^ dotket loads a value from Arvo (called a "scry").

  • +smyt pretty-prints a path.

  • =- tishep combines a faced noun with the subject, inverted relative to =+ tislus/=/ tisfas.

You can see how much of the generator is concerned with formatting the content of the file into a formatted text $tank by prepending %rose tags and so forth.

Work line-by-line through the file and clarify parts that are muddy to you at first glance.

Producing Error Messages

Formal error messages in Urbit are built of tanks. “A $tang is a list of $tanks, and a $tank is a structure for printing data. There are three types of $tank: $leaf, $palm, and $rose. A $leaf is for printing a single noun, a $rose is for printing rows of data, and a $palm is for printing backstep-indented lists.”

One way to include an error message in your code is the ~_ sigcab rune, described as a “user-formatted tracing printf”, or the ~| sigbar rune, a “tracing printf”. What this means is that these print to the stack trace if something fails, so you can use either rune to contribute to the error description:

|=  a=@ud
~_  leaf+"This code failed"
!!

When you compose your own library functions, consider including error messages for likely failure points.

%ask Generators

Previously, we introduced the concept of a %say generator to produce a more versatile form of standalone single computation than a simple naked generator (gate) allowed. Another elaboration, the %ask generator, takes things further.

We use an %ask generator when we want to create an interactive program that prompts for inputs as it runs, rather than expecting arguments to be passed in at the time of initiation.

This section will briefly walk through an %ask generator to give you a taste of how they work. The CLI app guide walks through the libraries necessary for working with %ask generators in greater detail. We also recommend reading ~wicdev-wisryt's “Input and Output in Hoon” for an extended consideration of relevant input/output issues.

Tutorial: %ask Generator

The code below is an %ask generator that checks if the user inputs "blue" when prompted per a classic Monty Python scene. Save it as /gen/axe.hoon in your %base desk.

/-  sole
/+  generators
=,  [sole generators]
:-  %ask
|=  *
^-  (sole-result (cask tang))
%+  print    leaf+"What is your favorite color?"
%+  prompt   [%& %prompt "color: "]
|=  t=tape
%+  produce  %tang
?:  =(t "blue")
  :~  leaf+"Oh. Thank you very much."
      leaf+"Right. Off you go then."
  ==
:~  leaf+"Aaaaagh!"
    leaf+"Into the Gorge of Eternal Peril with you!"
==

Run the generator from the Dojo:

> +axe

What is your favorite color?
: color:

Something new has happened. Instead of simply returning something, your Dojo's prompt changed from ~your-urbit:dojo> to ~your-urbit:dojo: color:, and now expects additional input. Let's give it an answer:

: color: red
Into the Gorge of Eternal Peril with you!
Aaaaagh!

Let's go over what exactly is happening in this code.

/-  sole
/+  generators
=,  [sole generators]

Here we bring in some of the types we are going to need from /sur/sole and gates we will use from /lib/generators. We use some special runes for this.

  • /- fashep is a Ford rune used to import types from /sur.

  • /+ faslus is a Ford rune used to import libraries from /lib.

  • =, tiscom is a rune that allows us to expose a namespace. We do this to avoid having to write sole-result:sole instead of sole-result or print:generators instead of +print.

:-  %ask
|=  *

This code might be familiar. Just as with their %say cousins, %ask generators need to produce a +cell, the head of which specifies what kind of generator we are running.

With |= *, we create a gate and ignore the standard arguments we are given, because we're not using them.

^-  (sole-result (cask tang))

%ask generators need to have the second half of the cell be a gate that produces a sole-result, one that in this case contains a +cask of $tang. We use the ^- kethep rune to constrain the generator's output to such a sole-result.

A +cask is a pair of a mark name and a noun. We previously described a $mark as a kind of complicated mold; here we add that a $mark can be thought of as an Arvo-level MIME type for data.

A $tang is a list of $tank, and a $tank is a structure for printing data, as described above. There are three types of $tank: $leaf, $palm, and $rose. A $leaf is for printing a single noun, a $rose is for printing rows of data, and a $palm is for printing backstep-indented lists.

%+  print    leaf+"What is your favorite color?"
%+  prompt   [%& %prompt "color: "]
|=  t=tape
%+  produce  %tang

Because we imported generators, we can access its contained gates, three of which we use in axe.hoon: +print, +prompt, and +produce.

+print is used for printing a $tank to the console.

In our example, %+ cenlus is used to call the gate +print, with two arguments. The first argument is a $tank to print. The + here is syntactic sugar for [%leaf "What is your favorite color?"] that just makes it easier to write. The second argument is the output of the call to +prompt.

+prompt is used to construct a prompt for the user to provide input.

The first argument is a tuple. The second argument is a gate that returns the output of a call to +produce. Most %ask generators will want to use the +prompt gate.

The first element of the +prompt tuple/sample is a flag that indicates whether what the user typed should be echoed out to them or hidden. %& will produce echoed output and %| will hide the output (for use in passwords or other secret text).

The second element of the +prompt sample is intended to be information for use in creating autocomplete options for the prompt. This functionality is not yet implemented.

The third element of the +prompt sample is the tape that we would like to use to prompt the user. In the case of our example, we use "color: ".

+produce is used to construct the output of the generator.

In our example, we produce a $tang.

|=  t=tape

Our gate here takes a $tape that was produced by +prompt. If we needed another type of data we could use +parse to obtain it.

The rest of this generator should be intelligible to those with Hoon knowledge at this point.

One quirk that you should be aware of, though, is that $tang prints in reverse order from how it is created. The reason for this is that $tang was originally created to display stack trace information, which should be produced in reverse order. This leads to an annoyance: we either have to specify our messages backwards or construct them in the order we want and then +flop the +list.

Previous14. Subject-Oriented ProgrammingNext16. Functional Programming

Last updated 1 day ago