3. The Core Stack
This lesson goes over the nested core structures of /sys/hoon.hoon, /sys/zuse.hoon, and /sys/lull.hoon. We explain how subject search and limb resolution work.
Subject-Oriented Programming
Subject Search and Limb Resolution
A face is a label for an axis in a tree. The main use for a face is to label a slot (axis) within a noun of a corresponding type. Without faces, you would have to refer to all data by numeric axis. Faces are a Hoon convention, and Nock knows nothing about labels or faces. These values are stripped out of the Nock result. In fact, it's possible but cumbersome to construct Hoon programs without labels.
From a $type perspective, a face results from =^ kettis (foo=bar) modifying the enclosed expression to be wrapped in [%face %foo original-type].
> =/ a b=c=5
c.b.a
5
> =/ a b=c=5
([%face type] -:!>(a))
> =/ a b=c=5
([%face *] -:!>(a))
[1.701.011.814 98 1.701.011.814 99 1.836.020.833 25.717 0]An arm name is not the same thing as a face. +add is the name of an arm in the standard subject. When that arm is fired, the result is the +add gate, which is then +slammed by swapping out the sample with an argument and firing the $ arm.
Wings are expressions that compile to an axis. A $wing is a (list limb), or basically a path to a value in the subject. We can compose wings:
Relatively, using lark syntax (
+>).Absolutely, using numeric syntax (
+6,&6,|6).By name, using faces and arm names.
Wings are parsed by +rope (+ven for lark syntax; +lus++pam++bar for numeric syntax). , dot wing resolution is conducted by the +ax:musk door in the Hoon compiler (invoked by +open:ap). This resolves a wing against a sample-supplied subject. Wings resolve by depth first (in other words, from the outermost “closest” match towards the inner cores).
^ skips a match. In the compiler, this corresponds to a number of skips.
. dot is Hoon-native syntax (not rune sugar) for a wing resolution search path. : col is a shorthand for => tisgar, and generally results in a longer Hoon AST than the . dot expression would. (The Nock formula may well come out the same.)
Since the subject of a core is the core itself, ..add resolves to the core containing +add (which is Layer 1) and thus exposes mutual visibility between all arms in the core.
We have made some noise in the past about arms and legs. With everything under your belt at this point, you are equipped to really understand the difference:
A leg is a noun accessible in the current subject using a Nock Zero call. Thus a value like
=/ pi .3.1415926would be a leg.An arm is a noun which requires a Nock Nine call. Thus
++ pi .3.1415926would be an arm even though it is an atom simpliciter.
When the compiler dereferences a limb, it either finds an arm (in the battery of a core) or a leg (anywhere else). For an arm, it must be computed against the whole core (Nock Nine) or simply retrieved (Nock Zero).
Arms are only pulled by name. If you retrieve them by axis or lark syntax then they are treated as raw nouns. The name of an arm is not a face.
Structure Mode
Most Hoon is written in value mode, meaning that sugar syntax like [] resolves to a : col family rune. However, $spec values are written in structure mode.
The Hoon parser can be switched from one to the other using a leading , com.
The root type of a structure mode quantity is a $spec:
A spec produces a mold, thus a %core with a $ arm, rather than e.g. a %cell.
Cores
In the AST, a %core consists of a lot of information about the behavior of various components.
The core variance is repeated because of the dry/wet gate distinction. Core variance starts to make more sense once you've popped the cover off of cores this way.
“Suppose this core was actually compiled using the modified payload instead of the one it was originally built with? Would the Nock formula we generated for the original template actually work for the modified
payload?”
What we're saying, in other words, is that if you produced Nock using a wet gate via more than one input, would that Nock end up the same? If so, then for a wet gate it's valid. Wetness is handled at three points in the compiler:
+hempdispatches the Nock formula generation slightly differently, turning offvet(sample nesting) in+mint:ut.+mint:utwhen it builds|@barpat wet doors and|*bartar wet gates.+dext:crop:arwhen%coretypes are handled, enforcing the condition that for%wetgates=(q.r.q.sut q.r.q.ref), that the formula results are the same.
Variance matters when comparing structural nesting. For instance, the main Gall agent type should permit checking the type of the door since it will be used as examples for building actual agent cores, but should not be reliant on things like the sample. Thus in /sys/lull, +agent is marked as %iron using ^| ketbar.
In /sys/lull, several shared representations like vane interfaces and +http are marked as %lead using ^? ketwut. Bivariance here permits any kind of nesting, useful for examples for types.
In +deem:nest:ut we can see how the %read/%rite permissions are directly set.
Likewise in +peel:ut:
Those permissions sets are the ones actually used in core behavior checks. In kernelspace, you are not strictly limited by the core type system—but you will have to manually construct handlers for other wetness/metallic behaviors and extend things to get the behavior you are aiming for.
Aside: Constructing Gates
Gates are special $-armed instances of doors. It's interesting to see how that particular sausage is made in +mint:ut:
Arvo-Supplied Values
Arvo values such as our, eny, and now are simply supplied at axes in the subject (rather than being scries). (This is why they must be explicitly provided for in generators.) Compare the following Nock results on a fakeship ~zod.
In the latter, our refers to a slot in the subject which needs to be looked up (at 12) and replaced into the final evaluated noun.
ouris at12nowis at26enyis at27
Dynamic Dispatch
Static dispatch (or early binding) happens when I know at compile time which function body will be executed when I call a method. In contrast, dynamic dispatch (or run-time dispatch or virtual method call or late binding) happens when I defer that decision to run time.
The conventional behavior of Urbit's Hoon language is to statically dispatch against limb labels known at compile time. It's somewhat difficult to get around this in userspace; for instance, to retrieve a list of faces in a core and selectively run against those that exist. Why? As we showed a moment ago, a face or an arm name is a compile-time construct that stands in for an axis in the subject.
With a subject and the slap/slop algebra, we can effect dynamic (runtime) dispatch for an interactive interface via slam. For instance, Dojo does this for every input. (Cf. ll. 530–539 in /app/dojo.hoon.)
-test Thread
-test ThreadThe
/ted/testthread invokes arms beginning withtestin the context of the subject provided in the core. How does it do this?
Kelvin Versioning
The innermost core of Hoon is the root marker for the language version. Not every part of a system should be subject to kelvin versioning: userspace generally will not be, and even in the %base desk many portions will not be subject to kelvin versioning.
Deep, onion-like layering is essential. A thin layer has no room to grow. A good example of this principle is the difference between Urbit and Lisp machines. Both Nock and Lisp are very simple axiomatic definitions of computing. But practical Lisp systems expand by extending the model, whereas Urbit layers over a frozen axiom system. — ~sorreg-namtyv, ~ravmel-ropdyl, “Towards a Frozen Operating System”
The parts of the system subject to kelvin versioning are:
Nock,
%4(liquid helium, 4.15 K).Hoon,
%137(about liquid krypton, 115.8 K). (%140in December 2020).Arvo,
%236(about liquid mercury, 234.4 K). (%240in December 2020).Lull,
%322, (about gaseous water, 373.1 K). (%330in December 2020).Zuse,
%410, which in a sense represents the most important kelvin for userspace developers since it's what they peg releases against. (%420in December 2020.)
What parts are subject to kelvin versioning? Essentially, the things we see as platform: as you can see, the language, the event handler, and parts of the standard library.
What results in a kelvin change? Not every release, even a change in a system file, motivates a kelvin decrement. The rule of thumb is that something which changes the specification of the platform burns a kelvin. In practice, although there are many kelvins yet to burn, it is more straightforward to bundle breaking changes together. This is both frugal of platform changes and generous to userspace developers.
Formally, /? faswut is used to pin a version number; in practice, it is not enforced at the compiler level.
Telescoping Kelvins
The rules of telescoping are simple:
If tool B sits on platform A, either both A and B must be at absolute zero, or B must be warmer than A.
Whenever the temperature of A (the platform) declines, the temperature of B (the tool) must also decline.
B must state the version of A it was developed against. A, when loading B, must state its own current version, and the warmest version of itself with which it's backward-compatible. — ~sorreg-namtyv, ~ravmel-ropdyl, “Towards a Frozen Operating System”
Thus if you introduced a tool into kernelspace which relies on Nock alone, you could version it at anything above 4. If it relies on Hoon, then it should be above 139. And preferably a fair bit above—fat onion rings are tastier than paper-thin ones.
The Structure of Kernelspace
The kernel is constructed of nested cores from the innermost /sys/hoon definitions out to /sys/zuse. All of userspace runs outside of these cores.
Although Arvo (ca03) is the operational core of Urbit, we actually require a boot process (see the boot lesson) building on the definition of Hoon itself. Thus we begin with hoon.hoon, zuse.hoon, and lull.hoon today before proceeding into Arvo proper.
Core 0
The first core consists of the Hoon version tag, currently %139. Since there are no documentation references to this core, we call it 0, the “version stub”.
This resolves down to
++ hoon-version %139in a circuitous way.The
~%sigcen tag starts a jet registration tree. Unlike other jet registrations we have seen and will see later, this one is the root jet registration, meaning it has no parent and exports no named formulas, although it contains all ofhoon.hoon.Since we refer to a “parent core” and imply a “child core“, we need to clarify something a bit counterintuitive about Urbit's subject-oriented nature. We say that the child core contains the parent core, through its
context; and we refer to the parent core as the “inner” core, the child being “outer”. In fact, the parent/inner core is a leg in the child/outer core.Compare the expansion of
~/sigfas:~%(p +7 ~ q).
Thus, all things considered, Core 0 layer-0 is the innermost core of all of Urbit. It appears at the rightmost side when the prettyprinter shows a core:
Core 1
The next core contains arithmetic. Since each core can only access limbs present in its payload (+3), and in particular its context (+7), each core builds outwards on its predecessors, in this case on a foundation of straightforward integer arithmetic.
The next block are for binary tree calculations:
Then we have some standard definitions of types and values for mold building and handling types like units.
Several of these, like
trelandqual, are hardly used even inhoon.hoonbut standardize named faces.We particularly draw your attention to
pole, which is a facelesslist. This has more recently shown up in contexts where it is helpful to replace supplied faces with your own, as in+on-peek?+wutlus statements.eachallows you to discriminate between values on type using a flag. (This is useful when returning structures out of a parser for instance, like(each manx marl)wheremanxis a structure andmarlis alist.)
Core 2
Many practical tools live in layer-2, including functional tools, maps, sets, list operators, and string and formatted text operators.
Unit Logic
First up, the unit tools. While units are often just stripped off in userspace, there is a full-featured algebra handling units. (I'm of the opinion that these are probably underutilized because it can be hard to reason correctly with units.)
In particular, check out the definitions of
+biff,+bond,+flit, and+lift, which apply wet gates and deferred traps.
List Logic
You are likely familiar with all of these except
+lure, which is a list builder that's unused in the system.Note that
+sort(quicksort) turns off the stack trace because feedback from such a crash is liable to be a mess.
Optional exercise: Implement another sorting algorithm besides quicksort in Hoon applicable to
lists.
Bitwise Logic
These provide bitwise operators for atoms.
+feis barely used but seems like it could be used to organize some of the logic.Optional exercise: Produce
+ripand+sew.
Insecure Hashing
These provide simple hashing and ordering algorithms.
The Murmur3 algorithm is non-cryptographic hash function. +muk implements the 32-bit version. In pseudocode from Wikipedia:
Unsigned Powers
Container Logic
Jars and jugs seem oddly specific, and are only invoked in a couple of special contexts in the base distribution.
We don't see
+lyused often. What are the difference in these constructed lists?
+lyuses the crash type for an empty list.
Serialization
+jamand+cueare critically important for noun communication operations.
Here is an annotated version of +jam. The basic idea is to produce a serial noun (in order of head/tail):
One bit marks cell or atom.
Next entry marks bit length of value.
Then the actual value.
(+cue distinguishes the bit length from the value by unary until the first 0.)
Functional programming combinators:
Various Type Definitions
Floating-point structs
Paths
Strings
Core 3
Core 4
If you encounter a biblical name ($abel, $onan, etc.) then you're in the prettyprinter.
Core 5
Parsing and Compiler
Sail and XML Parsing
Compiler
Core 6
Hoon is the root of the whole system—you cannot parse and build Arvo or anything else without these definitions. As part of ca01, you examined how $hoon types are built and how the AST is implemented for a basic rune.
Outside of the language-necessary components, the %lull core provides kernel-wide structures (essentially, a header file) and the %zuse core provides a kernel-appropriate standard library. %zuse organizes its cores into what it terms “engines”.
%lull
%lullModels
+onprovides the services for+mopordered maps.
Various types, notably structured text support.
Common structures: Ethereum, Azimuth, HTTP.
Networking (Ames)
After this point, %lull defines types and interfaces for interacting with vanes. We will skip lightly over these, but come back to them in the appropriate lessons.
Timekeeping (Behn)
Versioning (Clay)
Console (Dill)
HTTP Server (Eyre)
Extensions (Gall)
HTTP Client (Iris)
Security (Jael)
Threads (Khan)
IPC (Lick)
Computation
Various definitions for cards, strands, and moves.
%zuse
%zuseCryptography
One of the most important components of %zuse is the crypto library. This supplies modular arithmetic (+fu) and several specific algorithms. (In general, signed arithmetic in Urbit uses different names (like +sum instead of +add) to prevent accidental confusion.) Some significant portions of this include:
Notes on the above:
Modular arithmetic is used in public key systems.
Curve25519 is elliptic-curve cryptography.
Advanced encryption standard (AES) is a cipher for data encryption. There are a lot of modes available in
+aes.KECCAK is a cryptographic family underlying SHA-3.
A Schnorr signature is a digital signature scheme.
BLAKE is a cryptographic hash function like KECCAK.
Argon2 is a key derivation function.
RIPEMD cryptographic hash functions are used in Bitcoin.
Hashed-based message authentication codes (HMAC) are used in shared-secret key exchange based on the SHA-2 cryptographic hash algorithms, of which several are made available. (Variants of these also live in the password-based key derivation function (PBKDF) arm.)
Units
After cryptography, there are a number of library utility functions.
I don't see
units used a lot outside of standard functions, but there are some convenience operators for them:
Formatting Text and JSON Reparsing
+enjs:formatsupports noun-to-JSON conversions.+dejs:formatare the reparsers (see+de:json:htmlfor the parser). Notably this is where many noun-to-text converters live.+dejs-softoffers non-crashing versions of the+dejsarms (thus, returningunits).
Diffs
Diff tools, using the Hunt-McIlroy algorithm:
Web Text (HTML &c.)
More JSON, this time the parser:
+json:htmltools are reparsers (see+dejs:formatfor the reparsers).Since JSON do not have a single canonical form as text, these parsers provide an opportunity to see how to parse something structurally when whitespace doesn't matter.
Wires
Identity
Retrieve your mathematical sponsor, convert a number to a rank, etc.
Millisecond Timing
Some time-related tools (currently used for timing in Eyre):
Userlib
More userspace stuff. (At this point, %zuse is a grab bag of things that people have added over the years, and it's not clear who needs what or if it's even in contemporary use.)
+chrono:userlibprovides tools to print and parse basic UTC time statements.
(Overheard memo to self: work this into whatever /lib/chronos becomes.)
+space:userlibis used by the Hood tools like|mv.
The Compilation Subject
Ford uses %zuse (thus the full standard library) as the compilation subject for a hoon file. Typically a userspace file will produce a core (or, in the case of some generators, a head tag and a core). That core will contain the standard library in its context because | bar runes (the only runes that produce cores) return cores containing the original subject in their payloads.
Ford also allows you to modify the compilation subject by imports. (This is why you have to import files at the top of a hoon file, and why you do it in a particular order.) Each import is prepended to the compilation subject, so in general your compilation subject will look like [lib1 lib2 sur1 sur2 zuse].
You can see this process in +run-prelude:ford in /sys/vane/clay. (/ fas Ford runes are actually parsed in +parse-pile using +pile-rule. This is also where /? faswut is ignored.) In ca10 we'll take a deep dive through Clay.
/lib/tiny
The whole standard library is included in every piece of userspace Hoon, unless you go out of your way to remove it from the subject. This is only rarely a good idea, but you *can* build a small working Hoon against a minimalist subject. For instance, this is done for the naïve rollup smart contract code and Sword (née Ares) development using /lib/tiny.
Exercises
Implement a custom aura,
@uo(octal/byte encoding). At one level, simply implementing an aura requires no overhead. However, the rune must have a unique parsed format for input, and should have a corresponding output. (The rules around this are laxer for more complex nouns likesets andtrees.) One format which would be compatible with the restrictions on atom syntax as well as not shadow any current atom types is0o1234.5670(89acbdefare not valid characters in octal). You can model heavily on@uxto implement this aura. (A tutorial is available for a degree–minute–second implementation which you can use as a guide.)+sloeis intended to receive a$typeand return a list of the named arms in that type. Modeling on+sloe, produce a gate+beauwhich retrieves each face in the sample of a supplied gate argument and produces a list of them.
Hint:
Last updated