13. Vanes VI: Khan, Lick

Khan and Lick are both interprocess communications vanes with slightly different philosophies. We will also discuss conn.c. In brief:

%khan is a high-level thread interface, useful to both Arvo and external clients.
%lick is a low-level noun interface for domain sockets, treating Arvo as a server and earth software as a client.
conn.c provides full administrative control over Arvo and Vere.

Khan & Threads

Khan allows threads to be triggered from outside of Urbit. To start off with, what exactly is a thread?

A thread is a monadic function that takes arguments and produces a result. It may perform input and output while running, so it is not a pure function. A thread's strength is that it can easily perform complex IO operations. It uses what's often called the IO monad (plus the exception monad) to provide a natural framework for IO. A thread's weakness is that it's impermanent and may fail unexpectedly. In most of its intermediate states, it expects only a small number of events (usually one), so if it receives anything it didn't expect, it fails. When code is upgraded, it's impossible to upgrade a running thread, so it fails.

Threads can be run using Gall's %spider agent or Khan.

`%spider`: Threads Before Khan

Arvo is an event handler for OS-level moves for vanes. Gall is an event handler for agent-level moves. Spider is an agent for transient thread-level operations.

Thread Definition

Ultimately, a +$thread is a gate which accepts a vase and returns the form of a strand that produces a vase. In other words, the +$thread doesn't (just) produce a result, it produces a strand that takes input and produces output from which a result can be extracted. This allows threads to chain friable computations together until a %thread-done is produced.

+$  thread  $-(vase _*form:(strand ,vase))

++form is the mold of the strand. It weaves together the notions of input and output thus:

++  form  (strand-form-raw a)
++  strand-form-raw
  |*  a=mold
  $-(strand-input (strand-output-raw a))
+$  strand-input  [=bowl in=(unit input)]
++  strand-output-raw
  |*  a=mold
  $~  [~ %done *a]
  $:  cards=(list card)
      $=  next
      $%  [%wait ~]
          [%skip ~]
          [%cont self=(strand-form-raw a)]
          [%fail err=(pair term tang)]
          [%done value=a]
      ==
  ==

++strand is more complicated. It's an “asynchronous transaction mold”, which is basically a union of four different monads. It's a wet gate producing a core from a mold.

Reader for input.
Writer for cards.
Continuation for callbacks.
Exception.

This gate also produces a number of critical handlers, such as:

++form is the main type of a strand computation.
++pure is an identity computation, useful for binding.
++bind is a combination of two computations.
++eval maintains the monadic nature of the computation.

A simple thread (like /ted/code) is simply a wrapper for some check like a scry:

/-  spider
/+  strandio
=,  strand=strand:spider
^-  thread:spider
|=  arg=vase
=/  m  (strand ,vase)
^-  form:m
;<  =bowl:spider  bind:m  get-bowl:strandio
;<  code=@p  bind:m  (scry:strandio @p /j/code/(scot %p our.bowl))
(pure:m !>(code))

Read this thread with new eyes about the types involved.

;< micgal serves to permit a sequence of computations in which each one depends on the output of the previous one.

;<  mold  bind  expr1  expr2

which desugars to:

%+  (bind mold)
  expr1
|=  mold
expr2

;< can be used to glue a pipeline together to run an asynchronous function or event. This can be helpful when deferring parts of a computation based on external data.

The main-loop pattern provides a way of providing a list of functions to try a value against, and seems like an interesting way of handling an arbitrary number of %facts.

Threads can trigger daughter threads. ++handle-start-thread does this by modifying Spider's thread yarn, but you need to see Spider now.

`/app/spider`

/app/spider tracks threads at the highest level using a “spider core” and a state manager.

+$  card         card:agent:gall
+$  thread       $-(vase shed:khan)
+$  tid          @tatid
+$  input        [=tid =cage]
+$  yarn         (list tid)
+$  thread-form  _*eval-form:eval:(strand ,vase)
+$  trying       ?(%build %none)
::
+$  state
  $:  starting=(map yarn [=trying =vase])
      running=(axal thread-form)
      tid=(map tid yarn)
      serving=(map tid [(unit @ta) =mark =desk])
      scrying=(jug tid [=wire =ship =path])
  ==

Fundamentally, each thread is an invocation of a list of thread IDs and their startup state, threads currently in progress, and some handlers for remote scries, etc.

starting is the collection of threads pending successful execution.
running contains currently-running threads identified by path.
tid is a map for tracking child threads.
serving has to do with the HTTP API for threads, allowing you to use the Urbit ship like a function-as-a-service server.
scrying is a map of sets of remote scries.

Threads use a set of internal mark conventions (notably %thread-done and %thread-fail).

++  strand-output-raw
  |*  a=mold
  $~  [~ %done *a]
  $:  cards=(list card)
      $=  next
      $%  [%wait ~]
          [%skip ~]
          [%cont self=(strand-form-raw a)]
          [%fail err=(pair term tang)]
          [%done value=a]
      ==
  ==

cards is the set of cards to dispatch immediately.
%thread-wait means to not move on but to stay awaiting a callback.
%thread-skip is a drop because this should be handled elsewhere.
%thread-cont means to continue the computation from a new callback.
%thread-fail aborts a computation and doesn't send effects.
%thread-done finishes a computation and sends effects.

The %spider-helper core has all the logic to handle HTTP, start and conclude threads, build code, handle input, etc. For instance:

++  thread-done
  |=  [=yarn =vase silent=?]
  ^-  (quip card ^state)
  ::  %-  (slog leaf+"strand {<yarn>} finished" (sell vase) ~)
  =/  =tid  (yarn-to-tid yarn)
  =/  done-cards=(list card)
    :~  [%give %fact ~[/thread-result/[tid]] %thread-done vase]
        [%give %kick ~[/thread-result/[tid]] ~]
    ==
  =^  http-cards  state
    (thread-http-response tid vase)
  =^  scry-card  state  (cancel-scry tid silent)
  =^  cards      state  (thread-clean yarn)
  [:(weld done-cards cards http-cards scry-card) state]

It's not a proper ++abet core.

Spider supports a few auxiliary scries to monitor thread state, such as the set of currently running threads:

.^((list path) %gx /=spider=/tree/noun)

You can only subscribe to Spider for thread results.

Look at /lib/strand. What surprises you?
“Spider API”

A New Interface

Khan is the "control plane" and thread-runner vane. Its main purpose is to allow external applications to run threads via a Unix Socket and receive the result.

Khan was conceived as a way to control Urbit ships from the exterior using threads. The concept evolved a fair bit from proposal to implementation. In practice, Khan is essentially an interface wrapper for Spider-based threads, which produces a somewhat strange (but not unprecedented) situation in which a vane relies on a piece of userspace infrastructure to function correctly.

Khan can be internally invoked (using a cage) or externally invoked (using a page).

`/sys/lull` Definition

::                                                      ::::
::::                    ++khan                            ::  (1i) threads
  ::                                                    ::::
++  khan  ^?
  |%
  +$  gift                                              ::  out result <-$
    $%  [%arow p=(avow cage)]                           ::  in-arvo result
        [%avow p=(avow page)]                           ::  external result
    ==                                                  ::
  +$  task                                              ::  in request ->$
    $~  [%vega ~]                                       ::
    $%  $>(%born vane-task)                             ::  new unix process
        [%done ~]                                       ::  socket closed
        [%fard p=(fyrd cage)]                           ::  in-arvo thread
        [%fyrd p=(fyrd cast)]                           ::  external thread
        [%lard =bear =shed]                             ::  inline thread
        $>(%trim vane-task)                             ::  trim state
        $>(%vega vane-task)                             ::  report upgrade
    ==                                                  ::
  ::                                                    ::
  ++  avow  |$  [a]  (each a goof)                      ::  $fyrd result
  +$  bear  $@(desk beak)                               ::  partial $beak
  +$  cast  (pair mark page)                            ::  output mark + input
  ++  fyrd  |$  [a]  [=bear name=term args=a]           ::  thread run request
  ::                                                    ::
  +$  shed  _*form:(strand:rand ,vase)                  ::  compute vase
  --  ::khan

“Khan”

While %khan hasn't been thoroughly documented yet (we expect some minor API changes, such as the more recent addition of inline thread invocation↗), there are examples of its use in ~midsum-salrux's Tendiebot price bot↗ and Faux Urbit–Discord bridge↗.

The basic conceit of Khan is that it instruments three ways to run a thread:

%fard runs a thread from within Arvo directly.
%fyrd runs a thread from outside Arvo (a connexion with the runtime).
%lard runs an inline thread (rather than from /ted).

A %fard has the form:

:*  %pass
    /path-name        ::  path
    %arvo  %k  %fard  ::  Arvo vane and %khan mode
    %namespace        ::  desk?
    %thread-name      ::  /ted/thread-name.hoon
    %noun             ::  mark (always %noun for now)
    !>  :*            ::  thread arguments:
      bowl            ::    bowl (entropy etc.)
      other-info      ::    other arguments for thread
    ==
==

A %lard has the form:

=strandio -build-file %/lib/strandio/hoon
=sh |=  message=@t
    =/  m  (strand:rand ,vase)
    ;<  ~  bind:m  (poke:strandio [our %hood] %helm-hi !>('hi'))
    ;<  ~  bind:m  (poke:strandio [our %hood] %helm-hi !>(message))
    (pure:m !>('product'))
|pass [%k %lard %base (sh 'the message')]

Since /sys/vane/khan is a vane, you receive its gifts in ++on-arvo.

[%arow p=(avow cage)] is received in userspace. Note that it is a cage, or a pair of mark and vase.
[%avow p=(avow page)] can only be received by an external process. It is a page, or a pair of mark and (unvased) data.

Compare Spider and Khan:

:_  this
:~  [%pass /thread/[ta-now] %agent [our.bowl %spider] %watch /thread-result/[tid]]
    [%pass /thread/[ta-now] %agent [our.bowl %spider] %poke %spider-start !>([~ `tid byk.bowl %foo !>(~)])]
==
::
:_  this
:~  [%pass /thread[ta-now] %arvo %k %fard q.byk.bowl %foo %noun !>([bowl ~])]
==

As a vane, /sys/vane/khan is almost as simple as a vane can be: it simply ++calls tasks and ++takes gifts from Spider to dispatch back to its caller.

Khan currently supports no scries.

Speculatively, I believe that producing an improved CLI predicated on thread execution is feasible today on Urbit. Imagine a context which can dispatch moves either batched or singly, and queue return cards for processing.

In fact, although the vane evolved from its initial conception, Khan was originally proposed under the theory that pre-written threads would be the easiest way to bundle, distribute, and manage scripts for hosting and maintenance.

Lick

Although also dealing with interprocess communication, Lick was designed for a very different scenario than Khan: to allow external processes, in particular hardware drivers, to intercommunicate with Urbit. (This breached the Earth/Mars divide.) Thus /sys/vane/lick focuses on instrumenting a low-level noun interfaces over domain sockets.

Lick manages IPC ports, and the communication between Urbit applications and POSIX applications via these ports. Other vanes and applications ask Lick to open an IPC port, notify it when something is connected or disconnected, and transfer data between itself and the Unix application.

Lick works by opening a Unix socket for a particular process, which allows serialized IPC communications. These involve a jammed noun so the receiving process needs to know how to communicate in nouns.

The IPC ports Lick creates are Unix domain sockets (AF_UNIX address family) of the SOCK_STREAM type.

The connexions are made via filepaths in .urb/dev of the pier.

The format is:

V.BBBB.JJJJ.JJJJ...

V version
B jam size in bytes (little endian)
J jammed noun (little endian)

The process on the host OS must therefore strip the first 5 bytes, ++cue the jamfile, check the mark and (most likely) convert the noun into a native data structure.

`/sys/lull` Definition

::                                              ::::
::::                    ++lick                    ::  (1j) IPC
  ::                                            ::::
++  lick  ^?
  |%
  +$  gift                                      ::  out result <-$
    $%  [%spin =name]                           ::  open an IPC port
        [%shut =name]                           ::  close an IPC port
        [%spit =name =mark =noun]               ::  spit a noun to the IPC port
        [%soak =name =mark =noun]               ::  soak a noun from the IPC port
    ==
  +$  task                                      ::  in request ->$
    $~  [%vega ~]                               ::
    $%  $>(%born vane-task)                     ::  new unix process
        $>(%trim vane-task)                     ::  trim state
        $>(%vega vane-task)                     ::  report upgrade
        [%spin =name]                           ::  open an IPC port
        [%shut =name]                           ::  close an IPC port
        [%spit =name =mark =noun]               ::  spit a noun to the IPC port
        [%soak =name =mark =noun]               ::  soak a noun from the IPC port
    ==
  ::
  +$  name  path
  --  ::lick

To evaluate what /sys/vane/lick is doing, we need to look at Unix's IPC model briefly. IPC (“interprocess communication“) describes any way that two processes in an operating system's shared context have to communicate with each other. Lick focuses on Unix domain sockets↗, which are just communication endpoints↗. For instance, a valid use of %lick would use cards that look like this:

++  init  [[%pass / %arvo %l %spin /control]~ this]
::
++  on-arvo
  |=  [=wire =sign-arvo]
  ?+  sign-arvo  (on-arvo:def wire sign-arvo)
      [%lick %soak *]
      ?+  mark.sign-arvo  [~ this]
      ::
        %connect
      ~&  >  "connect"
      :_  this  [%pass /spit %arvo %l %spit /control %init area.state]~
  ==  ==
::
++  send-state
  |=  =state
  ^-  card:agent:gall
  [%pass /spit %arvo %l %spit /control %state [slick:state face.state food.state live.state]]

~mopfel-winrux, %slick↗

The vane definition of /sys/vane/lick is even simpler than /sys/vane/khan: it has no ++abet core and primarily communicates to the unix-duct in its state. The owner is a duct to handle the return %soak.

Lick takes several scries:

Gall needs to wrap %soak and %spit to route properly. See e.g. ++ap-generic-take. This lets multiple agents share sockets with the same name, and each agent can have its own folder.

`vere/io/lick.c`

The hardware counterpart of /sys/lick is contained in vere/io/lick.c aside from its callback registration. As with other parts of the runtime event loop and callback system, the primary connexion is made using libuv, in this case via an instance of a uv_pipe_t descriptor.

_lick_ef_spit()
- _lick_send_noun()
- u3_newt_send()
_lick_sock_cb(), callback for connection from Earth.
_lick_moor_poke, result of %soak from external process.

`conn.c`

conn.c is a driver in Vere. It is a part of the "King" (a.k.a. "Urth") process. It exposes a Unix domain socket↗ at /path/to/pier/.urb/conn.sock for sending/receiving data from external processes. The point of conn.c is to provide administrative control over Arvo and Vere: read ephemeral or persistent state, enqueue events, and send arbitrary commands (pack, meld, mass, &c).

(conn.c is only loosely related to /sys/khan. Its main connection is special-casing some inputs for Khan.)

conn.c accepts a newt-encoded ++jammed noun of the shape [request-id command arguments], where:

request-id is a client-supplied atomic identifier with type @. It exists entirely for the benefit of the client, allowing responses to be matched to requests. (The poor Earthling's wire.)
command is one of:
- %peek, namespace scry request into Arvo.
- %peel, emulated namespace scry request into Vere.
- %ovum, injection of a raw kernel move.
- %fyrd, direct shortcut to Khan command.
- %urth, subcommand to runtime to %pack or %meld.

See particularly:

_conn_moor_poke() for the main message dispatcher.
_conn_peek_cb() for the peek handler.
- _conn_send_noun()
_conn_read_peel() for the %peel handler.
“conn.c Usage Guide”↗
Click↗

Exercises

Run these valid commands on a fakeship from the outside (following examples in the conn.c usage guide).
- Pack, meld, OTA, install, code, vats

Khan & Threads

%spider: Threads Before Khan

Thread Definition

/app/spider

A New Interface

/sys/lull Definition

Lick

/sys/lull Definition

vere/io/lick.c

conn.c

Exercises

`%spider`: Threads Before Khan

`/app/spider`

`/sys/lull` Definition

`/sys/lull` Definition

`vere/io/lick.c`

`conn.c`