4b: Text Processing

+at

Basic printing.

Container core for basic $atom printing functions.

Note that these are largely deprecated, using outdated syntax, character mappings and encodings, and should generally not be used.

Accepts

.a is a @.

Source

++  at
  |_  a=@

+r:at

Print term, number or hex.

If .a is composed of the letters a through z and -, print as @tas syntax. If .a is an $atom of two bytes or less, print as an integer with comma-separated thousands. Otherwise, print .a as @ux syntax.

Accepts

.a is a @, and is the sample of +at.

Produces

A $tape.

Source

++  r
  ?:  ?&  (gte (met 3 a) 2)
          |-
          ?:  =(0 a)
            &
          =+  vis=(end 3 a)
          ?&  ?|(=('-' vis) ?&((gte vis 'a') (lte vis 'z')))
              $(a (rsh 3 a))
          ==
      ==
    rtam
  ?:  (lte (met 3 a) 2)
    rud
  rux

Examples

> ~(r at 'foo-bar')
"%foo-bar"
> ~(r at 0xbeef)
"48,879"
> ~(r at 0xdead.beef)
"0xdead-beef"

+rf:at

Print loobean.

If .a is a loobean, print as |/&, otherwise crash.

Accepts

.a is a @, and is the sample of +at.

Produces

A $tape.

Source

++  rf    `tape`[?-(a %& '&', %| '|', * !!) ~]

Examples

> ~(rf at %.y)
"&"
> ~(rf at %.n)
"|"
> ~(rf at 'foo')
dojo: hoon expression failed

+rn:at

Print null.

If .a is null, print ~, otherwise crash.

Accepts

.a is a @, and is the sample of +at.

Produces

A $tape.

Source

++  rn    `tape`[?>(=(0 a) '~') ~]

Examples

> ~(rn at ~)
"~"
> ~(rn at 'foo')
dojo: hoon expression failed

+rt:at

Print cord, including escape characters.

Print .a with $cord syntax, including escape characters.

Accepts

.a is a @, and the sample of +at.

Produces

A $tape.

Source

++  rt    `tape`['\'' (weld (mesc (trip a)) `$tape`['\'' ~])]

Examples

> ~(rt at 'foo')
"'foo'"
> ~(rt at 'fo\\o')
"'fo\\\\o'"
> ~(rt at 'fo\'o')
"'fo\\0x27/o'"

+rta:at

Same as +rt:at.

Source

++  rta   rt

+rtam:at

Print cord with @tas syntax.

Treat .a as $cord, print it with % prefix.

Accepts

.a is a @, and is the sample of +at.

Produces

A $tape.

Source

++  rtam  `tape`['%' (trip a)]

Examples

> ~(rtam at 'foo')
"%foo"
> ~(rtam at '12')
"%12"
> ~(rtam at '!!!')
"%!!!"

+rub:at

Print binary.

Print .a as @ub syntax, except with - separators rather than . separators.

Accepts

.a is a @, and is the sample of +at.

Produces

A $tape.

Source

++  rub   `tape`['0' 'b' (rum 2 ~ |=(b=@ (add '0' b)))]

Examples

> ~(rub at 0xbeef)
"0b1011-1110-1110-1111"

+rud:at

Print $atom as integer.

Print $atom .a as an integer, with commas separating thousands.

Accepts

.a is a @.

Produces

A $tape.

Source

++  rud   (rum 10 ~ |=(b=@ (add '0' b)))

Examples

> ~(rud at 0xbeef)
"48,879"

+rum:at

Print base-n.

Print .a as base .b with suffix .c and using gate .d to convert blocks to characters. Each set of four characters will be separated by -, except for base 10 which will be printed with commas separating thousands.

Accepts

  • .a is a @, and is the sample of +at.

  • .b is a @, denoting the the base.

  • .c is a $tape, and will be appended to the end of the result.

  • .d is a $gate that takes @ and produces @.

Produces

A $tape.

Source

++  rum
  |=  [b=@ c=tape d=$-(@ @)]
  ^-  tape
  ?:  =(0 a)
    [(d 0) c]
  =+  e=0
  |-  ^-  tape
  ?:  =(0 a)
    c
  =+  f=&(!=(0 e) =(0 (mod e ?:(=(10 b) 3 4))))
  %=  $
    a  (div a b)
    c  [(d (mod a b)) ?:(f [?:(=(10 b) ',' '-') c] c)]
    e  +(e)
  ==

Examples

> (~(rum at 0xdead.beef) 10 ~ |=(b=@ (add '0' b)))
"3,735,928,559"
> (~(rum at 0xdead.beef) 10 " m/s" |=(b=@ (add '0' b)))
"3,735,928,559 m/s"
> (~(rum at 0xdead.beef) 2 ~ |=(b=@ (add '0' b)))
"1101-1110-1010-1101-1011-1110-1110-1111"

+rup:at

Print @p (outdated).

Print .a as a @p, but with outdated syntax and number<->syllable mapping. This function almost certainly should not be used.

Accepts

.a is a @.

Produces

A $tape.

Source

++  rup
  =+  b=(met 3 a)
  ^-  tape
  :-  '-'
  |-  ^-  tape
  ?:  (gth (met 5 a) 1)
    %+  weld
      $(a (rsh 5 a), b (sub b 4))
    `tape`['-' '-' $(a (end 5 a), b 4)]
  ?:  =(0 b)
    ['~' ~]
  ?:  (lte b 1)
    (trip (tos:po a))
  |-  ^-  tape
  ?:  =(2 b)
    =+  c=(rsh 3 a)
    =+  d=(end 3 a)
    (weld (trip (tod:po c)) (trip (tos:po (mix c d))))
  =+  c=(rsh [3 2] a)
  =+  d=(end [3 2] a)
  (weld ^$(a c, b (met 3 c)) `tape`['-' $(a (mix c d), b 2)])

Examples

> ~(rup at 0xdead.beef)
"-rylsal-sellac"
> ~(rup at ~sampel-palnet)
"-sellod-lebdiv"
> ~(rup at ~zod)
"-~"

+ruv:at

Print base-64.

Print .a as outdated base-64 syntax, with 0v rather than 0w prefix, different character mappings and separators.

Accepts

.a is a @, and is the sample of +at.

Produces

A $tape.

Source

++  ruv
  ^-  tape
  :+  '0'
    'v'
  %^    rum
      64
    ~
  |=  b=@
  ?:  =(63 b)
    '+'
  ?:  =(62 b)
    '-'
  ?:((lth b 26) (add 65 b) ?:((lth b 52) (add 71 b) (sub b 4)))

Examples

> ~(ruv at 0xdead.beef)
"0vDe-rb7v"

Note this is how it should be rendered:

> `@uw`0xdead.beef
0w3.uHrXL

+rux:at

Print hexadecimal.

Print .a as @ux except with - separators rather than .s.

Accepts

.a is a @, and is the sample of +at.

Produces

A $tape.

Source

++  rux  `tape`['0' 'x' (rum 16 ~ |=(b=@ (add b ?:((lth b 10) 48 87))))]

Examples

> ~(rux at 0xdead.beef)
"0xdead-beef"

+cass

To lowercase.

Turn all occurences of uppercase letters in any $tape into lowercase letters. Returns a $tape.

Accepts

.vib is a $tape.

Produces

A $tape.

Source

++  cass
  |=  vib=tape
  ^-  tape
  (turn vib |=(a=@ ?.(&((gte a 'A') (lte a 'Z')) a (add 32 a))))

Examples

> (cass "JOHN DOE")
"john doe"
> (cass "abc ABC 123 !@#")
"abc abc 123 !@#"
> (cass "AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsQqRrVvWwXxYyZz")
"aabbccddeeffgghhiijjkkllmmnnooppqqrrssqqrrvvwwxxyyzz"

+crip

Tape to cord.

Produce a $cord from a $tape.

Accepts

.a is a $tape.

Produces

A $cord.

Source

++  crip  |=(a=tape `@t`(rap 3 a))

Examples

> (crip "john doe")
'john doe'
> (crip "abc 123 !@#")
'abc 123 !@#'
> `@ud`(crip "abc")
6.513.249

+cuss

To uppercase.

Turn all occurences of lowercase letters in any $tape into uppercase letters. Returns a $tape.

Accepts

.vib is a $tape.

Produces

A $tape.

Source

++  cuss
  |=  vib=tape
  ^-  tape
  (turn vib |=(a=@ ?.(&((gte a 'a') (lte a 'z')) a (sub a 32))))

Examples

> (cuss "john doe")
"JOHN DOE"
> (cuss "abc ABC 123 !@#")
"ABC ABC 123 !@#"
> (cuss "AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsQqRrVvWwXxYyZz")
"AABBCCDDEEFFGGHHIIJJKKLLMMNNOOPPQQRRSSQQRRVVWWXXYYZZ"

+mesc

Escape special chars.

Escape special characters.

Accepts

.vib is a $tape.

Produces

A $tape.

Source

++  mesc
  |=  vib=tape
  ^-  tape
  ?~  vib
    ~
  ?:  =('\\' i.vib)
    ['\\' '\\' $(vib t.vib)]
  ?:  ?|((gth i.vib 126) (lth i.vib 32) =(`@`39 i.vib))
    ['\\' (welp ~(rux at i.vib) '/' $(vib t.vib))]
  [i.vib $(vib t.vib)]

Examples

> (mesc "ham lus")
"ham lus"
> (mesc "bas\\hur")
"bas\\\\hur"
> (mesc "as'saß")
"as\0x27/sa\0xc3/\0x9f/"

+runt

Prepend n times.

Add .a repetitions of character .b to the head of $tape .c.

Accepts

.a and .b are $atoms.

.c is a $tape.

Produces

A $tape.

Source

++  runt
  |=  [[a=@ b=@] c=tape]
  ^-  tape
  ?:  =(0 a)
    c
  [b $(a (dec a))]

Examples

> (runt [2 '/'] "ham")
"//ham"
> (runt [10 'a'] "")
"aaaaaaaaaa"

+sand

Soft-cast by aura.

Check characters in .b are valid for aura .a. Produce a +unit of .a if valid and a null +unit if not. Takes .a and produces a gate that takes .b.

Accepts

.a is a @ta, and specifies the aura like %tas or %ta.

.b is an $atom.

Produces

A (unit @).

Source

++  sand
  |=  a=@ta
  (flit (sane a))

Examples

> `(unit @ta)`((sand %ta) 'sym-som')
[~ ~.sym-som]
> `(unit @ta)`((sand %ta) 'err!')
~

+sane

Check aura validity.

Check validity of .a by aura .b. Takes .b and produces a gate that takes .a.

Accepts

.a is a @ta, and specifies the aura like %tas or %ta.

.b is an $atom.

Produces

A $flag.

Source

++  sane
  |=  a=@ta
  |=  b=@  ^-  ?
  ?.  =(%t (end 3 a))
    ::  XX more and better sanity
    ::
    &
  =+  [inx=0 len=(met 3 b)]
  ?:  =(%tas a)
    |-  ^-  ?
    ?:  =(inx len)  &
    =+  cur=(cut 3 [inx 1] b)
    ?&  ?|  &((gte cur 'a') (lte cur 'z'))
            &(=('-' cur) !=(0 inx) !=(len inx))
            &(&((gte cur '0') (lte cur '9')) !=(0 inx))
        ==
        $(inx +(inx))
    ==
  ?:  =(%ta a)
    |-  ^-  ?
    ?:  =(inx len)  &
    =+  cur=(cut 3 [inx 1] b)
    ?&  ?|  &((gte cur 'a') (lte cur 'z'))
            &((gte cur '0') (lte cur '9'))
            |(=('-' cur) =('~' cur) =('_' cur) =('.' cur))
        ==
        $(inx +(inx))
    ==
  |-  ^-  ?
  ?:  =(0 b)  &
  =+  cur=(end 3 b)
  ?:  &((lth cur 32) !=(10 cur))  |
  =+  len=(teff cur)
  ?&  |(=(1 len) =+(i=1 |-(|(=(i len) &((gte (cut 3 [i 1] b) 128) $(i +(i)))))))
      $(b (rsh [3 len] b))
  ==

Examples

> ((sane %tas) %mol)
%.y
> ((sane %tas) 'lam')
%.y
> ((sane %tas) 'more ace')
%.n

+taft

UTF-8 to UTF-32.

Convert $cord .a to a UTF-32 @c.

Accepts

.a is a @t.

Produces

A @c.

Source

++  taft
  |=  a=@t
  ^-  @c
  %+  rap  5
  |-  ^-  (list @c)
  =+  b=(teff a)
  ?:  =(0 b)  ~
  =+  ^=  c
      %+  can  0
      %+  turn
        ^-  (list [p=@ q=@])
        ?+  b  !!
          %1  [[0 7] ~]
          %2  [[8 6] [0 5] ~]
          %3  [[16 6] [8 6] [0 4] ~]
          %4  [[24 6] [16 6] [8 6] [0 3] ~]
        ==
      |=([p=@ q=@] [q (cut 0 [p q] a)])
  ?>  =((tuft c) (end [3 b] a))
  [c $(a (rsh [3 b] a))]

Examples

> (taft 'foobar')
~-foobar
> `@ux`'foobar'
0x7261.626f.6f66
> `@ux`~-foobar
0x72.0000.0061.0000.0062.0000.006f.0000.006f.0000.0066

+teff

UTF-8 Length.

Produces the number of utf8 bytes of .a, a single @t character. If .a contains more than one character, it will produce the byte-length of the first one.

Accepts

.a is a @t.

Produces

An $atom.

Source

++  teff
  |=  a=@t  ^-  @
  =+  b=(end 3 a)
  ?:  =(0 b)
    ?>(=(`@`0 a) 0)
  ?>  |((gte b 32) =(10 b))
  ?:((lte b 127) 1 ?:((lte b 223) 2 ?:((lte b 239) 3 4)))

Examples

> (teff 'a')
1
> (teff 'ß')
2
> (teff 'aß')
1

+trim

Tape split.

Split first .a characters off $tape .b.

Accepts

.a is an $atom.

.b is a $tape.

Produces

A cell of $tapes, .p and .q.

Source

++  trim
  |=  [a=@ b=tape]
  ^-  [p=tape q=tape]
  ?~  b
    [~ ~]
  ?:  =(0 a)
    [~ b]
  =+  c=$(a (dec a), b t.b)
  [[i.b p.c] q.c]

Examples

> (trim 5 "lasok termun")
[p="lasok" q=" termun"]
> (trim 5 "zam")
[p="zam" q=""]

+trip

Cord to tape.

Produce a $tape from $cord .a.

Accepts

.a is an $atom.

Produces

A $tape.

Source

++  trip
  ~/  %trip
  |=  a=@  ^-  tape
  ?:  =(0 (met 3 a))
    ~
  [^-(@ta (end 3 a)) $(a (rsh 3 a))]

Examples

> (trip 'john doe')
"john doe"
> (trip 'abc 123 !@#')
"abc 123 !@#"
> (trip 'abc')
"abc"

+tuba

UTF-8 tape to UTF-32 tape.

Convert $tape to a +list of codepoints (@c).

Accepts

.a is a $tape.

Produces

A (list @c).

Source

++  tuba
  |=  a=tape
  ^-  (list @c)
  (rip 5 (taft (rap 3 a)))

Examples

> (tuba "я тут")
~[~-~44f. ~-. ~-~442. ~-~443. ~-~442.]
> (tuba "chars")
~[~-c ~-h ~-a ~-r ~-s]

+tufa

UTF-32 to UTF-8 tape.

Wrap a +list of UTF-32 codepoints into a UTF-8 $tape.

Accepts

.a is a (list @c).

Produces

A $tape.

Source

++  tufa
  |=  a=(list @c)
  ^-  tape
  ?~  a  ""
  (weld (rip 3 (tuft i.a)) $(a t.a))

Examples

> (tufa ~[~-~44f. ~-. ~-~442. ~-~443. ~-~442.])
"я тут"
> (tufa ((list @c) ~[%a %b 0xb1 %c]))
"ab±c"

+tuft

UTF-32 to UTF-8.

Convert @c to $cord.

Accepts

.a is a @c.

Produces

A $cord.

Source

++  tuft
  |=  a=@c
  ^-  @t
  %+  rap  3
  |-  ^-  (list @)
  ?:  =(`@`0 a)
    ~
  =+  b=(end 5 a)
  =+  c=$(a (rsh 5 a))
  ?:  (lte b 0x7f)
    [b c]
  ?:  (lte b 0x7ff)
    :*  (mix 0b1100.0000 (cut 0 [6 5] b))
        (mix 0b1000.0000 (end [0 6] b))
        c
    ==
  ?:  (lte b 0xffff)
    :*  (mix 0b1110.0000 (cut 0 [12 4] b))
        (mix 0b1000.0000 (cut 0 [6 6] b))
        (mix 0b1000.0000 (end [0 6] b))
        c
    ==
  :*  (mix 0b1111.0000 (cut 0 [18 3] b))
      (mix 0b1000.0000 (cut 0 [12 6] b))
      (mix 0b1000.0000 (cut 0 [6 6] b))
      (mix 0b1000.0000 (end [0 6] b))
      c
  ==

Examples

> (tuft ~-foobar)
'foobar'
> `@ux`~-foobar
0x72.0000.0061.0000.0062.0000.006f.0000.006f.0000.0066
> `@ux`'foobar'
0x7261.626f.6f66

+wack

Knot escape.

Escape $knot ~ as ~~ and _ as ~-. Used for printing.

Accepts

.a is a $knot.

Produces

A $knot.

Source

++  wack
  |=  a=@ta
  ^-  @ta
  =+  b=(rip 3 a)
  %+  rap  3
  |-  ^-  tape
  ?~  b
    ~
  ?:  =('~' i.b)  ['~' '~' $(b t.b)]
  ?:  =('_' i.b)  ['~' '-' $(b t.b)]
  [i.b $(b t.b)]

Examples

> (wack '~20_sam~')
~.~~20~-sam~~
> `@t`(wack '~20_sam~')
'~~20~-sam~~'

+wick

Knot unescape.

Unescape $knot ~~ as ~ and ~- as _. Produces a +unit, which is null if the $knot contains unescaped ~ characters.

Accepts

.a is a an $atom.

Produces

A (unit @ta).

Source

++  wick
  |=  a=@
  ^-  (unit @ta)
  =+  b=(rip 3 a)
  =-  ?^(b ~ (some (rap 3 (flop c))))
  =|  c=tape
  |-  ^-  [b=tape c=tape]
  ?~  b  [~ c]
  ?.  =('~' i.b)
    $(b t.b, c [i.b c])
  ?~  t.b  [b ~]
  ?-  i.t.b
    %'~'  $(b t.t.b, c ['~' c])
    %'-'  $(b t.t.b, c ['_' c])
    @     [b ~]
  ==

Examples

> (wick ~.~~20~-sam~~)
[~ ~.~20_sam~]
> (wick ~.~20~-sam~~)
~

+woad

Unescape cord.

Unescape $knot-encoded $cord.

Accepts

.a is a @ta.

Produces

A $cord.

Source

++  woad
  |=  a=@ta
  ^-  @t
  %+  rap  3
  |-  ^-  (list @)
  ?:  =(`@`0 a)
    ~
  =+  b=(end 3 a)
  =+  c=(rsh 3 a)
  ?:  =('.' b)
    [' ' $(a c)]
  ?.  =('~' b)
    [b $(a c)]
  =>  .(b (end 3 c), c (rsh 3 c))
  ?+  b  =-  (weld (rip 3 (tuft p.d)) $(a q.d))
         ^=  d
         =+  d=0
         |-  ^-  [p=@ q=@]
         ?:  =('.' b)
           [d c]
         ?<  =(0 c)
         %=    $
            b  (end 3 c)
            c  (rsh 3 c)
            d  %+  add  (mul 16 d)
               %+  sub  b
               ?:  &((gte b '0') (lte b '9'))  48
               ?>(&((gte b 'a') (lte b 'z')) 87)
         ==
    %'.'  ['.' $(a c)]
    %'~'  ['~' $(a c)]
  ==

Examples

> (woad ~.foo.bar~21.)
'foo bar!'

+wood

Escape cord.

Escape $cord .a so it can be encoded in a $knot.

Accepts

.a is a $cord.

Produces

A $knot.

Source

++  wood
  |=  a=@t
  ^-  @ta
  %+  rap  3
  |-  ^-  (list @)
  ?:  =(`@`0 a)
    ~
  =+  b=(teff a)
  =+  c=(taft (end [3 b] a))
  =+  d=$(a (rsh [3 b] a))
  ?:  ?|  &((gte c 'a') (lte c 'z'))
          &((gte c '0') (lte c '9'))
          =(`@`'-' c)
      ==
    [c d]
  ?+  c
    :-  '~'
    =+  e=(met 2 c)
    |-  ^-  tape
    ?:  =(0 e)
      ['.' d]
    =.  e  (dec e)
    =+  f=(rsh [2 e] c)
    [(add ?:((lte f 9) 48 87) f) $(c (end [2 e] c))]
  ::
    %' '  ['.' d]
    %'.'  ['~' '.' d]
    %'~'  ['~' '~' d]
  ==

Examples

> (wood 'foo bar!')
~.foo.bar~21.

Last updated