4b: Text Processing

+at

Basic printing

Container core for basic atom printing functions.

Note that these are largely deprecated, using outdated syntax, character mappings and encodings, and should generally not be used.

Accepts

.a is a @.

Source

++  at
  |_  a=@

+r:at

Print term, number or hex

If .a is composed of the letters a through z and -, print as @tas syntax. If .a is an atom of two bytes or less, print as an integer with comma-separated thousands. Otherwise, print .a as @ux syntax.

Accepts

.a is a @, and is the sample of +at.

Produces

A $tape.

Source

++  r
  ?:  ?&  (gte (met 3 a) 2)
          |-
          ?:  =(0 a)
            &
          =+  vis=(end 3 a)
          ?&  ?|(=('-' vis) ?&((gte vis 'a') (lte vis 'z')))
              $(a (rsh 3 a))
          ==
      ==
    rtam
  ?:  (lte (met 3 a) 2)
    rud
  rux

Examples

> ~(r at 'foo-bar')
"%foo-bar"
> ~(r at 0xbeef)
"48,879"
> ~(r at 0xdead.beef)
"0xdead-beef"

+rf:at

Print loobean

If a is a loobean, print as |/&, otherwise crash.

Accepts

a is a @, and is the sample of +at.

Produces

A $tape.

Source

++  rf    `tape`[?-(a %& '&', %| '|', * !!) ~]

Examples

> ~(rf at %.y)
"&"
> ~(rf at %.n)
"|"
> ~(rf at 'foo')
dojo: hoon expression failed

+rn:at

Print null

If a is null, print ~, otherwise crash.

Accepts

a is a @, and is the sample of +at.

Produces

A $tape.

Source

++  rn    `tape`[?>(=(0 a) '~') ~]

Examples

> ~(rn at ~)
"~"
> ~(rn at 'foo')
dojo: hoon expression failed

+rt:at

Print cord, including escape characters

Print a with $cord syntax, including escape characters.

Accepts

a is a @, and the sample of +at.

Produces

A $tape.

Source

++  rt    `tape`['\'' (weld (mesc (trip a)) `$tape`['\'' ~])]

Examples

> ~(rt at 'foo')
"'foo'"
> ~(rt at 'fo\\o')
"'fo\\\\o'"
> ~(rt at 'fo\'o')
"'fo\\0x27/o'"

+rta:at

Same as +rt:at

Source

++  rta   rt

+rtam:at

Print cord with @tas syntax

Treat a as $cord, print it with % prefix.

Accepts

a is a @, and is the sample of +at.

Produces

A $tape.

Source

++  rtam  `tape`['%' (trip a)]

Examples

> ~(rtam at 'foo')
"%foo"
> ~(rtam at '12')
"%12"
> ~(rtam at '!!!')
"%!!!"

+rub:at

Print binary

Print .a as @ub syntax, except with - separators rather than . separators.

Accepts

.a is a @, and is the sample of +at.

Produces

A $tape.

Source

++  rub   `tape`['0' 'b' (rum 2 ~ |=(b=@ (add '0' b)))]

Examples

> ~(rub at 0xbeef)
"0b1011-1110-1110-1111"

+rud:at

Print atom as integer

Print atom .a as an integer, with commas separating thousands.

Accepts

.a is a @.

Produces

A $tape.

Source

++  rud   (rum 10 ~ |=(b=@ (add '0' b)))

Examples

> ~(rud at 0xbeef)
"48,879"

+rum:at

Print base-n

Print .a as base .b with suffix .c and using gate .d to convert blocks to characters. Each set of four characters will be separated by -, except for base 10 which will be printed with commas separating thousands.

Accepts

  • .a is a @, and is the sample of +at

  • .b is a @, denoting the the base.

  • .c is a $tape, and will be appended to the end of the result.

  • .d is a $gate that takes @ and produces @.

Produces

A $tape.

Source

++  rum
  |=  [b=@ c=tape d=$-(@ @)]
  ^-  tape
  ?:  =(0 a)
    [(d 0) c]
  =+  e=0
  |-  ^-  tape
  ?:  =(0 a)
    c
  =+  f=&(!=(0 e) =(0 (mod e ?:(=(10 b) 3 4))))
  %=  $
    a  (div a b)
    c  [(d (mod a b)) ?:(f [?:(=(10 b) ',' '-') c] c)]
    e  +(e)
  ==

Examples

> (~(rum at 0xdead.beef) 10 ~ |=(b=@ (add '0' b)))
"3,735,928,559"
> (~(rum at 0xdead.beef) 10 " m/s" |=(b=@ (add '0' b)))
"3,735,928,559 m/s"
> (~(rum at 0xdead.beef) 2 ~ |=(b=@ (add '0' b)))
"1101-1110-1010-1101-1011-1110-1110-1111"

+rup:at

Print @p (outdated)

Print .a as an @p, but with outdated syntax and number<->syllable mapping. This function almost certainly should not be used.

Accepts

.a is a @.

Produces

A $tape.

Source

++  rup
  =+  b=(met 3 a)
  ^-  tape
  :-  '-'
  |-  ^-  tape
  ?:  (gth (met 5 a) 1)
    %+  weld
      $(a (rsh 5 a), b (sub b 4))
    `tape`['-' '-' $(a (end 5 a), b 4)]
  ?:  =(0 b)
    ['~' ~]
  ?:  (lte b 1)
    (trip (tos:po a))
  |-  ^-  tape
  ?:  =(2 b)
    =+  c=(rsh 3 a)
    =+  d=(end 3 a)
    (weld (trip (tod:po c)) (trip (tos:po (mix c d))))
  =+  c=(rsh [3 2] a)
  =+  d=(end [3 2] a)
  (weld ^$(a c, b (met 3 c)) `tape`['-' $(a (mix c d), b 2)])

Examples

> ~(rup at 0xdead.beef)
"-rylsal-sellac"
> ~(rup at ~sampel-palnet)
"-sellod-lebdiv"
> ~(rup at ~zod)
"-~"

+ruv:at

Print base-64

Print .a as outdated base-64 syntax, with 0v rather than 0w prefix, different character mappings and separators.

Accepts

.a is a @, and is the sample of +at.

Produces

A $tape.

Source

++  ruv
  ^-  tape
  :+  '0'
    'v'
  %^    rum
      64
    ~
  |=  b=@
  ?:  =(63 b)
    '+'
  ?:  =(62 b)
    '-'
  ?:((lth b 26) (add 65 b) ?:((lth b 52) (add 71 b) (sub b 4)))

Examples

> ~(ruv at 0xdead.beef)
"0vDe-rb7v"

Note this is how it should be rendered:

> `@uw`0xdead.beef
0w3.uHrXL

+rux:at

Print hexadecimal

Print .a as @ux except with - separators rather than .s.

Accepts

.a is a @, and is the sample of +at.

Produces

A $tape.

Source

++  rux  `tape`['0' 'x' (rum 16 ~ |=(b=@ (add b ?:((lth b 10) 48 87))))]

Examples

> ~(rux at 0xdead.beef)
"0xdead-beef"

+cass

To lowercase

Turn all occurences of uppercase letters in any $tape into lowercase letters. Returns a $tape.

Accepts

vib is a $tape.

Produces

A $tape.

Source

++  cass
  |=  vib=tape
  ^-  tape
  (turn vib |=(a=@ ?.(&((gte a 'A') (lte a 'Z')) a (add 32 a))))

Examples

> (cass "JOHN DOE")
"john doe"

> (cass "abc ABC 123 !@#")
"abc abc 123 !@#"

> (cass "AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsQqRrVvWwXxYyZz")
"aabbccddeeffgghhiijjkkllmmnnooppqqrrssqqrrvvwwxxyyzz"

+crip

Tape to cord

Produce a $cord from a $tape.

Accepts

.a is a $tape.

Produces

A $cord.

Source

++  crip  |=(a=tape `@t`(rap 3 a))

Examples

> (crip "john doe")
'john doe'

> (crip "abc 123 !@#")
'abc 123 !@#'

> `@ud`(crip "abc")
6.513.249

+cuss

To uppercase

Turn all occurences of lowercase letters in any $tape into uppercase letters. Returns a $tape.

Accepts

vib is a $tape.

Produces

A $tape.

Source

++  cuss
  |=  vib=tape
  ^-  tape
  (turn vib |=(a=@ ?.(&((gte a 'a') (lte a 'z')) a (sub a 32))))

Examples

> (cuss "john doe")
"JOHN DOE"

> (cuss "abc ABC 123 !@#")
"ABC ABC 123 !@#"

> (cuss "AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsQqRrVvWwXxYyZz")
"AABBCCDDEEFFGGHHIIJJKKLLMMNNOOPPQQRRSSQQRRVVWWXXYYZZ"

+mesc

Escape special chars

Escape special characters.

Accepts

vib is a $tape.

Produces

A $tape.

Source

++  mesc
  |=  vib=tape
  ^-  tape
  ?~  vib
    ~
  ?:  =('\\' i.vib)
    ['\\' '\\' $(vib t.vib)]
  ?:  ?|((gth i.vib 126) (lth i.vib 32) =(`@`39 i.vib))
    ['\\' (welp ~(rux at i.vib) '/' $(vib t.vib))]
  [i.vib $(vib t.vib)]

Examples

> (mesc "ham lus")
"ham lus"
> (mesc "bas\\hur")
"bas\\\\hur"
> (mesc "as'saß")
"as\0x27/sa\0xc3/\0x9f/"

+runt

Prepend n times

Add .a repetitions of character .b to the head of $tape .c.

Accepts

[a b] are atoms.

.c is a $tape.

Produces

A $tape.

Source

++  runt
  |=  [[a=@ b=@] c=tape]
  ^-  tape
  ?:  =(0 a)
    c
  [b $(a (dec a))]

Examples

> (runt [2 '/'] "ham")
"//ham"

> (runt [10 'a'] "")
"aaaaaaaaaa"

+sand

Soft-cast by aura

Check characters in b are valid for aura .a. Produce a +unit of .a if valid and a null +unit if not. Takes .a and produces a gate that takes b.

Accepts

.a is a @ta, and specifies the aura like %tas or %ta.

b is an atom.

Produces

A (unit @).

Source

++  sand
  |=  a=@ta
  (flit (sane a))

Examples

> `(unit @ta)`((sand %ta) 'sym-som')
[~ ~.sym-som]
> `(unit @ta)`((sand %ta) 'err!')
~

+sane

Check aura validity

Check validity of .a by aura .b. Takes .b and produces a gate that takes .a.

Accepts

.a is a @ta, and specifies the aura like %tas or %ta.

.b is an atom.

Produces

A ?.

Source

++  sane
  |=  a=@ta
  |=  b=@  ^-  ?
  ?.  =(%t (end 3 a))
    ::  XX more and better sanity
    ::
    &
  =+  [inx=0 len=(met 3 b)]
  ?:  =(%tas a)
    |-  ^-  ?
    ?:  =(inx len)  &
    =+  cur=(cut 3 [inx 1] b)
    ?&  ?|  &((gte cur 'a') (lte cur 'z'))
            &(=('-' cur) !=(0 inx) !=(len inx))
            &(&((gte cur '0') (lte cur '9')) !=(0 inx))
        ==
        $(inx +(inx))
    ==
  ?:  =(%ta a)
    |-  ^-  ?
    ?:  =(inx len)  &
    =+  cur=(cut 3 [inx 1] b)
    ?&  ?|  &((gte cur 'a') (lte cur 'z'))
            &((gte cur '0') (lte cur '9'))
            |(=('-' cur) =('~' cur) =('_' cur) =('.' cur))
        ==
        $(inx +(inx))
    ==
  |-  ^-  ?
  ?:  =(0 b)  &
  =+  cur=(end 3 b)
  ?:  &((lth cur 32) !=(10 cur))  |
  =+  len=(teff cur)
  ?&  |(=(1 len) =+(i=1 |-(|(=(i len) &((gte (cut 3 [i 1] b) 128) $(i +(i)))))))
      $(b (rsh [3 len] b))
  ==

Examples

> ((sane %tas) %mol)
%.y

> ((sane %tas) 'lam')
%.y

> ((sane %tas) 'more ace')
%.n

+taft

UTF8 to UTF32

Convert $cord .a to a UTF32 @c.

Accepts

.a is a @t.

Produces

A @c.

Source

++  taft
  |=  a=@t
  ^-  @c
  %+  rap  5
  |-  ^-  (list @c)
  =+  b=(teff a)
  ?:  =(0 b)  ~
  =+  ^=  c
      %+  can  0
      %+  turn
        ^-  (list [p=@ q=@])
        ?+  b  !!
          %1  [[0 7] ~]
          %2  [[8 6] [0 5] ~]
          %3  [[16 6] [8 6] [0 4] ~]
          %4  [[24 6] [16 6] [8 6] [0 3] ~]
        ==
      |=([p=@ q=@] [q (cut 0 [p q] a)])
  ?>  =((tuft c) (end [3 b] a))
  [c $(a (rsh [3 b] a))]

Examples

> (taft 'foobar')
~-foobar

> `@ux`'foobar'
0x7261.626f.6f66

> `@ux`~-foobar
0x72.0000.0061.0000.0062.0000.006f.0000.006f.0000.0066

+teff

UTF8 Length

Produces the number of utf8 bytes of .a, a single @t character. If .a contains more than one character, it will produce the byte-length of the first one.

Accepts

.a is a @t.

Produces

An atom.

Source

++  teff
  |=  a=@t  ^-  @
  =+  b=(end 3 a)
  ?:  =(0 b)
    ?>(=(`@`0 a) 0)
  ?>  |((gte b 32) =(10 b))
  ?:((lte b 127) 1 ?:((lte b 223) 2 ?:((lte b 239) 3 4)))

Examples

> (teff 'a')
1

> (teff 'ß')
2

> (teff 'aß')
1

+trim

Tape split

Split first .a characters off $tape .b.

Accepts

.a is an atom.

.b is a $tape.

Produces

A cell of $tapes, .p and .q.

Source

++  trim
  |=  [a=@ b=tape]
  ^-  [p=tape q=tape]
  ?~  b
    [~ ~]
  ?:  =(0 a)
    [~ b]
  =+  c=$(a (dec a), b t.b)
  [[i.b p.c] q.c]

Examples

> (trim 5 "lasok termun")
[p="lasok" q=" termun"]

> (trim 5 "zam")
[p="zam" q=""]

+trip

Cord to tape

Produce a $tape from $cord .a.

Accepts

.a is an atom.

Produces

A $tape.

Source

++  trip
  ~/  %trip
  |=  a=@  ^-  tape
  ?:  =(0 (met 3 a))
    ~
  [^-(@ta (end 3 a)) $(a (rsh 3 a))]

Examples

> (trip 'john doe')
"john doe"

> (trip 'abc 123 !@#')
"abc 123 !@#"

> (trip 'abc')
"abc"

+tuba

UTF8 tape to UTF32 tape

Convert $tape to a +list of codepoints (@c).

Accepts

.a is a $tape.

Produces

A (list @c).

Source

++  tuba
  |=  a=tape
  ^-  (list @c)
  (rip 5 (taft (rap 3 a)))

Examples

> (tuba "я тут")
~[~-~44f. ~-. ~-~442. ~-~443. ~-~442.]

> (tuba "chars")
~[~-c ~-h ~-a ~-r ~-s]

+tufa

UTF32 to UTF8 tape

Wrap a +list of utf32 codepoints into a utf8 $tape.

Accepts

.a is a (list @c).

Produces

A $tape.

Source

++  tufa
  |=  a=(list @c)
  ^-  tape
  ?~  a  ""
  (weld (rip 3 (tuft i.a)) $(a t.a))

Examples

> (tufa ~[~-~44f. ~-. ~-~442. ~-~443. ~-~442.])
"я тут"

> (tufa ((list @c) ~[%a %b 0xb1 %c]))
"ab±c"

+tuft

UTF32 to UTF8

Convert @c to $cord.

Accepts

.a is a @c.

Produces

A $cord.

Source

++  tuft
  |=  a=@c
  ^-  @t
  %+  rap  3
  |-  ^-  (list @)
  ?:  =(`@`0 a)
    ~
  =+  b=(end 5 a)
  =+  c=$(a (rsh 5 a))
  ?:  (lte b 0x7f)
    [b c]
  ?:  (lte b 0x7ff)
    :*  (mix 0b1100.0000 (cut 0 [6 5] b))
        (mix 0b1000.0000 (end [0 6] b))
        c
    ==
  ?:  (lte b 0xffff)
    :*  (mix 0b1110.0000 (cut 0 [12 4] b))
        (mix 0b1000.0000 (cut 0 [6 6] b))
        (mix 0b1000.0000 (end [0 6] b))
        c
    ==
  :*  (mix 0b1111.0000 (cut 0 [18 3] b))
      (mix 0b1000.0000 (cut 0 [12 6] b))
      (mix 0b1000.0000 (cut 0 [6 6] b))
      (mix 0b1000.0000 (end [0 6] b))
      c
  ==

Examples

> (tuft ~-foobar)
'foobar'

> `@ux`~-foobar
0x72.0000.0061.0000.0062.0000.006f.0000.006f.0000.0066

> `@ux`'foobar'
0x7261.626f.6f66

+wack

Knot escape

Escape $knot ~ as ~~ and _ as ~-. Used for printing.

Accepts

.a is a $knot.

Produces

A $knot.

Source

++  wack
  |=  a=@ta
  ^-  @ta
  =+  b=(rip 3 a)
  %+  rap  3
  |-  ^-  tape
  ?~  b
    ~
  ?:  =('~' i.b)  ['~' '~' $(b t.b)]
  ?:  =('_' i.b)  ['~' '-' $(b t.b)]
  [i.b $(b t.b)]

Examples

> (wack '~20_sam~')
~.~~20~-sam~~

> `@t`(wack '~20_sam~')
'~~20~-sam~~'

+wick

Knot unescape

Unescape $knot ~~ as ~ and ~- as _. Produces a +unit, which is null if the $knot contains unescaped ~ characters.

Accepts

.a is a an atom.

Produces

A (unit @ta).

Source

++  wick
  |=  a=@
  ^-  (unit @ta)
  =+  b=(rip 3 a)
  =-  ?^(b ~ (some (rap 3 (flop c))))
  =|  c=tape
  |-  ^-  [b=tape c=tape]
  ?~  b  [~ c]
  ?.  =('~' i.b)
    $(b t.b, c [i.b c])
  ?~  t.b  [b ~]
  ?-  i.t.b
    %'~'  $(b t.t.b, c ['~' c])
    %'-'  $(b t.t.b, c ['_' c])
    @     [b ~]
  ==

Examples

> (wick ~.~~20~-sam~~)
[~ ~.~20_sam~]

> (wick ~.~20~-sam~~)
~

+woad

Unescape cord

Unescape $knot-encoded $cord.

Accepts

.a is a @ta.

Produces

A $cord.

Source

++  woad
  |=  a=@ta
  ^-  @t
  %+  rap  3
  |-  ^-  (list @)
  ?:  =(`@`0 a)
    ~
  =+  b=(end 3 a)
  =+  c=(rsh 3 a)
  ?:  =('.' b)
    [' ' $(a c)]
  ?.  =('~' b)
    [b $(a c)]
  =>  .(b (end 3 c), c (rsh 3 c))
  ?+  b  =-  (weld (rip 3 (tuft p.d)) $(a q.d))
         ^=  d
         =+  d=0
         |-  ^-  [p=@ q=@]
         ?:  =('.' b)
           [d c]
         ?<  =(0 c)
         %=    $
            b  (end 3 c)
            c  (rsh 3 c)
            d  %+  add  (mul 16 d)
               %+  sub  b
               ?:  &((gte b '0') (lte b '9'))  48
               ?>(&((gte b 'a') (lte b 'z')) 87)
         ==
    %'.'  ['.' $(a c)]
    %'~'  ['~' $(a c)]
  ==

Examples

> (woad ~.foo.bar~21.)
'foo bar!'

+wood

Escape cord

Escape $cord .a so it can be encoded in a $knot.

Accepts

.a is a $cord.

Produces

A $knot.

Source

++  wood
  |=  a=@t
  ^-  @ta
  %+  rap  3
  |-  ^-  (list @)
  ?:  =(`@`0 a)
    ~
  =+  b=(teff a)
  =+  c=(taft (end [3 b] a))
  =+  d=$(a (rsh [3 b] a))
  ?:  ?|  &((gte c 'a') (lte c 'z'))
          &((gte c '0') (lte c '9'))
          =(`@`'-' c)
      ==
    [c d]
  ?+  c
    :-  '~'
    =+  e=(met 2 c)
    |-  ^-  tape
    ?:  =(0 e)
      ['.' d]
    =.  e  (dec e)
    =+  f=(rsh [2 e] c)
    [(add ?:((lte f 9) 48 87) f) $(c (end [2 e] c))]
  ::
    %' '  ['.' d]
    %'.'  ['~' '.' d]
    %'~'  ['~' '~' d]
  ==

Examples

> (wood 'foo bar!')
~.foo.bar~21.

Last updated