4i: Parsing (Useful Idioms)

+alf

Alphabetic characters

Parse alphabetic characters, both upper and lowercase.

Source

++  alf  ;~(pose low hig)

Examples

> (scan "a" alf)
'a'

> (scan "A" alf)
'A'

> `tape`(scan "AaBbCc" (star alf))
"AaBbCc"

+aln

Alphanumeric characters

Parse alphanumeric characters - both alphabetic characters and numbers.

Source

++  aln  ;~(pose low hig nud)

Examples

> (scan "0" aln)
'0'

> `tape`(scan "alf42" (star aln))
"alf42"

+alp

Alphanumeric and -

Parse alphanumeric strings and hep, "-".

Source

++  alp  ;~(pose low hig nud hep)

Examples

> (scan "7" alp)
'7'

> (scan "s" alp)
's'

> `tape`(scan "123abc-" (star alp))
"123abc-"

+bet

Axis syntax -, +

Parse the hep and lus axis syntax.

Source

++  bet  ;~(pose (cold 2 hep) (cold 3 lus))

Examples

> (scan "-" bet)
2

> (scan "+" bet)
3

+bin

Binary to atom

Parse a tape of binary (0s and 1s) and produce its atomic representation.

Source

++  bin  (bass 2 (most gon but))

Examples

> (scan "0000" bin)
0

> (scan "0001" bin)
1

> (scan "0010" bin)
2

> (scan "100000001111" bin)
2.063

+but

Binary digit

Parse a single binary digit.

Source

++  but  (cook |=(a=@ (sub a '0')) (shim '0' '1'))

Examples

> (scan "0" but)
0

> (scan "1" but)
1

> (scan "01" but)
! {1 2}
! 'syntax-error'
! exit

> (scan "01" (star but))
~[0 1]

+cit

Octal digit

Parse a single octal digit.

Source

++  cit  (cook |=(a=@ (sub a '0')) (shim '0' '7'))

Examples

> (scan "1" cit)
1
> (scan "7" cit)
7
> (scan "8" cit)
! {1 1}
! 'syntax-error'
! exit
> (scan "60" (star cit))
~[6 0]

+dem

Decimal to atom

Parse a decimal number to an atom.

Source

++  dem  (bass 10 (most gon dit))

Examples

> (scan "7" dem)
7

> (scan "42" dem)
42

> (scan "150000000" dem)
150.000.000

> (scan "12456" dem)
12.456

+dit

Decimal digit

Parse a single decimal digit.

Source

++  dit  (cook |=(a=@ (sub a '0')) (shim '0' '9'))

Examples

> (scan "7" dit)
7

> (scan "42" (star dit))
~[4 2]

> (scan "26000" (star dit))
~[2 6 0 0 0]

+dog

. optional gap

Dot followed by an optional gap, used with numbers.

Source

++  dog  ;~(plug dot gay)

Examples

> (scan "." dog)
['.' ~]

> (scan "a.        " ;~(pfix alf dog))
['.' ~]

+dof

- optional gap

Hep followed by an optional gap, used with @p & @q syntax.

Source

++  dof  ;~(plug hep gay)

Examples

> (scan "-" dof)
['-' ~]

> (scan "-     " dof)
['-' ~]

+doh

@p separator

Phonetic base phrase separator

Source

++  doh  ;~(plug ;~(plug hep hep) gay)

Examples

> (scan "--" doh)
[['-' '-'] ~]

> (scan "--     " doh)
[['-' '-'] ~]

+dun

-- to ~

Parse phep, --, to null, ~.

Source

++  dun  (cold ~ ;~(plug hep hep))

Examples

> (scan "--" dun)
~

+duz

== to ~

Parse stet, ==, to null ~.

Source

++  duz  (cold ~ ;~(plug tis tis))

Examples

> (scan "==" duz)
~

+gah

Newline or ' '

Whitespace component, either newline or space.

Source

++  gah  (mask [`@`10 ' ' ~])

Examples

> `tape`(scan " \0a \0a" (star gah))
" \0a \0a"

+gap

Plural whitespace

Separates tall runes

Source

++  gap  (cold ~ ;~(plug gaq (star ;~(pose vul gah))))

Examples

> `tape`(scan " \0a \0a" gap)
""

> (scan "\0a   \0a XYZ" ;~(pfix gap (jest 'XYZ')))
'XYZ'

+gaq

End of line

Two spaces, a newline, or comment.

Source

++  gaq  ;~  pose
             (just `@`10)
             ;~(plug gah ;~(pose gah vul))
             vul
         ==

Examples

> (scan "123\0a" ;~(sfix dem gaq))
123

> (scan "123 :: foo\0a" ;~(sfix dem gaq))
123

+gaw

Classic whitespace

Terran whitespace.

Source

++  gaw  (cold ~ (star ;~(pose vul gah)))

Examples

> (scan "  \0a  :: foo  \0a" gaw)
~

> (scan "  " gaw)
~

> (scan "\0a" gaw)
~

+gay

Optional gap

Optional gap.

Source

++  gay  ;~(pose gap (easy ~))

Examples

> (scan "  " gay)
~

> (scan "     " gay)
~

> (scan "\0a" gay)
~

> (scan "" gay)
~

+gon

Long numbers

Parse long numbers - Numbers which wrap around the shell with the line

Source

++  gon  ;~(pose ;~(plug bas gay fas) (easy ~))

Examples

> 'abc\
  /def'
'abcdef'

> (scan "\\\0a/" gon)
['\\' ~ '/']

+gul

Axis syntax < or >

Parse the axis gal and gar axis syntax.

Source

++  gul  ;~(pose (cold 2 gal) (cold 3 gar))

Examples

> (scan "<" gul)
2

> (scan ">" gul)
3

+hex

Hex to atom

Parse any hexadecimal number to an atom.

Source

++  hex  (bass 16 (most gon hit))

Examples

> (scan "a" hex)
10

> (scan "A" hex)
10

> (scan "2A" hex)
42

> (scan "1ee7" hex)
7.911

> (scan "1EE7" hex)
7.911

> (scan "1EE7F7" hex)
2.025.463

> `@ux`(scan "1EE7F7" hex)
0x1e.e7f7

+hig

Uppercase

Parse a single uppercase letter.

Source

++  hig  (shim 'A' 'Z')

Examples

> (scan "G" hig)
'G'

> (scan "ABCDEFGHIJKLMNOPQRSTUVWXYZ" (star hig))
"ABCDEFGHIJKLMNOPQRSTUVWXYZ"

+hit

Hex digits

Parse a single hexadecimal digit.

Source

++  hit  ;~  pose
           dit
           (cook |=(a=char (sub a 87)) (shim 'a' 'f'))
           (cook |=(a=char (sub a 55)) (shim 'A' 'F'))
         ==

Examples

> (scan "a" hit)
10

> (scan "A" hit)
10

> (scan "2A" (star hit))
~[2 10]

+iny

Indentation block

Apply +rule to indented block starting at current column number, omitting the leading whitespace.

Accepts

sef is a +rule

Produces

A +rule.

Source

++  iny
  |*  sef=rule
  |=  nail  ^+  (sef)
  =+  [har tap]=[p q]:+<
  =+  lev=(fil 3 (dec q.har) ' ')
  =+  eol=(just `@t`10)
  =+  =-  roq=((star ;~(pose prn ;~(sfix eol (jest lev)) -)) har tap)
      ;~(simu ;~(plug eol eol) eol)
  ?~  q.roq  roq
  =+  vex=(sef har(q 1) p.u.q.roq)
  =+  fur=p.vex(q (add (dec q.har) q.p.vex))
  ?~  q.vex  vex(p fur)
  =-  vex(p fur, u.q -)
  :+  &3.vex
    &4.vex(q.p (add (dec q.har) q.p.&4.vex))
  =+  res=|4.vex
  |-  ?~  res  |4.roq
  ?.  =(10 -.res)  [-.res $(res +.res)]
  (welp [`@t`10 (trip lev)] $(res +.res))

Examples

> `tape`(scan "   foo\0a   bar" ;~(pfix ace ace ace (iny (star ;~(pose prn (just '\0a'))))))
"foo\0abar"

Discussion

Note the amount of indentation whitespace to be stripped from the beginning of each line is determined by the value of .q (the column) in the $hair when +iny is first called. This means something like the +pfix expression in the example above is necessary to set the level of indentation. Additionally, the +rule given to +iny must consume the whole line including the line ending.


+low

Lowercase

Parse a single lowercase letter.

Source

++  low  (shim 'a' 'z')

Examples

> (scan "g" low)
'g'

+mes

Hexbyte

Parse a hexbyte.

Source

++  mes  %+  cook
           |=({a/@ b/@} (add (mul 16 a) b))
         ;~(plug hit hit)

Examples

> (scan "2A" mes)
42

> (scan "42" mes)
66

+nix

Letters and underscore

Parse Letters and _.

Source

++  nix  (boss 256 (star ;~(pose aln cab)))

Examples

> `@t`(scan "as_me" nix)
'as_me'

+nud

Numeric

Parse a numeric character - A number.

Source

++  nud  (shim '0' '9')

Examples

> (scan "0" nud)
'0'

> (scan "7" nud)
'7'

+prn

Printable character

Parse any printable character.

Source

++  prn  ;~(less (just `@`127) (shim 32 256))

Examples

> (scan "h" prn)
'h'

> (scan "!" prn)
'!'

> (scan "\01" prn)
! {1 1}
! exit

+qat

Chars in blockcord

Parse a single character contained in a multi-line cord block.

Source

++  qat  ;~  pose
             prn
             ;~(less ;~(plug (just `@`10) soz) (just `@`10))
         ==

Examples

> ^-  tape
  %+  scan
    "'''\0aabc\0adef\0aghi\0a'''"
  %+  ifix
    :-  ;~(plug soz (just `@`10))
    ;~(plug (just `@`10) soz)
  (star qat)
"abc\0adef\0aghi"

+qit

Chars in cord

Parse an individual character to its cord atom representation. Escaped characters are converted to the value they represent.

Source

++  qit  ;~  pose                                       ::  chars in a cord
             ;~(less bas soq prn)
             ;~(pfix bas ;~(pose bas soq mes))          ::  escape chars
         ==

Examples

> (scan "%" qit)
'%'

> `tape`(scan "cord" (star qit))
"cord"

> `tape`(scan "\\0a" (star qit))
"\0a"

+qut

Cord

Parse single-soq cord with \{gap}/ anywhere in the middle, or triple-single quote (aka triple-soq) cord, between which must be in an indented block.

Source

++  qut  ;~  simu  soq
           ;~  pose
             ;~  less  soz
               (ifix [soq soq] (boss 256 (more gon qit)))
             ==
             =+  hed=;~(pose ;~(plug (plus ace) vul) (just '\0a'))
             %-  iny  %+  ifix
               :-  ;~(plug soz hed)
               ;~(plug (just '\0a') soz)
             (boss 256 (star qat))
           ==
         ==

Examples

> `@t`(scan "'cord'" qut)
'cord'

> `@t`(scan "'''\0aabc\0adef\0a'''" qut)
'abc\0adef'

+soz

Delimiting '''

Parse a triple-single quote, used for multiline strings.

Source

++  soz  ;~(plug soq soq soq)

Examples

> (scan "'''" soz)
['\'' '\'' '\'']

+sym

Term

A term: a lowercase letter, followed by letters, numbers, or -.

Source

++  sym
  %+  cook
    |=(a=tape (rap 3 ^-((list @) a)))
  ;~(plug low (star ;~(pose nud low hep)))

Examples

> `term`(scan "sam-2" sym)
%sam-2

+mixed-case-symbol

Mixed-case term

The same as +sym but allowing uppercase letters.

Source

++  mixed-case-symbol
  %+  cook
    |=(a=tape (rap 3 ^-((list @) a)))
  ;~(plug alf (star alp))

Examples

> `term`(scan "sAm-2" mixed-case-symbol)
%sAm-2

+ven

+>- axis syntax

Axis syntax parser

Source

++  ven  ;~  (comp |=([a=@ b=@] (peg a b)))
           bet
           =+  hom=`?`|
           |=  tub=nail
           ^-  (like @)
           =+  vex=?:(hom (bet tub) (gul tub))
           ?~  q.vex
             [p.tub [~ 1 tub]]
           =+  wag=$(p.tub p.vex, hom !hom, tub q.u.q.vex)
           ?>  ?=(^ q.wag)
           [p.wag [~ (peg p.u.q.vex p.u.q.wag) q.u.q.wag]]
         ==

Examples

> (scan "->+" ven)
11

> (scan "->+<-" ven)
44

+vit

Base64 digit

Parse a standard base64 digit.

Source

++  vit
  ;~  pose
    (cook |=(a=@ (sub a 65)) (shim 'A' 'Z'))
    (cook |=(a=@ (sub a 71)) (shim 'a' 'z'))
    (cook |=(a=@ (add a 4)) (shim '0' '9'))
    (cold 62 (just '-'))
    (cold 63 (just '+'))
  ==

Examples

> (scan "C" vit)
2

> (scan "c" vit)
28

> (scan "2" vit)
54

> (scan "-" vit)
62

+vul

Comments to null

Parse comments and produce a null. Note that a comment must be ended with a newline character.

Source

++  vul  %+  cold   ~
         ;~  plug  col  col
           (star prn)
           (just `@`10)
         ==

Examples

> (scan "::this is a comment \0a" vul)
~

> (scan "::this is a comment " vul)
! {1 21}
! exit

Last updated