4i: Parsing (Useful Idioms)
+alf
+alfAlphabetic characters.
Parse alphabetic characters, both upper and lowercase.
Source
++ alf ;~(pose low hig)Examples
> (scan "a" alf)
'a'> (scan "A" alf)
'A'> `tape`(scan "AaBbCc" (star alf))
"AaBbCc"+aln
+alnAlphanumeric characters.
Parse alphanumeric characters - both alphabetic characters and numbers.
Source
++ aln ;~(pose low hig nud)Examples
> (scan "0" aln)
'0'> `tape`(scan "alf42" (star aln))
"alf42"+alp
+alpAlphanumeric and -.
Parse alphanumeric strings and hep, -.
Source
++ alp ;~(pose low hig nud hep)Examples
> (scan "7" alp)
'7'> (scan "s" alp)
's'> `tape`(scan "123abc-" (star alp))
"123abc-"+bet
+betAxis syntax -, +.
Parse the hep and lus axis syntax.
Source
++ bet ;~(pose (cold 2 hep) (cold 3 lus))Examples
> (scan "-" bet)
2> (scan "+" bet)
3+bin
+binBinary to $atom.
Parse a tape of binary (0s and 1s) and produce its atomic representation.
Source
++ bin (bass 2 (most gon but))Examples
> (scan "0000" bin)
0> (scan "0001" bin)
1> (scan "0010" bin)
2> (scan "100000001111" bin)
2.063+but
+butBinary digit.
Parse a single binary digit.
Source
++ but (cook |=(a=@ (sub a '0')) (shim '0' '1'))Examples
> (scan "0" but)
0> (scan "1" but)
1> (scan "01" but)
! {1 2}
! 'syntax-error'
! exit> (scan "01" (star but))
~[0 1]+cit
+citOctal digit.
Parse a single octal digit.
Source
++ cit (cook |=(a=@ (sub a '0')) (shim '0' '7'))Examples
> (scan "1" cit)
1> (scan "7" cit)
7> (scan "8" cit)
! {1 1}
! 'syntax-error'
! exit> (scan "60" (star cit))
~[6 0]+dem
+demDecimal to $atom.
Parse a decimal number to an $atom.
Source
++ dem (bass 10 (most gon dit))Examples
> (scan "7" dem)
7> (scan "42" dem)
42> (scan "150000000" dem)
150.000.000> (scan "12456" dem)
12.456+dit
+ditDecimal digit.
Parse a single decimal digit.
Source
++ dit (cook |=(a=@ (sub a '0')) (shim '0' '9'))Examples
> (scan "7" dit)
7> (scan "42" (star dit))
~[4 2]> (scan "26000" (star dit))
~[2 6 0 0 0]+dog
+dog. optional gap.
Dot followed by an optional gap, used with numbers.
Source
++ dog ;~(plug dot gay)Examples
> (scan "." dog)
['.' ~]> (scan "a. " ;~(pfix alf dog))
['.' ~]+dof
+dof- optional gap.
Hep followed by an optional gap, used with @p & @q syntax.
Source
++ dof ;~(plug hep gay)Examples
> (scan "-" dof)
['-' ~]> (scan "- " dof)
['-' ~]+doh
+doh@p separator.
Phonetic base phrase separator
Source
++ doh ;~(plug ;~(plug hep hep) gay)Examples
> (scan "--" doh)
[['-' '-'] ~]> (scan "-- " doh)
[['-' '-'] ~]+dun
+dun-- to ~.
Parse phep, --, to null, ~.
Source
++ dun (cold ~ ;~(plug hep hep))Examples
> (scan "--" dun)
~+duz
+duz== to ~.
Parse stet, ==, to null ~.
Source
++ duz (cold ~ ;~(plug tis tis))Examples
> (scan "==" duz)
~+gah
+gahNewline or ' '.
Whitespace component, either newline or space.
Source
++ gah (mask [`@`10 ' ' ~])Examples
> `tape`(scan " \0a \0a" (star gah))
" \0a \0a"+gap
+gapPlural whitespace.
Separates tall runes.
Source
++ gap (cold ~ ;~(plug gaq (star ;~(pose vul gah))))Examples
> `tape`(scan " \0a \0a" gap)
""> (scan "\0a \0a XYZ" ;~(pfix gap (jest 'XYZ')))
'XYZ'+gaq
+gaqEnd of line.
Two spaces, a newline, or comment.
Source
++ gaq ;~ pose
(just `@`10)
;~(plug gah ;~(pose gah vul))
vul
==Examples
> (scan "123\0a" ;~(sfix dem gaq))
123> (scan "123 :: foo\0a" ;~(sfix dem gaq))
123+gaw
+gawNormal whitespace.
Source
++ gaw (cold ~ (star ;~(pose vul gah)))Examples
> (scan " \0a :: foo \0a" gaw)
~> (scan " " gaw)
~> (scan "\0a" gaw)
~+gay
+gayOptional gap.
Source
++ gay ;~(pose gap (easy ~))Examples
> (scan " " gay)
~> (scan " " gay)
~> (scan "\0a" gay)
~> (scan "" gay)
~+gon
+gonLong numbers.
Parse long numbers - Numbers which wrap around the shell with the line
Source
++ gon ;~(pose ;~(plug bas gay fas) (easy ~))Examples
> 'abc\
/def'
'abcdef'> (scan "\\\0a/" gon)
['\\' ~ '/']+gul
+gulAxis syntax < or >.
Parse the axis gal and gar axis syntax.
Source
++ gul ;~(pose (cold 2 gal) (cold 3 gar))Examples
> (scan "<" gul)
2> (scan ">" gul)
3+hex
+hexHex to $atom.
Parse any hexadecimal number to an $atom.
Source
++ hex (bass 16 (most gon hit))Examples
> (scan "a" hex)
10> (scan "A" hex)
10> (scan "2A" hex)
42> (scan "1ee7" hex)
7.911> (scan "1EE7" hex)
7.911> (scan "1EE7F7" hex)
2.025.463> `@ux`(scan "1EE7F7" hex)
0x1e.e7f7+hig
+higUppercase.
Parse a single uppercase letter.
Source
++ hig (shim 'A' 'Z')Examples
> (scan "G" hig)
'G'> (scan "ABCDEFGHIJKLMNOPQRSTUVWXYZ" (star hig))
"ABCDEFGHIJKLMNOPQRSTUVWXYZ"+hit
+hitHex digits.
Parse a single hexadecimal digit.
Source
++ hit ;~ pose
dit
(cook |=(a=char (sub a 87)) (shim 'a' 'f'))
(cook |=(a=char (sub a 55)) (shim 'A' 'F'))
==Examples
> (scan "a" hit)
10> (scan "A" hit)
10> (scan "2A" (star hit))
~[2 10]+iny
+inyIndentation block.
Apply $rule to indented block starting at current column number, omitting the leading whitespace.
Accepts
sef is a $rule
Produces
A $rule.
Source
++ iny
|* sef=rule
|= nail ^+ (sef)
=+ [har tap]=[p q]:+<
=+ lev=(fil 3 (dec q.har) ' ')
=+ eol=(just `@t`10)
=+ =- roq=((star ;~(pose prn ;~(sfix eol (jest lev)) -)) har tap)
;~(simu ;~(plug eol eol) eol)
?~ q.roq roq
=+ vex=(sef har(q 1) p.u.q.roq)
=+ fur=p.vex(q (add (dec q.har) q.p.vex))
?~ q.vex vex(p fur)
=- vex(p fur, u.q -)
:+ &3.vex
&4.vex(q.p (add (dec q.har) q.p.&4.vex))
=+ res=|4.vex
|- ?~ res |4.roq
?. =(10 -.res) [-.res $(res +.res)]
(welp [`@t`10 (trip lev)] $(res +.res))Examples
> `tape`(scan " foo\0a bar" ;~(pfix ace ace ace (iny (star ;~(pose prn (just '\0a'))))))
"foo\0abar"Discussion
Note the amount of indentation whitespace to be stripped from the beginning of each line is determined by the value of .q (the column) in the $hair when +iny is first called. This means something like the +pfix expression in the example above is necessary to set the level of indentation. Additionally, the $rule given to +iny must consume the whole line including the line ending.
+low
+lowLowercase.
Parse a single lowercase letter.
Source
++ low (shim 'a' 'z')Examples
> (scan "g" low)
'g'+mes
+mesHexbyte.
Parse a hexbyte.
Source
++ mes %+ cook
|=({a/@ b/@} (add (mul 16 a) b))
;~(plug hit hit)Examples
> (scan "2A" mes)
42> (scan "42" mes)
66+nix
+nixLetters and underscore.
Parse Letters and _.
Source
++ nix (boss 256 (star ;~(pose aln cab)))Examples
> `@t`(scan "as_me" nix)
'as_me'+nud
+nudNumeric.
Parse a numeric character - A number.
Source
++ nud (shim '0' '9')Examples
> (scan "0" nud)
'0'> (scan "7" nud)
'7'+prn
+prnPrintable character.
Parse any printable character.
Source
++ prn ;~(less (just `@`127) (shim 32 256))Examples
> (scan "h" prn)
'h'> (scan "!" prn)
'!'> (scan "\01" prn)
! {1 1}
! exit+qat
+qatChars in blockcord.
Parse a single character contained in a multi-line cord block.
Source
++ qat ;~ pose
prn
;~(less ;~(plug (just `@`10) soz) (just `@`10))
==Examples
> ^- tape
%+ scan
"'''\0aabc\0adef\0aghi\0a'''"
%+ ifix
:- ;~(plug soz (just `@`10))
;~(plug (just `@`10) soz)
(star qat)
"abc\0adef\0aghi"+qit
+qitChars in cord.
Parse an individual character to its cord $atom representation. Escaped characters are converted to the value they represent.
Source
++ qit ;~ pose :: chars in a cord
;~(less bas soq prn)
;~(pfix bas ;~(pose bas soq mes)) :: escape chars
==Examples
> (scan "%" qit)
'%'> `tape`(scan "cord" (star qit))
"cord"> `tape`(scan "\\0a" (star qit))
"\0a"+qut
+qutCord.
Parse single-soq cord with \{gap}/ anywhere in the middle, or triple-single quote (aka triple-soq) cord, between which must be in an indented block.
Source
++ qut ;~ simu soq
;~ pose
;~ less soz
(ifix [soq soq] (boss 256 (more gon qit)))
==
=+ hed=;~(pose ;~(plug (plus ace) vul) (just '\0a'))
%- iny %+ ifix
:- ;~(plug soz hed)
;~(plug (just '\0a') soz)
(boss 256 (star qat))
==
==Examples
> `@t`(scan "'cord'" qut)
'cord'> `@t`(scan "'''\0aabc\0adef\0a'''" qut)
'abc\0adef'+soz
+sozDelimiting '''.
Parse a triple-single quote, used for multiline strings.
Source
++ soz ;~(plug soq soq soq)Examples
> (scan "'''" soz)
['\'' '\'' '\'']+sym
+symTerm.
A term: a lowercase letter, followed by letters, numbers, or -.
Source
++ sym
%+ cook
|=(a=tape (rap 3 ^-((list @) a)))
;~(plug low (star ;~(pose nud low hep)))Examples
> `term`(scan "sam-2" sym)
%sam-2+mixed-case-symbol
+mixed-case-symbolMixed-case term.
The same as +sym but allowing uppercase letters.
Source
++ mixed-case-symbol
%+ cook
|=(a=tape (rap 3 ^-((list @) a)))
;~(plug alf (star alp))Examples
> `term`(scan "sAm-2" mixed-case-symbol)
%sAm-2+ven
+ven+>- axis syntax.
Axis syntax parser
Source
++ ven ;~ (comp |=([a=@ b=@] (peg a b)))
bet
=+ hom=`?`|
|= tub=nail
^- (like @)
=+ vex=?:(hom (bet tub) (gul tub))
?~ q.vex
[p.tub [~ 1 tub]]
=+ wag=$(p.tub p.vex, hom !hom, tub q.u.q.vex)
?> ?=(^ q.wag)
[p.wag [~ (peg p.u.q.vex p.u.q.wag) q.u.q.wag]]
==Examples
> (scan "->+" ven)
11> (scan "->+<-" ven)
44+vit
+vitBase64 digit.
Parse a standard base64 digit.
Source
++ vit
;~ pose
(cook |=(a=@ (sub a 65)) (shim 'A' 'Z'))
(cook |=(a=@ (sub a 71)) (shim 'a' 'z'))
(cook |=(a=@ (add a 4)) (shim '0' '9'))
(cold 62 (just '-'))
(cold 63 (just '+'))
==Examples
> (scan "C" vit)
2> (scan "c" vit)
28> (scan "2" vit)
54> (scan "-" vit)
62+vul
+vulComments to null.
Parse comments and produce a null. Note that a comment must be ended with a newline character.
Source
++ vul %+ cold ~
;~ plug col col
(star prn)
(just `@`10)
==Examples
> (scan "::this is a comment \0a" vul)
~> (scan "::this is a comment " vul)
! {1 21}
! exitLast updated