4i: Parsing (Useful Idioms)
+alf
+alf
Alphabetic characters.
Parse alphabetic characters, both upper and lowercase.
Source
++ alf ;~(pose low hig)
Examples
> (scan "a" alf)
'a'
> (scan "A" alf)
'A'
> `tape`(scan "AaBbCc" (star alf))
"AaBbCc"
+aln
+aln
Alphanumeric characters.
Parse alphanumeric characters - both alphabetic characters and numbers.
Source
++ aln ;~(pose low hig nud)
Examples
> (scan "0" aln)
'0'
> `tape`(scan "alf42" (star aln))
"alf42"
+alp
+alp
Alphanumeric and -
.
Parse alphanumeric strings and hep, -
.
Source
++ alp ;~(pose low hig nud hep)
Examples
> (scan "7" alp)
'7'
> (scan "s" alp)
's'
> `tape`(scan "123abc-" (star alp))
"123abc-"
+bet
+bet
Axis syntax -
, +
.
Parse the hep and lus axis syntax.
Source
++ bet ;~(pose (cold 2 hep) (cold 3 lus))
Examples
> (scan "-" bet)
2
> (scan "+" bet)
3
+bin
+bin
Binary to $atom
.
Parse a tape of binary (0
s and 1
s) and produce its atomic representation.
Source
++ bin (bass 2 (most gon but))
Examples
> (scan "0000" bin)
0
> (scan "0001" bin)
1
> (scan "0010" bin)
2
> (scan "100000001111" bin)
2.063
+but
+but
Binary digit.
Parse a single binary digit.
Source
++ but (cook |=(a=@ (sub a '0')) (shim '0' '1'))
Examples
> (scan "0" but)
0
> (scan "1" but)
1
> (scan "01" but)
! {1 2}
! 'syntax-error'
! exit
> (scan "01" (star but))
~[0 1]
+cit
+cit
Octal digit.
Parse a single octal digit.
Source
++ cit (cook |=(a=@ (sub a '0')) (shim '0' '7'))
Examples
> (scan "1" cit)
1
> (scan "7" cit)
7
> (scan "8" cit)
! {1 1}
! 'syntax-error'
! exit
> (scan "60" (star cit))
~[6 0]
+dem
+dem
Decimal to $atom
.
Parse a decimal number to an $atom
.
Source
++ dem (bass 10 (most gon dit))
Examples
> (scan "7" dem)
7
> (scan "42" dem)
42
> (scan "150000000" dem)
150.000.000
> (scan "12456" dem)
12.456
+dit
+dit
Decimal digit.
Parse a single decimal digit.
Source
++ dit (cook |=(a=@ (sub a '0')) (shim '0' '9'))
Examples
> (scan "7" dit)
7
> (scan "42" (star dit))
~[4 2]
> (scan "26000" (star dit))
~[2 6 0 0 0]
+dog
+dog
.
optional gap.
Dot followed by an optional gap, used with numbers.
Source
++ dog ;~(plug dot gay)
Examples
> (scan "." dog)
['.' ~]
> (scan "a. " ;~(pfix alf dog))
['.' ~]
+dof
+dof
-
optional gap.
Hep followed by an optional gap, used with @p
& @q
syntax.
Source
++ dof ;~(plug hep gay)
Examples
> (scan "-" dof)
['-' ~]
> (scan "- " dof)
['-' ~]
+doh
+doh
@p
separator.
Phonetic base phrase separator
Source
++ doh ;~(plug ;~(plug hep hep) gay)
Examples
> (scan "--" doh)
[['-' '-'] ~]
> (scan "-- " doh)
[['-' '-'] ~]
+dun
+dun
--
to ~
.
Parse phep, --
, to null, ~
.
Source
++ dun (cold ~ ;~(plug hep hep))
Examples
> (scan "--" dun)
~
+duz
+duz
==
to ~
.
Parse stet, ==
, to null ~
.
Source
++ duz (cold ~ ;~(plug tis tis))
Examples
> (scan "==" duz)
~
+gah
+gah
Newline or ' '.
Whitespace component, either newline or space.
Source
++ gah (mask [`@`10 ' ' ~])
Examples
> `tape`(scan " \0a \0a" (star gah))
" \0a \0a"
+gap
+gap
Plural whitespace.
Separates tall runes.
Source
++ gap (cold ~ ;~(plug gaq (star ;~(pose vul gah))))
Examples
> `tape`(scan " \0a \0a" gap)
""
> (scan "\0a \0a XYZ" ;~(pfix gap (jest 'XYZ')))
'XYZ'
+gaq
+gaq
End of line.
Two spaces, a newline, or comment.
Source
++ gaq ;~ pose
(just `@`10)
;~(plug gah ;~(pose gah vul))
vul
==
Examples
> (scan "123\0a" ;~(sfix dem gaq))
123
> (scan "123 :: foo\0a" ;~(sfix dem gaq))
123
+gaw
+gaw
Normal whitespace.
Source
++ gaw (cold ~ (star ;~(pose vul gah)))
Examples
> (scan " \0a :: foo \0a" gaw)
~
> (scan " " gaw)
~
> (scan "\0a" gaw)
~
+gay
+gay
Optional gap.
Source
++ gay ;~(pose gap (easy ~))
Examples
> (scan " " gay)
~
> (scan " " gay)
~
> (scan "\0a" gay)
~
> (scan "" gay)
~
+gon
+gon
Long numbers.
Parse long numbers - Numbers which wrap around the shell with the line
Source
++ gon ;~(pose ;~(plug bas gay fas) (easy ~))
Examples
> 'abc\
/def'
'abcdef'
> (scan "\\\0a/" gon)
['\\' ~ '/']
+gul
+gul
Axis syntax <
or >
.
Parse the axis gal and gar axis syntax.
Source
++ gul ;~(pose (cold 2 gal) (cold 3 gar))
Examples
> (scan "<" gul)
2
> (scan ">" gul)
3
+hex
+hex
Hex to $atom
.
Parse any hexadecimal number to an $atom
.
Source
++ hex (bass 16 (most gon hit))
Examples
> (scan "a" hex)
10
> (scan "A" hex)
10
> (scan "2A" hex)
42
> (scan "1ee7" hex)
7.911
> (scan "1EE7" hex)
7.911
> (scan "1EE7F7" hex)
2.025.463
> `@ux`(scan "1EE7F7" hex)
0x1e.e7f7
+hig
+hig
Uppercase.
Parse a single uppercase letter.
Source
++ hig (shim 'A' 'Z')
Examples
> (scan "G" hig)
'G'
> (scan "ABCDEFGHIJKLMNOPQRSTUVWXYZ" (star hig))
"ABCDEFGHIJKLMNOPQRSTUVWXYZ"
+hit
+hit
Hex digits.
Parse a single hexadecimal digit.
Source
++ hit ;~ pose
dit
(cook |=(a=char (sub a 87)) (shim 'a' 'f'))
(cook |=(a=char (sub a 55)) (shim 'A' 'F'))
==
Examples
> (scan "a" hit)
10
> (scan "A" hit)
10
> (scan "2A" (star hit))
~[2 10]
+iny
+iny
Indentation block.
Apply $rule
to indented block starting at current column number, omitting the leading whitespace.
Accepts
sef
is a $rule
Produces
A $rule
.
Source
++ iny
|* sef=rule
|= nail ^+ (sef)
=+ [har tap]=[p q]:+<
=+ lev=(fil 3 (dec q.har) ' ')
=+ eol=(just `@t`10)
=+ =- roq=((star ;~(pose prn ;~(sfix eol (jest lev)) -)) har tap)
;~(simu ;~(plug eol eol) eol)
?~ q.roq roq
=+ vex=(sef har(q 1) p.u.q.roq)
=+ fur=p.vex(q (add (dec q.har) q.p.vex))
?~ q.vex vex(p fur)
=- vex(p fur, u.q -)
:+ &3.vex
&4.vex(q.p (add (dec q.har) q.p.&4.vex))
=+ res=|4.vex
|- ?~ res |4.roq
?. =(10 -.res) [-.res $(res +.res)]
(welp [`@t`10 (trip lev)] $(res +.res))
Examples
> `tape`(scan " foo\0a bar" ;~(pfix ace ace ace (iny (star ;~(pose prn (just '\0a'))))))
"foo\0abar"
Discussion
Note the amount of indentation whitespace to be stripped from the beginning of each line is determined by the value of .q
(the column) in the $hair
when +iny
is first called. This means something like the +pfix
expression in the example above is necessary to set the level of indentation. Additionally, the $rule
given to +iny
must consume the whole line including the line ending.
+low
+low
Lowercase.
Parse a single lowercase letter.
Source
++ low (shim 'a' 'z')
Examples
> (scan "g" low)
'g'
+mes
+mes
Hexbyte.
Parse a hexbyte.
Source
++ mes %+ cook
|=({a/@ b/@} (add (mul 16 a) b))
;~(plug hit hit)
Examples
> (scan "2A" mes)
42
> (scan "42" mes)
66
+nix
+nix
Letters and underscore.
Parse Letters and _
.
Source
++ nix (boss 256 (star ;~(pose aln cab)))
Examples
> `@t`(scan "as_me" nix)
'as_me'
+nud
+nud
Numeric.
Parse a numeric character - A number.
Source
++ nud (shim '0' '9')
Examples
> (scan "0" nud)
'0'
> (scan "7" nud)
'7'
+prn
+prn
Printable character.
Parse any printable character.
Source
++ prn ;~(less (just `@`127) (shim 32 256))
Examples
> (scan "h" prn)
'h'
> (scan "!" prn)
'!'
> (scan "\01" prn)
! {1 1}
! exit
+qat
+qat
Chars in blockcord.
Parse a single character contained in a multi-line cord block.
Source
++ qat ;~ pose
prn
;~(less ;~(plug (just `@`10) soz) (just `@`10))
==
Examples
> ^- tape
%+ scan
"'''\0aabc\0adef\0aghi\0a'''"
%+ ifix
:- ;~(plug soz (just `@`10))
;~(plug (just `@`10) soz)
(star qat)
"abc\0adef\0aghi"
+qit
+qit
Chars in cord.
Parse an individual character to its cord $atom
representation. Escaped characters are converted to the value they represent.
Source
++ qit ;~ pose :: chars in a cord
;~(less bas soq prn)
;~(pfix bas ;~(pose bas soq mes)) :: escape chars
==
Examples
> (scan "%" qit)
'%'
> `tape`(scan "cord" (star qit))
"cord"
> `tape`(scan "\\0a" (star qit))
"\0a"
+qut
+qut
Cord.
Parse single-soq cord with \{gap}/
anywhere in the middle, or triple-single quote (aka triple-soq) cord, between which must be in an indented block.
Source
++ qut ;~ simu soq
;~ pose
;~ less soz
(ifix [soq soq] (boss 256 (more gon qit)))
==
=+ hed=;~(pose ;~(plug (plus ace) vul) (just '\0a'))
%- iny %+ ifix
:- ;~(plug soz hed)
;~(plug (just '\0a') soz)
(boss 256 (star qat))
==
==
Examples
> `@t`(scan "'cord'" qut)
'cord'
> `@t`(scan "'''\0aabc\0adef\0a'''" qut)
'abc\0adef'
+soz
+soz
Delimiting '''
.
Parse a triple-single quote, used for multiline strings.
Source
++ soz ;~(plug soq soq soq)
Examples
> (scan "'''" soz)
['\'' '\'' '\'']
+sym
+sym
Term.
A term: a lowercase letter, followed by letters, numbers, or -
.
Source
++ sym
%+ cook
|=(a=tape (rap 3 ^-((list @) a)))
;~(plug low (star ;~(pose nud low hep)))
Examples
> `term`(scan "sam-2" sym)
%sam-2
+mixed-case-symbol
+mixed-case-symbol
Mixed-case term.
The same as +sym
but allowing uppercase letters.
Source
++ mixed-case-symbol
%+ cook
|=(a=tape (rap 3 ^-((list @) a)))
;~(plug alf (star alp))
Examples
> `term`(scan "sAm-2" mixed-case-symbol)
%sAm-2
+ven
+ven
+>-
axis syntax.
Axis syntax parser
Source
++ ven ;~ (comp |=([a=@ b=@] (peg a b)))
bet
=+ hom=`?`|
|= tub=nail
^- (like @)
=+ vex=?:(hom (bet tub) (gul tub))
?~ q.vex
[p.tub [~ 1 tub]]
=+ wag=$(p.tub p.vex, hom !hom, tub q.u.q.vex)
?> ?=(^ q.wag)
[p.wag [~ (peg p.u.q.vex p.u.q.wag) q.u.q.wag]]
==
Examples
> (scan "->+" ven)
11
> (scan "->+<-" ven)
44
+vit
+vit
Base64 digit.
Parse a standard base64 digit.
Source
++ vit
;~ pose
(cook |=(a=@ (sub a 65)) (shim 'A' 'Z'))
(cook |=(a=@ (sub a 71)) (shim 'a' 'z'))
(cook |=(a=@ (add a 4)) (shim '0' '9'))
(cold 62 (just '-'))
(cold 63 (just '+'))
==
Examples
> (scan "C" vit)
2
> (scan "c" vit)
28
> (scan "2" vit)
54
> (scan "-" vit)
62
+vul
+vul
Comments to null.
Parse comments and produce a null. Note that a comment must be ended with a newline character.
Source
++ vul %+ cold ~
;~ plug col col
(star prn)
(just `@`10)
==
Examples
> (scan "::this is a comment \0a" vul)
~
> (scan "::this is a comment " vul)
! {1 21}
! exit
Last updated