4i: Parsing (Useful Idioms)
+alf
+alf
Alphabetic characters
Parse alphabetic characters, both upper and lowercase.
Source
++ alf ;~(pose low hig)
Examples
> (scan "a" alf)
'a'
> (scan "A" alf)
'A'
> `tape`(scan "AaBbCc" (star alf))
"AaBbCc"
+aln
+aln
Alphanumeric characters
Parse alphanumeric characters - both alphabetic characters and numbers.
Source
++ aln ;~(pose low hig nud)
Examples
> (scan "0" aln)
'0'
> `tape`(scan "alf42" (star aln))
"alf42"
+alp
+alp
Alphanumeric and -
Parse alphanumeric strings and hep, "-".
Source
++ alp ;~(pose low hig nud hep)
Examples
> (scan "7" alp)
'7'
> (scan "s" alp)
's'
> `tape`(scan "123abc-" (star alp))
"123abc-"
+bet
+bet
Axis syntax -
, +
Parse the hep and lus axis syntax.
Source
++ bet ;~(pose (cold 2 hep) (cold 3 lus))
Examples
> (scan "-" bet)
2
> (scan "+" bet)
3
+bin
+bin
Binary to atom
Parse a tape of binary (0s and 1s) and produce its atomic representation.
Source
++ bin (bass 2 (most gon but))
Examples
> (scan "0000" bin)
0
> (scan "0001" bin)
1
> (scan "0010" bin)
2
> (scan "100000001111" bin)
2.063
+but
+but
Binary digit
Parse a single binary digit.
Source
++ but (cook |=(a=@ (sub a '0')) (shim '0' '1'))
Examples
> (scan "0" but)
0
> (scan "1" but)
1
> (scan "01" but)
! {1 2}
! 'syntax-error'
! exit
> (scan "01" (star but))
~[0 1]
+cit
+cit
Octal digit
Parse a single octal digit.
Source
++ cit (cook |=(a=@ (sub a '0')) (shim '0' '7'))
Examples
> (scan "1" cit)
1
> (scan "7" cit)
7
> (scan "8" cit)
! {1 1}
! 'syntax-error'
! exit
> (scan "60" (star cit))
~[6 0]
+dem
+dem
Decimal to atom
Parse a decimal number to an atom.
Source
++ dem (bass 10 (most gon dit))
Examples
> (scan "7" dem)
7
> (scan "42" dem)
42
> (scan "150000000" dem)
150.000.000
> (scan "12456" dem)
12.456
+dit
+dit
Decimal digit
Parse a single decimal digit.
Source
++ dit (cook |=(a=@ (sub a '0')) (shim '0' '9'))
Examples
> (scan "7" dit)
7
> (scan "42" (star dit))
~[4 2]
> (scan "26000" (star dit))
~[2 6 0 0 0]
+dog
+dog
.
optional gap
Dot followed by an optional gap, used with numbers.
Source
++ dog ;~(plug dot gay)
Examples
> (scan "." dog)
['.' ~]
> (scan "a. " ;~(pfix alf dog))
['.' ~]
+dof
+dof
-
optional gap
Hep followed by an optional gap, used with @p
& @q
syntax.
Source
++ dof ;~(plug hep gay)
Examples
> (scan "-" dof)
['-' ~]
> (scan "- " dof)
['-' ~]
+doh
+doh
@p
separator
Phonetic base phrase separator
Source
++ doh ;~(plug ;~(plug hep hep) gay)
Examples
> (scan "--" doh)
[['-' '-'] ~]
> (scan "-- " doh)
[['-' '-'] ~]
+dun
+dun
--
to ~
Parse phep, --
, to null, ~
.
Source
++ dun (cold ~ ;~(plug hep hep))
Examples
> (scan "--" dun)
~
+duz
+duz
==
to ~
Parse stet, ==
, to null ~
.
Source
++ duz (cold ~ ;~(plug tis tis))
Examples
> (scan "==" duz)
~
+gah
+gah
Newline or ' '
Whitespace component, either newline or space.
Source
++ gah (mask [`@`10 ' ' ~])
Examples
> `tape`(scan " \0a \0a" (star gah))
" \0a \0a"
+gap
+gap
Plural whitespace
Separates tall runes
Source
++ gap (cold ~ ;~(plug gaq (star ;~(pose vul gah))))
Examples
> `tape`(scan " \0a \0a" gap)
""
> (scan "\0a \0a XYZ" ;~(pfix gap (jest 'XYZ')))
'XYZ'
+gaq
+gaq
End of line
Two spaces, a newline, or comment.
Source
++ gaq ;~ pose
(just `@`10)
;~(plug gah ;~(pose gah vul))
vul
==
Examples
> (scan "123\0a" ;~(sfix dem gaq))
123
> (scan "123 :: foo\0a" ;~(sfix dem gaq))
123
+gaw
+gaw
Classic whitespace
Terran whitespace.
Source
++ gaw (cold ~ (star ;~(pose vul gah)))
Examples
> (scan " \0a :: foo \0a" gaw)
~
> (scan " " gaw)
~
> (scan "\0a" gaw)
~
+gay
+gay
Optional gap
Optional gap.
Source
++ gay ;~(pose gap (easy ~))
Examples
> (scan " " gay)
~
> (scan " " gay)
~
> (scan "\0a" gay)
~
> (scan "" gay)
~
+gon
+gon
Long numbers
Parse long numbers - Numbers which wrap around the shell with the line
Source
++ gon ;~(pose ;~(plug bas gay fas) (easy ~))
Examples
> 'abc\
/def'
'abcdef'
> (scan "\\\0a/" gon)
['\\' ~ '/']
+gul
+gul
Axis syntax <
or >
Parse the axis gal and gar axis syntax.
Source
++ gul ;~(pose (cold 2 gal) (cold 3 gar))
Examples
> (scan "<" gul)
2
> (scan ">" gul)
3
+hex
+hex
Hex to atom
Parse any hexadecimal number to an atom.
Source
++ hex (bass 16 (most gon hit))
Examples
> (scan "a" hex)
10
> (scan "A" hex)
10
> (scan "2A" hex)
42
> (scan "1ee7" hex)
7.911
> (scan "1EE7" hex)
7.911
> (scan "1EE7F7" hex)
2.025.463
> `@ux`(scan "1EE7F7" hex)
0x1e.e7f7
+hig
+hig
Uppercase
Parse a single uppercase letter.
Source
++ hig (shim 'A' 'Z')
Examples
> (scan "G" hig)
'G'
> (scan "ABCDEFGHIJKLMNOPQRSTUVWXYZ" (star hig))
"ABCDEFGHIJKLMNOPQRSTUVWXYZ"
+hit
+hit
Hex digits
Parse a single hexadecimal digit.
Source
++ hit ;~ pose
dit
(cook |=(a=char (sub a 87)) (shim 'a' 'f'))
(cook |=(a=char (sub a 55)) (shim 'A' 'F'))
==
Examples
> (scan "a" hit)
10
> (scan "A" hit)
10
> (scan "2A" (star hit))
~[2 10]
+iny
+iny
Indentation block
Apply +rule
to indented block starting at current column number, omitting the leading whitespace.
Accepts
sef
is a +rule
Produces
A +rule
.
Source
++ iny
|* sef=rule
|= nail ^+ (sef)
=+ [har tap]=[p q]:+<
=+ lev=(fil 3 (dec q.har) ' ')
=+ eol=(just `@t`10)
=+ =- roq=((star ;~(pose prn ;~(sfix eol (jest lev)) -)) har tap)
;~(simu ;~(plug eol eol) eol)
?~ q.roq roq
=+ vex=(sef har(q 1) p.u.q.roq)
=+ fur=p.vex(q (add (dec q.har) q.p.vex))
?~ q.vex vex(p fur)
=- vex(p fur, u.q -)
:+ &3.vex
&4.vex(q.p (add (dec q.har) q.p.&4.vex))
=+ res=|4.vex
|- ?~ res |4.roq
?. =(10 -.res) [-.res $(res +.res)]
(welp [`@t`10 (trip lev)] $(res +.res))
Examples
> `tape`(scan " foo\0a bar" ;~(pfix ace ace ace (iny (star ;~(pose prn (just '\0a'))))))
"foo\0abar"
Discussion
Note the amount of indentation whitespace to be stripped from the beginning of each line is determined by the value of .q
(the column) in the $hair
when +iny
is first called. This means something like the +pfix
expression in the example above is necessary to set the level of indentation. Additionally, the +rule
given to +iny
must consume the whole line including the line ending.
+low
+low
Lowercase
Parse a single lowercase letter.
Source
++ low (shim 'a' 'z')
Examples
> (scan "g" low)
'g'
+mes
+mes
Hexbyte
Parse a hexbyte.
Source
++ mes %+ cook
|=({a/@ b/@} (add (mul 16 a) b))
;~(plug hit hit)
Examples
> (scan "2A" mes)
42
> (scan "42" mes)
66
+nix
+nix
Letters and underscore
Parse Letters and _
.
Source
++ nix (boss 256 (star ;~(pose aln cab)))
Examples
> `@t`(scan "as_me" nix)
'as_me'
+nud
+nud
Numeric
Parse a numeric character - A number.
Source
++ nud (shim '0' '9')
Examples
> (scan "0" nud)
'0'
> (scan "7" nud)
'7'
+prn
+prn
Printable character
Parse any printable character.
Source
++ prn ;~(less (just `@`127) (shim 32 256))
Examples
> (scan "h" prn)
'h'
> (scan "!" prn)
'!'
> (scan "\01" prn)
! {1 1}
! exit
+qat
+qat
Chars in blockcord
Parse a single character contained in a multi-line cord block.
Source
++ qat ;~ pose
prn
;~(less ;~(plug (just `@`10) soz) (just `@`10))
==
Examples
> ^- tape
%+ scan
"'''\0aabc\0adef\0aghi\0a'''"
%+ ifix
:- ;~(plug soz (just `@`10))
;~(plug (just `@`10) soz)
(star qat)
"abc\0adef\0aghi"
+qit
+qit
Chars in cord
Parse an individual character to its cord atom representation. Escaped characters are converted to the value they represent.
Source
++ qit ;~ pose :: chars in a cord
;~(less bas soq prn)
;~(pfix bas ;~(pose bas soq mes)) :: escape chars
==
Examples
> (scan "%" qit)
'%'
> `tape`(scan "cord" (star qit))
"cord"
> `tape`(scan "\\0a" (star qit))
"\0a"
+qut
+qut
Cord
Parse single-soq cord with \{gap}/
anywhere in the middle, or triple-single quote (aka triple-soq) cord, between which must be in an indented block.
Source
++ qut ;~ simu soq
;~ pose
;~ less soz
(ifix [soq soq] (boss 256 (more gon qit)))
==
=+ hed=;~(pose ;~(plug (plus ace) vul) (just '\0a'))
%- iny %+ ifix
:- ;~(plug soz hed)
;~(plug (just '\0a') soz)
(boss 256 (star qat))
==
==
Examples
> `@t`(scan "'cord'" qut)
'cord'
> `@t`(scan "'''\0aabc\0adef\0a'''" qut)
'abc\0adef'
+soz
+soz
Delimiting '''
Parse a triple-single quote, used for multiline strings.
Source
++ soz ;~(plug soq soq soq)
Examples
> (scan "'''" soz)
['\'' '\'' '\'']
+sym
+sym
Term
A term: a lowercase letter, followed by letters, numbers, or -
.
Source
++ sym
%+ cook
|=(a=tape (rap 3 ^-((list @) a)))
;~(plug low (star ;~(pose nud low hep)))
Examples
> `term`(scan "sam-2" sym)
%sam-2
+mixed-case-symbol
+mixed-case-symbol
Mixed-case term
The same as +sym
but allowing uppercase letters.
Source
++ mixed-case-symbol
%+ cook
|=(a=tape (rap 3 ^-((list @) a)))
;~(plug alf (star alp))
Examples
> `term`(scan "sAm-2" mixed-case-symbol)
%sAm-2
+ven
+ven
+>-
axis syntax
Axis syntax parser
Source
++ ven ;~ (comp |=([a=@ b=@] (peg a b)))
bet
=+ hom=`?`|
|= tub=nail
^- (like @)
=+ vex=?:(hom (bet tub) (gul tub))
?~ q.vex
[p.tub [~ 1 tub]]
=+ wag=$(p.tub p.vex, hom !hom, tub q.u.q.vex)
?> ?=(^ q.wag)
[p.wag [~ (peg p.u.q.vex p.u.q.wag) q.u.q.wag]]
==
Examples
> (scan "->+" ven)
11
> (scan "->+<-" ven)
44
+vit
+vit
Base64 digit
Parse a standard base64 digit.
Source
++ vit
;~ pose
(cook |=(a=@ (sub a 65)) (shim 'A' 'Z'))
(cook |=(a=@ (sub a 71)) (shim 'a' 'z'))
(cook |=(a=@ (add a 4)) (shim '0' '9'))
(cold 62 (just '-'))
(cold 63 (just '+'))
==
Examples
> (scan "C" vit)
2
> (scan "c" vit)
28
> (scan "2" vit)
54
> (scan "-" vit)
62
+vul
+vul
Comments to null
Parse comments and produce a null. Note that a comment must be ended with a newline character.
Source
++ vul %+ cold ~
;~ plug col col
(star prn)
(just `@`10)
==
Examples
> (scan "::this is a comment \0a" vul)
~
> (scan "::this is a comment " vul)
! {1 21}
! exit
Last updated