Urbit Docs
  • What is Urbit?
  • Get on Urbit
  • Build on Urbit
    • Contents
    • Environment Setup
    • Hoon School
      • 1. Hoon Syntax
      • 2. Azimuth (Urbit ID)
      • 3. Gates (Functions)
      • 4. Molds (Types)
      • 5. Cores
      • 6. Trees and Addressing
      • 7. Libraries
      • 8. Testing Code
      • 9. Text Processing I
      • 10. Cores and Doors
      • 11. Data Structures
      • 12. Type Checking
      • 13. Conditional Logic
      • 14. Subject-Oriented Programming
      • 15. Text Processing II
      • 16. Functional Programming
      • 17. Text Processing III
      • 18. Generic and Variant Cores
      • 19. Mathematics
    • App School I
      • 1. Arvo
      • 2. The Agent Core
      • 3. Imports and Aliases
      • 4. Lifecycle
      • 5. Cards
      • 6. Pokes
      • 7. Structures and Marks
      • 8. Subscriptions
      • 9. Vanes
      • 10. Scries
      • 11. Failure
      • 12. Next Steps
      • Appendix: Types
    • App School II (Full-Stack)
      • 1. Types
      • 2. Agent
      • 3. JSON
      • 4. Marks
      • 5. Eyre
      • 6. React app setup
      • 7. React app logic
      • 8. Desk and glob
      • 9. Summary
    • Core Academy
      • 1. Evaluating Nock
      • 2. Building Hoon
      • 3. The Core Stack
      • 4. Arvo I: The Main Sequence
      • 5. Arvo II: The Boot Sequence
      • 6. Vere I: u3 and the Serf
      • 7. Vere II: The Loom
      • 8. Vanes I: Behn, Dill, Kahn, Lick
      • 9. Vanes II: Ames
      • 10. Vanes III: Eyre, Iris
      • 11. Vanes IV: Clay
      • 12. Vanes V: Gall and Userspace
      • 13. Vanes VI: Khan, Lick
      • 14. Vanes VII: Jael, Azimuth
    • Runtime
      • U3
      • Conn.c Guide
      • How to Write a Jet
      • API Overview by Prefix
      • C in Urbit
      • Cryptography
      • Land of Nouns
    • Tools
      • Useful Links
      • JS Libraries
        • HTTP API
      • Docs App
        • File Format
        • Index File
        • Suggested Structure
    • Userspace
      • Command-Line App Tutorial
      • Remote Scry
      • Unit Tests
      • Software Distribution
        • Software Distribution Guide
        • Docket File
        • Glob
      • Examples
        • Building a CLI App
        • Debugging Wrapper
        • Host a Website
        • Serving a JS Game
        • Ship Monitoring
        • Styled Text
  • Urbit ID
    • What is Urbit ID?
    • Azimuth Data Flow
    • Life and Rift
    • Urbit HD Wallet
    • Advanced Azimuth Tools
    • Custom Roller Tutorial
    • Azimuth.eth Reference
    • Ecliptic.eth Reference
    • Layer 2
      • L2 Actions
      • L2 Rollers
      • L2 Roller HTTP RPC-API
      • L2 Transaction Format
  • Urbit OS
    • What is Urbit OS?
    • Base
      • Hood
      • Threads
        • Basics Tutorial
          • Bind
          • Fundamentals
          • Input
          • Output
          • Summary
        • HTTP API Guide
        • Spider API Reference
        • Strandio Reference
        • Examples
          • Child Thread
          • Fetch JSON
          • Gall
            • Poke Thread
            • Start Thread
            • Stop Thread
            • Take Facts
            • Take Result
          • Main-loop
          • Poke Agent
          • Scry
          • Take Fact
    • Kernel
      • Arvo
        • Cryptography
        • Move Trace
        • Scries
        • Subscriptions
      • Ames
        • Ames API Reference
        • Ames Cryptography
        • Ames Data Types
        • Ames Scry Reference
      • Behn
        • Behn API Reference
        • Behn Examples
        • Behn Scry Reference
      • Clay
        • Clay API Reference
        • Clay Architecture
        • Clay Data Types
        • Clay Examples
        • Clay Scry Reference
        • Filesystem Hierarchy
        • Marks
          • Mark Examples
          • Using Marks
          • Writing Marks
        • Using Clay
      • Dill
        • Dill API Reference
        • Dill Data Types
        • Dill Scry Reference
      • Eyre
        • EAuth
        • Eyre Data Types
        • Eyre External API
        • Eyre Internal API
        • Eyre Scry Reference
        • Low-Level Eyre Guide
        • Noun channels
      • Gall
        • Gall API Reference
        • Gall Data Types
        • Gall Scry Reference
      • Iris
        • Iris API Reference
        • Iris Data Types
        • Iris Example
      • Jael
        • Jael API Reference
        • Jael Data Types
        • Jael Examples
        • Jael Scry Reference
      • Khan
        • Khan API Reference
        • Khan Data Types
        • Khan Example
      • Lick
        • Lick API Reference
        • Lick Guide
        • Lick Examples
        • Lick Scry Reference
  • Hoon
    • Why Hoon?
    • Advanced Types
    • Arvo
    • Auras
    • Basic Types
    • Cheat Sheet
    • Cryptography
    • Examples
      • ABC Blocks
      • Competitive Programming
      • Emirp
      • Gleichniszahlenreihe
      • Islands
      • Luhn Number
      • Minimum Path Sum
      • Phone Letters
      • Restore IP
      • Rhonda Numbers
      • Roman Numerals
      • Solitaire Cipher
      • Water Towers
    • Generators
    • Hoon Errors
    • Hoon Style Guide
    • Implementing an Aura
    • Irregular forms
    • JSON
    • Limbs and wings
      • Limbs
      • Wings
    • Mips (Maps of Maps)
    • Parsing Text
    • Runes
      • | bar · Cores
      • $ buc · Structures
      • % cen · Calls
      • : col · Cells
      • . dot · Nock
      • / fas · Imports
      • ^ ket · Casts
      • + lus · Arms
      • ; mic · Make
      • ~ sig · Hints
      • = tis · Subject
      • ? wut · Conditionals
      • ! zap · Wild
      • Constants (Atoms and Strings)
      • --, == · Terminators
    • Sail (HTML)
    • Serialization
    • Sets
    • Standard Library
      • 1a: Basic Arithmetic
      • 1b: Tree Addressing
      • 1c: Molds and Mold-Builders
      • 2a: Unit Logic
      • 2b: List Logic
      • 2c: Bit Arithmetic
      • 2d: Bit Logic
      • 2e: Insecure Hashing
      • 2f: Noun Ordering
      • 2g: Unsigned Powers
      • 2h: Set Logic
      • 2i: Map Logic
      • 2j: Jar and Jug Logic
      • 2k: Queue Logic
      • 2l: Container from Container
      • 2m: Container from Noun
      • 2n: Functional Hacks
      • 2o: Normalizing Containers
      • 2p: Serialization
      • 2q: Molds and Mold-Builders
      • 3a: Modular and Signed Ints
      • 3b: Floating Point
      • 3c: Urbit Time
      • 3d: SHA Hash Family
      • 3e: AES encryption (Removed)
      • 3f: Scrambling
      • 3g: Molds and Mold-Builders
      • 4a: Exotic Bases
      • 4b: Text Processing
      • 4c: Tank Printer
      • 4d: Parsing (Tracing)
      • 4e: Parsing (Combinators)
      • 4f: Parsing (Rule-Builders)
      • 4g: Parsing (Outside Caller)
      • 4h: Parsing (ASCII Glyphs)
      • 4i: Parsing (Useful Idioms)
      • 4j: Parsing (Bases and Base Digits)
      • 4k: Atom Printing
      • 4l: Atom Parsing
      • 4m: Formatting Functions
      • 4n: Virtualization
      • 4o: Molds
      • 5a: Compiler Utilities
      • 5b: Macro Expansion
      • 5c: Compiler Backend & Prettyprinter
      • 5d: Parser
      • 5e: Molds and mold builders
      • 5f: Profiling support
    • Strings
    • The Engine Pattern
    • Udon (Markdown-esque)
    • Vases
    • Zuse
      • 2d(1-5): To JSON, Wains
      • 2d(6): From JSON
      • 2d(7): From JSON (unit)
      • 2e(2-3): Print & Parse JSON
      • 2m: Ordered Maps
  • Nock
    • What is Nock?
    • Decrement
    • Definition
    • Fast Hints and Jets
    • Implementations
    • Specification
  • User Manual
    • Contents
    • Running Urbit
      • Cloud Hosting
      • Home Servers
      • Runtime Reference
      • Self-hosting S3 Storage with MinIO
    • Urbit ID
      • Bridge Troubleshooting
      • Creating an Invite Pool
      • Get an Urbit ID
      • Guide to Factory Resets
      • HD Wallet (Master Ticket)
      • Layer 2 for planets
      • Layer 2 for stars
      • Proxies
      • Using Bridge
    • Urbit OS
      • Basics
      • Configuring S3 Storage
      • Dojo Tools
      • Filesystem
      • Shell
      • Ship Troubleshooting
      • Star and Galaxy Operations
      • Updates
Powered by GitBook

GitHub

  • Urbit ID
  • Urbit OS
  • Runtime

Resources

  • YouTube
  • Whitepaper
  • Awesome Urbit

Contact

  • X
  • Email
  • Gather
On this page
  • +alf
  • +aln
  • +alp
  • +bet
  • +bin
  • +but
  • +cit
  • +dem
  • +dit
  • +dog
  • +dof
  • +doh
  • +dun
  • +duz
  • +gah
  • +gap
  • +gaq
  • +gaw
  • +gay
  • +gon
  • +gul
  • +hex
  • +hig
  • +hit
  • +iny
  • +low
  • +mes
  • +nix
  • +nud
  • +prn
  • +qat
  • +qit
  • +qut
  • +soz
  • +sym
  • +mixed-case-symbol
  • +ven
  • +vit
  • +vul
Edit on GitHub
  1. Hoon
  2. Standard Library

4i: Parsing (Useful Idioms)

+alf

Alphabetic characters

Parse alphabetic characters, both upper and lowercase.

Source

++  alf  ;~(pose low hig)

Examples

> (scan "a" alf)
'a'

> (scan "A" alf)
'A'

> `tape`(scan "AaBbCc" (star alf))
"AaBbCc"

+aln

Alphanumeric characters

Parse alphanumeric characters - both alphabetic characters and numbers.

Source

++  aln  ;~(pose low hig nud)

Examples

> (scan "0" aln)
'0'

> `tape`(scan "alf42" (star aln))
"alf42"

+alp

Alphanumeric and -

Parse alphanumeric strings and hep, "-".

Source

++  alp  ;~(pose low hig nud hep)

Examples

> (scan "7" alp)
'7'

> (scan "s" alp)
's'

> `tape`(scan "123abc-" (star alp))
"123abc-"

+bet

Axis syntax -, +

Parse the hep and lus axis syntax.

Source

++  bet  ;~(pose (cold 2 hep) (cold 3 lus))

Examples

> (scan "-" bet)
2

> (scan "+" bet)
3

+bin

Binary to atom

Parse a tape of binary (0s and 1s) and produce its atomic representation.

Source

++  bin  (bass 2 (most gon but))

Examples

> (scan "0000" bin)
0

> (scan "0001" bin)
1

> (scan "0010" bin)
2

> (scan "100000001111" bin)
2.063

+but

Binary digit

Parse a single binary digit.

Source

++  but  (cook |=(a=@ (sub a '0')) (shim '0' '1'))

Examples

> (scan "0" but)
0

> (scan "1" but)
1

> (scan "01" but)
! {1 2}
! 'syntax-error'
! exit

> (scan "01" (star but))
~[0 1]

+cit

Octal digit

Parse a single octal digit.

Source

++  cit  (cook |=(a=@ (sub a '0')) (shim '0' '7'))

Examples

> (scan "1" cit)
1
> (scan "7" cit)
7
> (scan "8" cit)
! {1 1}
! 'syntax-error'
! exit
> (scan "60" (star cit))
~[6 0]

+dem

Decimal to atom

Parse a decimal number to an atom.

Source

++  dem  (bass 10 (most gon dit))

Examples

> (scan "7" dem)
7

> (scan "42" dem)
42

> (scan "150000000" dem)
150.000.000

> (scan "12456" dem)
12.456

+dit

Decimal digit

Parse a single decimal digit.

Source

++  dit  (cook |=(a=@ (sub a '0')) (shim '0' '9'))

Examples

> (scan "7" dit)
7

> (scan "42" (star dit))
~[4 2]

> (scan "26000" (star dit))
~[2 6 0 0 0]

+dog

. optional gap

Dot followed by an optional gap, used with numbers.

Source

++  dog  ;~(plug dot gay)

Examples

> (scan "." dog)
['.' ~]

> (scan "a.        " ;~(pfix alf dog))
['.' ~]

+dof

- optional gap

Hep followed by an optional gap, used with @p & @q syntax.

Source

++  dof  ;~(plug hep gay)

Examples

> (scan "-" dof)
['-' ~]

> (scan "-     " dof)
['-' ~]

+doh

@p separator

Phonetic base phrase separator

Source

++  doh  ;~(plug ;~(plug hep hep) gay)

Examples

> (scan "--" doh)
[['-' '-'] ~]

> (scan "--     " doh)
[['-' '-'] ~]

+dun

-- to ~

Parse phep, --, to null, ~.

Source

++  dun  (cold ~ ;~(plug hep hep))

Examples

> (scan "--" dun)
~

+duz

== to ~

Parse stet, ==, to null ~.

Source

++  duz  (cold ~ ;~(plug tis tis))

Examples

> (scan "==" duz)
~

+gah

Newline or ' '

Whitespace component, either newline or space.

Source

++  gah  (mask [`@`10 ' ' ~])

Examples

> `tape`(scan " \0a \0a" (star gah))
" \0a \0a"

+gap

Plural whitespace

Separates tall runes

Source

++  gap  (cold ~ ;~(plug gaq (star ;~(pose vul gah))))

Examples

> `tape`(scan " \0a \0a" gap)
""

> (scan "\0a   \0a XYZ" ;~(pfix gap (jest 'XYZ')))
'XYZ'

+gaq

End of line

Two spaces, a newline, or comment.

Source

++  gaq  ;~  pose
             (just `@`10)
             ;~(plug gah ;~(pose gah vul))
             vul
         ==

Examples

> (scan "123\0a" ;~(sfix dem gaq))
123

> (scan "123 :: foo\0a" ;~(sfix dem gaq))
123

+gaw

Classic whitespace

Terran whitespace.

Source

++  gaw  (cold ~ (star ;~(pose vul gah)))

Examples

> (scan "  \0a  :: foo  \0a" gaw)
~

> (scan "  " gaw)
~

> (scan "\0a" gaw)
~

+gay

Optional gap

Optional gap.

Source

++  gay  ;~(pose gap (easy ~))

Examples

> (scan "  " gay)
~

> (scan "     " gay)
~

> (scan "\0a" gay)
~

> (scan "" gay)
~

+gon

Long numbers

Parse long numbers - Numbers which wrap around the shell with the line

Source

++  gon  ;~(pose ;~(plug bas gay fas) (easy ~))

Examples

> 'abc\
  /def'
'abcdef'

> (scan "\\\0a/" gon)
['\\' ~ '/']

+gul

Axis syntax < or >

Parse the axis gal and gar axis syntax.

Source

++  gul  ;~(pose (cold 2 gal) (cold 3 gar))

Examples

> (scan "<" gul)
2

> (scan ">" gul)
3

+hex

Hex to atom

Parse any hexadecimal number to an atom.

Source

++  hex  (bass 16 (most gon hit))

Examples

> (scan "a" hex)
10

> (scan "A" hex)
10

> (scan "2A" hex)
42

> (scan "1ee7" hex)
7.911

> (scan "1EE7" hex)
7.911

> (scan "1EE7F7" hex)
2.025.463

> `@ux`(scan "1EE7F7" hex)
0x1e.e7f7

+hig

Uppercase

Parse a single uppercase letter.

Source

++  hig  (shim 'A' 'Z')

Examples

> (scan "G" hig)
'G'

> (scan "ABCDEFGHIJKLMNOPQRSTUVWXYZ" (star hig))
"ABCDEFGHIJKLMNOPQRSTUVWXYZ"

+hit

Hex digits

Parse a single hexadecimal digit.

Source

++  hit  ;~  pose
           dit
           (cook |=(a=char (sub a 87)) (shim 'a' 'f'))
           (cook |=(a=char (sub a 55)) (shim 'A' 'F'))
         ==

Examples

> (scan "a" hit)
10

> (scan "A" hit)
10

> (scan "2A" (star hit))
~[2 10]

+iny

Indentation block

Apply rule to indented block starting at current column number, omitting the leading whitespace.

Accepts

sef is a rule

Produces

A rule.

Source

++  iny
  |*  sef=rule
  |=  nail  ^+  (sef)
  =+  [har tap]=[p q]:+<
  =+  lev=(fil 3 (dec q.har) ' ')
  =+  eol=(just `@t`10)
  =+  =-  roq=((star ;~(pose prn ;~(sfix eol (jest lev)) -)) har tap)
      ;~(simu ;~(plug eol eol) eol)
  ?~  q.roq  roq
  =+  vex=(sef har(q 1) p.u.q.roq)
  =+  fur=p.vex(q (add (dec q.har) q.p.vex))
  ?~  q.vex  vex(p fur)
  =-  vex(p fur, u.q -)
  :+  &3.vex
    &4.vex(q.p (add (dec q.har) q.p.&4.vex))
  =+  res=|4.vex
  |-  ?~  res  |4.roq
  ?.  =(10 -.res)  [-.res $(res +.res)]
  (welp [`@t`10 (trip lev)] $(res +.res))

Examples

> `tape`(scan "   foo\0a   bar" ;~(pfix ace ace ace (iny (star ;~(pose prn (just '\0a'))))))
"foo\0abar"

Discussion

Note the amount of indentation whitespace to be stripped from the beginning of each line is determined by the value of q (the column) in the hair when ++iny is first called. This means something like the pfix expression in the example above is necessary to set the level of indentation. Additionally, the rule given to ++iny must consume the whole line including the line ending.


+low

Lowercase

Parse a single lowercase letter.

Source

++  low  (shim 'a' 'z')

Examples

> (scan "g" low)
'g'

+mes

Hexbyte

Parse a hexbyte.

Source

++  mes  %+  cook
           |=({a/@ b/@} (add (mul 16 a) b))
         ;~(plug hit hit)

Examples

> (scan "2A" mes)
42

> (scan "42" mes)
66

+nix

Letters and underscore

Parse Letters and _.

Source

++  nix  (boss 256 (star ;~(pose aln cab)))

Examples

> `@t`(scan "as_me" nix)
'as_me'

+nud

Numeric

Parse a numeric character - A number.

Source

++  nud  (shim '0' '9')

Examples

> (scan "0" nud)
'0'

> (scan "7" nud)
'7'

+prn

Printable character

Parse any printable character.

Source

++  prn  ;~(less (just `@`127) (shim 32 256))

Examples

> (scan "h" prn)
'h'

> (scan "!" prn)
'!'

> (scan "\01" prn)
! {1 1}
! exit

+qat

Chars in blockcord

Parse a single character contained in a multi-line cord block.

Source

++  qat  ;~  pose
             prn
             ;~(less ;~(plug (just `@`10) soz) (just `@`10))
         ==

Examples

> ^-  tape
  %+  scan
    "'''\0aabc\0adef\0aghi\0a'''"
  %+  ifix
    :-  ;~(plug soz (just `@`10))
    ;~(plug (just `@`10) soz)
  (star qat)
"abc\0adef\0aghi"

+qit

Chars in cord

Parse an individual character to its cord atom representation. Escaped characters are converted to the value they represent.

Source

++  qit  ;~  pose                                       ::  chars in a cord
             ;~(less bas soq prn)
             ;~(pfix bas ;~(pose bas soq mes))          ::  escape chars
         ==

Examples

> (scan "%" qit)
'%'

> `tape`(scan "cord" (star qit))
"cord"

> `tape`(scan "\\0a" (star qit))
"\0a"

+qut

Cord

Parse single-soq cord with \{gap}/ anywhere in the middle, or triple-single quote (aka triple-soq) cord, between which must be in an indented block.

Source

++  qut  ;~  simu  soq
           ;~  pose
             ;~  less  soz
               (ifix [soq soq] (boss 256 (more gon qit)))
             ==
             =+  hed=;~(pose ;~(plug (plus ace) vul) (just '\0a'))
             %-  iny  %+  ifix
               :-  ;~(plug soz hed)
               ;~(plug (just '\0a') soz)
             (boss 256 (star qat))
           ==
         ==

Examples

> `@t`(scan "'cord'" qut)
'cord'

> `@t`(scan "'''\0aabc\0adef\0a'''" qut)
'abc\0adef'

+soz

Delimiting '''

Parse a triple-single quote, used for multiline strings.

Source

++  soz  ;~(plug soq soq soq)

Examples

> (scan "'''" soz)
['\'' '\'' '\'']

+sym

Term

A term: a lowercase letter, followed by letters, numbers, or -.

Source

++  sym
  %+  cook
    |=(a=tape (rap 3 ^-((list @) a)))
  ;~(plug low (star ;~(pose nud low hep)))

Examples

> `term`(scan "sam-2" sym)
%sam-2

+mixed-case-symbol

Mixed-case term

The same as ++sym but allowing uppercase letters.

Source

++  mixed-case-symbol
  %+  cook
    |=(a=tape (rap 3 ^-((list @) a)))
  ;~(plug alf (star alp))

Examples

> `term`(scan "sAm-2" mixed-case-symbol)
%sAm-2

+ven

+>- axis syntax

Axis syntax parser

Source

++  ven  ;~  (comp |=([a=@ b=@] (peg a b)))
           bet
           =+  hom=`?`|
           |=  tub=nail
           ^-  (like @)
           =+  vex=?:(hom (bet tub) (gul tub))
           ?~  q.vex
             [p.tub [~ 1 tub]]
           =+  wag=$(p.tub p.vex, hom !hom, tub q.u.q.vex)
           ?>  ?=(^ q.wag)
           [p.wag [~ (peg p.u.q.vex p.u.q.wag) q.u.q.wag]]
         ==

Examples

> (scan "->+" ven)
11

> (scan "->+<-" ven)
44

+vit

Base64 digit

Parse a standard base64 digit.

Source

++  vit
  ;~  pose
    (cook |=(a=@ (sub a 65)) (shim 'A' 'Z'))
    (cook |=(a=@ (sub a 71)) (shim 'a' 'z'))
    (cook |=(a=@ (add a 4)) (shim '0' '9'))
    (cold 62 (just '-'))
    (cold 63 (just '+'))
  ==

Examples

> (scan "C" vit)
2

> (scan "c" vit)
28

> (scan "2" vit)
54

> (scan "-" vit)
62

+vul

Comments to null

Parse comments and produce a null. Note that a comment must be ended with a newline character.

Source

++  vul  %+  cold   ~
         ;~  plug  col  col
           (star prn)
           (just `@`10)
         ==

Examples

> (scan "::this is a comment \0a" vul)
~

> (scan "::this is a comment " vul)
! {1 21}
! exit

Previous4h: Parsing (ASCII Glyphs)Next4j: Parsing (Bases and Base Digits)

Last updated 1 day ago