Skip Navigation
 
 

json Language Reference


Regular Grammar

( legend )
 Java Script Object Notation


 NOTES



 - see http://www.json.org



 - by definition, the character set is UTF-8. We have restricted the

   regular grammar accordingly, see see UTF-8(7) manual page. Especially,

   we restricted the production such, that we impose the shortest possible

   form as required by Unicode 3.1, to make parser report errors when

   parsing a latin-1 source, for instance.



 - Wierdness 3: JSON requieres UTF8, but provides only two bytes in literals.

   As a consequence, we made a likely restriction to "UTF8"

   See RFC 4627 for UTC/UTF tricks beyond ours abilities (and necessities).



 - Wierdness 1: Number syntax, since it forces to normalize the integer part,

   but not the exponent.



 - Wierdness 2: Quoted characters may contain "/". Unclear, why and what for.



 - Extension: Comments are not allowed in JSON, but Java Script style comments

   have been added here for convenience.


 UTF8 meditation table



       0x00000000 - 0x0000007F:

           0xxxxxxx

 (first)          0



       0x00000080 - 0x000007FF:

           110xxxxx 10xxxxxx

 (first)         10   000000

           c2       80       



       0x00000800 - 0x0000FFFF:

           1110xxxx 10xxxxxx 10xxxxxx

 (first)              100000   000000

           e0       a0



       0x00010000 - 0x001FFFFF:

           11110xxx 10xxxxxx 10xxxxxx 10xxxxxx

 (first)               10000   000000   000000

           f0       90



       0x00200000 - 0x03FFFFFF:

           111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx

 (first)                1000   000000   000000   000000

           f8       88



       0x04000000 - 0x7FFFFFFF:

           1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx

 (first)                 100   000000   000000   000000   000000

           fc       84
let  Usuff  :: 

"\80" .. "\bf"


 
     
let  UTF81  :: 

"\00" .. "\7f"


 
     
let  UTF82  :: 

"\c2" .. "\df" Usuff


 
     
let  UTF83  :: 

"\e0" .. "\ef" "\a0" .. "\bf" Usuff


let UTF84 = "\f0".."\f7" "\90".."\bf" Usuff Usuff let UTF85 = "\f8".."\fb" "\88".."\bf" Usuff Usuff Usuff let UTF86 = "\fc".."\fd" "\84".."\bf" Usuff Usuff Usuff Usuff

 
     
let  UTF8  :: 

UTF81 | UTF82 | UTF83


FIXME see note above | UTF84 | UTF85 | UTF86

 
     
let  Digit19  :: 

"1" .. "9"


 
     
let  Digit09  :: 

Digit19 | "0"


 
     
let  Hex  :: 

Digit09 | "a" .. "f" | "A" .. "F"


 
     
let  Int  :: 

"0" | Digit19 { Digit09 }


 
     
let  Control  :: 

"\00" .. "\1f"


 
     
let  Printable  :: 

UTF8 - Control


 
     
let  Charset  :: 

Printable | Spc


 
     
let  Char  :: 

Printable - '\\\"' | "\\" ( '\\\"/bfnrt' | "u" Hex Hex Hex Hex )


 
     
tok  String  :: 

"\"" { Char } "\""


 
     
tok  Number  :: 

[ "-" ] Int [ "." Digit09 + ] [ 'Ee' [ '+-' ] Digit09 + ]


 
     
tok  Sym  :: 

'[]{}:,' | "true" | "false" | "null"


 
     
ign  Spc  :: 

' \n\r\t'


note that form-feed is not allowed

 
     
 TODO allow more Control (but not 0) in Charset below?
com  Com  :: 

"//" { Printable } | "/*" ( { Charset } - ( { Charset } "*/" { Charset } ) ) "*/"


 



Context-free Grammar

( legend )

 TODO extension: allow Value on LHS of Member.pair?

 TODO make a C-Data Type to present JSON or revisit [Network/XmlRpc.C]

      see [http://json-rpc.org], too, for this purpose.

      see [http://groups.google.com/group/json-rpc/web/json-rpc-2-0], too.
start  Src  :: 

Value


 
     
let  Value  :: 

{ _List1 }


 
     
   | 

null

 

 
     
   | 

String

 

 
     
   | 

Number

 

 
     
   | 

Boolean

 

 
     
   | 

[ _List2 ]

 

 
     
let  _List1  :: 

 

 
     
   | 

Member _List1_0

 

 
     
let  _List1_0  :: 

 

 
     
   | 

, Member _List1_0

 

 
     
let  _List2  :: 

 

 
     
   | 

Value _List2_0

 

 
     
let  _List2_0  :: 

, Value _List2_0

 

 
     
   | 

 

 
     
let  Boolean  :: 

true


 
     
   | 

false

 

 
     
let  Member  :: 

String : Value


 

vim:syntax=styx