Writing a JSON Parser
May 18, 2020
Notes on Writing a simple JSON Parser: https://notes.eatonphil.com/writing-a-simple-json-parser.html
- JSON is pretty easy to parse
Parsing is often broken up into two stages
- lexical analysis
- syntactic analysis
Lexical Analysis breaks source input into the simplest decomposable elements (tokens)
Syntactic Analysis is often called parsing, receives the list of tokens and tries to find patterns in them.
Lexical Analysis
- input string is broken into tokens
- comments and whitespace are often discarded
- A simple lexical analyzer might iterate over all the characters in an input string non-recursively
Syntactic Anlysis
- iterate over a one-dimensional list of tokens and match groups of tokens up to pieces of the language according to the defination of the langauge
Call lex - return tokens Call paser on the tokens
- A key difference between this lexer and parser is that the lexer returns a 1D array of tokens. Parser are often defined recursively and returns a recursive tree like object. Since this is JSON NOT a language this parser just returns the needed datastructures
A JSON parser - iterate over the tokens received after a call to lex and try to match the tokens to objects, lists, or plan values
Parsers are often defined recursively and return a recursive, tree-like object. Since JSON is a data serialization format and not a language the parser should produce objects in Python rather then a syntax tree.