Syntax (computer science)
- The content on this page originated on Wikipedia and is yet to be significantly improved. Contributors are invited to replace and add material to make this an original article.
- See also: Syntax (disambiguation)
In computer science, especially in the subfield of programming languages, the syntax of a computer language is the set of allowed reserved words and possible token order in a program. The syntax of a programming language is the set of rules that a sequence of characters in a source code file must follow to be considered a conforming program in that language.
The rules specify how the character sequences are to be chunked into tokens (the lexical grammar), the permissible sequences of these tokens and some of the meaning to be attributed to these permissible token sequences (additional meaning is assigned by the semantics of the language).
The syntactic analysis of source code usually entails the transformation of the linear sequence of tokens into a hierarchical syntax tree (abstract syntax trees are one convenient form of syntax tree). This process is called parsing, as it is in syntactic analysis in linguistics. Tools have been written that automatically generate parsers from a specification of a language grammar written in Backus-Naur form, e.g., Yacc (yet another compiler compiler).
The syntax of computer languages is often at level-2 (ie, a context-free grammar) in the Chomsky hierarchy. As such the possible ordering of tokens is usually very restricted. The analysis of a program's syntax is usually performed using an automatically generated program known as a parser which often builds an abstract syntax tree.