From 9f7a55d2b28169cc54795eb0b93eba9ce8608dca Mon Sep 17 00:00:00 2001 From: chriseth Date: Mon, 18 Jul 2016 17:18:27 +0200 Subject: Source mapping documentation. --- docs/miscellaneous.rst | 53 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 53 insertions(+) diff --git a/docs/miscellaneous.rst b/docs/miscellaneous.rst index 825be2ce..9b067fb1 100644 --- a/docs/miscellaneous.rst +++ b/docs/miscellaneous.rst @@ -95,6 +95,59 @@ is simplified to code which can also be compiled from even though the instructions contained a jump in the beginning. +.. index:: source mappings + +*************** +Source Mappings +*************** + +As part of the AST output, the compiler provides the range of the source +code that is represented by the respective node in the AST. This can be +used for various purposes ranging from static analysis tools that report +errors based on the AST and debugging tools that highlight local variables +and their uses. + +Furthermore, the compiler can also generate a mapping from the bytecode +to the range in the source code that generated the instruction. This is again +important for static analysis tools that operate on bytecode level and +for displaying the current position in the source code inside a debugger +or for breakpoint handling. + +Both kinds of source mappings use integer indentifiers to refer to source files. +These are regular array indices into a list of source files usually called +``"sourceList"``, which is part of the combined-json and the output of +the json / npm compiler. + +The source mappings inside the AST use the following +notation: + +``s:l:f`` + +Where ``s`` is the byte-offset to the start of the range in the source file, +``l`` is the length of the source range in bytes and ``f`` is the source +index mentioned above. + +The encoding in the source mapping for the bytecode is more complicated: +It is a list of ``s:l:f:j`` separated by ``;``. Each of these +elements corresponds to an instruction, i.e. you cannot use the byte offset +but have to use the instruction offset or PC (program counter). +The fields ``s``, ``l`` and ``f`` are as above and ``j`` can be either +``i``, ``o`` or ``-`` signifying whether a jump instruction goes into a +function, returns from a function or is a regular jump as part of e.g. a loop. + +In order to compress these source mappings especially for bytecode, the +following rules are used: + + - If a field is empty, the value of the preceding element is used. + - If a ``:`` is missing, all following fields are considered empty. + +This means the following source mappings represent the same information: + +``1:2:1;1:9:1;2:1:2;2:1:2;2:1:2`` + +``1:2:1;:9;2::2;;`` + + .. index:: ! commandline compiler, compiler;commandline, ! solc, ! linker .. _commandline-compiler: -- cgit