simple-lisp/BYTECODE.md

9.1 KiB

Bytecode Documentation

Instructions

Each instruction consists of a mnemonic and some number of arguments. The numeric value for each instruction is the same length, but the arguments can be different lengths, and each instruction can have a different number of arguments.

In this document, the layout for data is specified by a number of fields separated by commas. When represented in raw bits, these fields are next to each other with no padding in between. For example:

    length:u64, data:[i8]

represents an unsigned 64 bit integer field called "length" and a field called "data" which is an array of signed 8 bit integers. The value right before an array is the value which determines its length.

The argument types are as follows:

Name Description
i8, i16, i32, i64 8, 16, 32, or 64 bit signed integer
u8, u16, u32, u64 8, 16, 32, or 64 bit unsigned integer
double Equivilant to the C type "double"
reg register (format: type:u8,which:u32)
str string (format: length:u64,data:[i8])

Registers

Most instructions take register numbers instead of direct arguments. Registers take a type and a number. For example "lexenv2" is the third (counting from 0) lexenv register or "arg0" is the first argument register.

Mnemonic ID Description
val 0 General value registers (clobbered by calls)
saved 1 Callee saved ragisters (val registers are clobbered by calls)
arg 2 Function argument registers (clobbered by calls)
ret 3 Function return value registers (clobbered by calls)

Instruction List

  • NIL dest:reg Load the literal nil into DEST

  • T def:reg Load the literal t into DEST

  • STRING dest:reg, value:str Load the string of LENGTH bytes from DATA into dest.

  • INT dest:reg, value:i64 Convert VALUE into an int object and store it into DEST

  • FLOAT dest:reg, value:double Convert VALUE into a float object and store it into DEST

  • CONS dest:reg, car:reg, cdr:reg Create a cons object with CAR and CDR and store it into REG.

  • LIST dest:reg, count:u64 Create a list from the first COUNT "arg" registers and store it into DEST.

  • VECTOR dest:reg, count:u64 Create a vector from the first COUNT "arg" registers and store it into DEST.

  • INTERN_LIT reg:reg, name:str INTERN_DYN reg:reg, name:reg These instructions convert the string literal or register containing a string NAME into a symbol and store the symbol into REG.

  • SYMBOL_NAME dest:reg, sym:reg Store the name of SYM into DEST.

  • MOV dest:reg, src:reg Copy the value in the register SRC into the register DEST.

  • FUNCALL reg:reg, argc:u64 Call the function in REG. This should either be a function object or a symbol which has a value as a function. The return values are placed into the "ret" registers. ARGC is the number of "arg" registers that have been set for this function call.

  • RETVAL_COUNT count:u8 Declare that first COUNT return values have been set (if they have not been touched during this lexenv, they will be set to nil). Without this, the highest "ret" register written to is the number to use. Without this, assume one return value.

  • GET_RETVAL_COUNT dest:reg Place the count last set with RETVAL_COUNT into DEST.

  • ENTER_LEXENV ENTER_INHERITED_LEXENV LEAVE_LEXENV Push or restore the current lexical environment. An inherited lexenv does not save or restore the "saved" registers.

  • ENTER_BLOCK sym:reg, count:u64 LEAVE_BLOCK sym:reg Enter a new named block which is identified by the symbol in SYM. The block is COUNT instructions long. LEAVE_BLOCK leaved the block identified by SYM.

  • SET_VALUE sym:reg, value:reg Set the value as a variable of SYM to VALUE.

  • SET_FUNCTION sym:reg, value:reg Set the value as a function of SYM to VALUE (value must be an actual function, not a symbol).

  • GET_VALUE dest:reg, sym:reg Store the value as a variable of the symbol SYM into DEST.

  • GET_FUNCTION dest:reg, sym:reg Store the value as a function of the symbol SYM into DEST.

  • NEWFUNCTION_LIT dest:reg, count:u64 NEWFUNCTION_DYN dest:reg, src:reg Create a new function object and store it into DEST. If the first case the next COUNT instructions are considered to be the function and are skipped. In the second case SRC should be a list or vector containing the bytecode for the function.

  • PUT sym:reg, key:reg, value:reg Associate KEY with VALUE in the plist of SYM.

  • GET dest:reg, sym:reg, key:reg Store the value associated with KEY in the plist of SYM into DEST.

  • AND dest:reg, count:u64, values:[reg] OR dest:reg, count:u64, values:[reg] XOR dest:reg, count:u64, values:[reg] NOT dest:reg, value:reg Perform a logical operation on each of VALUES. For example, the XOR instruction will exclusively or each of VALUES with the next value, and store the overall result into DEST. NOT is special as it can only take one value, of which it will take the logical negation and store it into DEST.

  • CJMP cond:reg, offset:i64 If the value in COND is truthy (not nil), skip the next OFFSET instructions. If OFFSET is negative, instead go back abs(OFFSET) instructions. CJMP is NOT counted as an instruction for the purposes of counting offsets. Therefore an OFFSET of -2 means restart execute at the instruction above the instruction above this CJMP.

  • CAR dest:reg, cons:reg CDR dest:reg, cons:reg Store the car or cdr of CONS into DEST.

  • SETCAR cons:reg, value:reg SETCDR cons:reg, value:reg Store VALUE into the car or cdr of CONS.

  • GETELT_LIT dest:reg, seq:reg, index:u64 GETELT_DYN dest:reg, seq:reg, index:reg Store the value at INDEX in SEQ (a list or vector) into DEST.

  • SETELT_LIT seq:reg, index:u64, value:reg SETELT_DYN seq:reg, index:reg, value:reg Store VALUE into the index numbered INDEX of SEQ (a list or vector).

  • EQ_TWO dest:reg, val1:reg, val2:reg EQ_N dest:reg, count:u64 Compare VAL1 and VAL2, if they are the same object (or symbols with the same name) store T into DEST, otherwise, store NIL. In the case of EQ_N, if the first COUNT "arg" registers are the same object (or symbols with the same name), store T into DEST, otherwise, store NIL.

  • NUM_GT dest:reg, val1:reg, val2:reg NUM_GE dest:reg, val1:reg, val2:reg NUM_EQ dest:reg, val1:reg, val2:reg NUM_LE dest:reg, val1:reg, val2:reg NUM_LT dest:reg, val1:reg, val2:reg Compare VAL1 and VAL2, which must be numbers, and compare their values. If they pass, store T into DEST, otherwise store NIL.

Mnemonic Conversion Table

Mnemonic Number
STRING_LIT 0
STRING_DYN 1
INT 2
FLOAT 3
CONS 4
LSIT_LIT 5
LSIT_DYN 6
VECTOR_LIT 7
VECTOR_DYN 8
INTERN_LIT 9
INTERN_DYN 10
SYMBOL_NAME 11
MOV 12
FUNCALL 13
RETVAL_COUNT 14
ENTER_LEXENV 15
LEAVE_ELEXENV 16
ENTER_BLOCK 17
LEAVE_BLOCK 18
SET_VALUE 19
SET_FUNCTION 20
GET_VALUE 21
GET_FUNCTION 22
NEWFUNCTION_LIT 23
NEWFUNCTION_DYN 24
PUT 25
GET 26
AND 27
OR 28
XOR 29
NOT 30
CJMP 31
CAR 32
CDR 33
SETCAR 34
SETCDR 35
GETELT_LIT 36
GETELT_DYN 37
SETELT_LIT 38
SETELT_DYN 39
EQ_TWO 40
EQ_N 41
NUM_GT 42
NUM_GE 43
NUM_EQ 44
NUM_LE 45
NUM_LT 46

Examples

This:

(format t "Hello World~%")

Compiles into:

INTERN_LIT val0, "Hello World~%"

This:

(defun foo (bar &key baz &rest qux)
  "FOO each of BAR and BAZ as well as each of QUX."
  (let (some-val (genval bar))
    (foo-internal some-val :baz baz qux)))
    
(foo 10 :baz 20)

Compiles into:

NEWFUNCTION val0, 13 ;; not counting this instruction
ENTER_LEXENV
INTERN_LIT saved0, "foo"
ENTER_BLOCK saved0, 9 ;; not counting this instruction
INTERN_LIT val0, "genval"
;; NOTE bar already in arg0
FUNCALL val0 ;; NOTE val0 clobbered here
INTERN_LIT val0, "foo-internal"
MOV arg0, ret0
MOV arg3, arg2
MOV arg2, arg1
INTERN_LIT arg1, ":baz"
FUNCALL val0
LEAVE_BLOCK saved0
LEAVE_LEXENV

INTERN_LIT val1, "foo"
SET_FUNCTION val1, val0
INT arg0, 10
INTERN_LIT arg1, ":baz"
INT arg2, 20
FUNCALL val1