simple-lisp/BYTECODE.md

274 lines
9.1 KiB
Markdown
Raw Normal View History

2024-09-24 07:03:52 -07:00
# Bytecode Documentation
## Instructions
Each instruction consists of a mnemonic and some number of arguments. The
numeric value for each instruction is the same length, but the arguments can be
different lengths, and each instruction can have a different number of
arguments.
In this document, the layout for data is specified by a number of fields
separated by commas. When represented in raw bits, these fields are next to each
other with no padding in between. For example:
```text
length:u64, data:[i8]
```
represents an unsigned 64 bit integer field called "length" and a field called
"data" which is an array of signed 8 bit integers. The value right before an
array is the value which determines its length.
The argument types are as follows:
| Name | Description |
|:------------------|:--------------------------------------|
| i8, i16, i32, i64 | 8, 16, 32, or 64 bit signed integer |
| u8, u16, u32, u64 | 8, 16, 32, or 64 bit unsigned integer |
| double | Equivilant to the C type "double" |
| reg | register (format: type:u8,which:u32) |
| str | string (format: length:u64,data:[i8]) |
### Registers
Most instructions take register numbers instead of direct arguments. Registers
take a type and a number. For example "lexenv2" is the third (counting from 0)
lexenv register or "arg0" is the first argument register.
| Mnemonic | ID | Description |
|:---------|:--:|:--------------------------------------------------------------|
| val | 0 | General value registers (clobbered by calls) |
| saved | 1 | Callee saved ragisters (val registers are clobbered by calls) |
| arg | 2 | Function argument registers (clobbered by calls) |
| ret | 3 | Function return value registers (clobbered by calls) |
### Instruction List
- NIL dest:reg
Load the literal nil into DEST
- T def:reg
Load the literal t into DEST
- STRING dest:reg, value:str
Load the string of LENGTH bytes from DATA into dest.
- INT dest:reg, value:i64
Convert VALUE into an int object and store it into DEST
- FLOAT dest:reg, value:double
Convert VALUE into a float object and store it into DEST
- CONS dest:reg, car:reg, cdr:reg
Create a cons object with CAR and CDR and store it into REG.
- LIST dest:reg, count:u64
Create a list from the first COUNT "arg" registers and store it into DEST.
- VECTOR dest:reg, count:u64
Create a vector from the first COUNT "arg" registers and store it into DEST.
- INTERN\_LIT reg:reg, name:str
INTERN\_DYN reg:reg, name:reg
These instructions convert the string literal or register containing a string
NAME into a symbol and store the symbol into REG.
- SYMBOL\_NAME dest:reg, sym:reg
Store the name of SYM into DEST.
- MOV dest:reg, src:reg
Copy the value in the register SRC into the register DEST.
- FUNCALL reg:reg, argc:u64
Call the function in REG. This should either be a function object or a symbol
which has a value as a function. The return values are placed into the "ret"
registers. ARGC is the number of "arg" registers that have been set for this
function call.
- RETVAL_COUNT count:u8
Declare that first COUNT return values have been set (if they have not been
touched during this lexenv, they will be set to nil). Without this, the
highest "ret" register written to is the number to use. Without this, assume
one return value.
2024-09-24 22:24:02 -07:00
- GET\_RETVAL\_COUNT dest:reg
Place the count last set with RETVAL\_COUNT into DEST.
2024-09-24 07:03:52 -07:00
- ENTER\_LEXENV
2024-09-24 22:24:02 -07:00
ENTER\_INHERITED\_LEXENV
2024-09-24 07:03:52 -07:00
LEAVE\_LEXENV
2024-09-24 22:24:02 -07:00
Push or restore the current lexical environment. An inherited lexenv does not
save or restore the "saved" registers.
2024-09-24 07:03:52 -07:00
- ENTER\_BLOCK sym:reg, count:u64
LEAVE\_BLOCK sym:reg
Enter a new named block which is identified by the symbol in SYM. The block is
COUNT instructions long. LEAVE\_BLOCK leaved the block identified by
2024-09-24 22:24:02 -07:00
SYM.
2024-09-24 07:03:52 -07:00
- SET\_VALUE sym:reg, value:reg
Set the value as a variable of SYM to VALUE.
- SET\_FUNCTION sym:reg, value:reg
Set the value as a function of SYM to VALUE (value must be an actual function,
not a symbol).
- GET\_VALUE dest:reg, sym:reg
Store the value as a variable of the symbol SYM into DEST.
- GET\_FUNCTION dest:reg, sym:reg
Store the value as a function of the symbol SYM into DEST.
- NEWFUNCTION\_LIT dest:reg, count:u64
NEWFUNCTION\_DYN dest:reg, src:reg
Create a new function object and store it into DEST. If the first case the
next COUNT instructions are considered to be the function and are skipped. In
the second case SRC should be a list or vector containing the bytecode for the
function.
- PUT sym:reg, key:reg, value:reg
Associate KEY with VALUE in the plist of SYM.
- GET dest:reg, sym:reg, key:reg
Store the value associated with KEY in the plist of SYM into DEST.
- AND dest:reg, count:u64, values:[reg]
OR dest:reg, count:u64, values:[reg]
XOR dest:reg, count:u64, values:[reg]
NOT dest:reg, value:reg
Perform a logical operation on each of VALUES. For example, the XOR
instruction will exclusively or each of VALUES with the next value, and store
the overall result into DEST. NOT is special as it can only take one value,
of which it will take the logical negation and store it into DEST.
- CJMP cond:reg, offset:i64
If the value in COND is truthy (not nil), skip the next OFFSET
instructions. If OFFSET is negative, instead go back abs(OFFSET)
instructions. CJMP is NOT counted as an instruction for the purposes of
counting offsets. Therefore an OFFSET of -2 means restart execute at the
instruction above the instruction above this CJMP.
- CAR dest:reg, cons:reg
CDR dest:reg, cons:reg
Store the car or cdr of CONS into DEST.
- SETCAR cons:reg, value:reg
SETCDR cons:reg, value:reg
Store VALUE into the car or cdr of CONS.
- GETELT\_LIT dest:reg, seq:reg, index:u64
GETELT\_DYN dest:reg, seq:reg, index:reg
Store the value at INDEX in SEQ (a list or vector) into DEST.
- SETELT\_LIT seq:reg, index:u64, value:reg
SETELT\_DYN seq:reg, index:reg, value:reg
Store VALUE into the index numbered INDEX of SEQ (a list or vector).
- EQ\_TWO dest:reg, val1:reg, val2:reg
EQ\_N dest:reg, count:u64
Compare VAL1 and VAL2, if they are the same object (or symbols with the same
name) store T into DEST, otherwise, store NIL. In the case of EQ\_N, if the
first COUNT "arg" registers are the same object (or symbols with the same
name), store T into DEST, otherwise, store NIL.
- NUM\_GT dest:reg, val1:reg, val2:reg
NUM\_GE dest:reg, val1:reg, val2:reg
NUM\_EQ dest:reg, val1:reg, val2:reg
NUM\_LE dest:reg, val1:reg, val2:reg
NUM\_LT dest:reg, val1:reg, val2:reg
Compare VAL1 and VAL2, which must be numbers, and compare their values. If
they pass, store T into DEST, otherwise store NIL.
## Mnemonic Conversion Table
| Mnemonic | Number |
|:-----------------|:------:|
| STRING\_LIT | 0 |
| STRING\_DYN | 1 |
| INT | 2 |
| FLOAT | 3 |
| CONS | 4 |
| LSIT\_LIT | 5 |
| LSIT\_DYN | 6 |
| VECTOR\_LIT | 7 |
| VECTOR\_DYN | 8 |
| INTERN\_LIT | 9 |
| INTERN\_DYN | 10 |
| SYMBOL\_NAME | 11 |
| MOV | 12 |
| FUNCALL | 13 |
| RETVAL\_COUNT | 14 |
| ENTER\_LEXENV | 15 |
| LEAVE\_ELEXENV | 16 |
| ENTER\_BLOCK | 17 |
| LEAVE\_BLOCK | 18 |
| SET\_VALUE | 19 |
| SET\_FUNCTION | 20 |
| GET\_VALUE | 21 |
| GET\_FUNCTION | 22 |
| NEWFUNCTION\_LIT | 23 |
| NEWFUNCTION\_DYN | 24 |
| PUT | 25 |
| GET | 26 |
| AND | 27 |
| OR | 28 |
| XOR | 29 |
| NOT | 30 |
| CJMP | 31 |
| CAR | 32 |
| CDR | 33 |
| SETCAR | 34 |
| SETCDR | 35 |
| GETELT\_LIT | 36 |
| GETELT\_DYN | 37 |
| SETELT\_LIT | 38 |
| SETELT\_DYN | 39 |
| EQ\_TWO | 40 |
| EQ\_N | 41 |
| NUM\_GT | 42 |
| NUM\_GE | 43 |
| NUM\_EQ | 44 |
| NUM\_LE | 45 |
| NUM\_LT | 46 |
## Examples
This:
```lisp
(format t "Hello World~%")
```
Compiles into:
```text
INTERN_LIT val0, "Hello World~%"
```
This:
```lisp
(defun foo (bar &key baz &rest qux)
"FOO each of BAR and BAZ as well as each of QUX."
(let (some-val (genval bar))
(foo-internal some-val :baz baz qux)))
(foo 10 :baz 20)
```
Compiles into:
```text
NEWFUNCTION val0, 13 ;; not counting this instruction
ENTER_LEXENV
INTERN_LIT saved0, "foo"
ENTER_BLOCK saved0, 9 ;; not counting this instruction
INTERN_LIT val0, "genval"
;; NOTE bar already in arg0
FUNCALL val0 ;; NOTE val0 clobbered here
INTERN_LIT val0, "foo-internal"
MOV arg0, ret0
MOV arg3, arg2
MOV arg2, arg1
INTERN_LIT arg1, ":baz"
FUNCALL val0
LEAVE_BLOCK saved0
LEAVE_LEXENV
INTERN_LIT val1, "foo"
SET_FUNCTION val1, val0
INT arg0, 10
INTERN_LIT arg1, ":baz"
INT arg2, 20
FUNCALL val1
```