274 lines
9.3 KiB
Markdown
274 lines
9.3 KiB
Markdown
|
# Bytecode Documentation
|
||
|
|
||
|
## Instructions
|
||
|
Each instruction consists of a mnemonic and some number of arguments. The
|
||
|
numeric value for each instruction is the same length, but the arguments can be
|
||
|
different lengths, and each instruction can have a different number of
|
||
|
arguments.
|
||
|
|
||
|
In this document, the layout for data is specified by a number of fields
|
||
|
separated by commas. When represented in raw bits, these fields are next to each
|
||
|
other with no padding in between. For example:
|
||
|
```text
|
||
|
length:u64, data:[i8]
|
||
|
```
|
||
|
represents an unsigned 64 bit integer field called "length" and a field called
|
||
|
"data" which is an array of signed 8 bit integers. The value right before an
|
||
|
array is the value which determines its length.
|
||
|
|
||
|
The argument types are as follows:
|
||
|
| Name | Description |
|
||
|
|:------------------|:--------------------------------------|
|
||
|
| i8, i16, i32, i64 | 8, 16, 32, or 64 bit signed integer |
|
||
|
| u8, u16, u32, u64 | 8, 16, 32, or 64 bit unsigned integer |
|
||
|
| double | Equivilant to the C type "double" |
|
||
|
| reg | register (format: type:u8,which:u32) |
|
||
|
| str | string (format: length:u64,data:[i8]) |
|
||
|
|
||
|
### Registers
|
||
|
Most instructions take register numbers instead of direct arguments. Registers
|
||
|
take a type and a number. For example "lexenv2" is the third (counting from 0)
|
||
|
lexenv register or "arg0" is the first argument register.
|
||
|
|
||
|
| Mnemonic | ID | Description |
|
||
|
|:---------|:--:|:--------------------------------------------------------------|
|
||
|
| val | 0 | General value registers (clobbered by calls) |
|
||
|
| saved | 1 | Callee saved ragisters (val registers are clobbered by calls) |
|
||
|
| arg | 2 | Function argument registers (clobbered by calls) |
|
||
|
| ret | 3 | Function return value registers (clobbered by calls) |
|
||
|
| lexenv | 4 | Lexical environment registers |
|
||
|
| block | 5 | Block symbol registers |
|
||
|
|
||
|
### Instruction List
|
||
|
|
||
|
- NIL dest:reg
|
||
|
Load the literal nil into DEST
|
||
|
|
||
|
- T def:reg
|
||
|
Load the literal t into DEST
|
||
|
|
||
|
- STRING dest:reg, value:str
|
||
|
Load the string of LENGTH bytes from DATA into dest.
|
||
|
|
||
|
- INT dest:reg, value:i64
|
||
|
Convert VALUE into an int object and store it into DEST
|
||
|
|
||
|
- FLOAT dest:reg, value:double
|
||
|
Convert VALUE into a float object and store it into DEST
|
||
|
|
||
|
- CONS dest:reg, car:reg, cdr:reg
|
||
|
Create a cons object with CAR and CDR and store it into REG.
|
||
|
|
||
|
- LIST dest:reg, count:u64
|
||
|
Create a list from the first COUNT "arg" registers and store it into DEST.
|
||
|
|
||
|
- VECTOR dest:reg, count:u64
|
||
|
Create a vector from the first COUNT "arg" registers and store it into DEST.
|
||
|
|
||
|
- INTERN\_LIT reg:reg, name:str
|
||
|
INTERN\_DYN reg:reg, name:reg
|
||
|
These instructions convert the string literal or register containing a string
|
||
|
NAME into a symbol and store the symbol into REG.
|
||
|
|
||
|
- SYMBOL\_NAME dest:reg, sym:reg
|
||
|
Store the name of SYM into DEST.
|
||
|
|
||
|
- MOV dest:reg, src:reg
|
||
|
Copy the value in the register SRC into the register DEST.
|
||
|
|
||
|
- FUNCALL reg:reg, argc:u64
|
||
|
Call the function in REG. This should either be a function object or a symbol
|
||
|
which has a value as a function. The return values are placed into the "ret"
|
||
|
registers. ARGC is the number of "arg" registers that have been set for this
|
||
|
function call.
|
||
|
|
||
|
- RETVAL_COUNT count:u8
|
||
|
Declare that first COUNT return values have been set (if they have not been
|
||
|
touched during this lexenv, they will be set to nil). Without this, the
|
||
|
highest "ret" register written to is the number to use. Without this, assume
|
||
|
one return value.
|
||
|
|
||
|
- ENTER\_LEXENV
|
||
|
LEAVE\_LEXENV
|
||
|
Shift all the "lexenv" registers up by 1 and then create a new lexenv and
|
||
|
store it into "lexenv0". LEAVE\_LEXENV does the opposite, restoring the last
|
||
|
pushed levenv.
|
||
|
|
||
|
- ENTER\_BLOCK sym:reg, count:u64
|
||
|
LEAVE\_BLOCK sym:reg
|
||
|
Enter a new named block which is identified by the symbol in SYM. The block is
|
||
|
COUNT instructions long. LEAVE\_BLOCK leaved the block identified by
|
||
|
SYM. Adding a new block pushes SYM onto the "block" registers, much like
|
||
|
PUSH\_LEXENV (which see).
|
||
|
|
||
|
- SET\_VALUE sym:reg, value:reg
|
||
|
Set the value as a variable of SYM to VALUE.
|
||
|
|
||
|
- SET\_FUNCTION sym:reg, value:reg
|
||
|
Set the value as a function of SYM to VALUE (value must be an actual function,
|
||
|
not a symbol).
|
||
|
|
||
|
- GET\_VALUE dest:reg, sym:reg
|
||
|
Store the value as a variable of the symbol SYM into DEST.
|
||
|
|
||
|
- GET\_FUNCTION dest:reg, sym:reg
|
||
|
Store the value as a function of the symbol SYM into DEST.
|
||
|
|
||
|
- NEWFUNCTION\_LIT dest:reg, count:u64
|
||
|
NEWFUNCTION\_DYN dest:reg, src:reg
|
||
|
Create a new function object and store it into DEST. If the first case the
|
||
|
next COUNT instructions are considered to be the function and are skipped. In
|
||
|
the second case SRC should be a list or vector containing the bytecode for the
|
||
|
function.
|
||
|
|
||
|
- PUT sym:reg, key:reg, value:reg
|
||
|
Associate KEY with VALUE in the plist of SYM.
|
||
|
|
||
|
- GET dest:reg, sym:reg, key:reg
|
||
|
Store the value associated with KEY in the plist of SYM into DEST.
|
||
|
|
||
|
- AND dest:reg, count:u64, values:[reg]
|
||
|
OR dest:reg, count:u64, values:[reg]
|
||
|
XOR dest:reg, count:u64, values:[reg]
|
||
|
NOT dest:reg, value:reg
|
||
|
Perform a logical operation on each of VALUES. For example, the XOR
|
||
|
instruction will exclusively or each of VALUES with the next value, and store
|
||
|
the overall result into DEST. NOT is special as it can only take one value,
|
||
|
of which it will take the logical negation and store it into DEST.
|
||
|
|
||
|
- CJMP cond:reg, offset:i64
|
||
|
If the value in COND is truthy (not nil), skip the next OFFSET
|
||
|
instructions. If OFFSET is negative, instead go back abs(OFFSET)
|
||
|
instructions. CJMP is NOT counted as an instruction for the purposes of
|
||
|
counting offsets. Therefore an OFFSET of -2 means restart execute at the
|
||
|
instruction above the instruction above this CJMP.
|
||
|
|
||
|
- CAR dest:reg, cons:reg
|
||
|
CDR dest:reg, cons:reg
|
||
|
Store the car or cdr of CONS into DEST.
|
||
|
|
||
|
- SETCAR cons:reg, value:reg
|
||
|
SETCDR cons:reg, value:reg
|
||
|
Store VALUE into the car or cdr of CONS.
|
||
|
|
||
|
|
||
|
- GETELT\_LIT dest:reg, seq:reg, index:u64
|
||
|
GETELT\_DYN dest:reg, seq:reg, index:reg
|
||
|
Store the value at INDEX in SEQ (a list or vector) into DEST.
|
||
|
|
||
|
- SETELT\_LIT seq:reg, index:u64, value:reg
|
||
|
SETELT\_DYN seq:reg, index:reg, value:reg
|
||
|
Store VALUE into the index numbered INDEX of SEQ (a list or vector).
|
||
|
|
||
|
- EQ\_TWO dest:reg, val1:reg, val2:reg
|
||
|
EQ\_N dest:reg, count:u64
|
||
|
Compare VAL1 and VAL2, if they are the same object (or symbols with the same
|
||
|
name) store T into DEST, otherwise, store NIL. In the case of EQ\_N, if the
|
||
|
first COUNT "arg" registers are the same object (or symbols with the same
|
||
|
name), store T into DEST, otherwise, store NIL.
|
||
|
|
||
|
- NUM\_GT dest:reg, val1:reg, val2:reg
|
||
|
NUM\_GE dest:reg, val1:reg, val2:reg
|
||
|
NUM\_EQ dest:reg, val1:reg, val2:reg
|
||
|
NUM\_LE dest:reg, val1:reg, val2:reg
|
||
|
NUM\_LT dest:reg, val1:reg, val2:reg
|
||
|
Compare VAL1 and VAL2, which must be numbers, and compare their values. If
|
||
|
they pass, store T into DEST, otherwise store NIL.
|
||
|
|
||
|
## Mnemonic Conversion Table
|
||
|
|
||
|
| Mnemonic | Number |
|
||
|
|:-----------------|:------:|
|
||
|
| STRING\_LIT | 0 |
|
||
|
| STRING\_DYN | 1 |
|
||
|
| INT | 2 |
|
||
|
| FLOAT | 3 |
|
||
|
| CONS | 4 |
|
||
|
| LSIT\_LIT | 5 |
|
||
|
| LSIT\_DYN | 6 |
|
||
|
| VECTOR\_LIT | 7 |
|
||
|
| VECTOR\_DYN | 8 |
|
||
|
| INTERN\_LIT | 9 |
|
||
|
| INTERN\_DYN | 10 |
|
||
|
| SYMBOL\_NAME | 11 |
|
||
|
| MOV | 12 |
|
||
|
| FUNCALL | 13 |
|
||
|
| RETVAL\_COUNT | 14 |
|
||
|
| ENTER\_LEXENV | 15 |
|
||
|
| LEAVE\_ELEXENV | 16 |
|
||
|
| ENTER\_BLOCK | 17 |
|
||
|
| LEAVE\_BLOCK | 18 |
|
||
|
| SET\_VALUE | 19 |
|
||
|
| SET\_FUNCTION | 20 |
|
||
|
| GET\_VALUE | 21 |
|
||
|
| GET\_FUNCTION | 22 |
|
||
|
| NEWFUNCTION\_LIT | 23 |
|
||
|
| NEWFUNCTION\_DYN | 24 |
|
||
|
| PUT | 25 |
|
||
|
| GET | 26 |
|
||
|
| AND | 27 |
|
||
|
| OR | 28 |
|
||
|
| XOR | 29 |
|
||
|
| NOT | 30 |
|
||
|
| CJMP | 31 |
|
||
|
| CAR | 32 |
|
||
|
| CDR | 33 |
|
||
|
| SETCAR | 34 |
|
||
|
| SETCDR | 35 |
|
||
|
| GETELT\_LIT | 36 |
|
||
|
| GETELT\_DYN | 37 |
|
||
|
| SETELT\_LIT | 38 |
|
||
|
| SETELT\_DYN | 39 |
|
||
|
| EQ\_TWO | 40 |
|
||
|
| EQ\_N | 41 |
|
||
|
| NUM\_GT | 42 |
|
||
|
| NUM\_GE | 43 |
|
||
|
| NUM\_EQ | 44 |
|
||
|
| NUM\_LE | 45 |
|
||
|
| NUM\_LT | 46 |
|
||
|
|
||
|
## Examples
|
||
|
This:
|
||
|
```lisp
|
||
|
(format t "Hello World~%")
|
||
|
```
|
||
|
Compiles into:
|
||
|
```text
|
||
|
INTERN_LIT val0, "Hello World~%"
|
||
|
```
|
||
|
|
||
|
This:
|
||
|
```lisp
|
||
|
(defun foo (bar &key baz &rest qux)
|
||
|
"FOO each of BAR and BAZ as well as each of QUX."
|
||
|
(let (some-val (genval bar))
|
||
|
(foo-internal some-val :baz baz qux)))
|
||
|
|
||
|
(foo 10 :baz 20)
|
||
|
```
|
||
|
Compiles into:
|
||
|
```text
|
||
|
NEWFUNCTION val0, 13 ;; not counting this instruction
|
||
|
ENTER_LEXENV
|
||
|
INTERN_LIT saved0, "foo"
|
||
|
ENTER_BLOCK saved0, 9 ;; not counting this instruction
|
||
|
INTERN_LIT val0, "genval"
|
||
|
;; NOTE bar already in arg0
|
||
|
FUNCALL val0 ;; NOTE val0 clobbered here
|
||
|
INTERN_LIT val0, "foo-internal"
|
||
|
MOV arg0, ret0
|
||
|
MOV arg3, arg2
|
||
|
MOV arg2, arg1
|
||
|
INTERN_LIT arg1, ":baz"
|
||
|
FUNCALL val0
|
||
|
LEAVE_BLOCK saved0
|
||
|
LEAVE_LEXENV
|
||
|
|
||
|
INTERN_LIT val1, "foo"
|
||
|
SET_FUNCTION val1, val0
|
||
|
INT arg0, 10
|
||
|
INTERN_LIT arg1, ":baz"
|
||
|
INT arg2, 20
|
||
|
FUNCALL val1
|
||
|
```
|