## Overview

MelVM is the low-level, stack-based virtual machine that powers Themelio covenants. By only allowing looping for a fixed number of iterations, MelVM makes the worst-case cost of each covenant computable, trading off Turing completeness. Unlike Bitcoin scripts, however, MelVM covenants can compute all primitive recursive functions, allowing expressing almost all “interesting” constructs.

## Memory model

MelVM uses a Harvard architecture where the code itself is not part of the memory space. Memory is divided into two regions: the **stack** and the **heap**. Except when data is transferred between the stack and the heap, MelVM instructions directly operate only on stack data, pushing and popping items from the top of the stack.

A big difference from most low-level VMs is that the basic unit of data isn’t a word of binary data, but a recursive data structure:

```
pub enum Value {
Int(U256),
Bytes(im::Vector<u8>),
Vector(im::Vector<Value>),
}
```

That is, there are two **base types**:

- 256-bit unsigned integers
- Arbitrary-length bytestrings

As well as a **product type**: an immutable vector.

The stack is just a FIFO stack of `Value`

s, while the heap is a mapping from 16-bit **addresses** to `Value`

s.

## Execution model

MelVM takes as input:

```
/// Heap address where the transaction trying to spend the coin encumbered by this covenant (spender) is put
pub const ADDR_SPENDER_TX: u16 = 0;
/// Heap address where the spender's hash is put.
pub const ADDR_SPENDER_TXHASH: u16 = 1;
/// Heap address where the *parent* (the transaction that created the coin now getting spent)'s hash is put
pub const ADDR_PARENT_TXHASH: u16 = 2;
/// Heap address where the index, at the parent, of the coin being spent is put. For example, if we are spending the third output of some transaction, `Heap[ADDR_PARENT_INDEX] = 2`.
pub const ADDR_PARENT_INDEX: u16 = 3;
/// Heap address where the hash of the running covenant is put.
pub const ADDR_SELF_HASH: u16 = 4;
/// Heap address where the face value of the coin being spent is put.
pub const ADDR_PARENT_VALUE: u16 = 5;
/// Heap address where the denomination of the coin being spent is put.
pub const ADDR_PARENT_DENOM: u16 = 6;
/// Heap address where the additional data of the coin being spent is put.
pub const ADDR_PARENT_ADDITIONAL_DATA: u16 = 7;
/// Heap address where the height of the parent is put.
pub const ADDR_PARENT_HEIGHT: u16 = 7;
/// Heap address where the "spender index" is put. For example, if this coin is spent as the first input of the spender, then `Heap[ADDR_SPENDER_INDEX] = 0`.
pub const ADDR_SPENDER_INDEX: u16 = 8;
/// Heap address where the header of the last block is put. If the covenant is being evaluated for a transaction in block N, this is the header of block N-1.
pub const ADDR_LAST_HEADER: u16 = 9;
```

## List of opcodes

In the “meaning” field, all expressions are evaluated left to right. For example, `pop() - pop()`

means pop a value `x`

, then another value `y`

, and then compute `x-y`

.

### Arithmetic

Overflow always wraps. Whether or not the previous instruction overflowed can be queried.

Opcode | Encoding | Meaning |
---|---|---|

ADD | 0x10 | `push(pop() + pop())` |

SUB | 0x11 | `push(pop() - pop())` |

MUL | 0x12 | `push(pop() * pop())` |

DIV | 0x13 | `push(pop() / pop())` |

REM | 0x14 | `push(pop() % pop())` |

### Logic

All operators operate on integers and are bitwise.

Opcode | Encoding | Meaning |
---|---|---|

AND | 0x20 | `push(pop() & pop())` |

OR | 0x21 | `push(pop() | pop())` |

XOR | 0x22 | `push(pop() ^ pop())` |

NOT | 0x23 | `push(~pop())` |

EQL | 0x24 | `push(pop() == pop())` |

LT | 0x25 | `push(pop() < pop())` |

GT | 0x26 | `push(pop() > pop())` |

### Cryptography

Operators take in a bytestring and return a bytestring.

Opcode | Encoding | Meaning |
---|---|---|

HASH(n) | 0x30 + u16be | `push(blake3(pop()[..n]))` |

SIGEOK(n) | 0x32 + u16be | `push(ed25519_verify(msg = pop()[..n], pk = pop(), sig = pop()))` |

### Heap access

Opcode | Encoding | Meaning |
---|---|---|

LOAD | 0x40 | `push(heap[pop()])` |

STORE | 0x41 | `heap[pop()] = pop()` |

LOADIMM(n) | 0x42 + u16be | `push(heap[n])` |

STOREIMM(n) | 0x43 + u16be | `heap[n] = pop()` |

### Vectors

Despite their appearance, these operations, as well as those for bytestrings, are all quasi-constant-time because vectors and bytestrings can be represented as RRB trees or similar.

Opcode | Encoding | Meaning |
---|---|---|

VREF | 0x50 | `push(pop()[pop()])` |

VAPPEND | 0x51 | `push(pop() ++ pop())` |

VEMPTY | 0x52 | `push(empty vector)` |

VLENGTH | 0x53 | `push(pop().length)` |

VSLICE | 0x54 | `push(pop[pop()..pop()])` |

VSET | 0x55 | `push(pop[pop() => pop()])` |

VPUSH | 0x56 | `push(pop().push(pop()))` |

VCONS | 0x57 | `push(cons(pop(), pop()))` |

### Bytestrings

Opcode | Encoding | Meaning |
---|---|---|

BREF | 0x70 | `push(pop()[pop()])` |

BAPPEND | 0x71 | `push(pop() ++ pop())` |

BEMPTY | 0x72 | `push(empty vector)` |

BLENGTH | 0x73 | `push(pop().length)` |

BSLICE | 0x74 | `push(pop[pop()..pop()])` |

BSET | 0x75 | `push(pop[pop() => pop()])` |

BPUSH | 0x76 | `push(pop().push(pop()))` |

BCONS | 0x77 | `push(cons(pop(), pop()))` |

### Control flow

Opcode | Encoding | Meaning |
---|---|---|

JMP(n) | 0xa0 | Jump n instructions forward |

BEZ(n) | 0xa1 | If `pop()` is zero, jump n |

BNZ(n) | 0xa2 | If `pop()` isn’t zero, jump n |

LOOP(n, body) | 0xb0 + u16be + u16be | Loop `body` `n` times |

### Type conversions

Opcode | Encoding | Meaning |
---|---|---|

ITOB | 0xc0 | Pops an integer and converts to bytes |

BTOI | 0xc1 | Pops bytes and converts first 32B to integer |

TYPEQ | 0xc2 | Pops a value and returns what type it is: 0 if integer, 1 if bytes, 2 if vector. |

### Literals

Opcode | Encoding | Meaning |
---|---|---|

PUSHB(bytes) | 0xf0 + u8 length + bytes | Push the bytes to the stack |

PUSHI(int) | 0xf1 + u256be | Push the integer to the stack |