Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

LuauToolkit

A collection of various Luau-related libraries.

[!WARNING] LuauToolkit is deeply unstable and is far from production ready.

Structure

Modules

The library is split into the following:

ModuleDescription
bytecode[WIP] Module for working with Luau bytecode.[docs]
[src]
[examples]

Binaries

Some modules also have command line applications, accessible under the bin/ directory.

Examples

A series of examples can be found under the examples/ directory.

You can run an example with:

lune run examples/<path_to_example> [args]

# eg:
lune run examples/bytecode/strip chunk.luauc stripped.luauc

Documentation

Documentation can be found here.

Installation

To install luau_toolkit, copy the lib/ directory to wherever you'd like to install the library, and rename it to luau_toolkit.

Optionally, create an @luau_toolkit alias in your .luaurc file to make the library accessible with require("@luau_toolkit/<module>"):

{
    //...
    "aliases": {
        "luau_toolkit": "./path/to/library"
    }
    //...
}

Modules

Parts

Bytecode Module

Introduction

The bytecode module provides functionality for generating and manipulating Luau bytecode.

Testing

LuauToolkit comes with a test suite.

lune run tests          # Run all tests
lune run tests bytecode # Run all bytecode tests
lune run tests --werror # Run all tests and treat warnings as errors

Dependencies and Tools

NameUseLicence
Dekkonot/int64-luau[Lib] Handling 32bit+ integers.MIT
plainenglishh/json.luau[Lib] Pure-Luau JSON parsing.MIT
plainenglishh/byteparse[Lib] Parsing binary formats, e.g. bytecode.MIT
lune-org/lune[Tool] Script/test running.MPL-2.0
rust-lang/mdBook[Tool] Documentation site generation.MPL-2.0
plainenglishh/mdbook.luau[Dev] Custom documentation preprocessor.MIT

Luau Internals

This section contains information about the Luau language internals.

Parts

Bytecode Format

1. Introduction

This chapter attempts to describe the Luau bytecode format up to version 6.

2. Definitions

2.1. Terms

TermMeaning
protoThe bytecode representation ('prototype') of a function.
closureAn instantiated function, regardless of upvalue use.
upvalueA value used by a function that belongs to an outside scope.
structA data structure encoding a series of field values without any padding.
enumA data structure representing a series of possible values represented by an integer.

2.1. Type Notation

NameDescription
booleanA true or false value. Encoded as a u8 where zero is false and anything else is true.
u_An unsigned integer _ bits wide. E.g. u8, u32.
i_An signed integer _ bits wide. E.g. i8, i32.
f_An IEEE 754 floating point _ bits wide. E.g. f32, f64.
varintA protobuf variable width integer.
stringA length prefixed ASCII string. Length is encoded as a varint.
bufferA length prefixed blob of binary data. Length is encoded as a varint.
{_}A length prefixed array of _. Length is encoded as a varint. E.g. {string}.
_?An optional value _, prefixed by a boolean indicating its present. E.g. string?.
_!A value that uses special encoding that can't be represented by the aforementioned notation. See surrounding text for details.

Any types not defined above will be defined in Section 3.

3. Structure

The encapsulating data structure, called the chunk, the contains metadata, strings, function definitions, constants, instructions and type/debug information required to execute the compiled program.

3.1. Chunk

The top level Chunk encodes the entire bytecode chunk, and is a struct with the following fields:

Field NameTypeMeaning
luau_versionu8The bytecode version of the chunk. Tells the decoder how the rest of the chunk is encoded. A value of 0 indicates a compilation error.
types_versionu8The bytecode type version of the chunk. Tells the decoder what type information is encoded.
strings{string}The string table. Encodes strings used by the chunk.
userdata_typesUserdataTypes!Only present when types_version == 3. Tagged userdata type name table.
protos{Proto}The proto table. Encodes function definitions.
main_protoProtoIdIndicates which function is the main function.

3.2. UserdataTypes

The UserdataTypes encodes tagged userdata type names, and is encoded as a null-terminated array of type name definitions.

The first value is a u8 of the userdata tag it defines, incremented by 1. The second value is a StringRef containing the type name.

3.3. Proto

A Proto encodes a function, and is a struct with the following fields:

Field NameTypeMeaning
max_stack_sizeu8How many registers, and therefore how large of a stack, the function uses.
num_paramsu8How many parameters the function accepts. Parameters occupy registers R0 to R{num_params - 1}.
num_upvaluesu8How many upvalues the function uses.
is_varargbooleanWhether the function accepts varadic arguments.
flagsProtoFlagsBitfield containing function related flags.
typesProtoTypeInfoType information used to guide native code generation.
instructionsbufferThe instructions buffer.
constants{Constant}Constants used by the function.
child_protos{ProtoId}Child functions.
line_definedvarintThe line the function was defined on.
debug_nameStringRefThe debug name of the function.
line_infoProtoLineInfo?Instruction line information
debug_infoProtoDebugInfo?Instruction type information.

3.4. ProtoId

A ProtoId is a varint that points to a Proto in the chunk proto table.

3.5. ProtoFlags

A ProtoFlags is a bitfield containing the following fields:

Bit PositionFlagMeaning
0native_moduleWhen used on the main proto, indicates the chunk was compiled with --!native.
1native_coldIndicates the proto isn't profitable to compile using native code generation.
2native_functionWhen used on the main proto, indicates at least one function within the module uses the attribute. (**assumed based on observed behaviour:** When used on non main protos, indicates was used on it)

3.6. StringRef

A StringRef is a varint that points to a string in the chunk string table. StringRefs are one-indexed to allow 0 to act as a 'null' variant. The actual index of a string would be string_ref - 1.

3.7. ProtoTypeInfo

A ProtoTypeInfo encodes type information for a Proto, and begins with a header struct with the following fields:

Field NameTypeMeaning
function_lengthvarintHow long the function type string is.
upvalue_countvarintHow many upvalue types there are.
local_countvarintHow many local types there are.

After the header, an array of ASCII characters function_length bytes long contains the functions type.

After the function type, an array upvalue_count items long contains the upvalue types, encoded as a BytecodeType.

After the upvalue types, an array local_count items long contains the local types, encoded as:

Field NameTypeMeaning
typeBytecodeTypeThe type of the local.
registeru8The register the local uses.
start_pcvarintThe start of the locals existence, in instructions.
lengthvarintHow many instructions the local exists for.

3.8. BytecodeType

A BytecodeType is a byte containing type information.

The first 7 bits of the BytecodeType are an enum of possible types. The last bit indicates whether the type is optional.

The list of types are:

TypeMeaning
0nil
1boolean
2number
3string
4table
5function
6thread
7userdata
8vector
9buffer
15any
64..96 (32 entries)Tagged userdata types (See UserdataTypes)

3.9. Constant

A Constant encodes a constant value.

The first byte of a constant encodes the constant type as an enum of:

TypeMeaning
0nil
1boolean
2number
3vector
4string
5import
6table
7closure

The rest of the Constant depends on the constant type:

TypeEncoding
nilNothing
booleanboolean
numberf64
vectorFour f32s
stringStringRef
importImportId
tableTableShape
closureProtoId

3.10. TableShape

A TableShape encodes the shape of a constant table as a length-prefixed array of key-value pairs, where each element points to another constant table entry.

3.11. ImportId

An ImportId is a u32 that encodes an import path (e.g. string.format or _G.hello.world), as a series of constant IDs pointing to constant strings.

ImportIds are encoded as:

Bit PositionTypeMeaning
0u10The first path component.
10u10The second path component.
20u10The third path component.
30u2The number of path components.

3.12. ProtoLineInfo

A ProtoLineInfo encodes the line information of a Protos instructions.

This is implementation defined.

3.13. ProtoDebugInfo

A ProtoDebugInfo encodes debugging information for a Proto, and contains the following fields:

Field NameTypeMeaning
locals{Local}Debug information for locals.
upvalues{Upvalue}Debug information for upvalues.

where Local is:

Field NameTypeMeaning
nameStringRefThe name of the local.
start_pcvarintThe instruction the local first appears.
end_pcvarintThe instruction the local stops being used.
registeru8The register the local uses.

and Upvalue is:

Field NameTypeMeaning
nameStringRefThe name of the upvalue.

4. Instructions Structure

An instruction buffer encodes instructions as a series of 32-bit words.

4.1. Instruction Word Structure

The first byte of an instruction contains the opcode.

The rest of the instruction contains its operands, which can be encoded in one of the following modes:

ModeEncoding
abcThree u8s.
adOne u8 followed by an i16
eOne i24

Some instructions have additional operands encoded in a follow up 'auxiliary' instruction.

Other instructions, like BREAK and NOP have no operands at all. Internally, LuauToolkit treats operand-less instructions as abc for simplicity's sake.

4.2. List of Instructions