View on GitHub

asMSX

AsMSX, originally developed by Pitpan. More info: https://www.msx.org/wiki/AsMSX.

asMSX - MSX cross assembler

asmsx

1. Introduction

1.1. Description

asMSX is an assembler for the Z80 processor with a number of extensions to make programming for MSX easy. It is a cross-assembler that runs on Windows or Linux PC. Compile time on a modern PC is near instant comparing to native MSX assemblers. There is no size limit on source file size. Source can be edited in any text editor.

1.2. Features

1.3. Project goal

asMSX project goal was to create Z80 cross assembler that is flexible, easy to use, reliable and designed from ground up for the development of MSX ROM and MegaROM programs. This goal is fully achieved in current version.

asMSX is a small console program that works very fast. It uses standard libraries present on all PCs.

1.4. Syntax

asMSX implements Z80 assembly language slightly differently from standard Zilog syntax. Major difference is that square brackets [ ], not parentheses ( ) are used for indirect address mode. This design decision allows us to support complex mathematical expressions that use parentheses to order evaluation precedence.

asMSX supports ZILOG directive to retain source level compatibility for existing code. It switches indirect address mode to use parentheses. Mathematical expression evaluation is switched to square brackets as separators.

1.5. Use

asMSX is a command-line program, so the easiest way to use it is from command prompt. To assemble a file use the following command:

asmsx [Optional params] filename.asm

If the extension is not provided, asMSX will assume “.asm” as default. Source code files are expected to be plain 8-bit ASCII text files. asMSX supports text files generated on both Windows and Linux (CR/LF end of line or just LF). On Windows you can assemble a source file by dragging it to asMSX desktop icon. This method is not recommended, since long file name support may not work well. Also please try to avoid dots and other special characters in file names. Dot use for file extension is fine.

asMSX produces messages during assembly.

If no errors are encountered, following files will be generated:

filename.sym: text file with all the symbols defined in program with their decimal values. This file is compatible with BlueMSX debugger. You can use it to make debugging easier. asMSX won’t generate SYM file if no constant or variable symbols were defined in program.

filename.txt: text file with messages generated by directives in program code. If PRINT directives are not used, file won’t be generated.

Depending on use of output type directives (rom/.rom, .megarom, basic, cas, wav) in your filename.asm, asMSX will generate one or more of the following binary files:

1.5.1 Commandline parameters

In asMSX there are the following allowed parameters:

2. Assembly language

2.1. Assembly syntax

asMSX accepts following line format in source code:

[label:]    [directive[parameters]]    [;comments]
[label:]    [opcode[parameters]]       [;comments]

Each block in brackets is optional. There is no limit on maximum line length. White space (tabs, spaces and empty lines) are ignored.

Example
loop:
        ld a,[0ffffh]   ; reads the master record

points:
        db 20,30,40,50  ; points table

        org 08000h      ; code offset on page 2
        nop

2.2. Labels

Label is a symbolic name for a memory address. Label name should start with either a character or an underscore _, the rest could be a character or a number. Label name should always end with a colon :. Label name is case sensitive: LABEL: and Label: are treated as two separate labels.

Example
LABEL:
Label:
_alfanumeric:
loop001:

Colon defines a label and sets its value equal to memory address of the next opcode or data value. If you want to get address of label, use its name without the colon. You can define local labels by prefixing label name with two “at” symbols @@ or one “dot”. Local label name is only valid up to next global label. This allows to reuse label names for trivial tasks like loops, so you don’t have to invent a bunch of different names. It simplifies programming and improves code readability. The following is valid code:

Function1:
        ...
@@loop: 
        ...
Function2:
        ...
@@loop:
        ...

OR

Function1:
        ...
.loop:	
        ...
Function2:
        ...
.loop:
        ...

2.3. Numeric expression

A numeric expression is a number or a result of operation on numbers.

2.3.1. Numeric formats

There are several popular numeric systems (radices, plural form of radix). Here are a few examples of syntax for such systems:

DECIMAL INTEGER: radix 10 numbers are usually expressed as a group of one or more decimal digits. The only restriction is that you must explicitly express zeroes. This is the numeric system that people use in everyday life.

Example
0 10 25 1 255 2048

DECIMAL FLOATING POINT: a decimal number with dot separating integer from fraction. Syntax requires dot to be present for the constant to be recognized as a floating point value.

Example
3.14 1.25 0.0 32768.0 -0.50 15.2

OCTAL: radix 8 numbers could be specified using two conventions. First convention is similar to C, C++ and Java. The number starts with 0 and continues with octal digits 0..7. The second convention is native to assemblers and is a number with octal digits followed by letter O, lower case or upper case. Second mode is included for compatibility, but is not recommended. Upper case letter O is easy to confuse with number zero 0.

Example
01 077 010 1o 77o 10o

HEXADECIMAL: radix 16 numbers, very popular in assembly programming. They can be specified using three different conventions.

C, C++ and Java convention is a number that starts with 0x prefix and continues with one or more hexadecimal digits: 0..9, a..f or A..F.

Second convention is borrowed from Pascal: a group of hexadecimal digits prefixed with $.

The third convention is native to assemblers: a group of hexadecimal digits, followed by letter h or H. This convention requires first digit to always be numeric. If a hexadecimal number starts with letter, you should prefix the number with ‘0’.

Example
0x8a 0xff 0x10
$8a $ff $10
8ah 0ffh 10h

BINARY: radix 2 numbers are specified as a group of binary digits 0 and 1, followed by letter b or B.

Example
1000000b 11110000b 0010101b 1001b

2.3.2. Operators

Numeric expressions use operators on numbers in supported numeric systems. Common notation is used for typical arithmetic operators:

Less common operators borrow from C/C++:

The precedence order is same as in C/C++. Parentheses can be used to explicitly specify parsing precedence in arithmetic expressions.

Example
((2*8)/(1+3))<<2

Same rules apply to all numbers, including non decimal. Additionally following functions are supported:

PI is predefined as a double precision floating point constant. It can be used in numeric expressions.

Example
sin(pi*45.0/180.0)

It is useful to support non-integer values in Z80 assembler programs by providing simple floating to fixed point conversion mechanism. This is done using two conversion functions:

$ is is a special read-only variable. During assembly of program, $ is replaced with memory address of next opcode or data, during execution it is equal to PC register.

2.4. Data definition

You can include data in your assembler program using four different directives:

        db data,[data...]
        defb data,[data...]
        dt "text"
        deft "text"

These instructions define data as 8-bit byte values. dt was included for compatibility with other Z80 assemblers. All four directives are equivalent.

        dw data,[data...]
        defw data,[data...]

This will define 16-bit word data.

        ds X
        defs X

This will reserve X bytes in memory starting with current memory address.

Special predefined directives are available for reserving memory for variables of conventional sizes, such as byte and word:

2.5. Directives

Directives are predefined instructions that help control the code and enable additional asMSX features.

.ZILOG This directive will switch the use of square brackets and parentheses from the point it is defined on. Parentheses will be used for “memory content at” indirection and brackets to group operations in complex arithmetic expressions.

.ORG X This directive is used to force current memory address offset. All subsequent code will be assembled from that offset on.

.PAGE X This directive is similar to .ORG directive, but instead of bytes, offset is defined in page blocks (16KB). .PAGE 0 is equivalent to .ORG 0000h, .PAGE 1 is same as .ORG 4000h, .PAGE 2 is same as .ORG 8000h and .PAGE 3 is same as .ORG 0C000h.

.EQU This directive is used to define constants. Naming rules are mostly the same as for labels. There are no local constants.

Variable = expression asMSX can use integer variables. Variables must be initialized by assigning them an integer values. It is possible to perform arithmetic operations with variables.

Example
Value1=0
Value1=Value1+1
ld a,Value1
Value1=Value1*2
ld b,Value1

.BIOS Predefined call address for common BIOS routines, including those specified in the MSX, MSX2, MSX2+ and Turbo-R standards. The usual names are used in upper case.

.ROM Indicates that a ROM header should be generated. It is important to use .PAGE directive first to define start address. .START directive can be used to indicate the start address for the program.

.MegaROM [mapper] Generates header for specified MegaROM mapper. This directive will also set start address to sub-page 0 of selected mapper, so using ORG, PAGE or SUBPAGE directives is not necessary. Supported mapper types are:

.BASIC Generates the header for a loadable binary MSX-BASIC file. It is important to use .ORG directive to indicate the intended start address for the program. Use .START directive if execution entry point is not at the start of program. The default extension of the output file is BIN.

.MSXDOS Produces a COM executable for MSX-DOS. No need for .ORG directive, because COM files are always loaded at 0100h.

.START X Indicates the starting execution address for ROM, MegaROM and BIN files, if it is not at the beginning of the file.

.SEARCH For ROMs and MegaROMs to that start on page 1 (4000h), it automatically finds and sets slot and subslot on page 2 (8000h). It is equivalent to the following code:

        call 0138h ;RSLREG
        rrca
        rrca
        and 03h
        
        ; Secondary Slot
        ld c,a
        ld hl,0FCC1h
        add a,l
        ld l,a
        ld a,[hl]
        and 80h
        or c
        ld c,a
        inc l
        inc l
        inc l
        inc l
        ld a,[hl]
        
        ; Define slot ID
        and 0ch
        or c
        ld h,80h
        
        ; Enable
        call 0024h ;ENASLT

.SUBPAGE n AT X This macro is used to define different sub-pages in a MegaROM. In MegaROM code generation model of asMSX, all code and data is included in sub-pages, equivalent to logic blocks that mapper operated with. You must provide the number of sub-page and execution entry point.

.SELECT n AT x / .SELECT record AT x Just like above, this macro selects the sub-page n with execution entry point at x. Specific code that end up being used for this directive depends on selected MegaROM mapper type. It doesn’t change the value of any record or affect interrupt mode or status flags.

.PHASE X / .DEPHASE These two routines enable virtual memory use. Instructions will be assembled to be stored at one memory address, but ready to be executed from another memory address. This is useful for creation of code in ROM image, that is copied to RAM and then executed. The effect is that label values will be calculated for supplied address. .DEPHASE reverts assembler behaviour to normal, although ORG, PAGE or SUBPAGE will have the same effect.

.CALLBIOS X Calls a BIOS routine from MSX-DOS program. It is equivalent to following code:

        LD IY,[EXPTBL-1]
        LD IX,ROUTINE
        CALL CALSLT

.CALLDOS X Calls MSX-DOS function. It is equivalent to following code:

        LD C,CODE
        CALL 0005h

.SIZE X Sets the output file size in Kilobytes.

.INCLUDE "file" Includes source code from an external file.

.INCBIN "file" [SKIP X] [SIZE Y] Injects the contents of a binary file into program. Optional SIZE and SKIP parameters allow to include a number of bytes, starting at specified offset.

.RANDOM(n) Generates a random integer number from the range of 0 to n-1. Provides better entropy than the Z80 R register.

.DEBUG "text" Adds text to assembled program that is visible during debugging, but does not affect the execution. BlueMSX debugger supports this extended functionality.

.BREAK [X] / .BREAKPOINT [X] Defines a breakpoint for BlueMSX debugger. It doesn’t affect the execution, but it should be removed from the final build. If the direction is not indicated, the breakpoint will be executed in the position in which it has been defined.

REPT n / ENDR This macro allows you to repeat a block given number of times. Nesting allows generation of complex tables. There is a restriction: the number of repetitions must be defined as integer number, it can’t be calculated from a numeric expression.

Example
    REPT 16
      OUTI
    ENDR
  
    X=0
    Y=0
    REPT 10
      REPT 10
        DB X*Y
        X=X+1
      ENDR
      Y=Y+1
    ENDR

.PRINTTEXT "text" / .PRINTSTRING "text" Prints a text in the output text file.

.PRINT expression / .PRINTDEC Prints a numeric expression in decimal format.

.PRINTHEX expression Prints a numeric expression in hexadecimal format.

.PRINTFIX expression Prints a fixed-point numeric value.

.CAS ["text"] / .CASSETTE ["text"] Generates a tape file name in output file. This only works if output file type is loadable BASIC program, ROM in page 2 or Z80 binary blob without header. Supplied name can be used to load the program from the tape, maximum length is 6 characters. If not explicitly specified, it is set to name of output file.

.WAV ["text"] Like the previous command, but instead of CAS file, it generates a WAV audio file that can be loaded directly on a real MSX through tape-in port.

.FILENAME ["text"] Using this directive will set the name of the output file. If not explicitly specified, source file name will be used, plus appropriate extension.

.SINCLAIR Currently broken This directive sets the output file type to TAP format with appropriate header. It is intended for loading on ZX Spectrum emulators or real hardware if you have a working playback application.

2.6. Comments

It is highly recommended to use comments throughout the assembler source code. asMSX supports comments in a variety of formats:

; Comment

Single line comment. This is the standard for assembler listings. Entire line up to carriage return is ignored.

// Comment

Same as above. Two consecutive slashes is a convention from C++ and C since C99.

/* Comments */
{ Comments }

C/C++ and Pascal style multi line comments. All text between the delimiters is skipped during assembly.

2.7. Conditional assembly

asMSX includes preliminary support for conditional assembly. The format of a conditional assembly block is as follows:

IF condition
  Instruction
ELSE
  Instruction
ENDIF

The condition can be of any type, consistent with C/C++ rules. A condition is considered true if evaluation result is non-zero.

If condition is true, asMSX will assemble code that follows IF. If the condition is false, code after ELSE will be assembled.

ELSE block is optional for IF.

ENDIF is mandatory, it closes conditional block.

Current IF nesting limit is 15. It may become unlimited in future rewrite.

Example
IF (computer==MSX)
  call MSX1_Init
ELSE
  IF (computer==MSX2)
    call MSX2_Init
  ELSE
    IF (computer==MSX2plus)
      call MSX2plus_Init
    ELSE
      call TurboR_Init
    ENDIF
  ENDIF
ENDIF

In addition, all code, directives and macros will be executed according to conditions, this enables creation of the following structures:

CARTRIDGE=1
BINARY=2
format=CARTRIDGE
IF (format==CARTRIDGE)
  .page 2
  .ROM
ELSE
  .org 8800h
  .Basic
ENDIF
.START Init

There is a limitation on conditional instructions: IF condition, ELSE and ENDIF must appear on their own separate lines. They can’t be mixed with any other instructions or macros. The following code will fail with current asMSX:

IF (variable) nop ELSE inc a ENDIF

It should be written as follows:

IF (variable)
  nop
ELSE
  inc a
ENDIF

IFDEF condition could be used to branch code assembly using a defined symbol as argument.

Example
IFDEF tests
  .WAV "Test"
ENDIF

This snippet will generate a WAV file only if the label or variable tests was previously defined in the source code. IFDEF only recognize a label if it is defined before IFDEF.