Skip to content

Commit

Permalink
Merge pull request #22 from Ho-Ro/master
Browse files Browse the repository at this point in the history
z80 output format (bin file with header telling file offset) for the asm
  • Loading branch information
sarnau authored Sep 15, 2024
2 parents 04df214 + ff0d7d3 commit 64688d4
Show file tree
Hide file tree
Showing 7 changed files with 107 additions and 102 deletions.
7 changes: 5 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
CC=gcc
CFLAGS=-I. -Wall
DEPS = z80_assembler.h
DEPS = z80_assembler.h kk_ihex_read.h kk_ihex_write.h Makefile

%.o: %.cpp $(DEPS)
$(CC) -c -o $@ $< $(CFLAGS)

%.o: %.cp $(DEPS)
$(CC) -c -o $@ $< $(CFLAGS)
Expand All @@ -13,7 +16,7 @@ all: z80assembler z80disassembler
z80assembler: z80_assembler.o z80_tokenize.o z80_compile.o z80_calc.o kk_ihex_write.o
$(CC) -o $@ $^ $(CFLAGS)

z80disassembler: z80_disassembler.o file.o kk_ihex_read.o
z80disassembler: z80_disassembler.o kk_ihex_read.o
$(CC) -o $@ $^ $(CFLAGS)

clean:
Expand Down
56 changes: 28 additions & 28 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,20 +7,20 @@ Z80 Disassembler

I created this small disassembler for a Z80 cpu at one afternoon. It is a commandline tool. The size of the ROM and entry points have to be coded directly in the sourcecode.

Every ANSI C++ compiler should compile this program. It only uses some ANSI C functions (look into ''main()'') for loading a file called "EPROM".
Every ANSI C++ compiler should compile this program. It only uses some ANSI C functions (look into `main()`).

The program has two parts:

- Analyze the code. The disassembler tries to analyze what part of the binary data is program code and what part is data. It start with all hardware vectors of the Z80 (''RST'' opcodes, NMI) and parses all jumps via a recursive analyze via ''ParseOpcode()''. Every opcode is marked in an array (''OpcodesFlags''). There are some exceptions, the parser can't recognize:
- Analyze the code. The disassembler tries to analyze what part of the binary data is program code and what part is data. It start with all hardware vectors of the Z80 (`RST` opcodes, `NMI`) and parses all jumps via a recursive analyze via `ParseOpcode()`. Every opcode is marked in an array (`OpcodesFlags`). There are some exceptions, the parser can't recognize:
- self modifying code. A ROM shouldn't contain such code.
- calculated branches with ''JP (IY)'', ''JP (IX)'' or ''JP (HL)''. The parser can't recognize them, either.
- Jumptables. These are quite common in a ROM. Only solution: disassemble the program and look into the code. If you found a jumptable - like on my Futura aquarium computer - insert some more calls of ''ParseOpcodes()''.
- calculated branches with `JP (IY)`, `JP (IX)` or `JP (HL)`. The parser can't recognize them, either.
- Jumptables. These are quite common in a ROM. Only solution: disassemble the program and look into the code. If you found a jumptable - like on my Futura aquarium computer - insert some more calls of `ParseOpcodes()`.
- Unused code. Code that is never called by anybody, could not be found. Make sure that the code is not called via a jump table!
- Disassembly of the code. With the help of the OpcodesFlags table the disassembler now creates the output. This subroutine is quite long. It disassembles one opcode at a specific address in ROM into a buffer. It is coded directly from a list of Z80 opcodes, so the handling of ''IX'' and ''IY'' could be optimized quite a lot.
- Disassembly of the code. With the help of the OpcodesFlags table the disassembler now creates the output. This subroutine is quite long. It disassembles one opcode at a specific address in ROM into a buffer. It is coded directly from a list of Z80 opcodes, so the handling of `IX` and `IY` could be optimized quite a lot.

The subroutine ''OpcodeLen()'' returns the size of one opcode in bytes. It is called while parsing and while disassembling.
The subroutine `OpcodeLen()` returns the size of one opcode in bytes. It is called while parsing and while disassembling.

The disassembler recognizes no hidden opcodes (the assembler does!). I didn't had a table for them while writing the disassembler and they were not needed anyway.
The disassembler recognizes some hidden opcodes.

If a routine wanted an "address" to the Z80 code, it is in fact an **offset** to the array of code. **No** pointers! Longs are not necessary for a Z80, because the standard Z80 only supports 64k.

Expand All @@ -31,31 +31,31 @@ Z80 Assembler

I created the assembler for the Z80 a few days later to compile the changes code from the disassembler into an EPROM image and build a new firmware for my aquarium computer. I needed almost two days for the assembler, this means: commandline only... If you want to change the filename of the sourcefile, you have to change main().

This small assembler has some nice gadgets: it is a quite fast tokenizing single-pass assembler with backpatching. It knows all official Z80 opcodes and some undocumented opcodes (mainly with ''IX'' and ''IY''). The Z80 syntax is documented in the Zilog documentation.
This small assembler has some nice gadgets: it is a quite fast tokenizing single-pass assembler with backpatching. It knows all official Z80 opcodes and some undocumented opcodes (mainly with `IX` and `IY`). The Z80 syntax is documented in the Zilog documentation.

The assembler allows mathematical expressions in operands: ''+'', ''-'', ''*'', ''/'', ''%'' (modulo), ''&'' (and), ''|'' (or), ''!'' (not), ''^'' (xor), ''<<'' (shift left) and ''>>'' (shift right). Brackets are also available. The expression parser is located in [[Z80 Calc.c]]. Number can be postpended by a ''D'', ''H'' or ''B'' for decimal, hexadecimal and binary numbers.
The assembler allows mathematical expressions in operands: `+`, `-`, `*`, `/`, `%` (modulo), `&` (and), `|` (or), `!` (not), `^` (xor), `<<` (shift left) and `>>` (shift right). Brackets are also available. The expression parser is located in `z80_calc.c`. Number can be postpended by a `D`, `H` or `B` for decimal, hexadecimal and binary numbers.

The assembler also knows the most commend pseudo opcodes (look into the sourcefile 'Z80 Tokenize.c'):
The assembler also knows the most commend pseudo opcodes (look into the sourcefile 'z80_tokenize.cp'):

* '';'' This line is a comment.
* ''IF'' Start the conditional expression. If false, the following sourcecode will be skipped (until ''ELSE'' or ''ENDIF'').
* ''ENDIF'' End of the condition expression.
* ''ELSE'' Include the following code, when the expression on IF was false.
* ''END'' End of the sourcecode. The assembler stops here. Optional.
* ''ORG'' Set the PC in the 64k address space. E.g. to generate code for address $2000.
* ''PRINT'' Print the following text on the console. Great for testing the assembler.
* ''EQU'' or ''='' Set a variable.
* ''DEFB'' Put a byte at the current address
* ''DEFW'' But a word at the current address (little endian!)
* ''DEFM'' But several bytes in the memory, starting at the current address. Seperated with a "," or a string.
* ''DEFS'' Set the current address n bytes ahead. Defines space for global variables that have no given value.
* `;` This line is a comment.
* `IF` Start the conditional expression. If false, the following sourcecode will be skipped (until `ELSE` or `ENDIF`).
* `ENDIF` End of the condition expression.
* `ELSE` Include the following code, when the expression on IF was false.
* `END` End of the sourcecode. The assembler stops here. Optional.
* `ORG` Set the PC in the 64k address space. E.g. to generate code for address $2000.
* `PRINT` Print the following text on the console. Great for testing the assembler.
* `EQU` or `=` Set a variable.
* `DEFB` Put a byte at the current address
* `DEFW` But a word at the current address (little endian!)
* `DEFM` But several bytes in the memory, starting at the current address. Seperated with a "," or a string.
* `DEFS` Set the current address n bytes ahead. Defines space for global variables that have no given value.

The Sourcecode
--------------

* [Z80 Assembler.cp](z80_assembler.cp)
* [Z80 Assembler.h](z80_assembler.h)
* [Z80 Calc.cp](z80_calc.cp)
* [Z80 Compile.cp](z80_compile.cp)
* [Z80 Disassembler.cp](z80_disassembler.cp)
* [Z80 Tokenize.cp](z80_tokenize.cp)
* [z80_assembler.cp](z80_assembler.cp)
* [z80_assembler.h](z80_assembler.h)
* [z80_calc.cp](z80_calc.cp)
* [z80_compile.cp](z80_compile.cp)
* [z80_disassembler.cp](z80_disassembler.cp)
* [z80_tokenize.cp](z80_tokenize.cp)
Binary file removed Z80.code
Binary file not shown.
50 changes: 0 additions & 50 deletions file.c

This file was deleted.

19 changes: 0 additions & 19 deletions file.h

This file was deleted.

35 changes: 33 additions & 2 deletions z80_assembler.cp
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ bool listing = false;

static FILE *infile;
static FILE *outbin;
static FILE *outz80;
static FILE *outhex;

int verboseMode = 0;
Expand Down Expand Up @@ -62,7 +63,7 @@ void usage( const char *fullpath ) {


static void listOneLine( uint32_t firstPC, uint32_t lastPC, const char *oneLine );

static void write_header( FILE *stream, uint32_t address );

/***
* …
Expand Down Expand Up @@ -207,14 +208,23 @@ int main( int argc, char **argv )

// create out file name(s) from in file name
size_t fnamelen = strlen( outputfilename );
// bin or com (=bin file that starts at PC=0x100) file
strncpy( outputfilename + fnamelen - 3, com ? "com" : "bin", sizeof(outputfilename) -fnamelen - 3 );
MSG( 1, "Creating output file %s\n", outputfilename );
outbin = fopen( outputfilename, "wb" );
if ( !outbin ) {
fprintf( stderr, "Error: Can't open output file \"%s\".\n", outputfilename );
return 1;
}

// z80 file is a bin file with a header telling the file offset
strncpy( outputfilename + fnamelen - 3, "z80", sizeof(outputfilename) -fnamelen - 3 );
MSG( 1, "Creating output file %s\n", outputfilename );
outz80 = fopen( outputfilename, "wb" );
if ( !outz80 ) {
fprintf( stderr, "Error: Can't open output file \"%s\".\n", outputfilename );
return 1;
}
// intel hex file
strncpy( outputfilename + fnamelen - 3, "hex", sizeof(outputfilename) -fnamelen - 3 );
MSG( 1, "Creating output file %s\n", outputfilename );
outhex = fopen( outputfilename, "wb" );
Expand All @@ -233,6 +243,10 @@ int main( int argc, char **argv )
fwrite( RAM + offset, sizeof( uint8_t ), maxPC + 1 - offset, outbin );
fclose( outbin );
}
if ( outz80 ) {
write_header( outz80, minPC );
fwrite( RAM + minPC, sizeof( uint8_t ), maxPC + 1 - minPC, outz80 );
}
if ( outhex ) {
// write the data as intel hex
struct ihex_state ihex;
Expand Down Expand Up @@ -313,6 +327,23 @@ static void listOneLine( uint32_t firstPC, uint32_t lastPC, const char *oneLine
}


// the z80 format is used by the z80-asm
// http://wwwhomes.uni-bielefeld.de/achim/z80-asm.html
// *.z80 files are bin files with a header telling the bin offset
// struct z80_header {
// const char *MAGIC = Z80MAGIC;
// uint16_t offset;
// }
static void write_header( FILE *stream, uint32_t address ) {
const char *Z80MAGIC = "Z80ASM\032\n";
unsigned char c[ 2 ];
c[ 0 ] = address & 255;
c[ 1 ] = address >> 8;
fwrite( Z80MAGIC, 1, strlen( Z80MAGIC ), stream );
fwrite( c, 1, 2, stream );
}


void ihex_flush_buffer( struct ihex_state *ihex, char *buffer, char *eptr ) {
(void)ihex;
*eptr = '\0';
Expand Down
42 changes: 41 additions & 1 deletion z80_disassembler.cp
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,6 @@ This program is freeware. It is not allowed to be used as a base for a commercia
#include <cstring>
#include <cstdint>
#include <cstdarg>
#include "file.h"
#include "kk_ihex_read.h"


Expand Down Expand Up @@ -960,6 +959,47 @@ int main( int argc, char *argv[] ) {
}


// the z80 format is used by the z80-asm
// http://wwwhomes.uni-bielefeld.de/achim/z80-asm.html
// *.z80 files are bin files with a header telling the bin offset
// struct z80_header {
// const char *MAGIC = Z80MAGIC;
// uint16_t offset;
// }
// reads header of a file and tests if it's Z80 ASM file, reads address
// return value: 0=OK, 1=this is not a z80 asm file, 2,3=seek malfunction
int read_header( FILE *stream, uint32_t *address, uint32_t *len ) {
const char *Z80MAGIC = "Z80ASM\032\n";
char tmp[ 9 ];
unsigned char c[ 2 ];
unsigned a, b;
int ret = 0;

b = strlen( Z80MAGIC );
tmp[ b ] = 0;
a = 0;
if ( ( a = fread( tmp, 1, b, stream ) ) != b )
ret = 1;
else if ( strcmp( tmp, Z80MAGIC ) )
ret = 1;
else if ( fread( c, 1, 2, stream ) != 2 )
ret = 1;
else {
*address = ( c[ 1 ] << 8 ) | c[ 0 ];
a = b + 2;
}
if ( fseek( stream, 0, SEEK_END ) )
ret = 2;
else if ( ( b = ftell( stream ) ) < a )
ret = 2;
else
*len = b - a;
if ( fseek( stream, a, SEEK_SET ) )
ret = 3;
return ret;
}


static bool load_bin( char *path, uint32_t offset ) {
// int address;
uint32_t size;
Expand Down

0 comments on commit 64688d4

Please sign in to comment.