Merge pull request #22 from Ho-Ro/master

z80 output format (bin file with header telling file offset) for the asm
sarnau · Sep 15, 2024 · 64688d4 · 64688d4
2 parents 04df214 + ff0d7d3
commit 64688d4
Show file tree

Hide file tree

Showing 7 changed files with 107 additions and 102 deletions.
diff --git a/Makefile b/Makefile
@@ -1,6 +1,9 @@
 CC=gcc
 CFLAGS=-I. -Wall
-DEPS = z80_assembler.h
+DEPS = z80_assembler.h kk_ihex_read.h kk_ihex_write.h Makefile
+
+%.o: %.cpp $(DEPS)
+	$(CC) -c -o $@ $< $(CFLAGS)
 
 %.o: %.cp $(DEPS)
 	$(CC) -c -o $@ $< $(CFLAGS)
@@ -13,7 +16,7 @@ all: z80assembler z80disassembler
 z80assembler: z80_assembler.o z80_tokenize.o z80_compile.o z80_calc.o kk_ihex_write.o
 	$(CC) -o $@ $^ $(CFLAGS)
 
-z80disassembler: z80_disassembler.o file.o kk_ihex_read.o
+z80disassembler: z80_disassembler.o kk_ihex_read.o
 	$(CC) -o $@ $^ $(CFLAGS)
 
 clean:

diff --git a/README.md b/README.md
@@ -7,20 +7,20 @@ Z80 Disassembler
 
 I created this small disassembler for a Z80 cpu at one afternoon. It is a commandline tool. The size of the ROM and entry points have to be coded directly in the sourcecode.
 
-Every ANSI C++ compiler should compile this program. It only uses some ANSI C functions (look into ''main()'') for loading a file called "EPROM".
+Every ANSI C++ compiler should compile this program. It only uses some ANSI C functions (look into `main()`).
 
 The program has two parts:
 
-  - Analyze the code. The disassembler tries to analyze what part of the binary data is program code and what part is data. It start with all hardware vectors of the Z80 (''RST'' opcodes, NMI) and parses all jumps via a recursive analyze via ''ParseOpcode()''. Every opcode is marked in an array (''OpcodesFlags''). There are some exceptions, the parser can't recognize:
+  - Analyze the code. The disassembler tries to analyze what part of the binary data is program code and what part is data. It start with all hardware vectors of the Z80 (`RST` opcodes, `NMI`) and parses all jumps via a recursive analyze via `ParseOpcode()`. Every opcode is marked in an array (`OpcodesFlags`). There are some exceptions, the parser can't recognize:
     - self modifying code. A ROM shouldn't contain such code.
-    - calculated branches with ''JP (IY)'', ''JP (IX)'' or ''JP (HL)''. The parser can't recognize them, either.
-    - Jumptables. These are quite common in a ROM. Only solution: disassemble the program and look into the code. If you found a jumptable - like on my Futura aquarium computer - insert some more calls of ''ParseOpcodes()''.
+    - calculated branches with `JP (IY)`, `JP (IX)` or `JP (HL)`. The parser can't recognize them, either.
+    - Jumptables. These are quite common in a ROM. Only solution: disassemble the program and look into the code. If you found a jumptable - like on my Futura aquarium computer - insert some more calls of `ParseOpcodes()`.
     - Unused code. Code that is never called by anybody, could not be found. Make sure that the code is not called via a jump table!
-  - Disassembly of the code. With the help of the OpcodesFlags table the disassembler now creates the output. This subroutine is quite long. It disassembles one opcode at a specific address in ROM into a buffer. It is coded directly from a list of Z80 opcodes, so the handling of ''IX'' and ''IY'' could be optimized quite a lot.
+  - Disassembly of the code. With the help of the OpcodesFlags table the disassembler now creates the output. This subroutine is quite long. It disassembles one opcode at a specific address in ROM into a buffer. It is coded directly from a list of Z80 opcodes, so the handling of `IX` and `IY` could be optimized quite a lot.
 
-The subroutine ''OpcodeLen()'' returns the size of one opcode in bytes. It is called while parsing and while disassembling.
+The subroutine `OpcodeLen()` returns the size of one opcode in bytes. It is called while parsing and while disassembling.
 
-The disassembler recognizes no hidden opcodes (the assembler does!). I didn't had a table for them while writing the disassembler and they were not needed anyway.
+The disassembler recognizes some hidden opcodes.
 
 If a routine wanted an "address" to the Z80 code, it is in fact an **offset** to the array of code. **No** pointers! Longs are not necessary for a Z80, because the standard Z80 only supports 64k.
 
@@ -31,31 +31,31 @@ Z80 Assembler
 
 I created the assembler for the Z80 a few days later to compile the changes code from the disassembler into an EPROM image and build a new firmware for my aquarium computer. I needed almost two days for the assembler, this means: commandline only... If you want to change the filename of the sourcefile, you have to change main().
 
-This small assembler has some nice gadgets: it is a quite fast tokenizing single-pass assembler with backpatching. It knows all official Z80 opcodes and some undocumented opcodes (mainly with ''IX'' and ''IY''). The Z80 syntax is documented in the Zilog documentation.
+This small assembler has some nice gadgets: it is a quite fast tokenizing single-pass assembler with backpatching. It knows all official Z80 opcodes and some undocumented opcodes (mainly with `IX` and `IY`). The Z80 syntax is documented in the Zilog documentation.
 
-The assembler allows mathematical expressions in operands: ''+'', ''-'', ''*'', ''/'', ''%'' (modulo), ''&'' (and), ''|'' (or), ''!'' (not), ''^'' (xor), ''<<'' (shift left) and ''>>'' (shift right). Brackets are also available. The expression parser is located in [[Z80 Calc.c]]. Number can be postpended by a ''D'', ''H'' or ''B'' for decimal, hexadecimal and binary numbers.
+The assembler allows mathematical expressions in operands: `+`, `-`, `*`, `/`, `%` (modulo), `&` (and), `|` (or), `!` (not), `^` (xor), `<<` (shift left) and `>>` (shift right). Brackets are also available. The expression parser is located in `z80_calc.c`. Number can be postpended by a `D`, `H` or `B` for decimal, hexadecimal and binary numbers.
 
-The assembler also knows the most commend pseudo opcodes (look into the sourcefile 'Z80 Tokenize.c'):
+The assembler also knows the most commend pseudo opcodes (look into the sourcefile 'z80_tokenize.cp'):
 
-  * '';'' This line is a comment.
-  * ''IF'' Start the conditional expression. If false, the following sourcecode will be skipped (until ''ELSE'' or ''ENDIF'').
-  * ''ENDIF'' End of the condition expression.
-  * ''ELSE'' Include the following code, when the expression on IF was false.
-  * ''END'' End of the sourcecode. The assembler stops here. Optional.
-  * ''ORG'' Set the PC in the 64k address space. E.g. to generate code for address $2000.
-  * ''PRINT'' Print the following text on the console. Great for testing the assembler.
-  * ''EQU'' or ''='' Set a variable.
-  * ''DEFB'' Put a byte at the current address
-  * ''DEFW'' But a word at the current address (little endian!)
-  * ''DEFM'' But several bytes in the memory, starting at the current address. Seperated with a "," or a string.
-  * ''DEFS'' Set the current address n bytes ahead. Defines space for global variables that have no given value.
+  * `;` This line is a comment.
+  * `IF` Start the conditional expression. If false, the following sourcecode will be skipped (until `ELSE` or `ENDIF`).
+  * `ENDIF` End of the condition expression.
+  * `ELSE` Include the following code, when the expression on IF was false.
+  * `END` End of the sourcecode. The assembler stops here. Optional.
+  * `ORG` Set the PC in the 64k address space. E.g. to generate code for address $2000.
+  * `PRINT` Print the following text on the console. Great for testing the assembler.
+  * `EQU` or `=` Set a variable.
+  * `DEFB` Put a byte at the current address
+  * `DEFW` But a word at the current address (little endian!)
+  * `DEFM` But several bytes in the memory, starting at the current address. Seperated with a "," or a string.
+  * `DEFS` Set the current address n bytes ahead. Defines space for global variables that have no given value.
 
 The Sourcecode
 --------------
 
-  * [Z80 Assembler.cp](z80_assembler.cp)
-  * [Z80 Assembler.h](z80_assembler.h)
-  * [Z80 Calc.cp](z80_calc.cp)
-  * [Z80 Compile.cp](z80_compile.cp)
-  * [Z80 Disassembler.cp](z80_disassembler.cp)
-  * [Z80 Tokenize.cp](z80_tokenize.cp)
+  * [z80_assembler.cp](z80_assembler.cp)
+  * [z80_assembler.h](z80_assembler.h)
+  * [z80_calc.cp](z80_calc.cp)
+  * [z80_compile.cp](z80_compile.cp)
+  * [z80_disassembler.cp](z80_disassembler.cp)
+  * [z80_tokenize.cp](z80_tokenize.cp)
diff --git a/Z80.code b/Z80.code
diff --git a/file.c b/file.c
diff --git a/file.h b/file.h
diff --git a/z80_assembler.cp b/z80_assembler.cp
@@ -22,6 +22,7 @@ bool        listing = false;
 
 static FILE *infile;
 static FILE *outbin;
+static FILE *outz80;
 static FILE *outhex;
 
 int verboseMode = 0;
@@ -62,7 +63,7 @@ void usage( const char *fullpath ) {
 
 
 static void listOneLine( uint32_t firstPC, uint32_t lastPC, const char *oneLine );
-
+static void write_header( FILE *stream, uint32_t address );
 
 /***
  *  …
@@ -207,14 +208,23 @@ int        main( int argc, char **argv )
 
         // create out file name(s) from in file name
         size_t fnamelen = strlen( outputfilename );
+        // bin or com (=bin file that starts at PC=0x100) file
         strncpy( outputfilename + fnamelen - 3, com ? "com" : "bin", sizeof(outputfilename) -fnamelen - 3 );
         MSG( 1, "Creating output file %s\n", outputfilename );
         outbin = fopen( outputfilename, "wb" );
         if ( !outbin ) {
             fprintf( stderr, "Error: Can't open output file \"%s\".\n", outputfilename );
             return 1;
         }
-
+        // z80 file is a bin file with a header telling the file offset
+        strncpy( outputfilename + fnamelen - 3, "z80", sizeof(outputfilename) -fnamelen - 3 );
+        MSG( 1, "Creating output file %s\n", outputfilename );
+        outz80 = fopen( outputfilename, "wb" );
+        if ( !outz80 ) {
+            fprintf( stderr, "Error: Can't open output file \"%s\".\n", outputfilename );
+            return 1;
+        }
+        // intel hex file
         strncpy( outputfilename + fnamelen - 3, "hex", sizeof(outputfilename) -fnamelen - 3 );
         MSG( 1, "Creating output file %s\n", outputfilename );
         outhex = fopen( outputfilename, "wb" );
@@ -233,6 +243,10 @@ int        main( int argc, char **argv )
             fwrite( RAM + offset, sizeof( uint8_t ), maxPC + 1 - offset, outbin );
         fclose( outbin );
     }
+    if ( outz80 ) {
+        write_header( outz80, minPC );
+        fwrite( RAM + minPC, sizeof( uint8_t ), maxPC + 1 - minPC, outz80 );
+    }
     if ( outhex ) {
         // write the data as intel hex
         struct ihex_state ihex;
@@ -313,6 +327,23 @@ static void listOneLine( uint32_t firstPC, uint32_t lastPC, const char *oneLine
 }
 
 
+// the z80 format is used by the z80-asm
+// http://wwwhomes.uni-bielefeld.de/achim/z80-asm.html
+// *.z80 files are bin files with a header telling the bin offset
+// struct z80_header {
+//     const char  *MAGIC = Z80MAGIC;
+//     uint16_t    offset;
+// }
+static void write_header( FILE *stream, uint32_t address ) {
+    const char *Z80MAGIC = "Z80ASM\032\n";
+    unsigned char c[ 2 ];
+    c[ 0 ] = address & 255;
+    c[ 1 ] = address >> 8;
+    fwrite( Z80MAGIC, 1, strlen( Z80MAGIC ), stream );
+    fwrite( c, 1, 2, stream );
+}
+
+
 void ihex_flush_buffer( struct ihex_state *ihex, char *buffer, char *eptr ) {
     (void)ihex;
     *eptr = '\0';

diff --git a/z80_disassembler.cp b/z80_disassembler.cp
@@ -27,7 +27,6 @@ This program is freeware. It is not allowed to be used as a base for a commercia
 #include <cstring>
 #include <cstdint>
 #include <cstdarg>
-#include "file.h"
 #include "kk_ihex_read.h"
 
 
@@ -960,6 +959,47 @@ int main( int argc, char *argv[] ) {
 }
 
 
+// the z80 format is used by the z80-asm
+// http://wwwhomes.uni-bielefeld.de/achim/z80-asm.html
+// *.z80 files are bin files with a header telling the bin offset
+// struct z80_header {
+//     const char  *MAGIC = Z80MAGIC;
+//     uint16_t    offset;
+// }
+// reads header of a file and tests if it's Z80 ASM file, reads address
+// return value: 0=OK, 1=this is not a z80 asm file, 2,3=seek malfunction
+int read_header( FILE *stream, uint32_t *address, uint32_t *len ) {
+    const char *Z80MAGIC = "Z80ASM\032\n";
+    char tmp[ 9 ];
+    unsigned char c[ 2 ];
+    unsigned a, b;
+    int ret = 0;
+
+    b = strlen( Z80MAGIC );
+    tmp[ b ] = 0;
+    a = 0;
+    if ( ( a = fread( tmp, 1, b, stream ) ) != b )
+        ret = 1;
+    else if ( strcmp( tmp, Z80MAGIC ) )
+        ret = 1;
+    else if ( fread( c, 1, 2, stream ) != 2 )
+        ret = 1;
+    else {
+        *address = ( c[ 1 ] << 8 ) | c[ 0 ];
+        a = b + 2;
+    }
+    if ( fseek( stream, 0, SEEK_END ) )
+        ret = 2;
+    else if ( ( b = ftell( stream ) ) < a )
+        ret = 2;
+    else
+        *len = b - a;
+    if ( fseek( stream, a, SEEK_SET ) )
+        ret = 3;
+    return ret;
+}
+
+
 static bool load_bin( char *path, uint32_t offset ) {
     // int address;
     uint32_t size;