Disassembling code

Disassembling is one of the most important abilities of a machine language monitor. This is done with the help of the disass command. The help text of it is as follows:

** Monitor 000 001
(C:$e5d4) help disass
Syntax: disass [<address> [<address>]]
Abbreviation: d

Disassemble instructions.  If two addresses are specified, they are
used as a start and end address.  If only one is specified, it is
treated as the start address and a default number of instructions are
disassembled.  If no addresses are specified, a default number of
instructions are disassembled from the dot address.

(C:$e5d4)

You see that the syntax if that memory is very similar to the mem command: You can give no, one or two addresses as parameters. If no address is given, the disassembly will start at the place where the last command (disass, mem, ...) ended. If you give two addresses, you will see the disassembly starting at the first address, and ending at the second address. Note that both addresses will be shown, similar to how mem works.

Additionally, note that the disass command shows the syntax differently than the mem command does, although the interpretation is exactly the same: [<address_opt_range>] and [address [address]] are two different specifications for exactly the same!

If you only give one address, the disassembly will start at that address, and the monitor will try to find a reasonable number of lines to output.

Now, let's try this command:

(C:$e5d4) disass fce2 fcff
.C:fce2   A2 FF      LDX #$FF
.C:fce4   78         SEI
.C:fce5   9A         TXS
.C:fce6   D8         CLD
.C:fce7   20 02 FD   JSR $FD02
.C:fcea   D0 03      BNE $FCEF
.C:fcec   6C 00 80   JMP ($8000)
.C:fcef   8E 16 D0   STX $D016
.C:fcf2   20 A3 FD   JSR $FDA3
.C:fcf5   20 50 FD   JSR $FD50
.C:fcf8   20 15 FD   JSR $FD15
.C:fcfb   20 5B FF   JSR $FF5B
.C:fcfe   58         CLI
.C:fcff   6C 00 A0   JMP ($A000)
(C:$fd02)

The above routine is the RESET routine of the C64. That is, after you switch on the C64 (or press a RESET button, if there is one), the above code is executed. I don't want to analyze the ROM of the C64 - this might be a good theme for another series -, so, I'll focus on how to interpret the output.

The first part of each line (i.e., .C:fce2) tells you that you are looking at the address $fce2 of the computer C:. The next part are one, two or three bytes which form the command in memory. At $FCE2, you see the bytes $A2 and $FF. Here, $A2 is the so-called opcode of the command there. In this case, the command is LDX #$FF.

You see: The disass command is similar to the mem command from the last part of the series. Anyway, additionally to what mem does, disass also interprets the code. So, you don't need to remember each and every opcode ("What does $A9 stand for?"), but the computer performs this cumbersome work for you.

Let's continue with the disassembly:

(C:$fd02) d
.C:fd02   A2 05      LDX #$05
.C:fd04   BD 0F FD   LDA $FD0F,X
.C:fd07   DD 03 80   CMP $8003,X
.C:fd0a   D0 03      BNE $FD0F
.C:fd0c   CA         DEX
.C:fd0d   D0 F5      BNE $FD04
.C:fd0f   60         RTS
.C:fd10   C3 C2      DCP ($C2,X)
.C:fd12   CD 38 30   CMP $3038
.C:fd15   A2 30      LDX #$30
.C:fd17   A0 FD      LDY #$FD
(C:$fd19)

You see the next routine in memory, starting at $FD02. As you can see in $FCE7 (from the last view of the disassembly), this is a subroutine which is used by the RESET routine. More precisely, this routines looks for a specific signature in memory at $8004-$8008. If that signature is there, the C64 assumes you have plugged in a cartridge, and executes it instead of the normal BASIC. Again, this is beyond the scope of this tutorial. If you are interested in this, search for a discussion of the C64 when cartridges are inserted.

An interesting command can be found in $FD10: DCP. What is this? Even if you are familiar with the 6502, you might not have heard of such a command yet. DCP is no "official" command; rather, it is a so-called undocumented opcode. That is, these commands do not officially exist; nevertheless, they do something, even something useful.

Some people have used these undocumented opcodes in their programs for the Commodore 64. Some have used them to establish some copy-protection, others have used them to make their programs shorter or faster. Thus, in order to emulate the C64 correctly, VICE also has to implement all of the undocumented opcodes. Since VICE already knows them, its monitor also shows them in a disassembly. Makes sense, doesn't it?

Well, in this case, this is not the whole story. Commodore never used undocumented opcodes in the KERNAL or the BASIC ROMs. Thus, whenever you see such an undocumented opcode in the ROM disassembly, you can be sure that you do not see program code, but some table. In this case, this table is referenced to from $FD04. Let's have a look into that table:

(C:$fd19) m fd10 fd14
>C:fd10  c3 c2 cd 38  30                                      ...80
(C:$fd15)

You see that there are three bytes $C3, $C2, $CD, followed by two bytes $38, $30. The last two bytes are ASCII for "80". The first three bytes - despite the fact that VICE does not show them - is the PETSCII representation for the string "CBM". Thus, this table just contains the string "CBM80". This is the signature the routine at $FD02 searches for in $8004-$8008 when trying to determine if there is a cartridge or not.