Reverse engineering an arcade game with Ghidra

A guide to using Ghidra to reverse engineer arcade games

About Ghidra #

Released April 2019 as Open Source by the NSA
Java, can run anywhere
Not on par IDA-Pro, but supports many CPU architectures
Including m68k and Z80
Very extensible, using Java or Jython

Prereqisites #

Ghidra
MAME
A basic understanding of low-level languages

ROMs #

Assemble program ROMs into one single binary file:

Hi/Lo bytes #

Need to interleave odd roms with even ones if your roms are 8-bit wide and your arch is 16-bit. If your ROMs are the same width at the host CPU, you can concat them, but you may need to byte-swap them, depending on how the ROMs are hooked up to the CPU.

Show cps1.c driver code example.

Link to program to assemble these: [https://gist.github.com/sf2platinum/19adb572afe948c3e51f24727dc44a38]

CPS2 Encryption #

Makes things very tricky:

only the program opcodes are encrypted, not the data
no straight forward way to tell one from the other
try to find a 'clean' dump of your game

Process #

Build a memory map #

Look at the cps1.c driver entry for your game to figure out how the ROMs are mapped, and where the I/O addresses of the custom chips are

Code ROMs #

main CPU (we'll concentrate on this one first)
sound CPU

Set up a new Ghidra project #

Import the main code ROM #

Import the assembled ROM

Set the type to MC68000

Open 'Options', and give it a block name, otherwise it will default to 'ram'

Explain that code doesn't always get correctly decoded straight away, we're going to have to help the disassembler to understand by giving it context.

Explain the 68k vector table, and how it uses that to get the location to jump to when it starts up

Demo setting a location to a pointer with 'p' / Data -> Pointer, convert several addresses into pointers to reveal the Vector Table

Some games set their stack pointer up here, on this game it's done manually later. The address 0x40e is loaded into the Program Counter. The CPU jumps here, and begins executing instructions.

Demo manually disassembling a bunch of instructions with 'd' / Disassemble

Explain labels and x-refs, and how some of these will be red herrings due to the disassembler not having enough context.

Assign a label for the first instruction: 'initial_pc'

Explain the red references:

the disassembler doesn't know what these mean yet
they're red because the represent addresses that aren't mapped yet

Demo the memory map dialog

Demo clearing erroneously decoded with 'c' / Clear Code Bytes

← Home