Reverse Engineering the International Superstar Soccer Password System Part 1
2025-01-31Does looking at this image make you feel frustrated?
This is a screenshot of the game "International SuperStar Soccer Deluxe" for the Super Nintendo.
Released in 1995, it was the successor of the popular "International SuperStar Soccer" for the same console. It was a very popular game at the time, and even today there are still people releasing modded versions of this game.
But, besides the big number of teams to choose and all the game modes, it was also known for its huge password system. The password entry screen allowed to password up to 60 characters, some special characters, numbers but no vowels (I guess they were afraid of accidental bad words).
It was very easy to make a mistake when typing it (very annoying), and especially writing down the wrong password (more than just very annoying).
I, particularly, did not have a snes console but I've seen the password screen in other's people houses and I always wondered what was the logic behind it. I knew somehow the game state was encoded in the password, but I could not never find any information about it.
Recently I saw a meme about the game password system and decided to look into it. I thought obviously someones must have figured it out by now (after so many years).
To my surprise I could not find any information about it except for someone on reddit who started the same quest and disappeared.
I am on vacation and I have some time to spare, so I decided to look into it myself.
Let's start from the beginning. (or TLRD: here is the result so far)
Finding the right tools
I have no prior knowledge about snes rom hacking. What I did know was that you can use the emulator to scan the memory and look for values based on how they changed. My theory was that the password was just the game state with a checksum so just looking at the memory would give me a good idea of how it worked, without having to look much into the actual instructions.
It turns out that there was no way I would figure out how it worked only from looking at the values on the memory.
With the lucky of the very stupid, I wasn't able to find the value on the memory with a popular emulator, which prompted me to find some emulator that could also debug the instructions being executed. One great emulator that does this is bsnes-plus, a fork of bsnes with debbugging tools.
Bsnes plus has no binaries for Linux, but luckly compiling it was a breeze. I did have issues running it, but turns out it did not run out of the box on ubuntu using wayland. Using xcb solved the issue.
export QT_QPA_PLATFORM=xcb
So, back to understanding the password system.
The way you create cheats in an emulator is by scanning the whole memory and looking for values that increased, decreased or stayed the same.
For example, let's say you have a game where you have some health represented by hearts. You are not sure what value represent this health in memory.
What you can do is scan the whole memory, then get hit and scan again for values that are lesser than the previous one. You can then scan for values that stay the same as the previous one. Lose health and search again, or even better, gain health and scan for values that increased.
This way you can narrow down the memory address that represents the health. You can then you can change this value and see how it affects the game.
This is what the cheat finder looks on bsnes-plus (other emulators have it too):
Looking for the right place in memory
With the same strategy we use too look for cheats, we can look for the place in memory where the password is stored.
What we need to do then is open the password screen, scan the whole memory, then change the first character of the password and scan again for values that changed.
Then we scan again a few times for values that did not change. This will eliminate animations, like the rolling balls character placeholders or the rotating cursor.
We can also changed the first character to something else to look for values that changed.
A few scans later I ended up with this:
This is very promising. Lets look at them.
First, let`s type a sequential password.
This is 7fe18a:
The higlighted part is where the memory changes when we change the password. There are many interesting things here.
Each entry is 16 bits long. All entries are paired with a 0x20 value. There is a strange gap between the entries.
And finally, if we try to find out which character corresponds to each value we end up with these interesting discontinuities.
B - 71 20 C - 72 20 D - 73 20 ????????? Space for E F - 75 20 G - 76 20 H - 77 20 ????????? Space for I J - 79 20 ...
From this, we can conclude those values are probably pointing to a tile sheet with the char table.
Let's look at the next candidate 7ee2d0.
This is what we see:
Perfect. We can assume that if there is any type of calculation going on, it would be easier to use consecutive numbers starting from 0. We can hopefullly assume that this is the right place.
Now that we have the memory address, we can use the bsnes plus debugger to see how this is being used. Bsnes allows us to set a breakpoint on any memory address and choose to stop on read, write or execute (as far as I know, only bsnes-plus have this feature).
We then add this breakpoint and try a new password. Immediately we stop at the breakpoint at 0x86cb3b. Success!
Now, this is only one instruction inside a whole subroutine. We need to look at the whole thing to understand how it works. Here is what we have:
Well, now would be a good time to look at the snes cpu and the 65C816 assembly. This is a very, very shallow look at what we are dealing with here. For our purposes, we only need to know stuff as we go.
The unavoidable concepts here are:
- Register A, the accumulator register. Used for any calculation.
- X and Y registers, the index register. Used to dynamically point to different memory adresses.
- Data Bank: Imagine this as an offset to be appended to a memory address. You can set it inside the DB register.
- The flags. Stored in the Processor Status Register (P), each one has one function. We will care only About M, N and C.
- Flag M Memory/Accumulator Width Flag - Switches between 8 bit and 16 bit mode.
- Flag N Indicates whether the result of the last operation was negative
- Flag C Indicates a carry or borrow from the last operation
Bsnes plus also have tracing capabilities, so we can follow the logic while also looking at the register values and flags.
This is a trace:
86cb27 jsr $cb13 [86cb13] A:0000 X:0004 Y:0080 S:0192 D:0000 DB:81 .....IZ. V:246 H:191 F:22 86cb27 is the address of the instruction jsr is the instruction, in this case jump to subroutine $cb13 is the address of the subroutine [86cb13] is the actual address cb13 resolves to. Then we have the registers and flags. Register A is the accumulator, X and Y are the index registers, S is the stack pointer, D is the direct page register, DB is the data bank register. The flags are the processor status flags. If they are on we see the flag character, if they are off we see a dot. Then we have the V sync and H sync counters and finally the frame counter.
Understanding what is being done with the password
From the code we can observe that:
- A subroutine is called at 86cb27 jsr $cb13, let's call it gameFunctionA. jsr is Jump to subroutine and basically means call a function inside the Data Bank.
- A subroutine is called at 86cb33 jsl $808b4c, let's call it globalFunctionA. jsl is Long Jump to subroutine. Unlike jsr, it can call subroutines anywhere.
- There is a bcc instruction pointing to an address inside this same subroutine. 86cb61 bcc $cb26. This is a loop.
We can go through each instruction and see what is happening.
This is gameFunctionA:
86cb13 phb A:0000 X:0004 Y:0080 S:0190 D:0000 DB:81 .....IZ. PHB - Push Data Bank Register - this is used to save the current data bank register value on the stack. 86cb14 lda #$0000 A:0000 X:0004 Y:0080 S:018f D:0000 DB:81 .....IZ. LDA - Load Accumulator - this is used to load a value into the accumulator. #$0000 is the immediate value 0. In other words, this is setting A to 0. 86cb17 sta $1700 [811700] A:0000 X:0004 Y:0080 S:018f D:0000 DB:81 .....IZ. STA - Store Accumulator - this is used to store the value in the accumulator into a memory address. In this case, it is storing 0 into 811700. 86cb1a lda #$0039 A:0000 X:0004 Y:0080 S:018f D:0000 DB:81 .....IZ. Puts 39 into A 86cb1d ldx #$1701 A:0039 X:0004 Y:0080 S:018f D:0000 DB:81 .....I.. Puts 1701 into X 86cb20 txy A:0039 X:1701 Y:0080 S:018f D:0000 DB:81 .....I.. TXY - Transfer X to Y 86cb21 iny A:0039 X:1701 Y:1701 S:018f D:0000 DB:81 .....I.. INY - Increment Y, see on the next trace line it went from 1701 to 1702. 86cb22 mvn $00, $00 A:0039 X:1701 Y:1702 S:018f D:0000 DB:81 .....I.. MVN - Block Move Negative - this is used to copy a block of memory from one location to another. The source is the source bank byte concatenated with the contents of the X register. The destination is the destination bank byte concatenated with the contents of the Y register. The size is the value in the A register. A is 39, so it is copying 39 bytes from 1701 t0 1702. ... 86cb22 mvn $00, $00 A:0001 X:1739 Y:173a S:018f D:0000 DB:00 .....I.. 86cb22 mvn $00, $00 A:0000 X:173a Y:173b S:018f D:0000 DB:00 .....I.. 86cb25 plb A:ffff X:173b Y:173c S:018f D:0000 DB:00 .....I.. PLB - Pull Data Bank Register - this is used to restore the data bank register value from the stack. 86cb26 rts A:ffff X:173b Y:173c S:0190 D:0000 DB:81 N....I.. RTS - Return from Subroutine - this is used to return from a subroutine. So, this is the end of gameFunctionA. It is copying 0 to a range from 1700 till 1700+39. It is a clean up function :) Let's call it clearMemory. Also, $1700 is obviously important here. Back to the main loop 86cb2a stz $00 [000000] A:ffff X:173b Y:173c S:0192 D:0000 DB:81 N....I.. 86cb2c stz $02 [000002] A:ffff X:173b Y:173c S:0192 D:0000 DB:81 N....I.. STZ - Store Zero - this is used to store the value 0 into a memory address. Or, in other words, initializing some stack variables with 0. Here we have a loop starting 86cb2e lda $02 [000002] A:ffff X:173b Y:173c S:0192 D:0000 DB:81 N....I.. Copies the value at the stack $02 into A 86cb30 ldy #$0008 A:0000 X:173b Y:173c S:0192 D:0000 DB:81 .....IZ. Puts 8 into Y (8 is constant) 86cb33 jsl $808b4c [808b4c] A:0000 X:173b Y:0008 S:0192 D:0000 DB:81 .....I.. Jumps to the subroutine at 808b4c The values on the registers A and Y are important to this subroutine (the why is coming) 808b4c sta $4204 [814204] A:0000 X:173b Y:0008 S:018f D:0000 DB:81 .....I.. Copies the value in A into $4204 This one requires a little bit of research. One special feature exclusive to the 65c816 snes cpu is native division. The 65C816 processor in the SNES does not have a dedicated division instruction. Instead, division is performed by a hardware division unit. - The CPU writes the dividend, a 16-bit value, to memory addresses $4204 (lower byte) and $4205 (higher byte) - The CPU writes the divisor, an 8-bit value, to memory address $4206. Writing to this address also initiates the division The quotient, which is the result of the division, is then read from memory addresses $4214 (lower byte) and $4215 (higher byte) The remainder can be read from memory addresses $4216 (lower byte) and $4217 (higher byte) So, A:0000 is the dividend. 808b4f sep #$20 A:0000 X:173b Y:0008 S:018f D:0000 DB:81 .....I.. This one will appear a lot. This is setting the bit on the P register corresponding to the M flag. When the M flag is set, the accumulator and memory operations are 8-bit 808b51 tya A:0000 X:173b Y:0008 S:018f D:0000 DB:81 ..M..I.. Copies Y to A 808b52 sta $4206 [814206] A:0008 X:173b Y:0008 S:018f D:0000 DB:81 ..M..I.. Now A is the divisor (with the value that was previously on Y) 808b55 lda #$03 A:0008 X:173b Y:0008 S:018f D:0000 DB:81 ..M..I.. Put 3 on a 808b57 dec A:0003 X:173b Y:0008 S:018f D:0000 DB:81 ..M..I.. Decrements A 808b58 bne $8b57 [808b57] A:0002 X:173b Y:0008 S:018f D:0000 DB:81 ..M..I.. Recpeats decrement while A is not zero 808b57 dec A:0002 X:173b Y:0008 S:018f D:0000 DB:81 ..M..I.. 808b58 bne $8b57 [808b57] A:0001 X:173b Y:0008 S:018f D:0000 DB:81 ..M..I.. 808b57 dec A:0001 X:173b Y:0008 S:018f D:0000 DB:81 ..M..I.. 808b58 bne $8b57 [808b57] A:0000 X:173b Y:0008 S:018f D:0000 DB:81 ..M..IZ. So a loop that spends three cpu cycles. I think this is due the division taking three cycles. 808b5a rep #$20 A:0000 X:173b Y:0008 S:018f D:0000 DB:81 ..M..IZ. Unset M flag (16 bit mode) 808b5c lda $4214 [814214] A:0000 X:173b Y:0008 S:018f D:0000 DB:81 .....IZ. Loads the result of the division into A 808b5f ldy $4216 [814216] A:0000 X:173b Y:0008 S:018f D:0000 DB:81 .....IZ. Loads the remainder into Y 808b62 rtl A:0000 X:173b Y:0000 S:018f D:0000 DB:81 .....IZ. Return from subroutine, ok, then globalFunctionA is actually just a division. Back to the main loop 86cb37 sta $04 [000004] A:0000 X:173b Y:0000 S:0192 D:0000 DB:81 .....IZ. Stores the result of the division into 4 on the stack 86cb39 ldx $00 [000000] A:0000 X:173b Y:0000 S:0192 D:0000 DB:81 .....IZ. Loads the value at 0 on the stack into X 86cb3b lda $7ee2d0,x [7ee2d0] A:0000 X:0000 Y:0000 S:0192 D:0000 DB:81 .....IZ. Loads the value at 7ee2d0+x into A, remember 7ee2d0? It is the memory address we found for the password. This is basically equivalent to A=password[x] 86cb3f and #$00ff A:3a0b X:0000 Y:0000 S:0192 D:0000 DB:81 .....I.. We are only interested in the lower byte, so we mask the upper byte 86cb42 cmp #$0040 A:000b X:0000 Y:0000 S:0192 D:0000 DB:81 .....I.. This one seems arbitrary. It is comparing A with 40. But when you look at the password in memory it makes sense. The password always ends with 0xff, so this is checking if we reached the end of the password. 86cb45 bcs $cb63 [86cb63] A:000b X:0000 Y:0000 S:0192 D:0000 DB:81 N....I.. Branch if carry flag is set set. The instruction CMP sets the carry flag is set if the value in the accumulator is greater than or equal to the operand and is cleared if the value in the accumulator is less than the operand 86cb47 ror A:000b X:0000 Y:0000 S:0192 D:0000 DB:81 N....I.. bitwise rotate right Accumulator register 86cb48 rol A:0005 X:0000 Y:0000 S:0192 D:0000 DB:81 .....I.C bitwise rotate left Accumulator register 86cb49 dey A:000b X:0000 Y:0000 S:0192 D:0000 DB:81 .....I.. decrement Y, Y started as the division remainder 86cb4a bpl $cb48 [86cb48] A:000b X:0000 Y:ffff S:0192 D:0000 DB:81 N....I.. Branch if plus. If the negative flag is not set, it branches. This is a small loop here. 86cb4c ldy $04 [000004] A:000b X:0000 Y:ffff S:0192 D:0000 DB:81 N....I.. Load the value at 4 on the stack into Y, this is the result of the division 86cb4e ora $1700,y [811700] A:000b X:0000 Y:0000 S:0192 D:0000 DB:81 .....IZ. Bitwise OR memory at 1700+y with accumulator into accumulator. Rememberm 1700 is the memory we cleared at the start. 86cb51 sta $1700,y [811700] A:000b X:0000 Y:0000 S:0192 D:0000 DB:81 .....I.. Store accumulator into memory at 1700+y 86cb54 inc $00 [000000] A:000b X:0000 Y:0000 S:0192 D:0000 DB:81 .....I.. Increment the value at 0 on the stack, remember this is the password index 86cb56 lda $02 [000002] A:000b X:0000 Y:0000 S:0192 D:0000 DB:81 .....I.. Load the value at 2 on the stack into A, remember this was initialized with 0 86cb58 clc A:0000 X:0000 Y:0000 S:0192 D:0000 DB:81 .....IZ. Clear carry flag 86cb59 adc #$0006 A:0000 X:0000 Y:0000 S:0192 D:0000 DB:81 .....IZ. Add 6 to A (6 is constant) 86cb5c sta $02 [000002] A:0006 X:0000 Y:0000 S:0192 D:0000 DB:81 .....I.. Store A into 2, this value will increase by 6 on each iteration 86cb5e cmp #$0168 A:0006 X:0000 Y:0000 S:0192 D:0000 DB:81 .....I.. Compare A with 168 86cb61 bcc $cb2e [86cb2e] A:0006 X:0000 Y:0000 S:0192 D:0000 DB:81 N....I.. Branch if carry flag is clear, meaning A is less than 168, remember $cb2e is the start of the loop
What this is doing is a little bit disapointing. I was expecting the logic to be more straightforward, just the values and a checksum. What we have here is a decoding logic. The values are decoded from the password and stored on $1700.
When we think about it, it makes sense. If not encoded, it would be very easy to identify patterns on the password.
The encode logic here is very interesting. It is based on a counter being incremented by 6 on each iteration and divided by 8. Then, both the result and the remainder are used. The result is the index on the decoded result to be written, and the remainder is used as the bitshift.
There are a few gotchas here. First one is that the bit rotation depends on the carry flag. The value is not shifted, but rotated.
For example, given 10000000 00000000 left shift 2 (rol instruction), the 1 would go to the carrier flag and back, giving 00000000 00000001.
The second gotcha is the M flag. Although the index is increaded by 1 (8 bits), the bit OR operation and the rotation are applied to 16 bits.
This means that even though a character from the password is only 8 bits, it can have a 16 bit value after rotated. It also means the values can overlap when doing the OR operation.
Or the values can be applied twice to the same memory index. It is easier to understand it if we look at the table with the values from the six counter divded by 8:
SixCounter | Math.floor(sixCounter/8) | (sixCounter % 8) |
0 | 0 | 0 |
6 | 0 | 6 |
12 | 1 | 4 |
18 | 2 | 2 |
24 | 3 | 0 |
30 | 3 | 6 |
36 | 4 | 4 |
42 | 5 | 2 |
48 | 6 | 0 |
54 | 6 | 6 |
60 | 7 | 4 |
66 | 8 | 2 |
On the first pass, memory position is $1700+0 (0/8 = 0) and the value is shifted 0 times (remainder).
On the second pass, memory position is still $1700+0 (6/8 = 0 ) and the value is shifted 6 times (remainder). So the value is applied with OR to the same memory position. If the shifted value happens to be more than 8 bits, it will overlap with the next value (Little endian).
For example, Let's say you have a password
CDF
or translating to values:
1 2 3
Let's do the decoding:
On pass 1: - The character value is 1 - The shift will be the remainder from 0/8 which is 0 - The memory position is the quotient from 0/8 which is 0 The value 1 represented in 16 bits little endian 00000001 00000000 - 0x01 0x00 Shifted left 0 times 00000001 00000000 - 0x01 0x00 An or is done with the value already in address 0: 00000001 00000000 - 0x01 0x00 00000000 00000000 - 0x00 0x00 ____________________ 00000001 00000000 - 0x01 0x00 Final value written on position 0 and 1: 0x01 Current memory 0x01 Counter increases by 6: 6 On pass 2: - The character value is 2 - The shift will be the remainder from 6/8 which is 6 - The memory position is the quotient from 6/8 which is 0 The value 2 represented in 16 bits little endian 00000010 00000000 - 0x02 0x00 Shifted left 6 times 10000000 00000000 - 0x80 0x00 An or is done with the value already in address 0: 10000000 00000000 - 0x80 0x00 00000001 00000000 - 0x01 0x00 ____________________ 10000001 00000000 - 0x81 0x00 Final value written on position 0 and 1: 0x81 Current memory 0x81 Counter increases by 6: 12 On pass 3: - The character value is 3 - The shift will be the remainder from 12/8 which is 4 - The memory position is the quotient from 12/8 which is 1 The value 3 represented in 16 bits little endian 00000011 00000000 - 0x03 0x00 Shifted left 4 times 00110000 00000000 - 0x30 0x00 An or is done with the value already in address 1: 00110000 00000000 - 0x30 0x00 00000000 00000000 - 0x00 0x00 ____________________ 00110000 00000000 - 0x30 0x00 Final value written on position 1 and 2: 0x30 Current memory 0x81 0x30 Counter increases by 6: 18 On pass 4: - The character value is 255 Execution breaks from the loop Current memory 0x81 0x30
Decoding CDF gets us 0x81 0x30
In conclusion for this part of the logic, we have a password as input and the decoded values as an output at $1700.
Moving on we need to figure out what is done with those values at $1700.
Finding out valid passwords
If that is all there is, how come some passwords work and some don't? What we can do to figure this out is to add a read breakpoint at $811700.
Previously for the decoding logic I did the explanation operation by operation because I though that was an interesting and unexpected logic.
There is nothing mindblowing happenning here, so I will just highlight the important bits and omit much of the trial and error and debugging that went on to figure this out.
One thing it is not visible here is that A contains the password lenght at the start of the subroutine. This is an important point later on.
Take a look at these instructions on the subroutine that reads the decoded password:
... while working on 8 bits mode 83f696 lda $1702 83f699 eor $1700 83f69c sta $1648 83f69f lda $1703 83f6a2 eor $1700 83f6a5 sta $1649 ... while working on 16 bits mode 83f6aa lda $1648; meaning it loads $1648 $1649 Little endian on A 83f6ad bit #$0020 83f6b0 beq $f6b5; beq is branch if equal, so we jump if bit 5 is not set and skip the next jmp 83f6b2 jmp $f7a6 ; jump here if bit 5 is set
The bit instruction is a bitwise AND operation. It is used to test if one or more bits are set in a value.
So, we have some bit flags on the third and fourth byte positions of the decoded password. They also do an eor on the first position.
This will check for the bit and if it is not there, skip a jump insruction.
#$0020 is bit 5. If we set it on the decoded password, it jumps to $f7a6. There we have some extra checks:
There are more branching depending on flags. Those are the bit instructions for bit #$0080 and bit #$0100.
But, most important there is a comparing with the value at $02 for #$0032. This only happens if bit #$0020 is set and not the other bits #$0080 and #$0100.
The value at $02 is the password length. So, the password length must be 0x32, which is 50 in decimal.
In summary, every bit instruction is checking a flag on the first two byte of the decoded password, and every cmp is checking the char password length.
Following the logic, for all bit flags, in the end we figure out there are some main flags that represents types of passwords. Each flag combination requires a different password length. There are just a 9 flag combination possible, which would be the passwords for different game modes:
- #$0020 - Size 50
- #$0020 and #$0080 - Size 11
- #$0020 and #$0100 - Size 8
- #$0004 - Size 15
- #$0004 and #$0008 - Size 39
- #$0004 and #$0200 - Size 12
- #$0002 - Size 20
- #$0400 - Size 11
- #$1000 - Size 27
And only the first flag is evaluated, so if you have #$0020 and #$0004, it ends up ont the logic for #$0020.
We are now very close to be able to produce valid passwords.
We know there are only 9 types of password with fixed password lenghts. Everything else is invalid.
But even following the flags and size rules, the password is still not valid. Can you guess why?
Finnally, a checksum
Yes! There is a checksum!
Going back to the logic for the #$0020 flag, we have this:
83f7d0 lda #$0025 83f7d3 jsr $f8c7
This is the checksum logic inside $f8c7.
It is a little bit messy. It translates to this pseudocode:
A = 0x25 <- fixed value (37 decimal) A-- A-- var4 = A // A is 0x23, it is the range of the values to be considered for the checksum Start 8 bit mode var2 = 0 // This will be hold the sum of the values X = 2 do { A = decodedPass[X] C = 0 //Clear carry flag A += var2 var2 = A A = decodedPass[X] A = A ^ decodedPass[0] // why is this necessary? decodedPass[X] = A X++ // Index goes up var4-- // Counter passed as parameter goes down } while var4 != 0 A = var2 // we get the sum C = 0 //Clear carry flag A += decodedPass[0] // why? if(A !== decodedPass[1]) { fail checksum }
To me it is the purpose of the value at decodedPass[0] ($1700) is not clear yet. It is probably related to how the password is created and maybe a clever hack to easily adjust a password checksum.
But, despite that, what is actually done here is straightforward.
decodedPass[1] ($1701) holds the actual value to be compared with the sum of the values.
The range of bytes used for calculating the check is different for each flag combination. So what is left is to debug the logic for every one of the 9 flag combinations and get the hardcoded ranges.
And with this last piece of the puzzle, we can now finally generate valid passwords for International SuperStar Soccer Deluxe.
You can check the generator working at github pages and inspect the commented source.
Right now al it does is check the size against tyhe flags and calculate the checksum. You can use it to write any valid password, either by chenging the password or the hex values.
You can write cool valid password such as this one:
Working out what the values mean could in theory be done by experimenting new values (if you are not willing to dive into the 65C816 assembly).
This concludes the first part of the password generator. The next step is to figure out what the values in the decoded password actually do. The plan would be to just follow the reads and debug code.
But for that, I will need more vacation time :)