(Link to AcmlmWiki) Offline: thank ||bass
Register | Login
Views: 13,040,846
Main | Memberlist | Active users | Calendar | Chat | Online users
Ranks | FAQ | ACS | Stats | Color Chart | Search | Photo album
05-02-24 07:09 AM
0 users currently in ROM Hacking.
Acmlm's Board - I3 Archive - ROM Hacking - Figuring out SOE text New poll | |
Add to favorites | Next newer thread | Next older thread
User Post
Zer0wned

Koopa


 





Since: 12-09-05
From: Torrance, ca

Last post: 6440 days
Last view: 6440 days
Posted on 08-13-06 03:44 AM Link | Quote
Edit: I'd recommend skipping ahead.
(SOE = Secret of Evermore)
Here's what I've gathered so far regarding how SOE handles text:
Generally, it stores full words to be repetitively re-used (change 'axe' to 'cat', and every occurance [with exceptions, mentioned ahead] will be "cat" instead of "axe"), alternates with capital letters are stored seperately.

On rarer occaisions, full sentences/descriptions are stored. For example there's an easter egg where if you pester a chicken too much, you get smote by the gods. All the accompanying text is in one place and consecutive. (so "TAUNT THE GOATS IF YOU WANT, BUT LEAVE THE CHICKENS ALONE" as opposed to it just retreiving the stored TAUNT,ALONE,GOATS,etc etc.)

So the memory address of "Podunk" is 11FFDF; I did a search on DF FF 11 in the ROM, and came up with three matches, one of which was preceded by A4 (so 'A4 DF FF 11')


$8F/A4DF FF 9D 12 3C SBC $3C129D,x[$3C:129D] A:00F9 X:0000 Y:0006 P:eNvMXdizc
^-- that came up when I did a trace in the general timeframe of an occurance of the word "Podunk". Now I'm pretty sure I screwed something up in the tracing (just started re-learning this), but I'll be damned if this isn't related to what I'm looking for.

Reason I'm pretty sure I screwed something up is... SBC is a subtracting function, right? and A4 = LDY dp "Loads directly to accumulator Y" (which I'm not 100% sure what that does exactly), which from the sound of it, is a little more related to what's going on.

7AE22 is the address where I found the A4DFFF11 code, and I'm using geiger's snes 9x emulator for tracing, and XVI32 for hex editing/searching, if that helps.

So based on what I've provided, is there anything I can do differently to produce better results? My goal here is to try and figure out how SOE handles text.

(if there's any way I can improve my posting quality in the future, go ahead and let me know what I'm doing right/wrong, I don't want to be a nuisance)

Edit: So I tried changing all three of the peices of code that contained DF FF 11 to supposedly point to a different word, but with no luck. I can change the word itself though.


(edited by Zer0wned on 08-13-06 03:07 AM)
(edited by Zer0wned on 08-17-06 10:04 PM)
MathOnNapkins

1100

In SPC700 HELL


 





Since: 11-18-05

Last post: 6282 days
Last view: 6282 days
Posted on 08-13-06 04:52 AM Link | Quote
I don't understand what you mean by memory addres $11FFDF. I think SOE is hirom, so this isn't even a valid cpu address. And if it were Lo-Rom it would be an address that resided in ROM, not in RAM... so what exactly are you talking about?

Do you mean that ... erm. Let me put it this way, $11FFDF sounds like an address inside the rom image, not a memory address. Please clarify on that point.

Next, $A4 is an opcode that takes a one byte operand. So it would consume $DF, then the next opcode is $FF which is SBC $nnnnnn, X (where each n is a hex digit). So the opcode $FF consumes the next three bytes. So that would mean you'd have:

LDY $DF (read up on the direct page register to understand this)
SBC $????11, X (The question marks would be filled in by whatever followed $11 in reverse order)

^ This code makes little sense. Believe me, when you stare at SNES code in hex long enough, you learn to recognize what makes sense and what would be unusual. So I doubt this is code, or at least that if it is code, is not a proper reading. I mean, why would you load Y with a value then not immediately use it. More context would be needed to determine whether it was a correct analysis. Or a full disassembly/trace.

I don't have access to a usable romhacking computer right now so I really can't help you. But providing more information I could put you on the right path.
Zer0wned

Koopa


 





Since: 12-09-05
From: Torrance, ca

Last post: 6440 days
Last view: 6440 days
Posted on 08-13-06 05:33 AM Link | Quote
$11FFDF is an address inside of the actual ROM image, and you are correct, it is HiROM (I don't know most of the jargon yet, sorry about the confusion).

Unfortunately I can't continue tonight, but if you could remind me on how to correctly obtain a full dissassembly/trace (which is easy using the geiger snes 9x, if I remember correctly) I will gladly do it tomorrow before I leave.

attatched is a trace (that has probably been done incorrectly) from the exiting of the enter name screen, to the appearance of the "Podunk, U.S.A." (which isn't a graphic, because I've changed "Pudunk" to something else before)

I should note this is my return from an 8 month haitus on rom hacking-- and I wasn't all too knowledgeable then either, but please bear with me (at least now I have more programming experience, but geez ASM is still hard to pick up).

Attachments

SOE.txt (15010b) - views: 33
MathOnNapkins

1100

In SPC700 HELL


 





Since: 11-18-05

Last post: 6282 days
Last view: 6282 days
Posted on 08-13-06 05:54 AM Link | Quote
Breakpoints will often provide you more info than looking through lengthy traces. I think of traces as a last resort and only used in certain situations. Set a read breakpoint on that particular rom location. And by that, I mean you have to convert that address ($11FFDF) to a CPU address.

Now... from the trace you posted it seems that the code is executing starting at bank $80.... and it doesn't look like Hi-Rom code... it looks like Lo-Rom code executing in the FastRom area (banks $80 and above). The reason I say this is that all the code in the trace executes in the upper half of each bank, which is characteristic of Lo-Rom. Hi-Rom games also typically only execute in banks $C0 and above unless they are larger than 32 megabits (4 megabytes). *Fastrom allows the processor to run at 3.58 mhz rather than 2.44 mhz. Anyways, you'll want to use a program like Lunar Address to convert that into a usable CPU address. Don't know if your rom has a header but my best guess is that it will convert to A3FFDF (no header) or $A3FDDF (header). In time you will see how I got there.
Zer0wned

Koopa


 





Since: 12-09-05
From: Torrance, ca

Last post: 6440 days
Last view: 6440 days
Posted on 08-13-06 12:09 PM Link | Quote
Now this hi/lorom thing is getting confusing... When I use the "auto detect type" in lunar address (which is how I obtained the information last night), it says that it's a hirom, the bootup screen also says the same (screenshot provided). But according to what you say, it's a lorom. I guess I'll have to take your word on this, because it is executing exclusively in the 8x/xxxx and 90/xxxx areas. I'll try messing around with breakpoints, and see how that goes...

Edit: Now this is interesting... I used lunar address to convert the 11ffdf (Podunk) address and isolate it in the hex viewer for snes9x, neither the lorom with/without header translation worked, but when I did it with hirom/no header ($D1:FDDF), there it was, Podunk. Any reason you can see for this?

Attachments




(edited by Zer0wned on 08-13-06 11:24 AM)
Gideon Zhi

Keese








Since: 12-05-05
From: ...behind you! Boo!

Last post: 6284 days
Last view: 6282 days
Posted on 08-13-06 01:18 PM Link | Quote
Originally posted by Zer0wned
(SOE = Secret of Evermore)
Here's what I've gathered so far regarding how SOE handles text:
Generally, it stores full words to be repetitively re-used (change 'axe' to 'cat', and every occurance [with exceptions, mentioned ahead] will be "cat" instead of "axe"), alternates with capital letters are stored seperately.

On rarer occaisions, full sentences/descriptions are stored. For example there's an easter egg where if you pester a chicken too much, you get smote by the gods. All the accompanying text is in one place and consecutive. (so "TAUNT THE GOATS IF YOU WANT, BUT LEAVE THE CHICKENS ALONE" as opposed to it just retreiving the stored TAUNT,ALONE,GOATS,etc etc.)


This sounds like fairly standard dictionary compression. Your pointers are likely to be stored in a table somewhere.
d4s

Shyguy








Since: 12-01-05

Last post: 6404 days
Last view: 6302 days
Posted on 08-13-06 03:48 PM Link | Quote
Originally posted by Zer0wned
Now this hi/lorom thing is getting confusing... When I use the "auto detect type" in lunar address (which is how I obtained the information last night), it says that it's a hirom, the bootup screen also says the same (screenshot provided). But according to what you say, it's a lorom.



most of the trace you posted is part of the sound handshaking routine.
the game probably plays a sound effect there or changes the music.
besides, your trace is just a fraction of the actually executed code, perhaps the routine youre searching isnt even in it.


the game is hirom. that doesnt mean it cant access data in the range that lorom games use.
i think mon wasnt saying the game is lorom, but rather looks like its accessing data lorom style.
the reasons why games do this can vary, my guess would be that they are using a soundengine that was meant to be used on a lorom cart in the first place.
the lower rom banks($00 and the fastrom mirror $80 onwards) have ram, registers and rom in it, and the high ones($c0 onwards) contain nothing but rom, the routine wouldve most likely been rewritten if you wanted to execute them in the $c0 banks. much more convenient to just execute it in the $80 banks and put the code and data in the upper 32k of a hirom bank

like mon said, the adress $11FFDF youre searching for is an adress inside the rom image in your hex editor.
thats not how the snes sees the rom, though.
using lunar adress is fine but if you dont know how the snes' memory map works, you probably wont be able to put it to good use.

you know that the rom is hirom, that means every bank is 64kb.
$11 is your bank and ffdf is an offset inside this bank.

like mentioned before, hirom games usually access data in the upper hirom mirror, banks $c0 onwards.
add $c0 to $11 and you have $d1, resulting in the snes adress $d1ffdf.
if you dont get a breakpoint match with that adress, remember how
the game often uses the bank $80 area instead, so set a breakpoint at $91:ffdf aswell. most likely, that'll tell you where the game accesses that data.

before continuing, try to understand how the snes' memory map works.
there are various docs around that cover this.


(edited by d4s on 08-13-06 02:51 PM)
(edited by d4s on 08-13-06 02:51 PM)
(edited by d4s on 08-13-06 02:52 PM)
(edited by d4s on 08-13-06 02:53 PM)
Zer0wned

Koopa


 





Since: 12-09-05
From: Torrance, ca

Last post: 6440 days
Last view: 6440 days
Posted on 08-16-06 01:31 PM Link | Quote
Alright so I got some help on IRC, and I think I have some actually useful information this time. What I did this time was in the actual ROM image, I went to "Podunk" and corrupted it so the game would crash when I tried to load it, make note of the in-game location of when that happened, then restart and do a trace beginning before that point (with the corrupted code still there).

192 megs of 5 meg CPU trace logs later O_o... (I only really needed the last one anyway, but geez)I came up with the code seen in SOE2.txt a little bit before the crash occured. Notice all the EAs getting put into the accumulator? that's what I used to corrupt the code and cause a crash.

So then after I obtained that information, I used a breakpoint on an arbitrary point before the relevant code(at least I think it's before the relavant code). I then noticed that a ways into the CPU log; 50, 6F, 64, 75, 6E, and 6B, (ascii for "Podunk") were stored into the accumulator in that order.

I can provide further behind on both the uncorrupted and corrupted version, and further ahead on the uncorrupted if you need it.

(attatched text contains both versions of the code)

Attachments

SOE2.txt (19387b) - views: 23
Gideon Zhi

Keese








Since: 12-05-05
From: ...behind you! Boo!

Last post: 6284 days
Last view: 6282 days
Posted on 08-16-06 05:31 PM Link | Quote
'fraid that doesn't help much. The LDA you've got there is a load from -RAM- not from ROM; it's likely a buffer containing the unpacked string. What you need to do is track backwards to figure out when it gets read from ROM and stored into RAM.

If you're using Geiger's debugger, try opening up the hex editor and navigating to the address (7Fwhatever); you'll probably see the entire text string there.
Zer0wned

Koopa


 





Since: 12-09-05
From: Torrance, ca

Last post: 6440 days
Last view: 6440 days
Posted on 08-17-06 10:59 PM Link | Quote
[deleted because I didn't know what I was talking about]
Significant update:
Alright, so I FOUND the text table. I mentioned a pattern I saw near where all the text was stored (00xx,00xx,[etc]01xx,01xx[and so on, up to 06xx, with xx increasing semi randomly each time]), someone said it reminded them of an... index string I think? Anyway, these lines of ASM:

$8C/CE1B BF 6C F4 91 LDA $91F46C,x[$91:F636] A:01CA X:01CA Y: D81D P:envmxdizc
$8C/CE1F 18 CLC A:060A X:01CA Y: D81D P:envmxdizc
$8C/CE20 69 D5 F7 ADC #$F7D5 A:060A X:01CA Y: D81D P:envmxdizc
$8C/CE23 85 26 STA $26 [$00:0026] A:FDDF X:01CA Y: D81D P:eNvmxdizc
*skip some lines..
$8C/CE02 A7 26 LDA [$26] [$91:FDDF] A:0000 X:01CA Y: D81D P:envMxdiZc
*More code, the "EA EA EA EA EA EA" I put in starts showing up in the accumulator. Using breakpoints and frequent checks into the RAM via geiger's snex9x hex editor, "EA EA EA EA EA EA" starts getting loaded into the RAM at $7F: D81E

were in one of the more complete tracing log. Anyway, the "$91:F636"($11F636 in the ROM) was right in the aforementioned suspected index string. So as an experiment, I changed the "0A 06" to "11 06", which was a value right next to it. And small scale success, "Sheath, U.S.A." (Sheath is the word right next to Podunk in the ROM). Nothing bugged out, and I even tried making it point to "Welcome", which is 7 instead of 6 characters, it was also printed to the screen with no errors to follow.

So what I kinda understand is that there's an LDA [$22] that contains the table value that gets F7D5 added to it, that then equals the xxxx portion of [$91:xxxx], which is the location of the target word + $80:000. That then gets loaded to the RAM and used shortly after.

So I know where all the words are, and I think I found the entire index table (or at least the main one), and I found out where in the code the word gets loaded to the RAM, and where in the RAM it gets loaded to. Now what I'm looking for is what's going on with the X register in LDA $91F46C,X before this code (it gets changed around a lot, and I keep getting lost, so if there isn't some standard thing that gets done regarding this, I'll gladly post some trace code with a few notes so you can CTRL+F your way through what I've been able to recognize), because this will help me find where in the ROM code I can find what holds the pointers, and how it holds them to form sentences, etc etc.


(edited by Zer0wned on 08-17-06 10:03 PM)
MathOnNapkins

1100

In SPC700 HELL


 





Since: 11-18-05

Last post: 6282 days
Last view: 6282 days
Posted on 08-19-06 04:47 AM Link | Quote
Glad to see you've made some progress, keep us updated.
Zer0wned

Koopa


 





Since: 12-09-05
From: Torrance, ca

Last post: 6440 days
Last view: 6440 days
Posted on 08-23-06 01:25 PM Link | Quote
Alright, a little more progress, but I'm getting more prone to getting stuck here... I created a note page for general text information, a word list with the associated pointers, and I've interpereted a few more of the values I've managed to come across.

Since it's the easiest word to work with, I've attempted to map out as many aspects of "Podunk" as I could.

So now I'm looking for how to find the originating value that gets a logical shift done to it that becomes the pointer offset...

Partial trace with notes (and liberal before and after tracing, the key area can be found by doing a search on "****", leaving out the quotes), all the notes on the word "Podunk", and my general text notes.

Oh yeah and SOE is HiRom, and uses fastrom, those two are for sure now. The notes/traces are all based on a headerless ROM.

Attachments

SOE2.txt (17156b) - views: 15
General text info.txt (365b) - views: 16
Podunk info.txt (556b) - views: 19
SOE wordlist.txt (5012b) - views: 26
Alchemic



 





Since: 11-17-05

Last post: 6396 days
Last view: 6282 days
Posted on 08-25-06 12:12 AM Link | Quote
I found this archive (copy attached for posterity) on this page; would it happen to be what you are looking for?

(Other features of the system from using those files: [0C XX] is a textual delay for XX units, and [4X] causes the next X bytes to be parsed differently - I think as just plain ASCII.)

Attachments

soe.zip (9329b) - views: 15
Zer0wned

Koopa


 





Since: 12-09-05
From: Torrance, ca

Last post: 6440 days
Last view: 6440 days
Posted on 08-25-06 01:21 AM Link | Quote
AND THEN OUT OF FREAKING NO WHERE, AN INFORMATION GOLD MINE APPEARS.

Sorry for the spammy reply, but holy hell that was the last thing I was expecting today! I'd taken a break starting today on that because I didn't want to burn myself out and get too frustrated too early. I looked for similar things elsewhere, but I've always had problems finding stuff on zophar's domain for some reason, and google was definitely not much help...

And considering this is apparently the last time you've posted in like a month, I feel blessed =D.
Add to favorites | Next newer thread | Next older thread
Acmlm's Board - I3 Archive - ROM Hacking - Figuring out SOE text |


ABII

Acmlmboard 1.92.999, 9/17/2006
©2000-2006 Acmlm, Emuz, Blades, Xkeeper

Page rendered in 0.019 seconds; used 431.55 kB (max 540.10 kB)