Register | Login | |||||
Main
| Memberlist
| Active users
| Calendar
| Chat
| Online users Ranks | FAQ | ACS | Stats | Color Chart | Search | Photo album |
| |
0 users currently in ROM Hacking. |
User | Post |
HyperHacker Posts: 5002/5072 |
Yes, you should see things like this all the time (using Z80 as an example because I know it):
DEC A JR NZ, somewhere LD A,(HL) AND A ;a Load instruction on Z80 doesn't update the flags JR Z, somewhere ARM even has the ability to choose whether an instruction will update the flags. |
Zeld Posts: 50/53 |
On that note, would it be possible to simplify code by taking out unneeded comparisons in cases where previous operations have altered the flags accordingly? I never tried it in thumb, but I've seen it used in ARM a lot (what with just about every operation in ARM being able to have conditions, it's no wonder). |
Xenesis Posts: 178/200 |
The processor has NCZV flags, so it does both carry and negative, I would assume. But thanks, I seem to have gotten the compare operation the other way around. ; |
HyperHacker Posts: 4987/5072 |
A Compare operation simply performs a subtraction and doesn't store the result. So when you do for example cmp r0, 0x10, the CPU does r0 - 0x10 and updates the flags accordingly. If r0 == 0x10, then the result will be zero, so the Z(ero) flag will be set. If r0 < 0x10, the result will be negative, so the N(egative) or C(arry) flag will be set. (Not sure which ARM/Thumb has, most likely carry.) |
Xenesis Posts: 177/200 |
Okay, I've got my head around jumps and stuff now, so I've got a new question.
Comparisons. I want something to branch if I compare it. Eg, if r0 = 0, Add 0x10 to the PC or something. However, looking at the Conditional Branches section of the GBATEK document, all I can see is that you can't do direct register comparisons, yes? You have to compare the CPSR condition flags right? Now, for my code I want it to skip them if r0 = 1. Hence: cmp r0, 0x1 Hence then if r0 = 1, then the Z would be set to 0, right? then I go bne (0xjump). Is that correct, or am I going about this the wrong way? |
RadioShadow Posts: 22/24 |
Yay! Xen fixed it! 'plays' |
Xenesis Posts: 172/200 |
Yeah, it is pretty broken.
While I'm normally a balance freak when it comes to AW2, this was something more to cut my teeth on before I go onto bigger and greater things. Edit: I've discovered that the patch doesn't work in Campaign or War Room. It appears that those memory addresses are used for other things. ; Oh well, I'll fix it up when I have time. Still works flawlessly in versus. Edit 2: Fixed Version. Huzzah. That'll teach me to be sloppy with choosing my memory addresses. ; |
Zeld Posts: 49/53 |
You know, if you debug the process of a mov pc, it will take you to the address&fffffff7, but r15 will be equal to address&fffffff7 plus 2 or 4, depending on the processor mode. It actually DOES increment because of pipelining >.>
What's the relationship you decided on between the day number and Adder's firepower? If it's directly proportional it's gonna be broken, no? |
Xenesis Posts: 169/200 |
I don't know. I just know it works. And yes, I was wrong, it is actually a decrement, I forgot that I just put an mov r0, r0 there. XD Anyhow, I managed to complete my first assembly hack, using two jumps from code in Advance Wars 2. One is run once when starting a game and I use it to initialise some values, the other is updated every time the 'current day' is increased. It needs a clean AW2 US rom (See releases for a cleaner if necessary), but here it is. It gives Adder the ability to power up over battle. He gets a firepower bonus based on what the current day is. |
Zeld Posts: 48/53 |
Wait, wouldn't that be a decrement? Or does it "increment" because of pipelining? :\
You can set a register to a range of 255 (not 256, because 0 would be useless) values with the mov function (and no, you can't shift while moving, because that is indeed an ARM only feature. Also, I haven't seen even ARM mode do that with an immediate value, but I don't see why it couldn't). That gives lots of nice ROM addresses to jump to...factor in the concept of different shift amounts, and that expands it even more. Yeah, I guess you might not have to use an LR loop. It's just, I don't want to have my assembly hacks spaced all over the expanded area of a ROM, because I'm working on expanding Fire Emblem 7, and there's lots of tables that will be going there. I'll have to place them accordingly with my assembly hacks in between with this new method, but I like it. For Fire Emblem 8, I already used a push {lr} bl $[Table, where LR is popped and compared to possible LRs to determine which subroutine to execute] pop {lr} - as stated above ldr first possible LR cmp register with loaded possible LR, current LR bne [Next compare] bl $[Hack] Then, at the hack routine, I do my business and load some numbers into the 8 base registers that I use to repair LR before popping 0-7. From there, I usually load one last word into a register that won't be used by the next routine and mov pc, that register. The returning issue is still there, I guess. You could still find registers that will be overwritten anyway and load a return address into it...or you could be a sneaky bastard and write your code in ARM, and load directly into PC (that works, doesn't it?) Oh, I have another question. I just got done writing my IEEE 32 bit floating point processing functions that handle addition, reciprocation, and multiplication (which allows me to do all 4 basic math operations, since subtraction is a variant of addition and division is multiplying by the reciprocal). I had to write my mutliplication function in ARM to make use of "umull" (thanks to whoever pointed out the umull operation in the programming thread I made >.>). After I assembled it and saw how big it was (400 bytes XP) I thought it would be slow, but it seems to actually calculate faster than the thumb scripts for addition and reciprocation. Is ARM mode just faster or something? If anyone cares, my reciprocator and my addition programs have 1:1 results with an actual java-based floating point calculator, but my multiplication function tends to give outputs that are just 0x1 off from the expected output. I debugged it to see why and it turns out the error occurs during the umull instruction. I can't do anything about that :\ |
Xenesis Posts: 168/200 |
Zeld, that worked like a treat. Now to actually go about writing my new routine.
Here's the jump code: B401 Push r0 2087 mov r0, #0x87 0500 lsl r0, r0 #0x14 1c40 add r0, r0, #0x1 4687 mov pc, r0 That jumps to address 08700001, the pc increments by 1 and it continues in THUMB mode. |
MathOnNapkins Posts: 1083/1106 |
^ You can do that with ARM, but I'm pretty sure you can't do that with THUMB, and the poster is having problems with THUMB jumps in particular. |
HyperHacker Posts: 4952/5072 |
That's clever. Doesn't the mov instruction allow shifting though? So instead of:
mov r0, #0x9 lsl r0, r0, #0x18 you could just do: mov r0, #0x9 LSL #0x18 or however that's written. Maybe I'm just tired but I seem to recall ARM supporting that. |
Xenesis Posts: 167/200 |
Zeld, that's so simple it's insanely stupid. I approve. And it wouldn't be too hard to get a range of addresses to jump to anyhow. Just add an offset to the jump address after you lsl it.
*goes to fiddle* |
Zeld Posts: 47/53 |
A bit late for this, but I just thought of something neat:
Hector of Chad: Wait Hector of Chad: I've got it Hector of Chad: push {r0} Hector of Chad: mov r0, #0x[somebyte] Hector of Chad: lsl r0, r0, #0x[some shift amount] Hector of Chad: mov pc, r0 Hector of Chad: You'd only be able to go to a small handful of addresses Hector of Chad: but if you used it to jump to the end of the ROM Hector of Chad: and made a loop that determines where you branched from based on LR Hector of Chad: You could insert assembly hacks without losing registers and with minimum necessary space in the original routine to branch from So, like, push {r0} mov r0, #0x9 lsl r0, r0, #0x18 mov pc, r0 That takes you to the end of the ROM, preserves all registers, and only requires 4 instructions out of the original routine, which should be easy enough to find, right? The LR loop check is inefficient, but the preservation is nice. I haven't had any problems with any games I've changed that were caused by having too long of a custom sub routine. Any ideas of improvement? Edit: Arg, I forgot about the thumb bit. Well, that would cost an add r0, #0x1. Only one instruction...hopefully not going to be a problem Not like you can't just go there in ARM mode and switch back to thumb, to handle situations where you can't get in 5 replacement opcodes but can do 4. |
Spikeman Posts: 6/8 |
Since you can't pop lr, you should do something like this:
push {lr} ldr r0,[jump] mov lr,pc bx r0 b return jump @dcd 0x8F00000|1 ; ORing 1 keeps it in THUMB, this is somewhere at the end of the ROM return pop r0,lr mov lr,r0 |
Zeld Posts: 46/53 |
The alternate is actually the standard...you can't pop LR in thumb mode.
You could use a load SP relative as an alternate...and decrement it yourself to emulate a full pop (it is decrement, right? I don't pay much attention to that). |
Dwedit Posts: 113/116 |
Usually you push and pop LR. Alternatively, push lr and pop pc. |
Xenesis Posts: 158/200 |
Thanks Spikeman, that helps me a lot.
Although, if I do need to preserve the lr, how would I do that? Just copy the current lr contents to another register, and then move it back into the lr when I've finished my routine? |
Spikeman Posts: 5/8 |
This is generally how I do it:
ldr r0,[jump] mov lr,pc bx r0 b return jump @dcd 0x8F00000|1 ; ORing 1 keeps it in THUMB, this is somewhere at the end of the ROM return It's a little different if you need to preserve lr, or don't have a free register, and even harder if you don't have enough space (for example, the routine you want to jump from only has like two instructions and code jumps into it at different places). |
This is a long thread. Click here to view it. |