Acmlm's Board - I3 Archive -- Programming: Division in Assembly

~~(Link to AcmlmWiki)~~ Offline: thank ||bass

Register \| Login
Views: 13,040,846	Main \| Memberlist \| Active users \| Calendar \| Chat \| Online users Ranks \| FAQ \| ACS \| Stats \| Color Chart \| Search \| Photo album	05-03-24 02:22 PM

0 users currently in Programming.

Acmlm's Board - I3 Archive - Programming - Division in Assembly

New poll |

Add to favorites | Next newer thread | Next older thread

User

Post

Zeld

Red Paragoomba

Since: 11-05-06

Last post: 6287 days
Last view: 6284 days

Posted on 01-20-07 05:04 PM

Link | Quote

This is bugging the crap out of me. I can think of ways to do multiplication in assembly, but I have no clue how to program division. I haven't seen a division function for the Gameboy advance ARM7 processor, and I'm worried that when I get around to making assembly hacks that require division I won't be able to obtain enough accuracy.

So far I can get .875 from dividing 9 by 10. That's sort of accurate, but I imagine it's terrible accuracy in comparison to a more standard method of division.

How do processor creators implement their division functions? Keep in mind this only has to handle integer division. Surely there's some semi-universal method for programming integer division on a processor using loops and shifts, etc...

I can't find anything with Google, but then again I'm probably not typing the right thing into the search engine.

Edit: I sort of thought of a way to do a small floating point implementation and acquired "0.884765625" for dividing 9 by 10. Perhaps if I extended the length after the "decimal" point from 2 digits to 4 I could obtain something more accurate...I still can't help but think that I am doing this the hard way, though not necessarily the wrong way.

I haven't actually translated any of the stuff I'm jotting down into an assembly script. I'm writing math operations on paper that can be executed easily in assembly, but I haven't actually written any assembly. So, I can't give a source to show what I'm doing, which means I can't really get any constructive criticism. :\

Edit: I saw a problem with my pseudo decimal point and multiplied by 256/100 to convert it properly. Except 256/100 contains a grotesque division in itself, so I just decided to use a lsl 1 + lsr 1, which is the same as multiplying by 2.5. Was that bad of me? I get 0.8994140625 now. That should be close enough, right?

Edit: I decided to write out my idea in assembly for reviewing. I would include a text file attachment, but for some reason there's no apparent way to add attachments to posts in the editing window.

R00=00000009 R04=00000000
R01=0000000a R05=00000000
R02=00000001 R06=00000000
R03=00000000 R07=00000000

R00 is nominator
R01 is original denominator
R02 is new denominator
R03 is "float" for R00

* lsl r2, r2, #0x1
add r4, r4, #0x1
cmp r2, r1
blt $*
lsl r0, r4
add r5, r0, #0x0
** cmp r1, r5
bgt $***
sub r5, r5, r1
add r6, r6, #0x1
b $**
*** mov r7, #0x64
mul r5, r7
**** sub r5, r5, r1
add r3, r3, #0x1
cmp r1, r5
ble $****
asr r7, r3, #0x1
lsl r3, r3, #0x1
add r3, r3, r7
lsl r2, r2, #0x8
add r4, r4, #0x8
lsl r6, r6, #0x8
add r6, r3, r6

R00=00000090 R04=0000000c
R01=0000000a R05=00000000
R02=00001000 R06=00000e64
R03=00000064 R07=00000014

At this point, asr r6, r4 equals approximately 9/10, but it will be truncated to 0. As you can see by the last bit of code (for those of you who understand assembly and can make enough sense of ARM7 even if it's not one of the assembly languages you're familiar with) this script is incomplete. It's my attempt at stepping in the right direction. I'm sure there's much more efficient methods of division, but this is the one I came up with. If I wanted to use the number 9/10 to get 90% of something, I could use this program. Unfortunately, mul r3, r6; asr r3, r4 would get me 89. It should give me 90, because 9/10 of 100 is 90. But no. I would get 89. Damned shame...

Edit: I tested the results of trying 7 divided by 9 with this program and it seems I have approximately a 1% error (it was .99%). That may be terrible by today's standards, but I like it.

(edited by Zeld on 01-20-07 11:44 AM)
(edited by Zeld on 01-20-07 12:10 PM)
(edited by Zeld on 01-20-07 01:03 PM)
(edited by Zeld on 01-20-07 01:18 PM)

Dwedit

Rope
フクトオンフォニクス

Since: 11-17-05
From: Chicago!

Last post: 6285 days
Last view: 6284 days

Posted on 01-20-07 08:33 PM

Link | Quote

Usually for division by a constant, you use "fixed point" division, where you multiply by the reciprocal, then shift right.

ARM7 supports multiplying two 32-bit numbers into a 64-bit number. An easy way to divide:

So for example, to divide 21 by 3:
Start with 0xFFFFFFFF
divide by 3 (0x55555555)
Add 1 (0x55555556)
multiply by 21 (0x7 0000000E)
Your quotient is the high register of the 64-bit output. If you need to do a lot of divisions by 3, you can just keep multiplying by 0x55555556.
You can use less bits if you want. If you use 4 hex digits (0x5556) instead of 8, your result will be accurate for numbers up to 32766* , and you shift right by 16 (divide by 0x10000) instead of taking the higher register.

*(Just for dividing by 3, the maximum accurate number varies with what you divide by)

This stuff only applies to dividing by constants. And the operation here is Integer Division. You can still use the inaccurate lower bits as part of a fractional component, but they are inaccurate.

(edited by Dwedit on 01-20-07 02:34 PM)

Zeld

Red Paragoomba

Since: 11-05-06

Last post: 6287 days
Last view: 6284 days

Posted on 01-20-07 09:56 PM

Link | Quote

I don't understand what to do with the 21 times 0x55555556, nor do I see where your mention of multiplying two 32 bit numbers together takes place...

Also, 21/3 isn't as big a deal as 9/10...9/10 is less than 1.

I had a look at the IEEE floating point system and I understood most of it. However, I didn't understand how 1.01 in binary is equal to 1.25 in decimal. This could be owed to the fact that the concept of having a decimal number like 1.01 in binary escapes me. If there's a way of writing 1.01 in binary already, why even make a floating point system? It looks like a circular reference to me...using a float to create a system for creating floats. Yeah, and 1=2.

Dwedit

Rope
フクトオンフォニクス

Since: 11-17-05
From: Chicago!

Last post: 6285 days
Last view: 6284 days

Posted on 01-20-07 11:00 PM

Link | Quote

My post was describing how to do integer division without a divide instruction, and does not apply to just situations where you want fractional parts to a number.

in that post, I multipled 21 by 0x55555556, and got 0x70000000E. Discard last 8 digits, and you're left with 7, which is the quotient.

Dividing by 3 in ARM asm:
mov r0,#21
ldr r1,=0x55555556
umull r2,r3,r0,r1
r3 contains the result

Multiplying by 9/10 in ARM asm:
mov r0,#200
ldr r1,=0xE6666667
umull r2,r3,r0,r1
r3 contains the result

(edit: I just reread your first post, and it looks like you just wanted integer division, so the content below does not apply, but it's still interesting anyway. And integer diving 9/10 will always give you zero, so you might really want to know about fixed point fractional numbers)

If you want fractional numbers, the tradition is to use so-called "fixed point" numbers. There really is nothing special about them, you just keep them stored as 32-bit numbers, then shift a bunch of bits off every time you want to display the number.

You pick how many bits you want for the integer part, then how many bits you want for the fractional part. For example, you could pick 8 bits for the integer, then 24 bits for the fraction. The number would be in hex as 0xXXYYYYYY, where XX is the integer part, and YYYYYY is the fractional part. 1/2 would be 0x00800000, 1/3 would be 0x00555555, 9/10 would be 0x00E66666, etc.

If you want to get the integer part, you shift it 24 bits right. Possibly after adding 0x00800000 to round it to the nearest whole number.

To add two fixed point numbers, just add them. To multiply by an integer, just multiply it. To multiply two fixed point numbers, multiply, then shift right 48 bits. (48 because we picked 24 bits for the fractional part). To add an integer to a fixed point number, shift the integer left 24 bits before adding it.

This example fixed point number format would have a range of -128.999999 to 127.999999. With 24 bits of fractional part, it gets about 7 digits correct after the decimal point.

In short, fixed point numbers are basically integers. You just have to remember that they are scaled by a certain number of bits.

(edited by Dwedit on 01-20-07 05:02 PM)
(edited by Dwedit on 01-20-07 05:09 PM)
(edited by Dwedit on 01-20-07 05:17 PM)

Black Lord +

Flurry

Since: 11-17-05
From: Where indians still roam...

Last post: 6285 days
Last view: 6286 days

Posted on 01-20-07 11:23 PM

Link | Quote

Originally posted by Zeld

I had a look at the IEEE floating point system and I understood most of it. However, I didn't understand how 1.01 in binary is equal to 1.25 in decimal. This could be owed to the fact that the concept of having a decimal number like 1.01 in binary escapes me. If there's a way of writing 1.01 in binary already, why even make a floating point system? It looks like a circular reference to me...using a float to create a system for creating floats. Yeah, and 1=2.

1.01

Can be translated to:

1x2^1 + 0x2^-1 + 1x2^-2

1 + 0 + 1/4

1 1/4

1.25

Makes perfect sense to me.

Zeld

Red Paragoomba

Since: 11-05-06

Last post: 6287 days
Last view: 6284 days

Posted on 01-21-07 12:17 AM

Link | Quote

Oh, right. Yes, that does make perfect sense. I just plain didn't see it as powers of 2 when there was a scary decimal point distracting me.

Except that first 1 be multiplied by 2^0. Anyhow, thanks for clearing that up.

Edit: oops, I missed Dwedit's other post.

The fixed point system is actually what I was going for. 9/10 may be truncated to 0, but the rounding method would give the number meaning. Not to mention I could still use it for multiplying by an integer, which is what I planned to do, after which shifting off the bits representing fractions wouldn't leave the result as a meaningless 0. Multiplying by 100 and then shifting would actually yield a percentage.

Also, I understand the 64 bit stuff now. I didn't at first because when I was calculating the result in microsoft calculator, I forgot to change the product's size to 64 bits before pressing equal (I wanted to see it myself and was following through your instructions). After realizing that mistake I now see how the high register contains the result, but I'm a bit miffed that the opcode you speak of requires a branch change to ARM. I haven't done anything in ARM yet (never needed to) so it will take longer for me to program in it. Of course, I can just use less bits for the fraction in order to do the same thing for thumb, yes? Like, mul r0, r1 where r0 is a number like 3c and r1 is 10/9 (which is 11C71C7 in a 24 bit system, right?) I get 0x42(6 digits of crap), shift it 24 bits to the right, and wind up with 111% of 60: 66. I'm going to need this for a skill some folks at FESS want me to put into a GBA Fire Emblem game (the original skill is in the Gamecube Fire Emblem).

(edited by Zeld on 01-20-07 07:39 PM)
(edited by Zeld on 01-20-07 07:48 PM)

Black Lord +

Flurry

Since: 11-17-05
From: Where indians still roam...

Last post: 6285 days
Last view: 6286 days

Posted on 01-21-07 01:58 AM

Link | Quote

Meh... Just hurried my post together... I shall fix that error.

Zeld

Red Paragoomba

Since: 11-05-06

Last post: 6287 days
Last view: 6284 days

Posted on 01-21-07 02:13 AM

Link | Quote

What do I do if I want to divide by a variable? I was thinking I could make a table of the fractions 1/X where X has a domain of 1 through 256, and then the variable would be used to call the fraction. This seems a tad impractical, and doesn't solve the issue of having a floating point that is being divided by a varying floating point. "It's a gameboy. Why do you need floating point math?" I don't, but I want to know this stuff. I'd like to get into N64 hacking soon, too. Seems like there aren't enough people into N64 hacking that spend more time doing than they do dreaming.

Dwedit

Rope
フクトオンフォニクス

Since: 11-17-05
From: Chicago!

Last post: 6285 days
Last view: 6284 days

Posted on 01-21-07 06:00 AM

Link | Quote

Don't forget the GBA's bios functions.
SWI 6
Signed Division, r0/r1.

It's slightly slower than some available division routines, but it's easy to call, and you don't need to provide your own division routine. You can do division and multiplication directly to fixed point numbers, just watch out for that sign bit. Since fixed point numbers save fraction info, you don't necessarily need to multiply before you divide like with integers.

HyperHacker

Star Mario
Finally being paid to code in VB! If only I still enjoyed that. <_<
Wii #7182 6487 4198 1828

Since: 11-18-05
From: Canada, w00t!
My computer's specs, if anyone gives a damn.
STOP TRUNCATING THIS >8^(

Last post: 6284 days
Last view: 6284 days

Posted on 01-21-07 06:13 AM

Link | Quote

FYI, the N64 has an actual floating point processor, so you hardly need to know anything about how FP actually works. Hell, unless you're rewriting the game's physics I've found there's almost no need to tinker with FP math on the N64 at all. Some games may use it often but the ones I've worked with generally stick to integer math most of the time. I imagine it's a bit faster.

Zeld

Red Paragoomba

Since: 11-05-06

Last post: 6287 days
Last view: 6284 days

Posted on 01-21-07 04:28 PM

Link | Quote

Interdpth told me about SWI 6 and 7 last night. I felt kind of ridiculous, having read through GBATEK so much and not even seeing those SWI's.

Interdpth suggested I read through the BIOS to understand how the BIOS itself handles division, but I don't know where to look in the BIOS. How does "SWI $6" translate to a BIOS execution address?

Edit: Never mind. I took a random ROM and screwed it up to get stuck in an infinite loop of mov r8, r8 and branch back to the mov r8, r8 (in thumb mode). Then I changed it a bit to include the SWI and just spammed the Next button in the disassembler until it took me to the function. Good stuff. Well, it's good...if you ignore how inefficient it is. I can only imagine how many cycles would be wasted when dividing by large numbers...

Edit: I rewrote my crummy assembly script. I imagine this is far more optimized than what the BIOS uses:

Here, r0 is the dividend and r1 is the divisor.

** cmp r0, r1
blt $*
sub r0, r0, r1
add r2, #0x1
b $***
* add r4, r0, #0x0 ; The rest is optional and generates a float by use of the remainder
mov r0, #0x0
sub r0, #0x1*** cmp r0, r1
blt $****
sub r0, r0, r1
add r3, #0x1
b $***
**** add r3, #0x1 ; There, now it rounds up
mul r3, r4
r2r3=Quotient.Fraction

Edit: I did some crappy math, and it looks to me like this function would take 90 seconds to calculate a division by 3. I'm worried about this...my only thought of a solution was to convert 3 to 2.999999999 to speed up the subtraction loop process.

Edit: No, that's a lame solution to. I thought of a better way (dividing a number, such as 3, into each digit separately, then shifting the remainder to emulate an actual carriage like they teach in grade school v_v).

Edit: Did I say digit? I should say "bit", as it will be much more efficient that way.

(edited by Zeld on 01-21-07 01:52 PM)
(edited by Zeld on 01-21-07 03:14 PM)
(edited by Zeld on 01-21-07 04:20 PM)
(edited by Zeld on 01-21-07 09:28 PM)
(edited by Zeld on 01-22-07 12:25 PM)
(edited by Zeld on 01-22-07 12:39 PM)

Zeld

Red Paragoomba

Since: 11-05-06

Last post: 6287 days
Last view: 6284 days

Posted on 01-25-07 05:52 AM

Link | Quote

I apologize for double posting, but if I simply edit my post, no one will think I've appended my progress as of late.

I completely ditched the whole "8 bits for the integer, 24 bits for the float" crap. It was too limited.

I decided to rewrite my entire division algorithm from scratch to make it calculate the reciprocal of any IEEE 32 bit float. After much testing, I have finally gotten it to produce consistently accurate results when compared to an online float calculator on an IEEE website.

So far I have tested the following numbers:

2
3
4
5
329402312

That last one was completely random. I wanted to make sure my calculation was precise regardless of the complexity of the input, and according to my program, it works perfectly. All 5 of these inputs provided outputs that match the reciprocals of these numbers in 32 bit IEEE float format, as provided by the IEEE calculator I was using for reference. It can handle negative numbers easily, but I haven't tested my program's ability to reciprocate fractions. I imagine it can handle those with equal success.

The best part is, I had no source to view from an actual floating point processor, so I pretty much had to figure most of it out myself. Things like adding 0x800000 to the mantissa before dividing it into 1*2^46, to account for the implied 1.

Unfortunately, I just realized my program can only compute normalized numbers. I don't think I want to add denormalized support, because I can just feel how ridiculously painful that would be in relation to such little pay off.

Then again, I see a relationship between the denormalized number I tested, the incorrect output I received, and the correct output I should have received. Perhaps I CAN fix it...except, I'm tired and cranky right now.

Edit: I decided to treat denormalized numbers as "0", as a form of rounding and simplification. My program seems to only have one issue left (I need to figure out how to make it return "0" when a denormalized number is generated). I may not have to implement this if I remind myself not to use the program for reciprocating overly large numbers, although I'm curious as to how I could even input such numbers...

Edit: I thought this would be fun to share:
Input: 0x83759847 - What I typed into my Gameboy Advance
Output: 0xfb856c48 - What my Gameboy Advance typed into my eyes
Expected output: 0xfb856c49 - What it should have typed instead
Difference: Decimal 158456325000000000000000000000 - Big number?
% error: Decimal 0.000011436401767426803224837107833072 - And yet, a tiny percent error XD

(edited by Zeld on 01-25-07 12:46 PM)
(edited by Zeld on 01-25-07 01:08 PM)

Add to favorites | Next newer thread | Next older thread

Acmlm's Board - I3 Archive - Programming - Division in Assembly

ABII

Acmlmboard 1.92.999, 9/17/2006
©2000-2006 Acmlm, Emuz, Blades, Xkeeper

Page rendered in 0.049 seconds; used 422.86 kB (max 526.09 kB)