Number Systems
Positional Number System
In a positional number system, we have
radix (or base)
radix point (
.)numerals made of digits (0->r-1)
each digit has its weight
In general, the generic number system with radix r is

Radix Conversion
Binary, Octal, Hex to Decimal
This will be trivial. In general, from any radix r to decimal, we just need to calculate the weighted sum of all digits.
Decimal to Binary, Octal, Hex
This conversion needs to be divided into two parts
Integer part
Fractional part
Integer Part
We do repeated division by r and take the remainder. Below is an example to convert to its binary equivalent:

Fractional Part
Similarly, we do repeated multiplication by r and take the integer part. Below is an example to convert to its binary equivalent:

Conversion among Hex, Octal and Binary
Hex <-> Binary
Hex -> Binary: Each Hex Digit -> 4 bits
Binary to Hex: Grouping 4 bits of binary -> 1 Hex digit
For example, from hex to bin in fractional

Octal <-> Binary
Similar as He <-> Binary, but the 1 octal bit = 3 binary bits and thus the grouping should be done in 3 binary bits.
Hex <-> Octal
Use binary as an intermediate step
Binary Arithmetic
Addition
The addition is almost the same as what we have learned in primary schools.

Multiplication
Multiplication uses the same but a slightly different rule as in decimal system. In Binary system, the multiplication is done by
shift
then add

Substraction
The substraction in binary is totally the same as what we learned in primary school about the decimal substraction.

Division
In Binary division, we do almost the same thing as in the decimal division. For example,

Recap
We may notice that in the binary arithmetic, there are actually two kinds of operation 1) add and 2) substract. However, as we know that substraction can be performed by adding a negative number. It means for our digital system, we may only need adders to perform all binary arithmetic operations.
Challenge: This requires an appropriate representation of the negative binary numbers to inexpensively carry out subtractions.
reuse the unsigned adder for signed operations
To solve this challenge, we'll introduce the signed binary number representations in digital systems
Signed Binary Number Representation
There are three ways to represent a signed number:
Sign-magnitude
1's complement
2's complement
Signed Magnitude
In signed magnitude convention, we have
MSB is the sign (0 is positive, 1 is negative)
Subsequent bits are the magnitude
For example, the below is a 3-bit example for sign magnitude

1's Complement
The 1's complement in binary system is actually derived from the famous diminished-radix complement representation, which states that
A "new" operation: diminished radix complement of an n-digit integer A with radix r defined as
Thus, in our binary system (radix r = 2), we have 1's complement to be defined as
By practice, we find that there is a simpler way to find 1's complement of a binary number, that is
reverse all bits
For example,
In summary, to get the 1's complement of a number, we should start from its magnitude A
If , just take the magnitude (n bits)
If , take the 1's complement of the magnitude (n bits). If binary, just flip all bits.
2's Complement
To solve the "0" problem above, we introduce 2's complement representation, which is just adding 1 to the 1's complement.
In short, finding the 2's complement representation of a number, we also start from its magnitude A
If , just take the magnitude (n bits)
If , take the 2's complement of the magnitude (n bits). If binary, just flip all bits and then add 1!
For example, the following table is a 3-bit 2's complement representation example

Now, we have solved the problem of 2 representations for number 0!
Notes
In Binary Arithmetic,
substraction is performed by taking the two's complement of the second number, then adding. Make sure after changing the second number, the first number and the second number has the same number of digits. (Use Sign extension if necessary)
addition is just adding two numbers and discard any final carry bit. (In subtraction, also need to discard the final carry bit)
Sign extension in 2's complement representation is achieved by copying the MSB.
2's complement is used widely in nowaday's digital systems.
More about carry and overflow
This part is basically from an awesome explanation about overflow flag and carry flag form idallen.com.
First and foremost, do not confuse the "carry" flag with the "overflow" flag in integer arithmetic. Each flag can occur on its own, or both together. The CPU's ALU doesn't care or know whether you are doing signed or unsigned mathematics; the ALU always sets both flags appropriately when doing any integer math. The ALU doesn't know about signed/unsigned; the ALU just does the binary math and sets the flags appropriately. It's up to you, the programmer, to know which flag to check after the math is done.
Carry Flag
The rules for turning on the carry flag in binary/integer math are two:
The carry flag is set if the addition of two numbers causes a carry out of the most significant (leftmost) bits added. e.g.,
1111+0001=0000 (carry flag is turned on).The carry (borrow) flag is also set if the subtraction of two numbers requires a borrow into the most significant (leftmost) bits subtracted. e.g.,
0000 - 0001 = 1111 (carry flag is turned on).
Otherwise, the carry flag is turned off (zero). For example
Overflow Flag
The rules for turning on the overflow flag in binary/integer math are two:
If the sum of two numbers with the sign bits off yields a result number with the sign bit on, the "overflow" flag is turned on. e.g.,
0100 + 0100 = 1000 (overflow flag is turned on).If the sum of two numbers with the sign bits on yields a result number with the sign bit off, the "overflow" flag is turned on. e.g.,
1000 + 1000 = 0000 (overflow flag is turned on).
Otherwise, the overflow flag is turned off. For example,
Note that you only need to look at the sign bits (leftmost) of the three numbers to decide if the overflow flag is turned on or off.
If you are doing two's complement (signed) arithmetic, overflow flag on means the answer is wrong — you added two positive numbers and got a negative, or you added two negative numbers and got a positive.
If you are doing unsigned arithmetic, the overflow flag means nothing and should be ignored.
Note that a negative and positive added together cannot be wrong, because the sum is between the addends. Since both of the addends fit within the allowable range of numbers, and their sum is between them, it must fit as well. Mixed-sign addition never turns on the overflow flag.
Last updated