On older cpus like the 1960's-1970's ICL 1900 series multiplication is done by the shift and add algorithm which takes O(word length) time. For example on a 1904S processor addition takes 1.55µS while multiplication takes 10.04µS.

This means that it is often worth replacing multiplication by a constant value by a sequnce of cheaper addition, subtraction and shift operations.

So the problem becomes how to calculate the sequence of operations needed to accomplish the required multiplication. One trivial method is to use the same shift and add algorithm as the multiplication operator, but that does not always find the shortest sequence of instructions, for example multiplication by 60 can be implemented by:

	n * 60 = n * 64 - n * 4
Two shifts and a subtract, where the shift and add algoritm would produce
	n * 60 = n * 32 + n * 16 + n * 8 + n * 4
4 shifts and 3 adds.

The problem can be solved by recursively computing:

	let log = log2(n)
if n = 2^log we've finished, otherwise:
	let n' = n - 2^log
or
	let n' = n - 2^(log + 1)
And taking the shortest sequence of operations.

A C implementation of this algorithm can be found here: best.c.

How much use is this? 8% of all multiplications by constants less than 1024 can be replaced by operations involving no more than one add and two shifts. 36% can be done by operations of no more than two adds and three shifts.

The inspiration for this came from Robert Eisele's post Optimising Integer Multiplication.


All questions to john@AtlanTech.com