Sid M. answered 09/24/21
Senior Software Engineer, with BS, and almost MS, in Computer Science
Let's review and expand, first on the task. We have a (monochrome, i.e. black-and-white) bit-mapped image, which occupies 150K bytes, stored as 150K (bytes)/4 (byte-per-word) = 37.50K words. Each word needs to be read, 1s complemented, and written back. So, we must execute the following instructions, 37500 times:
loop: ldw $1,0($2)
not $1,$1
stw $1,0($2)
adi $2,$2,1
blt $2,$3,loop
(I'm assuming a MIPS-like ISA, $1 is our temporary, $2 is the address of a word in the image, $3 is the address of the first word past the end of the image, and the code to set everything up is unimportant.)
Now, for the performance expectations, assume that each load and each store can be completed without requiring additional cycles. This means that (a) the estimate is best-case and is, therefore (b) unrealistic. Each instruction requires 5 stages, and each stage requires 1 cycle. At 3.5GHz, then, we can complement 1 word every 5 cycles/3.5GHz = 1.43ns.
On a single-core processor, we can complete inverting the image in 37.5e3 * 1.43ns, or 5.36e-5s, or 53.6us.
On a dual-core processor, assuming that each core operates on 1/2 of the image, and there are no conflicts between the 2 executing cores (thereby allowing them to work in parallel), then the image can be inverted in 1/2 the time, or 26.8us.