On 4/15/10 8:29 AM, Igor Stasenko wrote:
On 15 April 2010 05:54, Andreas Raabandreas.raab@gmx.de wrote:
On 4/14/2010 1:08 PM, Juan Vuletich wrote:
Profiling is indeed your friend. There is some serious inefficiency there. Quickly hacking this (warning: will only work for 1bpp):
[... snip ...]
gives over 30x speed increase (from 10 seconds down to 310 mSec) on my system. This is not a solution, just some food for thought.
Heh, heh. Very good. But now I'm gonna get serious ...
<pokerface on>
I see your 30x improvement and raise you another ... 6x for a total of 200x speedup (from 10secs to 50 msecs). There! Take that! :-)
(but if Igor pulls out some asm I may have to fold :-)
Yeah.. one could use MMX/SSE/SIMD instructions which will put on the knees anything you can write in C, not mentioning smalltalk :)
And now things come full circle. I recall reading an ancient article about someone who used bitblt as a hardware accelerated matrix multiply (or somesuch) back in the day.
Lawson