On Tue, Dec 24, 2013 at 1:35 PM, Nicolas Cellier <nicolas.cellier.aka.nice@gmail.com> wrote:

Sure, I agree but this would require platform specific code...
As I developped in the other thread, an alternative is to use native libraries where possible - that is when they perform an equivalent job (It would be Quartz or something like that for me).
Such platform specific support will be harder to obtain, because it requires knowledge of VM plugin+external library or low level assembler...
Plus the pain of diving into historical VM architecture (specifically in BitBlt, it's not that easy to get the full picture when you did not participate to the development - like me).
I note that there is plenty of work required in this area, sort of technical debt. For example my MacOSX VMs do not support the little endian image formats yes (those with negative depth) which could be surprising if we didn't knew historical roots...

^Display supportedDisplayDepths

So, in a word, such platform specific support can certainly provide great rewards, but with significative investments.
On the other hand, these little tricks are like harvesting low hanging fruits.
They are made of pure C^H slang, thus valid on every platform which does not yet provide specific accelerated support.
I'm lazy and just musing, so the fastest ROI is appealing.

Remember my little text composition/display benchs?

    | text canvas m1 |
    canvas := FormCanvas extent: 569@399 depth: 32.
text := Compiler evaluate: (FileStream fileNamed: 'text.st') contentsOfEntireFile.
    m1 := TextMorph new.
    m1 text: text textStyle: TextStyle default.
    m1 wrapFlag: true.
    m1 extent: 569@9999999.
   MessageTally spyOn: [ Time millisecondsToRun: [100 timesRepeat: [
        m1 drawOn: canvas]]]
Interpreter VM 4.10.10 Before bit hacks: 1425 1411 1403

Interpreter VM After bit hacks: 1152 1173

15 to 20% less, it's not really impressive, but the hurdle is low, just two bit-hacks applied to rules 20 24 30 & 31

On the contrary. That kind of speed-up in a mature tuned numerical algorithm is impressive. Thanks!

2013/12/24 tim Rowledge <tim@rowledge.org>

If you (or anyone else, of course) are interested in really speeding up bitblt, it would likely be worth looking at the ARM specific speedups Ben Avison did for the PI. (Look in platforms/Cross/plugins/BitBltPlugin) and seeing if similar tricks could be done with the assorted media instructions in current i7 etc cpus.

tim
--
tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim
Experience is something you don't get until just after you need it.

--
best,

Eliot