yes, but then i will ask you to compare results with JIT optimized for direct pointers.. :)
We have: accessing ivar: no extra cost method lookup: one extra indirection sends with MonomorphicInlineCache: no extra cost if implemented in an instance basis (checking against selfID).
hmm.. that doesn't makes inline cache to be effective. usually many different objects are passing via single call site but they having same class, this is where monomophic IC shines. if you change the cache to work on per-instance basis, i think it will make it less effective because of too much misses.
but you can have the two of them. In the jited prologue you may have something like:
mov [objectTable + selfID], self cmp [self -4], nativizedClass jz endOfPrologue // patching code must be added here jmp looupAndJIT cmp selfID, nativizedSelfID <- entry point jnz cmpClass mov nativizedSelf, self endOfPrologue
you add (mainly) an extra memory access, if the branch predictor helps