On Thu, 2022-06-30 at 18:03 -0400, Boris Shingarov wrote:
If you are interested, we can have a microhackathon where I can show you *my* way of doing it (which means, ULD). Or maybe Jan wants to show you his (which means, libgdbs/vdb). You were in the Pacific timezone, correct?
Jan: How good/bad is the ArchC model for rv64 right now? I remember you were fixing some defects around the register name conventions being all wrong -- are those now better or there is more work needed before ULD could even pick it up meaningfully?
ArchC model for RISC-V is good enough. At the moment it does not support A, F and D extensions, but that's "just work". Actually, I gave it a stab this morning and got F and D working in half an hour or so (modulo bugs).
The RISC-V model is good - in the last VM hackathon I used RISC-V model to emit and run code on both QEMU and Unleashed.
There's little work left on assembly/disaasembly support for flw/fld/fsw/fsd (but that's the asm syntax, one need to deal with the fact that one register is FPR and another is GPR) - but that's detail I can fix easily.
As for ULD I'd think it will be fine, but I quite don't remember where and how we handle target description annex. We may need to work around CSRs, they're not (yet) modelled in (my) RISC-V ArchC model.
Would you be interested to also participate in a GDB microhackathon with Ken?
Sure, I'm always up for some debugging fun!
Best, Jan
On 6/30/22 15:23, ken.dickey@whidbey.com wrote:
Date: Wed, 29 Jun 2022 09:17:29 -0700 From: Eliot Miranda eliot.miranda@gmail.com
..
The pattern is straight-forward. Look at the various implementations of ffiCalloutTo:SpecOnStack:in:. There are two issues, passing arguments and receiving results.
Yes. I have followed that. Thanks.
The generated RiscV64FFIPlugin.c is virtually identical to ARM64FFIPlugin.c as
ThreadedARM64FFIPlugin subclass: #ThreadedRiscV64FFIPlugin with almost no overrides,
aside from #nonRegisterStructReturnIsViaImplicitFirstArgument --> true.
However, the post return assignment to floatRet.d which I see in the ARM64 code appears to be absent from the RiscV64 code.
Being very rusty on C and GDB, I am moby confused by this.. -KenD
vvvvvv======vvvvvv From SqueakFFIPlugin: [Aarch64/Arm64]
2470 (floatRet.d = dispatchFunctionPointerwithwithwithwithwithwithwithwith(((struct dprr (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)) (((void *) address))), ((calloutState->integerRegisters))[0], ((calloutState->integerRegisters))[1], ((calloutState->integerRegisters))[2], ((calloutState->integerRegisters))[3], ((calloutState->integerRegisters))[4], ((calloutState->integerRegisters))[5], ((calloutState->integerRegisters))[6], ((calloutState->integerRegisters))[7])); 0x0000007ff6b3b6cc <+2004>: ldr x8, [x29, #928] 0x0000007ff6b3b6d0 <+2008>: ldr x0, [x29, #800] 0x0000007ff6b3b6d4 <+2012>: ldr x9, [x0, #232] 0x0000007ff6b3b6d8 <+2016>: ldr x0, [x29, #800] 0x0000007ff6b3b6dc <+2020>: ldr x1, [x0, #240] 0x0000007ff6b3b6e0 <+2024>: ldr x0, [x29, #800] 0x0000007ff6b3b6e4 <+2028>: ldr x2, [x0, #248] 0x0000007ff6b3b6e8 <+2032>: ldr x0, [x29, #800] 0x0000007ff6b3b6ec <+2036>: ldr x3, [x0, #256] 0x0000007ff6b3b6f0 <+2040>: ldr x0, [x29, #800] 0x0000007ff6b3b6f4 <+2044>: ldr x4, [x0, #264] 0x0000007ff6b3b6f8 <+2048>: ldr x0, [x29, #800] 0x0000007ff6b3b6fc <+2052>: ldr x5, [x0, #272] 0x0000007ff6b3b700 <+2056>: ldr x0, [x29, #800] 0x0000007ff6b3b704 <+2060>: ldr x6, [x0, #280] 0x0000007ff6b3b708 <+2064>: ldr x0, [x29, #800] 0x0000007ff6b3b70c <+2068>: ldr x7, [x0, #288] 0x0000007ff6b3b710 <+2072>: add x0, x29, #0x3b8 0x0000007ff6b3b714 <+2076>: sub x19, x0, #0x200 0x0000007ff6b3b718 <+2080>: mov x0, x9 0x0000007ff6b3b71c <+2084>: blr x8 ## after return, struct floatRet.d is assigned 0x0000007ff6b3b720 <+2088>: fmov d5, d0 0x0000007ff6b3b724 <+2092>: fmov d4, d1 0x0000007ff6b3b728 <+2096>: fmov d1, d2 0x0000007ff6b3b72c <+2100>: fmov d0, d3 0x0000007ff6b3b730 <+2104>: str d5, [x19] 0x0000007ff6b3b734 <+2108>: str d4, [x19, #8] 0x0000007ff6b3b738 <+2112>: str d1, [x19, #16] 0x0000007ff6b3b73c <+2116>: str d0, [x19, #24]
2471 if (isCalleePopsConvention((calloutState->callFlags))) { ^^^^^^======^^^^^^ vvvvvv======vvvvvv [RISCV64]
2476 (floatRet.d = dispatchFunctionPointerwithwithwithwithwithwithwithwith(((struct dprr (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)) (((void *) address))), ((calloutState->integerRegisters))[0], ((calloutState->integerRegisters))[1], ((calloutState->integerRegisters))[2], ((calloutState->integerRegisters))[3], ((calloutState->integerRegisters))[4], ((calloutState->integerRegisters))[5], ((calloutState->integerRegisters))[6], ((calloutState->integerRegisters))[7])); 0x0000003ff703f95a <+3456>: ld t3,-200(s0) 0x0000003ff703f95e <+3460>: ld a7,-208(s0) 0x0000003ff703f962 <+3464>: ld a6,-216(s0) 0x0000003ff703f966 <+3468>: ld a5,-224(s0) 0x0000003ff703f96a <+3472>: ld a4,-232(s0) 0x0000003ff703f96e <+3476>: ld a3,-240(s0) 0x0000003ff703f972 <+3480>: ld a2,-248(s0) 0x0000003ff703f976 <+3484>: ld a1,-256(s0) 0x0000003ff703f97a <+3488>: addi s4,s0,-520 0x0000003ff703f97e <+3492>: mv a0,s4 0x0000003ff703f980 <+3494>: sd t3,0(sp) 0x0000003ff703f982 <+3496>: jalr s5
## NOTA BENE: After return: NO assignment to floatRet.d !?!
2477 if (isCalleePopsConvention((calloutState->callFlags))) {
2478 setsp((calloutState->argVector)); 2479 } 2480 ownVM(myThreadIndex); => 0x0000003ff703f984 <+3498>: ld a5,408(s10) 0x0000003ff703f988 <+3502>: mv a0,s1 0x0000003ff703f98a <+3504>: jalr a5
2481 if (atomicType == FFITypeDoubleFloat) { 0x0000003ff703f98c <+3506>: li a5,13 0x0000003ff703f98e <+3508>: beq s2,a5,0x3ff703fd4a <ffiCallArgArrayOrNilNumArgs+4464>
2482 result = floatObjectOf(((((floatRet.d)).doubles))[0]); ^^^^^^======^^^^^^
I quite don't remember where and how we handle target description annex
It's in RemoteGdbXFER>>regTransfersFrom:, which takes a PPXmlElement (the annex received from the stub) and returns an Array of RSPOneRegisterTransfers sorted by the GDB's idea of "regNum" (which isn't necessarily contiguous, for example the Motorola e500v2 has no regNum 70). Near the end of prepareSession, this array is passed to AcProcessorDescription>>regsInGPacket:.
Then during the actual debugging, de/serialization is done in RemoteGDB>>decodeGPacket:/sendRegistersToRSP.
We may need to work around CSRs, they're not (yet) modelled in (my) RISC-V ArchC model
That shouldn't matter, because right now there is no connection between GDB's names and ArchC's names for registers. Yes it's broken; yes we should fix it; but that's what on GitHub right now.
I do have some time tomorrow (Saturday; 2 July) in Pacific Time Zone
Hmmm... how about some day during the week? This weekend is Canada Day and this always means guests etc.
So building and debugging takes patience
Wait... you build stuff on the board? I don't exactly remember how I built the interpreter OpenSmalltalkVM for the Neuquén demo in November 2019, but I *think* cross-compiling OSTVM *should* work. Or maybe it was my own hacked version of Cog (the one that doesn't quite slang yet)... damn, my brain is failing, I don't remember. But I have the SD card I was running the Neuquén demo off, untouched, so we can go look.
work left on assembly/disaasembly support for flw/fld/fsw/fsd
I am *CERTAIN* one can debug the JIT without FPU, *if* one disables initialization of Morphic, -- because that's what I was showing at the Second Hackathon (~500,000 bytecodes before the Reader REPL comes up, on the e500v2 which doesn't have IEEE FP).
On 2022-07-01 16:25, Boris Shingarov wrote:
I do have some time tomorrow (Saturday; 2 July) in Pacific Time Zone
Hmmm... how about some day during the week? This weekend is Canada Day and this always means guests etc.
Yes. I am booked 4-7th, perhaps next Thursday and/or Friday? [7th-8th]
So building and debugging takes patience
Wait... you build stuff on the board?
Yes. Bought ~$25 Lichee RV Doc from AliExpress.
Details at:
https://github.com/KenDickey/opensmalltalk-vm-rv64/blob/Cog/building/linux64...
RV64gc running Debian Linux on AllWinner D1 cuip. 1 core. Running Cuis base image tests on Raspberry Pi takes a couple of minutes, but close to 2 _hours_ on D1.
But hey, runs both Squeak and Cuis on OpenSmalltalk stack VM in either X11 or FrameBuffer on real RiscV64 hardware.
FYI, -KenD
vm-dev@lists.squeakfoundation.org