to the X11 server if Linux epoll event handling is being used, because forgetXDisplay() is typically called after a fork(). With epoll event handling the shared fd reference affects both parent and child badly. You can view, comment on, or merge this pull request online at:
https://github.com/OpenSmalltalk/opensmalltalk-vm/pull/550
-- Commit Summary --
* In forgetXDisplay() do not call aioDisable() for the socket connection
-- File Changes --
M platforms/unix/vm-display-X11/sqUnixX11.c (2)
-- Patch Links --
https://github.com/OpenSmalltalk/opensmalltalk-vm/pull/550.patch https://github.com/OpenSmalltalk/opensmalltalk-vm/pull/550.diff
IMO, not a good fix. The fact the the epoll structure is shared between parent and childs (forked) process is a problem not only for the X11 socket connection, but for ALL the polled fd. We shall close the epoll fd and re-create it in the forked vm. Probably, we should use a close-on-exec flag for epoll `epoll_create1(EPOLL_CLOEXEC )`, so as to protect from accidental sharing, unless someone foresee a good scenario for sharing epoll fd (I fail to see one).
Isn't there a way to duplicate fd's on fork, so that the child can close them at will without harm? (haven't done fork/exec in a while, tho)
Isn't there a way to duplicate fd's on fork, so that the child can close them at will without harm? (haven't done fork/exec in a while, tho)
No, from what I understand, the file-descriptors are duped anyway (they are in the process space), but the file descriptions are shared (in the kernel space). The duped fd point to shared objects... See https://copyconstruct.medium.com/the-method-to-epolls-madness-d9d2d6378642
Regarding `EPOLL_CLOEXEC`, does it really help when we do not actually `exec`?
Ouch, you're right, close-on-exec does close on exec as the name tells... Why the hell did I think that it would work at fork time? Just because it would fit our needs? Sort of wishful thinking...
I made a couple of commits to the Cog branch that provide error messages on stderr rather than segfault the VM. This shows clearly that after a fork, the X11 event notification for the child process is being delivered to both the child VM and the parent VM, and also that the kernel is supplying empty event data to the parent when this happens.
I suspect that for epoll, the rule should be never unregister the fd if you are also going to close the fd. This would be different from non-epoll event event handling, so possibly there would be a need for e.g. aioDisableAndClose(fd) in addition to the existing aioDisable(fd). Or maybe is it sufficient to just reopen the aiofd after a fork as suggested above.
Closed #550.
Closing the PR because the solution is not sufficient. I will open a different PR to close and reopen the epoll fd based on the suggestions above and in issue #548.
vm-dev@lists.squeakfoundation.org