Hi Levente,On Jan 27, 2019, at 5:40 PM, Levente Uzonyi <leves@caesar.elte.hu> wrote:On Sun, 27 Jan 2019, Chris Muller wrote:Hi,Yes, the SqueakMap server image is one part of the dynamic, but Ithink another is a bug in the trunk image. I think the reason Tim isnot seeing 45 seconds before error is because the timeout setting ofthe high-up client is not being passed all the way down to thelowest-level layers -- e.g., from HTTPSocket --> WebClient -->SocketStream --> Socket. By the time it gets down to Socket whichdoes the actual work, it's operating on its own 30 second timeout.I would expect subsecond reponse times. 30 seconds is just unacceptablylong.Well, it depends on if, for example, you're in the middle ofAntarctica with a slow internet connection in an office with a fastconnection. A 30 second timeout is just the maximum amount of timethe client will wait for the entire process before presenting adebugger, that's all it can do.We can be sure that Tim should get subsecond response times instead oftimeouts after 30 seconds.Right, but timeout settings are a necessary tool sometimes, my pointwas that we should fix client code in trunk to make timeouts workproperly.Incidentally, 99% of SqueakMap requests ARE subsecond -- just go tomap.squeak.org and click around and see. For the remaining 1% thataren't, the issue is known and we're working on a new server to fixthat.Great! That was my point: the image needs to be fixed.It is a fixed amount of time, I *think* still between 30 and 45seconds, that it takes the SqueakMap server to save its model after anand so if in the meantime it can simply be made to wait 45s instead of30s, then current SqueakMap will only be that occasional delay atworst, instead of the annoying debugger we currently get.I don't see why that would make a difference: the user would get a debugger anyway, but only 15 seconds later.You would save seconds, not milliseconds by not downloading files again.IIUC, you're saying we would save one hope in the "download" --instead of client <--> alan <--> andreas, it would just be client <-->alan. Is that right?No. If the client doesn't have the mcz in the package cache but nginx hasit in its cache, then we save the transfer of data between alan andandreas.Are alan and andreas co-located?They are cloud servers in the same data center.The file doesn't have to be read from the disk either.I assume you mean "read from disk" on alan? What about after it'scached so many mcz's in RAM that its paging out to swap file? To me,wasing precious RAM (of any server) to cache old MCZ file contentsthat no one will ever download (because they become old very quickly)feels wasteful. Dragster cars are wasteful too, but yes, they are"faster"... on a dragstrip. :) I guess there'd have to be some kindof application-specific smart management of the cache...Nginx's proxy_cache can handle that all automatically. Also, we don't need a large cache. A small, memory-only cache would do it.Levente, what about the trunk directory listing, can it cache that?Sure.That is the _#1 thing_ source.squeak.org is accessing and sending backover, and over, and over again -- every time that MC progress box thatsays, "Updating [repository name]".Right, unless you update an older image.If the client does have the mcz, then we save the complete file transfer.I don't know what the speed between alan <---> andreas is, but I doubtit's much slower than client <---> alan in most cases, so the savingswould seem to be minimal..?The image wouldn't have to open a file, read its content from the disk andsend that through a socket.By "the image" I assume you mean the SqueakSource server image. Butopening the file takes very little time. Original web-sites were.html files, remember how fast those were? Plus, filesystems "cache"file contents into their own internal caches anyway...Each file uses one external semaphore, each socket uses three. If you use a default image, there can be no more than 256 external semaphores which is ridiculous for a server, and it'll just grind to a halt when some load arrives. Every time the external semaphore table is full, a GC is triggered to try clear it up via the finalization process.Reading a file into memory is slow, writing it to a socket is slow.(Compared to nginx which uses sendfile to let the kernel handle that).And Squeak can only use a single process to handle everything.
That’s configurable. Alas because writing lock-free table growth is not easy the external semaphore table doesn’t grow automatically. But the vm does allow its size to be specified in a value cached in the image header and read at startup (IIRC). So we could easily have a 4K entry external semaphore table.Yes, it still has to return back through alan but I assume alan doesnot wait for a "full download" received from andreas before itsalready pipeing back to the Squeak client. If true, then it seemslike it only amounts to saving one hop, which would hardly benoticeable over what we have now.The goal of caching is not about saving a hop, but to avoid handling files in Squeak.Nginx does that thing magnitudes faster thanSqueak.The UX would not be magnitudes faster though, right?Directly by letting nginx serving the file, no, but the server image would be less likely to get stalled (return 5xx responses).But the caching scheme I described in this thread would make the UX a lot quicker too, because data would not have to be transferred when you already have it.That would also let us save bandwidth by not downloading files alreadysitting in the client's package cache.How so? Isn't the package-cache checked before hitting the server atall? It certainly should be.No, it's not. Currently that's not possible, because different files canhave the same name. And currently we have no way to tell them apart.No. No two MCZ's may have the same name, certainly not withiin thesame repository, because MCRepository cannot support that. So maybeNot at the same time, but it's possible, and it just happened recentlywith Chronology-ul.21.It is perfectly possible that a client has a version in its package cachewith the same name as a different version on the server.But we don't want to restrict what's possible in our software designbecause of that. That situation is already a headache anyway. Samename theoretically can come only from the same person (if we ensureunique initials) and so this is avoidable / fixable by resaving one ofthem as a different name...It wasn't me who created the duplicate. If your suggestion had been in place, some images out there, including mine, would have been broken by the update process.we need project subdirectories under package-cache to properlysimulate each cached Repository. I had no idea we were neutering 90%of the benefits of our package-cache because of this too, and justsitting here, I can't help wonder whether this is why MCProxy doesn'twork properly either!The primary purpose of a cache is to *check it first* to speed upaccess to something, right? What you say about package-cache soundsI don't know. It wasn't me who designed it. :)I meant ANY "cache".https://en.wikipedia.org/wiki/Cache_(computing)It still depends on the purpose of the cache. It's possible that package-cache is just a misnomer or it was just a plan to use it as a cache which hasn't happened yet.For Monticello, package-cache's other use-case is when anauthentication issue occurs when trying to save to a HTTP repository.At that point the Version object with the new ancestry was alreadyconstructed in memory, so rather than worry about trying to "undo" allthat, it was simpler and better to save it to a package-cache, persistit safely so the client can simply move forward from there (get accessto the HTTP and copy it or whatever).The package-cache is also handy as a default repository and as an offline storage.Levente- Chrisreally bad we should fix that, not surrender to it.Yes, that should be fixed, but it needs changes on the server side.What I always had in mind was to extend the repository listing withhashes/uuids so that the client could figure out if it needs to download aspecific version. But care must be taken not to break the code fornon-ss repositories (e.g. simple directory listings).Levente- Chris