- Don't proxify WorkingCopy ancestry for the release because we still have a bug.
Chris,
we don't want to proxify not just because it's buggy, but because proxying MC ancestry is a Bad Idea. There are much less brittle ways to achieve this. Our dev tools need to be rock-solid. These proxies are unpredictable and therefore have no place in a stable release.
- Bert -
On 28.01.2014, at 18:26, Bert Freudenberg bert@freudenbergs.de wrote:
- Don't proxify WorkingCopy ancestry for the release because we still have a bug.
Chris,
we don't want to proxify not just because it's buggy, but because proxying MC ancestry is a Bad Idea. There are much less brittle ways to achieve this. Our dev tools need to be rock-solid. These proxies are unpredictable and therefore have no place in a stable release.
+1
Best -Tobias
Bert, perhaps Eric Gamma would be interested in debating with you the validity of Proxy pattern, I'm not. The only thing I can do is direct you to works of universally accepted design patterns [1] and scores of systems that use Proxy's reliably, everyday (including Magma).
Further, I already stated I'm not beholden to solving the problem with the Proxy pattern, yet you continue to hammer your adjectives on it. Why won't you say something about the problem it's targeting and/or offer up one of your "much less brittle ways to achieve this..."?
[1] -- (see Chapter 4) http://www.amazon.com/Design-Patterns-Elements-Reusable-Object-Oriented/dp/0...
or
http://www.amazon.com/The-Design-Patterns-Smalltalk-Companion/dp/0201184621/...
On Tue, Jan 28, 2014 at 11:26 AM, Bert Freudenberg bert@freudenbergs.de wrote:
- Don't proxify WorkingCopy ancestry for the release because we still have a bug.
Chris,
we don't want to proxify not just because it's buggy, but because proxying MC ancestry is a Bad Idea. There are much less brittle ways to achieve this. Our dev tools need to be rock-solid. These proxies are unpredictable and therefore have no place in a stable release.
- Bert -
I am not going to argue your straw man. I am talking about MC ancestry specifically, not proxies in general, as you are well aware. You keep saying "my proxies are great if we just fix this last bug here". I disagree.
- Bert -
On 28.01.2014, at 22:40, Chris Muller asqueaker@gmail.com wrote:
Bert, perhaps Eric Gamma would be interested in debating with you the validity of Proxy pattern, I'm not. The only thing I can do is direct you to works of universally accepted design patterns [1] and scores of systems that use Proxy's reliably, everyday (including Magma).
Further, I already stated I'm not beholden to solving the problem with the Proxy pattern, yet you continue to hammer your adjectives on it. Why won't you say something about the problem it's targeting and/or offer up one of your "much less brittle ways to achieve this..."?
[1] -- (see Chapter 4) http://www.amazon.com/Design-Patterns-Elements-Reusable-Object-Oriented/dp/0...
or
http://www.amazon.com/The-Design-Patterns-Smalltalk-Companion/dp/0201184621/...
On Tue, Jan 28, 2014 at 11:26 AM, Bert Freudenberg bert@freudenbergs.de wrote:
- Don't proxify WorkingCopy ancestry for the release because we still have a bug.
Chris,
we don't want to proxify not just because it's buggy, but because proxying MC ancestry is a Bad Idea. There are much less brittle ways to achieve this. Our dev tools need to be rock-solid. These proxies are unpredictable and therefore have no place in a stable release.
- Bert -
You weren't clear about that. Your first sentence said you "don't want to proxify", then you said, "proxying MC ancestry is a Bad Idea" without saying why, and finally you said, "these proxies are brittle," possibly suggesting that another kind of proxy besides THESE proxies would fit better..? I still don't know which of these three interpretations you mean, I guess either the 2nd or 3rd since you said not the 1st..
You've reminded me of a client years ago using VW who made me strip the use of a Semaphore out of their code. A simple patch and the code they had worked beautifully (and was tested for months), but because they had prior bad-experiences using them incorrectly, they were unable to disassociate their fear of a possible app lock-up from Semaphores themselves. They blamed the Semaphore and chose to switch to synchronous (blocking) calls for everything at the last minute instead staying with what had already been tested.
If / when you get into the details of the problem (which, I hope you do), and begin to wrestle with the issues of compatibility, performance, transparency, and enabling a variable-sized lookback history, while still preserving ALL history in case we need it -- At that point you might strike an appreciation for how well the Proxy solution aligns itself to the problem and associated issues. Sure, if its a ticking time-bomb of nitro-glycerin, we can't use it, but I never understood why Proxy's are good for so many other similar cases but not this one. As with my former client, I sense there's may be an emotional component at play. That's totally my fail because if I had squashed these bugs earlier we probably wouldn't be having this conversation..
On Tue, Jan 28, 2014 at 3:56 PM, Bert Freudenberg bert@freudenbergs.de wrote:
I am not going to argue your straw man. I am talking about MC ancestry specifically, not proxies in general, as you are well aware. You keep saying "my proxies are great if we just fix this last bug here". I disagree.
- Bert -
On 28.01.2014, at 22:40, Chris Muller asqueaker@gmail.com wrote:
Bert, perhaps Eric Gamma would be interested in debating with you the validity of Proxy pattern, I'm not. The only thing I can do is direct you to works of universally accepted design patterns [1] and scores of systems that use Proxy's reliably, everyday (including Magma).
Further, I already stated I'm not beholden to solving the problem with the Proxy pattern, yet you continue to hammer your adjectives on it. Why won't you say something about the problem it's targeting and/or offer up one of your "much less brittle ways to achieve this..."?
[1] -- (see Chapter 4) http://www.amazon.com/Design-Patterns-Elements-Reusable-Object-Oriented/dp/0...
or
http://www.amazon.com/The-Design-Patterns-Smalltalk-Companion/dp/0201184621/...
On Tue, Jan 28, 2014 at 11:26 AM, Bert Freudenberg bert@freudenbergs.de wrote:
- Don't proxify WorkingCopy ancestry for the release because we still have a bug.
Chris,
we don't want to proxify not just because it's buggy, but because proxying MC ancestry is a Bad Idea. There are much less brittle ways to achieve this. Our dev tools need to be rock-solid. These proxies are unpredictable and therefore have no place in a stable release.
- Bert -
On 28.01.2014, at 23:38, Chris Muller asqueaker@gmail.com wrote:
You weren't clear about that.
Really now, we're down to grammar?
Your first sentence said you "don't want to proxify",
Which was citing your commit message.
then you said, "proxying MC ancestry is a Bad Idea"
Wherein I specify exactly which proxies I am talking about, not the general idea of proxies.
without saying why,
The whole conversation is about the problems your proxies generate, and it goes back for months, as you remember, so I don't feel the need to restate it.
and finally you said, "these proxies are brittle," possibly suggesting that another kind of proxy besides THESE proxies would fit better..?
No, I wrote "proxying MC ancestry is a Bad Idea", period.
I still don't know which of these three interpretations you mean, I guess either the 2nd or 3rd since you said not the 1st..
I meant precisely what I wrote. No need to guess. Proxying MC ancestry is a Bad Idea.
[... derisive schooling elided ...]
If / when you get into the details of the problem (which, I hope you do) and begin to wrestle with the issues of compatibility, performance, transparency, and enabling a variable-sized lookback history, while still preserving ALL history in case we need it -- At that point you might strike an appreciation for how well the Proxy solution aligns itself to the problem and associated issues. Sure, if its a ticking time-bomb of nitro-glycerin, we can't use it, but I never understood why Proxy's are good for so many other similar cases but not this one.
Because MC is a dev tool, not an end-user application. Having a proxy materialization kick in while you're debugging stuff is highly detrimental. They are inherently unpredictable. Just looking at them causes a cross-atlantic network fetch.
The basic problem is that loading the full ancestry happens as a side-effect of some unrelated operation. You're adding more and more patches trying to avoid materializing the proxies. But you can never be sure you found the last one.
The proper way to go about this is to make Monticello aware that the full ancestry might not be available. Then at a few select places where it really needs the full ancestry, insert calls to load that ancestry.
This would ensure that MC again is fully predictable, which you cannot guarantee with the proxy approach.
- Bert -
[... more vitriol disguised as pity ...]
On Tue, Jan 28, 2014 at 3:56 PM, Bert Freudenberg bert@freudenbergs.de wrote:
I am not going to argue your straw man. I am talking about MC ancestry specifically, not proxies in general, as you are well aware. You keep saying "my proxies are great if we just fix this last bug here". I disagree.
- Bert -
On 28.01.2014, at 22:40, Chris Muller asqueaker@gmail.com wrote:
Bert, perhaps Eric Gamma would be interested in debating with you the validity of Proxy pattern, I'm not. The only thing I can do is direct you to works of universally accepted design patterns [1] and scores of systems that use Proxy's reliably, everyday (including Magma).
Further, I already stated I'm not beholden to solving the problem with the Proxy pattern, yet you continue to hammer your adjectives on it. Why won't you say something about the problem it's targeting and/or offer up one of your "much less brittle ways to achieve this..."?
[1] -- (see Chapter 4) http://www.amazon.com/Design-Patterns-Elements-Reusable-Object-Oriented/dp/0...
or
http://www.amazon.com/The-Design-Patterns-Smalltalk-Companion/dp/0201184621/...
On Tue, Jan 28, 2014 at 11:26 AM, Bert Freudenberg bert@freudenbergs.de wrote:
- Don't proxify WorkingCopy ancestry for the release because we still have a bug.
Chris,
we don't want to proxify not just because it's buggy, but because proxying MC ancestry is a Bad Idea. There are much less brittle ways to achieve this. Our dev tools need to be rock-solid. These proxies are unpredictable and therefore have no place in a stable release.
- Bert -
Good morning Bert,
If / when you get into the details of the problem (which, I hope you do) and begin to wrestle with the issues of compatibility, performance, transparency, and enabling a variable-sized lookback history, while still preserving ALL history in case we need it -- At that point you might strike an appreciation for how well the Proxy solution aligns itself to the problem and associated issues. Sure, if its a ticking time-bomb of nitro-glycerin, we can't use it, but I never understood why Proxy's are good for so many other similar cases but not this one.
Because MC is a dev tool, not an end-user application. Having a proxy materialization kick in while you're debugging stuff is highly detrimental. They are inherently unpredictable. Just looking at them causes a cross-atlantic network fetch.
I feel like I'm reading the same message, which is you don't like them anywhere, not just in MC. Some large database end-user apps are just as important as MC. I doubt you're saying it's okay for such an app to endure "highly detrimental", "inherently unpredictable", Proxy's. Maybe you mean for a video game it would be okay..
The basic problem is that loading the full ancestry happens as a side-effect of some unrelated operation.
It happens when a message is sent to the Proxy, so it can't be THAT unrelated.
You're adding more and more patches trying to avoid materializing the proxies.
I haven't touched it in months.
But you can never be sure you found the last one.
You can never be sure you've found the last bug, of any type, in any system. But yes, Proxy's are slightly more-challenging to debug, and invariably require some "patching". But, in my experience, once they've been debugged, they pay dividends for a long time without having to touch it. That last part is a pattern consistent with any implementation.
The proper way to go about this is to make Monticello aware that the full ancestry might not be available. Then at a few select places where it really needs the full ancestry, insert calls to load that ancestry.
Fine, my goal is to have the problem solved, not use Proxy's or convince you to like them. Let's tackle it in 4.6.
This would ensure that MC again is fully predictable, which you cannot guarantee with the proxy approach.
I feel that oversells it a bit, but I understand your feelings.
[... more vitriol disguised as pity ...]
I often feel like a junior-programmer when you lecture me. The Cog VM might crash or an Environments bug might cause Associations to be shared across Dictionary's and you'll say nothing or something helpful. But I feel like you reserve your most dramatic and derisive adjectives for MY work.
Hopefully we'll get each other figured out someday.. :)
On 29.01.2014, at 16:59, Chris Muller asqueaker@gmail.com wrote:
Good morning Bert,
If / when you get into the details of the problem (which, I hope you do) and begin to wrestle with the issues of compatibility, performance, transparency, and enabling a variable-sized lookback history, while still preserving ALL history in case we need it -- At that point you might strike an appreciation for how well the Proxy solution aligns itself to the problem and associated issues. Sure, if its a ticking time-bomb of nitro-glycerin, we can't use it, but I never understood why Proxy's are good for so many other similar cases but not this one.
Because MC is a dev tool, not an end-user application. Having a proxy materialization kick in while you're debugging stuff is highly detrimental. They are inherently unpredictable. Just looking at them causes a cross-atlantic network fetch.
I feel like I'm reading the same message, which is you don't like them anywhere, not just in MC. Some large database end-user apps are just as important as MC. I doubt you're saying it's okay for such an app to endure "highly detrimental", "inherently unpredictable", Proxy's. Maybe you mean for a video game it would be okay..
The basic problem is that loading the full ancestry happens as a side-effect of some unrelated operation.
It happens when a message is sent to the Proxy, so it can't be THAT unrelated.
Smalltalk allObjectsDo: …
It is related, but only mediately.
You're adding more and more patches trying to avoid materializing the proxies.
I haven't touched it in months.
But you can never be sure you found the last one.
You can never be sure you've found the last bug, of any type, in any system. But yes, Proxy's are slightly more-challenging to debug, and invariably require some "patching". But, in my experience, once they've been debugged, they pay dividends for a long time without having to touch it. That last part is a pattern consistent with any implementation.
The proper way to go about this is to make Monticello aware that the full ancestry might not be available. Then at a few select places where it really needs the full ancestry, insert calls to load that ancestry.
Fine, my goal is to have the problem solved, not use Proxy's or convince you to like them. Let's tackle it in 4.6.
I’m in :) And I’m even more evil: let’s ditch Monticello in its current Form. It has virtues (let’s keep them) but also areas (like the use of DataStream, fully-zips, important but meaningless UUIDs clashing with naming conventions). Let’s take the best and leave the rest.[1]
Best -Tobias
[1]: Do not read as “monticello” is the root of all evil, it is not. But I had had enough problems with it to favour a leap.
On Wed, Jan 29, 2014 at 10:59 AM, Chris Muller asqueaker@gmail.com wrote:
It happens when a message is sent to the Proxy, so it can't be THAT unrelated.
The problem I ran into was triggered by cleaning out references to obsolete behaviours. This is a low-level piece of the core system. It had nothing to do with Monticello, and took several days of debugging, with help from several people on the list to figure out what was going on.
The thing is, #become: is an exotic tool that should be kept near the bottom of the tool chest, on the same shelf as thisContext and dynamically-created classes. Yes, it's very powerful, but with that power comes responsibility. If we use that tool in a ubiquitous part of the system like Monticello, we're asking every Squeak developer to take on this responsibility, even though they didn't pick up the tool themselves.
Here's another way to look at it: for the benefit they provide, proxies come with a cost, in complexity and unpredictability. You don't notice it much because you're paying that cost anyway, but most of us don't use Magma--or anything similar--in our day to day work. For us, the incremental cost of this change to Monticello is significant.
Colin
On Wed, Jan 29, 2014 at 2:18 AM, Bert Freudenberg bert@freudenbergs.dewrote:
On 28.01.2014, at 23:38, Chris Muller asqueaker@gmail.com wrote:
You weren't clear about that.
Really now, we're down to grammar?
Your first sentence said you "don't want to proxify",
Which was citing your commit message.
then you said, "proxying MC ancestry is a Bad Idea"
Wherein I specify exactly which proxies I am talking about, not the general idea of proxies.
without saying why,
The whole conversation is about the problems your proxies generate, and it goes back for months, as you remember, so I don't feel the need to restate it.
and finally you said, "these proxies are brittle," possibly suggesting that another kind of proxy besides THESE proxies would fit better..?
No, I wrote "proxying MC ancestry is a Bad Idea", period.
I still don't know which of these three interpretations you mean, I guess either the 2nd or 3rd since you said not the 1st..
I meant precisely what I wrote. No need to guess. Proxying MC ancestry is a Bad Idea.
[... derisive schooling elided ...]
If / when you get into the details of the problem (which, I hope you do) and begin to wrestle with the issues of compatibility, performance, transparency, and enabling a variable-sized lookback history, while still preserving ALL history in case we need it -- At that point you might strike an appreciation for how well the Proxy solution aligns itself to the problem and associated issues. Sure, if its a ticking time-bomb of nitro-glycerin, we can't use it, but I never understood why Proxy's are good for so many other similar cases but not this one.
Because MC is a dev tool, not an end-user application. Having a proxy materialization kick in while you're debugging stuff is highly detrimental. They are inherently unpredictable. Just looking at them causes a cross-atlantic network fetch.
The basic problem is that loading the full ancestry happens as a side-effect of some unrelated operation. You're adding more and more patches trying to avoid materializing the proxies. But you can never be sure you found the last one.
The proper way to go about this is to make Monticello aware that the full ancestry might not be available. Then at a few select places where it really needs the full ancestry, insert calls to load that ancestry.
IMO, there's another thing worth doing, and that is sorting and uniqueifying the history. I see duplicate entries in the ancestry which causes it to bloat (I suspect this happens on e.g. merge, but I'm not sure). I have seen my manual attempts at uniqueifying ancestry shrink significantly the size of mcz files.
This would ensure that MC again is fully predictable, which you cannot
guarantee with the proxy approach.
- Bert -
[... more vitriol disguised as pity ...]
On Tue, Jan 28, 2014 at 3:56 PM, Bert Freudenberg bert@freudenbergs.de
wrote:
I am not going to argue your straw man. I am talking about MC ancestry
specifically, not proxies in general, as you are well aware. You keep saying "my proxies are great if we just fix this last bug here". I disagree.
- Bert -
On 28.01.2014, at 22:40, Chris Muller asqueaker@gmail.com wrote:
Bert, perhaps Eric Gamma would be interested in debating with you the validity of Proxy pattern, I'm not. The only thing I can do is direct you to works of universally accepted design patterns [1] and scores of systems that use Proxy's reliably, everyday (including Magma).
Further, I already stated I'm not beholden to solving the problem with the Proxy pattern, yet you continue to hammer your adjectives on it. Why won't you say something about the problem it's targeting and/or offer up one of your "much less brittle ways to achieve this..."?
[1] -- (see Chapter 4)
http://www.amazon.com/Design-Patterns-Elements-Reusable-Object-Oriented/dp/0...
or
http://www.amazon.com/The-Design-Patterns-Smalltalk-Companion/dp/0201184621/...
On Tue, Jan 28, 2014 at 11:26 AM, Bert Freudenberg <
bert@freudenbergs.de> wrote:
- Don't proxify WorkingCopy ancestry for the release because we
still have a bug.
Chris,
we don't want to proxify not just because it's buggy, but because
proxying MC ancestry is a Bad Idea. There are much less brittle ways to achieve this. Our dev tools need to be rock-solid. These proxies are unpredictable and therefore have no place in a stable release.
- Bert -
On Wed, Jan 29, 2014 at 1:51 PM, Eliot Miranda eliot.miranda@gmail.comwrote:
IMO, there's another thing worth doing, and that is sorting and uniqueifying the history. I see duplicate entries in the ancestry which causes it to bloat (I suspect this happens on e.g. merge, but I'm not sure). I have seen my manual attempts at uniqueifying ancestry shrink significantly the size of mcz files.
We could may be uses Chris' unique registry idea to have canonical instances of VersionInfo. That would save memory in the image. We could also change the mcz format to allow references between nodes in the ancestry tree so that there's no duplicate information there. That would save space inside mcz files.
But the tree structure contains important information, and collapsing the tree into a linear history would prevent MC from doing merges properly.
Colin
Why in the world would we have the same VersionInfo twice in the same ancestry tree?
Are there any examples in any of our trunk packages?
On Thu, Jan 30, 2014 at 8:25 AM, Colin Putney colin@wiresong.com wrote:
On Wed, Jan 29, 2014 at 1:51 PM, Eliot Miranda eliot.miranda@gmail.com wrote:
IMO, there's another thing worth doing, and that is sorting and uniqueifying the history. I see duplicate entries in the ancestry which causes it to bloat (I suspect this happens on e.g. merge, but I'm not sure). I have seen my manual attempts at uniqueifying ancestry shrink significantly the size of mcz files.
We could may be uses Chris' unique registry idea to have canonical instances of VersionInfo. That would save memory in the image. We could also change the mcz format to allow references between nodes in the ancestry tree so that there's no duplicate information there. That would save space inside mcz files.
But the tree structure contains important information, and collapsing the tree into a linear history would prevent MC from doing merges properly.
Colin
On 30.01.2014, at 17:37, Chris Muller asqueaker@gmail.com wrote:
Why in the world would we have the same VersionInfo twice in the same ancestry tree?
simple: 'adopt as ancestor' There.
Are there any examples in any of our trunk packages?
On Thu, Jan 30, 2014 at 8:25 AM, Colin Putney colin@wiresong.com wrote:
On Wed, Jan 29, 2014 at 1:51 PM, Eliot Miranda eliot.miranda@gmail.com wrote:
IMO, there's another thing worth doing, and that is sorting and uniqueifying the history. I see duplicate entries in the ancestry which causes it to bloat (I suspect this happens on e.g. merge, but I'm not sure). I have seen my manual attempts at uniqueifying ancestry shrink significantly the size of mcz files.
We could may be uses Chris' unique registry idea to have canonical instances of VersionInfo. That would save memory in the image. We could also change the mcz format to allow references between nodes in the ancestry tree so that there's no duplicate information there. That would save space inside mcz files.
But the tree structure contains important information, and collapsing the tree into a linear history would prevent MC from doing merges properly.
Colin
That's for adding an ancestor, right? That's a very rare use-case, I never used it even once.
And, I still fail to understand how it results in duplicate -- if it is already part of ancestry, why adopt it again?
On Thu, Jan 30, 2014 at 10:38 AM, Tobias Pape Das.Linux@gmx.de wrote:
On 30.01.2014, at 17:37, Chris Muller asqueaker@gmail.com wrote:
Why in the world would we have the same VersionInfo twice in the same ancestry tree?
simple: 'adopt as ancestor' There.
Are there any examples in any of our trunk packages?
On Thu, Jan 30, 2014 at 8:25 AM, Colin Putney colin@wiresong.com wrote:
On Wed, Jan 29, 2014 at 1:51 PM, Eliot Miranda eliot.miranda@gmail.com wrote:
IMO, there's another thing worth doing, and that is sorting and uniqueifying the history. I see duplicate entries in the ancestry which causes it to bloat (I suspect this happens on e.g. merge, but I'm not sure). I have seen my manual attempts at uniqueifying ancestry shrink significantly the size of mcz files.
We could may be uses Chris' unique registry idea to have canonical instances of VersionInfo. That would save memory in the image. We could also change the mcz format to allow references between nodes in the ancestry tree so that there's no duplicate information there. That would save space inside mcz files.
But the tree structure contains important information, and collapsing the tree into a linear history would prevent MC from doing merges properly.
Colin
On 30.01.2014, at 17:43, Chris Muller asqueaker@gmail.com wrote:
That's for adding an ancestor, right? That's a very rare use-case, I never used it even once.
And, I still fail to understand how it results in duplicate -- if it is already part of ancestry, why adopt it again?
So I can do the monticello equivalent of a git squash:
Base version is Knorz-cmm.24.mcz I develop say, version 25, 26, 27, 28 in my own repo. and then I want to publish. So I adopt your version again as ancestor and commit Knorz-topa.29.mcz, which has Knorz-topa.28.mcz, Knorz-cmm.24.mcz as ancestors. Then, you can view diffs more easily.
see also here: https://stackoverflow.com/questions/20966089/monticello-workflow-for-simulta...
Also, I had to do that in SqueakSource to cross-merge features from different SqueakSource-forks more easily.
Best -Tobias
On Thu, Jan 30, 2014 at 11:37 AM, Chris Muller asqueaker@gmail.com wrote:
Why in the world would we have the same VersionInfo twice in the same ancestry tree?
It happens every time you we merge.
Are there any examples in any of our trunk packages?
Of course.
Colin
On 30.01.2014, at 18:18, Colin Putney colin@wiresong.com wrote:
On Thu, Jan 30, 2014 at 11:37 AM, Chris Muller asqueaker@gmail.com wrote: Why in the world would we have the same VersionInfo twice in the same ancestry tree?
It happens every time you we merge.
Are there any examples in any of our trunk packages?
Of course.
Colin
In a random trunk image I get
MCVersionInfo instanceCount ==> 9135 MCVersionInfo allInstances asSet size ==> 9037
So uniquing these wouldn't save much overall.
- Bert -
On Thu, Jan 30, 2014 at 1:09 PM, Bert Freudenberg bert@freudenbergs.dewrote:
In a random trunk image I get
MCVersionInfo instanceCount ==> 9135 MCVersionInfo allInstances asSet size ==> 9037
So uniquing these wouldn't save much overall.
In a random mcz file from my cache (Kernel-721.cwp):
15 package 1482380 snapshot.bin 1407480 snapshot/source.st 243499 version
These are byte counts, so the ancestry data (in "version") takes up about 8% of the total. If we really want to make these files smaller, we'd do better to get rid of the redundancy between snapshot.bin and snapshot/ source.st.
Colin
On Thu, Jan 30, 2014 at 10:30 AM, Colin Putney colin@wiresong.com wrote:
On Thu, Jan 30, 2014 at 1:09 PM, Bert Freudenberg bert@freudenbergs.dewrote:
In a random trunk image I get
MCVersionInfo instanceCount ==> 9135 MCVersionInfo allInstances asSet size ==> 9037
So uniquing these wouldn't save much overall.
In a random mcz file from my cache (Kernel-721.cwp):
15 package
1482380 snapshot.bin 1407480 snapshot/source.st 243499 version
These are byte counts, so the ancestry data (in "version") takes up about 8% of the total. If we really want to make these files smaller, we'd do better to get rid of the redundancy between snapshot.bin and snapshot/ source.st.
I thought that the issue was not file size but in-image footprint. What do others think?
In a random mcz file from my cache (Kernel-721.cwp):
15 package
1482380 snapshot.bin 1407480 snapshot/source.st 243499 version
These are byte counts, so the ancestry data (in "version") takes up about 8% of the total. If we really want to make these files smaller, we'd do better to get rid of the redundancy between snapshot.bin and snapshot/source.st.
I thought that the issue was not file size but in-image footprint. What do others think?
MC has 3 limitlessly-growing aspects causing increased degradation and scaling issues. 1) The in-image ancestry, 2) the allFilenames cache size (this is the Numero UNO thing our trunk server spends time doing), and 3) the repetition of code in .mcz files packages wastes quite a lot of space. Morphic is probably the largest, its 582 versions in trunk consuming 857M today. Total size of packages in /trunk is currently 3.1G.
On Thu, Jan 30, 2014 at 9:25 AM, Colin Putney colin@wiresong.com wrote:
We could also change the mcz format to allow references between nodes in the ancestry tree so that there's no duplicate information there. That would save space inside mcz files.
I should have checked on this before posting. :-)
It turns out we already do this, so optimizing the ancestry data in an mcz file would require a pretty radical change.
Colin
squeak-dev@lists.squeakfoundation.org