I've been trying to sort out mantis-1650 and siblings; oh boy what fun.
Basically at some point the shape of CharacterScanner was changed so that primitive 103 could no longer work; nowadays we waste time in starting up the primitive and never doing anything but failing it. The fallback code is pretty ugly too, though I have a few small improvements for it. We can't reasonably 'fix' the primitive since it is required to support older images, in particular Scratch on the Pi. Anyone doing something that compromises *that* will get a quiet visit from The Boys. We *could* add a new primitive, of course. It's also possible that for a lot of modern machines running Cog it might not be worth it - but not all machines are cogged nor fast.
Part of the complication is that we have rather lot of font related classes these days and not all of them are even subclasses of AbstractFont. So far as I can tell the major change was due to an attempt to handle fonts that can have character pair kerning, which looks like only FreeTypeFont. All the others are wasting time both through losing the primitive support and pointlessly finding out that pair-kerning does nothing new. Oh and FT2Face seems to be off on its own for some reason I haven't discerned as yet. Obvious question - who is most up to speed with what the hell Fonts are up to these days? I have some Questioning Instruments warming up for you…
MultiCharacterScanner brings in a whole new level of insanity, not least because it uses identical code including the pointless call of the primitive - and the two senders of these two methods are also (effectively) identical. And do, just for grins, take a look at the only reference to MultiCharacterScanner - FreeTypeFontProvider class>initialize. Oh my. And let's consider references and uses to other classes in that hierarchy - in NewParagraph>characterBlock* MultiCharacterBlockScanner is used for WideStrings but in Paragraph>composeAll it is used for both byte strings and wider strings. And then there is GrafPort>displayScannerForMulti:….
How have we got into such a mixed and messy state? Did some experiment get partially worked on and forgotten? Surely nobody has deliberately made it so convoluted?
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Strange OpCodes: CLOUT: Call Long-distance On Unused Telephone
On 2013-09-02, at 07:24, tim Rowledge tim@rowledge.org wrote:
How have we got into such a mixed and messy state? Did some experiment get partially worked on and forgotten?
That is pretty much exactly what happened - back in 2005 for the 3.8 release the m17n work done by our Japanese subcommunity was merged in. That work had existed for a couple of years so it was designed to be rather independent of Squeak's old system, rather than being a proper update of it. That is why there is much duplication of machinery. It got to a state that it pretty much worked fine, but no cleanup or simplification was ever attempted.
FT2 is a distinct effort of utilizing the FreeType plugin that was never properly integrated (it's still not in trunk). HostFont had a similar aim but the plugin was only ever implemented on Windows. Scratch's UnicodePlugin is completely separate. And rudimentary Pango support is part of RomePlugin (for odd reasons) but has not been used in Squeak trunk (it is used to render paragraphs of non-Latin scripts in Etoys on Linux).
I agree, it's a mess. Glad you're taking it on :)
- Bert -
I think there is no point in making internationalization unloadable. There is a better image for this purpose and it's named Cuis. So +1 for merging the single and multi Character* Also note that Pharo has greater plans about text composition refactorings...
Nicolas
2013/9/2 Bert Freudenberg bert@freudenbergs.de
On 2013-09-02, at 07:24, tim Rowledge tim@rowledge.org wrote:
How have we got into such a mixed and messy state? Did some experiment
get partially worked on and forgotten?
That is pretty much exactly what happened - back in 2005 for the 3.8 release the m17n work done by our Japanese subcommunity was merged in. That work had existed for a couple of years so it was designed to be rather independent of Squeak's old system, rather than being a proper update of it. That is why there is much duplication of machinery. It got to a state that it pretty much worked fine, but no cleanup or simplification was ever attempted.
FT2 is a distinct effort of utilizing the FreeType plugin that was never properly integrated (it's still not in trunk). HostFont had a similar aim but the plugin was only ever implemented on Windows. Scratch's UnicodePlugin is completely separate. And rudimentary Pango support is part of RomePlugin (for odd reasons) but has not been used in Squeak trunk (it is used to render paragraphs of non-Latin scripts in Etoys on Linux).
I agree, it's a mess. Glad you're taking it on :)
- Bert -
Yeah but on this particular list, the diaspora isn't as interesting as what we're going to do with the bits we've got. I'm a huge fan of and advocate of Cuis, in part because the Cuis crowd still dreams of becoming the basis of a future Squeak. The Pharo crowd though, with Elvis, has left the building.
It's probably best if I stop talking right bloody now.
On Mon, Sep 2, 2013 at 5:03 AM, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> wrote:
I think there is no point in making internationalization unloadable. There is a better image for this purpose and it's named Cuis. So +1 for merging the single and multi Character* Also note that Pharo has greater plans about text composition refactorings...
Nicolas
2013/9/2 Bert Freudenberg bert@freudenbergs.de
On 2013-09-02, at 07:24, tim Rowledge tim@rowledge.org wrote:
How have we got into such a mixed and messy state? Did some experiment
get partially worked on and forgotten?
That is pretty much exactly what happened - back in 2005 for the 3.8 release the m17n work done by our Japanese subcommunity was merged in. That work had existed for a couple of years so it was designed to be rather independent of Squeak's old system, rather than being a proper update of it. That is why there is much duplication of machinery. It got to a state that it pretty much worked fine, but no cleanup or simplification was ever attempted.
FT2 is a distinct effort of utilizing the FreeType plugin that was never properly integrated (it's still not in trunk). HostFont had a similar aim but the plugin was only ever implemented on Windows. Scratch's UnicodePlugin is completely separate. And rudimentary Pango support is part of RomePlugin (for odd reasons) but has not been used in Squeak trunk (it is used to render paragraphs of non-Latin scripts in Etoys on Linux).
I agree, it's a mess. Glad you're taking it on :)
- Bert -
On 02-09-2013, at 5:25 AM, Casey Ransberger casey.obrien.r@gmail.com wrote:
Yeah but on this particular list, the diaspora isn't as interesting as what we're going to do with the bits we've got. I'm a huge fan of and advocate of Cuis, in part because the Cuis crowd still dreams of becoming the basis of a future Squeak. The Pharo crowd though, with Elvis, has left the building.
I pretty much agree.
On Mon, Sep 2, 2013 at 5:03 AM, Nicolas Cellier nicolas.cellier.aka.nice@gmail.com wrote: I think there is no point in making internationalization unloadable.
I agree, I think. I certainly haven't been looking at this as part of trying to get rid of i18n stuff and the only sense in which I think it would be desirable to make it unloadable is in the very wide aim of making *everything* unloadable. Which isn't actually something I think is sensible, being more of a "make it buildable from a recipe" kinda guy.
There is a better image for this purpose and it's named Cuis.
Cuis is excellent and if I had near-inifinite time available I'd work on it.
So +1 for merging the single and multi Character*
I'm not so much trying to merge as understand and clean up so it works as neatly as possible. If merging gets us there, good.
Also note that Pharo has greater plans about text composition refactorings…
I tried out a pharo image, didn't much enjoy it and don'y have time to put any greater effort than that into it. They've gone their own way, do their own thing and maybe they will come up with something wonderful and successful.
What I really could do with finding out is who - if anyone - wrote, maintains, cares about, understands, whatever, the current collection of classes involved. I'd much rather find out what was intended from the horse's mouth than have to infer it all from complicated code with little commenting and no background docs.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Useful random insult:- On permanent leave of absence from his senses.
Who, if anyone, is maintaining the FreeType package? Who, if anyone, is using it? It has some rather old methods that nastily over-ride more recent methods in the trunk image. That implies it is moribund to me.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim A sad tale that brings a lump to the eye and a tear to the throat.
On 02-09-2013, at 12:45 PM, tim Rowledge tim@rowledge.org wrote:
Who, if anyone, is maintaining the FreeType package? Who, if anyone, is using it? It has some rather old methods that nastily over-ride more recent methods in the trunk image. That implies it is moribund to me.
Well it certainly sounds like nobody cares about FreeType. I guess that means nobody will mind as I rewrite some of the low-level font/scanner code and almost certainly break FreeType.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Useful random insult:- Not much to show for four billion years of evolution.
On Tue, Sep 03, 2013 at 06:21:08PM -0700, tim Rowledge wrote:
On 02-09-2013, at 12:45 PM, tim Rowledge tim@rowledge.org wrote:
Who, if anyone, is maintaining the FreeType package? Who, if anyone, is using it? It has some rather old methods that nastily over-ride more recent methods in the trunk image. That implies it is moribund to me.
Well it certainly sounds like nobody cares about FreeType. I guess that means nobody will mind as I rewrite some of the low-level font/scanner code and almost certainly break FreeType.
I don't know if anyone is actively using FreeType, although I think that it is (or was?) being used in Pharo. We do try to keep the FT2Plugin healthy in the VM.
If you can clean up the font code, I'd say you should consider yourself empowered.
You might want to check if Juan has addressed any of this in Cuis. You'll never go too far wrong by adopting his work.
Dave
I'm not sure what the deal here is. It's native OS fonts, I think, and also something that hasn't come up to my recollection since Juan gave us a much-nicer-than-we-had (anti-aliased) font from Cuis.
I'd like to see us depend less on host facilities now that we have the Cogs, so I'm biased, but I'd say: go ahead and break it if no one is raising an objection.
Is this related to the Scratch work, out of curiosity?
On Sep 3, 2013, at 6:21 PM, tim Rowledge tim@rowledge.org wrote:
On 02-09-2013, at 12:45 PM, tim Rowledge tim@rowledge.org wrote:
Who, if anyone, is maintaining the FreeType package? Who, if anyone, is using it? It has some rather old methods that nastily over-ride more recent methods in the trunk image. That implies it is moribund to me.
Well it certainly sounds like nobody cares about FreeType. I guess that means nobody will mind as I rewrite some of the low-level font/scanner code and almost certainly break FreeType.
tim
tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Useful random insult:- Not much to show for four billion years of evolution.
Pharo uses Freetype, see a mail from Mariano Martinez Peck, 20th Nov last year marianopeck@gmail.com
world menu -> system -> settings -> type "fonts" and hit enter -> check on "Use Free type..." (wait that it loads fonts). and that's all. Then if you go and select a font, you should see all of them
I think it is not a big effort to download a Pharo image and check out what they have been doing in this area.
Assuming Tim breaks the FreeType plugin, is it still possible to display Host System True Type fonts?
--Hannes
On 9/4/13, Casey Ransberger casey.obrien.r@gmail.com wrote:
I'm not sure what the deal here is. It's native OS fonts, I think, and also something that hasn't come up to my recollection since Juan gave us a much-nicer-than-we-had (anti-aliased) font from Cuis.
I'd like to see us depend less on host facilities now that we have the Cogs, so I'm biased, but I'd say: go ahead and break it if no one is raising an objection.
Is this related to the Scratch work, out of curiosity?
On Sep 3, 2013, at 6:21 PM, tim Rowledge tim@rowledge.org wrote:
On 02-09-2013, at 12:45 PM, tim Rowledge tim@rowledge.org wrote:
Who, if anyone, is maintaining the FreeType package? Who, if anyone, is using it? It has some rather old methods that nastily over-ride more recent methods in the trunk image. That implies it is moribund to me.
Well it certainly sounds like nobody cares about FreeType. I guess that means nobody will mind as I rewrite some of the low-level font/scanner code and almost certainly break FreeType.
tim
tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Useful random insult:- Not much to show for four billion years of evolution.
On Wed, Sep 04, 2013 at 03:59:16AM +0000, H. Hirzel wrote:
Pharo uses Freetype, see a mail from Mariano Martinez Peck, 20th Nov last year marianopeck@gmail.com
world menu -> system -> settings -> type "fonts" and hit enter -> check on "Use Free type..." (wait that it loads fonts). and that's all. Then if you go and select a font, you should see all of them
I think it is not a big effort to download a Pharo image and check out what they have been doing in this area.
You are right, this would be a good reference and the Pharo implementation looks good.
Assuming Tim breaks the FreeType plugin, is it still possible to display Host System True Type fonts?
The plugin and the VM should not be affected by anything that Tim is doing.
When you build a Squeak VM (either Cog or interpreter VM), you are using the FT2Plugin code from the squeaksource.com/FreetypePlugin repository as well as some support code from the squeaksource.com/FreeTypePlus repository. These provide the source code for the FT2Plugin in the VM, so any changes to font handling in the Squeak image should have no impact on the VM plugin.
--Hannes
On 9/4/13, Casey Ransberger casey.obrien.r@gmail.com wrote:
I'm not sure what the deal here is. It's native OS fonts, I think, and also something that hasn't come up to my recollection since Juan gave us a much-nicer-than-we-had (anti-aliased) font from Cuis.
I'd like to see us depend less on host facilities now that we have the Cogs, so I'm biased, but I'd say: go ahead and break it if no one is raising an objection.
Is this related to the Scratch work, out of curiosity?
On Sep 3, 2013, at 6:21 PM, tim Rowledge tim@rowledge.org wrote:
On 02-09-2013, at 12:45 PM, tim Rowledge tim@rowledge.org wrote:
Who, if anyone, is maintaining the FreeType package? Who, if anyone, is using it? It has some rather old methods that nastily over-ride more recent methods in the trunk image. That implies it is moribund to me.
Well it certainly sounds like nobody cares about FreeType. I guess that means nobody will mind as I rewrite some of the low-level font/scanner code and almost certainly break FreeType.
tim
tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Useful random insult:- Not much to show for four billion years of evolution.
On 03-09-2013, at 8:59 PM, "H. Hirzel" hannes.hirzel@gmail.com wrote:
Pharo uses Freetype, see a mail from Mariano Martinez Peck, 20th Nov last year marianopeck@gmail.com
world menu -> system -> settings -> type "fonts" and hit enter -> check on "Use Free type..." (wait that it loads fonts). and that's all. Then if you go and select a font, you should see all of them
I think it is not a big effort to download a Pharo image and check out what they have been doing in this area.
I have a Pharo image and have run it; I don't expect to use it much because I find it doesn't please me; of course, I can use it briefly to investigate what other people have done.
Assuming Tim breaks the FreeType plugin, is it still possible to display Host System True Type fonts?
Well, firstly I'm not intending to break the plugin - it would be difficult since I don't have a password for that repository! I *might* conclude that the best way to make things better is to 'break' it in the sense of changing image code enough that the plugin would need changing to keep working - and if that happens and nobody chooses to make similar changes and support them in Pharo/Cuis, then I guess I'd have to fork the plugin code.
TrueType fonts should work exactly as well or as badly as they did. If I find problems I can understand I'll probably manage to fix them, or at least report them. Since TTFs seem pretty much converted to StrikeFonts with slightly different glyph format I don't see much problem. Do TTF files have pair-kerning maps? They're not used in Squeak if so; that might be something worth trying to improve at some point.
Who knows what StrikeFontSets are intended for? I don't see a lot of evidence of usage. I can sort of see evidence that they are (or were?) a way to build a set of glyphs that could cope with a more-than-256-char language from the behaviour of the glyphInfoInt: method, but beyond that… no real idea. No blasted comments! What happened to the old policy of "no comment, no inclusion" ? Similarly TTCFontSets…
To confuse things a bit more, there is a HostFont class that is rather oddly a subclass of StrikeFont and yet seems to load some platform format files (hard to say what since NO DAMN COMMENTS) which include potential kerning data - and then so far as I can see ignores all that and just makes a plain strike font. Oh - and nobody but win32 has the relevant FontPlugin anyway. Obvious question here is whether this is work to be kept and expanded or dropped in favour of FreeType or some other portable system. The code hasn't been updated in 9 years according to SVN, so I suspect it is dead.
There's also Cairo/Pango/whatever (I really should write a WhateverPlugin someday) that probably need considering but since I know damn-all about them and have very little spare time, somebody will have to explain them to me in words of a short and simple nature.
Gosh, aren't fonts fun?
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Useful random insult:- Has a one-way ticket on the Disoriented Express.
After simplifying the scanning code a bit I'm looking into why we have the seemingly insane situation of two parallel hierarchies of CharacterScanner. So far it looks like there are no really substantive differences between CharacterScanner and MultiCharacterScanner and their subclasses. This seems like a mistake somewhere; certainly it could be mine, missing something important.
What is the intent of MultiXXXXX ? What is CombinedChar for? Are they, honestly, still needed? Or should the older versions be removed instead? Who wrote the new classes and is that person still maintaining them? Is he/she still around here?
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Useful Latin Phrases:- Utinam coniurati te in foro interficiant! = May conspirators assassinate you in the mall!
On 9/5/13, tim Rowledge tim@rowledge.org wrote:
After simplifying the scanning code a bit I'm looking into why we have the seemingly insane situation of two parallel hierarchies of CharacterScanner. So far it looks like there are no really substantive differences between CharacterScanner and MultiCharacterScanner and their subclasses. This seems like a mistake somewhere; certainly it could be mine, missing something important.
What is the intent of MultiXXXXX ? What is CombinedChar for?
They are part of the m17n package done by Yoshiki Oshima and introduced into Squeak 3.8
http://wiki.squeak.org/squeak/756
also check out the mailing list archive searching for m17n
Are they,
honestly, still needed? Or should the older versions be removed instead?
Maybe. I assume there was a reason for not removing the older classes.
Who
wrote the new classes and is that person still maintaining them? Is he/she still around here?
tim
On Wed, Sep 4, 2013 at 8:24 PM, tim Rowledge tim@rowledge.org wrote:
After simplifying the scanning code a bit I'm looking into why we have the seemingly insane situation of two parallel hierarchies of CharacterScanner. So far it looks like there are no really substantive differences between CharacterScanner and MultiCharacterScanner and their subclasses. This seems like a mistake somewhere; certainly it could be mine, missing something important.
It's all my fault and incompetence. I am sorry.
What is the intent of MultiXXXXX ? What is CombinedChar for? Are they, honestly, still needed? Or should the older versions be removed instead? Who wrote the new classes and is that person still maintaining them? Is he/she still around here?
This kind of stuff touches the part of Squeak that *has to* work. Once the "MultiCharacterScanner" worked and people were confident, it was in theory possible to ditch the old implementation; but I did not think back then that it (replacing fundamental code with a "work-in-progress" version) was acceptable to the community. IF there was enough man-power, there would have been more variation of such scanners implemented for different writing systems; keeping the original version that works for byte strings would have been useful under that light.
CombinedChar creates a precomposed character from a sequence of decomposed form of Unicode when possible. For a certain keyboard, it was needed.
On 05-09-2013, at 4:59 PM, Yoshiki Ohshima Yoshiki.Ohshima@acm.org wrote:
On Wed, Sep 4, 2013 at 8:24 PM, tim Rowledge tim@rowledge.org wrote:
After simplifying the scanning code a bit I'm looking into why we have the seemingly insane situation of two parallel hierarchies of CharacterScanner. So far it looks like there are no really substantive differences between CharacterScanner and MultiCharacterScanner and their subclasses. This seems like a mistake somewhere; certainly it could be mine, missing something important.
It's all my fault and incompetence. I am sorry.
Well, it might be your 'fault' but I rather doubt it was incompetence…
What is the intent of MultiXXXXX ? What is CombinedChar for? Are they, honestly, still needed? Or should the older versions be removed instead? Who wrote the new classes and is that person still maintaining them? Is he/she still around here?
This kind of stuff touches the part of Squeak that *has to* work. Once the "MultiCharacterScanner" worked and people were confident, it was in theory possible to ditch the old implementation; but I did not think back then that it (replacing fundamental code with a "work-in-progress" version) was acceptable to the community. IF there was enough man-power, there would have been more variation of such scanners implemented for different writing systems; keeping the original version that works for byte strings would have been useful under that light.
So if I understand you correctly, there *should* be no particular differences in what the two types of scanner do? You made a parallel set in order to insulate your work from the tools that you needed to keep working in order to keep making the i18n stuff?
I've worked through several of the scanners without finding any major differences, but not yet all of them. It certainly looks to me that there is nothing to stop us having only one set. I suspect there may be some bug fixes in the more recently created classes, though I did notice at least a couple of places where the method in the old scanner class was actually newer than its equivalent in the new scanner. Do you recall any serious changes made to support multi-byte strings?
CombinedChar creates a precomposed character from a sequence of decomposed form of Unicode when possible. For a certain keyboard, it was needed.
Ah, yes now I see . Should CombinedChars ever exist outside that very narrow area of reading the keyboard and then copying out the results to the paragraphs? I didn't see any use beyond that but it can be hard to trace everything.
If it's actually possible to simplify and get rid of a duplication of classes it would be nice to clean up!
Right now I'm thinking about refactoring to allow the class of the string and the font to be used instead of explicit tests for widestring and font-does-kerning etc. It seems to me that modern font systems are much more 'active' than we used to think of StrikeFonts being and maybe it is time fonts did their own scanning. That way it could be via simple methods, a prim or even a call out to a library. I'm aiming to make sure that the simple cases work really fast on slow machines (can we say Raspberry Pi?) and the complex cases at least work decently.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Strange OpCodes: RDR: Rotate Disk Right
On Thu, Sep 5, 2013 at 5:21 PM, tim Rowledge tim@rowledge.org wrote:
On 05-09-2013, at 4:59 PM, Yoshiki Ohshima Yoshiki.Ohshima@acm.org wrote:
What is the intent of MultiXXXXX ? What is CombinedChar for? Are they, honestly, still needed? Or should the older versions be removed instead? Who wrote the new classes and is that person still maintaining them? Is he/she still around here?
This kind of stuff touches the part of Squeak that *has to* work. Once the "MultiCharacterScanner" worked and people were confident, it was in theory possible to ditch the old implementation; but I did not think back then that it (replacing fundamental code with a "work-in-progress" version) was acceptable to the community. IF there was enough man-power, there would have been more variation of such scanners implemented for different writing systems; keeping the original version that works for byte strings would have been useful under that light.
So if I understand you correctly, there *should* be no particular differences in what the two types of scanner do? You made a parallel set in order to insulate your work from the tools that you needed to keep working in order to keep making the i18n stuff?
Not quite. The analogy for WideString and String was like LargeInteger and SmallInteger, and CharacterScanner was like a different implementation of #+. MultiCharacterScanner handles WideStrings, especially when there are characters with different leading chars are involved. So the functionality is different.
I've worked through several of the scanners without finding any major differences, but not yet all of them. It certainly looks to me that there is nothing to stop us having only one set. I suspect there may be some bug fixes in the more recently created classes, though I did notice at least a couple of places where the method in the old scanner class was actually newer than its equivalent in the new scanner. Do you recall any serious changes made to support multi-byte strings?
The serious change was for handling leading char, and also the different line breaking rules for different languages.
CombinedChar creates a precomposed character from a sequence of decomposed form of Unicode when possible. For a certain keyboard, it was needed.
Ah, yes now I see . Should CombinedChars ever exist outside that very narrow area of reading the keyboard and then copying out the results to the paragraphs? I didn't see any use beyond that but it can be hard to trace everything.
Whenever you want to find out a sequence is composable, it is potentially useful.
Hi Yoshiki, I also note that there is a presentation and presentationLine in MultiCharacterScanner, could you tell a word about theses inst. vars.?
Nicolas
2013/9/6 Yoshiki Ohshima Yoshiki.Ohshima@acm.org
On Thu, Sep 5, 2013 at 5:21 PM, tim Rowledge tim@rowledge.org wrote:
On 05-09-2013, at 4:59 PM, Yoshiki Ohshima Yoshiki.Ohshima@acm.org
wrote:
What is the intent of MultiXXXXX ? What is CombinedChar for? Are they,
honestly, still needed? Or should the older versions be removed instead? Who wrote the new classes and is that person still maintaining them? Is he/she still around here?
This kind of stuff touches the part of Squeak that *has to* work. Once the "MultiCharacterScanner" worked and people were confident, it was in theory possible to ditch the old implementation; but I did not think back then that it (replacing fundamental code with a "work-in-progress" version) was acceptable to the community. IF there was enough man-power, there would have been more variation of such scanners implemented for different writing systems; keeping the original version that works for byte strings would have been useful under that light.
So if I understand you correctly, there *should* be no particular
differences in what the two types of scanner do? You made a parallel set in order to insulate your work from the tools that you needed to keep working in order to keep making the i18n stuff?
Not quite. The analogy for WideString and String was like LargeInteger and SmallInteger, and CharacterScanner was like a different implementation of #+. MultiCharacterScanner handles WideStrings, especially when there are characters with different leading chars are involved. So the functionality is different.
I've worked through several of the scanners without finding any major
differences, but not yet all of them. It certainly looks to me that there is nothing to stop us having only one set. I suspect there may be some bug fixes in the more recently created classes, though I did notice at least a couple of places where the method in the old scanner class was actually newer than its equivalent in the new scanner. Do you recall any serious changes made to support multi-byte strings?
The serious change was for handling leading char, and also the different line breaking rules for different languages.
CombinedChar creates a precomposed character from a sequence of decomposed form of Unicode when possible. For a certain keyboard, it was needed.
Ah, yes now I see . Should CombinedChars ever exist outside that very
narrow area of reading the keyboard and then copying out the results to the paragraphs? I didn't see any use beyond that but it can be hard to trace everything.
Whenever you want to find out a sequence is composable, it is potentially useful.
-- -- Yoshiki
At Fri, 6 Sep 2013 23:02:57 +0200, Nicolas Cellier wrote:
Hi Yoshiki, I also note that there is a presentation and presentationLine in MultiCharacterScanner, could you tell a word about theses inst. vars.?
That (IIRC) was also something to do with the mapping from the logical sequence of code points (that is what a Unicode string is) to the list of "characters" that can be used to fetch the glyphs. IOW, "presentation" is something created by looking at combinations in the logical sequence.
Again, we did not go too far; I think we supported a simple accented characters but not much more.
-- Yoshiki
On 06-09-2013, at 3:26 PM, Yoshiki Ohshima Yoshiki.Ohshima@acm.org wrote:
At Fri, 6 Sep 2013 23:02:57 +0200, Nicolas Cellier wrote:
Hi Yoshiki, I also note that there is a presentation and presentationLine in MultiCharacterScanner, could you tell a word about theses inst. vars.?
That (IIRC) was also something to do with the mapping from the logical sequence of code points (that is what a Unicode string is) to the list of "characters" that can be used to fetch the glyphs. IOW, "presentation" is something created by looking at combinations in the logical sequence.
Again, we did not go too far; I think we supported a simple accented characters but not much more.
So far as I could tell from looking at senders and implementors, there wasn't really any use made of 'presentation' and not much of 'presentationLine'; certainly little enough that my *guess* would be they could go away without changing anything.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Useful random insult:- Suffers from permanent rapture of the deep. (Nitrogen narcosis.)
I should add that I'm keeping notes in mantis 1650 - http://bugs.squeak.org/view.php?id=1650 - for anyone interested in offering improvements or explanations or even promises of treats upon success.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim 42.7 percent of all statistics are made up on the spot.
At Fri, 6 Sep 2013 16:51:56 -0700, tim Rowledge wrote:
On 06-09-2013, at 3:26 PM, Yoshiki Ohshima Yoshiki.Ohshima@acm.org wrote:
At Fri, 6 Sep 2013 23:02:57 +0200, Nicolas Cellier wrote:
Hi Yoshiki, I also note that there is a presentation and presentationLine in MultiCharacterScanner, could you tell a word about theses inst. vars.?
That (IIRC) was also something to do with the mapping from the logical sequence of code points (that is what a Unicode string is) to the list of "characters" that can be used to fetch the glyphs. IOW, "presentation" is something created by looking at combinations in the logical sequence.
Again, we did not go too far; I think we supported a simple accented characters but not much more.
So far as I could tell from looking at senders and implementors, there wasn't really any use made of 'presentation' and not much of 'presentationLine'; certainly little enough that my *guess* would be they could go away without changing anything.
In the Etoys image, #addCharToPresentation: is called from #scanMultiCharactersCombiningFrom:...., and that is dispatched from Unicode class>>scanSelector. But I guess the mechanism was removed from the trunk at some point in the past.
-- Yoshiki
Yep, I did change it in following commit, unfortunately I omitted to tell exactly which was the problem...
Name: Multilingual-nice.116 Author: nice Time: 27 March 2010, 11:22:00.573 pm UUID: 6339699b-51ec-fb41-a1e0- c8246b621919 Ancestors: Multilingual-ul.115
Don't let Unicode use #scanMultiCharactersCombiningFrom:to:in:rightX:stopConditions:kern: until problems are fixed. Anyway, combining diacritical was experimental and not really operational.
2013/9/9 Yoshiki Ohshima Yoshiki.Ohshima@acm.org
At Fri, 6 Sep 2013 16:51:56 -0700, tim Rowledge wrote:
On 06-09-2013, at 3:26 PM, Yoshiki Ohshima Yoshiki.Ohshima@acm.org
wrote:
At Fri, 6 Sep 2013 23:02:57 +0200, Nicolas Cellier wrote:
Hi Yoshiki, I also note that there is a presentation and presentationLine in MultiCharacterScanner, could you tell a word about theses inst. vars.?
That (IIRC) was also something to do with the mapping from the logical sequence of code points (that is what a Unicode string is) to the list of "characters" that can be used to fetch the glyphs. IOW, "presentation" is something created by looking at combinations in the logical sequence.
Again, we did not go too far; I think we supported a simple accented characters but not much more.
So far as I could tell from looking at senders and implementors, there wasn't really any use made of 'presentation' and not much of 'presentationLine'; certainly little enough that my *guess* would be they could go away without changing anything.
In the Etoys image, #addCharToPresentation: is called from #scanMultiCharactersCombiningFrom:...., and that is dispatched from Unicode class>>scanSelector. But I guess the mechanism was removed from the trunk at some point in the past.
-- Yoshiki
On 4 September 2013 19:41, tim Rowledge tim@rowledge.org wrote:
On 03-09-2013, at 8:59 PM, "H. Hirzel" hannes.hirzel@gmail.com wrote:
Pharo uses Freetype, see a mail from Mariano Martinez Peck, 20th Nov
last year
world menu -> system -> settings -> type "fonts" and hit enter -> check on "Use Free type..." (wait that it loads fonts). and that's all. Then if you go and select a font, you should see all of
them
I think it is not a big effort to download a Pharo image and check out what they have been doing in this area.
I have a Pharo image and have run it; I don't expect to use it much because I find it doesn't please me; of course, I can use it briefly to investigate what other people have done.
Assuming Tim breaks the FreeType plugin, is it still possible to display Host System True Type fonts?
Well, firstly I'm not intending to break the plugin - it would be difficult since I don't have a password for that repository!
i would be happy to fix that (by adding you as developer), but squeaksource today living its last days, and does not persists project changes (including adding developer(s)), means that every time SqS image reboot you will need to ask to be added again and again. Today, we placed all relevant Pharo VM sources on github (yes, including all .st sources in form of FileTree) and freetype plugin is one of them, and we're going to support it there. Needless to say, that everyone is invited to contribute there. https://github.com/pharo-project/pharo-vm
I *might* conclude that the best way to make things better is to 'break' it in the sense of changing image code enough that the plugin would need changing to keep working - and if that happens and nobody chooses to make similar changes and support them in Pharo/Cuis, then I guess I'd have to fork the plugin code.
TrueType fonts should work exactly as well or as badly as they did. If I find problems I can understand I'll probably manage to fix them, or at least report them. Since TTFs seem pretty much converted to StrikeFonts with slightly different glyph format I don't see much problem. Do TTF files have pair-kerning maps? They're not used in Squeak if so; that might be something worth trying to improve at some point.
Who knows what StrikeFontSets are intended for? I don't see a lot of evidence of usage. I can sort of see evidence that they are (or were?) a way to build a set of glyphs that could cope with a more-than-256-char language from the behaviour of the glyphInfoInt: method, but beyond that… no real idea. No blasted comments! What happened to the old policy of "no comment, no inclusion" ? Similarly TTCFontSets…
if i understood correctly , a font set is like a font(s) of same family
(lets say Arial), but with different variants: bold/italic/bold+italic.. held in its fontArray ivar.. and then there are places which using a 'font index' (see TextFontChange) to point to a concrete font in that set.. Which is utterly ugly, i would say. And yes, there is no comments, since "smalltalk code is self-explanatory" :)
To confuse things a bit more, there is a HostFont class that is rather
oddly a subclass of StrikeFont and yet seems to load some platform format files (hard to say what since NO DAMN COMMENTS) which include potential kerning data - and then so far as I can see ignores all that and just makes a plain strike font. Oh - and nobody but win32 has the relevant FontPlugin anyway. Obvious question here is whether this is work to be kept and expanded or dropped in favour of FreeType or some other portable system. The code hasn't been updated in 9 years according to SVN, so I suspect it is dead.
There's also Cairo/Pango/whatever (I really should write a WhateverPlugin someday) that probably need considering but since I know damn-all about them and have very little spare time, somebody will have to explain them to me in words of a short and simple nature.
Gosh, aren't fonts fun?
You bet! :)
tim
tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Useful random insult:- Has a one-way ticket on the Disoriented Express.
On Tue, 3 Sep 2013, Casey Ransberger wrote:
I'm not sure what the deal here is. It's native OS fonts, I think, and also something that hasn't come up to my recollection since Juan gave us a much-nicer-than-we-had (anti-aliased) font from Cuis.
I didn't really follow recent changes to Cuis, but AFAIK it only supports byte characters, so those fonts only contain some latin characters (Latin-9 maybe).
Levente
I'd like to see us depend less on host facilities now that we have the Cogs, so I'm biased, but I'd say: go ahead and break it if no one is raising an objection.
Is this related to the Scratch work, out of curiosity?
On Sep 3, 2013, at 6:21 PM, tim Rowledge tim@rowledge.org wrote:
On 02-09-2013, at 12:45 PM, tim Rowledge tim@rowledge.org wrote:
Who, if anyone, is maintaining the FreeType package? Who, if anyone, is using it? It has some rather old methods that nastily over-ride more recent methods in the trunk image. That implies it is moribund to me.
Well it certainly sounds like nobody cares about FreeType. I guess that means nobody will mind as I rewrite some of the low-level font/scanner code and almost certainly break FreeType.
tim
tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Useful random insult:- Not much to show for four billion years of evolution.
Below
On Sep 4, 2013, at 6:35 AM, Levente Uzonyi leves@elte.hu wrote:
On Tue, 3 Sep 2013, Casey Ransberger wrote:
I'm not sure what the deal here is. It's native OS fonts, I think, and also something that hasn't come up to my recollection since Juan gave us a much-nicer-than-we-had (anti-aliased) font from Cuis.
I didn't really follow recent changes to Cuis, but AFAIK it only supports byte characters, so those fonts only contain some latin characters (Latin-9 maybe).
Levente
I believe this is correct, though someone (Ken Dickey maybe?) has been working on a simplified Unicode implementation for Cuis.
CC Cuis list.
On 9/5/13, Casey Ransberger casey.obrien.r@gmail.com wrote:
Below
On Sep 4, 2013, at 6:35 AM, Levente Uzonyi leves@elte.hu wrote:
On Tue, 3 Sep 2013, Casey Ransberger wrote:
I'm not sure what the deal here is. It's native OS fonts, I think, and also something that hasn't come up to my recollection since Juan gave us a much-nicer-than-we-had (anti-aliased) font from Cuis.
I didn't really follow recent changes to Cuis, but AFAIK it only supports byte characters, so those fonts only contain some latin characters (Latin-9 maybe).
Levente
I believe this is correct, though someone (Ken Dickey maybe?) has been working on a simplified Unicode implementation for Cuis.
CC Cuis list.
Ken works on an Unicode - add-on to Cuis. [1] He has a working prototype at this stage.
Cuis 4.2 [2] runs on 8 bit characters (ISO Latin 9 - ISO 8859-15) but it can read and write UTF8 files. For any Unicode chars not in ISO Latin 9 (ISO 8859-15), embed an NCR. See http://en.wikipedia.org/wiki/Numeric_character_reference"
[1] https://github.com/KenDickey/Cuis-Smalltalk-Unicode [2] https://github.com/Cuis-Smalltalk/Cuis-Smalltalk-Dev
P.S.
https://github.com/KenDickey/Cuis-Smalltalk-Unicode
is separate from the Core-Cuis in terms of Unicode support. If you load the package you have two character classes.
On 9/5/13, H. Hirzel hannes.hirzel@gmail.com wrote:
On 9/5/13, Casey Ransberger casey.obrien.r@gmail.com wrote:
Below
On Sep 4, 2013, at 6:35 AM, Levente Uzonyi leves@elte.hu wrote:
On Tue, 3 Sep 2013, Casey Ransberger wrote:
I'm not sure what the deal here is. It's native OS fonts, I think, and also something that hasn't come up to my recollection since Juan gave us a much-nicer-than-we-had (anti-aliased) font from Cuis.
I didn't really follow recent changes to Cuis, but AFAIK it only supports byte characters, so those fonts only contain some latin characters (Latin-9 maybe).
Levente
I believe this is correct, though someone (Ken Dickey maybe?) has been working on a simplified Unicode implementation for Cuis.
CC Cuis list.
Ken works on an Unicode - add-on to Cuis. [1] He has a working prototype at this stage.
Cuis 4.2 [2] runs on 8 bit characters (ISO Latin 9 - ISO 8859-15) but it can read and write UTF8 files. For any Unicode chars not in ISO Latin 9 (ISO 8859-15), embed an NCR. See http://en.wikipedia.org/wiki/Numeric_character_reference"
[1] https://github.com/KenDickey/Cuis-Smalltalk-Unicode [2] https://github.com/Cuis-Smalltalk/Cuis-Smalltalk-Dev
On 4 September 2013 03:21, tim Rowledge tim@rowledge.org wrote:
On 02-09-2013, at 12:45 PM, tim Rowledge tim@rowledge.org wrote:
Who, if anyone, is maintaining the FreeType package? Who, if anyone, is
using it? It has some rather old methods that nastily over-ride more recent methods in the trunk image. That implies it is moribund to me.
Well it certainly sounds like nobody cares about FreeType. I guess that means nobody will mind as I rewrite some of the low-level font/scanner code and almost certainly break FreeType.
(sorry for being late on topic)
In Pharo, we certainly do. But FreeType code also needs a decent review and cleanup for sure. The freetype packages is an integral part of pharo base image, and maintained there. As for its plugin, i even managed to fix a bug recently in it.. in primitive which nobody uses though.. mainly because i had plans to use it, but i haven't time to play an experiment, yet. (i am still thinking , maybe naively, that FT rendering speed can be improved).
Concerning MultiCharacterMeetPainfulDeathWhenStaringAtIt ... it just impossible to do something with it, especially considering that it used for rendering text, and if you break/change it , you won't be able to fix it (because the image is using the very same code which you just broke to render all text.. ). At least in Pharo, we decided to not even try to do something about it (and as far as i know, nobody tries to do anything for years), instead we decided to write things from scratch,and when it will be ready, throw all this out without a bit of regret.
tim
tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Useful random insult:- Not much to show for four billion years of evolution.
On 12-09-2013, at 3:30 PM, Igor Stasenko siguctua@gmail.com wrote:
On 4 September 2013 03:21, tim Rowledge tim@rowledge.org wrote:
On 02-09-2013, at 12:45 PM, tim Rowledge tim@rowledge.org wrote:
Who, if anyone, is maintaining the FreeType package? Who, if anyone, is using it? It has some rather old methods that nastily over-ride more recent methods in the trunk image. That implies it is moribund to me.
Well it certainly sounds like nobody cares about FreeType. I guess that means nobody will mind as I rewrite some of the low-level font/scanner code and almost certainly break FreeType.
(sorry for being late on topic)
In Pharo, we certainly do. But FreeType code also needs a decent review and cleanup for sure.
Most code does; little code ever gets one.
The freetype packages is an integral part of pharo base image, and maintained there. As for its plugin, i even managed to fix a bug recently in it.. in primitive which nobody uses though.. mainly because i had plans to use it, but i haven't time to play an experiment, yet. (i am still thinking , maybe naively, that FT rendering speed can be improved).
Probably the smart thing is to use whatever the best, most fastidiously maintained library with the widest platform spread. Go for something that makes good use of GPUs. If FreeType is that, it's worth the pain. Probably. Just make sure it runs on ARM based machines from the get-go or it will become irrelevant within a few years. Intel is being eaten alive.
Concerning MultiCharacterMeetPainfulDeathWhenStaringAtIt ... it just impossible to do something with it, especially considering that it used for rendering text, and if you break/change it , you won't be able to fix it (because the image is using the very same code which you just broke to render all text.. ). At least in Pharo, we decided to not even try to do something about it (and as far as i know, nobody tries to do anything for years), instead we decided to write things from scratch,and when it will be ready, throw all this out without a bit of regret.
Oh you wimps! Where is the fun in that? If it doesn't involve quantum transforms via irrational phase-space dimensions while reciting Vogon Poetry in Klingon then it isn't difficult enough.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Strange OpCodes: BW: Branch on Whim
On 13 September 2013 00:43, tim Rowledge tim@rowledge.org wrote:
On 12-09-2013, at 3:30 PM, Igor Stasenko siguctua@gmail.com wrote:
On 4 September 2013 03:21, tim Rowledge tim@rowledge.org wrote:
On 02-09-2013, at 12:45 PM, tim Rowledge tim@rowledge.org wrote:
Who, if anyone, is maintaining the FreeType package? Who, if anyone,
is using it? It has some rather old methods that nastily over-ride more recent methods in the trunk image. That implies it is moribund to me.
Well it certainly sounds like nobody cares about FreeType. I guess that
means nobody will mind as I rewrite some of the low-level font/scanner code and almost certainly break FreeType.
(sorry for being late on topic)
In Pharo, we certainly do. But FreeType code also needs a decent review
and cleanup for sure.
Most code does; little code ever gets one.
Amen.
The freetype packages is an integral part of pharo base image, and
maintained there.
As for its plugin, i even managed to fix a bug recently in it.. in
primitive which nobody uses though..
mainly because i had plans to use it, but i haven't time to play an
experiment, yet. (i am still thinking
, maybe naively, that FT rendering speed can be improved).
Probably the smart thing is to use whatever the best, most fastidiously maintained library with the widest platform spread. Go for something that makes good use of GPUs. If FreeType is that, it's worth the pain. Probably. Just make sure it runs on ARM based machines from the get-go or it will become irrelevant within a few years. Intel is being eaten alive.
AFAIK, Esteban was managed to build FT2 library on iOS.. and can use it in iOS VMs so, from that perspective, i think, it is safe choice. As for GPUs & stuff: even, if you don't use FT2 library for rendering, but just for reading the font(s) data and extracting all necessary info (like char/glyph mappings, kerning and glyph metrics + outlines) that functionality alone, is good/heavy enough reason to keep using it.
Concerning MultiCharacterMeetPainfulDeathWhenStaringAtIt ... it just
impossible to do something with it,
especially considering that it used for rendering text, and if you
break/change it , you won't be able to fix it
(because the image is using the very same code which you just broke to
render all text.. ).
At least in Pharo, we decided to not even try to do something about it
(and as far as i know, nobody tries
to do anything for years), instead we decided to write things from
scratch,and when it will be ready, throw
all this out without a bit of regret.
Oh you wimps! Where is the fun in that? If it doesn't involve quantum transforms via irrational phase-space dimensions while reciting Vogon Poetry in Klingon then it isn't difficult enough.
Well, if there would be any hope that it can be shaped into something more or less ugly (instead of horrifying), i would be all hands for it. The main issue is that code has a lot of assumptions about font(s) and laying the glyphs in many places.. and things are heavily optimized to render first 256 characters by old primitive (which doesn't works btw) Freetype fonts can't use that primitive, only raster fonts can.. other assumptions about how fonts are rendered and by what.. a lot of these concerns is mixed in single place.. so i do not see a solution there. We decided to go more radical way in Pharo: we build a new text model (replacement for Text class) and new layout and rendering engine for it. (oh and if you think it is less difficult than dealing with old code, i afraid i must upset you ;). Text domain is inherently complex.
tim
tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Strange OpCodes: BW: Branch on Whim
On Fri, 13 Sep 2013, Igor Stasenko wrote:
On 4 September 2013 03:21, tim Rowledge tim@rowledge.org wrote:
On 02-09-2013, at 12:45 PM, tim Rowledge <tim@rowledge.org> wrote: > Who, if anyone, is maintaining the FreeType package? Who, if anyone, is using it? It has some rather old methods that nastily over-ride more recent methods in the trunk image. That implies it is moribund to me.
Well it certainly sounds like nobody cares about FreeType. I guess that means nobody will mind as I rewrite some of the low-level font/scanner code and almost certainly break FreeType.
(sorry for being late on topic)
In Pharo, we certainly do. But FreeType code also needs a decent review and cleanup for sure. The freetype packages is an integral part of pharo base image, and maintained there. As for its plugin, i even managed to fix a bug recently in it.. in primitive which nobody uses though.. mainly because i had plans to use it, but i haven't time to play an experiment, yet. (i am still thinking , maybe naively, that FT rendering speed can be improved).
Concerning MultiCharacterMeetPainfulDeathWhenStaringAtIt ... it just impossible to do something with it, especially considering that it used for rendering text, and if you break/change it , you won't be able to fix it (because the image is using the very same code which you just broke to render all text.. ).
Why can't you just create a copy and modify that? That way you won't break your image.
Levente
At least in Pharo, we decided to not even try to do something about it (and as far as i know, nobody tries to do anything for years), instead we decided to write things from scratch,and when it will be ready, throw all this out without a bit of regret.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim
Useful random insult:- Not much to show for four billion years of evolution.
-- Best regards, Igor Stasenko.
On 13 September 2013 00:44, Levente Uzonyi leves@elte.hu wrote:
On Fri, 13 Sep 2013, Igor Stasenko wrote:
On 4 September 2013 03:21, tim Rowledge tim@rowledge.org wrote:
On 02-09-2013, at 12:45 PM, tim Rowledge <tim@rowledge.org> wrote: > Who, if anyone, is maintaining the FreeType package? Who, if
anyone, is using it? It has some rather old methods that nastily over-ride more recent methods in the trunk image. That implies it is moribund to me.
Well it certainly sounds like nobody cares about FreeType. I guess that means nobody will mind as I rewrite some of the low-level font/scanner code and almost certainly break FreeType.
(sorry for being late on topic)
In Pharo, we certainly do. But FreeType code also needs a decent review and cleanup for sure. The freetype packages is an integral part of pharo base image, and maintained there. As for its plugin, i even managed to fix a bug recently in it.. in primitive which nobody uses though.. mainly because i had plans to use it, but i haven't time to play an experiment, yet. (i am still thinking , maybe naively, that FT rendering speed can be improved).
Concerning MultiCharacterMeetPainfulDeath**WhenStaringAtIt ... it just impossible to do something with it, especially considering that it used for rendering text, and if you break/change it , you won't be able to fix it (because the image is using the very same code which you just broke to render all text.. ).
Why can't you just create a copy and modify that? That way you won't break your image.
that, i think was exactly the reason why we having CharacterScanner and
MultiCharacterScanner in our images today.. so, rephrasing: "it has been tried once, and it doesn't seems to help" :)
Levente
At least in Pharo, we decided to not even try to do something about it
(and as far as i know, nobody tries to do anything for years), instead we decided to write things from scratch,and when it will be ready, throw all this out without a bit of regret.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim
Useful random insult:- Not much to show for four billion years of evolution.
-- Best regards, Igor Stasenko.
On Fri, 13 Sep 2013, Igor Stasenko wrote:
On 13 September 2013 00:44, Levente Uzonyi leves@elte.hu wrote: On Fri, 13 Sep 2013, Igor Stasenko wrote:
On 4 September 2013 03:21, tim Rowledge <tim@rowledge.org> wrote: On 02-09-2013, at 12:45 PM, tim Rowledge <tim@rowledge.org> wrote: > Who, if anyone, is maintaining the FreeType package? Who, if anyone, is using it? It has some rather old methods that nastily over-ride more recent methods in the trunk image. That implies it is moribund to me. Well it certainly sounds like nobody cares about FreeType. I guess that means nobody will mind as I rewrite some of the low-level font/scanner code and almost certainly break FreeType. (sorry for being late on topic) In Pharo, we certainly do. But FreeType code also needs a decent review and cleanup for sure. The freetype packages is an integral part of pharo base image, and maintained there. As for its plugin, i even managed to fix a bug recently in it.. in primitive which nobody uses though.. mainly because i had plans to use it, but i haven't time to play an experiment, yet. (i am still thinking , maybe naively, that FT rendering speed can be improved). Concerning MultiCharacterMeetPainfulDeathWhenStaringAtIt ... it just impossible to do something with it, especially considering that it used for rendering text, and if you break/change it , you won't be able to fix it (because the image is using the very same code which you just broke to render all text.. ).
Why can't you just create a copy and modify that? That way you won't break your image.
that, i think was exactly the reason why we having CharacterScanner and MultiCharacterScanner in our images today.. so, rephrasing: "it has been tried once, and it doesn't seems to help" :)
You're totally wrong about that one. :)
Levente
OK, after a couple of weeks rewriting how Scratch draws tiles I'm back to looking at dear old scanners and fonts.
After side-by-side comparing the old scanners with the new 'Multi' scanners, my conclusion is that there is *very* little difference and we really ought to be able to go back to a single set of classes. Which, I claim, would be nice, since we've already visibly suffered from the obvious side-effect of having two trees as bug fixes get added to only one part.
So far as I could tell the only substantive difference relates to the use of the presentation/presentationLine ivars which seems to be not very important (ref Yoshiki's message 8 sept) and even seems to be mostly inactive (ref nice's message regarding Multilingual-nice.116 same date). It would be really nice to get a solid decision on whether it is still wanted, or should be removed, or if it needs some fixes that can be provided.
I'm puzzled by quite a few things I've discovered.
a) There are {language}environment classes and encoding classes. There is #isBreakableAt:in: implemented in both but seemingly unused in the encoding classes because it is just plain broken there. Should it be removed from the encoders? In the language environment classes it is implemented to return true for space and cr by default, but space, cr & lf in Latin1 and Latin2. Is that as expected?
b) MultiCharacterScanner>setConditionArray: cuts out the handling of #space - any ideas why?
c) as previously mentioned MultiCharacterScanner>addCharToPresentation: is currently unused, apparently because of issues with Unicode & #scanMultiCharactersCombiningFrom:to:in:rightX:stopConditions:kern: - do we have any decent hope of a fix?
d) MultiCanvasCharacterScanner>setFont uses baselineY differently to its CanvasCharacterScanner equivalent; why?
e) TextComposer>composeLinesFrom:to:delta:into:…. differs minimally from MultiTextComposer>multiComposeLinesFrom:to:delta:into:…. and is the only sender of #canComputeDefaultLineHeight. What is the intent? Is this just a bug fix added in one place and not the other?
f) one of the oddest - DisplayScanner>displayLine:offset:leftInRun: passes the displayString:.. to the font (which I see as good) but MultiDisplayScanner uses the bitlblt instead which seems quite wrong.
That'll do for now. I hope we can get some information together to allow this to be improved since it should really simplify an important area of code that gets a lot of exercise and needs to be as fast as possible. I know most of you are running 128 core 75GHz machines with 42Tb of ram and so on, so it hardly matters, but there are over a million Pi's trying to run Scratch that need help. And there will almost certainly be *many* millions more trying to use Scratch and Squeak over the next year or so, scattered across parts of the world where a Pi is more computer than anyone could have imagined a year ago - Pi is probably going to be *the* platform in sub-Saharn Africa and much of Asia. Let's at least try to make it good, eh? This is what Smalltalk has always claimed to be about.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Strange OpCodes: AGO: Allow Games Only
I really like the effort you put into cleaning up the mess.
I have briefly looked at this code in the past and it is quite convoluted and hard to understand what and where settings get set. I tried to hunt down a issue in the Etoys image that a bug can cause certain characters to change color when the image get in a shaky state. (which happens quite often when I test ideas)
Karl
On Sat, Sep 21, 2013 at 5:08 AM, tim Rowledge tim@rowledge.org wrote:
OK, after a couple of weeks rewriting how Scratch draws tiles I'm back to looking at dear old scanners and fonts.
After side-by-side comparing the old scanners with the new 'Multi' scanners, my conclusion is that there is *very* little difference and we really ought to be able to go back to a single set of classes. Which, I claim, would be nice, since we've already visibly suffered from the obvious side-effect of having two trees as bug fixes get added to only one part.
So far as I could tell the only substantive difference relates to the use of the presentation/presentationLine ivars which seems to be not very important (ref Yoshiki's message 8 sept) and even seems to be mostly inactive (ref nice's message regarding Multilingual-nice.116 same date). It would be really nice to get a solid decision on whether it is still wanted, or should be removed, or if it needs some fixes that can be provided.
I'm puzzled by quite a few things I've discovered.
a) There are {language}environment classes and encoding classes. There is #isBreakableAt:in: implemented in both but seemingly unused in the encoding classes because it is just plain broken there. Should it be removed from the encoders? In the language environment classes it is implemented to return true for space and cr by default, but space, cr & lf in Latin1 and Latin2. Is that as expected?
b) MultiCharacterScanner>setConditionArray: cuts out the handling of #space - any ideas why?
c) as previously mentioned MultiCharacterScanner>addCharToPresentation: is currently unused, apparently because of issues with Unicode & #scanMultiCharactersCombiningFrom:to:in:rightX:stopConditions:kern: - do we have any decent hope of a fix?
d) MultiCanvasCharacterScanner>setFont uses baselineY differently to its CanvasCharacterScanner equivalent; why?
e) TextComposer>composeLinesFrom:to:delta:into:…. differs minimally from MultiTextComposer>multiComposeLinesFrom:to:delta:into:…. and is the only sender of #canComputeDefaultLineHeight. What is the intent? Is this just a bug fix added in one place and not the other?
f) one of the oddest - DisplayScanner>displayLine:offset:leftInRun: passes the displayString:.. to the font (which I see as good) but MultiDisplayScanner uses the bitlblt instead which seems quite wrong.
That'll do for now. I hope we can get some information together to allow this to be improved since it should really simplify an important area of code that gets a lot of exercise and needs to be as fast as possible. I know most of you are running 128 core 75GHz machines with 42Tb of ram and so on, so it hardly matters, but there are over a million Pi's trying to run Scratch that need help. And there will almost certainly be *many* millions more trying to use Scratch and Squeak over the next year or so, scattered across parts of the world where a Pi is more computer than anyone could have imagined a year ago - Pi is probably going to be *the* platform in sub-Saharn Africa and much of Asia. Let's at least try to make it good, eh? This is what Smalltalk has always claimed to be about.
tim
tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Strange OpCodes: AGO: Allow Games Only
I'm going to be very brave and commit my changes so far; nothing has broken on my system since making the changes. Obviously there is some chance it will be necessary to rewind but I consider it low.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim A)bort, R)etry or S)elf-destruct?
On 23-09-2013, at 12:40 PM, tim Rowledge tim@rowledge.org wrote:
I'm going to be very brave and commit my changes so far; nothing has broken on my system since making the changes. Obviously there is some chance it will be necessary to rewind but I consider it low.
Both packages (Graphics-tpr.226 & Multilingual-tpr.170) committed ok.
REALLY IMPORTANT NOTE - the FreeType package needs two changes that I cannot make, lacking permissions to hit that file.
1) AbstractFont>widthAndKernedWidthOfLeft: leftCharacter right: rightCharacterOrNil into: aTwoElementArray "Set the first element of aTwoElementArray to the width of leftCharacter and the second element to the width of left character when kerned with rightCharacterOrNil. Answer aTwoElementArray" "Actually, nearly all users of this actually want just the widthOf the leftCharacter, so we will default to that for speed. See other implementations for more complex cases - and note that this may be a temporary fix until scanners are improved"
| w | w := self widthOf: leftCharacter. aTwoElementArray at: 1 put: w. aTwoElementArray at: 2 put: w
" The old code, and what fonts which have pair-kerning would use - w := self widthOf: leftCharacter. rightCharacterOrNil isNil ifTrue:[ aTwoElementArray at: 1 put: w; at: 2 put: w] ifFalse:[ k := self kerningLeft: leftCharacter right: rightCharacterOrNil. aTwoElementArray at: 1 put: w; at: 2 put: w+k]. ^aTwoElementArray "
2) FreeTypeFont (or whichever is the right name) >isPairKerningCapable "a hopefully temporary test method; better factoring of scan/measure/display should remove the need for it. Only FreeType fonts would currently add this to return true" ^true
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim I'm so skeptical that I'm not sure I'm really a skeptic
I've now finished looking at users of the assorted scanner classes and senders of related messages and boy, what a mix.
Odd bits that I could really do with answers to -
RemoteCanvas>paragraph:bounds:color: - should it be using MultiCanvasCharacterScanner ? The only other reference to CanvasCharacterScanner is from Canvas>paragraph2:bound:color: which is unsent, so I propose to remove it.
Paragraph>composeAll refers to MultiTextComposer - why? seems like a mixup with NewParagraph to me. Similarly, Paragraph>displayLines:affectedRectangle: refers to MultiDisplayScanner when I'd anticipate DisplayScanner
GrafPort>displayScannerForMulti:foreground:background: ignoreColorChanges: ( only sent by
FormCanvas>paragraph3:bounds:colour: (refers to DisplayScanner) is only sent by TextMorph>drawOnTest: which is unsent, so I propose to remove the whole lot.
The choice between CharacterBlockScanner & MultiCBS in NewParagraph>characterBlock… is based on #isWideString. The choice between DisplayScanner and MultiDS in GrafPort>displayScannerFor…. is based on para isMemberOf: MultiNewParagraph) or: [para text string isByteString] and I can't currently see why the difference.
In NewParagraph>composeAll, the choice between a plain compose and multi-compose message is based not on whether the string is byte or wide, but whether it *could* be be byte even if actually wide. That seems like it might dangerously skip over an encoding issue?
Answers on these could be quite useful in trying to work out the intention of some of the code, which is important because <b>lack of comments</b>
One of the suggestions when I started on this stuff was that the multi-stuff was associated with NewParagraph and was for Morphic text handling whereas the old scanners and Paragraph was kept for MVC. It really isn't looking much like that at the moment.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Compatible: Gracefully accepts erroneous data from any source.
I have done one part of this job independently, removed all senders of getPresentation getPresentationLines This covers FormCanvas>paragraph3:bounds:colour: and TextMorph>drawOnTest: and a bunch of other (see attached change set) This then enables removing presentation presentationLines and numOfComposition from MultiCharacterScanner and get the two hierarchies a bit closer... As i said in another mail, rendering CombinedChar and transforming WideString to precomposed or not are two orthogonal matters, there is no reason to fuse the two notions, so my feeling is that we must redo this part. And the font should be responsible of displaying the CombinedChar...
I wanted to publish in inbox, unfortunately I messed up Morphic-nice.683.mcz in trunk (two versions with exactly same name...) I'll try to restore the original Morphic-nice.683.mcz tomorrow morning, and I think that it is important because the original is the one pointed by the mcm update (they retain the UUID). So please don't publish a Morphic in trunk until then...
2013/9/24 tim Rowledge tim@rowledge.org
I've now finished looking at users of the assorted scanner classes and senders of related messages and boy, what a mix.
Odd bits that I could really do with answers to -
RemoteCanvas>paragraph:bounds:color: - should it be using MultiCanvasCharacterScanner ? The only other reference to CanvasCharacterScanner is from Canvas>paragraph2:bound:color: which is unsent, so I propose to remove it.
Paragraph>composeAll refers to MultiTextComposer - why? seems like a mixup with NewParagraph to me. Similarly, Paragraph>displayLines:affectedRectangle: refers to MultiDisplayScanner when I'd anticipate DisplayScanner
GrafPort>displayScannerForMulti:foreground:background: ignoreColorChanges: ( only sent by
FormCanvas>paragraph3:bounds:colour: (refers to DisplayScanner) is only sent by TextMorph>drawOnTest: which is unsent, so I propose to remove the whole lot.
The choice between CharacterBlockScanner & MultiCBS in NewParagraph>characterBlock… is based on #isWideString. The choice between DisplayScanner and MultiDS in GrafPort>displayScannerFor…. is based on para isMemberOf: MultiNewParagraph) or: [para text string isByteString] and I can't currently see why the difference.
In NewParagraph>composeAll, the choice between a plain compose and multi-compose message is based not on whether the string is byte or wide, but whether it *could* be be byte even if actually wide. That seems like it might dangerously skip over an encoding issue?
Answers on these could be quite useful in trying to work out the intention of some of the code, which is important because <b>lack of comments</b>
One of the suggestions when I started on this stuff was that the multi-stuff was associated with NewParagraph and was for Morphic text handling whereas the old scanners and Paragraph was kept for MVC. It really isn't looking much like that at the moment.
tim
tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Compatible: Gracefully accepts erroneous data from any source.
On 23-09-2013, at 5:29 PM, Nicolas Cellier nicolas.cellier.aka.nice@gmail.com wrote:
I have done one part of this job independently, removed all senders of getPresentation getPresentationLines This covers FormCanvas>paragraph3:bounds:colour: and TextMorph>drawOnTest: and a bunch of other (see attached change set) This then enables removing presentation presentationLines and numOfComposition from MultiCharacterScanner and get the two hierarchies a bit closer... As i said in another mail, rendering CombinedChar and transforming WideString to precomposed or not are two orthogonal matters, there is no reason to fuse the two notions, so my feeling is that we must redo this part. And the font should be responsible of displaying the CombinedChar…
OK, we need to be careful to not stomp on each other's feet here.
wrt your changeset -
GrafPort>displayScannerForMulti… can't see any change but it should be completely deleted anyway MultiNewParagraph stuff should all go away
I'll remove some of the pointless nasties and submit the package(s), you make sure to update before starting work tomorrow ;-)
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Time to start the War on Errorism before stupidity finally gets us.
Another odd bit to consider - MultiNewParagraph is, so far as I can tell, dead, defunct, unrequired and should be removed. It only adds presentationText/Lines and no usage is made of them.
Make your bid soon to preserve this unloved class, or say sayonara.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim No one is listening until you make a mistake
Yes, after removal of presentation*, it is mostly un-needed But beware, I have MultiNewParagraph allInstances size -> 85, because of TextMorph>>paragraphClass. So something special might be required in update process to hook opened windows...
2013/9/24 tim Rowledge tim@rowledge.org
Another odd bit to consider - MultiNewParagraph is, so far as I can tell, dead, defunct, unrequired and should be removed. It only adds presentationText/Lines and no usage is made of them.
Make your bid soon to preserve this unloved class, or say sayonara.
tim
tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim No one is listening until you make a mistake
On 23-09-2013, at 5:38 PM, Nicolas Cellier nicolas.cellier.aka.nice@gmail.com wrote:
Yes, after removal of presentation*, it is mostly un-needed But beware, I have MultiNewParagraph allInstances size -> 85, because of TextMorph>>paragraphClass. So something special might be required in update process to hook opened windows…
Ah yes, the joy of live programming...
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim If it was easy, the hardware people would take care of it.
Good news is that getting rid of all references to MultiNewParagraph doesn't seem to break anything and all the tests that used to refer to it still work when referring to plain old NewParagraph instead.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Strange OpCodes: IO: Illogical Or
Another one for tim:
MultiTextComposer multiComposeLinesFrom: etc... differs from TextComposer by two lines:
1) it uses a MultiCompositionScanner instead of a CompositionScanner 2) it does not have the workaround (scanner canComputeDefaultLineHeight) the fix is implemented only in CompositionScanner (introduced in 2010 by cmm)
Name: Morphic-cmm.440 Fix for composing Text's with TextAnchors as first character.
2013/9/24 tim Rowledge tim@rowledge.org
Good news is that getting rid of all references to MultiNewParagraph doesn't seem to break anything and all the tests that used to refer to it still work when referring to plain old NewParagraph instead.
tim
tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Strange OpCodes: IO: Illogical Or
On 23-09-2013, at 6:12 PM, tim Rowledge tim@rowledge.org wrote:
Good news is that getting rid of all references to MultiNewParagraph doesn't seem to break anything and all the tests that used to refer to it still work when referring to plain old NewParagraph instead.
More good news; a nice simple and obvious MultiNewParagraph allInstancesDo:[:mnp| mnp become: (mnp as: NewParagraph)] followed by a gc is enough to get rid of all instances without so far as I can tell causing any issues. So the obvious question is how to make a package with that and the class deletion; never done any sort of preamble script for MC stuff before so all advice welcomed.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Oxymorons: Living dead
Could you add a MultiNewParagraph class>>initialize that does your snippit (change all instances to NewParagraph). Include that MCZ in a new configuration map with that version (to ensure it loads for everyone else). Then commit a new version of the package without MultiNeweParagraph.
This works, right? you'd think I'd want to be updating Trunk at some point, too, wouldn't you?
-cbc
On Tue, Sep 24, 2013 at 11:18 AM, tim Rowledge tim@rowledge.org wrote:
On 23-09-2013, at 6:12 PM, tim Rowledge tim@rowledge.org wrote:
Good news is that getting rid of all references to MultiNewParagraph
doesn't seem to break anything and all the tests that used to refer to it still work when referring to plain old NewParagraph instead.
More good news; a nice simple and obvious MultiNewParagraph allInstancesDo:[:mnp| mnp become: (mnp as: NewParagraph)] followed by a gc is enough to get rid of all instances without so far as I can tell causing any issues. So the obvious question is how to make a package with that and the class deletion; never done any sort of preamble script for MC stuff before so all advice welcomed.
tim
tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Oxymorons: Living dead
Done, give it a try if not faint hearted ;)
2013/9/24 Chris Cunningham cunningham.cb@gmail.com
Could you add a MultiNewParagraph class>>initialize that does your snippit (change all instances to NewParagraph). Include that MCZ in a new configuration map with that version (to ensure it loads for everyone else). Then commit a new version of the package without MultiNeweParagraph.
This works, right? you'd think I'd want to be updating Trunk at some point, too, wouldn't you?
-cbc
On Tue, Sep 24, 2013 at 11:18 AM, tim Rowledge tim@rowledge.org wrote:
On 23-09-2013, at 6:12 PM, tim Rowledge tim@rowledge.org wrote:
Good news is that getting rid of all references to MultiNewParagraph
doesn't seem to break anything and all the tests that used to refer to it still work when referring to plain old NewParagraph instead.
More good news; a nice simple and obvious MultiNewParagraph allInstancesDo:[:mnp| mnp become: (mnp as: NewParagraph)] followed by a gc is enough to get rid of all instances without so far as I can tell causing any issues. So the obvious question is how to make a package with that and the class deletion; never done any sort of preamble script for MC stuff before so all advice welcomed.
tim
tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Oxymorons: Living dead
On 24-09-2013, at 1:32 PM, Chris Cunningham cunningham.cb@gmail.com wrote:
Could you add a MultiNewParagraph class>>initialize that does your snippit (change all instances to NewParagraph). Include that MCZ in a new configuration map with that version (to ensure it loads for everyone else). Then commit a new version of the package without MultiNeweParagraph.
This works, right? you'd think I'd want to be updating Trunk at some point, too, wouldn't you?
I don't actually know if that is the right thing to do; does loading package-chthulhu.5 when you currently have .2 result in .3 then .4 and finally .5 being loaded? If it only loads .5 then the intermediate and temporary existence of the method isn't noticed.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim As far as we know, our computer has never had an undetected error.
The packages listed in the next update-*.mcm are required intermediates.
If: chthulhu.tpr.3 is specified in update-tpr.248.mcz chthulhu.tpr.7 is specified in update-tpr.249.mcz chthulhu.tpr.10 is highest version in Trunk
then these 3 packages will be loaded/merged in that order, and intermediate packages will be ignored... I've specified Morphic-nice.688 in update-tpr.247.mcz, so the initialize method will be invoked for sure...
2013/9/25 tim Rowledge tim@rowledge.org
On 24-09-2013, at 1:32 PM, Chris Cunningham cunningham.cb@gmail.com wrote:
Could you add a MultiNewParagraph class>>initialize that does your
snippit (change all instances to NewParagraph).
Include that MCZ in a new configuration map with that version (to ensure
it loads for everyone else).
Then commit a new version of the package without MultiNeweParagraph.
This works, right? you'd think I'd want to be updating Trunk at some
point, too, wouldn't you?
I don't actually know if that is the right thing to do; does loading package-chthulhu.5 when you currently have .2 result in .3 then .4 and finally .5 being loaded? If it only loads .5 then the intermediate and temporary existence of the method isn't noticed.
tim
tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim As far as we know, our computer has never had an undetected error.
On 24-09-2013, at 3:18 PM, Nicolas Cellier nicolas.cellier.aka.nice@gmail.com wrote:
The packages listed in the next update-*.mcm are required intermediates.
If: chthulhu.tpr.3 is specified in update-tpr.248.mcz chthulhu.tpr.7 is specified in update-tpr.249.mcz chthulhu.tpr.10 is highest version in Trunk
then these 3 packages will be loaded/merged in that order, and intermediate packages will be ignored... I've specified Morphic-nice.688 in update-tpr.247.mcz, so the initialize method will be invoked for sure…
OK, so that is some other part of MC about which I am blissfully ignorant. As long as it all integrates in with the updating menu item in the top-of-screen dock then we're all ok.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Useful Latin Phrases:- Noli me vocare, ego te vocabo = Don't call me, I'll call you.
The next step - I have changes that remove the need for MultiTextComposer and clean up a variety of usages. BUT the order of making the changes is (as you have to expect when changing the tools that change the tools) crucially important. I'd really appreciate it if one you that already knows the MC magic to ensure correct ordering could leap to my assistance for this; it would be nice to avoid the fun we had the other day.
The load order needs to be Multilingual-tpr.174 on top of Multilingual-nice.173 Graphics-tpr.229 on top of Graphics-tpr.228 only then can Graphics-tpr.230 be added (method ordering seems an issue) finally Morphic-tpr.689 makes use of the changes.
Gradually, bit by bit, little curlicues of ugliness are being abraded away….
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Useful Latin Phrases:- Vescere bracis meis. = Eat my shorts.
Ah, the tool for you is MC Configuration Browser. Here is how to use:
1) open MC browser 2) open repository browser on trunk 3) select (pseudo) package named update and latest update version (update-nice.247.mcm) 4) click Browse button
At this stage you have a MC Configuration Browser armed for the massacre of text composition.
5) Arrange to load in your image all the packages that you want to figure in the map (Multilingual-tpr.174 and Graphics-tpr.229)
6) Select the Multilingual package in the configuration browser list (Multilingual-nice.171) 7) pop up menu and 'update this dependency from image' 8) redo with Graphics 9) click Store button and accept update-tpr.248, publish in trunk
At this stage, you gained a MC configuration publisher skill.
Now if Graphics-tpr.230 is really an intermediate requirement, you could make a second update-tpr.249 Note that you can also mess with the package load order in last resort (the top most in the list is loaded/merged first). Since currently Graphics is loaded before Multilingual, I don't think that the second update is necessary...
If you do that, you'll get the MC configuration hacker skill, and it's not recommended to get two MC skills in a single evening.
2013/9/25 tim Rowledge tim@rowledge.org
The next step - I have changes that remove the need for MultiTextComposer and clean up a variety of usages. BUT the order of making the changes is (as you have to expect when changing the tools that change the tools) crucially important. I'd really appreciate it if one you that already knows the MC magic to ensure correct ordering could leap to my assistance for this; it would be nice to avoid the fun we had the other day.
The load order needs to be Multilingual-tpr.174 on top of Multilingual-nice.173 Graphics-tpr.229 on top of Graphics-tpr.228 only then can Graphics-tpr.230 be added (method ordering seems an issue) finally Morphic-tpr.689 makes use of the changes.
Gradually, bit by bit, little curlicues of ugliness are being abraded away….
tim
tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Useful Latin Phrases:- Vescere bracis meis. = Eat my shorts.
On Thu, Sep 26, 2013 at 12:32:08AM +0200, Nicolas Cellier wrote:
If you do that, you'll get the MC configuration hacker skill, and it's not recommended to get two MC skills in a single evening.
Ha! :)
The accumulation of excessive MC skills is a rare but serious condition. A reported cure is to release some of the excess mental pressure in the form of a new class comment for some undocumented MC class.
Dave
On 25-09-2013, at 5:00 PM, "David T. Lewis" lewis@mail.msen.com wrote:
The accumulation of excessive MC skills is a rare but serious condition. A reported cure is to release some of the excess mental pressure in the form of a new class comment for some undocumented MC class.
What? There are classes without proper comments? Unpossible!
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Oxymorons: Resident alien
On 25-09-2013, at 3:32 PM, Nicolas Cellier nicolas.cellier.aka.nice@gmail.com wrote:
Ah, the tool for you is MC Configuration Browser. Here is how to use:
Right, that appears to have worked from my end, so hopefully I followed the recipe correctly and it works for the rest of you out there!
Thanks for the clear explanation. Now, let;s see if I can collapse CompositionScanner & MultiCompositionScanner.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Strange OpCodes: CCC: Crash if Carry Clear
I note that there seems to be no attempt in #scanJapaneseCharactersFrom… to handle kerning. Is this because Japanese character glyphs don't get kerned, or is it a bug that should be addressed at some point?
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Useful random insult:- One flower short of an arrangement.
On 25-09-2013, at 6:02 PM, tim Rowledge tim@rowledge.org wrote:
I note that there seems to be no attempt in #scanJapaneseCharactersFrom… to handle kerning. Is this because Japanese character glyphs don't get kerned, or is it a bug that should be addressed at some point?
And similarly it seems that only in scanJapaneseCharacters… do we actually need to send isBreakable:in & registerBreakableIndex. That would speed things up a bit if I'm correct.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Strange OpCodes: RLBMI: Ruin Logic Board Multiple Indexed
Right now it appears that DisplayScanner is no longer needed. Another one bites the dust...
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Useful random insult:- Living proof that nature does not abhor a vacuum.
I suggest that we move the last bits of difference from MultiCharacter* -> CharacterScanner, properly classify what is *Multilingual related, and then remove the Multi* classes. Sounds right?
2013/9/28 tim Rowledge tim@rowledge.org
Right now it appears that DisplayScanner is no longer needed. Another one bites the dust...
tim
tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Useful random insult:- Living proof that nature does not abhor a vacuum.
On 28-09-2013, at 12:34 AM, Nicolas Cellier nicolas.cellier.aka.nice@gmail.com wrote:
I suggest that we move the last bits of difference from MultiCharacter* -> CharacterScanner, properly classify what is *Multilingual related, and then remove the Multi* classes. Sounds right?
In general, yes. The smart thing is to end up with CharacterScanner instead of MultiCharacterScanner, just because the name is simpler.
We can probably fudge things a little to make the Multi-* classes acceptable in all normal-running, swap code to not refer to the non-Multi classes, then they can be cleaned up without fear of breaking anything live, then swap everything back to non-Multi, then delete all Multi-*.
Then maybe we can do the same to Paragraph & NewParagraph and any other confused classes.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Never write software that patronizes the user.
Now that Nicolas & I have pretty much finished this stage of cleaning up the scanners etc, we have at least achieved the major aim I had in mind; getting back to a single class tree for scanning text. So far as I can tell everything is working ok and I haven't managed to cause any errors.
The magic keys (cmd - & cmd +) that are supposed to kern the selection do not work, but then they don't in a vanilla 4.5-12461 image from before this work was done. So we didn't break it…
The next thing to do is try to simplify the choices between byte and wide strings, fonts and encodings and language environments. I hate seeing #isKindOf: or #isMemberOf: type tests in running code (you can excuse it in prototypes, for a few minutes at least) and #isByteString is not much better. We have classes and inheritance for a reason; nobody should be writing C code in Smalltalk.
Trying to list the factors involved in working out how to scan a text (and *please* correct whatever I get wrong):-
the String - byteString; so far as I can see ByteStrings are single-byte characters (duh) with an assumed encoding. That appears to be 'mac roman' which is almost but not quite latin1 or iso-something-or-other. wideString; 32 bit characters where the top (ish) 8 bits are used as a leading character (not to be confused with leading in the typographic sense of affecting line spacing - isn't English wonderfully clear…) that defines an EncodedCharSet (or LanguageEnvironment, sigh) which provides for a specific scanning message to use. To complicate life further, a later character in a WideString can change the encoding to use, which may well change the font, oh frabjous day.
the Font - we have several classes of fonts, not all in the base image right now. I think I'd divide them into two phyla at the moment; a) StrikeFonts and other simple bitmap glyphs. This would include StrikeFont itself, HostFont and TTCFont (since it generates bitmaps that are simply bitblt'd to use) b) ComplicatedPluginFonts where an interface to a more complex and sophisticated renderer is used to leverage a library such as TrueType, Cairo, Pango, Weyland or whatever. These may well need to completely usurp the actual scanner to do the work.
There's another font aspect that is important too, but for now at least it is tied to a & b above - whether pair-kerning is supported. I'm sure we could make a variant of StrikeFonts that does it if we wanted but let's keep things tolerably intelligible for now, eh?
I'm going to take a quick swing at changing the scanning to delegate to 1) the string, which will then delegate to 2) the font, which for all the classes in the image right now will then delegate back to 3) the scanner, but having already worked out which form of scanning is required.
OK; I'm going in! Cover me!
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Strange OpCodes: CSF: Charge to NSF
I've completed some changes to clean up the dispatching of scanning so that we use multi-dispatching instead of nasty tests.
It's running quite happily in my development image, seems to handle plain ascii stuff and widestrings with all them furrin' accents ok. I'm *not* going to just drop it into the trunk right now though since it has plenty of scope for totally messing up things and really ought to be tested a little first.
The explanation and code is at http://bugs.squeak.org/view.php?id=7789 Two small changesets - one with all the new code and one with a single scary flip-over to use it. I suggest browsing the new code first….
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Strange OpCodes: BDC: Break Down and Cry
Usually, it's because open-source is a ghetto.
On Sun, Sep 1, 2013 at 10:24 PM, tim Rowledge tim@rowledge.org wrote:
I've been trying to sort out mantis-1650 and siblings; oh boy what fun.
Basically at some point the shape of CharacterScanner was changed so that primitive 103 could no longer work; nowadays we waste time in starting up the primitive and never doing anything but failing it. The fallback code is pretty ugly too, though I have a few small improvements for it. We can't reasonably 'fix' the primitive since it is required to support older images, in particular Scratch on the Pi. Anyone doing something that compromises *that* will get a quiet visit from The Boys. We *could* add a new primitive, of course. It's also possible that for a lot of modern machines running Cog it might not be worth it - but not all machines are cogged nor fast.
Part of the complication is that we have rather lot of font related classes these days and not all of them are even subclasses of AbstractFont. So far as I can tell the major change was due to an attempt to handle fonts that can have character pair kerning, which looks like only FreeTypeFont. All the others are wasting time both through losing the primitive support and pointlessly finding out that pair-kerning does nothing new. Oh and FT2Face seems to be off on its own for some reason I haven't discerned as yet. Obvious question - who is most up to speed with what the hell Fonts are up to these days? I have some Questioning Instruments warming up for you…
MultiCharacterScanner brings in a whole new level of insanity, not least because it uses identical code including the pointless call of the primitive - and the two senders of these two methods are also (effectively) identical. And do, just for grins, take a look at the only reference to MultiCharacterScanner - FreeTypeFontProvider class>initialize. Oh my. And let's consider references and uses to other classes in that hierarchy - in NewParagraph>characterBlock* MultiCharacterBlockScanner is used for WideStrings but in Paragraph>composeAll it is used for both byte strings and wider strings. And then there is GrafPort>displayScannerForMulti:….
How have we got into such a mixed and messy state? Did some experiment get partially worked on and forgotten? Surely nobody has deliberately made it so convoluted?
tim
tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Strange OpCodes: CLOUT: Call Long-distance On Unused Telephone
squeak-dev@lists.squeakfoundation.org