Mailing List lswitcher-dev@2rosenthals.com Archived Message #208

From: "Alfredo Fernández Díaz" <lswitcher-dev@2rosenthals.com> Full Headers
Undecoded message
Subject: Re: [lswitcher-dev] lSwitcher-2-93-0-RC_6.wpi
Date: Sun, 15 Aug 2021 22:10:16 +0100
To: lSwitcher Developers Mailing List <lswitcher-dev@2rosenthals.com>

Hi,

On 2021/08/15 20:36, Lewis wrote:
Hi...

On 08/15/21 08:57 am, Alfredo Fern�ndez D�az wrote:
<snip>
Lewis, did you notice I reported this was a problem that showed up /on a
Russian system/, and nowhere else? -- Ulrich's name was always properly
processed and rendered on my main system (main CP always 850).


You got me there.

:)

I was only testing in English. My first go-round told me that XWP was not installed,
 as it couldn't match Ulrich's name in the db.

Now this is unexpected, so maybe it's also interesting. Was English the system language? What were the codepage settings? What lswitcher were you trying you to install, exactly? How was Ulrich's surname spelled in the warning message? And in WarpIN's database mode? Can you reproduce that?

Once I converted the script to UTF-8, all was right with the world.

"Works for me" uh? :)

<snip>
Now, let's convert the WIS to UTF (and change its CODEPAGE attribute
accordingly), and fire up WarpIN on that again: see lsw@ru_CP850.png, look
at my name again.

That is a UTF conversion problem, which may or may not be related to the one
I reported initially, but we definitely brought it up converting the WIS to
CP 1208 aka UTF8.


The WarpIN source says that we handle extracted files (EXTRACTFROMPCK) like so:


<snip>

So, we convert the CP850 Readme to UTF-8. So far, so good. However, when we
then need to convert to the display codepage (CP866, in this case), we run
into a slight problem (note that Readme.UTF8 is the original readme which I
converted via iconv):

[j:\] iconv -f UTF-8 -t 866 Readme.UTF8 > Readme.866
iconv.exe: Readme.UTF8:108:12: cannot convert

Line 108, char 12 is "�" in your name.

Correct.

Hmmm... I'm not sure what to do here.

Why do something, except maybe archive for future reference? -- it is just a classic limitation of old IBM's codepages, and there is little possible reward in addressing it for its own sake.

There is a WarpIN preference for display codepage, which defaults to process
codepage. However, on a Russian system, it would seem highly illogical to
change this merely to read a few characters which can't be rendered in 866.

If anything, it would make sense to change WarpIN display codepage to 1208, but at the moment I don't even know whether that's possible.

Also, this is not a font thing. I have dropped myriad fonts onto the dialog,
all with the same result: "?" for the characters in your name.

Sure it is not a font thing.

I fall back on my contention that this is not a WarpIN bug.

I differ. This is minor compared to what I found, though.

WarpIN accepts the
content of an external file as the same codepage as specified for the WIS, and
then converts to UTF-8, and finally to the display codepage. It's a conundrum,
I grant you. I just haven't figured an adequate workaround as yet.

However, you solved the secondary mystery of why my name was being mangled in the readme as displayed by WarpIN when Unicode was used for the WIS -- the file is encoded using CP850, but the code assumes it is 1208. Naturally, 0xA0 and 0xA1 (CP850 codepoints for � and �) are not valid UTF8 byte sequences, so bad stuff happens when trying to display that. Maybe some UTF validation could/should be incorporated into the code, but I don't think that's a priority.

I'll summarize all of this and report to Paul "tonight".


Subscribe: Feed, Digest, Index.
Unsubscribe
Mail to ListMaster