Mailing List lswitcher-dev@2rosenthals.com Archived Message #211

Fra: "Alfredo Fernández Díaz" <lswitcher-dev@2rosenthals.com> Full Headers
Undecoded message
Emne: Re: [lswitcher-dev] lSwitcher-2-93-0-RC_6.wpi
Dato: Mon, 16 Aug 2021 01:07:46 +0100
Til: lSwitcher Developers Mailing List <lswitcher-dev@2rosenthals.com>

Hi Gregg,

On 2021/08/15 21:52, Gregg Young wrote:
Hi Lewis

This is a warpin bug. REQUIRES="Ulrich Möller\XWorkplace\Kernel\1\0\1" isn't meant
to be user  readable it is an internal check.

It's both since WarpIN has a 'database' mode that can show its contents to users (that should match the REQUIRES). However, I don't think users not being able to read a few characters here and there is a problem. The database not being built consistently is.


I think you will also see problems on a Russian system with PACKAGEID="Ulrich Möller\XWorkplace\Kernel\1\0\1"
installed post 1.0.24. If this is done with codepage 1208 or no codepage. the database will contain
"Ulrich M?ller\XWorkplace\Kernel\1\0\1". If you have a REQUIRES="Ulrich Möller\XWorkplace\Kernel\1\0\1"
codepage 1208 it will probably work but if you have a wis with this that is codepage 850 it will fail since
the ö will be present.
The ö isn't present in codepage 866. Only ASCII character (0-127) are (probably) guaranteed  between codepages.

Yes, 0-127 are English + minimal typesetting characters common to all codepages.

Regarding database contents...

My main system database contains "55 6C 72 69 63 68 20 4D C3 B6 6C 6C 65 72" which is valid UTF-8 for "Ulrich Möller", which means all code worked correctly to build it.

On the Russian system, the database contains "55 6C 72 69 63 68 20 4D D0 A4 6C 6C 65 72", which is not valid UTF-8. (D0A4 seems to be a Hangul character in UTF8, BTW.)

So apparently the problem lies in translation to Unicode on non-CP850 systems (because the same WPIs are processed on all systems but we have different results), and this badly translated stuff being stored afterwards in the database.

What is needed is for warpin to convert these "internal use" strings to codepage 850 use them and then
convert the rest to codepage 866.

Not really. A REQUIRES string must be converted to UTF8 if CODEPAGE != 1208, and then it can be compared directly to whatever is in the database, because that is supposed to be UTF8. The problem is, it is mangled on the Russian system.

The other problem is with wises with no codepage (most if not all of which are codepage 850).
These fail for REQUIRES="Ulrich Möller\XWorkplace\Kernel\1\0\1" on Russian systems because they are now
read out as codepage 866 (process default).

Are we sure of this? (Reminder: docs say when no CODEPAGE= is found in the WIS, 850 is assumed.)

This case requires that the "internal use" be read first in
codepage 850 and used before the codepage 866 (default) read. This can also be fixed by assuming they are
codepage 850 not the process codepage.

OK some questions about this. What version of iconv are you using I have found several?
What are the exact steps to build the wis? I assume you need to reconvert any time you edit the file unless
you use a UTF-8 enabled editor (are there any).

IIRC iconv has always had problems. I have put together a little REXX script to use Alex Taylor's ULS to convert text files between codepages, find it attached. He also has an UTF8 capable, QT-based editor somewhere, but I can't recall the name right now.

Thank you,
AFD.

Legg ved fil:  RxTxtCnv.cmd (2505 bytes)

Abboner: Feed, Digest, Index.
Stopp abbonement
E-post til ListMaster