From: |
"Lewis" <lswitcher-dev@2rosenthals.com> |
Full Headers Undecoded message |
Subject: |
Re: [lswitcher-dev] lSwitcher-2-93-0-RC_6.wpi |
Date: |
Sun, 15 Aug 2021 15:36:16 -0400 |
To: |
lSwitcher Developers Mailing List <lswitcher-dev@2rosenthals.com> |
|
---|
Hi...
On 08/15/21 08:57 am, Alfredo Fernández Díaz wrote:
"Morning,"
On 2021/08/15 04:04, Lewis wrote:
Hi...
Changing the codepage isn't enough. The content needs to be converted to UTF-8
(it was still CP850).
As I tried to explain (albeit maybe too briefly, sorry) this still breaks (more?) things...
I am perfectly aware that the WIS contains non-English characters, so specifying CODEPAGE is not enough -- what you state there must be the one in use as well, so if the original used CP850 characters, a proper conversion is in order, sure.
Still, WarpIN is not handling this correctly...
<snip>
This gave me a UTF-8 script, which properly renders Ulrich's name and which
then matches what's in the WarpIN db (no error report of missing XWP).
Lewis, did you notice I reported this was a problem that showed up /on a Russian system/, and nowhere else? -- Ulrich's name was always properly processed and rendered on my main system (main CP always 850).
You got me there. I was only testing in English. My first go-round told me that XWP was not installed, as it couldn't match Ulrich's name in the db. Once I converted the script to UTF-8, all was right with the world.
I am attaching two screenshots to illustrate that something (which may or may not be new, and/or related to the problem with not finding XWP in the database) breaks when you convert the WIS:
lsw@ru_CP850.png shows how the readme (CP 850) is rendered on this Russian system under CP 866 when the wis is CP850-encoded: see the "?" on my name? That is possibly a rendering-only, cosmetic problem.
Now, let's convert the WIS to UTF (and change its CODEPAGE attribute accordingly), and fire up WarpIN on that again: see lsw@ru_CP850.png, look at my name again.
That is a UTF conversion problem, which may or may not be related to the one I reported initially, but we definitely brought it up converting the WIS to CP 1208 aka UTF8.
The WarpIN source says that we handle extracted files (EXTRACTFROMPCK) like so:
if (!G_pCurrentPageInfo->_ulExtractFromPck)
str2Insert.assignUtf8(pLocals->_pCodecGui,
G_pCurrentPageInfo->_ustrReadmeSrc);
else
{
// use _strReadmeSrc as a file name:
// V1.0.11 (2006-08-31) [pr]: was using Unicode filename for Readme @@fixes 812
ULONG cpSrc = Engine._pCurrentArchive->_pScript->_ulCodepage;
BSUniCodec codecSrc(cpSrc);
BSString strReadmeSrc(&codecSrc, G_pCurrentPageInfo->_ustrReadmeSrc);
BSString strTempFileName;
APIRET arc;
if (!(arc = Engine.ExtractTempFile(G_pCurrentPageInfo->_ulExtractFromPck,
strReadmeSrc.c_str(), // V1.0.11 (2006-08-11)
&strTempFileName)))
{
// successfully extracted:
PSZ pszContent = NULL;
if (!(arc = doshLoadTextFile(strTempFileName.c_str(),
&pszContent,
NULL)))
{
// check what codepage the script was created in...
// we assume that the "readme" file was written in
// the same codepage. If the codepage is different
// from our current one, we'll need to convert:
if (cpSrc == pLocals->_pCodecGui->QueryCodepage())
// easy
str2Insert = pszContent;
else
{
// alright, different:
// convert file contents to Unicode
ustring ustr(&codecSrc, pszContent);
// convert Unicode to display codepage
str2Insert.assignUtf8(pLocals->_pCodecGui, ustr);
}
free(pszContent);
}
else
str2Insert._printf(nlsGetString(WPSI_ERRORREADINGPCKFILE),
arc,
strReadmeSrc.c_str(), // V1.0.11 (2006-08-31) [pr]
G_pCurrentPageInfo->_ulExtractFromPck);
}
else
str2Insert._printf(nlsGetString(WPSI_ERROREXTRACTINGPCKFILE),
arc,
strReadmeSrc.c_str(), // V1.0.11 (2006-08-31) [pr]
G_pCurrentPageInfo->_ulExtractFromPck);
So, we convert the CP850 Readme to UTF-8. So far, so good. However, when we then need to convert to the display codepage (CP866, in this case), we run into a slight problem (note that Readme.UTF8 is the original readme which I converted via iconv):
[j:\] iconv -f UTF-8 -t 866 Readme.UTF8 > Readme.866
iconv.exe: Readme.UTF8:108:12: cannot convert
Line 108, char 12 is "á" in your name. Hmmm... I'm not sure what to do here. There is a WarpIN preference for display codepage, which defaults to process codepage. However, on a Russian system, it would seem highly illogical to change this merely to read a few characters which can't be rendered in 866.
Also, this is not a font thing. I have dropped myriad fonts onto the dialog, all with the same result: "?" for the characters in your name.
I fall back on my contention that this is not a WarpIN bug. WarpIN accepts the content of an external file as the same codepage as specified for the WIS, and then converts to UTF-8, and finally to the display codepage. It's a conundrum, I grant you. I just haven't figured an adequate workaround as yet.
--
Lewis
|