TGet() - UTF8 encoding fails [Solved]

Re: TGet() - UTF8 encoding fails [Unsolved]

Postby frose » Fri Oct 13, 2023 7:32 am

ok, I see.

Nevertheless the encoding should not be changed, in MHO this is a bug!

Since the Upper() function doesn't work 'properly' for UTF8, I have to use my own U82Upper() function for that!

But what about the VARCHAR clause? Does the same apply there?
Windows 11 Pro 22H2 22621.1848
Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384
Harbour 3.2.0dev (r2008190002)
FWH 23.10 x86
User avatar
frose
 
Posts: 392
Joined: Tue Mar 10, 2009 11:54 am
Location: Germany, Rietberg

Re: TGet() - UTF8 encoding fails [Unsolved]

Postby nageswaragunupudi » Sat Oct 14, 2023 5:27 pm

Since the Upper() function doesn't work 'properly' for UTF8, ...


Yes, Harbour's Upper() function does not work with UTF8 encoded Umlauts.
Even with Ansi encoded umlauts, Harbour Upper/Lower functions work only if the codepage is set to German.

But a Unicode Get control does not have to depend on Harbour's Upper() function for converting to Upper case when picture clause "@!" is used. Windows OS has its own built-in Upper/Lower case functionality. This functionality is used by a Unicode Get by setting the style to ES_UPPERCASE, so this upper case conversion is automatically done by Windows.
This explains how "üäö" is converted to "ÜÄÖ" inside the Get.

In the version to be released we are providing two new functions, WinUpper() and WinLower().
These functions are wrappers to Windows API functions CharUpper() and CharLower().
These functions work both with ANSI/UTF8 encoded texts.
If the parameter is ANSI encoded umlaut, the result is ANSI encoded umlaut and
if the parameter is UTF8 encoded umlaut, the result is UTF8 encoded umlaut.

Here is a preview of one of these functions.
Code: Select all  Expand view  RUN
#include "fivewin.ch"

#xtranslate enc(<c>) => If(isutf8(<c>),"UTF8", "ANSI" )

function Main()

   local cAnsiLower  := "üäö"
   local cUtf8Lower  := AnsiToUtf8( cAnsiLower )
   local cUtf8Upper, cAnsiUpper

   cUtf8Upper  := winUpper( cUtf8Lower )
   cAnsiUpper  := winUpper( cAnsiLower )

   ? cUtf8Upper, STRTOHEX( cUtf8Upper, " " ), enc( cUtf8Upper )
      // --> "ÜÄÖ", "C3 9C C3 84 C3 96", "UTF8"
   ? cAnsiUpper, STRTOHEX( cAnsiUpper, " " ), enc( cAnsiUpper )
      // --> "ÜÄÖ", "DC C4 D6", "ANSI"

return nil

#pragma BEGINDUMP

#include <windows.h>
#include <hbapi.h>
#include <fwh.h>

LPSTR UTF16toUTF8( LPWSTR utf16 );

HB_FUNC( WINUPPER )
{
   LPWSTR pStr;
   LPCSTR pRet;

   if HB_ISCHAR( 1 )
   {
      pStr = fw_parWide( 1 );
      CharUpperW( pStr );
      if ( isutf8( hb_parc( 1 ), hb_parclen( 1 ) ) )
      {
         pRet = UTF16toUTF8( pStr );
         hb_retc( pRet );
         hb_xfree( ( void * ) pRet );
      }
      else { fw_retWide( pStr ); }
      hb_xfree( ( void * ) pStr );
   } else { hb_retc( "" ); }
}

#pragma ENDDUMP


This works without setting any codepage and whether FW_SetUnicode() is set to .F. or .T.
Regards

G. N. Rao.
Hyderabad, India
User avatar
nageswaragunupudi
 
Posts: 10655
Joined: Sun Nov 19, 2006 5:22 am
Location: India

Re: TGet() - UTF8 encoding fails [Unsolved]

Postby frose » Sun Oct 15, 2023 10:35 am

Tested with:
Code: Select all  Expand view  RUN
local cAnsiLower  := " Καλημέρα - Приве́ - ดีตอนเช้า"

Image
I think that's really good :D

But of course I can't use it with TGet() and the picture clause "@!" because then the encoding changes :(
Windows 11 Pro 22H2 22621.1848
Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384
Harbour 3.2.0dev (r2008190002)
FWH 23.10 x86
User avatar
frose
 
Posts: 392
Joined: Tue Mar 10, 2009 11:54 am
Location: Germany, Rietberg

Re: TGet() - UTF8 encoding fails [Unsolved]

Postby nageswaragunupudi » Sun Oct 15, 2023 10:43 am

because then the encoding changes

We intend to address all issues with your help and feedback.
Regards

G. N. Rao.
Hyderabad, India
User avatar
nageswaragunupudi
 
Posts: 10655
Joined: Sun Nov 19, 2006 5:22 am
Location: India

Re: TGet() - UTF8 encoding fails [Solved]

Postby frose » Sat Nov 04, 2023 9:36 am

Dear Mr. Nageswara Rao,
now encoding is OK :D
Thanks
Frank
Windows 11 Pro 22H2 22621.1848
Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384
Harbour 3.2.0dev (r2008190002)
FWH 23.10 x86
User avatar
frose
 
Posts: 392
Joined: Tue Mar 10, 2009 11:54 am
Location: Germany, Rietberg

Re: TGet() - UTF8 encoding fails [Solved]

Postby nageswaragunupudi » Sat Nov 04, 2023 9:48 am

Thank you.
Possible because of your feedback.
Regards

G. N. Rao.
Hyderabad, India
User avatar
nageswaragunupudi
 
Posts: 10655
Joined: Sun Nov 19, 2006 5:22 am
Location: India

Previous

Return to FiveWin for Harbour/xHarbour

Who is online

Users browsing this forum: Google [Bot] and 25 guests