UTF-8, 2-Byte characters, Lower() and Upper()

UTF-8, 2-Byte characters, Lower() and Upper()

Postby frose » Sat Jun 24, 2023 6:41 am

The functions Lower() and Upper doesn't work as expected for UTF-8 2-Byte characters

Code: Select all  Expand view  RUN
function Main()

   local oDlg
   local oEdit
   local cVar1 := "lowerüöäßUPPER"
   local cVar2 := "UPPERÄÜÖßlower"

   REQUEST HB_CODEPAGE_UTF8
   HB_CDPSELECT( "UTF8" )
   FW_SetUnicode( .T. )
   
   DEFINE DIALOG oDlg SIZE 600, 600 PIXEL TRUEPIXEL
   
   @  40, 20 EDIT oEdit VAR cVar1 SIZE 200,20 PIXEL OF oDlg
   
   @  60, 20 EDIT oEdit VAR cVar2 SIZE 200,20 PIXEL OF oDlg

   @  80, 20 BUTTON "CHECK" SIZE 100,40 PIXEL OF oDlg ACTION MsgInfo( ;
      Lower( "Lower( |" + cVar1 + "|" + CRLF + "|" + cVar2 + "| )" ) + CRLF + CRLF + ;
      Upper( "Upper( |" + cVar1 + "|" + CRLF + "|" + cVar2 + "| )" );
      )

   ACTIVATE DIALOG oDlg CENTERED
RETURN NIL
 

Image
Windows 11 Pro 22H2 22621.1848
Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384
Harbour 3.2.0dev (r2008190002)
FWH 23.10 x86
User avatar
frose
 
Posts: 392
Joined: Tue Mar 10, 2009 11:54 am
Location: Germany, Rietberg

Re: UTF-8, 2-Byte characters, Lower() and Upper()

Postby karinha » Sat Jun 24, 2023 2:06 pm

Code: Select all  Expand view  RUN

// C:\FWH...\SAMPLES\FROSEUT8.PRG

#include "FiveWin.ch"

REQUEST HB_LANG_PT
REQUEST HB_CODEPAGE_PT850

// REQUEST HB_CODEPAGE_PTISO
// REQUEST HB_CODEPAGE_UTF8EX

FUNCTION Main()

   LOCAL oDlg
   LOCAL oEdit
   LOCAL cVar1 := "lowerüöäßUPPER"
   LOCAL cVar2 := "UPPERÄÜÖßlower"

   HB_LANGSELECT( 'PT' )     // Default language is now Portuguese
   HB_SETCODEPAGE( "PT850" )

   /*
   HB_CDPSELECT( "PTISO" )

   hb_cdpSelect( "UTF8EX" )
   */


   HB_CDPSELECT( "UTF8" )

   FW_SetUnicode( .T. )
   
   DEFINE DIALOG oDlg SIZE 600, 600 PIXEL TRUEPIXEL
   
   @  40, 20 EDIT oEdit VAR cVar1 SIZE 200,20 PIXEL OF oDlg
   
   @  60, 20 EDIT oEdit VAR cVar2 SIZE 200,20 PIXEL OF oDlg

   /*
   @  80, 20 BUTTON "CHECK" SIZE 100,40 PIXEL OF oDlg ACTION MsgInfo( ;
      Lower( "Lower( |" + cVar1 + "|" + CRLF + "|" + cVar2 + "| )" ) + CRLF + CRLF + ;
      Upper( "Upper( |" + cVar1 + "|" + CRLF + "|" + cVar2 + "| )" );
      )
   */


   @  90, 20 BUTTON "CHECK" SIZE 100,40 PIXEL OF oDlg ;
      ACTION( VIEW_UTF8( cVar1, cVar2 ) )

   ACTIVATE DIALOG oDlg CENTERED

RETURN NIL

FUNCTION VIEW_UTF8( ccVar1, ccVar2 )

/*
MsgInfo( ;
      Lower( "Lower( |" + cVar1 + "|" + CRLF + "|" + cVar2 + "| )" ) + CRLF + CRLF + ;
      Upper( "Upper( |" + cVar1 + "|" + CRLF + "|" + cVar2 + "| )" );
      )*/


   ? OemToAnsi( LOWER( "Lower( |" + ccVar1 + "|" + CRLF + "|" + ccVar2 + "| )" ) )

   ? OemToAnsi( UPPER( "Upper( |" + ccVar1 + "|" + CRLF + "|" + ccVar2 + "| )" ) )

   // ? hb_strtoutf8( LOWER( ccVar1 ) )


RETURN NIL
 
João Santos - São Paulo - Brasil - Phone: +55(11)95150-7341
User avatar
karinha
 
Posts: 7831
Joined: Tue Dec 20, 2005 7:36 pm
Location: São Paulo - Brasil

Re: UTF-8, 2-Byte characters, Lower() and Upper()

Postby nageswaragunupudi » Sat Jun 24, 2023 2:07 pm

By default Lower() and Upper() work with English characters only.

We need to set the codepage of the desired language
Regards

G. N. Rao.
Hyderabad, India
User avatar
nageswaragunupudi
 
Posts: 10646
Joined: Sun Nov 19, 2006 5:22 am
Location: India

Re: UTF-8, 2-Byte characters, Lower() and Upper()

Postby frose » Sun Jun 25, 2023 7:58 am

karinha wrote:
Code: Select all  Expand view  RUN
...
 

karinha, thank you very much, helps for clarification.
nageswaragunupudi wrote:By default Lower() and Upper() work with English characters only.
We need to set the codepage of the desired language

Ok, understand.

So, if I am in a multi-language environment, e.g.:
    - a dialog/browse that uses more than one language with diacritical marks
    - or want to search case-insensitively and does not know the source language of the search string
functions like U8Lower() and U8Upper() are essential!
Windows 11 Pro 22H2 22621.1848
Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384
Harbour 3.2.0dev (r2008190002)
FWH 23.10 x86
User avatar
frose
 
Posts: 392
Joined: Tue Mar 10, 2009 11:54 am
Location: Germany, Rietberg


Return to FiveWin for Harbour/xHarbour

Who is online

Users browsing this forum: Google [Bot] and 85 guests