FW_SetUnicode( .T. ) 2-Byte characters

FW_SetUnicode( .T. ) 2-Byte characters

Postby frose » Wed Jun 21, 2023 6:33 am

Hi,

does anyone know why these 2-byte characters "üöä", "ÄÖÜ", "ßéÉÊ" are not interpreted correctly when inserted during editing?

Code: Select all  Expand view
FUNCTION Main()

   LOCAL aArray

   HB_CDPSELECT( "UTF8" )

   FW_SetUnicode( .T. )

   aArray := { "üöä", "ÄÖÜ", "ßéÉÊ"}

   XBrowse( aArray, "Unicode 2-Byte Test - FW_SetUnicode( .T. ) - aArray",,,,, !.F., .T.,,, .F., .T. )

RETURN NIL
 

The given characters are displayed correctly:

Image

But if you enter or edit the same characters, they will not be interpreted correctly: :shock:

Image

What is going wrong?
Windows 11 Pro 22H2 22621.1848
Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384
Harbour 3.2.0dev (r2008190002)
FWH 23.10 x86
User avatar
frose
 
Posts: 392
Joined: Tue Mar 10, 2009 11:54 am
Location: Germany, Rietberg

Re: FW_SetUnicode( .T. ) 2-Byte characters

Postby frose » Wed Jun 21, 2023 1:14 pm

This is the result with the example from https://forums.fivetechsupport.com/viewtopic.php?f=3&t=43246
when FW_SetUnicode( .t. )

Code: Select all  Expand view

local oDlg, oGet, oEdit
   local cVar1 := ""
   local cVar2 := ""

   HB_CDPSELECT( "UTF8" )
   FW_SetUnicode( .T. )

   DEFINE DIALOG oDlg SIZE 300,300 PIXEL TRUEPIXEL

   @  20,20 GET oGet VAR cVar1 SIZE 200,20 PIXEL OF oDlg VARCHAR 20

   @  60,20 EDIT oEdit VAR cVar2 SIZE 200,20 PIXEL OF oDlg

   @ 100,20 BUTTON "CHECK" SIZE 100,40 PIXEL OF oDlg ACTION MsgInfo( "|" + cVar1 + "|" + CRLF + "|" + cVar2 + "|" )

   ACTIVATE DIALOG oDlg CENTERED
 


Image

no words, what's going on?
Windows 11 Pro 22H2 22621.1848
Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384
Harbour 3.2.0dev (r2008190002)
FWH 23.10 x86
User avatar
frose
 
Posts: 392
Joined: Tue Mar 10, 2009 11:54 am
Location: Germany, Rietberg

Re: FW_SetUnicode( .T. ) 2-Byte characters

Postby nageswaragunupudi » Wed Jun 21, 2023 3:54 pm

cVar2 using EDIT control is correct but cVar1 using GET control is not correct?

I need to do more tests at my end.

Did you try keeping FW_SetUnicode( .f. ) // default
and try setting
Code: Select all  Expand view
  HB_LangSelect("DE")
   HB_SetCodePage("DEWIN")
Regards

G. N. Rao.
Hyderabad, India
User avatar
nageswaragunupudi
 
Posts: 10620
Joined: Sun Nov 19, 2006 5:22 am
Location: India

Re: FW_SetUnicode( .T. ) 2-Byte characters

Postby frose » Wed Jun 21, 2023 6:07 pm

nageswaragunupudi wrote:cVar2 using EDIT control is correct but cVar1 using GET control is not correct?

Yes!

Did some more tests with ü - 0xC3BC and some other 2-byte characters:
- The first 2-byte char is ok
- all the following are not ok
- Pasting one or more 2-byte chars from the clipboard is working!

nageswaragunupudi wrote:Did you try keeping FW_SetUnicode( .f. ) // default
and try setting
Code: Select all  Expand view
  HB_LangSelect("DE")
   HB_SetCodePage("DEWIN")

Used DE850 until now, want to use UTF-8 in the future 8)
Windows 11 Pro 22H2 22621.1848
Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384
Harbour 3.2.0dev (r2008190002)
FWH 23.10 x86
User avatar
frose
 
Posts: 392
Joined: Tue Mar 10, 2009 11:54 am
Location: Germany, Rietberg

Re: FW_SetUnicode( .T. ) 2-Byte characters

Postby nageswaragunupudi » Thu Jun 22, 2023 1:31 am

Used DE850 until now,

Is this working perfectly?
Can you please let me see your settings?
Regards

G. N. Rao.
Hyderabad, India
User avatar
nageswaragunupudi
 
Posts: 10620
Joined: Sun Nov 19, 2006 5:22 am
Location: India

Re: FW_SetUnicode( .T. ) 2-Byte characters

Postby frose » Thu Jun 22, 2023 5:23 am

Is this working perfectly?

Yes
Can you please let me see your settings?

Windows 11 Pro 22H2 22621.1848
Harbour 3.2.0dev (r2008190002)
Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384
FWH 23.04 x86
Windows 11 Pro 22H2 22621.1848
Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384
Harbour 3.2.0dev (r2008190002)
FWH 23.10 x86
User avatar
frose
 
Posts: 392
Joined: Tue Mar 10, 2009 11:54 am
Location: Germany, Rietberg

Re: FW_SetUnicode( .T. ) 2-Byte characters

Postby nageswaragunupudi » Thu Jun 22, 2023 12:47 pm

I do not have German keyboard, but I am using virtual touch keyboard downloaded from Google.
I noticed the same issues.
We are going to look into and solve the issue.
This may take some time.
I suggest you to postpone moving to Unicode for a few days, till we make this work perfectly.

Meanwhile you can help me by
1. Let me know if the multiline Get
@ r,c, GET ctext MEMO/TEXT ..
is working perfectly with German lang when FW_SetUnicode() is .T. ?

2. Can you paste all problem German characters here?
Regards

G. N. Rao.
Hyderabad, India
User avatar
nageswaragunupudi
 
Posts: 10620
Joined: Sun Nov 19, 2006 5:22 am
Location: India

Re: FW_SetUnicode( .T. ) 2-Byte characters

Postby frose » Fri Jun 23, 2023 6:56 am

I do not have German keyboard, but I am using virtual touch keyboard downloaded from Google.
I noticed the same issues.

I use the 'Comfort On-Screen Keyboard Pro' for this purpose.
We are going to look into and solve the issue.

Good to know.
This may take some time.
I suggest you to postpone moving to Unicode for a few days, till we make this work perfectly.

No problem, enough other problems (challenges) left when switching from xHarbour.com/DE850 to Harbour/UTF-8.
Meanwhile you can help me by
1. Let me know if the multiline Get
@ r,c, GET ctext MEMO/TEXT ..
is working perfectly with German lang when FW_SetUnicode() is .T. ?

Ok, I will do.
2. Can you paste all problem German characters here?

There are only three 'German Umlaute': [/list]
    üÜ
    öÖ
    äÄ
and the ß - UpperCase 'SS' 8) .
For more information see https://www.berlitz.com/blog/german-umlaut-meaning-letters
In UTF-8 they are a lot of other 2-Byte characters, used in french, spanish, danish croatian, etc.
All 2-Byte characters I have tested are concerned!

One other aspects that (perhaps) fit the theme:
If you switch your windows machine to 'Beta: Use Unicode UTF-8 for worldwide language support' 2-Byte characters in the TGet() are handled differently: the first character will appear as � - https://www.compart.com/en/unicode/U+FFFD all following characters are OK!
After a while I turned the switch 'Beta: Use Unicode UTF-8 for worldwide language support' off again. There are some side effects to other applications. And in my Harbour app it's better to see the misintepreted characters e.g. ü instead off the �.
In this context this side was/is very helpful for me: https://www.i18nqa.com/debug/utf8-debug.html
Last edited by frose on Sun Jun 25, 2023 4:08 pm, edited 1 time in total.
Windows 11 Pro 22H2 22621.1848
Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384
Harbour 3.2.0dev (r2008190002)
FWH 23.10 x86
User avatar
frose
 
Posts: 392
Joined: Tue Mar 10, 2009 11:54 am
Location: Germany, Rietberg

Re: FW_SetUnicode( .T. ) 2-Byte characters

Postby frose » Fri Jun 23, 2023 8:18 am

Meanwhile you can help me by
1. Let me know if the multiline Get
@ r,c, GET ctext MEMO/TEXT ..
is working perfectly with German lang when FW_SetUnicode() is .T. ?


Works without any objection in TMultiGet() :D
Image
Code: Select all  Expand view
  local oDlg
   local oGet
   local oEdit
   local oMemo
   local cVar1 := ""
   local cVar2 := ""
   local cVar3 := ""
   local cVar4 := ""

   REQUEST HB_CODEPAGE_UTF8
   HB_CDPSELECT( "UTF8" )
   FW_SetUnicode( .T. )
   
   DEFINE DIALOG oDlg SIZE 600, 600 PIXEL TRUEPIXEL
   
   @  20, 20 GET oGet VAR cVar1 SIZE 200,20 PIXEL OF oDlg VARCHAR 20
   
   @  40,20 EDIT oEdit VAR cVar2 SIZE 200,20 PIXEL OF oDlg
   
   @  60, 20 GET oGet VAR cVar3 MULTILINE SIZE 200, 50 PIXEL OF oDlg

   @ 120, 20 GET oMemo VAR cVar4 MEMO OF oDlg PIXEL SIZE 400, 100
   
   @ 220, 20 BUTTON "CHECK" SIZE 100,40 PIXEL OF oDlg ;
      ACTION MsgInfo( "|" + cVar1 + "|" + CRLF + "|" + cVar2 + "|" + cVar3 + "|" + cVar4 + "|" )

   ACTIVATE DIALOG oDlg CENTERED
 
Windows 11 Pro 22H2 22621.1848
Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384
Harbour 3.2.0dev (r2008190002)
FWH 23.10 x86
User avatar
frose
 
Posts: 392
Joined: Tue Mar 10, 2009 11:54 am
Location: Germany, Rietberg

Re: FW_SetUnicode( .T. ) 2-Byte characters

Postby frose » Fri Jun 23, 2023 9:51 am

something else I noticed: During editing it sometimes happens that the following 2-byte characters are interpreted CORRECTLY.
So the error should be related to the length calculation of the previous characters, somewhere deep inside.
HTH
Windows 11 Pro 22H2 22621.1848
Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384
Harbour 3.2.0dev (r2008190002)
FWH 23.10 x86
User avatar
frose
 
Posts: 392
Joined: Tue Mar 10, 2009 11:54 am
Location: Germany, Rietberg

Re: FW_SetUnicode( .T. ) 2-Byte characters

Postby Uwe.Diemer » Fri Jun 23, 2023 1:30 pm

same Prob here with unicode

I want move to Harbour from xHarbour

My getfield blocks if itype "Müller" t stops at "Mü"

U.diemer using ads Server 12.2
User avatar
Uwe.Diemer
 
Posts: 98
Joined: Mon Aug 09, 2010 11:00 am

Re: FW_SetUnicode( .T. ) 2-Byte characters

Postby nageswaragunupudi » Fri Jun 23, 2023 4:01 pm

Uwe.Diemer wrote:same Prob here with unicode

I want move to Harbour from xHarbour

My getfield blocks if itype "Müller" t stops at "Mü"

U.diemer using ads Server 12.2


Working well with xHarbour but not with Harbour?
Regards

G. N. Rao.
Hyderabad, India
User avatar
nageswaragunupudi
 
Posts: 10620
Joined: Sun Nov 19, 2006 5:22 am
Location: India

Re: FW_SetUnicode( .T. ) 2-Byte characters

Postby frose » Sat Jun 24, 2023 6:31 am

Uwe.Diemer wrote:same Prob here with unicode

I want move to Harbour from xHarbour

My getfield blocks if itype "Müller" t stops at "Mü"

U.diemer using ads Server 12.2

cannot confirm this behavior for Harbour.
Uwe, try example from this thread
Windows 11 Pro 22H2 22621.1848
Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384
Harbour 3.2.0dev (r2008190002)
FWH 23.10 x86
User avatar
frose
 
Posts: 392
Joined: Tue Mar 10, 2009 11:54 am
Location: Germany, Rietberg

Re: FW_SetUnicode( .T. ) 2-Byte characters

Postby frose » Mon Jun 26, 2023 9:15 am

nageswaragunupudi wrote:Meanwhile you can help me by
1. Let me know if the multiline Get
@ r,c, GET ctext MEMO/TEXT ..
is working perfectly with German lang when FW_SetUnicode() is .T. ?


TEdit() does not work correctly yet. Something happens with <cVar1> and <cVar2>:
Code: Select all  Expand view
FUNCTION Main()

   LOCAL oDlg
   LOCAL oEdit
   LOCAL cU82Lower
   LOCAL cU82Upper
   LOCAL cVar1
   LOCAL cVar2

   REQUEST HB_CODEPAGE_UTF8
   HB_CDPSELECT( "UTF8" )
   FW_SetUnicode( .T. )

   cU82Lower := "lowerüöäßUPPER"
   cU82Upper := "UPPERÄÜÖßlower"
   cVar1     := cU82Lower
   cVar2     := cU82Upper
   
   MsgInfo( ;
      "cVar1: " + cVar1 + CRLF + CRLF + ;
      "cVar2: " + cVar2, ;
      "Test Encoding";
      )
   DEFINE DIALOG oDlg SIZE 600, 600 PIXEL TRUEPIXEL
   
   @  40, 20 EDIT oEdit VAR cVar1 SIZE 200,20 PIXEL OF oDlg
   
   @  60, 20 EDIT oEdit VAR cVar2 SIZE 200,20 PIXEL OF oDlg

   @  80, 20 BUTTON "CHECK" SIZE 100,40 PIXEL OF oDlg ACTION MsgInfo( ;
      "cVar1: " + cVar1 + CRLF + CRLF + ;
      "cVar2: " + cVar2, ;
      "Test Encoding";
      )

   ACTIVATE DIALOG oDlg CENTERED

   MsgInfo( ;
      "cVar1: " + cVar1 + CRLF + CRLF + ;
      "cVar2: " + cVar2, ;
      "Test Encoding";
      )

   MsgInfo( ;
      "cU82Lower: " + cU82Lower + CRLF + CRLF + ;
      "cU82Upper: " + cU82Upper, ;
      "Test Encoding";
      )
     
RETURN NIL
 

Image
Windows 11 Pro 22H2 22621.1848
Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384
Harbour 3.2.0dev (r2008190002)
FWH 23.10 x86
User avatar
frose
 
Posts: 392
Joined: Tue Mar 10, 2009 11:54 am
Location: Germany, Rietberg

Re: FW_SetUnicode( .T. ) 2-Byte characters

Postby karinha » Mon Jun 26, 2023 10:40 am

Code: Select all  Expand view

// C:\FWH..\SAMPLES\FROSE2UT.PRG

#include "FiveWin.ch"

REQUEST HB_LANG_PT
REQUEST HB_CODEPAGE_PT850

// REQUEST HB_CODEPAGE_UTF8 ???? Harbour? No xHarbour.

// REQUEST HB_CODEPAGE_PTISO
// REQUEST HB_CODEPAGE_UTF8EX

FUNCTION Main()

   LOCAL oDlg
   LOCAL oEdit
   LOCAL cU82Lower
   LOCAL cU82Upper
   LOCAL cVar1
   LOCAL cVar2

   HB_LANGSELECT( 'PT' )     // Default language is now Portuguese
   HB_SETCODEPAGE( "PT850" )

   HB_CDPSELECT( "UTF8" )

   /*
   HB_CDPSELECT( "PTISO" )

   hb_cdpSelect( "UTF8EX" )
   */


   FW_SetUnicode( .T. )

   // cU82Lower := OemToAnsi( LOWER( "lowerüöäßUPPER" ) )
   // cU82Upper := OemToAnsi( UPPER( "UPPERÄÜÖßlower" ) )

   // OR:

   cU82Lower := LOWER( "lowerüöäßUPPER" )
   cU82Upper := UPPER( "UPPERÄÜÖßlower" )

   cVar1     := cU82Lower
   cVar2     := cU82Upper

   DEFINE DIALOG oDlg SIZE 600, 600 PIXEL TRUEPIXEL
   
   @  40, 20 EDIT oEdit VAR cVar1 SIZE 200,20 PIXEL OF oDlg
   
   @  60, 20 EDIT oEdit VAR cVar2 SIZE 200,20 PIXEL OF oDlg

   @  80, 20 BUTTON "CHECK" SIZE 100,40 PIXEL OF oDlg ACTION MsgInfo( ;
      "cVar1: " + cVar1 + CRLF + CRLF + ;
      "cVar2: " + cVar2, ;
      "Test Encoding";
      )

   ACTIVATE DIALOG oDlg CENTERED

RETURN NIL
 


Regards, saludos.
João Santos - São Paulo - Brasil - Phone: +55(11)95150-7341
User avatar
karinha
 
Posts: 7794
Joined: Tue Dec 20, 2005 7:36 pm
Location: São Paulo - Brasil

Next

Return to FiveWin for Harbour/xHarbour

Who is online

Users browsing this forum: No registered users and 55 guests