Page 2 of 2

Re: AscW() function

PostPosted: Thu Jun 13, 2019 9:45 am
by Enrico Maria Giordano
Sorry, I'm not familiar with Unicode. Just give me a sample showing the problem and I'll see what I can do.

EMG

Re: AscW() function

PostPosted: Thu Jun 13, 2019 9:48 am
by nageswaragunupudi
From the beginning, we are going in the wrong direction. I am sorry for this deviation.
This is not conversion to and from UTF8 but of WideChar. (16bit char encoding).
Windows internally works with WideChar, i.e., 64 bit (little endian) characters.

In 8bit notation

"AB" is "4142" in hex

In 16bit notation

"AB" is "41004200" in hex

Re: AscW() function

PostPosted: Thu Jun 13, 2019 9:59 am
by Enrico Maria Giordano
So, what's the point?

EMG

Re: AscW() function

PostPosted: Thu Jun 13, 2019 11:40 am
by nageswaragunupudi
BIN2W( cWideString ) should give the value equivalent to AscW()

Re: AscW() function

PostPosted: Thu Jun 13, 2019 11:53 am
by nageswaragunupudi
This small program demonstrates that BIN2W() works like AscW()
Code: Select all  Expand view
#include "fivewin.ch"

function Main()

   local cUtf8, cWide, nAsc

   cUtf8  := "అ"
   ? cUtf8
   cWide  := UTF8TOUTF16( cUtf8 )
   nAsc  := BIN2W( cWide )
   ? nAsc
   ? "Proof", HB_UTF8CHR( nAsc )

return nil
 

Re: AscW() function

PostPosted: Thu Jun 13, 2019 11:54 am
by Enrico Maria Giordano
So, what about HB_UTF8ASC()? Do we need it?

EMG

Re: AscW() function

PostPosted: Thu Jun 13, 2019 12:22 pm
by nageswaragunupudi
Enrico Maria Giordano wrote:So, what about HB_UTF8ASC()? Do we need it?

EMG


We need it.

This may not be what Mr. Natter is looking for but we need it in xHarbour.
Harbour has both HB_UTF8CHR() and HB_UTF8ASC().
But xHarbour has only HB_UTF8CHR() but not HB_UTF8ASC().
Addition of this function to xHarbour will be useful.

Re: AscW() function

PostPosted: Thu Jun 13, 2019 8:59 pm
by Enrico Maria Giordano
Done. Please try xHarbour build 10253.

EMG

Re: AscW() function

PostPosted: Thu Jun 13, 2019 9:01 pm
by nageswaragunupudi
Thank you.

FWH is now using HB_UTF8CHR(). There is no problem either with Harbour or xHarbour.
Till now there is no need to use HB_UTF8ASC().
But, if and when the need arises, what shall we do? There will be many users of older versions of xHarbour. Looks like we need to force them to upgrade xHarbour.

But what about xharbour.com users? They will get unresolved externals issue.

Re: AscW() function

PostPosted: Thu Jun 13, 2019 9:22 pm
by Enrico Maria Giordano
They can use this code:

Code: Select all  Expand view
#pragma BEGINDUMP


#include "error.ch"
#include "hbapierr.h"


static BOOL utf8tou16nextchar( UCHAR ucChar, int * n, USHORT * uc )
{
   if( *n > 0 )
   {
      if( ( ucChar & 0xc0 ) != 0x80 )
         return FALSE;
      *uc = ( *uc << 6 ) | ( ucChar & 0x3f );
      ( *n )--;
      return TRUE;
   }

   *n    = 0;
   *uc   = ucChar;
   if( ucChar >= 0xc0 )
   {
      if( ucChar < 0xe0 )
      {
         *uc   &= 0x1f;
         *n    = 1;
      }
      else if( ucChar < 0xf0 )
      {
         *uc   &= 0x0f;
         *n    = 2;
      }
      else if( ucChar < 0xf8 )
      {
         *uc   &= 0x07;
         *n    = 3;
      }
      else if( ucChar < 0xfc )
      {
         *uc   &= 0x03;
         *n    = 4;
      }
      else if( ucChar < 0xfe )
      {
         *uc   &= 0x01;
         *n    = 5;
      }
   }
   return TRUE;
}


HB_FUNC( HB_UTF8ASC )
{
   const char * pszString = hb_parc( 1 );

   if( pszString )
   {
      HB_SIZE nLen = hb_parclen( 1 );
      USHORT wc = 0;
      int n = 0;

      while( nLen )
      {
         if( ! utf8tou16nextchar( ( unsigned char ) *pszString, &n, &wc ) )
            break;

         if( n == 0 )
            break;

         pszString++;

         nLen--;
      }

      hb_retnint( wc );
   }
   else
      hb_errRT_BASE_SubstR( EG_ARG, 3012, NULL, HB_ERR_FUNCNAME, HB_ERR_ARGS_BASEPARAMS );
}

#pragma ENDDUMP


EMG