Page 1 of 1

Recognizing a Word-document

Posted: Fri Mar 11, 2022 9:21 am
by driessen
Hello,

Is there a way to find out what format a document it is made in, which i need to open in Word from my FWH-application.
Is it an RTF-document, is it DOC, is it DOCX?
I want to know it, without taking the filename extension into considuration.
Thank you very much.

Re: Recognizing a Word-document

Posted: Fri Mar 11, 2022 3:17 pm
by James Bott
Hmm, why not use the extension? That seems to be the simplest way.

Re: Recognizing a Word-document

Posted: Fri Mar 11, 2022 7:35 pm
by Otto
Michel, what about opening with memoread?

In a DOC file, you will find {\rtf1\ were as a docx is a zip file.

Best regards,
Otto

Re: Recognizing a Word-document

Posted: Sun Mar 13, 2022 10:40 pm
by nageswaragunupudi
Existing FWH function MemoryBufferType is extended for next release.

Code: Select all | Expand

function MemoryBufferType( cBuf )

   local cType, n, a
   local cPunct   := Chr( 9 ) + Chr( 10 ) + Chr( 12 ) + Chr( 13 ) + Chr( 26 ) + Chr( 141 )
   local lExact   := Set( _SET_EXACT, .f. )
   local lBinary  := .f.
   local aTypes   := { ;
         { "IMG.BMP",   "BM",                                  .t. }, ;
         { "IMG.PNG",   Chr( 0x89 ) + "PNG",                   .f. }, ;
         { "IMG.ICO",   Chr( 0 ) + Chr( 0 ) + Chr( 1 ),        .f. }, ;
         { "IMG.JPG",   Chr( 255 ) + Chr( 216 ) + Chr( 255 ),  .f. }, ;
         { "IMG.GIF",   "GIF8",                                .t. }, ;
         { "IMG.TIF",   Chr(73) + Chr(73) + Chr(42),           .t. }, ;
         { "IMG.TIF",   Chr(77) + Chr(77) + Chr(42),           .t. }, ;
         { "IMG.EMF",   Chr( 1 ) + Chr( 0 ) + Chr( 0 ) + Chr( 0 ),   .f., }, ;
         { "IMG.WMF",   HEXTOSTR( "D7CDC69A00" ),              .f. }, ;
         { "DOC.XML",   "<?xml ",                              .f. }, ;
         { "DOC.RTF",   "{\rtf",                               .f. }, ;
         { "DOC.GTF",   "GTF" + Chr( 5 ),                      .f. }, ;
         { "DOC.PDF",   "%PDF-",                               .f. }, ;
         { "DOC.DOC",   HEXTOSTR( "D0CF11E0A1B11AE1" ),        .f. }, ;
         { "DOC.DCX",   HEXTOSTR( "504B030414" ),              .f. }  }

   for each a in aTypes
      if cBuf = a[ 2 ]
         if a[ 3 ]
            if IsBinaryData( SubStr( cBuf, 3, 15 ) )
               cType    := a[ 1 ]
            endif
         else
            cType    := a[ 1 ]
         endif
         EXIT
      endif
   next
   Set( _SET_EXACT, lExact )
   if ! Empty( cType )
      return cType
   endif

   lBinary     := IsBinaryData( cBuf )
   if lBinary
      if FreeImageIsLoaded() .and. ( n := IfNil( FITypeFromMemory( cBuf ), -1 ) ) >= 0
         return "IMG." + cValToChar( n )
      endif
      cType := "BIN.HEX"
   else
      if IsUTF8( cBuf )
         return "TXT.UTF8"
      else
         cType    := "TXT.ANSI"
      endif
   endif

return cType
 


This function can recognize if the given text is an image type or a doc type ( RTF, GTF, PDF, DOC, DCX (for DOCX), etc