WORD 2007 docx: search/replace is working

WORD 2007 docx: search/replace is working

Postby Otto » Sun Nov 09, 2008 11:08 pm

I did some tests with the new Word Open XML format and have
"search replace" working.

This is what I did:
1. rename from docx to zip
2. unpack
3. Seek and replace in the document.xml for placeholders
4. Zip again the files and rename back to docx.

Walkthrough: Word 2007 XML Format
http://msdn.microsoft.com/en-us/library/ms771890.aspx

Regards,
Otto

Image


Code: Select all  Expand view  RUN
#include "FiveWin.ch"
#INCLUDE "directry.ch"
//----------------------------------------------------------------------------//
function Main()
   local DCOM
   local cTxtFile
   local cRCFile
   local cPath := "c:\soft"    // is a tmp directry to expand the docx file
   local aFiles

   DELETE FILE ('archive.zip')
   DELETE FILE ('temp.zip')
   
   // clear the directory - sure there is a better way
   do while .t.
      aFiles   := directory( cPath + '\*.*', 'D' )
      if len(aFiles) > 2
         dir_recurs( cPath )
      else
         exit
      endif
   enddo


   MOVEFILE( "demo.docx" , "temp.zip" )
   SYSREFRESH()

   //extracts all *.* files from the archive archive.zip to c:\soft folder.
   // 7Z.exe - thanks to Richard - maybe someone can help with the build in function
   DCOM  := '7Z.exe x '  + "temp.zip" + " -oc:\soft *.* -r"
   WAITRUN(DCOM,0)
   SYSREFRESH()

   cRCFile  := "c:\soft\word\document.xml"
   cTxtFile = MemoRead( cRCFile )
   SYSREFRESH()

   cTxtFile := STRTRAN(cTxtFile, "[ANREDE]",           "Herr")
   cTxtFile := STRTRAN(cTxtFile, "[TITELVORNAMENAME]", "Dr. Mustermann Hans")
   cTxtFile := STRTRAN(cTxtFile, "[BRIEFANRED]",       "Lieber Hans")
   cTxtFile := STRTRAN(cTxtFile, "[STRASSE]",          "Bahnhofstrasse")
   cTxtFile := STRTRAN(cTxtFile, "[ORT]",              "Musterort" )

   //xHarbours memowrit adds a chr(26) - therfore I linked the harbour function here
   memowrit(cRCFile, cTxtFile,.f.  )
   SYSREFRESH()

   //adds all files and subfolders from folder subdir to archive archive.zip.
   DCOM  := '7z a -tzip archive.zip  -ir!c:\soft\*.*'

   WAITRUN(DCOM,0)
   SYSREFRESH()

   MOVEFILE("archive.zip", "demo.docx" )

   SYSREFRESH()

   msginfo("Ende")
return nil

//----------------------------------------------------------------------------//

static function dir_recurs( cPath )
   local x
   local aFiles   := directory( cPath + '\*.*', 'D' )
   local nFilCount := len( aFiles )

   dirremove(cPath)

   for x := 1 to nFilCount

      if aFiles[ X, F_NAME ] <> '..'
         ferase (cPath + "\" +  aFiles[ X, F_NAME ])
      endif

      if 'D' $ aFiles[ X, F_ATTR ]
         if aFiles[ X, F_NAME ] <> '.'
            dir_recurs( cPath + '' + aFiles[ X, F_NAME ] )
         endif
      endif
   next

return NIL
//----------------------------------------------------------------------------//

#pragma BEGINDUMP

#include <hbapi.h>
#include <hbapiitm.h>
#include "hbapifs.h"

HB_FUNC( MEMOWRIT )
{
PHB_ITEM pFileName = hb_param( 1, HB_IT_STRING );
   PHB_ITEM pString = hb_param( 2, HB_IT_STRING );
BOOL bWriteEof = TRUE; /* write Eof !, by default is .T. */
BOOL bRetVal = FALSE;

   if( hb_parinfo(0) == 3                       && ISLOG( 3 ) )
bWriteEof = hb_parl( 3 );

   if( pFileName                                && pString )
{
FHANDLE fhnd = hb_fsCreate( ( BYTE * ) hb_itemGetCPtr( pFileName ), FC_NORMAL );

   if( fhnd != FS_ERROR )
{
ULONG ulSize = hb_itemGetCLen( pString );

   bRetVal = ( hb_fsWriteLarge( fhnd, ( BYTE * ) hb_itemGetCPtr( pString ), ulSize ) == ulSize );

/* NOTE: CA-Clipper will add the EOF even if the write failed. [vszakats] */
/* NOTE: CA-Clipper will not return .F. when the EOF could not be written. [vszakats] */
#if ! defined(OS_UNIX_COMPATIBLE)
{
if( bWriteEof ) /* if true, then write EOF */
{
BYTE byEOF = HB_CHAR_EOF;

   hb_fsWrite( fhnd, &byEOF, sizeof( BYTE ) );
   }
}
#endif

hb_fsClose( fhnd );
   }
}

hb_retl( bRetVal );
   }

#pragma ENDDUMP

//----------------------------------------------------------------------------//
User avatar
Otto
 
Posts: 6332
Joined: Fri Oct 07, 2005 7:07 pm

Postby MauroArevalo » Mon Nov 10, 2008 3:21 pm

Otto:

It is a very interesting topic, I will make supporting evidence.

Thank you
Edgar Mauricio Arévalo Mogollón.
Bogotá DC. Colombia
FWH FTDN, xHarbour 1.2.1, Pelles C, Fivedit, Visual Studio Code, Borland 7.30, Mysql, Dbfs
http://www.hymplus.com http://www.hymlyma.com
Tratando de retomar la programación....
User avatar
MauroArevalo
 
Posts: 107
Joined: Thu Jan 19, 2006 11:47 pm
Location: Bogota DC. Colombia

Postby driessen » Mon Nov 10, 2008 3:55 pm

Otto,

A very interesting development you share with us.

I only have 1 problem.

As I told you before I use FCREATE, FREAD and FWRITE to obtain the same result. So far so good.

But I use some signs like "[<" and ">]". In XML they are written to the file somewhat different? So I have to find out how to solve this.

But keep us informed of your further developments.

Thanks.
Regards,

Michel D.
Genk (Belgium)
_____________________________________________________________________________________________
I use : FiveWin for (x)Harbour v. 24.07 - Harbour 3.2.0 (February 2024) - xHarbour Builder (January 2020) - Bcc773
User avatar
driessen
 
Posts: 1422
Joined: Mon Oct 10, 2005 11:26 am
Location: Genk, Belgium

Postby Otto » Mon Nov 10, 2008 4:56 pm

Hello Michael,

If I remember well you told that you use RTF.

Here I use the new OPEN XML format from WORD 2007.
This is the new standard format – docx.

I uploaded my VB6 drag&dop program for test purpose.
http://www.atzwanger-software.com/fw/word2007.zip
I hope we can do this program in FWH.

You can try to drag and drop a placeholder from the program to word
and then have a look at the document.xml file inside the docx – file.
You will see that the brackets remain.

Example:
><w:t xml:space="preserve"> [STRASSE] </w:t></w:r></w:p><w:p w:rsidR="00313937" w:rsidRDefault="0034734A"><w:r w:rsidRPr="0034734A"><w:t>[ORT]<

Regards,
Otto


Image
User avatar
Otto
 
Posts: 6332
Joined: Fri Oct 07, 2005 7:07 pm

Postby Rochinha » Wed Nov 12, 2008 9:07 pm

Otto

Enchacement for your code:

Change the main function:
Code: Select all  Expand view  RUN
Function Main( cDOCFile, cVFields, cVData )

   aVFields := StringToArray( cVFields, ";" )
   aVData   := StringToArray( cVData, ";" )
   ...
   MOVEFILE( cDOCFile, "temp.zip" )
   ...
   HB_UNZIPFILE( "temp.zip",,.f.,,"c:\soft")
   ...
   Hb_ZIPFILE( "temp.zip", cDOCFile, 8 )



Replace
Code: Select all  Expand view  RUN
   cTxtFile := STRTRAN(cTxtFile, "[ANREDE]",           "Herr")
   cTxtFile := STRTRAN(cTxtFile, "[TITELVORNAMENAME]", "Dr. Mustermann Hans")
   cTxtFile := STRTRAN(cTxtFile, "[BRIEFANRED]",       "Lieber Hans")
   cTxtFile := STRTRAN(cTxtFile, "[STRASSE]",          "Bahnhofstrasse")
   cTxtFile := STRTRAN(cTxtFile, "[ORT]",              "Musterort" )


With
Code: Select all  Expand view  RUN
   for i = 1 to len(aVFields)
       cTxtFile := STRTRAN(cTxtFile, aVFields[i], aVData[i])
   next


Code: Select all  Expand view  RUN
function StringToArray( cString, cSeparator )
LOCAL nPos
LOCAL aString := {}
DEFAULT cSeparator := ";"
cString := ALLTRIM( cString ) + cSeparator
DO WHILE .T.
   nPos := AT( cSeparator, cString )
   IF nPos = 0
      EXIT
   ENDIF
   AADD( aString, SUBSTR( cString, 1, nPos-1 ) )
   cString := SUBSTR( cString, nPos+1 )
ENDDO
RETURN ( aString )
Rochinha
 
Posts: 310
Joined: Sun Jan 08, 2006 10:09 pm
Location: Brasil - Sao Paulo

Postby Otto » Wed Nov 12, 2008 10:07 pm

Hello Rochinha,

thank you very much for your enhancement.

Good news - mail merge and WINDOWS WORD 2007 programmatically from FWH is ready.

Call the function like:

MailMerge( cDOCFile, cVFields, cVData )

Where cDocFile is your WINWORD docx file, cVFields is a list with your placeholders you use in the wordfile and cVData is the corresponding data.

The document.xml files are very small and this makes this way replacing the “placeholders” very speedy.

BTW, did you also had a look at my VB6 program? Do you think this could be done with FWH: drag the contents of your list control to Word?

Next step: we have to take care about the non standard characters - in German “Umlauts” - which we pass via cVData to the xml-file.
I have to find out how to substitute the characters: for example ß will become //ß .


Thanks again and best regards,
Otto


I don’t understand why Zip/unzip from xHarbour for me is not working.
But at the moment this is not important because I can use 7Z.exe as a workaround.
User avatar
Otto
 
Posts: 6332
Joined: Fri Oct 07, 2005 7:07 pm

Postby Rochinha » Wed Nov 12, 2008 11:44 pm

Otto,

Your .DOCX have a DOCTYPE instruction like this?:
Code: Select all  Expand view  RUN
<?xml version="1.0"?>
<!DOCTYPE xbel PUBLIC
       "+//IDN python.org//DTD XML Bookmark Exchange
        Language 1.0//EN//XML"
       "http://www.python.org/topics/xml/dtds/xbel-1.0.dtd">


In Language 1.0//EN//XML" you need replace with Language 1.0//DE//XML".

The German Charset is iso-8859-1 or x-IA5-German

Check and Try.
Rochinha
 
Posts: 310
Joined: Sun Jan 08, 2006 10:09 pm
Location: Brasil - Sao Paulo

Postby Otto » Thu Nov 13, 2008 1:26 am

Rochinha,
thank you. There is no such instruction.
A origingal docx is in my zip file

http://www.atzwanger-software.com/fw/word2007.zip
Regards,
Otto
User avatar
Otto
 
Posts: 6332
Joined: Fri Oct 07, 2005 7:07 pm


Return to FiveWin for Harbour/xHarbour

Who is online

Users browsing this forum: No registered users and 42 guests