EXTRACT PLAIN TEXT FROM HTML FILE

EXTRACT PLAIN TEXT FROM HTML FILE

Postby MarcoBoschi » Fri May 10, 2024 2:35 pm

Hi,
Please I need, If it exist a freeware software that permits to me to extract plain text from an html file. Or other tips are welcome

Many Thanks

Marco
User avatar
MarcoBoschi
 
Posts: 1065
Joined: Thu Nov 17, 2005 11:08 am
Location: Padova - Italy

Re: EXTRACT PLAIN TEXT FROM HTML FILE

Postby karinha » Fri May 10, 2024 3:12 pm

João Santos - São Paulo - Brasil - Phone: +55(11)95150-7341
User avatar
karinha
 
Posts: 7824
Joined: Tue Dec 20, 2005 7:36 pm
Location: São Paulo - Brasil

Re: EXTRACT PLAIN TEXT FROM HTML FILE

Postby MarcoBoschi » Fri May 10, 2024 3:49 pm

8)
User avatar
MarcoBoschi
 
Posts: 1065
Joined: Thu Nov 17, 2005 11:08 am
Location: Padova - Italy

Re: EXTRACT PLAIN TEXT FROM HTML FILE

Postby karinha » Fri May 10, 2024 3:54 pm

Code: Select all  Expand view

// C:\FWH\SAMPLES\HTML2TXT.PRG

#include "FiveWin.ch"

MEMVAR cINNText

FUNCTION Main()

   LOCAL cFile := ".\GMAP.HTML"

   IF FILE( "Boschi.txt" )

      FERASE( "Boschi.txt" )

   ENDIF

   MsgRun( "WAIT... Converting HTML to TEXT. ", ;
           "Please, Wait                     ", ;
           { || WinExec( CONVERT_HTML2TXT( cFile ) ), 3 } )

   MemoEdit( MemoRead( "Boschi.txt" ) )

RETURN NIL

FUNCTION CONVERT_HTML2TXT( cFile )

   LOCAL oExplorer := TOLEAuto():New( "InternetExplorer.Application" )

   PRIV cINNText

   oExplorer:Navigate2( cFile )

   DO WHILE oExplorer:ReadyState <> 4

      hb_idleSleep( 1 )

   ENDDO

   cINNText := oExplorer:Document:Body:InnerText

   MemoWrit( "Boschi.txt", cINNText )

   // MemoEdit( MemoRead( "Boschi.txt" ) )

   oExplorer:Quit()

RETURN NIL

// FIN / END
 


Regards, saludos.
João Santos - São Paulo - Brasil - Phone: +55(11)95150-7341
User avatar
karinha
 
Posts: 7824
Joined: Tue Dec 20, 2005 7:36 pm
Location: São Paulo - Brasil


Return to FiveWin for Harbour/xHarbour

Who is online

Users browsing this forum: Google [Bot], Natter and 63 guests