Can FW also read a intire online webshop ?

Can FW also read a intire online webshop ?

Postby Marc Venken » Tue Mar 09, 2021 8:42 am

Today I was totaly surpriced that a colluege of mine presented me a exel file with the content of my online webshop !!

All catagories and all product info is inside the exel file. So, he was able with a Exel Macro to read my webshop. How great is this

Can this be done also with FW ? shop = www(dot)maveco-webshop(dot)be

Note : I just did some googleling and it seems to be called : website scraping... interesting
Marc Venken
Using: FWH 23.04 with Harbour
User avatar
Marc Venken
 
Posts: 1425
Joined: Tue Jun 14, 2016 7:51 am
Location: Belgium

Re: Can FW also read a intire online webshop ?

Postby nageswaragunupudi » Tue Mar 09, 2021 4:16 pm

Yes, interesting.
By the way, can you provide us the link to your webshop?
Regards

G. N. Rao.
Hyderabad, India
User avatar
nageswaragunupudi
 
Posts: 10620
Joined: Sun Nov 19, 2006 5:22 am
Location: India

Re: Can FW also read a intire online webshop ?

Postby Marc Venken » Tue Mar 09, 2021 4:31 pm

https://www.maveco-webshop.be

Maybe the code of mr. Rao's and Uwe for retrieving the topics from this forum are the base of this new setup ?
Marc Venken
Using: FWH 23.04 with Harbour
User avatar
Marc Venken
 
Posts: 1425
Joined: Tue Jun 14, 2016 7:51 am
Location: Belgium

Re: Can FW also read a intire online webshop ?

Postby Marc Venken » Sat Mar 13, 2021 8:38 pm

Here is the basic code from Mr. Rao for reading the forum

This code is used inside a loop to get all forum topics...

viewtopic.php?f=3&t=" + cValToChar( nTopic )

How do we find the code from a webshop that is needed for the loop in order to get all the product data ?

https://www.maveco-webshop.be/ (here is extra code needed) + a counter like nTopic

Any idea to get me started ?

Code: Select all  Expand view

#include "fivewin.ch"

REQUEST DBFCDX

static nLastTopic := 33420  // rage of topics
static nFirstopic := 33400

//----------------------------------------------------------------------------//

function Main()

   SET DATE BRITISH
   SET CENTURY ON
   SET DELETED ON
   RDDSETDEFAULT( "DBFCDX" )

   DBCREATE( "SAMPLES.DBF", { ;
      { "TOPICNO",   'N',  6, 0 }, ;
      { "TOPIC",     'C', 60, 0 }, ;
      { "AUTHOR",    'C', 40, 0 }, ;
      { "DATE",      'D',  8, 0 }, ;
      { "CODE",      'M', 10, 0 }  }, ;
      "DBFCDX", .T., "DB" )
   FW_CdxCreate()
   CLOSE DB

   USE SAMPLES EXCLUSIVE VIA "DBFCDX"

   ForumSamples( nLastTopic, nFirsTopic )

   BrowseSamples()

return nil

//----------------------------------------------------------------------------//

function BrowseSamples()

   local oDlg, oFont, oBold, oMono, oGet, oBrw

   SET ORDER TO TAG TOPICNO
   GO TOP

   DEFINE FONT oFont NAME "Segoe UI" SIZE 0,-14
   DEFINE FONT oBold NAME "TAHOMA" SIZE 0,-18 BOLD
   DEFINE FONT oMono NAME "Lucida Console" SIZE 0,-12

   DEFINE DIALOG oDlg SIZE 900,700 PIXEL TRUEPIXEL FONT oFont ;
      TITLE "SAMPLES IN FWH FORUMS"

   @  90,20 XBROWSE oBrw SIZE 400,-20 PIXEL OF oDlg ;
      DATASOURCE "SAMPLES" ;
      COLUMNS "TOPICNO", "DATE", "AUTHOR" ;
      AUTOSORT ;
      LINES NOBORDER

   WITH OBJECT oBrw
      :nMarqueeStyle := MARQSTYLE_HIGHLROWRC
      :bChange       := { || oDlg:Update() }
      :lIncrFilter   := .t.
      :bSeek         := { |c| ( oBrw:cAlias )->( BrwFilter( c ) ) }
      :CreateFromCode()
   END

   @ 20, 20 SAY TRIM( SAMPLES->TOPIC ) SIZE 860,30 PIXEL OF oDlg CENTER ;
         FONT oBold UPDATE

   @ 60, 20 SAY "Filter containing all words any where" SIZE 300,20 PIXEL OF oDlg

   @ 60,340 SAY oBrw:oSeek PROMPT oBrw:cSeek SIZE 540,20 PIXEL OF oDlg ;
      COLOR CLR_HRED,CLR_YELLOW

   @  90,420 SAY "CODE" SIZE 460,30 PIXEL OF oDlg CENTER ;
      COLOR CLR_BLACK, nRGB( 231, 242, 255 )

   @ 120,420 GET oGet VAR SAMPLES->CODE SIZE 460,540 PIXEL OF oDlg ;
      MEMO READONLY FONT oMono UPDATE

   oDlg:bPainted := { || oDlg:Box( 59,339,81,881 ) }

   ACTIVATE DIALOG oDlg CENTERED
   RELEASE FONT oFont, oMono, oBold

return nil

//----------------------------------------------------------------------------//

function BrwFilter( c )

   local lFound   := .t.
   local aTokens
   local cSaveFilter := DBFILTER()
   local nSaveRec    := RECNO()
   local cFilter     := {}

   if Empty( c )
      return .t.
   endif

   c  := UPPER( c )
   aTokens  := HB_ATokens( c )

   for each c in aTokens
      AAdd( cFilter, "'" + c + "' $ UPPER( DBRECORDINFO( 9 ) )" )
   next

   cFilter  := FW_ArrayAsList( cFilter, " .AND. " )

   SET FILTER TO &cFilter
   GO TOP
   lFound   := ( OrdKeyCount() > 0 )

return lFound

//----------------------------------------------------------------------------//

function ForumSamples( nTopic, nLast )

   local cTopic, cUrl, cPageURL, cUser, cText, cCode, nPage, nPages, n, cLeft, dDate

   DEFAULT nTopic   := 33507, nLast := nTopic - 50

   for nTopic := nTopic to nLast step -1

      nPage    := 1
      cUrl  := TopicNoToURL( nTopic )
      do while .t.
         cPageURL := cUrl + If( nPage > 1, "&start=" + LTrim( Str( nPage * 15 ) ), "" )
         MsgRun( cPageURL, "READING FORUM PAGE", { || ;
            cText := WebPageContents( cPageUrl, .t. ) ;
            } )

         if nPage == 1
            nPages   := PageCount( cText )
            cTopic   := textbetween( ctext, "<h2>", "</h2>", 1 )
            cTopic   := textbetween( cTopic, ">", "</a>", 1 )
         endif

         n     := 1
         do while !Empty( cCode := TextBetween( cText, "<code>", "</code>", n, @cLeft ) )
            cUser := GetUserName( cLeft, @dDate )
            if Empty( dDate )
               dDate := CTOD( "" )
            endif
            cCode          := ExtractPrgCode( cCode )
            //
            DBAPPEND()
            FIELD->TOPICNO := nTopic
            FIELD->TOPIC   := cTopic
            FIELD->AUTHOR  := cUser
            FIELD->DATE    := dDate
            FIELD->CODE    := cCode
            n++
         enddo
         nPage++
         if nPage > nPages
            EXIT
         endif
      enddo
   next nTopic

return nil

//----------------------------------------------------------------------------//

function TopicNoToURL( nTopic )
return   "http://forums.fivetechsupport.com/viewtopic.php?f=3&t=" + cValToChar( nTopic )


//----------------------------------------------------------------------------//

function TextBetween( cText, cStartTag, cCloseTag, nPos, cLeft, cRight )

   local cRet  := ""

   if !( cStartTag $ cText )
      cLeft    := cText
      cRight   := ""
      return ""
   endif

   cRight   := AfterAtNum( cStartTag, cText,  nPos )
   cRet     := BeforAtNum( cCloseTag, cRight, 1    )

   if PCount() > 4
      cLeft    := BeforAtNum( cStartTag, cText,  nPos )
      cRight   := AfterAtNum( cCloseTag, cRight, 1    )
   endif

return cRet

//----------------------------------------------------------------------------//

function ExtractPrgCode( cCode )

   local nFrom, nUpto, cLeft, cRight, cToken
   local nFor
   local aSubs := { ;
      { '<br />',CRLF }, ;
      { '&nbsp;'," " }, ;
      { 'ÿ'," " }, ;
      { '&quot;','"' } }

   for nFor := 1 to Len( aSubs )
      cCode    := StrTran( cCode, aSubs[ nFor, 1 ], aSubs[ nFor, 2 ] )
   next

   do while !Empty( cToken := TextBetween( cCode, "<", ">", 1, @cLeft, @cRight ) )
      cCode    := cLeft + cRight
   enddo

   aSubs := { ;
      { '&gt;', ">"  }, ;
      { '&lt;', "<"  } }

   for nFor := 1 to Len( aSubs )
      cCode    := StrTran( cCode, aSubs[ nFor, 1 ], aSubs[ nFor, 2 ] )
   next
   do while !Empty( cToken := TextBetween( cCode, "&#", ";", 1, @cLeft, @cRight ) )
      cToken   := Chr( Val( cToken ) )
      cCode    := cLeft + cToken + cRight

   enddo

return cCode

//----------------------------------------------------------------------------//

function PageCount( cText )

   local nAt
   local nPages   := 1

   if ( nAt := AT( "Page <strong>", cText ) ) > 0
      cText    := SubStr( cText, nAt + 14, 50 )
      nPages   := Val( AfterAtNum( "<strong>", cText, 1 ) )
   endif

return nPages

//----------------------------------------------------------------------------//

function GetUserName( cText, dDate )

   local c1    := "/memberlist.php?mode=viewprofile&amp;u=" //2342">cnavarro</a></strong> &raquo; Tue Jan 17
   local c2    := ["username]
   local nAt   := RAT( c1, cText )
   local n2    := RAT( c2, cText )
   local cUser := "
"
   local cDate

   nAt      := Max( nAt, n2 )
   if nAt > 0
      cText    := SubStr( cText, nAt, 200 )
      cUser    := TextBetween( cText, "
>", "<", 1 )
      cDate    := AllTrim( TextBetween( cText, "
&raquo;", "</p>" ) )
      cDate    := Upper( AfterAtNum( "
", cDate, 1 ) )
      dDate    := uCharToVal( cDate, 'D' )
   endif

return cUser

//----------------------------------------------------------------------------//

Marc Venken
Using: FWH 23.04 with Harbour
User avatar
Marc Venken
 
Posts: 1425
Joined: Tue Jun 14, 2016 7:51 am
Location: Belgium

Re: Can FW also read a intire online webshop ?

Postby Marc Venken » Sat Mar 13, 2021 9:02 pm

This seems to be the part of code that would be needed in order to read the webshop, but it is giving no data.

https://www.maveco-webshop.be/bedrijfsk ... ek/?page=2

If I copy this code into Chrome, then it shows data, but into the program I see zero
Marc Venken
Using: FWH 23.04 with Harbour
User avatar
Marc Venken
 
Posts: 1425
Joined: Tue Jun 14, 2016 7:51 am
Location: Belgium

Re: Can FW also read a intire online webshop ?

Postby Marc Venken » Sat Mar 13, 2021 11:04 pm

I did some more testing, but i get no data in the cText.

Is it possible that the function Webpagecontents is not able to retrive the data when it is a https (notice the s)
site? with http seems to give data ...


//cData = "https://www.maveco-webshop.be/"
cData = "http://forums.fivetechsupport.com/"
//cData = "http://www.kaboutersopglabbeek.be/"

MsgRun( cPageURL, "READING FORUM PAGE", { || ;
cText := WebPageContents( cData, .t. ) ;
} )

msginfo(cText)
Marc Venken
Using: FWH 23.04 with Harbour
User avatar
Marc Venken
 
Posts: 1425
Joined: Tue Jun 14, 2016 7:51 am
Location: Belgium

Re: Can FW also read a intire online webshop ?

Postby nageswaragunupudi » Sun Mar 14, 2021 4:44 am

The function reads the contents of "https" also.

Code: Select all  Expand view
FW_MEMOEDIT( WebPageContents( "https://www.maveco-webshop.be/", .t. ) )
 


Image

This page has 11 tables.
Code: Select all  Expand view
  cText := WebPageContents( "https://www.maveco-webshop.be/", .t. )
   ? OCCURS( "</table>", cText )
 

Result : 11

Next read all the 11 tables one by one and parse each table to find what we want.
Regards

G. N. Rao.
Hyderabad, India
User avatar
nageswaragunupudi
 
Posts: 10620
Joined: Sun Nov 19, 2006 5:22 am
Location: India

Re: Can FW also read a intire online webshop ?

Postby Marc Venken » Mon Mar 15, 2021 7:56 pm

How could i do the folowing ?

assume is have a start description field like

"Safety shoes S3 with composit upper and kevlar sole type 1254"

Then I have a huge dbf with several field like 'description' and 'memo' and 'reference' ....

How can I put in a array or dbf ALL the records that meet one of the words from the start desc. field ?

So all records containing "Safety" or "shoes" or "composit" or "kevlar", .... should go in 1 array/browse in order to link one of them to the start description.

In this function I will also put a array of words that are not seeked for like (S3, and, in, ..) words that are to commen in many descriptions will be skipped.

Any Idea how to start ?
Marc Venken
Using: FWH 23.04 with Harbour
User avatar
Marc Venken
 
Posts: 1425
Joined: Tue Jun 14, 2016 7:51 am
Location: Belgium

Re: Can FW also read a intire online webshop ?

Postby Marc Venken » Thu Mar 18, 2021 9:53 am

I'm trying to analyse the function below and see that this is the string that is build up when typing words in Xbrowse

cFilter will look like this --->> 'DASSY' $ UPPER( DBRECORDINFO( 9 ) ) .AND. 'BARI' $ UPPER( DBRECORDINFO( 9 ) )

What is DBRECORDINFO(9) doing ? Looking into all field :?: :?:

What I finaly would like todo is :

cLookup = "Dassy, S3, 1523, Shoe, Safety"

cLookupfield = "Description"

and all records with words into cLookup should be added to the result xbrowse filter (maybe looking only in the cLookupfield)




Code: Select all  Expand view
function BrwFilter( c )

   local lFound   := .t.
   local aTokens
   local cSaveFilter := DBFILTER()
   local nSaveRec    := RECNO()
   local cFilter     := {}

   if Empty( c )
      return .t.
   endif

   c  := UPPER( c )
   aTokens  := HB_ATokens( c )

   for each c in aTokens
      AAdd( cFilter, "'" + c + "' $ UPPER( DBRECORDINFO( 9 ) )" )
   next

   cFilter  := FW_ArrayAsList( cFilter, " .AND. " )

   SET FILTER TO &cFilter
   GO TOP
   lFound   := ( OrdKeyCount() > 0 )

return lFound
Marc Venken
Using: FWH 23.04 with Harbour
User avatar
Marc Venken
 
Posts: 1425
Joined: Tue Jun 14, 2016 7:51 am
Location: Belgium

Re: Can FW also read a intire online webshop ?

Postby nageswaragunupudi » Thu Mar 18, 2021 11:52 am

cFilter will look like this --->> 'DASSY' $ UPPER( DBRECORDINFO( 9 ) ) .AND. 'BARI' $ UPPER( DBRECORDINFO( 9 ) )

What is DBRECORDINFO(9) doing ? Looking into all field :?: :?:

This is the full record as a string.
You are in the right direction.

You will need to case-insensitive comparision.
Use FW_AtX()
You can search for an array of words in a string
Regards

G. N. Rao.
Hyderabad, India
User avatar
nageswaragunupudi
 
Posts: 10620
Joined: Sun Nov 19, 2006 5:22 am
Location: India

Re: Can FW also read a intire online webshop ?

Postby hmpaquito » Thu Mar 18, 2021 6:33 pm

Alternatively
Code: Select all  Expand view
hb_wildMatchI( cPattern, cString )


http://kresin.ru/en/hrbfaq_3.html#Doc13_9
hmpaquito
 
Posts: 1482
Joined: Thu Oct 30, 2008 2:37 pm

Re: Can FW also read a intire online webshop ?

Postby Marc Venken » Thu Mar 18, 2021 7:59 pm

hmpaquito wrote:Alternatively
Code: Select all  Expand view
hb_wildMatchI( cPattern, cString )


http://kresin.ru/en/hrbfaq_3.html#Doc13_9


Interesting site... Can all hb functions be used in FW
Marc Venken
Using: FWH 23.04 with Harbour
User avatar
Marc Venken
 
Posts: 1425
Joined: Tue Jun 14, 2016 7:51 am
Location: Belgium

Re: Can FW also read a intire online webshop ?

Postby Marc Venken » Fri Mar 19, 2021 3:31 pm

Is there also a HTML class ? I found this :

https://github.com/harbour/core/blob/ma ... s/html.prg

Is there a more complete version ?

Not for writing, but read a html page and extract all stuff from it.

<title>test</title>
etc...

I can do it with the sample code here, but who knows... maybe a better class exsist to work/expand .
Marc Venken
Using: FWH 23.04 with Harbour
User avatar
Marc Venken
 
Posts: 1425
Joined: Tue Jun 14, 2016 7:51 am
Location: Belgium

Re: Can FW also read a intire online webshop ?

Postby Marc Venken » Wed Mar 24, 2021 8:43 am

Today I found this code in java.

It should show the tags in a html page.

I can read the tags in FW when i know te name of them en take the text between the corresponding text.

Do we have a corresponding function like getelementbytagname ?


Code: Select all  Expand view

let tags = []
for(tag of document.body.getElementsByTagName('*')){
  if(!tags.includes(tag.tagName))
    tags.push(tag.tagName);
}
console.log(tags)
 

https://stackoverflow.com/questions/602 ... -html-file
Marc Venken
Using: FWH 23.04 with Harbour
User avatar
Marc Venken
 
Posts: 1425
Joined: Tue Jun 14, 2016 7:51 am
Location: Belgium

Re: Can FW also read a intire online webshop ?

Postby nageswaragunupudi » Wed Mar 24, 2021 1:03 pm

Code: Select all  Expand view
function GetTag( cTag, cHtml, nPos )

   local cText := BeforAtNum( "</" + cTag + ">", cHtml, IfNil( nPos, 1 ) )
   local nAt

   if !Empty( cText )
      nAt   := Max( RAT( "<" + cTag + " ", cText ), RAT( "<" + cTag + ">", cText ) )
      cText := AfterAtNum( ">", SubStr( cText, nAt + Len( cTag ) + 1 ), 1 )
   endif

return cText
Regards

G. N. Rao.
Hyderabad, India
User avatar
nageswaragunupudi
 
Posts: 10620
Joined: Sun Nov 19, 2006 5:22 am
Location: India


Return to FiveWin for Harbour/xHarbour

Who is online

Users browsing this forum: No registered users and 45 guests