A converter from text-file ( with delimiters ) to DBF ?

Converter

Postby ukoenig » Tue Aug 19, 2008 9:07 am

Thank you very much erverybody for his help.

With the informations, i can solve the problem and help
the customer today.

Best Regrds
Uwe :lol:
Since 1995 ( the first release of FW 1.9 )
i work with FW.
If you have any questions about special functions, maybe i can help.
User avatar
ukoenig
 
Posts: 4043
Joined: Wed Dec 19, 2007 6:40 pm
Location: Germany

Converter

Postby ukoenig » Tue Aug 19, 2008 10:37 am

Hello,

once again, thank you very much for the help.
With your solutions, everything is solved and works perfect now.
My customer is happy.

Regards
Uwe :lol:
Since 1995 ( the first release of FW 1.9 )
i work with FW.
If you have any questions about special functions, maybe i can help.
User avatar
ukoenig
 
Posts: 4043
Joined: Wed Dec 19, 2007 6:40 pm
Location: Germany

Customer response

Postby ukoenig » Tue Aug 19, 2008 11:50 am

Hello,
i got a customer response :

This month, he had to convert : 181000 Textlines / 12,3 MB
With the old functions before, the used converting-time
was round about => 8 Minutes.

With the new functions ( on a Intel Pentium 4 ),
i used the very nice sample from nageswaragunupudi,
very fast, only 1.4 Seconds.

Great !!!

Regards
Uwe
Since 1995 ( the first release of FW 1.9 )
i work with FW.
If you have any questions about special functions, maybe i can help.
User avatar
ukoenig
 
Posts: 4043
Joined: Wed Dec 19, 2007 6:40 pm
Location: Germany

Postby Silvio » Tue Aug 19, 2008 12:31 pm

good nas!!
Best Regards, Saludos

Falconi Silvio
User avatar
Silvio
 
Posts: 3107
Joined: Fri Oct 07, 2005 6:28 pm
Location: Teramo,Italy

Postby xProgrammer » Tue Aug 19, 2008 12:45 pm

Hi Otto

> "PS: O.T. I use so many times for checking if a variable is empty
this code:
if len(ALLTRIM(cText) = 0
Is there a build in function existing?"

The following should work:

Code: Select all  Expand view  RUN
IF Empty( cText )


Also covers other data types

Regards

xProgrammer
User avatar
xProgrammer
 
Posts: 464
Joined: Tue May 16, 2006 7:47 am
Location: Australia

Postby Antonio Linares » Tue Aug 19, 2008 4:37 pm

Uwe,

If you avoid the use of the AEval(), I would say that you can get a better time :-)

But if the customer is happy with it, then leave it as it is :-)
regards, saludos

Antonio Linares
www.fivetechsoft.com
User avatar
Antonio Linares
Site Admin
 
Posts: 42122
Joined: Thu Oct 06, 2005 5:47 pm
Location: Spain

Postby nageswaragunupudi » Wed Aug 20, 2008 1:12 am

Mr Uwe

>
This month, he had to convert : 181000 Textlines / 12,3 MB
With the old functions before, the used converting-time
was round about => 8 Minutes.

With the new functions ( on a Intel Pentium 4 ),
i used the very nice sample from nageswaragunupudi,
very fast, only 1.4 Seconds.
>

Glad the sample was useful to you. But I am surprised at the speed you mentioned. As I arleady said while posting the sample, it was not optimised for speed. I meant it to demonstrate that what you wanted could be done. According my tests here 40,000 rows took more than 2 seconds. 180,000 rows might have taken more time, unless you removed the field width checking loop. Anyway glad that the customer is happy with the speeds.

The program has 3 main steps.
1. Parsing the data
2. Checking for maximum field widths
3. Writing DBF.

Step 1 takes the least time. Can be optimized further but the benefits will not be perceptible and optimization excercise would only be of academic interest.
Step 2 and 3 are the real time killers.

If we know in advance the safe maximum field lengths of each field, then I would totally avoid steps 1 and 2. I would straight away copy from buffer to buffer in a C routine, building an image of DBF file and DBF header and written in raw mode using fcreate and fwrite. That would really be the fastest way.

I may post the alternative purely for academic interest.

Mr Antonio

>>
If you avoid the use of the AEval(), I would say that you can get a better time
>>

Is it so? I have been all the time under the impression that AEval would be a bit faster than For Next loop with index variable being maitained at Harbour level. Is for..next loop really faster than AEval ? Thanks in advance for clarification on this issue.
Regards

G. N. Rao.
Hyderabad, India
User avatar
nageswaragunupudi
 
Posts: 10653
Joined: Sun Nov 19, 2006 5:22 am
Location: India

Time-stop for converting

Postby ukoenig » Wed Aug 20, 2008 1:50 am

hello mr. nageswaragunupudi,

i got these informations after a phonecall with my customer.
For my tests, i could only use 1000 textlines ( a part of the file ).
The complete, converted textfile contains all phonecalls of 1 month.

I will ask him again tomorrow, to count the running time exactly.
As well, i will double the testfile a few times, to get many lines.
I do the test on my 2 different computers and stop the time there.

The result ( the exact time with 2 var-defines start and end )

My 1. computer :
-------------------
Intel Pentium
D 820 Dual Core
2,8 GHz / 800 Mhz FSB

For 100 000 Textlines the exact running-time was 4,2 Seconds
---------------------------------------------------------------------------
The original structure has shown 1 empty line between two textlines.
That means, there have been 50 000 converted lines and 50 000 emty lines in relation to the running time.

I am not shure, if the processor of my custumer is much faster.
My time seems to be more realistic.
but even 4.2 seconds is a very good time.

Greetings from Germany
Uwe :lol:
Last edited by ukoenig on Wed Aug 20, 2008 3:00 am, edited 2 times in total.
Since 1995 ( the first release of FW 1.9 )
i work with FW.
If you have any questions about special functions, maybe i can help.
User avatar
ukoenig
 
Posts: 4043
Joined: Wed Dec 19, 2007 6:40 pm
Location: Germany

Postby hua » Wed Aug 20, 2008 1:55 am

Antonio Linares wrote:No, just one char.

But you could use StrTran() to replace CRLFs into ";" or similar


Thanks for the idea Antonio. I've been using the following to achieve the same purpose. Maybe it's not as optimized as StrCharCount() because of the do..while loop.

Code: Select all  Expand view  RUN
function CountCrLf(cStr)
  local num := 1, nret := 0
  // Desc: Count the no. of occurences of CRLF (Each line is terminated with a CRLF)
  // AtNum() is a function from ct.lib.
  do while atnum(CRLF, cStr, num++) != 0
     nRet++
  enddo
return nRet
hua
 
Posts: 1072
Joined: Fri Oct 28, 2005 2:27 am

Postby Antonio Linares » Wed Aug 20, 2008 6:25 am

Hua,

StrCharCount() is implemented in C, which means "machine code" generation. No "virtual machine" intervention at all.
regards, saludos

Antonio Linares
www.fivetechsoft.com
User avatar
Antonio Linares
Site Admin
 
Posts: 42122
Joined: Thu Oct 06, 2005 5:47 pm
Location: Spain

Postby Antonio Linares » Wed Aug 20, 2008 6:26 am

Dear Rao,

A simple test:
Code: Select all  Expand view  RUN
#include "FiveWin.ch"

function Main()

   local a := Array( 1000000 ), n
   local nStep1 := GetTickCount(), nStep2, nStep3

   AEval( a, { || Date() } )
   nStep2 = GetTickCount()

   for n = 1 to 1000000
      Date()
   next         

   nStep3 = GetTickCount()

   MsgInfo( "AEval() ..." + Str( nStep2 - nStep1 ) + CRLF + ;
            "for next ... " + Str( nStep3 - nStep2 ) )

return nil

The technical explanation is that to evaluate a codeblock a new "virtual machine" frame has to be built (increase and decrease of the stack). Using a for next just keeps using the same virtual machine frame. The calling to Date() from both, also forces a new virtual machine frame, but it is common for both ways.
Last edited by Antonio Linares on Wed Aug 20, 2008 6:41 am, edited 1 time in total.
regards, saludos

Antonio Linares
www.fivetechsoft.com
User avatar
Antonio Linares
Site Admin
 
Posts: 42122
Joined: Thu Oct 06, 2005 5:47 pm
Location: Spain

Postby Antonio Linares » Wed Aug 20, 2008 6:30 am

BTW, lets compare with the above example Harbour and xHarbour speed:

1. Harbour:

AEval ... 421
for next ... 265

2. xHarbour:

AEval ... 749
for next ... 483

So Uwe, your application built using Harbour will be even faster :-)
regards, saludos

Antonio Linares
www.fivetechsoft.com
User avatar
Antonio Linares
Site Admin
 
Posts: 42122
Joined: Thu Oct 06, 2005 5:47 pm
Location: Spain

Postby James Bott » Wed Aug 20, 2008 6:58 am

Antonio,

Wow, this is disappointing--two functions tested and both much slower with xHarbour. I would have guessed that they were both almost the same speed.

Any ideas why this is?

It also makes you wonder if two out of two was just a coincindence, or is xHarbour much slower overall?

James
User avatar
James Bott
 
Posts: 4840
Joined: Fri Nov 18, 2005 4:52 pm
Location: San Diego, California, USA

Postby nageswaragunupudi » Wed Aug 20, 2008 6:58 am

Mr Antonio

>
he technical explanation is that to evaluate a codeblock a new "virtual machine" frame has to be built (increase and decrease of the stack). Using a for next just keeps using the same virtual machine frame. The calling to Date() from both, also forces a new virtual machine frame, but it is common for both ways.
>

Thanks for the clarification. I shall keep this in mind while coding in future. However much I am aware of this, I wonder why does it not strike my mind when I am writing my code.

>
your application built using Harbour will be even faster
>

I am really surprised to see the benchmarks. To be honest let me confess that I have been under the "impression" that xHarbour is faster than harbour, mainly going by the claims in the xharbour website. I never checked by myself. I wouldn't be surprised if many of my professional colleagues share similar opinion, rightly or wrongly. While i am not interested in initiating a debate on this issue, I realise that we better do our own benchmarks and then decide by ourselves.
Regards

G. N. Rao.
Hyderabad, India
User avatar
nageswaragunupudi
 
Posts: 10653
Joined: Sun Nov 19, 2006 5:22 am
Location: India

Postby Antonio Linares » Wed Aug 20, 2008 7:09 am

According to Przemek (current Harbour tech leader), Harbour is much faster than xHarbour, mainly because it is better implemented internally.

Less people working on it and better internal organization (no rush to implement features) resulted in a faster virtual machine execution and smaller EXEs.

Curiously enough, though we are technical people, most of you consider xHarbour as a faster compiler. Just "hype" caused by the "marketing" and "flock follow" (aka "lamb") effect... :-)
regards, saludos

Antonio Linares
www.fivetechsoft.com
User avatar
Antonio Linares
Site Admin
 
Posts: 42122
Joined: Thu Oct 06, 2005 5:47 pm
Location: Spain

Previous

Return to FiveWin for Harbour/xHarbour

Who is online

Users browsing this forum: No registered users and 104 guests