I need some help figuring out some data manipulation.
I have a database with 90000 entries.
I have a .csv file with 22000 entries.
In the .csv file there are 3 entries per line (cOld, cNew, cName)
I need to go through the .csv file and match cOld to my database filed MyDB->ID and cName in MyDB->Name
If cOld == MyDB->ID and cName == MyDB->Name I need to update MyDB->ID with cNew.
I tried doing a DO WHILE ! EOF() / ENDDO and in that I would search each line of the .csv file for a match.
This takes at least 12 hours to run. There must be an easier/faster way.
Any ideas?
Stuck - Database update via .csv file
- Jeff Barnes
- Posts: 933
- Joined: Sun Oct 09, 2005 1:05 pm
- Location: Ontario, Canada
- Contact:
Stuck - Database update via .csv file
Thanks,
Jeff Barnes
(FWH 16.11, xHarbour 1.2.3, Bcc730)
Jeff Barnes
(FWH 16.11, xHarbour 1.2.3, Bcc730)
Re: Stuck - Database update via .csv file
Are you unable to transform .CSV into .TXT? Wouldn't it be easier to import the .TXT into the .DBF?
Regards, saludos.
Regards, saludos.
João Santos - São Paulo - Brasil - Phone: +55(11)95150-7341
- Jeff Barnes
- Posts: 933
- Joined: Sun Oct 09, 2005 1:05 pm
- Location: Ontario, Canada
- Contact:
Re: Stuck - Database update via .csv file
The .csv file is a text file.
I can't just import the data, I need to update the data in the database based on the info in the .csv file
I can't just import the data, I need to update the data in the database based on the info in the .csv file
Thanks,
Jeff Barnes
(FWH 16.11, xHarbour 1.2.3, Bcc730)
Jeff Barnes
(FWH 16.11, xHarbour 1.2.3, Bcc730)
Re: Stuck - Database update via .csv file
Jeff, can't you use APPEND FROM?
Best regards,
Otto
Best regards,
Otto
********************************************************************
mod harbour - Vamos a la conquista de la Web
modharbour.org
https://www.facebook.com/groups/modharbour.club
********************************************************************
mod harbour - Vamos a la conquista de la Web
modharbour.org
https://www.facebook.com/groups/modharbour.club
********************************************************************
- Jeff Barnes
- Posts: 933
- Joined: Sun Oct 09, 2005 1:05 pm
- Location: Ontario, Canada
- Contact:
Re: Stuck - Database update via .csv file
Hi Otto
I can't just do an append.
I am updating a database with new ID numbers.
In the csv file, there is the old ID, new ID and name (22000 of them).
The csv file will only show 1 entry for each "old ID" however, there can be multiple entries in my database for the same "old ID".
I need to update ALL records that match "old ID and name" and change the old ID to the new ID.
I hope that made sense
I can't just do an append.
I am updating a database with new ID numbers.
In the csv file, there is the old ID, new ID and name (22000 of them).
The csv file will only show 1 entry for each "old ID" however, there can be multiple entries in my database for the same "old ID".
I need to update ALL records that match "old ID and name" and change the old ID to the new ID.
I hope that made sense
Thanks,
Jeff Barnes
(FWH 16.11, xHarbour 1.2.3, Bcc730)
Jeff Barnes
(FWH 16.11, xHarbour 1.2.3, Bcc730)
Re: Stuck - Database update via .csv file
Jeff, you could first create from the csv a dbf.
Best regards,
Otto
Best regards,
Otto
********************************************************************
mod harbour - Vamos a la conquista de la Web
modharbour.org
https://www.facebook.com/groups/modharbour.club
********************************************************************
mod harbour - Vamos a la conquista de la Web
modharbour.org
https://www.facebook.com/groups/modharbour.club
********************************************************************
- Marc Venken
- Posts: 1481
- Joined: Tue Jun 14, 2016 7:51 am
- Location: Belgium
Re: Stuck - Database update via .csv file
Have a look here : function readCSV
https://forums.fivetechsupport.com/view ... 69#p276253
For each token you can do the check with the DBF (indexed) and update if needed.
Seek the ID in the dbf. If found, start a second loop as long as the ID = OLDID and change the data. Exit the second loop and process the CSV for next values
Can it help ?
https://forums.fivetechsupport.com/view ... 69#p276253
For each token you can do the check with the DBF (indexed) and update if needed.
Seek the ID in the dbf. If found, start a second loop as long as the ID = OLDID and change the data. Exit the second loop and process the CSV for next values
Can it help ?
Marc Venken
Using: FWH 23.08 with Harbour
Using: FWH 23.08 with Harbour
- Jeff Barnes
- Posts: 933
- Joined: Sun Oct 09, 2005 1:05 pm
- Location: Ontario, Canada
- Contact:
Re: Stuck - Database update via .csv file
Hi Marc,
Thank you.
This is what I needed: "Seek the ID in the dbf. If found, start a second loop as long as the ID = OLDID and change the data. Exit the second loop and process the CSV for next values"
It was the second loop that I was missing. This should run much much faster
Thank you.
This is what I needed: "Seek the ID in the dbf. If found, start a second loop as long as the ID = OLDID and change the data. Exit the second loop and process the CSV for next values"
It was the second loop that I was missing. This should run much much faster
Thanks,
Jeff Barnes
(FWH 16.11, xHarbour 1.2.3, Bcc730)
Jeff Barnes
(FWH 16.11, xHarbour 1.2.3, Bcc730)
- Antonio Linares
- Site Admin
- Posts: 42270
- Joined: Thu Oct 06, 2005 5:47 pm
- Location: Spain
- Contact:
Re: Stuck - Database update via .csv file
Dear Jeff,
From chatgpt:
1. Load the CSV into a Memory Array
Read the .csv file into a memory array or hash table. This allows quick lookups by cOld and cName.
2. Iterate Over the Database Efficiently
Use a single loop through your database and check for matches in the array or hash table instead of iterating over the .csv file repeatedly.
3. Batch Updates
When a match is found, update MyDB->ID with cNew directly.
Key Improvements in This Approach
1. Hash Table Lookup
Instead of searching the .csv line by line, the hash table enables O(1) lookups for each database record, massively improving performance.
2. Single Pass Through Database
Only one loop is needed for the database, avoiding the costly nested loop.
3. Minimal Disk I/O
The .csv is read into memory entirely at the start, reducing repetitive file reads.
From chatgpt:
1. Load the CSV into a Memory Array
Read the .csv file into a memory array or hash table. This allows quick lookups by cOld and cName.
2. Iterate Over the Database Efficiently
Use a single loop through your database and check for matches in the array or hash table instead of iterating over the .csv file repeatedly.
3. Batch Updates
When a match is found, update MyDB->ID with cNew directly.
Code: Select all | Expand
FUNCTION UpdateDatabase()
LOCAL aCSV := {}, cLine, nFileHandle, hTable := {}, cCompositeKey
// Open the database
USE MyDB EXCLUSIVE NEW
// Load CSV into memory
nFileHandle := FOpen("update_file.csv")
IF nFileHandle == -1
? "Failed to open update_file.csv"
RETURN
ENDIF
WHILE ! FEOF(nFileHandle)
cLine := AllTrim(FReadLine(nFileHandle))
IF Empty(cLine)
LOOP
ENDIF
// Parse CSV line
LOCAL aFields := HB_ATokens(cLine, ",")
AAdd(aCSV, { aFields[1], aFields[2], aFields[3] }) // cOld, cNew, cName
// Create a hash table for quick lookup
cCompositeKey := aFields[1] + "_" + aFields[3]
hTable[cCompositeKey] := aFields[2] // Map composite key to cNew
ENDDO
FClose(nFileHandle)
// Traverse the database
GO TOP
DO WHILE ! EOF()
cCompositeKey := MyDB->ID + "_" + MyDB->Name
IF HB_HHasKey(hTable, cCompositeKey)
MyDB->ID := hTable[cCompositeKey]
// Optional: Mark the record as updated if needed
// REPLACE MyDB->Updated WITH .T.
ENDIF
SKIP
ENDDO
// Clean up and close
CLOSE DATABASES
RETURN
1. Hash Table Lookup
Instead of searching the .csv line by line, the hash table enables O(1) lookups for each database record, massively improving performance.
2. Single Pass Through Database
Only one loop is needed for the database, avoiding the costly nested loop.
3. Minimal Disk I/O
The .csv is read into memory entirely at the start, reducing repetitive file reads.