The outsourcing of memo fields into individual files can indeed be a practical and flexible alternative in modern systems and might better suit your DMS system.
That is exactly what I’m working on: a project where I create an index file from all the words in a directory. The index file follows the pattern: Word[TAB]Filename. My system is already working very well, but I wanted to share my approach and at the same time ask if anyone else is working on something similar. I’d love to exchange ideas and learn from your experiences!
My Solution Overview
One-time Creation of the Index File
I scan the directory and extract all words from the files.
For each word, I create an entry in the format Word[TAB]Filename and save everything in a main index file.
Ongoing Operation
During operation, the index file must be updated whenever files are deleted, edited, or added. My approach is as follows:
A. Deleting a File
I search the index file for all entries containing the filename.
These entries are removed from the index file.
B. Editing a File
First, I delete all entries in the index file that contain the filename.
Then I create a temporary index file with new entries from the edited file.
Finally, I append the temporary index file to the main index file.
C. Adding a New File
I create a temporary index file with entries from the new file.
This temporary index file is appended to the main index file.
Why This Approach?
Simplicity: The approach is easy to understand and implement.
Consistency: By deleting and recreating entries during edits, the index file always remains up to date.
Flexibility: Temporary index files allow changes to be processed in isolation before being merged into the main index file.
PS: Here’s a test
Just for info: the search is performed across 276,748 TXT files.
I split the forums DBF into individual memo files.
Search completed. Elapsed time: 0.9403415 seconds
Writing hash content to 'c:\search_results\results.txt'...
Results have been written to 'c:\search_results\results.txt'.
Writing matches with coverage to 'c:\search_results\matches_with_coverage.txt'...
Matches with coverage have been written to 'c:\search_results\matches_with_coverage.txt'.