Page 1 of 1
Searching in the content of documents
Posted: Wed Mar 16, 2022 10:03 am
by driessen
Hello,
I have a folder with thousands of documents : Word-documents (doc, docx and rtf) and PDF-documents.
I need to do a search in the content of all these documents to see if certain words can be found.
This results into a list of documents, all of them containing the word I have been searching for.
This proces needs to be done within my application.
Any suggestions?
Thank you very much in advance.
Re: Searching in the content of documents
Posted: Wed Mar 16, 2022 11:11 am
by Otto
Michel,
maybe you can use findstr?
memowrit bat-file and winexec() and memoread the result.
findstr /P "xbrowse" C:\FWH\samples\*.* >test.log
Best regards,
Otto
https://stackoverflow.com/questions/884 ... str-comman
Re: Searching in the content of documents
Posted: Wed Mar 16, 2022 11:35 am
by csincuir
Or you can also use FileSeek:
https://www.fileseek.ca/It's fast and easy to use
Best regards
Carlos
Re: Searching in the content of documents
Posted: Wed Mar 16, 2022 12:34 pm
by Otto
Carlos,
I remember that I did tests with fileseek. But you need the paid version to get a CSV export of the results.
Best regards,
Otto
viewtopic.php?f=3&t=33244&p=196025&hilit=fileseek&sid=b0f3b637d2d0ef8daf74d1ff56516df8#p196025
Re: Searching in the content of documents
Posted: Wed Mar 16, 2022 9:54 pm
by Otto
Hello Michel,
findstr() does not search DOCX.
For DOCX I use UNZIP and then search in the XML files.
I have a test here with UNZIP the DOCX files and search then in the XML file.
116 DOCX files are searched. Only one contains the search term.
Best regards,
Otto
data:image/s3,"s3://crabby-images/5bc87/5bc8773829ae366f25fd1668fc6f37bf7960ed22" alt="Image"
Re: Searching in the content of documents
Posted: Wed Mar 16, 2022 10:11 pm
by driessen
Hello Otto,
Thank you very much for your efforts trying to help me.
How about your suggestion when one need to search in a few hundred thousands of documents?
Is the system still doing its job?
I'll have to test it but I will only be able to test in the second half of next week since I'm going on holiday for one week.
But I'll start my test asap.
Thanks once again.
Re: Searching in the content of documents
Posted: Thu Mar 17, 2022 8:05 pm
by Jimmy
hi,
have not test it yet but there "seems" to be a "simple" Way using ADO
look at Github for "Windows-classic-samples-main.zip" (have no Link yet)
Windows-classic-samples-main.zip\Windows-classic-samples-main\Samples\Win7Samples\winui\WindowsSearch\WSFromScript\QueryEverything.vbs
---
page_type: sample
languages:
- vbscript
products:
- windows-api-win32
name: WSFromScript sample
urlFragment: wsfromscript-sample
description: Demonstrates to query Windows Search from a Microsoft Visual Basic script using Microsoft ActiveX Data Objects (ADO).
extendedZipContent:
- path: LICENSE
target: LICENSE
---
# WSFromScript sample
The WSFromScript code sample demonstrates how to query Windows Search from a Microsoft Visual Basic script using Microsoft ActiveX Data Objects (ADO).