I have indexed a mailing list archive. My next goal is to nightly update that index by indexing the entire month's archive and then merging that into the main database. At present, there are about 4 years of data. I'm seeking comments on my approach.
FYI, this is the main index: -rw-r--r-- 1 dan dan 70930432 Nov 9 16:58 adsl.docdb -rw-r--r-- 1 dan dan 1939456 Nov 9 16:58 adsl.docs.index -rw-r--r-- 1 dan dan 80090252 Nov 9 16:58 adsl.wordlist -rw-r--r-- 1 dan dan 66713600 Nov 9 16:58 adsl.words.db My first step is to create the merge database: -rw-r--r-- 1 dan dan 39936 Nov 9 16:52 adsl-merge.docdb -rw-r--r-- 1 dan dan 2048 Nov 9 16:50 adsl-merge.docs.index -rw-r--r-- 1 dan dan 33705 Nov 9 16:50 adsl-merge.wordlist -rw-r--r-- 1 dan dan 54272 Nov 9 16:50 adsl-merge.words.db Here is the command I use to do the merge of the above two databases: htmerge -s -a -c adsl.conf -m adsl.merge.conf But in order to do that, I need to first do the following: cp adsl-merge.docdb.work adsl-merge.docdb cp adsl-merge.docs.index.work adsl-merge.docs.index cp adsl-merge.wordlist.work adsl-merge.wordlist cp adsl-merge.words.db.work adsl-merge.words.db cp adsl.docdb adsl.docdb.work cp adsl.docs.index adsl.docs.index.work cp adsl.wordlist adsl.wordlist.work cp adsl.words.db adsl.words.db.work After the merge, this moves the new search data into production: mv adsl.docdb.work adsl.docdb mv adsl.docs.index.work adsl.docs.index mv adsl.wordlist.work adsl.wordlist mv adsl.words.db.work adsl.words.db It all seems to work. Any comments? Thanks. -- Dan Langille : http://www.langille.org/ ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

