As far as I know there isn't. And there's a good explanation for it. Htdig
has to crawl the whole website to see if there are any changes. It has to
view/download every page and then compare it with the data it allready has.
The amount of effort needed to create the new databases is not that much
bigger than searching the database ,determining if it has allready been
indexed and then either go to the next page, or index this page.

So, I don't think you will gain much time if you'd be able to update the
database, instead of creating new one. The company I work for has
implemented website using htdig a number of times now. In each and every
case, the indexing of the website took a large amount of time. The only way
to speed up the process, is to speed up the server ;) 

Marco


-----Original Message-----
From: Raschin Ghanad-Tavakoli [mailto:[EMAIL PROTECTED] 
Sent: vrijdag 17 december 2004 9:36
To: Marco Houtman
Subject: Re: [htdig] conflicts due to deleted files from indexed
directories?

hello, 

thx for the advice...

ad 1: yes,that's what I thought...

ad 2:
What I actually meant was, do I have to reindex the whole stuff 
from scratch to delete some entries of the database?
I don't believe that I have to, but I'm not sure...

Of course this wouldn't be so bad if I can use the -a option to ensure that
the search-engine stays online, but it still needs much, much longer time (2
or 3 times more = 2/3 days !!!) than just updating the database (without the
-i or -a option)...

besides: Isn't there any possibility to use the -a option without reindexing
from scratch?


thx in advance
raschin


"Marco Houtman" <[EMAIL PROTECTED]> schrieb:

> Raschin,
> 
> 1. If you delete a file after it has been indexed, htsearch will retrieve
it
> from the htdig databases, but the browser won't find it anymore, thus
> resulting in a 404 error (File Not Found).
> 
> 2. The are two ways (by my knowledge) to create te databases. First,
> starting htdig in a normal way, it will delete the existing databases, and
> then creates new ones. Htsearch will therefor not retrieve anything
usefull
> until the whole search is completed. The other way is running htdig with,
I
> believe, the -a option. This will force htdig to create temporary
workfiles.
> These will contain the new data, and as soon as htdig has finished
spidering
> your website you can delete the old databases and then move the new
> databases in their place. This can all be done with a script. 
> 
> Hope this helps!
> 
> Marco
> 
> -----Original Message-----
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Raschin
> Ghanad-Tavakoli
> Sent: woensdag 15 december 2004 13:52
> To: [EMAIL PROTECTED]
> Subject: [htdig] conflicts due to deleted files from indexed directories?
> 
> I have a general question...do there are any conflicts if I delete files
> after(or while) having indexed the whole directory-structure?
> 
> 2 questions:
> 
> 1.Where does htsearch's results will link to if the files don't exist
> anymore?
> 
> 2.Is htdig deleting its entries in the database to fix this with the next
> indexing?
> 
> thx in advance
> raschin
> 
> 
> 
> 
> -------------------------------------------------------
> SF email is sponsored by - The IT Product Guide
> Read honest & candid reviews on hundreds of IT Products from real users.
> Discover which products truly live up to the hype. Start reading now. 
> http://productguide.itmanagersjournal.com/
> _______________________________________________
> ht://Dig general mailing list: <[EMAIL PROTECTED]>
> ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html
> List information (subscribe/unsubscribe, etc.)
> https://lists.sourceforge.net/lists/listinfo/htdig-general
> 
> 
> 
> 
> -------------------------------------------------------
> SF email is sponsored by - The IT Product Guide
> Read honest & candid reviews on hundreds of IT Products from real users.
> Discover which products truly live up to the hype. Start reading now. 
> http://productguide.itmanagersjournal.com/
> _______________________________________________
> ht://Dig general mailing list: <[EMAIL PROTECTED]>
> ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html
> List information (subscribe/unsubscribe, etc.)
> https://lists.sourceforge.net/lists/listinfo/htdig-general
> 






-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now. 
http://productguide.itmanagersjournal.com/
_______________________________________________
ht://Dig general mailing list: <[EMAIL PROTECTED]>
ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html
List information (subscribe/unsubscribe, etc.)
https://lists.sourceforge.net/lists/listinfo/htdig-general

Reply via email to