According to Warren Jones:
> I'm not at all sure that the patch to URL.cc is the best solution,
> but something like it is essential for our site, and I suspect
> others are in the same situation. Here are the details:
>
> o We must index only valid_extensions, since we have no
> control over what individual users put in their web
> directories, and some are ...uhm... indiscriminate.
>
> o If a user puts a binary executable in his web directory,
> our server announces that it's type "text/html".
> I don't have control over this either.
This is a bit odd. I believe most servers use text/plain as the default
type, for files with no suffix or an unknown suffix. Still, htdig would
index text/plain files, so binary files with no file name suffix would
still pose a problem.
> o Using valid_extensions also allows URL's with no extension
> (after my patch to Retriever.cc). This is as it should be,
> since many URL's with no extension are subdirectories,
> which we need to index. But other URL's with no extension
> are binary executables or heaven knows what.
>
> o Users can't be relied on to use a trailing slash in links
> that point to a directory, e.g. <A HREF="subdirectory/">.
>
> In short, I see no way to tell whether a URL with no extension
> is 1) a subdirectory, which we want to index or 2) binary garbage,
> which we want to ignore, except to do what I've done in URL.cc:
> add a trailing slash to the URL and try to retrieve it.
>
> Still, I agree with Gilles in being a little uncomfortable with
> this solution. I'd be happy if someone could suggest something
> that's more elegant.
The problem is that change totally breaks things for cases where it's
valid to have text files with no suffix. E.g., one may want to index
a directory of HTML documentation files which also contains text/plain
files like COPYING, ChangeLog, README, etc. If your change is necessary
for your system, then perhaps it could be selectable by a new config
attribute, but to make this the default or only action would cause a
lot of users a lot of grief.
--
Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba Phone: (204)789-3766
Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.