Torsten Neuer (tneuer@inwise.de)
Wed, 17 Mar 1999 10:02:22 +0100
On Mit, 17 Mär 1999, Geoff Hutchison wrote:
>>start_url: http://www.suse.com/Mailinglists/suse-informix
>>
>>doesn't see this subdir. The index file there however contians
>>all the links...
>
>My best suggestion is to run htdig and add '-vvv' to your command line.
>This will generate a pile of data, but it should give you each HTTP header
>as well as some explanation of why it rejects links.
>
>I ran it through the URL test code I've been using as I add support for
>multiple services. It parsed it OK, so any problem is occurring earlier in
>parsing--for example the HTML parser may decide those href tags aren't any
>good. The debugging output will tell us more.
>
>-Geoff
>
I may be wrong, but AFAIK the colon is a special character in an URL
which is normally used to include username/password into FTP URLs
(i.e. "ftp://user:password@host/directory").
If used in another context, colons must be encoded.
regs,
Torsten
-- InWise - Wirtschaftlich-Wissenschaftlicher Internet Service GmbH Waldhofstraße 14 Tel: +49-4101-403605 D-25474 Ellerbek Fax: +49-4101-403606 E-Mail: info@inwise.de Internet: http://www.inwise.de------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message.
This archive was generated by hypermail 2.0b3 on Wed Mar 17 1999 - 01:28:01 PST