Subject: Re: [htdig3-dev] Re: robots.txt bug (was [ANNOUNCE] ht://Dig 3.2.0b1)
From: Geoff Hutchison (ghutchis@wso.williams.edu)
Date: Mon Feb 07 2000 - 16:20:28 PST
At 11:15 AM +0200 2/7/00, Valdas Andrulis wrote:
>GH> First off, have you set case_sensitive to anything in your config file?
>
>No.
Good. This rules out any problems with the regex from this.
>GH> Then let us know what pattern it sets in the debug output--I don't
>GH> really want the whole thing but I want to see if it's setting the
>GH> pattern OK.
>
>Trying to retrieve robots.txt file
>Parsing robots.txt file using myname = htdig
>Found 'user-agent' line: htdig
>Found 'disallow' line: /cat/
>Found 'user-agent' line: htdig
>Found 'disallow' line: /foobar/
>Pattern: /foobar/
This is bad. The last line should be:
Pattern: /cat/|/foobar/
In light of a recent bug report (about a new 'allow' keyword in
robots.txt) the code probably needs to be rewritten. Nevertheless,
here's the key code:
if (*rest)
{
if (pattern.length())
pattern << '|' << rest;
else
pattern = rest;
}
The only thing I can think of here is that "pattern = rest;" is not
performing the copying that it should...
Thoughts?
-Geoff
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
htdig3-dev-unsubscribe@htdig.org
You will receive a message to confirm this.
This archive was generated by hypermail 2b28 : Mon Feb 07 2000 - 16:25:04 PST