On Thu, Dec 14, 2023 at 8:52 AM Dave Wreski <dwre...@guardiandigital.com.invalid> wrote:
> Hi, > > I have a FAQ, but need some additional info I haven't been able to find. >> I'm trying to process links Google has indicated are 404s that never really >> ever existed on our site. >> >> I have an htaccess file I'm Including with my main apache config that >> only contains RewriteConds. This file is processed before any of the other >> htaccess files that contain standard RewriteRules. This is what I'm using >> to strip off any trailing slashes in URLs: >> >> RewriteCond %{REQUEST_FILENAME} !-d >> RewriteRule ^(.*)/$ /$1 [R=301,L] >> >> I just want to confirm that this means none of the RewriteRules that >> follow should contain a trailing slash or they will not match, correct? >> >> Some of my existing RewriteRules that were created before I realized I >> should be stripping off the trailing slash actually contain a trailing >> slash. >> >> Perhaps I should instead be using '/?' instead of just '/' at the end of >> URLs? >> >> Thanks, >> Dave >> >> >> > If the following rules look for a trailing slash and you remove it prior, > in theory it won't match. However, remember that .htaccess files will be > parsed over and over until it stops matching, so you are likely to create a > rewrite loop. > > Oh, good info. I didn't realize that. > > What is the rationale for removing trailing slashes here? > > Because apparently Google considers it duplicated content when it sees one > version with a slash and one version without. Here's a few articles that > discuss the issues. > > https://authenticdigital.nz/blog/trailing-slashes-and-seo/ > https://ahrefs.com/blog/trailing-slash/ > > https://stackoverflow.com/questions/5948659/when-should-i-use-a-trailing-slash-in-my-url > > Also, I learned my RewriteCond above to strip off the trailing slash > doesn't work with URLs involving query strings. > > RewriteCond %{REQUEST_FILENAME} !-d > RewriteRule ^(.*)/$ /$1 [R=301,L] > > I believe it also has the potential to add a duplicate slash in the > beginning if $1 already has a slash in it, but using just $1 alone doesn't > fix the problem with losing query strings. Even ahrefs uses the above > example in their blog post without considering query strings or the > potential for creating duplicate slashes. > > Ideas greatly appreciated. > > Thanks, > Dave > > > > I would stop using .htaccess files, first, and merge all rewrite rules in the relevant vhost / Directory block. Then, I would use the rewrite log to see what is really happening. Using multiple .htaccess files is a recipe to lose all your hair.