May 142014
 
Article Apache

Sometimes, the webmaster of a website may consider that the structure of the content of the site should be modified in some way. As a result,the URLs used to access some pages might be affected. But in other pages there may be links (either external links from other sites, or internal links) to those pages, using the old URL. Besides, search engines such as Google might already have indexed the content of the page, also using the old URL.

If the URL is modified without taking into account this fact, all those link would become broken links, and the ranking of the linked pages would most likely be affected as well.

To prevent this undesirable effect, a RewriteRule directive can be put in the config file of the Apache web server, to redirect the old URLs to the new ones, as explained in this post.

1. 301 and 302 redirects

If the intention of the webmaster is to make a lasting change to the URL, a permanent redirect should be configured, making the web server return a status code HTTP 301.

However, if the change to the URL is expected to be reverted later, it is advisable to set up a temporary redirect, making the web server return a status code HTTP 302.

The redirects are implemented adding a “RewriteRule” directive to the web server configuration, or else to the directory-level .htaccess configuration file. The syntax is the same in both cases.

In the following example, a permanent redirect is configured to send to www.example.net all incoming requests for the domain www.example.com:

A temporary redirect would be configured in the same way, just replacing “R=301” with “R=302”

The RewriteCond directive that precedes the RewriteRule limits the redirect to URLs whose original domain is www.example.com. This directive is required only if the web server has been configured to serve several virtual hosts.

On the other hand, the “L” (Last) flag prevents the evaluation of the rewrite rules that follow the redirect.

2. 404 and 410 redirects

In other cases, while changing the structure of the site, the webmaster migth find that some pages should simply be dropped, without being redirected to others.

The default behaviour of the web server when missing pages are requested, is to return a status code “404 Not Found” to the client. But in these cases it is preferrable to return a status code “410 Gone”, to make the search engines aware that the page has been intentionally removed.

For instance, the following directive can be used to tell search engines as well as browsers that all pages under the directory “/2012/images” have been deleted:

The “G” (“Gone”) flag makes the web server return a 410 status code.

The “NC” (No Case) flag also used in the example is optional, and makes the evaluation of the regular expression in the left side of the directive to be done case insensitive. In this case, request for urls starting with “/2012/IMAGES” would alse receive a 410 response code.

3. 403 redirect

Finally, in some cases we would rather let browsers and search engines know that the page exists, but access to it is forbidden ( for instance, to allow only to registered users access to the site administration pages).

This is done by returning a “403 Forbidden” status code.

The syntax is similar to that used for the 404 redirect, just replacing the flag “G” with “F”. For instance, to forbid access to any file with a “.exe” extension:

References

 Posted by at 2:34 pm

 Leave a Reply

(required)

(required)