May 152014
 
Article PHP

Sometimes, the structure of the content of a web site need to be modified, causing the URLs used to access some of the pages in the site to be changed.

But other web sites might already have created links to our site, using the old URLs. Besides, some users might have bookmarked the old URLs in their browsers. Finally, search engines such as Google or Bing might already have indexed the content of our site using the old URLs.

To prevent all those valuable references to become broken links, we can program the site to redirect users, as well as search engines, to the new URLs, when requests for the old URLs are received.

If the URLs have changed in a way that can be easily expressed as a regular expression, the best way to set up those redirects is by adding RewriteRule directives to the configuration of the (Apache) web server of our site.

But, sometimes it may be more convenient to update the CGI scripts that dynamically generate the content of the site. This post explains how to do this in a PHP script.

1. Temporary and permanent redirects

A temporary redirect is implemented in PHP with a call to the built-in “header” function, passing the destination url as argument. For instance:

This statement tells the web server to send a status code 302 “Moved temporarily”  in the response.

But, if the change in the URL is intended to be permanent, search engines should be made aware of this fact, to update their indexes, replacing the old url with the new one.

To do this, the desired 301 “Moved permanently” response code must be added as the third argument in the call to header(). The second argument can be set as the default value “true”:

2. Redirects 403, 404 y 410

In the new structure of the content of the site, it may happen that some pages have become obsolete, and have been removed. The default behaviour of a web server when a request for a non-existent page arrives, is to return a 404 “Not Found” response code. In PHP, we can also make the server return a 404 code with a call to header(), as explained above.

But, it is better to return a 410 “Gone” status code. This tells search engines that the page has been intentionally removed from the site, and they should consequently remove it from their indexes. On the other hand, a browser will not load the URL passed as the first argument to the header() function, but it is common practice to pass the home url in the call:

Finally, it might happen that the requested content actually exists, but under some circumstances (i.e., not logged-in user), we want to prevent access to it. In this case, it is best to send a 403 “Forbidden” status code, that works in the same way as the 404 and 410 redirects.

References

 Posted by at 5:34 pm

 Leave a Reply

(required)

(required)