Manage the Epsilon web site locally

This article provides a step-by-step guide for obtaining a local copy of the Epsilon website.

Using lighttpd instead of XAMPP

You can also use lighttpd with PHP instead of XAMPP, following this tutorial and pointing server.document-root to your htdocs directory, checked out as above. Alternatively, in most Debian-based GNU/Linux distributions, installing the lighttpd and php5-cgi packages and adapting this minimal configuration should be enough:

server.document-root = "/path/to/htdocs"
server.port          = 8080
index-file.names    += ( "index.html", "index.htm", "index.php" )

mimetype.assign = ( ".html" => "text/html", ".htm" => "text/html",
                    ".css" => "text/css",   ".txt" => "text/plain",
                    ".jpg" => "image/jpeg", ".png" => "image/png" )

server.modules += ( "mod_cgi" )
cgi.assign      = ( ".php" => "/path/to/php-cgi" )

You can then launch the lighttpd server and leave it running on the foreground with:

lighttpd -D -f lighttpd-epsilon.conf

If you use lighttpd in Windows, use Windows-style paths (c:\...) instead of Cygwin-style (/cygdrive/c/...) paths. Otherwise, PHP will not work correctly.


Finding broken links

wget and grep can be used to find broken links in the Epsilon website. First, we will traverse the website using wget with this command:

wget -e robots=off --spider -r --no-parent -o wget_errors.txt http://localhost:8080/epsilon/

We have used these options:

Once it's done, we can simply search for the word "404" in the log, with:
grep -B2 -w 404 wget_errors.txt

We will get a list of all the URLs which reported 404 (Not Found) HTTP error codes.

To find broken links in the Epsilon blog, it is better to use -l 3 (3 levels of recursion for the spider) rather than --no-parent, as we might want to check if external links are broken as well. We use three levels so we can go to the month in the "Archives", then to the full article, and finally to the external link itself.