Drupal site forensics

What tools do we have for a forensic research of a Drupal site. This question popped into my head after seeing a request on the consulting mailing list.

My naive approach is

wget -r -l 1 example.com

This way we get a one link level deep site of which we can grep the modules used on these pages. Then grep -r -h "modules" | sort -u gives a nice idea of the modules used.

But this beauty gives us the full file names.

grep -h -r -o "[[:alnum:]\/]*modules[[:alnum:]\.\/]*" * | sort -u | xargs -I {} echo {}

Getting all the named stylesheets

$> grep \$Id *
book.css:/* $Id: book.css,v 1.2.2.1 2007/01/29 18:54:29 dries Exp $ */
comment.css:/* $Id: comment.css,v 1.1.2.2 2007/07/24 18:38:58 drumm Exp $ */
content.css:/* $Id: content.css,v 1.2.2.8 2007/08/09 19:08:16 yched Exp $ */
defaults.css:/* $Id: defaults.css,v 1.2 2006/08/25 09:01:12 drumm Exp $ */
devel.js:// $Id: devel.js,v 1.1.2.2 2007/10/21 01:06:35 weitzman Exp $
forum.css:/* $Id: forum.css,v 1.2 2006/11/14 06:30:10 drumm Exp $ */
node.css:/* $Id: node.css,v 1.2.2.1 2007/07/24 18:38:58 drumm Exp $ */
panels.css:/* $Id: panels.css,v 1.1.2.5 2008/01/10 17:51:56 merlinofchaos Exp $ */
poll.css:/* $Id: poll.css,v 1.2 2006/10/02 16:16:06 dries Exp $ */
robots.txt:# $Id: robots.txt,v 1.7.2.2 2008/02/25 02:18:25 drumm Exp $
system.css:/* $Id: system.css,v 1.21 2006/12/21 16:13:06 dries Exp $ */
tagadelic.css:/*$Id: tagadelic.css,v 1.2 2006/11/16 16:57:56 ber Exp $*/
twocol.css:/* $Id: twocol.css,v 1.4.6.4 2007/12/11 21:02:35 merlinofchaos Exp $ */
user.css:/* $Id: user.css,v 1.4 2006/12/30 07:45:31 dries Exp $ */
video.css:/* $Id: video.css,v 1.1.4.1 2008/02/20 13:16:35 fax8 Exp $   /*

So is this site safe? And what version is it using?

The we need to get the CVS TAGS from ie http://cvs.drupal.org/viewvc.py/drupal/drupal/robots.txt?revision=1.7.2.... to see this is a member of DRUPAL-5.8, 5.9 or 5.10

With this we get the HTML which contains the CVS Tags

wget --header="Accept: text/html" "http://cvs.drupal.org/viewvc.py/drupal/drupal/robots.txt?revision=1.7.2....

To get the revision out of the downloaded file we need something like this

grep -h "\$Id.*,v" * | grep -o "[0-9][0-9]*\.[0-9\.]*"

free form: