Creating a sanitized Drupal database dump

People often want to create a backup copy of their site database and give it to someone else to create an environment similar to the live environment for testing or development. However, doing so exposes all of your site data being leaked if that backup copy is ever placed on a CD that gets lost or a harddrive which is not destroyed at the end of its life or a laptop which is stolen.

I blogged about one solution to this problem earlier with using Virtualbox and an encrypted disk image on Mac for easier and more secure development. However, another simple solution exists: munge the data.

Sanitizing a Drupal database export

Read the rest of Creating a sanitized Drupal database backup


This article is now part of the Knowledge Base of Drupal security articles on Drupal Scout.


This page is kept so the comments posted here are available since they provide additional help and insights.

Comments

Private key

Don't forget to remove the site-specific site key:

DELETE FROM variable WHERE name = 'drupal_private_key';

And for D7, potentially also the cron_key and possibly any other "key":

DELETE FROM variable WHERE name LIKE '%key%';

Demo module

btw, I also thought of extending http://drupal.org/project/demo module with an optional sub-module to ease this process, potentially also providing a hook for modules to integrate with.

Encryption Module

Great article. This sort of things is one of the main use-cases that I started the Encryption module. http://drupal.org/project/encrypt

What about profile data?

What about profile data in profile_fields and profile_values.

Clearing private/hidden profile fields

Good call.

The below query deletes profile data that is in a "private" or "hidden" profile field. Sometimes this data is necessary for theming so people may need to make fake users that have these values in order to test the site properly.


delete from profile_values where fid in (select fid from profile_fields where visibility in (1, 4));

Passwords should be munged

Since (in D6) passwords are not salted, they need to be munged if you're distributing the database. Otherwise it will be trivial to mount a brute-force attack on the username/password combinations in the database. Once the password is discovered for a username, it can be used against the real site.


UPDATE users SET pass=name;

or some such will do the job. If you want to disallow access, use a more complicated password replacement scheme.

Better update query

Sorry, left out the md5():

UPDATE users SET pass=md5(name);

This would allow login using the username as password to any account. If you want to secure accounts, use something more random and then set a password on specific accounts you need access to.

projects table does not exist

UPDATE projects SET mail = CONCAT("empty", '@localhost');
The projects table does not exist in the current version of the project module. Maybe it comes from an older schema of project.module?

two small bugs

Hi,

I noticed two smallish bugs that might confuse users new to SQL. The first line of the SQL is statement that updates all passwords using concatenation of a keyword (MILDSECRET) and username (name).

However, it is missing a comma at the end of the statement between 'MILDSECRET' and name, and should be written like this:

UPDATE users SET mail = CONCAT(name, '@localhost'), init = CONCAT(name, '@localhost'), pass = MD5(CONCAT('MILDSECRET', name));

The other problem is also minor, accesslog is not a table.... it should be access is; and the correction is:

TRUNCATE access;

thanks, now incorporated

Thanks, Mike. Those fixes are now incorporated into the original post.

I think the access vs. accesslog thing depends on which version of Drupal you use. 5.x and before for a while were accesslog while 6.x and above are access. I left both in for good measure.