Задать вопрос

Тел: +7 965 3737 888





db_dump.py - for dumpping and loading data from database

<p>This tool is used for dump and restore database of Django. And it can also support some simple situations for Model changes, so it can also be used in importing data after the migration of Model.</p>
<p>It includes: dump and restore.</p>
<p>Command Line:</p>
python db_dump.py [-svdh] [--settings] dump [applist]

<p>If applist is ignored&#65292;then it means that all app will be dumped. applist can be one or more app name.</p>
<p>Description of options:</p>
<ul><li>-s Output will displayed in console, default is writing into file</li>
<li>-v Display execution infomation, default is does not display</li>
<li>-d Directory of output, default is datadir in current directory. If the path is not existed, it'll be created automatically.</li>
<li>-h Display help information.</li>
<li>--settings settings model, default is settings.py in current directory. </li>
</ul><p>It can only support Python format for now. It'll create a standard python source file, for example:</p>
dump = {'table': 'tablename', 'records': [[...]], 'fields': [...]}

<p>table' is table name in database, records is all records of the table, it's a list of list, that is each record is a list. fields` is the fields name of the table.
Load(Restore) &#182;</p>
<p>Command Line:</p>
python db_dump.py [-svdrh] [--settings] load [applist]

<p>You can refer to the above description for same option. Others is:</p>
<ul><li>-r Does not empty the table as loading the data, default is empty the table first then load the data </li>
</ul><p>Using this tool, you can not only restore the database, but also can deal with the simple changes of database. It can select the suitable field from the backup data file according to the changed Model automatically, and it can also deal with the default value define in Model, such as default parameter and auto_now and auto_now_add parameter for Date-like field. And you can even edit the backup data file manually, and add a default key for specify the default value for some fields, the basic format is:</p>
'default':{'fieldname':('type', 'value')}

<p>default is a dict, the key will be the field name of the table, the value will be a two element tuple, and the first element of this tuple is type field, the second element is its value. Below is a description of type field:</p>
<p>type    value   description</p>
<p>'value'     real value  using the value field directly
'reference' referred field name     the value of this filed will use the value of referred field. It'll be used when the field name is changed
'date'  'now'|'yyyy-mm-dd'  It's a date date type, if the value field is 'now', then the value be current time. Otherwise, it'll be a string, it's format is 'yyyy-mm-dd'
'datetime'  'now'|'yyyy-mm-dd hh:mm:ss'     The same as above
'time'  'now'|'hh:mm:ss'    The same as above</p>
<p>The strategy of selection of default value of a field is: first, create a default value dict according the Model, then update it according the default key of backup data file. So you can see if there is a same definition of a field in both Model and backup data file, it'll use the one in backup data file.</p>
<p>According the process of default value, this tool will suport these changes, such as: change of field name, add or remove field name, etc. So you can use this tool to finish some simple update work of database.</p>
<p>But I don't give it too much test, and my situation is in sqlite3. So download and test are welcome, and I hope you can give me some improve advices.</p>
<h1>project site</h1>

Вопрос полезен? Да0/Нет0

Ответы (29):

Ответ полезен? Да0/Нет0

Still not working. The problem seems only occur when dumping data from Postgresql. I think line 441 is outdated:


if settings.DATABASE_ENGINE in ('postgresql_psycopg2', 'postgresql'):

fix: if settings.DATABASES['default']['ENGINE'] == 'django.db.backends.postgresql_psycopg2':

But still dumping data from Postgresql won't work.

Please help!

Ответ полезен? Да0/Нет0

Hi and thanks a lot for the useful script!! Unfortunately, I think the db_dump does not work properly with the latest release of Django 1.2.

Could you maybe have a look at this?

Sincerely, Simon

Ответ полезен? Да0/Нет0

I think the fix to work with inherited models properly is to just change line 292 above from:

for f in model._meta.fields:


for f in model._meta.local_fields:

This works in my simple test, and appears to make sense, but I don't know enough about Django internals to be sure this is safe.

Ответ полезен? Да0/Нет0

This appears not to work if you use inherited models. E.g. the approved Django way of doing user-profiles is now to derive an object from auth.User.

But if you do this, the code (v2.2) tries to insert all the fields into the child class, and not put anything into the parent class.

Anyone else come acrorss this?

Ответ полезен? Да0/Нет0

ManyToManyFields with a "through" arg cause an IntegrityError on load, because db_dump tries to dump or load the table twice: once for the model with the ManyToManyField, once for the through Model. The following patch fixes it for me.


Ответ полезен? Да0/Нет0

Looking quite tasty, I'll try this. But, what is the difference between 'db_dump' and already existing 'dumpdata' and 'loaddata' ?

Ответ полезен? Да0/Нет0

First, it's not limited in data dump, you can use it to migrate data from one database to another type database.

Second, you can use it to resolve model structure change.

Ответ полезен? Да0/Нет0

Why wouldn't you just use mysqldump?

Ответ полезен? Да0/Нет0

thanks thn, and I updated the source code. But there is somewhat different from your patch.

Ответ полезен? Да0/Нет0

limodou - thanks for the script! Here's a patch for the change lenducha suggested... the current method using count() doesn't work reliably because if a record in the middle of a sequence has been erased, a simple count will be off...

--- 14.py       2007-09-24 11:08:32.000000000 -0700
+++ db_dump.py  2007-09-24 11:10:36.000000000 -0700
@@ -423,11 +423,11 @@

# postgresql: reset sequence
 if settings.DATABASE_ENGINE in ('postgresql_psycopg2', 'postgresql'):
-        cursor.execute('SELECT count(*) + 1 FROM %s;' % quote_name(model._meta.db_table))
+        cursor.execute('SELECT max(id) + 1 FROM %s;' % quote_name(model._meta.db_table))
         nb = cursor.fetchall()[0][0]
-        seq = quote_name(model._meta.db_table + '_id_seq')
-        cursor.execute('ALTER SEQUENCE %s RESTART WITH %d;' % (seq, nb))
+        if nb:
+            seq = quote_name(model._meta.db_table + '_id_seq')
+            cursor.execute('ALTER SEQUENCE %s RESTART WITH %d;' % (seq, nb))

 def get_usage():
     usage = """

Ответ полезен? Да0/Нет0

Thanks, I'v changed it.

Ответ полезен? Да0/Нет0

DatebaseOperations() insertion also required on lines 124, 126, 235, 236.

Ответ полезен? Да0/Нет0

If you are following the Django svn version, lines 332 and 367 of db_dump.py will need amending twice in each line replacing:




Ответ полезен? Да0/Нет0

I'm familiar with postgres, so could write a patch for that? I'll merge it.

Ответ полезен? Да0/Нет0


I have problem loading data to postgres. Data are imported succesfully, but no pkey_seq is set. It will cause "duplicate primary key" exception when adding new item.

I think the problem is following: When pk is specified in insert, sequence is not incremented. The solution could be to setval('seq', MAX(pk)+1) after inserts (only if database postgres ?).

What do you think?

Ответ полезен? Да0/Нет0

Thanks, I'v changed it now.

Ответ полезен? Да0/Нет0

Regarding django changeset 5172, I think that this should fix the problem:

def get_model_many2many_stru(model):
    from django.contrib.contenttypes import generic

    opts = model._meta
    for f in opts.many_to_many:
        fields = []
        if isinstance(f.rel, generic.GenericRelation):
            yield f.m2m_db_table(), fields

Ответ полезен? Да0/Нет0

I simply use rows = cursor.fetchmany(100) to get the result, and directly dump the result by repr(). So maybe the boolean type processing is different between postgres and mysql or sqlite3.

Ответ полезен? Да0/Нет0

I edit file datadir/auth_user.py by hand and set fields corresponding to is_staff, is_active and is_superuser to True instead of 1. Then load to postgres with no errors.

Dumping from postgres gives True values in dump file.

Ответ полезен? Да0/Нет0

I didn't test it in postgres, and how to deal with boolean field in postgres?

Ответ полезен? Да0/Нет0

It seems that boolean fields dumped incorrectly. When dumping from mysql and loading into postgres database I saw an error:

Error sql: INSERT INTO "auth_user" ("id","username","first_name","last_name","email","password","is_staff","is_active","is_superuser","last_login","date_joined") VALUES (%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s); [1L, 'bogus', '', '', 'bogdanov@fr.ru', 'sha1$9fc62$50cf1c3ff7898b778dcb854647b393a62ec597a7', 1, 1, 1, '2007-04-19 13:48:02', '2007-03-23 18:29:20']
Traceback (most recent call last):
File "/home/bogus/projects/db_dump.py", line 228, in load_model
  cursor.execute(e_sql, sql_values)
File "/usr/lib/python2.4/site-packages/django/db/backends/util.py", line 12, in execute
  return self.cursor.execute(sql, params)
ProgrammingError: column "is_staff" is of type boolean but expression is of type integer
LINE 1: ...name","first_name","last_name","email","password","is_staff"...

Ответ полезен? Да0/Нет0

thanks. I'v seen it. And the design is different.

Ответ полезен? Да0/Нет0

  1. because I using python repr() to dump the python data type, so I think there is no problem for ''"&' string etc. And you can treat the output data file a real python source file. And you can even import it for future usage. I don't know how to add test case in django now, but I used it frequently and I think it's stable enough now. Buf I should said, I'm often working in sqlite, so maybe thers are some problems on others database.

Ответ полезен? Да0/Нет0

Hi limodou, thanks for replying.

  1. Beaut, I'll try that. I've seen something similar in code so I'll try it. I was using a 'real path'. Hence the problem.

  2. I dumped the data as you suggest to stdout and the "'" & '"' are both covered. I wonder if you can mix a single quote and a double quote into the string? I think I'll check this. One thing you can do with django is write test cases so I might add some tests to check these boundary cases. Let you know when I'm done.

Ответ полезен? Да0/Нет0

  1. for --settings, it should be a module format but not a real path. So you can set PYTHONPATH to your project first, then you can just dump without --settings parameter.

  2. because I using repr(r) to dump the data, so it'll convert to python string representation, and for "'" it'll convert to "\'", so I think it no error for db_dump.py. And you can try to dump the data with -s parameter, it'll output the dump result to screen. Or directly dump the data to directory, and open the data file to see if it's correct.

Ответ полезен? Да0/Нет0

this is one seriously cool script.

I was going to write a simple db dump & import tool for my dj projects extracting just some tables given the db but this takes the idea one step forward. Two questions I did have:

1) I tried "python db_dump.py --settings=/path/to/settings/ dump ". I could only get the script to work directly in the settings dir. Do you have any idea why?

2) I have lots of text with single quotes in it. In the stored version. How do you handle this case when storing the data in a list?

Hello there! It's certainly a hot today.

Would the result be stored as ...

[' Hello there! It's certainly a hot today.']

this will (should) fail for me.

Ответ полезен? Да0/Нет0

Man what a useful script!

Thanks for this limodou!

Ответ полезен? Да0/Нет0

You can add -v option to see the tedious output, and if you want to test the output data, you can add -s option. And after dump or load command, you can specify the apps, just like: user books, if there are more than one app, you should seperate them with blank. I'm very glad that you like it.

Ответ полезен? Да0/Нет0

I used this script to move my project from sqlite to MySQL.

My project was using sqlite. I ran 'python db_dump.py dump'; it didn't tell me much but I could see it created a new folder full of my data. I then modified settings.py to have details for my fresh MySQL database/user, ran './manage.py syncdb', then re-ran db_dump.py in 'load' mode. It perfectly copied all of my data into the new MySQL database and I'm now running at a much zippier speed.

Until now I wasn't aware of a simple way to move database platforms with Django. Good work, limodou!