Converting a MySQL version 3.23 database to version 5.0
You’d think there might be a one-step approach to this, as it would be a common need, but because we’re going from a database version that didn’t know about character encodings at all, its contents could be in any old encoding. These steps assume that the database was used to store Latin1 (ISO 8859-1) characters before. In the examples below, we’re going to convert a database called “libris2″ (because that’s a real one I had to convert – these steps are tried and tested!)
1. Dump the old database
The standard way of dumping a database from MySQL is
mysqldump --opt libris2 > libris2.sql
That dumps all the tables from a database and all the data in a form designed for efficient insertion into a new database, with very long INSERT statements. However, I want to be able to check the conversions that I’m about to do, so I’ll dump the database slightly differently:
mysqldump --add-drop-table --add-locks --all
--quick --lock-tables libris2 > libris2.sql
Here, I’ve used all the options that “–opt” would give me, with the exception of “–extended-insert”. Now, I have the complete database in a text file, libris2.sql.
Note: Perform this dump on the machine running the old MySQL server, so that you get an old (matching) version of mysqldump. If mysqldump doesn’t match the server version, you may not get any output at all. I have been in the situation where my database was automatically migrated from an old server to a newer version, and convincing mysqldump to ignore character encoding details in the dump on newer versions is a pain. Migrate your data while you still have the old server running.
2. Convert the character encodings
If I’m sure of the encoding of data in the database, I can do this:
iconv -f latin1 -t utf8 < libris2.sql > libris-utf8.sql
Another way of doing this is to edit the file in vim, which will automatically convert the character encoding of a file into one appropriate for the shell its running in. For example, if I do
vim libris2.sql
vim will open the file and display a status line saying:
"libris2.sql" [converted] 234L, 733981C
showing that the encoding of the file wasn’t already UTF-8. If I type the command “:set fileencoding” (or “:set fenc”), vim will report
fileencoding=latin1
If I wanted to, I could get vim to reencode the file for me:
:set fenc=utf8 :w libris2-utf8.sql :q
Although using vim takes a bit longer because it’s interactive, vim will at least make a good guess at the original encoding, in case I couldn’t remember.
At this point, I like to see what conversions have been made, which is why I didn’t want super-long lines in the database dump:
diff libris2.sql libris2-utf8 | more
As my shell uses UTF-8, some characters in the original file will appear as junk, but they should appear correctly in the output file. If I’ve made any mistakes in guessing the original encoding when I used iconv, I can have another go.
3. Update SQL definitions
For some reason, mysqldump may produce table definitions that are invalid in later versions of MySQL if you are using auto incrementing key fields. If mysqldump produces a table definition that looks like this:
CREATE TABLE DIRECTORS ( DIRECTORS.id int(11) DEFAULT '0' NOT NULL auto_increment, PRIMARY KEY (id) );
then the default clause will need to be filtered before importing into MySQL 5.0. I do this for all table definitions in one go with a simple sed script:
sed -e "/auto_increment/ s/DEFAULT '0'//" database.sql > database-filt.sql
4. Import the database dump
Now we have a database dump with the correct encoding, so it only remains to import it to MySQL 5.0 correctly. If we were re-importing the dump to the same version of MySQL, that would be as simple as:
mysql libris2 < libris2.sql
but we need to make sure that the default encoding of the new database is correct, and we need to set the encoding used for our connection to the database. To do that, I’m going to create a small file containing some extra commands that I’ll place in front of the commands in the database dump. Create a file called libris2-header containing these lines:
set names utf8; drop database if exists libris2; create database libris2 character set utf8; use libris2;
The beauty of putting these lines in a separate file, rather than adding them to the top of the database dump, is that I can re-use them if I need to dump the database from the old server again.
Now, I use my little header file and the converted database dump to create a new database on my MySQL 5.0 server:
cat libris2-header libris2-utf8.sql | mysql
Note that I haven’t specified the database name as an option to mysql: that’s because it doesn’t exist yet.
Thanks for an excellent summary ! Saved me a whole heap of time.
I have a project I am working on for a new client. They have an 3.23 db they had developed in ’03 that they never used. The development company that created the system is out of business and my client wants to use the old system as the basis for our new project. Unfortunately we do not have a 3.23 server we can access to do the conversion. Is this something you’d be interested in doing for us? Please email me asap to discuss. Thanks.
Followed your instructions – very helpful – thank you!
Two differences for me:
1: Didn’t need the sed script to update my lines with auto_increment.
2: Did need to strip off the first block of comments at the top of my SQL – think it was choking on a long line of dashes. Removed that block and it loaded fine.
Here is my write up of me steps (with credit to your post, of course!)
http://craiccomputing.blogspot.com/2009/06/migrating-mysql-323-database-to-50.html
Thanks again.
Hi flo, thanks for the tutorial. I am trying to import an old WordPress blog (mysql 3.23) into a newer version (mysql 5.0).
I agree with Rob above. I didn’t need the sed command, but I had to remove the first block of comments.
In my case, the dump was already in utf8, so I didn’t need to convert the encoding either.
I get the following error using these steps:
ERROR 1064 (42000) at line 361999: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ‘order int(11) NOT NULL default ’0′,
PRIMARY KEY (id)
) TYPE=MyISAM’ at line 6
I even rand the sed command.
Any suggestions?