Topic: How to make convert_to_utf8 way faster
I was just working on my phpbb2 to punbb1.3 script and even with my tiny database (only about 38k posts) it would take a dump while trying to convert everything. In short it'd hit the 32MB limit pretty quickly. Now I could just up the limit but in some places you don't have that control and at what point do you stop upping it?
So ... I think these changes work but can someone please confirm against some proper non-UTF8 content?
Change these lines:
$str = preg_replace_callback('/&#([0-9]+);/', create_function('$s', 'return dcr2utf8($s[1]);'), $str);
$str = preg_replace_callback('/&#x([a-f0-9]+);/i', create_function('$s', 'return dcr2utf8(hexdec($s[1]));'), $str);
to:
$str = preg_replace_callback('/&#([0-9]+);/', 'callback1', $str);
$str = preg_replace_callback('/&#x([a-f0-9]+);/i', 'callback2', $str);
and add these two functions:
function callback1($matches) {
return dcr2utf8($matches[1]);
}
function callback2($matches) {
return dcr2utf8(hexdec($matches[1]));
}
And now I can convert my whole phpbb2 database in one go (no stupid page refreshing) in about 30 seconds without going over 1MB of memory.
create_function() creates an anonymous function, which of course it was doing 152,000 times for the posts table alone. Making a proper function for it makes things much better.