Topic: Indexing is not unicode-safe
Hello, a wonderful engine and wonderful people here :-)
I installed punBB on my site. Did some little modifications to it as well (needed to integrate it into our own login system, but thanks to the advent of PostgreSQL this was a matter of making a view and a couple of triggers). The question follows.
I am storing my database in UTF-8. I used a russian localization and changed locale definitions there to ru_RU.UTF-8, after which I converted the localization files themselves into UTF-8. The board was working OK, but when creating a new message (or a new topic) with more than one word in it, the engine was giving me a "Cannot create search index" error (or the like), referencing to line 127 of search_idx.php
Currently I solved this problem by reinstalling russian language packs and hacking the DBDriver of PunBB th set the client encoding of the database connection accordingly. However, there is "one more thing" - alll of the output from the site goes into a UTF-8 encoded XSL template. If I will leave the forum engine in Win-1251 I am in for all kinds of problems when submitting forms (even if I will convert the forum data after with ob_get_contents()). I cannot use Win-1251 (most XML processors do not support it, and the main part of the site is anyways UTF-8 already).
The question follows - how can I modify the regex at line 127 so that it is multibyte-safe, or maybe it is possible to overload the regex engine with mbstring functions so that they can become multibyte-safe automatically?
Thanks in advance, and keep up the good work. PunBB certainly thrilled me :-)
I am running v. 1.1.5