1

(118 replies, posted in Supported extensions)

I think it is a problem from traffic limitation at other server programs.

Why don't you look into your Apache's (or other server's) logs ?

以下の点を修正したPunBB-1.3.4-Japaneze.zipをアップロードしました。

・「購読」⇒「更新通知」
・「投稿」⇒「コメント」
・削除・違反報告管理画面の文章
・オンラインとオフラインが逆だった tongue ので修正

その他、以下の日本語ドキュメントを追加しました。
検索インデックスをYahoo WebAPIで作成する方法
DokuWikiのユーザー管理をPunBBに統合する方法 (DokuWiki公式サイト内)

Splitting Japanese sentence into correct words is very difficult, so I used an web API tool provided by Yahoo! Japan.

http://developer.yahoo.co.jp/webapi/jlp … parse.html (in Japanese)

I've just written an wiki page to solve a problem of searching in Japanese.

http://punbb.informer.com/wiki/searchin … r_japanese


P.S. Please add an option of 'ja' to 'translation this page' at Wiki !!

Japanese language pack for PunBB-1.3.4 is out.
Please check and try it.

チェックお願いします。 -> 日本語を使用される方

http://punbb.informer.com/wiki/punbb13/language_packs

erratum: strlen() does not need encoding...

$mbAdd = (mb_strlen($cur_word,'UTF-8')==strlen($cur_word)) ?
  '' : '%';
Anatoly wrote:
vankon wrote:

most language is single byte/character, but Chinese is double byte/character(word) ,
in utf-8, most Chinese Word is 2 byte, other is 3byte or 4 byte.

this is a problem!

This should be fine. Russian (my native) is 2 bytes in UTF-8 too.

vankon wrote:

str_replace('*', '%', $cur_word)

This replaces * with % in SQL query. You should search *数据库* if you want search *database* and 数据库 for just database - these may give two different results. Using %word% every time is not good as I will find all the carpet, scare, careful instead of just car I was really looking for.
Do I miss something?

Yeah, this is not 'Multibyte' problem.  It's because whether 'a sentence is space-split or not'. Either Chinese or Japanese string doesn't have whitespaces between words. So explode(' ', $keywords) at line 76 does nothing...

I put some code to insert/add '%' only to multibyte searching $cur_words. This fix solves unnecessary burden by wildcards '%' for every words, moreover, this fixes 'car/carpet/careful' problem if they are not multibyte string.


at line 99 If multibyte string, $mbAdd = '%'.

$mbAdd = 
  (mb_strlen($cur_word,'UTF-8')==strlen($cur_word,'UTF-8')) ?
  '' : '%';

at line 108 (109)  $mbAdd as prefix and postfix to $cur_word

'WHERE' => 'w.word LIKE \''.$mbAdd.$forum_db->escape(
  str_replace('*', '%', $cur_word)).$mbAdd.'\''

Could somebody using Chinese or Japanese try and test this ?