Search options

2010-01-22 09:22

I think it is a problem from traffic limitation at other server programs.

Why don't you look into your Apache's (or other server's) logs ?

2010-01-21 06:31

以下の点を修正したPunBB-1.3.4-Japaneze.zipをアップロードしました。

・「購読」⇒「更新通知」
・「投稿」⇒「コメント」
・削除・違反報告管理画面の文章
・オンラインとオフラインが逆だったので修正

その他、以下の日本語ドキュメントを追加しました。
・検索インデックスをYahoo WebAPIで作成する方法
・DokuWikiのユーザー管理をPunBBに統合する方法 (DokuWiki公式サイト内)

2010-01-15 05:56

Splitting Japanese sentence into correct words is very difficult, so I used an web API tool provided by Yahoo! Japan.

http://developer.yahoo.co.jp/webapi/jlp … parse.html (in Japanese)

I've just written an wiki page to solve a problem of searching in Japanese.

http://punbb.informer.com/wiki/searchin … r_japanese

P.S. Please add an option of 'ja' to 'translation this page' at Wiki !!

2009-12-24 07:10

Japanese language pack for PunBB-1.3.4 is out.
Please check and try it.

チェックお願いします。 -> 日本語を使用される方

http://punbb.informer.com/wiki/punbb13/language_packs

2009-12-15 23:41

erratum: strlen() does not need encoding...

$mbAdd = (mb_strlen($cur_word,'UTF-8')==strlen($cur_word)) ?
  '' : '%';

2009-12-15 10:28

Anatoly wrote:

vankon wrote:
most language is single byte/character, but Chinese is double byte/character(word) ,
in utf-8, most Chinese Word is 2 byte, other is 3byte or 4 byte.
this is a problem!
This should be fine. Russian (my native) is 2 bytes in UTF-8 too.
vankon wrote:
str_replace('*', '%', $cur_word)
This replaces * with % in SQL query. You should search *数据库* if you want search *database* and 数据库 for just database - these may give two different results. Using %word% every time is not good as I will find all the carpet, scare, careful instead of just car I was really looking for.
Do I miss something?

Yeah, this is not 'Multibyte' problem. It's because whether 'a sentence is space-split or not'. Either Chinese or Japanese string doesn't have whitespaces between words. So explode(' ', $keywords) at line 76 does nothing...

I put some code to insert/add '%' only to multibyte searching $cur_words. This fix solves unnecessary burden by wildcards '%' for every words, moreover, this fixes 'car/carpet/careful' problem if they are not multibyte string.

at line 99 If multibyte string, $mbAdd = '%'.

$mbAdd = 
  (mb_strlen($cur_word,'UTF-8')==strlen($cur_word,'UTF-8')) ?
  '' : '%';

at line 108 (109) $mbAdd as prefix and postfix to $cur_word

'WHERE' => 'w.word LIKE \''.$mbAdd.$forum_db->escape(
  str_replace('*', '%', $cur_word)).$mbAdd.'\''

Could somebody using Chinese or Japanese try and test this ?

Search options

Posts found: 6

1 Reply by iobataya 2010-01-22 09:22

Re: Attachments (pun_attachment) (118 replies, posted in Supported extensions)

2 Reply by iobataya 2010-01-21 06:31

Re: Japanese language pack for 1.3.4 (1 replies, posted in PunBB 1.3 additions)

3 Reply by iobataya 2010-01-15 05:56

Re: The search enginee in Chinese (or non English) (12 replies, posted in Feature requests)

4 Topic by iobataya 2009-12-24 07:10

Topic: Japanese language pack for 1.3.4 (1 replies, posted in PunBB 1.3 additions)

5 Reply by iobataya 2009-12-15 23:41

Re: The search enginee in Chinese (or non English) (12 replies, posted in Feature requests)

6 Reply by iobataya 2009-12-15 10:28

Re: The search enginee in Chinese (or non English) (12 replies, posted in Feature requests)

Posts found: 6