今天在网上搜了一下,中文的搜索是一个很大的难题,对中文分词是很不容易的事情,不像拉丁文每个词之间使用空格分隔,所以PunBB或者其它的论坛对中文的搜索效果不佳是正常的,现行的各种论坛中文搜索也只是简单的对关键字进行%keyword%搜索,而PunBB搜索的提取单词对中文无效的,只能提取大量的句子,而且搜索结果并不准确。

哪位懂中文和英文的朋友,请把这段中文翻译成英文。机器翻译总是效果不佳的。


Today, a search on the Internet about the Chinese search is a big problem for the Chinese word is never easy, unlike Latin spaces between each word separated, so PunBB or other forum on Chinese The poor search results is normal, the current forum for the various Chinese search only for a simple keyword search %keyword%, and the PunBB search on the word of Chinese extraction null and void and will only serve to extract a large number of sentences, and search results Is not accurate.

Who are literate in Chinese and English friends, during the Chinese translated into English. Machine translation is always with poor results.

btw: I think bbcode should be able to customize will be even better.
Like phpbb's bbcode function.

Anatoly wrote:
vankon wrote:

These paragraphs of the arrangement for non-English language is very important.

Could you explain in details please?

the bbcode below:

[align=left]some text[/align]
[align=center][img]imgurl[/img][/align]
[align=right][url]url[/url][/align]

...

Anatoly wrote:

That is not in the spirit of PunBB. E.g. I see no use for this features in this forum. And with extension you will always be able to add these features with one click.

I can understand what you think.

4

(8 replies, posted in PunBB 1.2 show off)

maybe a little. smile

5

(16 replies, posted in Discussions)

usually, in  China, open-source is not mean  non-business or business. users free for non-business, and must pay for business.

gpl allowed you use the soft freely, but not free for support. so, if a problem no one support, then you can solve it by your self.

smile

and As a standard feature, rather than extension.

I suggest that the BBCode enhanced features such as paragraphs of the paragraphs left, center, right of abode; Flash, MP3 (Music), film (MMS / RTMP / AVI / RM / RMVB), and other functions.

These paragraphs of the arrangement for non-English language is very important.

For multimedia support to enhance the function of PunBB, but the use of outside url, the forum will not take up any resources.

This statement is too hard to me, it translated by Google. Wish you can understand.

Thank you for answer my questions.

In China, people usually use keyword and not use *, and any people use a lots of smiles .
This is a difference between the habit.

continue...

I use keyword '*数据库*', it show 10 records, but i use sql statement, has 12 records.
I don't know why the search result is difference.

I don't know how punbb search the keyword,

my sql statement is :

/*start*/
/*prefix with f_ */
select count(id) from f_topics where id in
(select id from f_topics where subject like '%数据库%'
union
select topic_id id from f_posts where message like '%数据库%')
/*end*/

How about your sql statement ?

Look forward to your reply.(<-- This statement is a translation provided by Google, I understand little English, )

The most nicety search statements is follow, but performance is not good!

select * from f_topics where id in
(select id from f_topics where subject like '%keywords%'
union 
select topic_id id from f_posts where message like '%keywords%')
order by id;

other: the smilies will not work if without blank before smilies code.

quote

smile:):):)

code

:):):):)

quote

smile smile smile smile

code

:) :) :) :)

I don't know how the search enginee work. smile:):)

punbb version 1.3 final

The search engine is very well for English, but is not good for Chinese.
it only can get Result for English, no Result for Chinese!

i debug the code ,find that

include/search_functions.php
@108 'WHERE' statement

$query = array(
                        'SELECT'    => 'm.post_id',
                        'FROM'        => 'search_words AS w',
                        'JOINS'        => array(
                            array(
                                'INNER JOIN'    => 'search_matches AS m',
                                'ON'            => 'm.word_id=w.id'
                            )
                        ),
                        'WHERE'        => 'w.word LIKE \''.$forum_db->escape(str_replace('*', '%', $cur_word)).'\''
                    );

$query value is like

SELECT m.post_id FROM search_words AS w INNER JOIN search_matches AS m ON m.word_id=w.id WHERE w.word LIKE '数据库'

my debug code is

echo 'search_functions.php@116@query=SELECT '.$query['SELECT'].' FROM '.$query['FROM'].' INNER JOIN '.'search_matches AS m'. ' ON '.'m.word_id=w.id'.' WHERE '.$query['WHERE'];

url is http://cilinux.cn/forum2/search.php?act … C%E7%B4%A2

the keywords is '数据库', in English is 'database'

the final sql statement missing '%' around keywords.

maybe ths bug in all non English language.

the code

'WHERE'        => 'w.word LIKE \''.$forum_db->escape(str_replace('*', '%', $cur_word)).'\''

modify to

'WHERE'        => 'w.word LIKE \'%'.$forum_db->escape(str_replace('*', '%', $cur_word)).'%\''

then OK for keyword: '数据库'

but other keywords also have no Result.

most language is single byte/character, but Chinese is double byte/character(word) ,
in utf-8, most Chinese Word is 2 byte, other is 3byte or 4 byte.

this is a problem!

so on.

The search engine is very well for English, but is not good for Chinese.
it only can get Result for English, no Result for Chinese!

I think support [flash=width,height]flash_url[/flash] is a good idea.
This is a BBCode to can support flash video. like http://www.phpbbchina.com/forum/viewtop … amp;t=1003 or http://linux.chinaunix.net/bbs/viewthre … a=page%3D1

a expression

[flash=[u]{width}[/u],[u]{height}[/u]][u]{url}[/u][/url]
<object classid="clsid:D27CDB6E-AE6D-11CF-96B8-444553540000" codebase="http://active.macromedia.com/flash2/cabs/swflash.cab#version=5,0,0,0" width="[u]{width}[/u]" height="[u]{height}[/u]"><param name="movie" value="[u]{url}[/u]" /><param name="play" value="true" /><param name="loop" value="false" /><param name="quality" value="high" /><param name="allowScriptAccess" value="never" /><param name="allowNetworking" value="internal" /><embed src="[u]{url}[/u]" type="application/x-shockwave-flash" pluginspage="http://www.macromedia.com/shockwave/download/index.cgi?P1_Prod_Version=ShockwaveFlash" width="[u]{width}[/u]" height="[u]{height}[/u]" play="true" loop="false" quality="high" allowscriptaccess="never" allownetworking="internal"></embed></object>

This can let Administrator add new BBCode by self.

PunBB is very faster than PHPBB, Thank you!

Excuse for my poor English.