<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
	<channel>
		<title><![CDATA[PunBB Forums - The search enginee in Chinese (or non English)]]></title>
		<link>http://punbb.informer.com/forums/topic/20308/the-search-enginee-in-chinese-or-non-english/</link>
		<description><![CDATA[The most recent posts in The search enginee in Chinese (or non English).]]></description>
		<lastBuildDate>Fri, 26 Jun 2009 07:25:22 +0000</lastBuildDate>
		<generator>PunBB</generator>
		<item>
			<title><![CDATA[Re: The search enginee in Chinese (or non English)]]></title>
			<link>http://punbb.informer.com/forums/post/128687/#p128687</link>
			<description><![CDATA[<div class="quotebox"><cite>vankon wrote:</cite><blockquote><p>punbb version 1.3 final</p><p>The search engine is very well for English, but is not good for Chinese.<br />it only can get Result for English, no Result for Chinese!</p><p>i debug the code ,find that</p><p>include/search_functions.php<br />@108 &#039;WHERE&#039; statement<br /></p><div class="quotebox"><blockquote><p>$query = array(<br />&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &#039;SELECT&#039;&nbsp; &nbsp; =&gt; &#039;m.post_id&#039;,<br />&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &#039;FROM&#039;&nbsp; &nbsp; &nbsp; &nbsp; =&gt; &#039;search_words AS w&#039;,<br />&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &#039;JOINS&#039;&nbsp; &nbsp; &nbsp; &nbsp; =&gt; array(<br />&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; array(<br />&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &#039;INNER JOIN&#039;&nbsp; &nbsp; =&gt; &#039;search_matches AS m&#039;,<br />&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &#039;ON&#039;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =&gt; &#039;m.word_id=w.id&#039;<br />&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; )<br />&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; ),<br />&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &#039;WHERE&#039;&nbsp; &nbsp; &nbsp; &nbsp; =&gt; &#039;w.word LIKE \&#039;&#039;.$forum_db-&gt;escape(str_replace(&#039;*&#039;, &#039;%&#039;, $cur_word)).&#039;\&#039;&#039;<br />&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; );</p></blockquote></div><p>$query value is like</p><p>SELECT m.post_id FROM search_words AS w INNER JOIN search_matches AS m ON m.word_id=w.id WHERE w.word LIKE &#039;数据库&#039;</p><p>my debug code is </p><div class="quotebox"><blockquote><p>echo &#039;search_functions.php@116@query=SELECT &#039;.$query[&#039;SELECT&#039;].&#039; FROM &#039;.$query[&#039;FROM&#039;].&#039; INNER JOIN &#039;.&#039;search_matches AS m&#039;. &#039; ON &#039;.&#039;m.word_id=w.id&#039;.&#039; WHERE &#039;.$query[&#039;WHERE&#039;];</p></blockquote></div><p>url is <a href="http://cilinux.cn/forum2/search.php?action=search&amp;keywords=%E6%95%B0%E6%8D%AE%E5%BA%93&amp;author=&amp;sort_by=0&amp;sort_dir=DESC&amp;show_as=topics&amp;search=%E6%90%9C%E7%B4%A2">http://cilinux.cn/forum2/search.php?act &#133; C%E7%B4%A2</a></p><p>the keywords is &#039;数据库&#039;, in English is &#039;database&#039;</p><p>the final sql statement missing &#039;%&#039; around keywords.</p><p>maybe ths bug in all non English language.</p><p>the code </p><div class="quotebox"><blockquote><p>&#039;WHERE&#039;&nbsp; &nbsp; &nbsp; &nbsp; =&gt; &#039;w.word LIKE \&#039;&#039;.$forum_db-&gt;escape(str_replace(&#039;*&#039;, &#039;%&#039;, $cur_word)).&#039;\&#039;&#039;</p></blockquote></div><p>modify to</p><div class="quotebox"><blockquote><p>&#039;WHERE&#039;&nbsp; &nbsp; &nbsp; &nbsp; =&gt; &#039;w.word LIKE \&#039;%&#039;.$forum_db-&gt;escape(str_replace(&#039;*&#039;, &#039;%&#039;, $cur_word)).&#039;%\&#039;&#039;</p></blockquote></div><p>then OK for keyword: &#039;数据库&#039;</p><p>but other keywords also have no Result.</p><p>most language is single byte/character, but Chinese is double byte/character(word) ,<br />in utf-8, most Chinese Word is 2 byte, other is 3byte or 4 byte.</p><p>this is a problem!</p><p>so on.</p></blockquote></div><p>这个非常好……谢谢分享</p>]]></description>
			<author><![CDATA[dummy@example.com (qzstudio)]]></author>
			<pubDate>Fri, 26 Jun 2009 07:25:22 +0000</pubDate>
			<guid>http://punbb.informer.com/forums/post/128687/#p128687</guid>
		</item>
		<item>
			<title><![CDATA[Re: The search enginee in Chinese (or non English)]]></title>
			<link>http://punbb.informer.com/forums/post/119808/#p119808</link>
			<description><![CDATA[<p>今天在网上搜了一下，中文的搜索是一个很大的难题，对中文分词是很不容易的事情，不像拉丁文每个词之间使用空格分隔，所以PunBB或者其它的论坛对中文的搜索效果不佳是正常的，现行的各种论坛中文搜索也只是简单的对关键字进行%keyword%搜索，而PunBB搜索的提取单词对中文无效的，只能提取大量的句子，而且搜索结果并不准确。</p><p>哪位懂中文和英文的朋友，请把这段中文翻译成英文。机器翻译总是效果不佳的。</p><br /><p>Today, a search on the Internet about the Chinese search is a big problem for the Chinese word is never easy, unlike Latin spaces between each word separated, so PunBB or other forum on Chinese The poor search results is normal, the current forum for the various Chinese search only for a simple keyword search %keyword%, and the PunBB search on the word of Chinese extraction null and void and will only serve to extract a large number of sentences, and search results Is not accurate. </p><p>Who are literate in Chinese and English friends, during the Chinese translated into English. Machine translation is always with poor results.</p>]]></description>
			<author><![CDATA[dummy@example.com (vankon)]]></author>
			<pubDate>Sat, 22 Nov 2008 10:25:00 +0000</pubDate>
			<guid>http://punbb.informer.com/forums/post/119808/#p119808</guid>
		</item>
		<item>
			<title><![CDATA[Re: The search enginee in Chinese (or non English)]]></title>
			<link>http://punbb.informer.com/forums/post/119612/#p119612</link>
			<description><![CDATA[<p>Thank you for answer my questions.</p><p>In China, people usually use keyword and not use *, and any people use a lots of smiles . <br />This is a difference between the habit.</p><p>continue...</p><p>I use keyword &#039;*数据库*&#039;, it show 10 records, but i use sql statement, has 12 records.<br />I don&#039;t know why the search result is difference.</p><p>I don&#039;t know how punbb search the keyword, </p><p>my sql statement is :</p><p>/*start*/<br />/*prefix with f_ */<br />select count(id) from f_topics where id in<br />(select id from f_topics where subject like &#039;%数据库%&#039;<br />union<br />select topic_id id from f_posts where message like &#039;%数据库%&#039;)<br />/*end*/</p><p>How about your sql statement ?</p><p>Look forward to your reply.(&lt;-- This statement is a translation provided by Google, I understand little English, )</p>]]></description>
			<author><![CDATA[dummy@example.com (vankon)]]></author>
			<pubDate>Wed, 19 Nov 2008 10:24:47 +0000</pubDate>
			<guid>http://punbb.informer.com/forums/post/119612/#p119612</guid>
		</item>
		<item>
			<title><![CDATA[Re: The search enginee in Chinese (or non English)]]></title>
			<link>http://punbb.informer.com/forums/post/119598/#p119598</link>
			<description><![CDATA[<div class="quotebox"><cite>vankon wrote:</cite><blockquote><p>most language is single byte/character, but Chinese is double byte/character(word) ,<br />in utf-8, most Chinese Word is 2 byte, other is 3byte or 4 byte.</p><p>this is a problem!</p></blockquote></div><p>This should be fine. Russian (my native) is 2 bytes in UTF-8 too.<br /></p><div class="quotebox"><cite>vankon wrote:</cite><blockquote><p>str_replace(&#039;*&#039;, &#039;%&#039;, $cur_word)</p></blockquote></div><p>This replaces * with % in SQL query. You should search <strong>*数据库*</strong> if you want search <strong>*database*</strong> and <strong>数据库</strong> for just <strong>database</strong> - these may give two different results. Using <strong>%word%</strong> every time is not good as I will find all the <strong>carpet</strong>, <strong>scare</strong>, <strong>careful </strong>instead of just <strong>car </strong>I was really looking for.<br />Do I miss something?</p><div class="quotebox"><cite>vankon wrote:</cite><blockquote><p>the smilies will not work if without blank before smilies code</p></blockquote></div><p>We suppose this is a feature, not a bug :-)</p>]]></description>
			<author><![CDATA[dummy@example.com (Anatoly)]]></author>
			<pubDate>Wed, 19 Nov 2008 07:09:56 +0000</pubDate>
			<guid>http://punbb.informer.com/forums/post/119598/#p119598</guid>
		</item>
		<item>
			<title><![CDATA[Re: The search enginee in Chinese (or non English)]]></title>
			<link>http://punbb.informer.com/forums/post/119585/#p119585</link>
			<description><![CDATA[<p>The most nicety search statements is follow, but performance is not good!<br /></p><div class="codebox"><pre><code>select * from f_topics where id in
(select id from f_topics where subject like &#039;%keywords%&#039;
union 
select topic_id id from f_posts where message like &#039;%keywords%&#039;)
order by id;</code></pre></div>]]></description>
			<author><![CDATA[dummy@example.com (vankon)]]></author>
			<pubDate>Wed, 19 Nov 2008 05:58:52 +0000</pubDate>
			<guid>http://punbb.informer.com/forums/post/119585/#p119585</guid>
		</item>
		<item>
			<title><![CDATA[Re: The search enginee in Chinese (or non English)]]></title>
			<link>http://punbb.informer.com/forums/post/119581/#p119581</link>
			<description><![CDATA[<p>other: the smilies will not work if without blank before smilies code.</p><p>quote<br /></p><div class="quotebox"><blockquote><p><img src="http://punbb.informer.com/forums/img/smilies/smile.png" width="15" height="15" alt="smile" />:):):)</p></blockquote></div><p>code<br /></p><div class="codebox"><pre><code>:):):):)</code></pre></div><p>quote<br /></p><div class="quotebox"><blockquote><p><img src="http://punbb.informer.com/forums/img/smilies/smile.png" width="15" height="15" alt="smile" /> <img src="http://punbb.informer.com/forums/img/smilies/smile.png" width="15" height="15" alt="smile" /> <img src="http://punbb.informer.com/forums/img/smilies/smile.png" width="15" height="15" alt="smile" /> <img src="http://punbb.informer.com/forums/img/smilies/smile.png" width="15" height="15" alt="smile" /></p></blockquote></div><p>code<br /></p><div class="codebox"><pre><code>:) :) :) :)</code></pre></div>]]></description>
			<author><![CDATA[dummy@example.com (vankon)]]></author>
			<pubDate>Wed, 19 Nov 2008 05:06:06 +0000</pubDate>
			<guid>http://punbb.informer.com/forums/post/119581/#p119581</guid>
		</item>
		<item>
			<title><![CDATA[Re: The search enginee in Chinese (or non English)]]></title>
			<link>http://punbb.informer.com/forums/post/119578/#p119578</link>
			<description><![CDATA[<p>I don&#039;t know how the search enginee work. <img src="http://punbb.informer.com/forums/img/smilies/smile.png" width="15" height="15" alt="smile" />:):)</p>]]></description>
			<author><![CDATA[dummy@example.com (vankon)]]></author>
			<pubDate>Wed, 19 Nov 2008 04:46:47 +0000</pubDate>
			<guid>http://punbb.informer.com/forums/post/119578/#p119578</guid>
		</item>
		<item>
			<title><![CDATA[The search enginee in Chinese (or non English)]]></title>
			<link>http://punbb.informer.com/forums/post/119577/#p119577</link>
			<description><![CDATA[<p>punbb version 1.3 final</p><p>The search engine is very well for English, but is not good for Chinese.<br />it only can get Result for English, no Result for Chinese!</p><p>i debug the code ,find that</p><p>include/search_functions.php<br />@108 &#039;WHERE&#039; statement<br /></p><div class="quotebox"><blockquote><p>$query = array(<br />&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &#039;SELECT&#039;&nbsp; &nbsp; =&gt; &#039;m.post_id&#039;,<br />&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &#039;FROM&#039;&nbsp; &nbsp; &nbsp; &nbsp; =&gt; &#039;search_words AS w&#039;,<br />&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &#039;JOINS&#039;&nbsp; &nbsp; &nbsp; &nbsp; =&gt; array(<br />&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; array(<br />&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &#039;INNER JOIN&#039;&nbsp; &nbsp; =&gt; &#039;search_matches AS m&#039;,<br />&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &#039;ON&#039;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =&gt; &#039;m.word_id=w.id&#039;<br />&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; )<br />&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; ),<br />&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &#039;WHERE&#039;&nbsp; &nbsp; &nbsp; &nbsp; =&gt; &#039;w.word LIKE \&#039;&#039;.$forum_db-&gt;escape(str_replace(&#039;*&#039;, &#039;%&#039;, $cur_word)).&#039;\&#039;&#039;<br />&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; );</p></blockquote></div><p>$query value is like</p><p>SELECT m.post_id FROM search_words AS w INNER JOIN search_matches AS m ON m.word_id=w.id WHERE w.word LIKE &#039;数据库&#039;</p><p>my debug code is </p><div class="quotebox"><blockquote><p>echo &#039;search_functions.php@116@query=SELECT &#039;.$query[&#039;SELECT&#039;].&#039; FROM &#039;.$query[&#039;FROM&#039;].&#039; INNER JOIN &#039;.&#039;search_matches AS m&#039;. &#039; ON &#039;.&#039;m.word_id=w.id&#039;.&#039; WHERE &#039;.$query[&#039;WHERE&#039;];</p></blockquote></div><p>url is <a href="http://cilinux.cn/forum2/search.php?action=search&amp;keywords=%E6%95%B0%E6%8D%AE%E5%BA%93&amp;author=&amp;sort_by=0&amp;sort_dir=DESC&amp;show_as=topics&amp;search=%E6%90%9C%E7%B4%A2">http://cilinux.cn/forum2/search.php?act &#133; C%E7%B4%A2</a></p><p>the keywords is &#039;数据库&#039;, in English is &#039;database&#039;</p><p>the final sql statement missing &#039;%&#039; around keywords.</p><p>maybe ths bug in all non English language.</p><p>the code </p><div class="quotebox"><blockquote><p>&#039;WHERE&#039;&nbsp; &nbsp; &nbsp; &nbsp; =&gt; &#039;w.word LIKE \&#039;&#039;.$forum_db-&gt;escape(str_replace(&#039;*&#039;, &#039;%&#039;, $cur_word)).&#039;\&#039;&#039;</p></blockquote></div><p>modify to</p><div class="quotebox"><blockquote><p>&#039;WHERE&#039;&nbsp; &nbsp; &nbsp; &nbsp; =&gt; &#039;w.word LIKE \&#039;%&#039;.$forum_db-&gt;escape(str_replace(&#039;*&#039;, &#039;%&#039;, $cur_word)).&#039;%\&#039;&#039;</p></blockquote></div><p>then OK for keyword: &#039;数据库&#039;</p><p>but other keywords also have no Result.</p><p>most language is single byte/character, but Chinese is double byte/character(word) ,<br />in utf-8, most Chinese Word is 2 byte, other is 3byte or 4 byte.</p><p>this is a problem!</p><p>so on.</p>]]></description>
			<author><![CDATA[dummy@example.com (vankon)]]></author>
			<pubDate>Wed, 19 Nov 2008 04:23:24 +0000</pubDate>
			<guid>http://punbb.informer.com/forums/post/119577/#p119577</guid>
		</item>
	</channel>
</rss>
