1

Topic: search words containing swedish characters not working?

I never seem to get any matches when searching for words with åäö in it.

/IoR_Kongo @swec

Re: search words containing swedish characters not working?

In these forums or somewhere else? If it's in these forums, I know why. It's because the locale isn't set to Swedish on the server.

"Programming is like sex: one mistake and you have to support it for the rest of your life."

3

Re: search words containing swedish characters not working?

I found out on my forum first and I thought it was a bug in the old version I'm running. So I tried here but with the same result.

So it doesn't have anything to do with PunBB? How can I fix it?

/IoR_Kongo @swec

Re: search words containing swedish characters not working?

It has to do with PunBB. Are you using the swedish language pack? If not, you should. If you are and it still doesn't work, there's a problem with your server setup. For some reason, the call to setlocale() doesn't work. That usually means the swedish locale isn't installed.

"Programming is like sex: one mistake and you have to support it for the rest of your life."

5

Re: search words containing swedish characters not working?

yes i'm using the swedish language pack. I guess I can't do anything about it then because it's not my server. sad

/IoR_Kongo @swec

6

Re: search words containing swedish characters not working?

Contact the server administrator. He might be able to help you out.

Do, or do not.

7 (edited by Henke 2004-07-10 23:29)

Re: search words containing swedish characters not working?

yes, I will

EDIT: digged a little bit myself and found that

setlocale( LC_CTYPE, 'sv_SE' );

didn't return anything but

setlocale( LC_CTYPE, 'sv_SE.ISO_8859-1' );

returns the correct locale, but search isn't working anyway.

EDIT2: Now I see you've already thought of that in the new langpack Rickard.

/IoR_Kongo @swec

Re: search words containing swedish characters not working?

The new langpack?

"Programming is like sex: one mistake and you have to support it for the rest of your life."

9

Re: search words containing swedish characters not working?

I meant the latest swe lang pack.

/IoR_Kongo @swec

10

Re: search words containing swedish characters not working?

This is a bug, I cant use search with Chinese Words

Re: search words containing swedish characters not working?

machen: That doesn't tell me much. The search feature has a hack built it to allow searches in two byte character sets such as Chinese. It should work. It has worked before :)

"Programming is like sex: one mistake and you have to support it for the rest of your life."

Re: search words containing swedish characters not working?

I wonder if it is working when lang_multibyte is set true...

Re: search words containing swedish characters not working?

Yeah, lang_multibyte must be true for searches to work in multibyte languages such as chinese.

"Programming is like sex: one mistake and you have to support it for the rest of your life."

14

Re: search words containing swedish characters not working?

no more ideas regarding my problem? As I said setlocale seems to be working but åäö-searches still doesn't work.

/IoR_Kongo @swec

Re: search words containing swedish characters not working?

How do you know setlocale() works? Try putting the following piece of code after the inclusion of common.php in e.g. index.php:

dump(setlocale(LC_CTYPE, NULL));

Then run the script. It should output the currently active locale which in your case should be Swedish.

"Programming is like sex: one mistake and you have to support it for the rest of your life."

16

Re: search words containing swedish characters not working?

I assumed it worked since it returned the locale and not false when setting the locale.

The output from dump () is C
yes C.

/IoR_Kongo @swec

Re: search words containing swedish characters not working?

C is the default locale (POSIX). What it means is that setlocale() failed to set the swedish locale. Why that happens I don't know, but most likely it means you don't have it installed.

"Programming is like sex: one mistake and you have to support it for the rest of your life."

Re: search words containing swedish characters not working?

it might be wrong but how about trying setting lang_multibyte true? My idea is other language except english is much like multibyte, and when you see the source of swidish website it might show you using ? and the "?" is in the noise match and is replaced ' '. so you will not lose anything if you try setting it true.

Re: search words containing swedish characters not working?

jacobswell: That's not a good idea. Getting the locale installed shouldn't be a problem. lang_multibyte is a hack that ignores the search index and does a "raw" search in the posts table. It's many times slower than a regular search. There really isn't much I can do though seeing as PHP lacks native unicode support.

"Programming is like sex: one mistake and you have to support it for the rest of your life."

20 (edited by jacobswell 2004-07-16 13:30)

Re: search words containing swedish characters not working?

I mean, parden if I say wrong, that the when we post an article, is it saved into mysql with swidish characters or something like "Tack f? ditt bes? p?Yahoo! Sverige" if the latter is right, then the character "?" is replaced with ' '(space) by search.php's noise match filter. then we cannot get proper information. I think we can check like this

1. find in search.php

                    // Filter out non-alphabetical chars
                    $noise_match = array('^', '$', '&', '(', ')', '<', '>', '`', '\'', '"', '|', ',', '@', '_', '?', '%', '~', '.', '[', ']', '{', '}', ':', '\\', '/', '=', '#', '\'', ';', '!', '?);
                    $noise_replace = array(' ', ' ', ' ', ' ', ' ', ' ', ' ', '',  '',   ' ', ' ', ' ', ' ', '',  ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', '' ,  ' ', ' ', ' ', ' ',  ' ', ' ', ' ');
                    $keywords = str_replace($noise_match, $noise_replace, $keywords);

2. add this code after that

exit($keywords);

3. finally we can check how the $keywords is changed. I think all the "?" character is replaced with space if the latter case is right.

this is the example. Richard's signature is shown in notepad.exe as "Nice catch blanco ni?, but too bad your ass got saaaaaaaaaaaaaacked!".

Re: search words containing swedish characters not working?

jacobswell: That's not his problem. The swedish characters are saved correctly into his database. The problem is that in searchidx.php, PunBB devides the message into words and in that process it uses preg_replace() which in turn relies on the locale being correctly set. If it isn't, any words that contain the Swedish characters å, ä and ö will be ignored and thus won't end up in the search index.

"Programming is like sex: one mistake and you have to support it for the rest of your life."

Re: search words containing swedish characters not working?

I see, thanks for your kind explaination.

23

Re: search words containing swedish characters not working?

I do have the swedish locale installed. I'm running FreeBSD and the locales are in /usr/share/locale.

/IoR_Kongo @swec

Re: search words containing swedish characters not working?

Are you sure the actual locales are installed? I have a bunch of directories in my /usr/share/locale, but most of them are empty or more or less empty.

"Programming is like sex: one mistake and you have to support it for the rest of your life."

25

Re: search words containing swedish characters not working?

yes

[henke@flinta:~] $ ls /usr/share/locale/sv_SE.ISO8859-1
LC_COLLATE      LC_MESSAGES     LC_NUMERIC
LC_CTYPE        LC_MONETARY     LC_TIME
/IoR_Kongo @swec