1 (edited by liquidat0r 2008-05-20 20:49)

Topic: Bug with using a special Alt-code character

I made a topic in Test Forum just now so that you can see what I mean.

If you enter Alt+0173 into a topic title, post message, or presumably anywhere in PunBB, you can bypass the validation of 'required fields'.

First Google result to help explain: http://encyclopediadramatica.com/Alt-0173

---

I'm not sure if there's anyway to protect against that, but I thought I'd post it anyhow.

Re: Bug with using a special Alt-code character

liquidat0r wrote:

I made a topic in Test Forum just now so that you can see what I mean.

If you enter Alt+0173 into a topic title, post message, or presumably anywhere in PunBB, you can bypass the validation of 'required fields'.

First Google result to help explain: http://encyclopediadramatica.com/Alt-0173

---

I'm not sure if there's anyway to protect against that, but I thought I'd post it anyhow.

Thanks!

What are your ideas on possible solution?
Chars blacklist, or conversely whitelist-filter containing allowed chars only?
For critical fields at least.

BTW: MS Notepad/Word show it like "-". MS IE shows it in URL, but does not in form fields. Firefox doesn't show it at all.

Carpe diem

3 (edited by MattF 2008-05-21 12:12)

Re: Bug with using a special Alt-code character

Anatoly wrote:

What are your ideas on possible solution?
Chars blacklist, or conversely whitelist-filter containing allowed chars only?
For critical fields at least.

You are a Dev are you not? No disrespect intended, but you should be suggesting your proposed solutions, not asking for methods. smile If those characters serve no useful purpose whatsoever, they shouldn't be allowed fullstop. Plus, a useless character is a useless character in ANY field.

Re: Bug with using a special Alt-code character

MattF wrote:
Anatoly wrote:

What are your ideas on possible solution?
Chars blacklist, or conversely whitelist-filter containing allowed chars only?
For critical fields at least.

You are a Dev are you not? No disrespect intended, but you should be suggesting your proposed solutions, not asking for methods. :) If those characters serve no useful purpose whatsoever, they shouldn't be allowed fullstop. Plus, a useless character is a useless character in ANY field.

Thanks for opinion.
I suppose it is hard to say which characters are always useful and which ones are always useless.

U+00AD (Alt+0173) is HTML entity ­ = soft hyphen.
And here is the issue desribed.
We shouldn't just cut it off everywhere.

PS: I always ask my colleagues for their opinion, especially when the solution seems simple to me.

Carpe diem

5

Re: Bug with using a special Alt-code character

Anatoly wrote:

PS: I always ask my colleagues for their opinion, especially when the solution seems simple to me.

Did you ask us about changing the domain name?

Re: Bug with using a special Alt-code character

hcgtv wrote:
Anatoly wrote:

PS: I always ask my colleagues for their opinion, especially when the solution seems simple to me.

Did you ask us about changing the domain name?

1. The post you have quoted is personal position, but I cannot answer your question as "changing the domain name" was not my decision (I do not own PunBB). Nevertheless, I suppose PunBB owners do have heard your question already (if you'd like me address your question to them).
2. It is offtopic, let us stop it. If you want to continue discussion, start another topic or mail me to anatoly@punbb.org please.

Carpe diem

Re: Bug with using a special Alt-code character

Anatoly wrote:

What are your ideas on possible solution?

As far as I know, it's the only character that can have this effect.

I'm no coder, but I presume all you would have to do is put some script somewhere to replace ­ with or .

8

Re: Bug with using a special Alt-code character

Anatoly wrote:

If you want to continue discussion, start another topic or mail me to anatoly@punbb.org please.

http://punbb.informer.com/forums/viewtopic.php?id=19202

Re: Bug with using a special Alt-code character

liquidat0r wrote:

I'm no coder, but I presume all you would have to do is put some script somewhere to replace ­ with or .

Agree.
But just for the cases we really need it. It seems to be OK to allow them in message text.
Other way is to check whether field value contains ­'s only or just trim them off the edges.

Carpe diem

10 (edited by MattF 2008-05-21 22:26)

Re: Bug with using a special Alt-code character

Anatoly wrote:

But just for the cases we really need it. It seems to be OK to allow them in message text.
Other way is to check whether field value contains ­'s only or just trim them off the edges.

If it does have any use whatsoever, (which I severely doubt, and even one of the links you pointed to says this):

Web authoring, SHY (written e.g. using the entity ­) could be used as an occasional hyphenation hint in special cases, with the risk that it may be displayed as a normal hyphen in any context by some (rare) browsers. Moreover, some browsers (e.g., Firefox) simply ignore SHY.

then the only place I can see it being of any use whatsoever is within a set of code tags. Nowhere else.

11

Re: Bug with using a special Alt-code character

There is a similar issue with some other unicode characters, there's a bit more info about it here if it helps.

Re: Bug with using a special Alt-code character

According to [url=]this[/url] document, the ISO Latin 1 usage of the soft hyphen:

A graphic character that is imaged by a graphic symbol identical with, or similar to, that representing hyphen, for use when a line break has been established within a word.

So:

The soft hyphen is pretty irrelevant in HTML documents, since normally one should not divide a word into two lines in HTML.

So in printed documents it is used to restore the integrity of a word that has been divided into lines, and its use is discouraged in HTML. In which case, it makes sense to strip its use altogether - in post messages as well as other places.

Re: Bug with using a special Alt-code character

Reines wrote:

There is a similar issue with some other unicode characters, there's a bit more info about it here if it helps.

Agree.
Seems like we'd better strip them all before any input checks (so that we have "empty field" error).

Carpe diem

14

Re: Bug with using a special Alt-code character

Anatoly wrote:

Seems like we'd better strip them all before any input checks (so that we have "empty field" error).

That's a far better solution. smile