Topic: Redundant data in sql tables - the example of PunBB

hi,

i'm using punbb for my site so i use it as a php/sql example once in a while.

i've noticed that a lot of data stored in the punbb tables are redundant. for example, the 'post' table stores both the 'poster' name and the 'poster_id'.

is it because it is more efficient to not do a select with a join?
are there other reasons?
is it that much more efficient?
it seems like the code and the db would be "cleaner" if it wasn't the case...

i'm not criticizing how punbb is designed, i'm just trying to understand better how to design my own data structure which is of course used differently than punbb.

what should i take into consideration in order to decide whether to duplicate some data in the various tables?

Re: Redundant data in sql tables - the example of PunBB

The redundant data is there to primarily improve performance, yes. However, in some cases it is there for more practical reasons. For example, poster in the posts table. It is there to facilitate guest posts. If the posts table didn't contain a field to store the username of anonymous posters, a table would have to be added to hold guest poster's usernames and this would require another join in viewtopic.php, the script in which performance is most important. When I first started working on PunBB, I decided this was the most efficient approach and didn't really attempt to make it "correct" so to speak. If I were to reimplement the complete database structure today, it is possible I would make some changes, but doing that now is too much work and it risks breaking all kinds of third party addons and integrations with other pieces of software.

"Programming is like sex: one mistake and you have to support it for the rest of your life."