Re: Punmap - html sitemap with fancy urls UPDATED - final release

As a sidenote i should mention that i laugh like crazzy a few months ago when the web out there found out that deleting the google sitemap from their sites would give them an immediate gain on the number of indexed pages.

I hadn't heard that mentioned anywhere: where did you hear it? smile

27 (edited by pedrotuga 2007-06-19 11:11)

Re: Punmap - html sitemap with fancy urls UPDATED - final release

Smartys wrote:

As a sidenote i should mention that i laugh like crazzy a few months ago when the web out there found out that deleting the google sitemap from their sites would give them an immediate gain on the number of indexed pages.

I hadn't heard that mentioned anywhere: where did you hear it? smile

It was not mentioned on the press. But it was a lot of talk about in discussions like this one. Check out some webmasters and SEO communities ou there and search for 'google sitemap'. I remember it was a big fuzz on  digitalpoint forums as i ocasionaly check those forums, but it was talked about all over the web.
I don't have the habit of reading blogs, but i am sure there was people talk about it. If you go through these blog articles you can quickly find some google sitemap downsides exposed.

Re: Punmap - html sitemap with fancy urls UPDATED - final release

pedrotuga wrote:
Smartys wrote:

As a sidenote i should mention that i laugh like crazzy a few months ago when the web out there found out that deleting the google sitemap from their sites would give them an immediate gain on the number of indexed pages.

I hadn't heard that mentioned anywhere: where did you hear it? smile

It was not mentioned on the press. But it was a lot of talk about in discussions like this one. Check out some webmasters and SEO communities ou there and search for 'google sitemap'. I remember it was a big fuzz on  digitalpoint forums as i ocasionaly check those forums, but it was talked about all over the web.
I don't have the habit of reading blogs, but i am sure there was people talk about it. If you go through these blog articles you can quickly find some google sitemap downsides exposed.

These are the only things I've found, which is a bit different from what you're saying:
http://www.apogee-search.com/Blog/?p=450
http://www.seomoz.org/blog/expert-advic … ont-submit
Is that what you meant? Because it isn't that the sitemap affects the number of pages indexed negatively, it's that the sitemap affects it POSITIVELY and thus you can't tell that a page lacks "the necessary components for inclusion, be they architectural, link strength, content-related, etc."

I use Sitemaps (actually, I use a sitemap for every PunBB-Hosting forum and a sitemap index in the root to tie them together) and that's not an issue for me at all: if I were writing my site completely from scratch it might be, but for a forum all you can do is hope that Google thinks the topics themselves are enough content.

Re: Punmap - html sitemap with fancy urls UPDATED - final release

Yes, i am talking about the situation discussed in your second link. But not only.
But that's actually the big point. If you submit a sitemap and go from zero to a thousand indexed pages, you probably get very happy, but did you remember to check how much traffic comes from google? You will probably get 10 visitors a month if you are lucky.
That's the google illusion, how many times do you click on the number 2 to check the second result page on google? Personally i do that in less than 1% of my searches. So whats the point of having thousands of indexed pages if they don't show on the first page? I would prefer to have one single page that is all the time on top of google.
I know it's a big temptation, having all the pages indexed as quickly as that, but the truth is that you might be taking the place of valuable external linked spidering that would really value your position on google.

Of course i am talking about pages with a reasonable amount of relevant content. if you have a small site that will get a couple of dozen visits a day, then of course it's better to inform google as quick as possible about your content as you will have to wait an eternity otherwise.

Besides this there is still another problem with google sitemaps: they are not reliable. If you check your pages on google on a daily basis you will notice that you can go from 10000 pages on one day to 100 pages the day after, for no reason. You may get your 10000 pages back the week after, or even the day after, but you will lose al ot of visitors that pointed their interest to some other similar site.

Re: Punmap - html sitemap with fancy urls UPDATED - final release

But that's actually the big point. If you submit a sitemap and go from zero to a thousand indexed pages, you probably get very happy, but did you remember to check how much traffic comes from google? You will probably get 10 visitors a month if you are lucky.
That's the google illusion, how many times do you click on the number 2 to check the second result page on google? Personally i do that in less than 1% of my searches. So whats the point of having thousands of indexed pages if they don't show on the first page? I would prefer to have one single page that is all the time on top of google.
I know it's a big temptation, having all the pages indexed as quickly as that, but the truth is that you might be taking the place of valuable external linked spidering that would really value your position on google.

I haven't seen any indication that sitemap indexing is "taking the place" of anything. It's no different than regular indexing except that it will potentially index more pages. In the end, it's up to the administrator to use all the tools provided to him/her when optimizing for Google.

Besides this there is still another problem with google sitemaps: they are not reliable. If you check your pages on google on a daily basis you will notice that you can go from 10000 pages on one day to 100 pages the day after, for no reason. You may get your 10000 pages back the week after, or even the day after, but you will lose al ot of visitors that pointed their interest to some other similar site.

I haven't seen that with my sitemap and I haven't seen anybody mention anything like that with sitemaps. In fact, wouldn't it make more sense for that to happen when you don't have a sitemap, since changes in links could hide a page from Google?

Re: Punmap - html sitemap with fancy urls UPDATED - final release

This is fantastic. Using it for wap rather than an archive.

Anyway you could include things like search.php, userlist.php, etc?

32 (edited by pedrotuga 2007-06-20 10:59)

Re: Punmap - html sitemap with fancy urls UPDATED - final release

Smartys, the link you pointed a couple of posts ago explains it kind of straight forward.
I didn't want to be trying to make a point out of this, my intention was more sharing this information with punbb users.
I don't want to stop anybody from using sitemaps, i am just sharing my experience with them. None of us know know how does google handles it's data. My experience with google sitemaps was bad, it basically made my traffic go down, and i think this is very likely to happen to other people.
On the other hand, if you are satisfied with sitemaps and they actually brought you traffic, of course you have no reason to drop them.

Liquidator,
Sorry, i don't have time to put effort on that at the moment. But, yep... that would be cool, there's already a minimalistic view, those would be the next step for a full featured ( from the user perspective ) minimum look punbb.

I got reminded that punbb outputs valid xhtml, i wonder if it wouldn't be beter if someone release a flat style instead. I can't think of anything that makes that impossible. Maybe we better wait for the new markup on version 1.3

Re: Punmap - html sitemap with fancy urls UPDATED - final release

For those concerned about duplicated content here is what you have to put on your robots.txt

User-agent: *
Disallow: /viewforum.php
Disallow: /viewtopic.php
Disallow: /profile.php
Disallow: /userlist.php

Dont  forget to include the path.
I will give this a try for one month to see if it really maters in any way.

34

Re: Punmap - html sitemap with fancy urls UPDATED - final release

But pedrotuga, will putting those pages into robots.txt interfere with the operation of AdSense (if you have it on your pages)? The Googlebot needs to know what those pages are about in order to deliver appropriate context sensitive ads via AdSense. You might end up just getting public service messages displayed, which don't pay smile

Of course if you don't run AdSense, it won't be an issue.

Still, it will be an interesting experiment.

Re: Punmap - html sitemap with fancy urls UPDATED - final release

I am afraid i don't understand you.

there was people comming up with the issue of the duplicated content. By placing this on your robots.txt you will prevent the dynamic urls from being indexed and keep you rewriten ones.
The content of the page will be the exact same. I don't see how this messes up adsense or googlebot.

36

Re: Punmap - html sitemap with fancy urls UPDATED - final release

Maybe I am confused.

What I am suggesting is that the pages that display the AdSense have to be the one that are read by the Googlebot.
Eg if you have a bunch of AdSense ads on /viewforum.php, then that page needs to be accessible to the Googlebot directly.

By blocking in robots.txt the Googlebot from accessing the 'normal' pages which the AdSense displays on, won't the AdSense die?

I mean the pages with the ads on them will still remotely load the jscript off google.com, naturally, but without the pages that host the AdSense being spidered, google won't know what ads would be correct to show on those pages.

The rewritten html sitemap pages may be successfully spidered, but you are then kind of saying to the googlebot: hey buddy, spider these (rewritten) pages over here, and then use the results of that to figure out what ads should be displayed on those (dynamic) pages in that pile over there, which you aren't allowed to see....

Re: Punmap - html sitemap with fancy urls UPDATED - final release

Good mod but i've done some modifications with url rewriting and the links are not displayed correctly ( error 404 )

What do i have to modify in punmap.php to have

1) The forum url displayed like that :

http://www.url_to_forum/categorie-forum_id-forum_name.html instead of
http://www.url_to_forum/forum_name-forum_id.html

2) The viewtopic url displayed like that :

http://www.url_to_forum/discussion-topic_id-topic_name.html instead of
http://www.url_to_forum/topic_name--topic_id.html

3) If there is an extended character in a forum or topic name ( like é,è...) the letter is not displayed at all ,giving also bad link

Any help would be appreciated

Regards

Re: Punmap - html sitemap with fancy urls UPDATED - final release

sirena, do your have to set manually on google every url you put ads on? If not you should not fear this. I don't know if you understood how this works, but the pages are the exact same, only the url changes. The diference here is: you will get your google visitors on your same old pages but they will reach then through a different url, that's all.

I actually use this mode in another way, instead of poiting the rewriten urls to my viewtopic.php i point them to viewprintable.php, another mod out there. This way, the users that arrive from google will arrive to the topic printable page. If you do this you can put ads on those pages and keep your original topicview pages ads clean so you keep your regular users from seeing ads but you still show them to your google visitors. It depends on what you want to achieve.

glucarelli, there is no way extended characters are causing any trouble because they simply are token away. Including them in the urls is not recomended at all and can cause you a lot of trouble. If you still want to include them i would sugest you replace 'é' by 'e', 'ç' by 'c', etc. You can make this replacement in the fucntion clean_url() on my mod.

You can change the rewriting to the format you pointed, but you will have to go through the mod and change it yourself. You need to adapt the regex both in your the function clean_url() and in your .htaccess and you need to twaek the link output in lines:
107,88 and 141.