Google webmaster tools

Problems, problems, problems…

Have you ever been mithered by nasty web bots which insist on trawling your web content, even though its password protected ?

Yes us too, this was happening on our implentation of Liferay.

We knew that the answer lay within the ‘robots.txt’ file on your server, we just couldn’t get it to stop bots accessing the required directories. This is an issue that has been bothering us with the use of Liferay for a while now, you might be glad to hear that I’ve solved the issue : This is how!

To make customising the robots.txt file all the more easier for your setup, I suggest that you create an account on Google Webmaster Tools
And register your webpages which are affected.

How to create a robots.txt file:
1. On the Webmaster Tools Home page, click the site you want.
2. Under Site configuration, click Crawler access.
3. Click the Generate robots.txt tab.

The lines that you want in your robots.txt file to stop bots from trying to trawl Liferay on a Glassfish implentation are :

User-agent: *
Disallow: /glassfish/domains/domain1/applications/j2ee-modules/liferay-portal
Disallow: /web/guest/home
Disallow: /web/guest
Allow: /

This might not be what your exactly after, but hopfully it will point you in the right direction.

One comment

  1. Liferay Portal Development

    Articles are created to express different body of knowledge. That is why I admire writers who are passionate of doing such incredible job. I salute you guys. By the way, I like you post for it is specifically talk about current issues and technicalities in life. I look forward for your subsequent post.I look forward for your next article.Thanks Marks Liferay Portal Development

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s