Creating a sitemap with Django
The Sitemaps protocol allows a webmaster to inform search engines about URLs on a website that are available for crawling. A sitemap is an XML file that lists a site’s URLs. It allows webmasters to include additional information about each URL: when it was last updated, how often it is likely to be changed, and how important it is in relation to other URLs on the site. This allows search engines to crawl the site more intelligently.
This text shows how to generate a sitemap.xml with “django.contrib.sitemaps“ in Django, based on a simple blog application.
You can download the complete code here [3].
Overview:
First create a new project called “my_project“.
django-admin.py startproject my_project
Change to the newly created folder “my_project“ and create an application called “blog“.
cd my_project/; django-admin.py startapp blog
In the settings file (settings.py) you have to choose your database, in this example sqlite3 and a database file “database.db“ are used.
DATABASE_ENGINE = 'sqlite3' DATABASE_NAME = 'database.db'
Also we need to install the “django.contrib.sitemaps“, “django.contrib.sites“ and of course our application “blog“ to the “INSTALLED_APPS“ tuple.
INSTALLED_APPS = (
'django.contrib.sites',
'django.contrib.sitemaps',
'my_project.blog',
)
The model for the blog entries is called Post.
in my_project/blog/models.py
from django.db import models
class Post(models.Model):
title = models.CharField(max_length=255)
text = models.TextField()
def get_absolute_url(self):
return "/post/%s/" % (self.id,)
The class/model “Post“ consists of a title field and a larger text field for the content. Also it uses the “get_absolute_url“ method to tell Django how to calculate the URL for an object. The pattern looks like this: /post/1/
Now we sync the model into the database.
./manage.py syncdb
To create data you can use the Django/Python shell.
./manage.py shell
First load the Post model, than create two new objects/datasets.
>>> from my_project.blog.models import Post >>> Post(title="first post", text="first text").save() >>> Post(title="second post", text="second text").save()
The sitemap.py consists of all links generated by the Post model by calling the method “get_absolute_url“ for all objects in Post.
Now you can define how often the search engine should crawl the site, which priority the site has and some other values. Take a look at the documentation for more [1].
in my_project/sitemap.py
from django.contrib.sitemaps import Sitemap
from my_project.blog.models import Post
class PostSitemap(Sitemap):
changefreq = 'monthly'
priority = 0.5
def items(self):
return Post.objects.all()
Now we define the URL pattern that is returned by the generated sitemap.xml file.
in my_project/url.py
from django.conf.urls.defaults import *
from django.contrib.sitemaps import views as sitemap_views
from my_project.sitemaps import PostSitemap
sitemaps = {
'post': PostSitemap,
}
urlpatterns = patterns('',
(r'^sitemap.xml$', sitemap_views.sitemap, {'sitemaps': sitemaps}),
)
Now start the developer server and point the browser to http://127.0.0.1:8000/sitemap.xml.
./manage.py runserver
The generated xml file should be displayed in your browser.
Links:
[1] http://www.djangoproject.com/documentation/sitemaps/
[2] http://www.sitemaps.org
[3] http://media.b23.at/download/django_sitemap.tar.gz
1 comment so far
Leave a reply
Why, following both your excellently written tutorial and the example on the Django docs page, do I keep getting NoReverseMatch errors?
I can’t stand the NoReverseMatch error. It so damned hard to understand.