Site Update Notification System

The Site Update Notification System provides an architecture for web sites to supply data to search engines describing changes made to their pages, either in content or in presentation. The distinction is important such that a search engine can determine when an index needs updating or simply an update to the cached representation of a page is enough.

A notification from an update website enables a search engine to re-index targeted pages on a web server only when a change is made, resulting in the search engine crawler and indexing software being more efficient. The crawler's selection of web sites to visit can avoid those sites known not to have changed since the last visit and therefore make better use of the available bandwidth and processing. Additionally, the bandwidth requirement of client web servers is reduced, and keeps the search engine index up-to-date with dynamically changing content.

Description

A client web server hosts one or more Site Update Notification Agents (SUNAs) that monitor for changes in the content of the website. When a change is detected, SUNA collects information about the change and identifies which external web site pages are affected by the change. The SUNA can identify changes that are only to content, or include presentation changes such as updated graphic images or style sheet definitions. This information is transmitted to a Site Update Notification Service (SUNS) hosted on an internet server. Connectivity can be established using any communication protocol over tcp/ip, for example, http/https, ftp or any other industry standard or even developed to support proprietary protocols. The data transfer uses an authentication mechanism to prevent unauthorised submission of site update information which could lead to abuse of the service. The information describing changes to web page content across a plurality of client web servers submitted from authenticated SUNA's is collated and stored securely.

Subscription Service

The Site Update Subscription Service (SUSS) hosts the provision of site update information to subscribers. The service provides on-demand delivery of change information relating to one or more specific sites, or for all sites, within a given time period. The information is returned to an authenticated subscriber in an XML document. Typical subscribers to the SUSS are those that currently crawl internet web sites for content or benefit in knowing that the content has changed. The largest of these are web search engine sites, and others include page monitors, bookmark managers, offline browsers and website mirroring.

An alternative subscription service is available to consumers as an RSS feed. This XML document contains a higher level change notification alert to provide an end-user page monitoring service using a standard technology.


Read more about the
Site Update Notification Agents.

Site Update Notification Architecture