Customising XSite
XSite under the hood
XSite can be customised using the following main components:
- SitemapLoader: defaults to XStream-based loader
- PageExtractor: defaults to SiteMesh-based page extractor
- Skin: defaults to Freemarker-based skin
- LinkValidators: a list of link validators
- FileSystem: defaults to Commons-IO filesystem
All default components can be overridden via the conf/xsite.xml
composition resource found in the binary distribution, and specifiable via
the '-f' (if a file resource) or '-R' (if a classpath resource) options.
SiteMesh components
The context extraction is by default done by SiteMesh. SiteMesh does not only extract the content, but is also able to modify it on the fly based on tag rules and text filters. Some of it are being used by XSite already internally, but others are available (from SiteMesh directly or from XSite). Implementations of both types can be registered in the composition resource.
Text Filter
Text filter are applied to the extracted text value. Following TextFilter implementations are available:
- RegexReplacementTextFilter
- Text replacement with regular expressions
- MailToLinkTextFilter
- Syntax to inject a mailto: link with parameters into the text of a page.
Tag Rules
A TagRule processes HTML tags. Note, that it is not possible to process the same tag twice with different rules. Following TagRule implementations are available and useful in the context of XSite:
- FramesetRule
- The rule will simply add the property "frameset" with value "true" to the properties of the current Page model.
- HtmlAttributesRule
- The rule will add all attributes of the HTML tag as property to the current Page model.
- MetaTagRule
- The rule will add all attributes of the meta tags in the HTML header as property to the current Page model.
- ParameterExtractingRule
- The rule will remove any tag named "parameter&uot; and add its value as property with the name of the tag's attribute "name" to the current Page model.
- TagReplaceRule
- Replace the current tag with a different one.
- AddClassAttributeToFirstHeaderRule
- Add a class attribute to the first header tag.
- H1ToTitleRule
- Extract the content of the first H1 tag and provide it in as title in the page model.
- ImgAttributesRule
- Extracts attributes for an image from the src attribute.
- DropDivOfClassSectionRule
- Special rule for Doxia/Maven generated HTML pages that nest any header tag (H1-6) into div elements with class "section". The rule will remove these div tags and therefore unnest the inner elements again.
- TopLevelBlockExtractingRule
- Special rule for the body of the HTML page, that extracts all HTML top level block elements as individual paragraphs. The rule will add the number of paragraphs as property "paragraphs" and all the individual paragraphs as property "paragraph.<i>" with i starting with 0. The Page model provides the paragraphs directly as list for your convenience.