PJ on Development
Thoughts and babbling about software development. Visual Basic, .NET platform, web, etc.
Title: | BBCode Parser |
Description: | A very flexible, extendable and configurable parser for BBCode. |
Author: | Paulo Santos |
eMail: | pjondevelopment@gmail.com |
Environment: | VB.NET 2008 (Visual Studio 2008) |
Keywords: | BBCode Parser Tokenizer |
BBCode Parser
- Download Source Code - 46,53 KB
- Schema for XML Configuration File - 3,16 KB
DISCLAIMERThe Software is provided "AS IS", without warranty of any kind, express or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose and non-infringement. in no event shall the authors or copyright holders be liable for any claim, damages or other liability, whether in an action of contract, tort or otherwise, arising from, out of or in connection with the software or the use or other dealings in the software. |
||
Introduction
I've been working on improving the website of the company I work for during the last few weeks . One of my tasks was to provide means to allow the end user to edit rich content text while providing the management with enough peace of mind that the users wouldn't mess up with the site design and configuration.
So I proposed using BBCode which is pretty common among many Internet forums (like phpBB, Snitz, among others) given a very low learning curve, if any, among our user base.
The problem is that the few parsers for BBCode I've found on line weren't flexible and extensible enough to meet our needs. So, as a any developer, I rolled up my sleeves and coded one myself.
Parsing the BBCode
The first thing I took into account when creating this parser is that it needed to be flexible, able to add as many tags as wanted, formatting the text in any way possible.
The heart of the parser is a tokenizer I've created a while back for another project that never took off, but it was pretty robust for this project, so, I grabbed it along.
With the tokenizer the rest of the code pretty much came by itself, and the desired objective was accomplished.
Within the syntax of BBCode, a tag starts with an open square bracket '[
' and ends with a close square bracket ']
'. Anything between this two markers is considered a tag.
BBCode Tags
There are basically three types of tags:
- Simple Tags
[tag]
- Value Tags
[name=value]
- Parametrized Tags
[tag param=value]
Each tag may or may not require a closing tag. A closing tag has the same name of the tag starting with a forward slash '/
'.
The name of a tag can be any character except the equals sign (=
) and the square brackets ([]
).
Yes, I know that there are some parsers out there that allows tags with the square bracket within the tag, but this is a limitation I can live with.
Configuring Tags
While there are some basic tags common to most of BBCode parsers, there is no standard, although there are a few attempts to do so, like for instance BBCode.org.
With this in mind this parser is very configurable and flexible. It can be configured either programmatically or through an XML file
To configure the parser programmatically you need to add tag definition to the BBCodeParser.Dictionary
collection, as shown on the example below.
Dim parser As New BBCodeParser() parser.Dictionary.Add("b", "<b>{value}</b>", True) parser.Dictionary.Add("u", "<u>{value}</u>", True) parser.Dictionary.Add("i", "<i>{value}</i>", True)
The last parameter indicates that the tag requires a closing tag. With the configuration above the text
[b][u][i]Sample Text[/i][/u][/b]
would be formatted as:
Sample Text
Tag Parameters
Any tag can have any number of parameters. Each parameter consists of a pair of name=value
. The value can be between a single ('
) or double ("
) quote.
This parameters can then be used in placeholders within the replacement text format.
There are two special parameters:
- default : Only used for value tags and it has the value of the text after the first equal sign in the tag.
- value : Used for any tag that requires a closing tag, represents the formatted text inside the tag.
For example, with the configuration below:
parser.Dictionary.Add _
("url", _ "<a href='{default|value}'>{value|default}</a>", _ True)
the text [url=http://pjondevelopment.50webs.com]here[/url]
could be used as a link pointing to this site.
Note that on the replacement format text, the placeholders are formed with '{
' and '}
', separated by pipes '|
', if multiple parameters can be used on its place. When formatting, the first non-empty parameter will be used.
So, in the example above, we have two placeholders: {default|value}
and {value|default}
this means that in the first placeholder, it will be used either the value of the default
parameter or the formatted text between [url=...]
and [/url]
.
Extending Tags
Besides the tag definition the parser can be extended even further by simply creating a derived class from BBCodeElement
and add it to the BBCodeParser.ElementTypes
collection. With custom types the possibilities are endless, as one might read the contents from a file, a database, or wherever the object wants, based or not on the tag parameters.
Just note that the derived type must provide a public default constructor.
parser.ElementTypes.Add _ ("userImage", _ GetType(SubClassOfBBCodeElement))
With this configuration, every time the text [userImage]
is found, it will be replaced by whatever text the class SubClassOfBBCodeElement
provides through the method Format(ITextFormatter)
.
Saving and Loading Configurations
By calling the method SaveConfiguration
all the elements' configuration will be saved to an XML file, that can be loaded by the LoadConfiguration
method.
The BBCode XML Configuration File has the following structure:
<?xml version="1.0" encoding="utf-8"?> <Configuration> <ElementTypes> <element name="tagName" type="SubClassOfBBCodeElement, AssemblyQualifiedName" /> </ElementTypes> <Dictionary> <tag name="tagName" requireClosingTag="True|False"> <!-- XHTML Replacement Text --> Use {parameter} as a placeholder for any variable portion of the text. </tag> <tag name="otherTagName" requireClosingTag="True|False" escaped="True"> Use the escaped="True" if the text is not a valid XML, escapting all the necessary XML characters. </tag> </Dictionary> </Configuration>
The tag can be in either or both sections, if a tag is in both sections of the configuration file, the tag will be created using the type specified in the <ElementTypes>
section and having its BBCodeElement.ReplacementFormat
property set with the text from the <Dictionary>
section.
The order of the sections is not important, however the tags must be described, so the BBCode parser can format the tag properly. If a tag is not found in either sections, it will be rendered as simple text, ignoring the fact that it is a tag.
BBCodeParser Class
Public Properties
![]() | Dictionary Read Only. Gets the dictionary of known elements of the BBCodeParser Class. ReadOnly Property Dictionary() _ |
![]() | ElementTypes Read Only. Gets the dictionary of known custom element of the BBCodeParser Class. ReadOnly Property ElementTypes() _ |
Public Methods
![]() | LoadConfiguration Loads the configuration (Dictionary and ElementTypes) from the specified file, Stream or TextReader. Sub LoadConfiguration(ByVal fileName As String) Sub LoadConfiguration(ByVal stream As IO.Stream) Sub LoadConfiguration(ByVal reader As IO.TextReader) |
![]() | SaveConfiguration Saves the configuration (Dictionary and ElementTypes) to the specified file, Stream or TextWriter. Sub SaveConfiguration(ByVal fileName As String) |
![]() | Parse Parses the BBCode returning an BBCodeDocument. Function Parse(ByVal text As String) As BBCodeDocument |
BBCodeDocument Class
Public Properties
![]() | Nodes Read Only. Gets the collection of BBCodeNode parsed by the BBCodeParser. ReadOnly Property Nodes() As BBCodeNodeCollection |
![]() | Text Gets or sets the BBCode of the BBCodeDocument. Property Text() As String |
Public Methods
![]() | Format Formats the BBCode into HTML by default, or any other format using an implementation of the ITextFormatter interface.. Function Format() As String Function Format(ByVal formatter As ITextFormatter) _ |
BBCodeNode Class
Public Properties
![]() | InnerBBCode Gets or sets the BBCode for the node. Property InnerBBCode() As String |
![]() | InnerText Gets or sets the text of the BBCodeNode. Property InnerText() As String |
![]() | OuterBBCode Read Only. Gets the BBCode for the complete node. Property OuterBBCode() As String |
![]() | Parent Read Only. Gets node above the current BBCodeNode. Property Parent() As BBCodeNode |
Public Methods
![]() | Format Formats the BBCode using an implementation of the ITextFormatter interface.. Function Format(ByVal formatter As ITextFormatter) _ |
BBCodeElement Class
Inherits from BBCodeNodePublic Properties
![]() | Attributes Read Only. Gets a collection of attributes of the BBCodeElement. Property ReadOnly Attributes() _ |
![]() | InnerBBCode Inherited from BBCodeNode. |
![]() | InnerText Inherited from BBCodeNode. |
![]() | Name Read Only. Gets the name of the tag. Property ReadOnly Name() As String |
![]() | Nodes Read Only. Gets a collection of BBCodeNode children of the current BBCodeElement. Property ReadOnly Attributes() _ |
![]() | OuterBBCode Inherited from BBCodeNode. |
![]() | Parent Inherited from BBCodeNode. |
![]() | ReplacementFormat Read Only. Gets the string that describes how the element should be formatted by the ITextFormatter. Property ReadOnly ReplacementFormat() As String |
![]() | RequireClosingTag Read Only. Gets a value indicating if the tag requires a closing tag. Property ReadOnly RequireClosingTag() As Boolean |
Public Methods
![]() | Format Inherited from BBCodeNode. |