MarkdownExtra
Markdown Extra Parser Class
- Full name:
\Michelf\MarkdownExtra
- Parent class:
\Michelf\Markdown
Properties
fn_id_prefix
Prefix for footnote ids.
fn_link_title
Optional title attribute for footnote links.
fn_link_class
Optional class attribute for footnote links and backlinks.
fn_backlink_class
fn_backlink_html
Content to be displayed within footnote backlinks. The default is '↩'; the U+FE0E on the end is a Unicode variant selector used to prevent iOS from displaying the arrow character as an emoji.
Optionally use '^^' and '%%' to refer to the footnote number and reference number respectively. {@see \Michelf\parseFootnotePlaceholders()}
fn_backlink_title
Optional title and aria-label attributes for footnote backlinks for added accessibility (to ensure backlink uniqueness).
Use '^^' and '%%' to refer to the footnote number and reference number respectively. {@see \Michelf\parseFootnotePlaceholders()}
fn_backlink_label
table_align_class_tmpl
Class name for table cell alignment (%% replaced left/center/right) For instance: 'go-%%' becomes 'go-left' or 'go-right' or 'go-center' If empty, the align attribute is used instead of a class name.
code_class_prefix
Optional class prefix for fenced code block.
code_attr_on_pre
Class attribute for code blocks goes on the code
tag;
setting this to true will put attributes on the pre
tag instead.
predef_abbr
Predefined abbreviations.
hashtag_protection
Only convert atx-style headers if there's a space between the header and #
omit_footnotes
Determines whether footnotes should be appended to the end of the document.
If true, footnote html can be retrieved from $this->footnotes_assembled.
footnotes_assembled
After parsing, the HTML for the list of footnotes appears here.
This is available only if $omit_footnotes == true.
Note: when placing the content of footnotes_assembled
on the page,
consider adding the attribute role="doc-endnotes"
to the div
or
section
that will enclose the list of footnotes so they are
reachable to accessibility tools the same way they would be with the
default HTML output.
footnotes
Extra variables used during extra transformations.
footnotes_ordered
footnotes_ref_count
footnotes_numbers
abbr_desciptions
abbr_word_re
footnote_counter
Give the current footnote number.
ref_attr
Ref attribute for links
id_class_attr_catch_re
Expression to use to catch attributes (includes the braces)
id_class_attr_nocatch_re
Expression to use when parsing in a context when no capture is desired
block_tags_re
Tags that are always treated as block tags
context_block_tags_re
Tags treated as block tags only if the opening tag is alone on its line
contain_span_tags_re
Tags where markdown="1" default to span mode:
clean_tags_re
Tags which must not have their contents modified, no matter where they appear
auto_close_tags_re
Tags that do not need to be closed.
em_relist
Redefining emphasis markers so that emphasis by underscore does not work in the middle of a word.
strong_relist
Define the strong operators with their regex matches
em_strong_relist
Define the emphasis + strong operators with their regex matches
Methods
__construct
Constructor function. Initialize the parser object.
setup
Setting up Extra-specific variables.
teardown
Clearing Extra-specific variables.
doExtraAttributes
Parse attributes caught by the $this->id_class_attr_catch_re expression and return the HTML-formatted list of attributes.
protected doExtraAttributes(string $tag_name, string $attr, mixed $defaultIdValue = null, array $classes = array()): string
Currently supported attributes are .class and #id.
In addition, this method also supports supplying a default Id value, which will be used to populate the id attribute in case it was not overridden.
Parameters:
Parameter | Type | Description |
---|---|---|
$tag_name |
string | |
$attr |
string | |
$defaultIdValue |
mixed | |
$classes |
array |
stripLinkDefinitions
Strips link definitions from text, stores the URLs and titles in hash references.
Parameters:
Parameter | Type | Description |
---|---|---|
$text |
string |
_stripLinkDefinitions_callback
Strip link definition callback
Parameters:
Parameter | Type | Description |
---|---|---|
$matches |
array |
hashHTMLBlocks
Hashify HTML Blocks and "clean tags".
We only want to do this for block-level HTML tags, such as headers, lists, and tables. That's because we still want to wrap
s around "paragraphs" that are wrapped in non-block-level tags, such as anchors, phrase emphasis, and spans. The list of tags we're looking for is hard-coded.
This works by calling _HashHTMLBlocks_InMarkdown, which then calls _HashHTMLBlocks_InHTML when it encounter block tags. When the markdown="1" attribute is found within a tag, _HashHTMLBlocks_InHTML calls back _HashHTMLBlocks_InMarkdown to handle the Markdown syntax within the tag. These two functions are calling each other. It's recursive!
Parameters:
Parameter | Type | Description |
---|---|---|
$text |
string |
_hashHTMLBlocks_inMarkdown
Parse markdown text, calling _HashHTMLBlocks_InHTML for block tags.
protected _hashHTMLBlocks_inMarkdown(string $text, int $indent, string $enclosing_tag_re = '', bool $span = false): array
-
$indent is the number of space to be ignored when checking for code blocks. This is important because if we don't take the indent into account, something like this (which looks right) won't work as expected:
Hello World. <-- Is this a Markdown code block or text?<-- Is this a Markdown code block or a real tag?If you don't like this, just don't indent the tag on which you apply the markdown="1" attribute.
If $enclosing_tag_re is not empty, stops at the first unmatched closing tag with that name. Nested tags supported.
If $span is true, text inside must treated as span. So any double newline will be replaced by a single newline so that it does not create paragraphs.
Returns an array of that form: ( processed text , remaining text )
Parameters:
Parameter Type Description $text
string $indent
int $enclosing_tag_re
string $span
bool
_hashHTMLBlocks_inHTML
Parse HTML, calling _HashHTMLBlocks_InMarkdown for block tags.
- Calls $hash_method to convert any blocks.
- Stops when the first opening tag closes.
- $md_attr indicate if the use of the
markdown="1"
attribute is allowed. (it is not inside clean tags)
Returns an array of that form: ( processed text , remaining text )
Parameters:
Parameter Type Description $text
string $hash_method
string $md_attr
bool Handle markdown="1"
attribute
hashClean
Called whenever a tag must be hashed when a function inserts a "clean" tag in $text, it passes through this function and is automaticaly escaped, blocking invalid nested overlap.
Parameters:
Parameter Type Description $text
string
doAnchors
Turn Markdown link shortcuts into XHTML tags.
Parameters:
Parameter Type Description $text
string
_doAnchors_reference_callback
Callback for reference anchors
Parameters:
Parameter Type Description $matches
array
_doAnchors_inline_callback
Callback for inline anchors
Parameters:
Parameter Type Description $matches
array
doImages
Turn Markdown image shortcuts into
tags.
Parameters:
Parameter Type Description $text
string
_doImages_reference_callback
Callback for referenced images
Parameters:
Parameter Type Description $matches
array
_doImages_inline_callback
Callback for inline images
Parameters:
Parameter Type Description $matches
array
doHeaders
Process markdown headers. Redefined to add ID and class attribute support.
Parameters:
Parameter Type Description $text
string
_doHeaders_callback_setext
Callback for setext headers
Parameters:
Parameter Type Description $matches
array
_doHeaders_callback_atx
Callback for atx headers
Parameters:
Parameter Type Description $matches
array
doTables
Form HTML tables.
Parameters:
Parameter Type Description $text
string
_doTable_leadingPipe_callback
Callback for removing the leading pipe for each row
Parameters:
Parameter Type Description $matches
array
_doTable_makeAlignAttr
Make the align attribute in a table
Parameters:
Parameter Type Description $alignname
string
_doTable_callback
Calback for processing tables
Parameters:
Parameter Type Description $matches
array
doDefLists
Form HTML definition lists.
Parameters:
Parameter Type Description $text
string
_doDefLists_callback
Callback for processing definition lists
Parameters:
Parameter Type Description $matches
array
processDefListItems
Process the contents of a single definition list, splitting it into individual term and definition list items.
Parameters:
Parameter Type Description $list_str
string
_processDefListItems_callback_dt
Callback for
- elements in definition lists
Parameters:
Parameter Type Description $matches
array
_processDefListItems_callback_dd
Callback for
- elements in definition lists
Parameters:
Parameter Type Description $matches
array
doFencedCodeBlocks
Adding the fenced code block syntax to regular Markdown:
Parameters:
Parameter Type Description $text
string
_doFencedCodeBlocks_callback
Callback to process fenced code blocks
Parameters:
Parameter Type Description $matches
array
_doFencedCodeBlocks_newlines
Replace new lines in fenced code blocks
Parameters:
Parameter Type Description $matches
array
formParagraphs
Parse text into paragraphs
Parameters:
Parameter Type Description $text
string String to process in paragraphs $wrap_in_p
bool Whether paragraphs should be wrapped in <p> tags Return Value:
HTML output
stripFootnotes
Footnotes - Strips link definitions from text, stores the URLs and titles in hash references.
Parameters:
Parameter Type Description $text
string
_stripFootnotes_callback
Callback for stripping footnotes
Parameters:
Parameter Type Description $matches
array
doFootnotes
Replace footnote references in $text [^id] with a special text-token which will be replaced by the actual footnote marker in appendFootnotes.
Parameters:
Parameter Type Description $text
string
appendFootnotes
Append footnote list to text
Parameters:
Parameter Type Description $text
string
_doFootnotes
Generates the HTML for footnotes. Called by appendFootnotes, even if footnotes are not being appended.
_appendFootnotes_callback
Callback for appending footnotes
Parameters:
Parameter Type Description $matches
array
parseFootnotePlaceholders
Build footnote label by evaluating any placeholders.
protected parseFootnotePlaceholders(string $label, int $footnote_number, int $reference_number): string
- ^^ footnote number
- %% footnote reference number (Nth reference to footnote number)
Parameters:
Parameter Type Description $label
string $footnote_number
int $reference_number
int
stripAbbreviations
Abbreviations - strips abbreviations from text, stores titles in hash references.
Parameters:
Parameter Type Description $text
string
_stripAbbreviations_callback
Callback for stripping abbreviations
Parameters:
Parameter Type Description $matches
array
doAbbreviations
Find defined abbreviations in text and wrap them in elements.
Parameters:
Parameter Type Description $text
string
_doAbbreviations_callback
Callback for processing abbreviations
Parameters:
Parameter Type Description $matches
array
Inherited methods
defaultTransform
Simple function interface - Initialize the parser and return the result of its transform method. This will work fine for derived classes too.
- This method is static.
Parameters:
Parameter Type Description $text
string
__construct
Constructor function. Initialize appropriate member variables.
setup
Called before the transformation process starts to setup parser states.
teardown
Called after the transformation process to clear any variable which may be taking up memory unnecessarly.
transform
Main function. Performs some preprocessing on the input text and pass it through the document gamut.
Parameters:
Parameter Type Description $text
string
stripLinkDefinitions
Strips link definitions from text, stores the URLs and titles in hash references
Parameters:
Parameter Type Description $text
string
_stripLinkDefinitions_callback
The callback to strip link definitions
Parameters:
Parameter Type Description $matches
array
hashHTMLBlocks
Hashify HTML blocks
Parameters:
Parameter Type Description $text
string
_hashHTMLBlocks_callback
The callback for hashing HTML blocks
Parameters:
Parameter Type Description $matches
string
hashPart
Called whenever a tag must be hashed when a function insert an atomic element in the text stream. Passing $text to through this function gives a unique text-token which will be reverted back when calling unhash.
The $boundary argument specify what character should be used to surround the token. By convension, "B" is used for block elements that needs not to be wrapped into paragraph tags at the end, ":" is used for elements that are word separators and "X" is used in the general case.
Parameters:
Parameter Type Description $text
string $boundary
string
hashBlock
Shortcut function for hashPart with block-level boundaries.
Parameters:
Parameter Type Description $text
string
runBlockGamut
Run block gamut tranformations.
We need to escape raw HTML in Markdown source before doing anything else. This need to be done for each block, and not only at the begining in the Markdown function since hashed blocks can be part of list items and could have been indented. Indented blocks would have been seen as a code block in a previous pass of hashHTMLBlocks.
Parameters:
Parameter Type Description $text
string
runBasicBlockGamut
Run block gamut tranformations, without hashing HTML blocks. This is useful when HTML blocks are known to be already hashed, like in the first whole-document pass.
Parameters:
Parameter Type Description $text
string
doHorizontalRules
Convert horizontal rules
Parameters:
Parameter Type Description $text
string
runSpanGamut
Run span gamut transformations
Parameters:
Parameter Type Description $text
string
doHardBreaks
Do hard breaks
Parameters:
Parameter Type Description $text
string
_doHardBreaks_callback
Trigger part hashing for the hard break (callback method)
Parameters:
Parameter Type Description $matches
array
doAnchors
Turn Markdown link shortcuts into XHTML tags.
Parameters:
Parameter Type Description $text
string
_doAnchors_reference_callback
Callback method to parse referenced anchors
Parameters:
Parameter Type Description $matches
array
_doAnchors_inline_callback
Callback method to parse inline anchors
Parameters:
Parameter Type Description $matches
array
doImages
Turn Markdown image shortcuts into
tags.
Parameters:
Parameter Type Description $text
string
_doImages_reference_callback
Callback to parse references image tags
Parameters:
Parameter Type Description $matches
array
_doImages_inline_callback
Callback to parse inline image tags
Parameters:
Parameter Type Description $matches
array
doHeaders
Parse Markdown heading elements to HTML
Parameters:
Parameter Type Description $text
string
_doHeaders_callback_setext
Setext header parsing callback
Parameters:
Parameter Type Description $matches
array
_doHeaders_callback_atx
ATX header parsing callback
Parameters:
Parameter Type Description $matches
array
_generateIdFromHeaderValue
If a header_id_func property is set, we can use it to automatically generate an id attribute.
This method returns a string in the form id="foo", or an empty string otherwise.
Parameters:
Parameter Type Description $headerValue
string
doLists
Form HTML ordered (numbered) and unordered (bulleted) lists.
Parameters:
Parameter Type Description $text
string
_doLists_callback
List parsing callback
Parameters:
Parameter Type Description $matches
array
processListItems
Process the contents of a single ordered or unordered list, splitting it into individual list items.
Parameters:
Parameter Type Description $list_str
string $marker_any_re
string
_processListItems_callback
List item parsing callback
Parameters:
Parameter Type Description $matches
array
doCodeBlocks
Process Markdown
<pre><code>
blocks.Parameters:
Parameter Type Description $text
string
_doCodeBlocks_callback
Code block parsing callback
Parameters:
Parameter Type Description $matches
array
makeCodeSpan
Create a code span markup for $code. Called from handleSpanToken.
Parameters:
Parameter Type Description $code
string
prepareItalicsAndBold
Prepare regular expressions for searching emphasis tokens in any context.
doItalicsAndBold
Convert Markdown italics (emphasis) and bold (strong) to HTML
Parameters:
Parameter Type Description $text
string
doBlockQuotes
Parse Markdown blockquotes to HTML
Parameters:
Parameter Type Description $text
string
_doBlockQuotes_callback
Blockquote parsing callback
Parameters:
Parameter Type Description $matches
array
_doBlockQuotes_callback2
Blockquote parsing callback
Parameters:
Parameter Type Description $matches
array
formParagraphs
Parse paragraphs
Parameters:
Parameter Type Description $text
string String to process in paragraphs $wrap_in_p
bool Whether paragraphs should be wrapped in <p> tags
encodeAttribute
Encode text for a double-quoted HTML attribute. This function is not suitable for attributes enclosed in single quotes.
Parameters:
Parameter Type Description $text
string
encodeURLAttribute
Encode text for a double-quoted HTML attribute containing a URL, applying the URL filter if set. Also generates the textual representation for the URL (removing mailto: or tel:) storing it in $text.
This function is not suitable for attributes enclosed in single quotes.
Parameters:
Parameter Type Description $url
string $text
string Passed by reference Return Value:
URL
encodeAmpsAndAngles
Smart processing for ampersands and angle brackets that need to be encoded. Valid character entities are left alone unless the no-entities mode is set.
Parameters:
Parameter Type Description $text
string
doAutoLinks
Parse Markdown automatic links to anchor HTML tags
Parameters:
Parameter Type Description $text
string
_doAutoLinks_url_callback
Parse URL callback
Parameters:
Parameter Type Description $matches
array
_doAutoLinks_email_callback
Parse email address callback
Parameters:
Parameter Type Description $matches
array
encodeEntityObfuscatedAttribute
Input: some text to obfuscate, e.g. "mailto:foo@example.com"
protected encodeEntityObfuscatedAttribute(string $text, string& $tail = null, int $head_length): string
Output: the same text but with most characters encoded as either a decimal or hex entity, in the hopes of foiling most address harvesting spam bots. E.g.:
mailto:foo @example.co m
Note: the additional output $tail is assigned the same value as the ouput, minus the number of characters specified by $head_length.
Based by a filter by Matthew Wickline, posted to BBEdit-Talk. With some optimizations by Milian Wolff. Forced encoding of HTML attribute special characters by Allan Odgaard.
Parameters:
Parameter Type Description $text
string $tail
string Passed by reference $head_length
int
parseSpan
Take the string $str and parse it into tokens, hashing embeded HTML, escaped characters and handling code spans.
Parameters:
Parameter Type Description $str
string
handleSpanToken
Handle $token provided by parseSpan by determining its nature and returning the corresponding value that should replace it.
Parameters:
Parameter Type Description $token
string $str
string Passed by reference
outdent
Remove one level of line-leading tabs or spaces
Parameters:
Parameter Type Description $text
string
detab
Replace tabs with the appropriate amount of spaces.
For each line we separate the line in blocks delemited by tab characters. Then we reconstruct every line by adding the appropriate number of space between each blocks.
Parameters:
Parameter Type Description $text
string
_detab_callback
Replace tabs callback
Parameters:
Parameter Type Description $matches
string
_initDetab
Check for the availability of the function in the
utf8_strlen
property (initiallymb_strlen
). If the function is not available, create a function that will loosely count the number of UTF-8 characters with a regular expression.
unhash
Swap back in all the tags hashed by _HashHTMLBlocks.
Parameters:
Parameter Type Description $text
string
_unhash_callback
Unhashing callback
Parameters:
Parameter Type Description $matches
array
Automatically generated on 2025-03-18