Skip to content

MarkdownExtra

Markdown Extra Parser Class

Properties

fn_id_prefix

Prefix for footnote ids.

public string $fn_id_prefix

Optional title attribute for footnote links.

public string $fn_link_title

Optional class attribute for footnote links and backlinks.

public string $fn_link_class

public string $fn_backlink_class

Content to be displayed within footnote backlinks. The default is '↩'; the U+FE0E on the end is a Unicode variant selector used to prevent iOS from displaying the arrow character as an emoji.

public string $fn_backlink_html

Optionally use '^^' and '%%' to refer to the footnote number and reference number respectively. {@see \Michelf\parseFootnotePlaceholders()}


Optional title and aria-label attributes for footnote backlinks for added accessibility (to ensure backlink uniqueness).

public string $fn_backlink_title

Use '^^' and '%%' to refer to the footnote number and reference number respectively. {@see \Michelf\parseFootnotePlaceholders()}


public string $fn_backlink_label

table_align_class_tmpl

Class name for table cell alignment (%% replaced left/center/right) For instance: 'go-%%' becomes 'go-left' or 'go-right' or 'go-center' If empty, the align attribute is used instead of a class name.

public string $table_align_class_tmpl

code_class_prefix

Optional class prefix for fenced code block.

public string $code_class_prefix

code_attr_on_pre

Class attribute for code blocks goes on the code tag; setting this to true will put attributes on the pre tag instead.

public bool $code_attr_on_pre

predef_abbr

Predefined abbreviations.

public array $predef_abbr

hashtag_protection

Only convert atx-style headers if there's a space between the header and #

public bool $hashtag_protection

omit_footnotes

Determines whether footnotes should be appended to the end of the document.

public bool $omit_footnotes

If true, footnote html can be retrieved from $this->footnotes_assembled.


footnotes_assembled

After parsing, the HTML for the list of footnotes appears here.

public ?string $footnotes_assembled

This is available only if $omit_footnotes == true.

Note: when placing the content of footnotes_assembled on the page, consider adding the attribute role="doc-endnotes" to the div or section that will enclose the list of footnotes so they are reachable to accessibility tools the same way they would be with the default HTML output.


footnotes

Extra variables used during extra transformations.

protected array $footnotes

footnotes_ordered

protected array $footnotes_ordered

footnotes_ref_count

protected array $footnotes_ref_count

footnotes_numbers

protected array $footnotes_numbers

abbr_desciptions

protected array $abbr_desciptions

abbr_word_re

protected string $abbr_word_re

footnote_counter

Give the current footnote number.

protected int $footnote_counter

ref_attr

Ref attribute for links

protected array $ref_attr

id_class_attr_catch_re

Expression to use to catch attributes (includes the braces)

protected string $id_class_attr_catch_re

id_class_attr_nocatch_re

Expression to use when parsing in a context when no capture is desired

protected string $id_class_attr_nocatch_re

block_tags_re

Tags that are always treated as block tags

protected string $block_tags_re

context_block_tags_re

Tags treated as block tags only if the opening tag is alone on its line

protected string $context_block_tags_re

contain_span_tags_re

Tags where markdown="1" default to span mode:

protected string $contain_span_tags_re

clean_tags_re

Tags which must not have their contents modified, no matter where they appear

protected string $clean_tags_re

auto_close_tags_re

Tags that do not need to be closed.

protected string $auto_close_tags_re

em_relist

Redefining emphasis markers so that emphasis by underscore does not work in the middle of a word.

protected array $em_relist

strong_relist

Define the strong operators with their regex matches

protected array $strong_relist

em_strong_relist

Define the emphasis + strong operators with their regex matches

protected array $em_strong_relist

Methods

__construct

Constructor function. Initialize the parser object.

public __construct(): void

setup

Setting up Extra-specific variables.

protected setup(): void

teardown

Clearing Extra-specific variables.

protected teardown(): void

doExtraAttributes

Parse attributes caught by the $this->id_class_attr_catch_re expression and return the HTML-formatted list of attributes.

protected doExtraAttributes(string $tag_name, string $attr, mixed $defaultIdValue = null, array $classes = array()): string

Currently supported attributes are .class and #id.

In addition, this method also supports supplying a default Id value, which will be used to populate the id attribute in case it was not overridden.

Parameters:

Parameter Type Description
$tag_name string
$attr string
$defaultIdValue mixed
$classes array

stripLinkDefinitions

Strips link definitions from text, stores the URLs and titles in hash references.

protected stripLinkDefinitions(string $text): string

Parameters:

Parameter Type Description
$text string

_stripLinkDefinitions_callback

Strip link definition callback

protected _stripLinkDefinitions_callback(array $matches): string

Parameters:

Parameter Type Description
$matches array

hashHTMLBlocks

Hashify HTML Blocks and "clean tags".

protected hashHTMLBlocks(string $text): string

We only want to do this for block-level HTML tags, such as headers, lists, and tables. That's because we still want to wrap

s around "paragraphs" that are wrapped in non-block-level tags, such as anchors, phrase emphasis, and spans. The list of tags we're looking for is hard-coded.

This works by calling _HashHTMLBlocks_InMarkdown, which then calls _HashHTMLBlocks_InHTML when it encounter block tags. When the markdown="1" attribute is found within a tag, _HashHTMLBlocks_InHTML calls back _HashHTMLBlocks_InMarkdown to handle the Markdown syntax within the tag. These two functions are calling each other. It's recursive!

Parameters:

Parameter Type Description
$text string

_hashHTMLBlocks_inMarkdown

Parse markdown text, calling _HashHTMLBlocks_InHTML for block tags.

protected _hashHTMLBlocks_inMarkdown(string $text, int $indent, string $enclosing_tag_re = '', bool $span = false): array
  • $indent is the number of space to be ignored when checking for code blocks. This is important because if we don't take the indent into account, something like this (which looks right) won't work as expected:

    Hello World. <-- Is this a Markdown code block or text?
    <-- Is this a Markdown code block or a real tag?

    If you don't like this, just don't indent the tag on which you apply the markdown="1" attribute.

  • If $enclosing_tag_re is not empty, stops at the first unmatched closing tag with that name. Nested tags supported.

  • If $span is true, text inside must treated as span. So any double newline will be replaced by a single newline so that it does not create paragraphs.

Returns an array of that form: ( processed text , remaining text )

Parameters:

Parameter Type Description
$text string
$indent int
$enclosing_tag_re string
$span bool

_hashHTMLBlocks_inHTML

Parse HTML, calling _HashHTMLBlocks_InMarkdown for block tags.

protected _hashHTMLBlocks_inHTML(string $text, string $hash_method, bool $md_attr): array
  • Calls $hash_method to convert any blocks.
  • Stops when the first opening tag closes.
  • $md_attr indicate if the use of the markdown="1" attribute is allowed. (it is not inside clean tags)

Returns an array of that form: ( processed text , remaining text )

Parameters:

Parameter Type Description
$text string
$hash_method string
$md_attr bool Handle markdown=&quot;1&quot; attribute

hashClean

Called whenever a tag must be hashed when a function inserts a "clean" tag in $text, it passes through this function and is automaticaly escaped, blocking invalid nested overlap.

protected hashClean(string $text): string

Parameters:

Parameter Type Description
$text string

doAnchors

Turn Markdown link shortcuts into XHTML tags.

protected doAnchors(string $text): string

Parameters:

Parameter Type Description
$text string

_doAnchors_reference_callback

Callback for reference anchors

protected _doAnchors_reference_callback(array $matches): string

Parameters:

Parameter Type Description
$matches array

_doAnchors_inline_callback

Callback for inline anchors

protected _doAnchors_inline_callback(array $matches): string

Parameters:

Parameter Type Description
$matches array

doImages

Turn Markdown image shortcuts into tags.

protected doImages(string $text): string

Parameters:

Parameter Type Description
$text string

_doImages_reference_callback

Callback for referenced images

protected _doImages_reference_callback(array $matches): string

Parameters:

Parameter Type Description
$matches array

_doImages_inline_callback

Callback for inline images

protected _doImages_inline_callback(array $matches): string

Parameters:

Parameter Type Description
$matches array

doHeaders

Process markdown headers. Redefined to add ID and class attribute support.

protected doHeaders(string $text): string

Parameters:

Parameter Type Description
$text string

_doHeaders_callback_setext

Callback for setext headers

protected _doHeaders_callback_setext(array $matches): string

Parameters:

Parameter Type Description
$matches array

_doHeaders_callback_atx

Callback for atx headers

protected _doHeaders_callback_atx(array $matches): string

Parameters:

Parameter Type Description
$matches array

doTables

Form HTML tables.

protected doTables(string $text): string

Parameters:

Parameter Type Description
$text string

_doTable_leadingPipe_callback

Callback for removing the leading pipe for each row

protected _doTable_leadingPipe_callback(array $matches): string

Parameters:

Parameter Type Description
$matches array

_doTable_makeAlignAttr

Make the align attribute in a table

protected _doTable_makeAlignAttr(string $alignname): string

Parameters:

Parameter Type Description
$alignname string

_doTable_callback

Calback for processing tables

protected _doTable_callback(array $matches): string

Parameters:

Parameter Type Description
$matches array

doDefLists

Form HTML definition lists.

protected doDefLists(string $text): string

Parameters:

Parameter Type Description
$text string

_doDefLists_callback

Callback for processing definition lists

protected _doDefLists_callback(array $matches): string

Parameters:

Parameter Type Description
$matches array

processDefListItems

Process the contents of a single definition list, splitting it into individual term and definition list items.

protected processDefListItems(string $list_str): string

Parameters:

Parameter Type Description
$list_str string

_processDefListItems_callback_dt

Callback for

elements in definition lists

protected _processDefListItems_callback_dt(array $matches): string

Parameters:

Parameter Type Description
$matches array

_processDefListItems_callback_dd

Callback for

elements in definition lists

protected _processDefListItems_callback_dd(array $matches): string

Parameters:

Parameter Type Description
$matches array

doFencedCodeBlocks

Adding the fenced code block syntax to regular Markdown:

protected doFencedCodeBlocks(string $text): string
Code block

Parameters:

Parameter Type Description
$text string

_doFencedCodeBlocks_callback

Callback to process fenced code blocks

protected _doFencedCodeBlocks_callback(array $matches): string

Parameters:

Parameter Type Description
$matches array

_doFencedCodeBlocks_newlines

Replace new lines in fenced code blocks

protected _doFencedCodeBlocks_newlines(array $matches): string

Parameters:

Parameter Type Description
$matches array

formParagraphs

Parse text into paragraphs

protected formParagraphs(string $text, bool $wrap_in_p = true): string

Parameters:

Parameter Type Description
$text string String to process in paragraphs
$wrap_in_p bool Whether paragraphs should be wrapped in <p> tags

Return Value:

HTML output


stripFootnotes

Footnotes - Strips link definitions from text, stores the URLs and titles in hash references.

protected stripFootnotes(string $text): string

Parameters:

Parameter Type Description
$text string

_stripFootnotes_callback

Callback for stripping footnotes

protected _stripFootnotes_callback(array $matches): string

Parameters:

Parameter Type Description
$matches array

doFootnotes

Replace footnote references in $text [^id] with a special text-token which will be replaced by the actual footnote marker in appendFootnotes.

protected doFootnotes(string $text): string

Parameters:

Parameter Type Description
$text string

appendFootnotes

Append footnote list to text

protected appendFootnotes(string $text): string

Parameters:

Parameter Type Description
$text string

_doFootnotes

Generates the HTML for footnotes. Called by appendFootnotes, even if footnotes are not being appended.

protected _doFootnotes(): void

_appendFootnotes_callback

Callback for appending footnotes

protected _appendFootnotes_callback(array $matches): string

Parameters:

Parameter Type Description
$matches array

parseFootnotePlaceholders

Build footnote label by evaluating any placeholders.

protected parseFootnotePlaceholders(string $label, int $footnote_number, int $reference_number): string
  • ^^ footnote number
  • %% footnote reference number (Nth reference to footnote number)

Parameters:

Parameter Type Description
$label string
$footnote_number int
$reference_number int

stripAbbreviations

Abbreviations - strips abbreviations from text, stores titles in hash references.

protected stripAbbreviations(string $text): string

Parameters:

Parameter Type Description
$text string

_stripAbbreviations_callback

Callback for stripping abbreviations

protected _stripAbbreviations_callback(array $matches): string

Parameters:

Parameter Type Description
$matches array

doAbbreviations

Find defined abbreviations in text and wrap them in elements.

protected doAbbreviations(string $text): string

Parameters:

Parameter Type Description
$text string

_doAbbreviations_callback

Callback for processing abbreviations

protected _doAbbreviations_callback(array $matches): string

Parameters:

Parameter Type Description
$matches array

Inherited methods

defaultTransform

Simple function interface - Initialize the parser and return the result of its transform method. This will work fine for derived classes too.

public static defaultTransform(string $text): string
  • This method is static.

Parameters:

Parameter Type Description
$text string

__construct

Constructor function. Initialize appropriate member variables.

public __construct(): void

setup

Called before the transformation process starts to setup parser states.

protected setup(): void

teardown

Called after the transformation process to clear any variable which may be taking up memory unnecessarly.

protected teardown(): void

transform

Main function. Performs some preprocessing on the input text and pass it through the document gamut.

public transform(string $text): string

Parameters:

Parameter Type Description
$text string

stripLinkDefinitions

Strips link definitions from text, stores the URLs and titles in hash references

protected stripLinkDefinitions(string $text): string

Parameters:

Parameter Type Description
$text string

_stripLinkDefinitions_callback

The callback to strip link definitions

protected _stripLinkDefinitions_callback(array $matches): string

Parameters:

Parameter Type Description
$matches array

hashHTMLBlocks

Hashify HTML blocks

protected hashHTMLBlocks(string $text): string

Parameters:

Parameter Type Description
$text string

_hashHTMLBlocks_callback

The callback for hashing HTML blocks

protected _hashHTMLBlocks_callback(string $matches): string

Parameters:

Parameter Type Description
$matches string

hashPart

Called whenever a tag must be hashed when a function insert an atomic element in the text stream. Passing $text to through this function gives a unique text-token which will be reverted back when calling unhash.

protected hashPart(string $text, string $boundary = &#039;X&#039;): string

The $boundary argument specify what character should be used to surround the token. By convension, "B" is used for block elements that needs not to be wrapped into paragraph tags at the end, ":" is used for elements that are word separators and "X" is used in the general case.

Parameters:

Parameter Type Description
$text string
$boundary string

hashBlock

Shortcut function for hashPart with block-level boundaries.

protected hashBlock(string $text): string

Parameters:

Parameter Type Description
$text string

runBlockGamut

Run block gamut tranformations.

protected runBlockGamut(string $text): string

We need to escape raw HTML in Markdown source before doing anything else. This need to be done for each block, and not only at the begining in the Markdown function since hashed blocks can be part of list items and could have been indented. Indented blocks would have been seen as a code block in a previous pass of hashHTMLBlocks.

Parameters:

Parameter Type Description
$text string

runBasicBlockGamut

Run block gamut tranformations, without hashing HTML blocks. This is useful when HTML blocks are known to be already hashed, like in the first whole-document pass.

protected runBasicBlockGamut(string $text): string

Parameters:

Parameter Type Description
$text string

doHorizontalRules

Convert horizontal rules

protected doHorizontalRules(string $text): string

Parameters:

Parameter Type Description
$text string

runSpanGamut

Run span gamut transformations

protected runSpanGamut(string $text): string

Parameters:

Parameter Type Description
$text string

doHardBreaks

Do hard breaks

protected doHardBreaks(string $text): string

Parameters:

Parameter Type Description
$text string

_doHardBreaks_callback

Trigger part hashing for the hard break (callback method)

protected _doHardBreaks_callback(array $matches): string

Parameters:

Parameter Type Description
$matches array

doAnchors

Turn Markdown link shortcuts into XHTML tags.

protected doAnchors(string $text): string

Parameters:

Parameter Type Description
$text string

_doAnchors_reference_callback

Callback method to parse referenced anchors

protected _doAnchors_reference_callback(array $matches): string

Parameters:

Parameter Type Description
$matches array

_doAnchors_inline_callback

Callback method to parse inline anchors

protected _doAnchors_inline_callback(array $matches): string

Parameters:

Parameter Type Description
$matches array

doImages

Turn Markdown image shortcuts into tags.

protected doImages(string $text): string

Parameters:

Parameter Type Description
$text string

_doImages_reference_callback

Callback to parse references image tags

protected _doImages_reference_callback(array $matches): string

Parameters:

Parameter Type Description
$matches array

_doImages_inline_callback

Callback to parse inline image tags

protected _doImages_inline_callback(array $matches): string

Parameters:

Parameter Type Description
$matches array

doHeaders

Parse Markdown heading elements to HTML

protected doHeaders(string $text): string

Parameters:

Parameter Type Description
$text string

_doHeaders_callback_setext

Setext header parsing callback

protected _doHeaders_callback_setext(array $matches): string

Parameters:

Parameter Type Description
$matches array

_doHeaders_callback_atx

ATX header parsing callback

protected _doHeaders_callback_atx(array $matches): string

Parameters:

Parameter Type Description
$matches array

_generateIdFromHeaderValue

If a header_id_func property is set, we can use it to automatically generate an id attribute.

protected _generateIdFromHeaderValue(string $headerValue): string

This method returns a string in the form id="foo", or an empty string otherwise.

Parameters:

Parameter Type Description
$headerValue string

doLists

Form HTML ordered (numbered) and unordered (bulleted) lists.

protected doLists(string $text): string

Parameters:

Parameter Type Description
$text string

_doLists_callback

List parsing callback

protected _doLists_callback(array $matches): string

Parameters:

Parameter Type Description
$matches array

processListItems

Process the contents of a single ordered or unordered list, splitting it into individual list items.

protected processListItems(string $list_str, string $marker_any_re): string

Parameters:

Parameter Type Description
$list_str string
$marker_any_re string

_processListItems_callback

List item parsing callback

protected _processListItems_callback(array $matches): string

Parameters:

Parameter Type Description
$matches array

doCodeBlocks

Process Markdown <pre><code> blocks.

protected doCodeBlocks(string $text): string

Parameters:

Parameter Type Description
$text string

_doCodeBlocks_callback

Code block parsing callback

protected _doCodeBlocks_callback(array $matches): string

Parameters:

Parameter Type Description
$matches array

makeCodeSpan

Create a code span markup for $code. Called from handleSpanToken.

protected makeCodeSpan(string $code): string

Parameters:

Parameter Type Description
$code string

prepareItalicsAndBold

Prepare regular expressions for searching emphasis tokens in any context.

protected prepareItalicsAndBold(): void

doItalicsAndBold

Convert Markdown italics (emphasis) and bold (strong) to HTML

protected doItalicsAndBold(string $text): string

Parameters:

Parameter Type Description
$text string

doBlockQuotes

Parse Markdown blockquotes to HTML

protected doBlockQuotes(string $text): string

Parameters:

Parameter Type Description
$text string

_doBlockQuotes_callback

Blockquote parsing callback

protected _doBlockQuotes_callback(array $matches): string

Parameters:

Parameter Type Description
$matches array

_doBlockQuotes_callback2

Blockquote parsing callback

protected _doBlockQuotes_callback2(array $matches): string

Parameters:

Parameter Type Description
$matches array

formParagraphs

Parse paragraphs

protected formParagraphs(string $text, bool $wrap_in_p = true): string

Parameters:

Parameter Type Description
$text string String to process in paragraphs
$wrap_in_p bool Whether paragraphs should be wrapped in <p> tags

encodeAttribute

Encode text for a double-quoted HTML attribute. This function is not suitable for attributes enclosed in single quotes.

protected encodeAttribute(string $text): string

Parameters:

Parameter Type Description
$text string

encodeURLAttribute

Encode text for a double-quoted HTML attribute containing a URL, applying the URL filter if set. Also generates the textual representation for the URL (removing mailto: or tel:) storing it in $text.

protected encodeURLAttribute(string $url, string& $text = null): string

This function is not suitable for attributes enclosed in single quotes.

Parameters:

Parameter Type Description
$url string
$text string Passed by reference

Return Value:

URL


encodeAmpsAndAngles

Smart processing for ampersands and angle brackets that need to be encoded. Valid character entities are left alone unless the no-entities mode is set.

protected encodeAmpsAndAngles(string $text): string

Parameters:

Parameter Type Description
$text string

Parse Markdown automatic links to anchor HTML tags

protected doAutoLinks(string $text): string

Parameters:

Parameter Type Description
$text string

Parse URL callback

protected _doAutoLinks_url_callback(array $matches): string

Parameters:

Parameter Type Description
$matches array

Parse email address callback

protected _doAutoLinks_email_callback(array $matches): string

Parameters:

Parameter Type Description
$matches array

encodeEntityObfuscatedAttribute

Input: some text to obfuscate, e.g. "mailto:foo@example.com"

protected encodeEntityObfuscatedAttribute(string $text, string& $tail = null, int $head_length): string

Output: the same text but with most characters encoded as either a decimal or hex entity, in the hopes of foiling most address harvesting spam bots. E.g.:

   &#109;&#x61;&#105;&#x6c;&#116;&#x6f;&#58;&#x66;o&#111;
   &#x40;&#101;&#x78;&#97;&#x6d;&#112;&#x6c;&#101;&#46;&#x63;&#111;
   &#x6d;

Note: the additional output $tail is assigned the same value as the ouput, minus the number of characters specified by $head_length.

Based by a filter by Matthew Wickline, posted to BBEdit-Talk. With some optimizations by Milian Wolff. Forced encoding of HTML attribute special characters by Allan Odgaard.

Parameters:

Parameter Type Description
$text string
$tail string Passed by reference
$head_length int

parseSpan

Take the string $str and parse it into tokens, hashing embeded HTML, escaped characters and handling code spans.

protected parseSpan(string $str): string

Parameters:

Parameter Type Description
$str string

handleSpanToken

Handle $token provided by parseSpan by determining its nature and returning the corresponding value that should replace it.

protected handleSpanToken(string $token, string& $str): string

Parameters:

Parameter Type Description
$token string
$str string Passed by reference

outdent

Remove one level of line-leading tabs or spaces

protected outdent(string $text): string

Parameters:

Parameter Type Description
$text string

detab

Replace tabs with the appropriate amount of spaces.

protected detab(string $text): string

For each line we separate the line in blocks delemited by tab characters. Then we reconstruct every line by adding the appropriate number of space between each blocks.

Parameters:

Parameter Type Description
$text string

_detab_callback

Replace tabs callback

protected _detab_callback(string $matches): string

Parameters:

Parameter Type Description
$matches string

_initDetab

Check for the availability of the function in the utf8_strlen property (initially mb_strlen). If the function is not available, create a function that will loosely count the number of UTF-8 characters with a regular expression.

protected _initDetab(): void

unhash

Swap back in all the tags hashed by _HashHTMLBlocks.

protected unhash(string $text): string

Parameters:

Parameter Type Description
$text string

_unhash_callback

Unhashing callback

protected _unhash_callback(array $matches): string

Parameters:

Parameter Type Description
$matches array


Automatically generated on 2025-03-18