HTMLPurifier_Lexer_PH5P

Experimental HTML5-based parser using Jeroen van der Meer's PH5P library.

Occupies space in the HTML5 pseudo-namespace, which may cause conflicts.

Methods

Lexes an HTML string into tokens.

public tokenizeHTML(string $html, \HTMLPurifier_Config $config, \HTMLPurifier_Context $context): \HTMLPurifier_Token[]

Parameters:

Retrieves or sets the default Lexer as a Prototype Factory.

public static create(\HTMLPurifier_Config $config): \HTMLPurifier_Lexer

By default HTMLPurifier_Lexer_DOMLex will be returned. There are a few exceptions involving special features that only DirectLex implements.

Parameters:

Parameter	Type	Description
`$config`	\HTMLPurifier_Config

Throws:

public __construct(): mixed

public parseText(mixed $string, mixed $config): mixed

Parameters:

Parameter	Type	Description
`$string`	mixed
`$config`	mixed

public parseAttr(mixed $string, mixed $config): mixed

Parameters:

Parameter	Type	Description
`$string`	mixed
`$config`	mixed

Parses special entities into the proper characters.

public parseData(string $string, mixed $is_attr, mixed $config): string

This string will translate escaped versions of the special characters into the correct ones.

Parameters:

Parameter	Type	Description
`$string`	string	String character data to be parsed.
`$is_attr`	mixed
`$config`	mixed

Return Value:

Parsed character data.

Lexes an HTML string into tokens.

public tokenizeHTML(string $html, \HTMLPurifier_Config $config, \HTMLPurifier_Context $context): \HTMLPurifier_Token[]

Parameters:

Translates CDATA sections into regular sections (through escaping).

protected static escapeCDATA(string $string): string

Parameters:

Parameter	Type	Description
`$string`	string	HTML string to process.

Return Value:

HTML with CDATA sections escaped.

Special CDATA case that is especially convoluted for