Class XmlEscapeCharacterConverter


  • public final class XmlEscapeCharacterConverter
    extends java.lang.Object
    This converter handles references when dealing with text or markup in an XML document. Those references (escape characters) are defined in ISO-8859-1 Reference.

    The conversion supports both converting a numeric character reference (&#nnnn; where nnnn is the code point in decimal form or &xhhhh; where hhhh is the code point in hexadecimal point) and a character entity reference (&name; where name is the case-sensitive name of the entity).

    Version:
    2.5
    Author:
    Pascal Filion
    Since:
    2.5
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static java.lang.String escape​(java.lang.String value, int[] positions)
      Converts the characters that are reserved in an XML document the given string may have into their corresponding references (escape characters) using the character entity reference.
      static java.lang.String getCharacter​(java.lang.String reference)
      Returns the Unicode character for the given reference (which is either a numeric character reference or a character entity reference).
      static java.lang.String getEscapeCharacter​(char character)
      Returns the escaped character for the given reserved character.
      static boolean isReserved​(char character)
      Determines if the given character is one of the XML/HTML reserved characters.
      static void reposition​(java.lang.CharSequence query, int[] positions)
      Re-adjusts the given positions, which is based on the non-escaped version of the given query, by making sure it is pointing at the same position within query, which contains references (escape characters).
      static java.lang.String unescape​(java.lang.String value, int[] position)
      Converts the references (escape characters) the given string may have into their corresponding Unicode characters.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • AMPERSAND_ENTITY_NAME

        public static final java.lang.String AMPERSAND_ENTITY_NAME
        The entity name for ampersand: &.
        See Also:
        Constant Field Values
      • APOSTROPHE_ENTITY_NAME

        public static final java.lang.String APOSTROPHE_ENTITY_NAME
        The entity name for apostrophe: '.
        See Also:
        Constant Field Values
      • GREATER_THAN_ENTITY_NAME

        public static final java.lang.String GREATER_THAN_ENTITY_NAME
        The entity name for greater-than symbol: >.
        See Also:
        Constant Field Values
      • LESS_THAN_ENTITY_NAME

        public static final java.lang.String LESS_THAN_ENTITY_NAME
        The entity name for less-than symbol: <.
        See Also:
        Constant Field Values
      • QUOTATION_MARK_NAME

        public static final java.lang.String QUOTATION_MARK_NAME
        The entity name for quotation mark: ".
        See Also:
        Constant Field Values
    • Method Detail

      • escape

        public static java.lang.String escape​(java.lang.String value,
                                              int[] positions)
        Converts the characters that are reserved in an XML document the given string may have into their corresponding references (escape characters) using the character entity reference.
        Parameters:
        value - A string that may contain characters that need to be escaped
        positions - This array of length one or two can be used to adjust the position of the cursor or a text range within the string during the conversion of the reserved characters
        Returns:
        The given string with any reserved characters converted into the escape characters
      • getCharacter

        public static java.lang.String getCharacter​(java.lang.String reference)
        Returns the Unicode character for the given reference (which is either a numeric character reference or a character entity reference).
        Parameters:
        reference - The numeric character or character entity reference stripped of the leading ampersand and trailing semi-colon
        Returns:
        The Unicode character mapped to the given reference or null if the reference is invalid or unknown
      • getEscapeCharacter

        public static java.lang.String getEscapeCharacter​(char character)
        Returns the escaped character for the given reserved character.
        Parameters:
        character - The reserved character to retrieve its escape character with the entity name
        Returns:
        The escape character with the entity name of the given character if it is a reserved character; otherwise returns null
      • isReserved

        public static boolean isReserved​(char character)
        Determines if the given character is one of the XML/HTML reserved characters.
        Parameters:
        character - The character to verify if it's one of the reserved characters
        Returns:
        true if the given character is defined as a reserved characters; false otherwise
      • reposition

        public static void reposition​(java.lang.CharSequence query,
                                      int[] positions)
        Re-adjusts the given positions, which is based on the non-escaped version of the given query, by making sure it is pointing at the same position within query, which contains references (escape characters).

        The escape characters are either the character entity references or the numeric character references used in an XML document.

        Important: The given query should contain the exact same amount of whitespace than the query used to calculate the given positions.

        Parameters:
        query - The query that may contain escape characters
        positions - The position within the non-escaped version of the given query, which is either a single element position or two positions that is used as a text range. After execution contains the adjusted positions by moving it based on the difference between the escape and non-escaped versions of the query
        Since:
        2.5
      • unescape

        public static java.lang.String unescape​(java.lang.String value,
                                                int[] position)
        Converts the references (escape characters) the given string may have into their corresponding Unicode characters.
        • Character entity reference: &copy; for ©
        • Numeric character reference (decimal value): &#169; for ©
        • Numeric character reference (hexadecimal value): &#xA9; for ©
        Parameters:
        value - A string that may contain escape characters
        position - This array of length one can be used to adjust the position of the cursor within the string during the conversion of the escape characters
        Returns:
        The given string with any escape characters converted into the actual Unicode characters