module Asciidoctor::Pdf::Sanitizer

Constants

BuiltInEntityCharOrTagRx
BuiltInEntityCharRx
BuiltInEntityChars
NumericCharRefRx
SegmentPcdataRx
XmlSanitizeRx

Public Instance Methods

sanitize(string) click to toggle source

Strip leading, trailing and repeating whitespace, remove XML tags and resolve all entities in the specified string.

FIXME move to a module so we can mix it in elsewhere FIXME add option to control escaping entities, or a filter mechanism in general

# File lib/asciidoctor-pdf/sanitizer.rb, line 20
def sanitize string
  string.strip
      .gsub(XmlSanitizeRx, '')
      .tr_s(' ', ' ')
      .gsub(NumericCharRefRx) { [$1.to_i].pack('U*') }
      .gsub(BuiltInEntityCharRx, BuiltInEntityChars)
end
upcase_pcdata(string) click to toggle source
# File lib/asciidoctor-pdf/sanitizer.rb, line 28
def upcase_pcdata string
  if BuiltInEntityCharOrTagRx =~ string
    string.gsub(SegmentPcdataRx) { $2 ? $2.upcase : $1 }
  else
    string.upcase
  end
end