module Loofah
Strings and IO Objects as Input¶ ↑
Loofah.document
and Loofah.fragment
accept any IO object in addition to accepting a string. That IO object could be a file, or a socket, or a StringIO, or anything that responds to read
and close
. Which makes it particularly easy to sanitize mass quantities of docs.
Constants
- VERSION
The version of
Loofah
you are using
Public Class Methods
Shortcut for Loofah::HTML::Document.parse This method accepts the same parameters as Nokogiri::HTML::Document.parse
# File lib/loofah.rb, line 35 def document(*args, &block) remove_comments_before_html_element Loofah::HTML::Document.parse(*args, &block) end
Shortcut for Loofah::HTML::DocumentFragment.parse
This method accepts the same parameters as Nokogiri::HTML::DocumentFragment.parse
# File lib/loofah.rb, line 41 def fragment(*args, &block) Loofah::HTML::DocumentFragment.parse(*args, &block) end
A helper to remove extraneous whitespace from text-ified HTML
# File lib/loofah.rb, line 78 def remove_extraneous_whitespace(string) string.gsub(/\n\s*\n\s*\n/, "\n\n") end
Shortcut for Loofah.document(string_or_io)
.scrub!(method)
# File lib/loofah.rb, line 51 def scrub_document(string_or_io, method) Loofah.document(string_or_io).scrub!(method) end
Shortcut for Loofah.fragment(string_or_io)
.scrub!(method)
# File lib/loofah.rb, line 46 def scrub_fragment(string_or_io, method) Loofah.fragment(string_or_io).scrub!(method) end
Shortcut for Loofah.xml_document(string_or_io)
.scrub!(method)
# File lib/loofah.rb, line 73 def scrub_xml_document(string_or_io, method) Loofah.xml_document(string_or_io).scrub!(method) end
Shortcut for Loofah.xml_fragment(string_or_io)
.scrub!(method)
# File lib/loofah.rb, line 68 def scrub_xml_fragment(string_or_io, method) Loofah.xml_fragment(string_or_io).scrub!(method) end
Shortcut for Loofah::XML::Document.parse This method accepts the same parameters as Nokogiri::XML::Document.parse
# File lib/loofah.rb, line 57 def xml_document(*args, &block) Loofah::XML::Document.parse(*args, &block) end
Shortcut for Loofah::XML::DocumentFragment.parse
This method accepts the same parameters as Nokogiri::XML::DocumentFragment.parse
# File lib/loofah.rb, line 63 def xml_fragment(*args, &block) Loofah::XML::DocumentFragment.parse(*args, &block) end
Private Class Methods
remove comments that exist outside of the HTML
element.
these comments are allowed by the HTML
spec:
https://www.w3.org/TR/html401/struct/global.html#h-7.1
but are not scrubbed by Loofah
because these nodes don’t meet the contract that scrubbers expect of a node (e.g., it can be replaced, sibling and children nodes can be created).
# File lib/loofah.rb, line 93 def remove_comments_before_html_element(doc) doc.children.each do |child| child.unlink if child.comment? end doc end