class Loofah::Scrubber
A Scrubber wraps up a block (or method) that is run on an HTML node (element):
# change all <span> tags to <div> tags span2div = Loofah::Scrubber.new do |node| node.name = "div" if node.name == "span" end
Alternatively, this scrubber could have been implemented as:
class Span2Div < Loofah::Scrubber def scrub(node) node.name = "div" if node.name == "span" end end span2div = Span2Div.new
This can then be run on a document:
Loofah.fragment("<span>foo</span><p>bar</p>").scrub!(span2div).to_s # => "<div>foo</div><p>bar</p>"
Scrubbers can be run on a document in either a top-down traversal (the default) or bottom-up. Top-down scrubbers can optionally return Scrubber::STOP to terminate the traversal of a subtree.
Constants
Attributes
When a scrubber is initialized, the optional block is saved as :block. Note
that, if no block is passed, then the scrub
method is assumed
to have been implemented.
When a scrubber is initialized, the :direction may be specified as :top_down (the default) or :bottom_up.
Public Class Methods
Options may include
:direction => :top_down (the default)
or
:direction => :bottom_up
For top_down traversals, if the block returns Loofah::Scrubber::STOP, then the traversal will be terminated for the current node's subtree.
Alternatively, a Scrubber may inherit from Loofah::Scrubber, and implement
scrub
, which is slightly faster than using a block.
# File lib/loofah/scrubber.rb, line 64 def initialize(options = {}, &block) direction = options[:direction] || :top_down unless [:top_down, :bottom_up].include?(direction) raise ArgumentError, "direction #{direction} must be one of :top_down or :bottom_up" end @direction, @block = direction, block end
Public Instance Methods
When new
is not passed a block, the class may implement
scrub
, which will be called for each document node.
# File lib/loofah/scrubber.rb, line 85 def scrub(node) raise ScrubberNotFound, "No scrub method has been defined on #{self.class.to_s}" end
Calling traverse
will cause the document to be traversed by
either the lambda passed to the initializer or the scrub
method, in the direction specified at new
time.
# File lib/loofah/scrubber.rb, line 77 def traverse(node) direction == :bottom_up ? traverse_conditionally_bottom_up(node) : traverse_conditionally_top_down(node) end
Private Instance Methods
# File lib/loofah/scrubber.rb, line 91 def html5lib_sanitize(node) case node.type when Nokogiri::XML::Node::ELEMENT_NODE if HTML5::Scrub.allowed_element? node.name HTML5::Scrub.scrub_attributes node return Scrubber::CONTINUE end when Nokogiri::XML::Node::TEXT_NODE, Nokogiri::XML::Node::CDATA_SECTION_NODE return Scrubber::CONTINUE end Scrubber::STOP end
# File lib/loofah/scrubber.rb, line 113 def traverse_conditionally_bottom_up(node) node.children.each {|j| traverse_conditionally_bottom_up(j)} if block block.call(node) else scrub(node) end end
# File lib/loofah/scrubber.rb, line 104 def traverse_conditionally_top_down(node) if block return if block.call(node) == STOP else return if scrub(node) == STOP end node.children.each {|j| traverse_conditionally_top_down(j)} end