class Loofah::Scrubbers::Whitewash

scrub!(:whitewash)

:whitewash removes all comments, styling and attributes in addition to doing markup-fixer-uppery and pruning unsafe tags. I like to call this “whitewashing”, since it's like putting a new layer of paint on top of the HTML input to make it look nice.

messy_markup = "ohai! <div id='foo' class='bar' style='margin: 10px'>div with attributes</div>"
Loofah.fragment(messy_markup).scrub!(:whitewash)
=> "ohai! <div>div with attributes</div>"

One use case for this scrubber is to clean up HTML that was cut-and-pasted from Microsoft Word into a WYSIWYG editor or a rich text editor. Microsoft's software is famous for injecting all kinds of cruft into its HTML output. Who needs that crap? Certainly not me.

Public Class Methods

new() click to toggle source
# File lib/loofah/scrubbers.rb, line 160
def initialize
  @direction = :top_down
end

Public Instance Methods

scrub(node) click to toggle source
# File lib/loofah/scrubbers.rb, line 164
def scrub(node)
  case node.type
  when Nokogiri::XML::Node::ELEMENT_NODE
    if HTML5::Scrub.allowed_element? node.name
      node.attributes.each { |attr| node.remove_attribute(attr.first) }
      return CONTINUE if node.namespaces.empty?
    end
  when Nokogiri::XML::Node::TEXT_NODE, Nokogiri::XML::Node::CDATA_SECTION_NODE
    return CONTINUE
  end
  node.remove
  STOP
end