module Loofah::TextBehavior

Overrides text in HTML::Document and HTML::DocumentFragment, and mixes in to_text.

Public Instance Methods

inner_text(options={})
Alias for: text
text(options={}) click to toggle source

Returns a plain-text version of the markup contained by the document, with HTML entities encoded.

This method is significantly faster than to_text, but isn't clever about whitespace around block elements.

Loofah.document("<h1>Title</h1><div>Content</div>").text
# => "TitleContent"

By default, the returned text will have HTML entities escaped. If you want unescaped entities, and you understand that the result is unsafe to render in a browser, then you can pass an argument as shown:

frag = Loofah.fragment("&lt;script&gt;alert('EVIL');&lt;/script&gt;")
# ok for browser:
frag.text                                 # => "&lt;script&gt;alert('EVIL');&lt;/script&gt;"
# decidedly not ok for browser:
frag.text(:encode_special_chars => false) # => "<script>alert('EVIL');</script>"
# File lib/loofah/instance_methods.rb, line 94
def text(options={})
  result = serialize_root.children.inner_text rescue ""
  if options[:encode_special_chars] == false
    result # possibly dangerous if rendered in a browser
  else
    encode_special_chars result
  end
end
Also aliased as: inner_text, to_str
to_str(options={})
Alias for: text
to_text(options={}) click to toggle source

Returns a plain-text version of the markup contained by the fragment, with HTML entities encoded.

This method is slower than to_text, but is clever about whitespace around block elements.

Loofah.document("<h1>Title</h1><div>Content</div>").to_text
# => "\nTitle\n\nContent\n"
# File lib/loofah/instance_methods.rb, line 115
def to_text(options={})
  Loofah.remove_extraneous_whitespace self.dup.scrub!(:newline_block_elements).text(options)
end