module Loofah::TextBehavior
Overrides text
in HTML::Document and HTML::DocumentFragment, and mixes in
to_text
.
Public Instance Methods
text(options={})
click to toggle source
Returns a plain-text version of the markup contained by the document, with HTML entities encoded.
This method is significantly faster than to_text, but isn't clever about whitespace around block elements.
Loofah.document("<h1>Title</h1><div>Content</div>").text # => "TitleContent"
By default, the returned text will have HTML entities escaped. If you want unescaped entities, and you understand that the result is unsafe to render in a browser, then you can pass an argument as shown:
frag = Loofah.fragment("<script>alert('EVIL');</script>") # ok for browser: frag.text # => "<script>alert('EVIL');</script>" # decidedly not ok for browser: frag.text(:encode_special_chars => false) # => "<script>alert('EVIL');</script>"
# File lib/loofah/instance_methods.rb, line 94 def text(options={}) result = serialize_root.children.inner_text rescue "" if options[:encode_special_chars] == false result # possibly dangerous if rendered in a browser else encode_special_chars result end end
Also aliased as: inner_text, to_str
to_text(options={})
click to toggle source
Returns a plain-text version of the markup contained by the fragment, with HTML entities encoded.
This method is slower than to_text, but is clever about whitespace around block elements.
Loofah.document("<h1>Title</h1><div>Content</div>").to_text # => "\nTitle\n\nContent\n"
# File lib/loofah/instance_methods.rb, line 115 def to_text(options={}) Loofah.remove_extraneous_whitespace self.dup.scrub!(:newline_block_elements).text(options) end