« Ruby on Rails 3 Security Updated | Main | Two vulnerabilities fixed in Rails 2.3.4 »
Friday
Nov272009

XSS Weakness in strip_tags and some notes on parsing HTML/XML

There is another Cross-Site Scripting (XSS) Weakness in the Rails method strip_tag(). The problem was found in the HTML::Tokenizer which has bugs when parsing non-printable ASCII characters.

According to the original post, this has been fixed in Rails 2.3.5 and there is a patch for the 2.2. branch. Earlier versions are unsupported. Upgrade to a newer version if you make use of this method.

The workaround is this:

Users using strip_tags can pass the resulting output to the regular escaping functionality:

  <%= h(strip_tag(...)) %>

However, this is not how it should be. The strip_tags() method should work correctly. The workaround does work, but strip_tags() is based on HTML::Tokenizer which uses a very naive approach to parsing HTML code. It is based on regular expressions to analyze the code. For serious/enterprise implementations, you should not use an error-prone parser library.

  • The REXML is a little better, but not very fast for large amounts of data. It has some bugs and it's not 100% standard compliant. For larger amounts of data, it may even be used to use a pull parser: REXML::Parsers::PullParser. Some people have successfully parsed HTML with it.
  • And there is libxml, which is a real parser, now with ruby bindings. We haven't used it with (X)HTML, though. It has a pull parser too, and its quite like the REXML pull parser. LibXML is an extensive C-library which might not available on exotic Linux-derivates or Windows. Nokogiri is also based on LibXML.
  • Update: If you're using JRuby, you can use tried and tested Java XHTML/XML parsers. For example Apache Xerces or the pull parser Woodstox which supports "almost well-formed" documents (like legacy (X)HTML content).

PrintView Printer Friendly Version

EmailEmail Article to Friend

Reader Comments (5)

I hope I can improve through learning this respect. But overall, it's very nice. Thank you for your share!

June 24, 2010 | Unregistered CommenterAcronym List
June 26, 2010 | Unregistered Commenterhhg

Glad to read about it,pizza,thanks ,pizza,Good job,mini cooper,thanks

July 8, 2010 | Unregistered Commenterpizza

Glad to read about it,pizza,thanks ,pizza,Good job,mini cooper,thanks

July 8, 2010 | Unregistered Commenterpizza

As a famous Paris Fashion brand, Herve Leger Strapless is taking great effort to show women`s perfect figures and gentle charactors. 2010 new style Herve Leger help women to show this line of beauty perfectly. This is also the tenet of Herve Leger Strapless .Originating in France,2010 new style Herve Leger also have the charactor as Paris Women does, natural but not fake, romantic, elegant and Vogue .In the 1980s, Moncler Polo Shirt become unprecedentedly popular all over the world. They are poplular because all the
Moncler Jackets Vest are made of high- quality down.So many young people are fascinated with Moncler Polo Shirt , and we guess you must be one of them. Start to be the trend-spotter from owning an Moncler Accessories.The red outsole is the distinctive features of Christian Louboutin Boots , also is the female of gentle, lovely, beautiful and sexy logo.Soon, the red high-heeled Christian Louboutin Flats spread all over the world after Cinderella's fairy tales, especially the big stars and royal aristocrats let
Christian Louboutin Flats appear in the front of the world. Christian Louboutin Boots favor gorgeous colors with various exotic. . No matter what kinds of HERMES PURSE you like, you can have a look at HERMES BELT .Please go and check out, you will be attracted by those beautiful and elegant belt. And among the various Hermes Kelly , there must be one suitable for you.Hermes Lindy&Hermes Evelyne one of the best handbag brands in the world.

July 29, 2010 | Unregistered Commenterjun

PostPost a New Comment

Enter your information below to add a new comment.

My response is on my own website »
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>