Wednesday, August 06th, 2008 | Author:

So I was writing this tool to create a bunch of SQL statements from a data dump. Simple enough, right. And as always when you generate SQL statements, you have to make sure that the data doesn’t interfere with the SQL syntax by escaping the single quotes (and generally any binary data, but I didn’t have that). Any database gem/module/library has that built-in, of course, but I didn’t want to use that. So I said [Note: this doesn't work. Read on for the solution.]

def quote (str)
  str.gsub('\\','\\\\').gsub('\'','\\\'')
end

I read this as “replace all backslashes with double backslashes, and then replace all single quotes with a backslash and a single quote”. I added a simple test for it (yay TestUnit!):

def setup
  @m = Migrate.new
end

def test_quote
  assert_equal("I\\'m home", @m.quote("I'm home"))
end

But imagine my surprise when I got

  1) Failure:
test_quote(TestMigrate) [migration/test_migrate.rb:29]:
< "I\\'m home"> expected but was
< "Im homem home">.

Ooookay. What’s wrong here? Have I misunderstood the rules for escaping the escape sequence? It’s supposed to be easier with single quotes, but maybe I got it wrong. So I tried with double quotes:

def quote (str)
  str.gsub("\\","\\\\").gsub("'","\\'")
end

Surely this would work? Nope, it gives the exact same error. Time to look up gsub in the manual:

str.gsub(pattern, replacement) => new_str
str.gsub(pattern) {|match| block } => new_str

[…] If a string is used as the replacement, special variables from the match (such as $& and $1) cannot be substituted into it, as substitution into the string occurs before the pattern match starts. However, the sequences \1, \2, and so on [my emphasis] may be used to interpolate successive groups in the match.

“And so on”? Oh, so obviously \' (escaped \\' in the string literal) is the replacement string equivalent of $', which means everything afther the match (as all regexp hackers know). So I need to escape the backslash for regexp engine too:

def quote (str)
  str.gsub("\\","\\\\").gsub("'","\\\\'")
end

OK, the tests pass. But the code looks wrong. Four backslashes can’t work for both cases, can they? Let’s add a test case:

def test_quote
  assert_equal("I\\'m home", @m.quote("I'm home"))
  assert_equal("S\\\\N", @m.quote("S\\N"))
end

Nope, that fails. So we need this:

def quote (str)
  str.gsub("\\","\\\\\\\\").gsub("'","\\\\'")
end

Eight backslashes. Yes, the test passes, but is it worth it? Is it understandable? I don’t want comments to explain my code. Comments are good to provide a raison d’être for something, but not to explain its looks. Let’s switch to the other form of gsub:

def quote (str)
  str.gsub(/\\|'/) { |c| "\\#{c}" }
end

“If you see a backslash or a single quote, replace it with a backslash and whatever you saw.” That’s what I wanted to say anyway.

Good. But I wrote this in Markdown, so now I have to generate the HTML and the go through it and make sure that I restore whatever backslashes Markdown ate. (It turns out it didn’t eat any. TextMate has a Markdown Preview function that ate a lot of backslashes, but when I said “Convert to HTML” it didn’t eat any at all. Go figure.)

Category: Ruby
You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.

9 Responses

  1. 1
    Dave 

    Hello,

    thanks very much for the post – I was driving myself mad with this trying to work out why my efforts to escape single quotes weren’t working. Now it works nicely. :-)

    As an aside, it turns out that in Rails (which is where I’m coding this) there’s an “escape_javascript” function which does it for you, although I might leave the gsub line in instead to remind me how annoying it was to get it working…

    Cheers
    Dave

  2. 2
    ras 

    awwwwwwwww

    Scanned this 2 hours ago and thought: nice, a solution in the beginning of the post.

    Now I see it’s actually the problem. :s

  3. 3
    Jonathan 

    Argh!!! Thank you!!

  4. 4
    Elvi B 

    amazing! Thanks so much for the explanation. This was a crazy one to track down.

  5. 5
    DJ 

    Cool. Thanks!

  6. Eight backlashes? haha that is crazy. im using the same as dave, escape_javascript. it has been pretty good for me. best of luck to you!

  7. 7
    Diego 

    I’m soooo glad I found this before I tried anything myself! Lol. thanks google! and Vrensk of course!

  8. 8
    Fhou 

    Thank you so much!

  9. Appreciate it for this tremendous post, I am glad I found this website on yahoo.