Archive for the Category » Ruby «

Sunday, March 28th, 2010

[A note for casual readers: These notes are not meant to be objective representations of what different speakers said at Scottish Ruby Conference. They are my interpretations in my context.]

[Update 2010-03-29, morning: added more links. 2010-03-29, evening: re-phrased note on gender distribution and separated it from note on sexism, since I don't think of them as related and don't want others to think so. 2010-04-01: links to other summaries.]

Concurrency and real time are great to have and quite attainable if we step outside the comfort zone that Rails gives us. Thanks to Jim Weirich for the reminders and to Makoto Inoue and Martyn Loughran for valuable tips and tools: js-model, dragban, pusher demo, . Presentation here, and some background. And I really want to do some Erlang.

What we do is an art that is based on science and while the artistery has received a lot of attention lately, we can benefit from revisiting the science. Jim Weirich told us about Structure and Interpretation of Computer Programs. It’s available online and there is a mailing list to discuss the content and/or the exercises. I think Jim would have been able to sell 50 copies of the book on the spot if he had brought any.

Lifecycle management should be done with proper tools, not timestamps with funny names (approved_at, published_at, etc). And lifecycle is more than state machines, it’s also workflow and permissions. David Bock showed off Stonepath and hinted at Stonewall. He uses acts_as_state_machine but says that it should probably be really easy to use state_machine instead. (NB: A new version of state machine was released two weeks ago with a lot of important fixes!)

Ever since I read about ticgit in Scott Chacon’s Git Internals I have wanted to do something with git besides source code control but I always thought I’d need to learn 100 git plumbing commands. Scott demonstrated how to do very useful things with just four or five commands, so that’s something I’ll put in my toolbox.

Gwyn Morfey introduced a very useful image in his talk “Write Bad Code”. We may need to get into technical debt in order to avoid the area of death, just like we may need to get into financial debt in real life to avoid starving or bankruptcy. He also gave a good number of rules for classifying the situations where it is applicable and how to act in those situations.

Lots of people retweeted Tim Bray’s sentiment that “if your webapp doesn’t work on a mobile device nowadays then it doesn’t work”. I definitely think that is true in most cases, but I’m worried that it will be wielded like the Sword Of Truth in the future. It’s not black or white, and while most webapps should be written to work on a mobile device, I think there are loads of valid exceptions.

I read about CRC Cards (Class, Responsibility, Collaboration) many years ago and threw off the idea as a tool for people who needed something to hold in their hands while they learned about object oriented design and analysis. Sam Wessel ran a BoF workshop with Kevin Rutherford where we got to work through a simple analysis exercise. It was a real eye opener and the most valuable session of the conference. I learned two things:

  1. it is useful to have a class for the whole of the system and not just the parts, and
  2. I should learn about CRC. (I shall try to avoid the temptation to think that I can actually use it on my own just because I’ve sat at the feet of masters.)

I’m not a big friend of mocking and thus went to a BoF where Brian Swan and Kevin Rutherford debated mocks. Interesting debate, but according to show of hands at the end, I was the only person who changed their position to whatever slight extent. Apparently there is now a nicer syntax for setting expectations so that you can do the stubbing in the setup and the expectation checks in the actual it-clauses. And Kevin says that (contrary to Brian’s experience, and mine), he feels that he can refactor quite freely without breaking loads of mocks. I need to learn more about mocks or modelling, or both.

WebMock by Bartosz Blimke does the same thing FakeWeb, but better. It has support for regexp matching of urls, checks for POST data, and nice assertions. Worth checking out.

Redcar by Daniel Lucraft aims to be “a cross-platform programmer’s editor written in Ruby”. I couldn’t install it, probably because I have installed gems and rubies in too many ways on my laptop. Will need to clean up and try again.

Things I should check out: 12 hours to rate a rails application, Story mapper (big picture planning for Pivotal Tracker), Distributed Architectures with Rack (mentioned in Tim Bray’s blog).

Non-coding observations

  • When going to a country where it is hideously expensive to use my phone for data, I should bring an old phone that can run my normal sim card and buy a pay-as-you-go card for my iphone at the destination. Doubly so since there is really no reason to believe that anyone can ever create a wireless network that can support 300 developers at the same time, all the time.
  • RubyConf India had 28 female attendees out of 400 total; Scottish Ruby Conference had slightly less, I think.  I wonder what we can do as a community to raise that percentage.  I wonder what we can do as a society.  I wonder what I can do.
  • I also noted that a few presenters thought it appropriate to portray Ruby developers as geeky manboys and women as some kind of more or less attainable prize or decoration. That is so not cool.
  • Quality of presentations vary from marvellous and eyeopening to YOU HAVE ROBBED ME OF 45 MINUTES OF MY LIFE AND MADE ME LOSE FAITH IN HUMANITY. Unless the presenters are well known, I should always find someone who can vouch for the presenter beforehand, or at least talk to the presenter to find out what I can expect. In the choice between an interesting presentation and an interesting presenter, prefer the latter.
  • If I don’t manage to do the above, I should make sure to find a seat where I can sneak out of the room without looking rude.
  • Considering that the lack of equal opportunities for men and women is a far greater problem than my having 45 minutes ripped from my day, I’m forced to make the observation that I have grown slightly numb to male chauvinism. I don’t like that.

Links to others’ summaries:

Category: Programming, Ruby  | Tags: ,  | 12 Comments
Monday, September 08th, 2008

I need to pick up an XML file from a server every 30 minutes and process it. I’ve done similar things before, and using Hpricot it is a pleasure:

#! /usr/bin/env ruby
require 'rubygems'
require 'hpricot'
require 'open-uri'

doc = Hpricot(open("http://example.com/the_file.xml"))
(doc / :person).each { |person| ... }

Couldn’t be simpler. This time, there is a snag: the file is sensitive, so the connection is encrypted using HTTPS. For this article, let’s say we’re talking about Cert’s list of new vulnerabilities, which can be found at https://www.cert.org/blogs/vuls/rss.xml. open-uri supports HTTPS, so it shouldn’t be a problem, but it is:

doc = Hpricot(open("https://www.cert.org/blogs/vuls/rss.xml"))
# =>
/usr/lib/ruby/1.8/net/http.rb:590:in `connect': certificate verify failed (OpenSSL::SSL::SSLError)

OpenSSL, which open-uri uses behind the scenes, fails to verify Cert’s certificate and halts execution.

Solution 1: skip verification

Let’s assume that I don’t care much about the verification; all I want is the data, and it just so happens that it is only available through HTTPS. open-uri doesn’t let me turn off verification so I have to dig deeper.

open-uri is just a clever wrapper around Ruby’s comprehensive, but insufficiently documented, networking library that handles a variety of protocols, including HTTPS. To fetch a web page over a secure connection, you can use something like this sample client (from net/https.rb):

#! /usr/bin/env ruby
require 'net/https'
require 'uri'

uri = URI.parse(ARGV[0] || 'https://localhost/')
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true if uri.scheme == "https"  # enable SSL/TLS
http.start {
  http.request_get(uri.path) {|res|
    print res.body
  }
}

There are three things to note in the sample client:

  1. You should require net/https, not net/http.
  2. You create the client with Net::HTTP.new, not Net::HTTPS.new. (There is no HTTPS class despite the fact that you require 'net/https'.)
  3. You need to set use_ssl = true explicitly. The URI library is clever enough to set its port attribute to 443 when it parses a URI that starts with https, but Net::HTTP isn’t quite as clever.

If you put the above code in webclient.rb and run it, you’ll see this:

$ ruby webclient.rb https://www.cert.org/blogs/vuls/rss.xml
warning: peer certificate won't be verified in this SSL session
<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
    <channel>
        <title>Vulnerability Analysis Blog</title>
[...]

Yes, it will fetch and print the RSS XML, but it will also warn you that it doesn’t verify the host’s certificate. Let’s turn off the warning by telling Net::HTTP that we don’t expect it to perform any verification:

uri = URI.parse(ARGV[0] || 'https://localhost/')
http = Net::HTTP.new(uri.host, uri.port)
if uri.scheme == "https"  # enable SSL/TLS
  http.use_ssl = true
  http.verify_mode = OpenSSL::SSL::VERIFY_NONE
end
http.start { ... }

Run this, and you get the same result without the warning.

Solution 2: add verification

Solution 1 is not enough for my current needs. I want encryption, but I also want to know that I’m talking to the right server. To turn on verification, I change VERIFY_NONE to VERIFY_PEER and run again. Now I’m back on square one with OpenSSL::SSL::SSLError: certificate verify failed. Uh-huh. So what’s wrong with that one? It works in my browser without problems.

I’m not going to go into how HTTPS and certificate validation works. Suffice it so say that my browser is more trusting than OpenSSL. And it’s not blind trust either; the browser knows more Certificate Authorities. So how do I add them to Ruby and OpenSSL? I looked around and found a solution to a similar problem, Connecting to POP3 servers over SSL with Ruby. Adapting that to my HTTPS problem, it becomes a two-step solution:

  1. Download the CA Root Certificates bundle from haxx.se, the creators of curl. Store the file in the same directory as webclient.rb and make sure that it’s called cacert.pem. (But please see the discussion below on Too much trust.)
  2. Make webclient.rb use this file instead of whatever is bundled with OpenSSL.

Now we can tell Net::HTTP to use this CA file:

uri = URI.parse(ARGV[0] || 'https://localhost/')
http = Net::HTTP.new(uri.host, uri.port)
if uri.scheme == "https"  # enable SSL/TLS
  http.use_ssl = true
  http.verify_mode = OpenSSL::SSL::VERIFY_PEER
  http.ca_file = File.join(File.dirname(__FILE__), "cacert.pem")
end
http.start { ... }

Look, it works! It gives the expected output, and it is verifying… something. But what? Time to look under the hood again. It turns out that with these settings, OpenSSL checks that the server certificate is signed by a known CA and has not expired, which is good, but not everything I’m looking for. I also want it to check that the certificate belongs to the server that I’m talking to. To see an example, go to https://google.com/. In Firefox 3, you should get an iconic policeman telling you it’s a Page Load Error. The certificate belongs to www.google.com, not google.com. But our script is not quite as discerning:

$ ruby webclient.rb https://google.com/
hostname was not match with the server certificate
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>302 Moved</TITLE></HEAD><BODY>
<H1>302 Moved</H1>
The document has moved
<A HREF="http://www.google.com">here</A>.
</BODY></HTML>

Note the warning on the first line of output. Apparently Net::HTTP checks to see if the certificate belongs to the host, but it’s not a fatal error. To change this, we need to enable the “post-connection check”. So here is the final version of the script:

#! /usr/bin/env ruby
require 'net/https'
require 'uri'

uri = URI.parse(ARGV[0] || 'https://localhost/')
http = Net::HTTP.new(uri.host, uri.port)
if uri.scheme == "https"  # enable SSL/TLS
  http.use_ssl = true
  http.enable_post_connection_check = true
  http.verify_mode = OpenSSL::SSL::VERIFY_PEER
  http.ca_file = File.join(File.dirname(__FILE__), "cacert.pem")
end
http.start {
  http.request_get(uri.path) {|res|
    print res.body
  }
}

Now it will fail for https://google.com/ but succeed for https://www.google.com/. Done!

Too much trust

OK, I should admit that downloading a file from someplace called haxx.se doesn’t seem like the best way to raise security. If you really want to know who you will be trusting, you should download Root Certificates from each of the CA’s that you trust. That’s way too much work for the application I’m working on right now, but it might be a requirement for you. If you don’t want to go mad, though, try doing it the same way the haxx people did. They wrote a little tool to extract the Root Certificates from the source files of Mozilla, and they even have a tool for extracting it from your binary installation. Check out their documentation for a full description and links to the tools (source code).

[Update: John in comment 16 has written up an instruction on how to get the certificates file using https.  Turtles all the way down.]

Wednesday, August 06th, 2008

So I was writing this tool to create a bunch of SQL statements from a data dump. Simple enough, right. And as always when you generate SQL statements, you have to make sure that the data doesn’t interfere with the SQL syntax by escaping the single quotes (and generally any binary data, but I didn’t have that). Any database gem/module/library has that built-in, of course, but I didn’t want to use that. So I said [Note: this doesn't work. Read on for the solution.]

def quote (str)
  str.gsub('\\','\\\\').gsub('\'','\\\'')
end

I read this as “replace all backslashes with double backslashes, and then replace all single quotes with a backslash and a single quote”. I added a simple test for it (yay TestUnit!):

def setup
  @m = Migrate.new
end

def test_quote
  assert_equal("I\\'m home", @m.quote("I'm home"))
end

But imagine my surprise when I got

  1) Failure:
test_quote(TestMigrate) [migration/test_migrate.rb:29]:
< "I\\'m home"> expected but was
< "Im homem home">.

Ooookay. What’s wrong here? Have I misunderstood the rules for escaping the escape sequence? It’s supposed to be easier with single quotes, but maybe I got it wrong. So I tried with double quotes:

def quote (str)
  str.gsub("\\","\\\\").gsub("'","\\'")
end

Surely this would work? Nope, it gives the exact same error. Time to look up gsub in the manual:

str.gsub(pattern, replacement) => new_str
str.gsub(pattern) {|match| block } => new_str

[…] If a string is used as the replacement, special variables from the match (such as $& and $1) cannot be substituted into it, as substitution into the string occurs before the pattern match starts. However, the sequences \1, \2, and so on [my emphasis] may be used to interpolate successive groups in the match.

“And so on”? Oh, so obviously \' (escaped \\' in the string literal) is the replacement string equivalent of $', which means everything afther the match (as all regexp hackers know). So I need to escape the backslash for regexp engine too:

def quote (str)
  str.gsub("\\","\\\\").gsub("'","\\\\'")
end

OK, the tests pass. But the code looks wrong. Four backslashes can’t work for both cases, can they? Let’s add a test case:

def test_quote
  assert_equal("I\\'m home", @m.quote("I'm home"))
  assert_equal("S\\\\N", @m.quote("S\\N"))
end

Nope, that fails. So we need this:

def quote (str)
  str.gsub("\\","\\\\\\\\").gsub("'","\\\\'")
end

Eight backslashes. Yes, the test passes, but is it worth it? Is it understandable? I don’t want comments to explain my code. Comments are good to provide a raison d’être for something, but not to explain its looks. Let’s switch to the other form of gsub:

def quote (str)
  str.gsub(/\\|'/) { |c| "\\#{c}" }
end

“If you see a backslash or a single quote, replace it with a backslash and whatever you saw.” That’s what I wanted to say anyway.

Good. But I wrote this in Markdown, so now I have to generate the HTML and the go through it and make sure that I restore whatever backslashes Markdown ate. (It turns out it didn’t eat any. TextMate has a Markdown Preview function that ate a lot of backslashes, but when I said “Convert to HTML” it didn’t eat any at all. Go figure.)

Category: Ruby  | Tags: , , ,  | 9 Comments