Review of Ruby, Overview of Ruby on Rails

LUG Programming Course, 3rd March 2008
The last two lessons presented a whirlwind guide to Ruby, demonstrating a simple command line program, and a simple web application. That was a lot of material to digest, so in this lesson we'll make a short review of the important things we learned, and explain iterators and code blocks in a little more detail, since they caused some problems for some of the students.
As we still have much ground to cover, half of this week's lesson is also dedicated to an overview of the Ruby on Rails framework, which we'll be using in the next two lessons to construct a more complicated web application.
The lesson went reasonably well. I overran by about 15 minutes. I'll be overrunning a lot more in the next two lessons, as there's much more to go though. Seems I haven't lost anyone permanently from the course – they all diligently show up , some even and smile occasionally.

Ruby Review

Ruby is a message passing object oriented language. Because it is message passing, objects can only communicate between themselves using methods. Everything in Ruby is an object.
Within a class, there are instance @variables, and class @@variables. If we want the rest of the (Ruby) world to access them, however, we have to write accessor methods, or use the attr_accessor helper methods to create them.
In Ruby, only nil and false are false, everything else is true. The opposite of if is unless. Both can be used as statement modifiers, for example; @volume = 5 unless @mute.
Ruby allows you to write more concise code by making the statement terminator (;) and parentheses (()) in methods optional in certain conditions. This is used frequently in Ruby code.
Ruby has fewer keywords than other languages, preferring method calls in most circumstances. Classes are defined by class ... end code blocks, and methods are defined by def ... end code blocks.
Iterators and Code Blocks - One More Time
Probably the best way to explain iterators and code blocks is to use a couple of examples. First we'll use some 'experimenting' code to dissect Ruby code blocks and how they get used, then we'll look at a more pragmatic example, creating our own iterators.
Here is the code_block.rb code, with explanations:
01 #!/usr/local/bin/ruby -w
02
03 def always_twice
04   puts "In method 'always_twice' (line 4)"
05   yield
06   yield
07 end
08
09 puts "Start (line 9)"
10 always_twice do
11   puts "In 'always_twice' code block (line 11)"
12 end
13 puts "End (line 13)"
14
We have a method definition, always_twice on lines 3 to 7, followed by a method call with associated do ... end code block, on lines 10 to 12, all sprinkled with puts statements so that we can see what is happening. Let's look at the output:
C:\Courses\LUGPC8\Review>code_blocks.rb
Start (line 9)
In method 'always_twice' (line 4)
In 'always_twice' code block (line 11)
In 'always_twice' code block (line 11)
End (line 13)
We use do ... end code blocks here so that each puts is on a separate line – this is equivalent to using curly braces ({}). As you probably expected, we see Start..., In method..., and End..., but we also see In ... code block... twice. So each time we use yield in the method (lines 5 and 6) the code block is executed. Although yield looks like a method call, it is actually a statement. Its meaning is “hand over control to the method's associated code block”.
Now we'll continue this little code blocks exercise, as follows:
15 def never
16   puts "In method 'never' (line 16)"
17 end
18
19 puts "Start (line 19)"
20 never do
21   puts "In 'never' code block (line 21)"
22 end
23 puts "End (line 23)"
24
Again, let's look at the output, then we'll work out what happens:
Start (line 19)
In method 'never' (line 16)
End (line 23)
In this second example, even though we have defined an associated code block for the never method call (lines 20 to 22), it is never executed. In fact, the never method has no yield statement (lines 15 to 17). In other words, the associated code block is completely under control of the method. The method can choose to execute the code block any number of times - including zero.
One last exercise, this time we'll communicate some simple information to the code block:
25 def maybe(&block)
26   puts "In method 'maybe' block? #{block ? 'true' : 'false'} (line 26)"
27   yield "#{block.inspect}" if block
28 end
29
30 puts "Start (line 30)"
31 maybe do |info|
32   puts "In 'maybe' code block: #{info} (line 37)"
33 end
34 puts "End (line 34)"
35
36 puts "Start (line 36)"
37 maybe
38 puts "End (line 38)"
Here's the output:
Start (line 30)
In method 'maybe' block? true (line 26)
In 'maybe' code block: #<Proc:0x02ae593c@C:/Courses/LUGPC8/Review/code_blocks.rb:
 31> (line 37)
End (line 34)
Start (line 36)
In method 'maybe' block? false (line 26)
End (line 38)
This time we've written a method, maybe, on lines 25 to 28, which defines the code block as a parameter. Although the parameter name, block, can be any name we choose, it must be prefixed with an ampersand (&), and it must appear as the last parameter in the parameter list. Different to our previous examples, the yield statement passes a string to the associated code block – if there is one.
In the first test, lines 30 to 34, we call maybe with an associated code block, and that associated code block receives the string that maybe provides as a local variable info (line 31). In the output, you can see that Ruby converted the code block into a Proc object.
In the second test, we call maybe with no associated code block. Since block is nil inside the method, yield is not executed. Attempting to execute yield where there is no associated code block causes the interpreter to produce an error.
Popular Authors and Iterators
Now that we've seen what can be achieved with methods, associated code blocks, and the yield statement, we can create our own iterator. I'll use the Project Guttenberg “Top 100” page as data for this example. This lists the top 100 books and authors, by number of downloads over the last few days. Actually, I'll just use the top 10, to keep things simple.
Here is the authors.rb code, with explanations:
#!/usr/local/bin/ruby -w

# Top 10 Authors (from http://www.gutenberg.org/browse/scores/top)
# Deliberately not in alphabetical order
authors = {
  "Shakespeare, William" => 31758,
  "Austen, Jane" => 39508,
  "Verne, Jules" => 24615,
  "Dickens, Charles" => 37563,
  "Tzu, Sun" => 22447,
  "Thomson, J. Arthur" => 36825,
  "Doyle, Arthur Conan, Sir" => 28398,
  "Baum, L. Frank (Lyman Frank)" => 23663,
  "Twain, Mark" => 48995,
  "Miles, Alexander" => 21775
}

puts "Hash natural ordering"
authors.each { |name, count| puts " #{name} (#{count})" }
Here we define the list, authors, deliberately in no particular order, using a Hash object, followed by the standard Hash.each iterator. The results look something like:
C:\Courses\LUGPC8\Review>code_blocks.rb
Hash natural ordering
 Dickens, Charles (37563)
 Miles, Alexander (21775)
 Twain, Mark (48995)
 Austen, Jane (39508)
 Verne, Jules (24615)
 Shakespeare, William (31758)
 Tzu, Sun (22447)
 Baum, L. Frank (Lyman Frank) (23663)
 Doyle, Arthur Conan, Sir (28398)
 Thomson, J. Arthur (36825)
On my computer, the Hash's natural order does not correspond to the order defined in the code, nor to an ordered list by author. What we actually want is an iterator which gives us the list by ordered author name. So let's create the iterator:
# Singleton method to get authors by ordered name
def authors.each_by_author(&block)
  return unless block
  self.keys.sort.each { |name| yield name, self[name] }
end

puts "\nHash ordered by author"
authors.each_by_author { |name, count| puts " #{name} (#{count})" }
Just as for JavaScript, Ruby allows us to add a method to a single instance object – all we have to do is add the instance variable name, authors, to the method definition.
We sort the Hash.keys array (the author names), and then use the Array.each iterator and then yield the author name and popularity (the Hash key's value).
The result is:
Hash ordered by author
 Austen, Jane (39508)
 Baum, L. Frank (Lyman Frank) (23663)
 Dickens, Charles (37563)
 Doyle, Arthur Conan, Sir (28398)
 Miles, Alexander (21775)
 Shakespeare, William (31758)
 Thomson, J. Arthur (36825)
 Twain, Mark (48995)
 Tzu, Sun (22447)
 Verne, Jules (24615)
Modules - One Solution to Two Problems
Up to now we've used a few simple names for our classes. Eventually, as our programs become more complex, there is a risk that we want to use the same name for two completely different classes. Unfortunately, Ruby doesn't allow this. If you think about the Ruby libraries, you can probably imagine that two different authors could easily write the same Result or Element class, for example.
A second problem is that Ruby only allows single class inheritance. There will be times when you'd really like to inherit functionality from more than one class, but Ruby won't let you. You're stuck with choosing the most useful super class, and rewriting the functionality of the other classes. This leads to duplication of code, which is not a good idea at all. Doing things once, in one place, is the whole point of writing object oriented code in the first place.
Ruby's solution to both problems is the module. You define a module inside a module ... end code block. A module can be used as a namespace for one or more classes. This basically means that we can put class names into a container which also has a name. So we can differentiate two classes with the same name, by using two differently named modules.
When more than one class requires the same method or methods, and it is difficult to inherit one class from the other, we can define the methods in a module, and then include it in those classes. In Ruby terms, this is known as a mixin – we mix in the methods to the classes.
The following highly contrived code example, modules.rb, shows examples of these techniques:
01 #!/usr/local/bin/ruby -w
02
03 module Mixin
04   Version = "3.0.11"
05   def Mixin.version
06     class_version(Mixin)
07   end
08   def Mixin.class_version(obj)
09     "#{obj} Version #{obj::Version}"
10   end
11   def time
12     @time
13   end
14 end
15
The first module, Mixin defines a constant Version, two module methods Mixin.version and Mixin.class_version, and an instance method time. Well, I did warn you it was contrived.
16 module Search
17   class Result
18     include Mixin
19     Version = "1.1.2"
20     def initialize
21       @time = 0.000212
22     end
23     def self.version
24       "#{Mixin.class_version(self)}, #{Mixin.version}"
25     end
26   end
27 end
28
The second module Search acts as a namespace for the class Result. In addition it includes the Mixin module on line 18, and sets a Version constant on line 19.
29 module Racing
30   class Result
31     include Mixin
32     def initialize(time)
33       @time = time
34     end
35     def self.version
36       "#{Mixin.class_version(self)}, Beta 3"
37     end
38   end
39 end
40
The third module, Racing, acts as a namespace for another class named Result. It also includes the Mixin module. Different from the second module, it does not define its own Version constant.
Within either module, Search or Racing, there is only one Result class. Although there are only a couple of methods for these classes, you can probably imagine that they will have very different behaviours. The first will probably have a file_path and word_count, the second might have a team, racing_car and driver.
Within each Result class, however, we want a version class method, and they both need to have a time instance method – Search::Result has the time in seconds (with microsecond precision) to perform the search, Racing::Result has the driver's average lap time in seconds (with millisecond precision). Because their behaviour is so different, we wouldn't want to simply inherit one from the other, but we can include the Mixin module to both. By doing that, all the methods defined in Mixin become methods of the class.
41 puts "Search::Result::Version #{Search::Result::Version}"
42 puts "Racing::Result::Version #{Racing::Result::Version}"
43 puts Search::Result.version
44 puts Racing::Result.version
45 s_r = Search::Result.new
46 r_r = Racing::Result.new(104.123)
47 puts "s_r #{s_r.inspect}"
48 puts "r_r #{r_r.inspect}"
49 puts "s_r.time #{s_r.time}"
50 puts "r_r.time #{r_r.time}"
51 puts "Mixin::Version #{Mixin::Version}"
52 puts "Mixin.time #{Mixin.time}"
To wrap up we write a series of puts to see the results of our mixins. These print out the two Result classes' Version constants, the version class method values, create two instances of the two Result classes, inspect them, get their time method values, and finally output the Mixin modules Version constant and time method value.
The resulting output looks like:
01 C:\Courses\LUGPC8\Review>modules.rb
02 Search::Result::Version 1.1.2
03 Racing::Result::Version 3.0.11
04 Search::Result Version 1.1.2, Mixin Version 3.0.11
05 Racing::Result Version 3.0.11, Beta 3
06 s_r #<Search::Result:0x2ae4870 @time=0.000212>
07 r_r #<Racing::Result:0x2ae4028 @time=104.123>
08 s_r.time 0.000212
09 r_r.time 104.123
10 Mixin::Version 3.0.11
11 C:/Courses/LUGPC8/Review/modules.rb:52: undefined method `time' for Mixin:Module (NoMethodError)
Lines 41 to 44 give the output on lines 2 to 5. You might be surprised that Search::Result can change the Mixin::Version constant value. In effect Ruby does not change the value, it changes the reference. When you include a mixin, the constants and methods are not copied into the class, they are added to the list of the class' references – see the Module documentation for methods that list class_variables, constants, included_modules, and instance_methods of a class.
Lines 47 to 50 give the output on lines 6 to 9. Both classes use the mixin time method but using their own @time instance variable values.
Finally, lines 54 to 55 give the output on lines 10 to 11. Although we can print the Mixin::Version constant value, and we could call the module method Mixin.version if we wanted to, we can't do the same with the time method, since a module itself cannot have instance methods.
Ruby uses mixins to define the Object class. Some of the methods come from Object, and some from the modules Kernel and Module. This way, we don't end up with a huge amount of code in the Object class, and the module names also help differentiate the behaviour of the methods themselves.
Two other notable mixin modules are Comparable, and Enumerable. For any given class, once the comparison method (<=>) has been defined, the Comparable mixin gives us six other methods (<, <=, ==, >=, >, and between?) for free. Similarly with Enumerable, once we have defined an each method, this mixin can provide a dozen other iterator methods.
Unit Testing
Ruby provides a unit testing library which makes testing our code trivial. We didn't have time last week, so I've added a unit_test.rb file which tests our SearchEngine from the 'Ruby and Ajax' lesson, together with the search.rb and word_count.rb files, and two simple text files which we'll test against.
01 #!/usr/local/bin/ruby -w
02
03 require 'search'
04 require 'test/unit'
05
06 class TestSearch < Test::Unit::TestCase
07   def setup
08     @engine ||= SearchEngine.new('./text', '.', 'http://www.jhl.it')
09   end
We create a class (which begins with Test) and inherits from the unit test library. Before the test methods are run, we can prepare the data we'll be testing in the setup method. Where this requires using resources, we can release those resources after each test method has run by defining a teardown method, though this wasn't needed in our example.
Note that setup is called before each test method, and teardown after each test method. We don't need to parse the files for each test, so I use the short circuit on line 8 to ensure that @engine is created only if it isn't nil.
See the documentation for Test::Unit::TestCase, and Test::Unit::Assertions for more information.
10   def test_files
11     assert_equal(2, @engine.files.length)
12     @engine.files.each do |file|
13       assert_equal("http://www.jhl.it/text", file.url[0...22])
14       case file.file
15         when './text/one_word.txt'
16           assert_equal('frabjous', file.title)
17           assert_equal(1, file.count)
18           assert_equal(1, file.unique_count)
19           assert_equal(1, file.word_count('frabjous'))
20           assert_equal(0, file.word_count('inexistant'))
21         when './text/four_words.txt'
22           assert_equal('O frabjous day! Callooh! Callay!', file.title)
23           assert_equal(10, file.count)
24           assert_equal(4, file.unique_count)
25           assert_equal(0, file.word_count('O'))
26           assert_equal(4, file.word_count('frabjous'))
27           assert_equal(3, file.word_count('day'))
28           assert_equal(2, file.word_count('Callooh'))
29           assert_equal(1, file.word_count('Callay'))
30         else
31           flunk "Unexpected #{file.file}"
32       end
33     end
34   end
The test_files method checks that both files have been parsed correctly. It also checks that no additional files have been added, as this will probably invalidate the results of other tests.
35   def test_results
36     results = @engine.search('inexistent')
37     assert_equal([], results)
38     results = @engine.search('day')
39     assert_equal(1, results.length)
40     result = results[0]
41     assert_equal('http://www.jhl.it/text/four_words.txt', result.url)
42     assert_equal('O frabjous day! Callooh! Callay!', result.title)
43     assert_equal(3, result.count)
44     results = @engine.search('frabjous')
45     assert_equal(2, results.length)
46     result = results[0]
47     assert_equal('http://www.jhl.it/text/four_words.txt', result.url)
48     assert_equal('O frabjous day! Callooh! Callay!', result.title)
49     assert_equal(4, result.count)
50     result = results[1]
51     assert_equal('http://www.jhl.it/text/one_word.txt', result.url)
52     assert_equal('frabjous', result.title)
53     assert_equal(1, result.count)
54   end
55 end
The second test runs the SearchEngine through its paces. We check that the results array is valid for the case of zero, one or two results (lines 36, 38, and 44).
The output of running this little program is as follows:
C:\Courses\LUGPC8\Review>unit_test.rb
Loaded suite C:/Courses/LUGPC8/Review/unit_test
Started
..
Finished in 0.015 seconds.

2 tests, 28 assertions, 0 failures, 0 errors

Overview of Ruby on Rails

Ruby on Rails is a framework for developing web applications in the Ruby language. The code that we develop is usually very intimately tied to the framework. This makes frameworks quite different from libraries, where code can usually be organised to remove such close dependencies, allowing a program to switch from one library to another.
Choosing the framework for a web application is therefore a major and permanent decision. It will be difficult if not impossible to switch frameworks once a significant part of the application has been developed. In the Ruby world, however, the choice is relatively simple, because Ruby on Rails is by far the dominant web application framework. It has achieved this dominance by reducing the complexity of web applications using a technique now known as convention over configuration.
What's in the Box?
Rails comprises six ruby gems, which we'll look at briefly, and a set of code generators and other scripts.
  • Rails – the rails gem provides the core “glue” code which binds together the other five gems. As Ruby on Rails programmers we won't have many dealings with this gem, it does its work and keeps out of the way.
  • ActiveSupport – this gem provides a host of classes and methods, all designed to make our job easier. It adds methods to Ruby classes, provides helpers for enumerations, and adds Unicode support to strings.
  • ActiveRecord – provides an object relational mapper between the database tables and records, and the data objects in our web application. This gem provides near transparent methods to extract, modify and permanently save the data that we'll be using.
  • ActiveResource – a complete web services kit, strongly leaning towards RESTful implementations. We won't have time to look at this particular gem.
  • ActionPack – comprises ActionController and ActionView. The former provides a framework to analyse and elaborate a user's request, the latter provides templating engines, and helper classes for displaying the response to the user. We'll look at these in more detail a little later on.
  • ActionMailer – provides a suite of classes and methods for email services. Again, we won't have time to use this gem in our web application.
The Rails gem also performs additional tasks via a series of scripts, both to maintain the conventions it advocates, and to provide scaffolding code upon which we can derive our own work. The first and fundamental script is rails, which creates the basic application.
The application consists of the following folders:
  • app/ - the application code, in particular the controllers/, helpers/, models/ and views/.
  • config/ - configuration files, and database connections.
  • db/ - the database and migration files.
  • doc/ - generated application documentation.
  • lib/ - shared (library) code.
  • log/ - server logs.
  • public/ - public static files.
  • script/ - utility and code generation scripts.
  • test/ - unit, functional and integration tests, fixtures and mock classes.
  • tmp/ - temporary files created during the execution of the web application, such as cached pages, sockets and sessions.
  • vendor/ - the rails code and any additional plugin code.
That may seem like a lot of folders, but each has its own distinct purpose. During the next two lessons, we'll be doing work in the app/, db/, and public/ folders, and make brief visits to the config/ folder. The log/ and tmp/ folders are used by the web server itself, though the log files in particular can be a useful debugging tool. Our application won't make use of the doc/, lib/, and vendor/ folders, as such. Due to a lack of time, we will also be ignoring the test/ folder. Rails provides a sophisticated testing structure however, which I urge you to take a look at, from unit testing, right through to integration testing.
Models, Views and Controllers
The model-view-controller pattern is used by Rails, and most other web application frameworks, to distribute the work that the code does into separate, distinct areas of concern. The model takes care of the data, the view handles displaying information, and the controller coordinates the whole thing.
In Rails, the model consists of one or more classes that define the data that we'll be using in our application. These classes live in the apps/models and db/migrate folders. They make extensive use of the ActiveRecord classes and methods to perform CRUD (create, read, update and delete) operations on the data, via the database tables.
The view consists of one or more (usually many more) classes that define the data presentation. These classes live in the apps/views folder. They are for the most part templates, though a great deal of functionality is provided by the ActionView classes.
The controller handles a specific aspect of the application. The complete web application usually consists of more than one controller. These classes live in the apps/controllers folder. Again, a great deal of functionality is provided to our controllers via the ActionController classes.
So how do these three main components work together? Let's look at an example, and 'talk' through the code. Lets suppose that a user sends a request for the URL /users/show/1. Here is a simplified scheme of what happens:
  1. Rails receives the request, and uses config/routes.rb to decide which controller is to be used. It also decides the controller method to call, and any further information that needs to be given to that method. In this case, it maps /users to the UsersController class, found in apps/controllers/users_controller.rb, it maps /show to that controller's show method, and it maps /1 to the parameter identifier, id.
  2. At this point Rails will create a UsersController instance, and call the show method, after setting the id to 1. The controller then 'asks' the data model, User, to read the user data whose identifier is 1. Assuming that the user data exists, the data model will return a User object which the controller can then pass along to the template, which by convention will be in apps/views/users/show.html.erb. Of couse if there is no user data with an identifier of 1, the model will give back nil, and the controller can decide to show some other template, probably containing an error message.
  3. The template show.html.erb then prepares the response, using HTML, containing some or all of the information found in the User object.
  4. The controller's show method terminates, and Rails takes the prepared response, which it sends back to the browser.
Each of the three components, model, view and controller, work in collaboration with each other, but without knowing the intimate details of what exactly the others do. This is what is meant by distinct, separate areas of concern.
The model and view don't even (need to) know that the controller exists. The model doesn't (need to) know that the view exists.
The controller knows the model and view, but doesn't know how they go about their business, it only needs to know what to ask for, in the case of the model, and what to pass along, in the case of the view.
Finally the view only knows about the model, and how to extract information from that model, it doesn't know how the model objects were created, or by whom.

Source Files

All the source files for this lesson can be found in the LUGPC8.zip archived file, distributed under the GNU Lesser General Public License.

What's Next?

Next we'll take our knowledge of Ruby a step further by building our very own Ruby on Rails web application.
Lesson 9 will take place on Monday, 10th March 2008.