Ruby

From WhyNotWiki
Jump to: navigation, search

This is mostly for information regarding how to use ruby -- tips and tricks and such -- which is probably what most people are most interested in most of the time. For general information about Ruby, including advocacy and analysis of, see Ruby / About. This article is also my primary repository for snippets and examples at the moment, although I'd like to move them out...


Ruby  edit   (Category  edit) .


Contents

Learning Ruby

http://poignantguide.net/ruby/

http://wiki.rubygarden.org/Ruby/page/show/RubyIdioms RubyIdioms

Reference

http://www.ruby-doc.org/

http://www.ruby-doc.org/core/
http://www.ruby-doc.org/stdlib/

http://www.rubycentral.com/book/ (Programming Ruby: The Pragmatic Programmer's Guide, First Edition) (buy the Second Edition instead!)

http://www.rubycentral.com/book/lib_standard.html

Cheat sheet / Quick reference

http://www.zenspider.com/Languages/Ruby/QuickRef.html (very good)

Miscellaneous / Examples

Data types: False values

The only false values are nil and false. In particular, 0 and '' are true! (http://woss.name/2006/05/07/notes-from-a-rails-course/)


Equivalent to Python's if __name__ == "__main__": main ()?

  if $0 == __FILE__
    main
  end


Examples from Rubyisms in Rails

(Source: Rubyisms in Rails, by Jacob Harris. p. 28-29)

In Java:

Iterator iterator = birdlist.iterator();
while (iterator.hasNext()) {
  Bird bird = (Bird) iterator.next();
  if (bird.endangered()) {
    call_scientists();
    break;
  }
}

In Ruby:

Detection:
call_scientists if birdlist.any? {|bird| bird.endangered? }

Selection:
common_birds = birds.select {|bird, count| count > 5 }
rare_birds = birds.reject {|bird, count| count > 5 }

Accumulation (summing up values):
total_bird_sightings = birds.inject {|total, bird| total += bird.count}

Searching:
winner = contestants.find {|x| x.has_answer? }

Reordering:
titles = jukebox.sort_by {|x| x.title }

Partitioning:
quick, dead = people.partition {|p| p.can_outrun? :bear }
:-) 

Recombination:
blind_dates = restaurants.zip(men, women)
:-)

dates = [:OliveGarden, :TacoBell].zip([:David, :Fred], [:Sara, :Jane])
#     => [[:OliveGarden, :David, :Sara], [:TacoBell, :Fred, :Jane]]

(/Source)

[Caveats (category)] ENV is not a Hash, like you'd expect, but an Object

It doesn't even have a proper class -- just a singleton class. Its class is "Object".

irb -> ENV.class
    => Object

irb -> puts ENV.class.ancestors
Object
PP::ObjectMixin
Kernel

It does behave like a Hash, however. And it is Enumerable.

{"TERM"=>"linux",
 "SHELL"=>"/bin/bash",
 "LOGNAME"=>"tylerrick",
 "VISUAL"=>"/usr/bin/vim",
 "LINES"=>"55",
 "COLUMNS"=>"155",
 ...}

irb -> ENV.respond_to? :[]
    => true
irb -> ENV.respond_to? :to_a
    => true
irb -> ENV.respond_to? :to_hash
    => true

irb -> puts (class << ENV; self; end).ancestors
Enumerable
Object
PP::ObjectMixin
Kernel

I tried to check if there were any other constants like ENV that were just "Objects", but I didn't find any...

irb -> puts Object.constants.sort.map {|it| it + ': ' + Object.const_get(it).class.to_s }
...

Types/type conversions/type checking

irb -> nil.to_i
    => 0

"Coercion between types will use to_str, to_int, etc for attempting to coerce an object to a string/integer internally. to_s and to_i produce human-readable representations." (http://woss.name/2006/05/07/notes-from-a-rails-course/)


Meta-programming: eval

eval doesn't show the right line numbers!

You have to pass __FILE__ to eval or it will print something useless like (eval):4 in any backtraces, etc.

Grep Houston describes this problem well at http://ghouston.blogspot.com/2006/06/ruby-meta-programming-and-stack-traces.html

His solution/demonstration of his solution:

class Monkey
  module_eval(<<-EOS, __FILE__, __LINE__)
        def see
          puts 'Hello World from Monkey.see'
          puts caller(0)
        end
      EOS
end
    
Monkey.new.see
Hello World from Monkey.see
Monkey-see.rb:5:in `see'
Monkey-see.rb:11

in the evaled code, __FILE__ will be whatever you pass in to eval and the working directory (pwd) will be that of the caller

This may be self evident to some people, but it caught me by surprise when I was using Reap's RubyCommentTester. You see, RubyCommentTester extracts your test from a comment in your source file and then evals it.

I was running rubytest like this:

/path/to/project > rubytest lib/string/string.rb

which caused RubyCommentTester (/usr/lib/ruby/gems/1.8/gems/reap-6.0.2/lib/reap/bin/rubytest.rb) to eval it like this:

    eval test_code, TOPLEVEL_BINDING, File.basename(filepath), offset

My test was at the bottom of string.rb and looked something like this:

=begin test
require File.dirname(__FILE__) + "/../../test/test_helper"
class StringTest < Test::Unit::TestCase
  ...
end
=end

The problem occurs on this line, where I try to require a file by its full path:

require File.dirname(__FILE__) + "/../../test/test_helper"

Instead, it ends up trying to require "./../../test/test_helper". __FILE__ was not the full path to string.rb ("/path/to/project/lib/string/string.rb") like I expected it to be. Instead, it was File.basename("/path/to/project/lib/string/string.rb"), which is just "string.rb". So now File.dirname(__FILE__) is trying to extract just the path part of a filename and it gives me "." instead of "/full/path/to/" like I was expecting!

What's more, the working directory (which I printed out with FileUtils.pwd) is not the path of the file actually containing the code (/path/to/project/lib/string/string.rb). Instead, it's whatever the working directory was when I ran rubytest from the command line (/path/to/project).

So when it does the relative require to "./../../test/test_helper", it is relative to "/path/to/project/" rather than "/path/to/project/lib/string/" like I expected. Long story short: it doesn't work.

Since I don't have access to the full path of the file, I can't reliably do an absolute require. And since I don't know ahead of time what pwd will be, I can't reliably do a relative require either.

Solutions: I've thought of several possible solutions none of which I like.

Option 1: Don't do any requires in my test code. (In this particular case, it's not critical that I require anything. I just resent that I'm not able to determine the path of the current path in case I did need to...)

Option 2: Try several variations of require, one with "../", one with "../../", etc.

Option 3: Change the source of RubyCommentTester (/usr/lib/ruby/gems/1.8/gems/reap-6.0.2/lib/reap/bin/rubytest.rb) from:

eval test_code, TOPLEVEL_BINDING, File.basename(filepath), offset

to:

eval test_code, TOPLEVEL_BINDING, filepath, offset

Option 4: Use caller() or $!.backtrace and extract the path information from the callstack. Doesn't work: the only information that those return is ["string.rb:19"].

I ended up going with option 1.


Depending on other Ruby files and libraries (require/load/gem)

Ruby / Depending on other Ruby files and libraries edit


Requiring files

The require statement is executed at runtime (there is no compile-time!) and uses the global variable $: to decide where to look for the file. You can modify or inspect $: if you discover that it can't find the file you're trying to include.

Caveat: require will include the same file again if you spell the path differently

Workaround: Use File.expand_path

require File.expand_path(File.dirname(__FILE__) + "/../config/environment")

Caveat: the starting file is not counted as already required

So if you did something like this, you could end up stuck in an infinite loop:

#starting_file.rb
require "a.rb"

#a.rb
require "starting_file.rb"

Then run -- don't require, just run it from the command line -- starting_file.rb
> ruby starting_file.rb

Granted, you probably shouldn't have circular dependencies like that, but if you do (by accident or design), I think it should be smarter than to get stuck in an infinite loop and overflow the stack! I would consider this a bug.

require_local

I'm sure glad the folks at http://facets.rubyforge.org/ wrote this little extension, because I get sick of always doing requires like this:

require File.dirname(__FILE__) + "/something_in_the_same_directory"

How verbose and ugly. Here's the better way that you've been looking for:

require "rubygems"
require 'facets/core/kernel/require_local'
require_local "something_in_the_same_directory"

http://extensions.rubyforge.org/ has a similar method: require_relative

Why using require_local is more reliable than relying on the $LOAD_PATH and the "." directory

Example:

Even something as innocent-looking as this line at the top of a test can spell trouble:

require 'test/test_helper'

It causes a problem when you try to execute the test script from a directory other than the one the developer anticipated.

active_record_test.rb:2:in `require': no such file to load -- test/test_helper (LoadError)

That kind of brittleness, I hope you will agree, ought to be avoided.

Case in point: http://activescaffold.googlecode.com/svn/tags/current/test/extensions/active_record_test.rb

p $LOAD_PATH  # Added by Tyler
require 'test/test_helper'

If we run it from the root of the plugin directory, the test works just fine!

~/code/plugins/activescaffold > ruby test/extensions/active_record_test.rb
["/usr/lib/ruby/site_ruby/1.8", "/usr/lib/ruby/site_ruby/1.8/i386-linux", "/usr/lib/ruby/site_ruby", "/usr/lib/site_ruby/1.8", "/usr/lib/site_ruby/1.8/i386-linux", "/usr/lib/site_ruby", "/usr/lib/ruby/1.8", "/usr/lib/ruby/1.8/i386-linux", "."]
Loaded suite test/extensions/active_record_test
...

You can see that it had no trouble finding the file 'test/test_helper', because the $LOAD_PATH contains the path ".", which in my case was ~/code/plugins/activescaffold.

But I for one am used to running tests directly from the directory that contains the test. What happens if I try to run the test from within the test/extensions directory?

~/code/plugins/activescaffold/test/extensions > ruby active_record_test.rb
["/usr/lib/ruby/site_ruby/1.8", "/usr/lib/ruby/site_ruby/1.8/i386-linux", "/usr/lib/ruby/site_ruby", "/usr/lib/site_ruby/1.8", "/usr/lib/site_ruby/1.8/i386-linux", "/usr/lib/site_ruby", "/usr/lib/ruby/1.8", "/usr/lib/ruby/1.8/i386-linux", "."]
active_record_test.rb:2:in `require': no such file to load -- test/test_helper (LoadError)
...

Uh oh. You will notice that even though the value of the $LOAD_PATH variable is identical, it wasn't able to find 'test/test_helper' due to the fact that "." no longer refers to ~/code/plugins/activescaffold but now refers to ~/code/plugins/activescaffold/test/extensions! So even though it tried looking for './test/test_helper', that ended up causing it to search for ~/code/plugins/activescaffold/test/extensions/test/test_helper, which is obviously not a valid path.

The moral of the story is that we shouldn't make assumptions about:

  • what's been added to $LOAD_PATH by this point (not demonstrated, but still true)
  • what the current working directory (= "." = Dir.getwd) is

Instead of doing this:

require 'test/test_helper'

, consider doing this:

require 'facets/core/kernel/require_local'
require_local '../test_helper'

Or, if for some reason you really want to use $LOAD_PATH, then you should rely on an entry in $LOAD_PATH whose meaning is fixed, not one which changes depending on which directory you are in at the moment:

$LOAD_PATH << File.join(File.dirname(__FILE__), '..', '..')
require 'test/test_helper'

(This will add a path such as "/home/tyler/code/plugins/activescaffold/test/extensions/../.." onto the $LOAD_PATH.)

Then you can be 100% confident that the require 'test/test_helper' will succeed no matter which directory you run the script from.

Load path/include path

$: and $LOAD_PATH are equivalent

Add a path to it like this: $LOAD_PATH << 'my/path'

How to require a file if it exists, but otherwise silently continue

path=File.expand_path(File.dirname(__FILE__) + "/../../../../vendor/plugins/shared/lib/test_helper.rb"); require path if FileTest.exists?(path)

Or could we do that more cleanly with a begin/rescue block ?

How to require all the files in a certain directory

The quick, no-dependency-way:

Dir[File.dirname(__FILE__) + '/tasks/*.rake'].each {|f| load f}

The slightly prettier way:

(using qualitysmith_extensions)

require_all File.dirname(__FILE__) + '/tasks/*.rake'

Caveat: require only works with files ending in .rb!

I knew it worked to omit the ".rb" from the require command...

Either of these work:

require 'my_file'  # finds my_file.rb
require 'my_file.rb'  # finds my_file.rb

What I didn't know was that you can't use require to load files that don't end in .rb! For example, if you have a file bin/my_command (not my_command.rb), then this will fail:

require 'my_command'

This, however, works (but inherits any problems that load comes with, such as possibly doubly loading a file):

load 'my_command'

[Gems (category)]

The file gem_b/the_name.rb, if it exists, may be loaded instead of the the_name/the_name.rb file (from the the_name gem)

In other words, if you simply do a require 'gem_name' in your code and expect it to load that file (gem_name.rb) from the gem_name gem, you may be in for a surprise if there is another gem out there that happens to have a file by the same name and happens to be in a more preferred position in the load path.

Here's a real-life example that happened to me, when I was trying to do require 'ruby2ruby'...

Initial symptoms:

> sudo gem install ruby2ruby
> irb
irb -> require 'ruby2ruby'
    => true

irb -> def a(&block); block.to_ruby; end
    => nil

irb -> a { foo() }
NoMethodError: undefined method `to_ruby' for #<Proc:0xb7c80bd8@(irb):3>
        from (irb):2:in `a'
        from (irb):3

What?? The documentation for ruby2ruby suggests that the method `to_ruby' is defined for objects of class Proc!

Further investigation:

> gemwhich ruby2ruby
/usr/lib/ruby/gems/1.8/gems/ZenHacks-1.0.1/lib/ruby2ruby.rb

Oh.

After putting appropriate debug output code in both of:

  • /usr/lib/ruby/gems/1.8/gems/ZenHacks-1.0.1/lib/ruby2ruby.rb
  • /usr/lib/ruby/gems/1.8/gems/ruby2ruby-1.1.6/lib/ruby2ruby.rb
> irb
irb -> require 'ruby2ruby'
Loaded ZenHacks gem version
    => true

There are 2 solutions to this problem:

The least-destructive / easiest way to get around this is to simply activate the gem you want to include the file from before you do the require...

> irb
irb -> gem 'ruby2ruby'
    => true

irb -> require 'ruby2ruby'
Loaded ruby2ruby gem version
    => true

The other solution is to simply move the file you don't want to be loaded out of the way (or simply remove it).

> sudo mv /usr/lib/ruby/gems/1.8/gems/ZenHacks-1.0.1/lib/ruby2ruby.rb /usr/lib/ruby/gems/1.8/gems/ZenHacks-1.0.1/lib/ruby2ruby.rb2

> gemwhich ruby2ruby
/usr/lib/ruby/gems/1.8/gems/ruby2ruby-1.1.6/lib/ruby2ruby.rb

> irb
irb -> require 'ruby2ruby'
Loaded ruby2ruby gem version
    => true

This method:

  • requires you to have write permissions to whatever directory the gems are stored in.
  • will cause any code that uses ZenHacks and tries to use the version of ruby2ruby.rb found there, to fail (unless the ruby2ruby gem's ruby2ruby.rb is backwards compatible...).

That's what I ended up doing, because I think the version from the ruby2ruby gem is the official, most up-to-date version of that code; and more importantly, it's the only version that works the way I expect it to.

When I follow an example I find on the web that says you can just require 'ruby2ruby' and then do such-and-such, I want it to hold true for me too! The ZenHacks gem, as cool as it may be, has no business interfering with that behavior (the behavior that would have existed for require 'ruby2ruby' had the ZenHacks gem not been installed).

Now I can do this and it behaves as expected:

> irb
irb -> require 'ruby2ruby'
    => true

irb -> def a(&block); block.to_ruby; end
    => nil

irb -> a { foo() }
    => "proc {\n  foo()\n}"
 


Variables and constants

Ruby / Variables and constants edit

Can you reference variables that haven't been initialized?

Yes, if they are global variables or instance variables.

No, if they are local variables or class variables.

def demonstrate
  puts $a
  puts @a
  puts a    # error!
  puts @@a  # error!
end
demonstrate

nil
nil
using_undefined_variables.rb:6:in `demonstrate': undefined local variable or method `a' for main:Object (NameError)
        from using_undefined_variables.rb:9
irb -> a
NameError: undefined local variable or method `a' for main:Object
        from (irb):1

irb -> @@a
NameError: uninitialized class variable @@a in Object
        from (irb):7

irb -> @a
    => nil

irb -> $a
    => nil

How do you check if a variable is defined?

Not like this:

irb -> defined?(:a)
    => "expression"

because that checks if the symbol

defined? is a very special method that doesn't act like other methods. The difference is that you can pass undefined local and class variables to defined? and you won't get an error!

defined? returns nil if the variable (or whatever?) is not defined and returns what kind of "thing" it is if it is defined...

irb -> defined?(a)
    => nil
irb -> defined?(@a)
    => nil
irb -> defined?(@@a)
    => nil
irb -> defined?($a)
    => nil


irb -> a = 'a local variable'
    => "a local variable"
irb -> @a = 'an instance variable'
    => "an instance variable"
irb -> @@a = 'a class  variable'
    => "a class  variable"
irb -> $a = 'a global variable'
    => "a global variable"

irb -> defined?(a)
    => "local-variable"
irb -> defined?(@a)
    => "instance-variable"
irb -> defined?(@@a)
    => "class variable"
irb -> defined?($a)
    => "global-variable"


How do you check if a constant is defined?

At first I thought you might use const_defined? ... but then I saw that you have to ask that of a specific module. What if you just define a constant at the global scope (not in a module)? Then which module would you to check if that constant is defined? I couldn't get that to work at all.

Instead, use the language-built-in (I couldn't find it in the RDoc for Kernel/Module/etc.) check defined? . It will be nil if it's not defined, non-nil if it is defined!

Foo = true
puts Module.const_defined?(:Foo)
# => false
puts defined? Foo
# => constant

What's the difference between a constant and a variable?

Besides that constants begin with a capital letter and variables do not...?

Does it let you change constants? (Yes)

Yes, but it will raise a warning:

irb -> PI
NameError: uninitialized constant PI
        from (irb):1
irb -> PI=3
    => 3

irb -> PI=4
(irb):3: warning: already initialized constant PI
    => 4

irb -> PI
    => 4

However, you may not change a constant from a method apparently:

irb -> def foo; PI=4; end
SyntaxError: compile error
(irb):1: dynamic constant assignment
def foo; PI=4; end
            ^
        from (irb):1

However, this is Ruby, and like most restrictions in Ruby, you can get around this one:

irb -> def foo; self::class::const_set(:PI, 4); end
    => nil

irb -> foo
    => 4

irb -> PI
    => 4

Different scope: To whom do the constants actually belong?

They always belong to a class/module, it appears; you can't just have local constants like in the same way that you can have local variables.

irb -> MODEL_DIR="/path/to/models"
    => "/path/to/models"

irb -> Kernel::MODEL_DIR = "something else entirely"
    => "something else entirely"

irb -> MODEL_DIR
    => "/path/to/models"

irb -> Object::MODEL_DIR
    => "/path/to/models"

irb -> class Foo
irb(main):010:1> MODEL_DIR="Foo's MODEL_DIR"
irb(main):011:1> end
    => "Foo's MODEL_DIR"

irb -> MODEL_DIR
    => "/path/to/models"

irb -> Object::const_get(:MODEL_DIR)
    => "/path/to/models"

irb -> Foo::MODEL_DIR
    => "Foo's MODEL_DIR"

I would have to say that it looks like unless Ruby has some reason to think you want it elsewhere (f.e., you give an explicit scope, Foo::, or you're inside of a class), it will think you want the constant to be defined for the class object "Object".

Checking if a constant is defined

This could be useful, for example, if you want to conditionally mix in a module or something only if a class is available.

include CoolFeature if CoolFeatureModule.const_defined?(:CoolClass)

(not a good example, I know)

Caveat: Did you mean to have 2 variables reference the same object or two copies of the object?

I easily forget that variables and objects are different. A variable just points to an object. So you can have two variables referring to the same object ... even without meaning to! Really! Believe me!

Demonstration:

irb -> original = new = "hi there"; new.capitalize!; original + " => " + new
    => "Hi there => Hi there"

It may not be obvious unless you know Ruby, but original = new = "hi there" causes both original and new (two variables) to both point to the same object in memory. So changing the one variable will affect the other. See, they have the same object_id:

irb -> original == new
    => true
irb -> original.object_id == new.object_id
    => true
irb -> "#{original.object_id} == #{new.object_id}"
    => "-604147648 == -604147648"

The proper way to create independently modifiable copies of an object is with the dup (duplication) method, like so:

irb -> original = "hi there"; new = original.dup; new.capitalize!; original + " => " + new
    => "hi there => Hi there"

It's a little more verbose, but I don't know of a conciser way of doing it!

Observe how the objects have different object IDs:

irb -> original == new
    => false
irb -> original.object_id == new.object_id
    => false
irb -> "#{original.object_id} != #{new.object_id}"
    => "-604271038 != -604271048"

That's because they really are two separate objects!

Variable scope/accessibility

local instance class global
v @v @@v $v
instance instance accessor/getter accessor/setter class class accessor/getter accessor/setter
normally (from within instance/class): @v @v= object.v object.v= @@v Klass.v Klass.v=
cheating (when scope rules would prevent normal access): object.instance_variable_get(:@v) object.instance_variable_set(:@v, new_value) object.send(:v) object.send(:v) klass.class_variable_get(:@@v) klass.class_variable_set(:@@v, new_value) ? klass.send(:v) ?? klass.send(:v)

cattr_accessor|attr_accessor

mattr_accessor (for modules)

Instance (member) variables

From within a method, it's easy enough to access an instance variable. You just refer to it as @foo or whatever it's called.

Outside of the class, it's a little bit harder:

# Doesn't work:
irb -> a.@foo
SyntaxError: compile error
(irb):9: syntax error, unexpected tIVAR
        from (irb):9

# But you can access it this way:
irb -> a.instance_variable_get(:@foo)
    => []

Reminder: You should never access an instance variable like that (I only use it for debugging). Instead, create accessor methods for it, using attr_writer, attr_reader, or attr_accesor.

Class variables

People.send(:class_variable_set, :@@lifespan, 0.minute)

mod.class_variables will only report which class variable have been set (initialized)

irb -> class A; def self.hi; @@a = 1 end; end
    => nil

# You might expect it to list @@a, but it doesn't really treat it as a class_variable until the moment it is initialized
irb -> A.class_variables
    => []

irb -> A.hi
    => 1

irb -> A.class_variables
    => ["@@a"]

For reasons such as that, it's usually a good idea to initialize class variable in the class scope, even if you will later change them from within a method...


irb -> class B; @@b = nil; def self.hi; @@b = 1 end; end
    => nil

irb -> B.class_variables
    => ["@@b"]


 


Methods / message passing

Ruby / Methods / message passing edit


Article Metadata:

Flag those sections that include non-core-ruby stuff (for example, requires another gem) with "Requires #{what_it_requires}"

intersection of [Category:Ruby:Messages] and [Category:Examples]: flag with heading prefix "[Example]".

Further classification:

  • concepts

Example of: Passing hash as if we had named parameters

Source: http://docs.rubyrake.org/read/chapter/4#pa

task :name => [:prereq1, :prereq2]

NOTE: Although this syntax looks a little funky, it is legal Ruby. We are constructing a hash where the key is :name and the value for that key is the list of prerequisites. It is equivalent to the following …

hash = Hash.new
hash[:name] = [:prereq1, :prereq2]
task(hash)

"returning": Tell it what you're returning in case you forget and accidentally return the result of a method call!

Part of ActiveSupport, but [Should be core Ruby (category)]

It also feels a little more concise and elegant and Rubyish...

(modified quote) (source: http://errtheblog.com/post/29)

def change_state(object_id, new_state)
  returning find(object_id) do |object|
    object.state = new_state
    object.save
  end
end

instead of:

def change_state(object_id, new_state)
  object = find(object_id)
  object.state = new_state
  object.save
  object
end

(/modified quote)

Because otherwise you might forget that Ruby automatically returns the result of the last line in the block and accidentally return object.save or something:

def change_state(object_id, new_state)
  object = find(object_id)
  object.state = new_state
  object.save
end

Another example:

def cd_rails_root
  returning(FileUtils.getwd) { FileUtils.cd(RAILS_ROOT) }
end

Silliness: alias and alias_method

I prefer to use alias_method rather than alias.

alias_method :orig_meth, :meth
def meth # override it...
alias_method :new_meth, :old_meth

The order of arguments is reversed for alias...stupidly.

[Caveat (category)]: Object#send invokes private methods

It totally bypasses any privacy restrictions that the author of that class may have put in. In other words, it lets you call methods that have been marked as "private".

It is arguably a [good thing (category)] that Ruby provides a way to call private methods -- for those times when you really need to -- but in general this should be considered a [bad practice (category)].

Sometimes, it can even lead to some unexpected behavior. For instance, when a private method by that name exists but a public method does not. Example:

This example is taken from a real-world problem we ran across with Rails. We (Lance/Jim) discovered some really strange behavior where it threw an error if we tried to invoke a method via send (which the plugin in question (ActiveScaffold) was probably doing, since it was probably just looping through a list of column names) but it worked fine if we invoked the method directly!

record.send(:task)
=> ArgumentError (0 for 1)

but:

record.task
=> "test"

Very unintuitive for these two methods of calling a method should have different behavior!

The reason the send(:task) was raising an ArgumentError in our particular case was because we were somehow requiring "rake" somewhere in our code (something we probably shouldn't have been doing), and Rake was mixing in some Rake-specific private methods into the Object class (something it probably shouldn't have been doing), among them task. This private method was defined by Rake to expect 1 argument.

We were only passing 0 arguments, however, because we expected task to be a normal old attribute reader method for our ActiveRecord model, and obviously you shouldn't have to pass any arguments just to ask a model what the current value of one of its attributes is.

However, due I believe to some performance optimizations done by the ActiveRecord developers, those methods are not even created until they are needed. In other words, the first time you call an attribute reader, the message is caught by method_missing, which then dynamically creates the reader methods for you.

So in our case, we just got unlucky with our timing. If we had already accessed one of the models attributes in the "normal" (non-send) way prior to this point in the code, then the public method would have already existed, the private task method would not have been called, and this code would have worked as designed.

(Should also be filed under / transcluded by: "Caveat: ActiveRecord attribute accessor methods aren't created until first accessed", "Ruby / Problems with creating methods at run-time")

Wouldn't it be nice, then, if there were a "send" method that would only send to public methods?

Fortunately, the Facets core library provides just that — in the form of object_send. It will invoke the public method provided as its first argument, or (if there isn't a matching method by that name) method_missing, or (if there is no method_missing), raise raise an error (not sure if I like that behavior -- shouldn't it just send method_missing, in case we want a parent class to handle it?).

Currently (1.8.51) implemented as:

  # File lib/facets/core/kernel/object_send.rb, line 9
  def object_send(name,*args,&blk)
    #instance_eval "self.#{name}(*args)"
    if respond_to?(name)
      send(name,*args,&blk)
    elsif respond_to?(:method_missing)
      method_missing(name,*args,&blk)
    else
      raise NoMethodError
    end
  end

Question: Why is that in Kernel and not in Object?

Really, Object#send should only send to public methods (by default)! It should only send to private methods if you specifically request it to consider private methods as candidates (perhaps via a :ignore_privacy => true) option.

We shouldn't have to use a custom-written method to get the public-only behavior.

 


Syntax

Ruby / Syntax edit

Rescue statement modifier within method call

irb -> p(Dir.entries(d).grep(/erb/) rescue)
SyntaxError: compile error
(irb):7: syntax error, unexpected kRESCUE_MOD, expecting ')'
p(Dir.entries(d).grep(/erb/) rescue)
                                   ^
        from (irb):7

To get around this, you'll need to throw on one more set of parentheses:

irb -> p((Dir.entries(d).grep(/erb/) rescue []))
[]
    => nil

Multi-line comments

=begin
Your
comment
here
=end

Useful for commenting out a block of code.

You can also tell it what type of comment. The following special types are ones that I know of so far:

  • =begin rdoc -- begins an RDoc block
  • =begin test -- begins an inline test that will be read by Reap's test extraction tool


branching structures (if, case, ...) can be used inline in expressions even!

irb -> 'Is it true? ' + if false then 'yes' else 'no' end
    => "Is it true? no"

statement/expression "modifiers" (comes after expression)

the "if" expression modifier

This can apply to a single statement:

irb -> puts 'statement' if false
    => nil

or to a block (of any size):

irb -> begin; puts 'hi'; puts 'bye'; end if false
    => nil

irb -> begin; puts 'hi'; puts 'bye'; end if true
hi
bye
    => nil

You can also have the if clause apply to a section of code by enclosing in parentheses:

irb -> (puts 'First statement'; puts 'Second statement') if false
    => nil

irb -> puts 'First statement'; puts 'Second statement' if false
First statement
    => nil

Order of evaluation of statement modifiers

Example 0:

These don't work:

irb -> puts message if message = 'hi'
NameError: undefined local variable or method `message' for main:Object
        from (irb):1

FileUtils.rm(file) if File.exist?(file = '/path/to/file')

Example 1: This works:

        diff_output = Subversion.diff(external.container_dir)
        puts diff_output unless diff_output.blank?

but you can't (unfortunately) condense that into one statement. In other words, this doesn't work as desired:

        puts (diff_output = Subversion.diff(external.container_dir)) unless diff_output.blank?

See:

irb -> require 'rubygems'; require 'active_support'
irb -> puts (diff_output = "some long diff output") unless diff_output.blank?
    => nil

That's because when the Ruby parser hits "diff_output =", even though it doesn't evaluate it at that point, it still initializes diff_output to nil. Next, diff_output.blank? is evaluated and finds diff_output to still be nil. So the unless block evaluates to unless false and the block it is protecting (which actually sets the (diff_output = "some long diff output")) does not get evaluated. Kind of unintuitive and disappointing at times, but not too bad. We just have to write this as two lines instead of one in this case.

Example 2: validating arguments

Suppose you have a multiple-choice "option" named how and you want to validate that the caller gave it one of the valid options. If it's not one of the valid options, you want to print out a nice error message that lists what the valid options are. The setup code is:

    options = args.last.is_a?(Hash) ? args.last : {}
    how = options.delete(:how) || :capture

Now how would we validate the how option?

    raise ArgumentError.new(":how option must be one of #{valid_options.inspect}") unless (valid_options = [:capture, :exec, :popen]).include? how

That actually works, because the unless clause (which gets executed first) sets up the array for the "then" clause...

irb -> how = :vapidly
    => :vapidly
irb -> raise ArgumentError.new(":how option must be one of #{valid_options.inspect}") unless (valid_options = [:capture, :exec, :popen]).include? how
ArgumentError: :how option must be one of [:capture, :exec, :popen]
        from (irb):6
        from :0

irb -> how = :capture
    => :capture
irb -> raise ArgumentError.new(":how option must be one of #{valid_options.inspect}") unless (valid_options = [:capture, :exec, :popen]).include? how
    => nil

We could have also written it like this:

    unless (valid_options = [:capture, :exec, :popen]).include? how
      raise ArgumentError.new(":how option must be one of #{valid_options.inspect}")
    end

Example 3

Suppose you want to only print the results of some method call, but only if the results are not ""...

It feels like you should be able to do it with one line, like this...

puts output = obj.some_method() unless output == ""

but that doesn't work.

We're forced to break it into 2 lines.

output = obj.some_method()
puts output unless output == ""

In this case, if all you're worried about is getting blank lines when the method returns a blank line, you could use print instead.

print output = obj.some_method() unless output == ""

But that doesn't solve the more general problem. Which is that an if/unless statement modifier (that comes after the statement) can never check and respond to an input that exists only after running the statement to which the modifier applies. This is because the if/unless statement modifier is evaluated before the statement it is guarding.

It looks to me like the only way to do what I'm trying to do with a single statement is to use the "regular" old if/then/else statement inline in my puts statement.

irb -> unless (a = "") == "" then puts a else 'doing nothing' end
    => "doing nothing"

irb -> unless (a = "not an empty string") == "" then puts a else 'doing nothing' end
not an empty string
    => nil

So, applying that to my more specific example:

unless (output = obj.some_method()) == "" then puts output else end

Not as concise or readable as I'd hoped, but it does get the job done in one line.

"rescue" statement modifier / how to execute multiple statements as if they were one

Be sure you realize that only the statement immediately following the rescue belongs to the rescue clause (will only be executed if an exception is raised). Any statements after that will be executed unconditionally.

To demonstrate:

irb -> "Something that doesn't raise an exception." rescue p 'something'; p 'else'
"else"
    => nil

If you need to execute multiple statements as if they were one, just throw a begin/end around them! This would be the correct way to do what the above example was trying to do:

irb -> "Something that doesn't raise an exception." rescue begin; p 'something'; p 'else'; end
    => "Something that doesn't raise an exception."

irb -> raise "An exception"                         rescue begin; p 'something'; p 'else'; end
"something"
"else"
    => nil

||= (conditional initialization operator)

I love this operator. It lets you initialize a variable to a default only if the variable isn't already set.

My only complaint about it is that it doesn't work for booleans. (It also sometimes doesn't work how I'd like for hashes, arrays, as I'll explain in a bit.)


Ruby 101 for .NET developers: The strange ||= operator (http://www.softiesonrails.com/2007/2/6/ruby-101-for-net-developers-the-strange-OR-operator). Retrieved on 2007-03-29 14:44.

Sometimes in Ruby you'll come across some code that looks like this: stuff ||= [ ] or maybe stuff ||= { } What's going on here? First, if you're new to Ruby, you may not realize how the || operator really works, so let's digress for a minute and talk about that first. The || operator is a short-circuited logical OR operator. If I say a = "hello" b = a || "goodbye" What is the value of b? It will be "hello". Since a evaluated to something that's not nil and not false, it stopped right there (short circuited itself) and assigned be a reference to a. But suppose a was nil instead: a = nil b = a || "goodbye" Now, b will become "goodbye". Since a was nil, the OR operator continued to evaluate the next expression, which was "goodbye". So b becomes "goodbye". If you're with me so far, then you're already rounding third base. Now, you probably already know that this: a = a + 5 can be shortened to this: a += 5 right? In Ruby, this same idea gets applied to the OR operator. Instead of writing this: a = a || "baseball" This means, if a already has a value, then keep it; but it it's nil, then assign it to "baseball". But horrors, what a lot of typing! A good Ruby programmer would do this instead and save two whole keystrokes: a ||= "baseball" In other words, a will become "baseball" if it was nil (or false) before, otherwise it will just keep its original value.


[caveats (category)] Doesn't work for booleans

irb -> def foo(a) a ||= true; a end
    => nil

irb -> foo(false)
    => true

Sets a to true (the default value) even though a non-default value (false) was passed in! The consequences for not understanding this important point could be tragic! It's the difference between true and false.

Workaround:

irb -> def foo(a) a ||= true if a.nil?; a end
    => nil

irb -> foo(false)
    => false

More verbose, but it gives the (IMHO) desired behavior.

Better solution? Well, we can't just make up a new operator in Ruby, unfortunately, but we might be able to solve it with a new method...

irb -> class Object; def default(default) self = default if self.nil? end; end
SyntaxError: compile error
(irb):33: Can't change the value of self
class Object; def default(default) self = default if self.nil? end; end

Ohp! Never mind. I guess Ruby doesn't like that. I bet it woulda worked though...

irb -> class Object; def default(default) puts "Setting to default value of #{default}" if self.nil? end; end

irb -> foo(false)
    => nil

irb -> foo()
Setting to default value of true
    => nil

What if you want a [] variable to be treated as nil (uninitialized) value and to set it to a default if it is either nil or []?

irb -> a = []
    => []

irb -> (!a.empty?)
    => false

irb -> (!a.empty? || a)
    => []

irb -> !a.empty? || (a = ['default', 'array'])
    => ["default", "array"]

irb -> a
    => ["default", "array"]


irb -> a = []
    => []

irb -> a = ['default', 'array'] if a.empty?
    => ["default", "array"]

That works!

But it wouldn't work so well if a were nil.

irb -> a = nil
    => nil

irb -> !a.empty? || (a = ['default', 'array'])
NoMethodError: undefined method `empty?' for nil:NilClass
        from (irb):14

Unless we did this:

irb -> class NilClass; def empty?; true; end; end
    => nil

irb -> !a.empty? || (a = ['default', 'array'])
    => ["default", "array"]

or this:

irb -> a && !a.empty? || (a = ['default', 'array'])
    => ["default", "array"]

or this:

irb -> a = ['default', 'array'] if a && !a.empty?
    => ["default", "array"]

or this:

irb -> a = nil
irb -> a = ['default', 'array'] if a.nil? || (a && a.empty?)
    => ["default", "array"]

irb -> a = []
irb -> a = ['default', 'array'] if a.nil? || (a && a.empty?)
    => ["default", "array"]

How about the case where it is not nil or []?

irb -> a = ['existing', 'array']
    => ["existing", "array"]

irb -> a = ['default', 'array'] if a.nil? || (a && a.empty?)
    => nil

irb -> a
    => ["existing", "array"]

Cool, that works. Except that the assignment line returns nil, which may not be what we want. Maybe we'd rather the assignment line returned a ... ?


It'd be kind of neat if we could do this Python-esque (?) syntax...

irb -> (a = ['default', 'array'] if a.empty? else a)
SyntaxError: compile error
(irb):10: syntax error, unexpected kELSE, expecting ')'
(a = ['default', 'array'] if a.empty? else a)
                                          ^

... but, that's not valid in Ruby.

 

Once again assuming this...

irb -> class NilClass; def empty?; true; end; end
    => nil

Let's see if we can get the assignment line to return a.

irb -> a = []
    => []

irb -> ((a = ['default', 'array'] if a.empty?) || a).each {|e| puts e}
default
array
    => ["default", "array"]

Good, it used the default!

irb -> a = ['existing', 'array']
    => ["existing", "array"]

irb -> ((a = ['default', 'array'] if a.empty?) || a).each {|e| puts e}
existing
array
    => ["existing", "array"]

Good, it used the existing value of a (rather than overriding with a default)!

When would you want to do that? I don't know.

Here's the situation that originally motivated me to solve that problem: I wanted to use the value of ARGV (which I expected to be a list of directories), unless ARGV was empty ([]), in which case I just wanted by default to use the current directory (['.']). I didn't have any control of ARGV; it's always an array ([] if there are no args), so I couldn't just use ||=, because that only works if it is nil.

But then I realized that ARGV was a constant and I couldn't change its values by doing something like ||= anyway. This is what I ended up doing:

      ((['.'] if ARGV.empty?) || ARGV).each do |dir|
        ...
      end

and then changed to:

      (if ARGV.empty? then ['.'] else ARGV end).each do |dir|
        ...
      end

[caveats (category)] Can't do (var ||= "") += "string"

Why would I want to do that? So that I don't have to write it as two lines, of course! Compactness.

The problem, you see, is that you can't call + on nil variables (a default behavior of NilClass that I prefer to change)

irb -> string += 'some thing'
NoMethodError: undefined method `+' for nil:NilClass
        from (irb):1

I could always do this:

string ||= ""
string += "string"

But that's sort of a pain. I sort of wish I could do this:

irb -> (string ||= "") += 'some thing'
SyntaxError: compile error
(irb):2: syntax error, unexpected tOP_ASGN, expecting $end
(string ||= "") += 'some thing'
                  ^
        from (irb):2

The reason it doesn't like that, of course, is that the l-value of an assignment (which += is) must be a variable, not an expression. I sort of think it should be able to use the variable mentioned in the expression (string), but oh well.

This works, however:

irb -> string += 'this' if string ||= ''
    => "this"

irb -> string += ' works' if string ||= ''
    => "this works"

[Parallel assignment (category)]

See also: Pickaxe p. 340 (the rules), Pickaxe p. 90 (examples)

Why would you want to use it?

  • it's concise!
  • you can use it to swap variables (a, b = b, a)

lvalues, ... = *rvalue

"The rvalue is replaced with the elements of the array, with each element forming its own rvalue."

irb -> a, b, c = *[1, 2, 3];  "a=#{a}, b=#{b}, c=#{c}"
    => "a=1, b=2, c=3"

lvalues, ... = array

irb -> array = [1, 2, 3]
irb -> a, b, c = array;  "a=#{a}, b=#{b}, c=#{c}"
    => "a=1, b=2, c=3"

If there are more lvalues than rvalues...

... the excess will be assigned the value nil.

irb -> a, b, c = 1;  "a=#{a.inspect}, b=#{b.inspect}, c=#{c.inspect}"
    => "a=1, b=nil, c=nil"

If the last lvalue is prefixed with a *: a, ..., *catch_all = ...

(without the *)

irb -> a, b, catch_all = 1, 2, 3, 4, 5; "a=#{a}, b=#{b}, catch_all=#{catch_all.inspect}"
    => "a=1, b=2, catch_all=3"

but (with the *):

irb -> a, b, *catch_all = 1, 2, 3, 4, 5; "a=#{a}, b=#{b}, catch_all=#{catch_all.inspect}"
    => "a=1, b=2, catch_all=[3, 4, 5]"

It also works when there is only one lvalue and it is prefixed with a *.

irb -> *vars = 1, 2, 3; vars
    => [1, 2, 3]

More examples...

irb -> array = [3, 4, 5]

irb -> a, b, not_really_a_catch_all = 1, 2, array; "a=#{a}, b=#{b}, not_really_a_catch_all=#{not_really_a_catch_all.inspect}"
    => "a=1, b=2, not_really_a_catch_all=[3, 4, 5]"
# Simply assigns the value of array to not_really_a_catch_all

irb -> a, b, *catch_all = 1, 2, *array; "a=#{a}, b=#{b}, catch_all=#{catch_all.inspect}"
    => "a=1, b=2, catch_all=[3, 4, 5]"
# Has the same end result, but uses a catch-all and a splat (*array).

irb -> a, b, *catch_all = 1, 2, array; "a=#{a}, b=#{b}, catch_all=#{catch_all.inspect}"
    => "a=1, b=2, catch_all=[[3, 4, 5]]"
# Probably not what you want.

irb -> a, b, catch_all = 1, 2, *array; "a=#{a}, b=#{b}, catch_all=#{catch_all.inspect}"
    => "a=1, b=2, catch_all=3"
# The splat was too big to catch! The excess from the splat (4, 5) just ended up being ignored.

They are performed in parallel (effectively), so swapping is possible

irb -> a, b = 1, 2
    => [1, 2]

irb -> a, b = b, a;  "a=#{a}, b=#{b}"
    => "a=2, b=1"

But they are also performed in order, so if an rvalue has side-effects, it may affect the next rvalue

irb -> a, b, c  =  (x = 0), (x += 1), (x += 1)
    => [0, 1, 2]

irb -> x
    => 2

The return value of the multiple assignment is an array

irb -> return_val = (a, b, c = 1, 2, 3)
    => [1, 2, 3]

[Example of * operator with an Array (category)]

Either of these works:

irb -> a, b, c = *['something']*3
    => ["something", "something", "something"]

irb -> "a=#{a}, b=#{b}, c=#{c}"
    => "a=something, b=something, c=something"
irb -> a, b, c = ['something']*3
    => ["something", "something", "something"]

irb -> "a=#{a}, b=#{b}, c=#{c}"
    => "a=something, b=something, c=something"

[values, not references (category)]

This should be self-evident to the astute Rubyist, but I've forgotten it at least once so I'll just give this reminder for my own sake then.


irb -> a, b, c = 1, 2, 3
    => [1, 2, 3]

irb -> vars = [a, b, c]
    => [1, 2, 3]
# vars now contains a reference to a, b, and c, right? Wrong.

irb -> *vars = 4, 5, 6;  "a=#{a}, b=#{b}, c=#{c}"
    => "a=1, b=2, c=3"

irb -> vars
    => [4, 5, 6]

a, b, and c are not changed at all by this. Only vars (the actual lvalue) is changed.

Are variables references, pointers, or what?? / Do things get passed/returned by value or by reference?

This concept/question is actually very important for every serious Ruby developer to understand. I know it's sure gotten me very confused before.

The short version is [may not be 100% technically accurate, but conceptually it works to describe what's going on]:

  • variables are really just pointers to objects
  • most operations you do to a variable (var.operate!) actually end up changing the object to which it points
  • variable assignment (var = ) is different: it points the variable to a different object; it does not modify the original object!

You can tell what object a variable "points to" by inspecting its object_id. Each distinct object has a unique object_id. It is possible for more than one variable to point to the same object (and thus have the same object_id).

If we were to compare Ruby with C++, Ruby's variables are a lot like by-reference in C++: you can't just change what they point to with as much freedom as you can with C++ pointers, but -- much like by-reference argument passing in C++ -- the variable you pass in to a method can be modified by the function. Likewise, if a method returns a variable, that variable can be modified by the caller. Don't you forget that.

{ Some examples using strings

These examples illustrate that method calls on a variable do not change the "object_id" of that variable. In other words, after calling these methods, the variable still points to the same object.

irb -> a = "a"; a.object_id
    => -604332868

irb -> a.replace("b"); a.object_id
    => -604332868

irb -> a.concat "b"; a.object_id
    => -604332868

irb -> a.upcase!; a.object_id
    => -604332868

See how the object_id of a was -604332868 in all cases?

Now here are some examples to illustrate that using = on a variable to do an assignment will usually cause the variable to point to a new/different object.

I said usually, because sometimes an assignment will not change the target of the variable, like when you assign a variable to itself, or pass the variable to a method that returns the object.

irb -> a = "a"; a.object_id
    => -604365948

irb -> a = a; a.object_id
    => -604365948

irb -> def return_arg(arg) arg end; a = return_arg(a); a.object_id
    => -604365948

But usually an assignment results in a new object being created. Observe:

irb -> a = "a"; a.object_id
    => -604212148
# That's what we start with.

irb -> a = "b"; a.object_id
    => -604251578
# That's really just a shorter way of saying:
irb -> a = String.new("b"); a.object_id
    => -604265298
# Each string literal actually constructs a new object, even if you *spell* the string exactly the same. Symbols do not have this behavior, which is partly what makes them so cool...

irb -> print "a".object_id, " ", "a".object_id
-604361848 -604361868    => nil    # (different objects)
irb -> print :"a".object_id, " ", :"a".object_id
164578 164578    => nil            # (same object...every time!)

irb -> a = "a" + ""; a.object_id
    => -604235568
# Even though the resulting string *looks* the same ("a") -- the string is still "spelled" the same before and after -- the + operator actually (unconditionally) results in a new object being constructed.

irb -> a += "b"; a.object_id
    => -604280198
# You might *think* that this would just add "b" to the end of the existing object (to which a points). It doesn't though; it creates a new object. Use a.concat("b") if you want it to *modify* a's current object rather than creating a new object.
# a += "b" is just short for this (and we've already observed that String#+ *always* creates a new String object):
irb -> a = a + "b"; a.object_id
    => -604318858

irb -> def return_arg(arg) arg + "" end; a = return_arg(a); a.object_id
    => -604228528

Did you see how every one of those operations showed a to have a different object_id? That's because each one of them actually created a new String object and associated the variable a with that new String object.

The docs [1] confirm what we have already observed about the + operator:

str + other_str => new_str Concatenation—Returns a new String containing other_str concatenated to str. "Hello from " + self.to_s #=> "Hello from main"

} Some examples using strings

Are variables passed/returned by value or reference?

You've already seen one example where they are both passed and returned by reference.

irb -> a = "a"; a.object_id
    => -604250678

irb -> def return_arg(arg) arg end; a = return_arg(a); a.object_id
    => -604250678

(the exact same object that is passed ends up being returned)

So I guess that answers your question, doesn't it. The rest of my examples will just further illustrate that, but the concept is really easy to remember: yes, everything is by reference [if I understand correctly].

So even though the following returns a different object than is passed in, the object that the caller gets back is the same object that existed when we called return from within the method.

irb -> def return_arg(arg) return arg + "" end; a = return_arg(a); a.object_id
    => -604347018

This can be illustrated even more clearly by putting a puts inside of the method...

irb -> def return_arg(arg)
         new_str = arg + ""; 
         puts new_str.object_id;
         new_str
       end;
       a = return_arg(a); a.object_id
-604280118
    => -604280118

[Caveats (category)] Attribute reader methods can actually be used to modify the object's instance variables

You may think your object's private data is safe just because you've provided an attr_reader and not an attr_writer. This is not necessarily the case, however, so you need to be careful!

In particular, an attr_reader> method will return the attribute (instance variable) by reference, so the caller can modify the instance variable all she likes. This may disturb some folks, because it seems like the reader should be read-only.

If this were C++, we'd even declare the reader as const. But you don't have quite that degree of protection of your object's internal data as you do in C++, unfortunately. In fact, there's a way to get around practically every encapsulation "convention" provided by Ruby; they're really just "conventions", not rules that are enforced by Ruby (use obj.send(:private_method_A), for instance, to call private_method_A as if it were a public method). [To do: move to its own section].

class A
  attr_reader :options
  def initialize
    @options = []
  end
end

a = A.new
puts a.options.object_id  # to show that the a.options() method returns the variable it is reading *by reference*...
a.options() << 'flum1'    # and with this "reference" to A's private @options data, we can do whatever we'd like!
a.options() << 'flum2'
p a.options
-604226398
["flum1", "flum2"]

Illustration of how to create a "reference" to an object (so you have 2 variables pointing to the same object)...

my_ref = a.options
p my_ref.object_id        # to show that my_ref *points* to the same object as a
my_ref.replace ['zroo']   # we operate on my_ref, but end up changing the object that a points to (since they point to the same object)
p a.options
-604226398
["zroo"]

Also note (yet again) that this trick only works if the return value is an actual variable rather than an "expression"... If you were to implement your attribute reader like this...

  def options
    @options | []
  end

...instead of like this...

  def options
    @options
  end

...then you would effectively make it read-only! a.options().replace, a.options() << element would modify some object, certainly, but it could never be used to modify @options. (Take that, you people-who-try-to-directly-access-my-object's-instance-variables-rather-than-going-through-the-interface-I've-provided!)

Also note how in the following example...

class A
  def initialize
    @options = []
  end
  def options
    @options | []
  end
end

a = A.new
puts a.options.object_id
puts a.options.object_id
puts a.options.object_id
a.options() << 'flum2'
p a.options

...each call to a.options returns a different object. (Has that fact sunk into your mind yet??)

-604404818
-604404858
-604404898
[]

How to make a new variable "reference" an existing object ([Examples (category)] of the difference between = and replace())

Simplest possible example:

irb -> a = "a"; a.object_id
    => -604430038

irb -> my_ref = a; my_ref.object_id
    => -604430038

irb -> my_ref.replace("c")
    => "c"

irb -> a
    => "c"

Example that shows that arrays maintain "references" (rather than just the values) to their member objects too...

irb -> a = "a"; b = "b"; a.object_id
    => -604430038

irb -> array = [a, b]
    => ["a", "b"]

irb -> array[0].object_id
    => -604430038

irb -> my_ref = array[0]
    => "a"

irb -> my_ref = array[0]; my_ref.object_id
    => -604430038

irb -> my_ref = "c"; my_ref.object_id
    => -604549458

irb -> my_ref = array[0]; my_ref.object_id
    => -604430038

irb -> my_ref.replace("c"); my_ref.object_id
    => -604430038

irb -> a
    => "c"

([Examples (category)] of the difference between = and replace())

irb -> def foo(input)
         puts $a.object_id
         puts input[0].object_id
         $a = 'b'
         puts $a.object_id
         puts input[0].object_id
         puts input[0]
       end
    => nil

irb -> $a, $b = 'a', 'b'
    => ["a", "b"]

irb -> foo([$a, $b])
-604472168  # (the original object (containing 'a'))
-604472168  # (the original object (containing 'a'))
-604486558  # $a was changed to point to the object containing 'b'
-604472168  # but input[0] still points to the original object (containing 'a')
a           #     input[0] still points to the original object (containing 'a')
    => nil

irb -> $b
    => "b"  # $a was changed to point to the object containing 'b'

[Problems (category)] Why I wish Ruby had proper pointers (even if every object did have a replace() method, that still doesn't let you change types) ([Examples (category)] of the difference between = and replace())

Sometimes you want to have your method accept an array of objects and then modify those objects and have your changes visible outside of the method (as opposed to just modifying local variables).

The following example using Strings illustrates how this is possible with classes that implement the replace() method.

require 'stringio'
def change_vars!(vars)
  puts "Before:"
  puts vars.map { |v| v.inspect }.join(", ")

  vars.each do |v|
    v.replace 'new'
  end

  puts "After:"
  puts vars.map { |v| v.inspect }.join(", ")
end

$a, $b = ['old']*2
change_vars! [$a, $b]
Before:
"old", "old"
After:
"new", "new"

Note, however, that replace() can't be used to change the type of the variable.

require 'stringio'
def change_vars!(vars)
  puts "Before:"
  puts vars.map { |v| v.inspect }.join(", ")

  vars.each do |v|
    v.replace StringIO.new
  end

  puts "After:"
  puts vars.map { |v| v.inspect }.join(", ")
end

$a, $b = ['old']*2
change_vars! [$a, $b]
Before:
"old", "old"
...:in `replace': can't convert StringIO into String (TypeError)
        from temp.rb:11:in `change_vars!'
        from temp.rb:10:in `each'
        from temp.rb:10:in `change_vars!'
        from temp.rb:19

Those were contrived examples. Here's what I really want to be able to do:

  • pass an array of objects to capture_output; these objects will be of type IO
  • in capture_output, I want to be be able to use an iterator and change every object in the array that was passed in to be a different object (a new object, of type StringIO)
require 'stringio'
def capture_output(vars = [$stdout], &block)
  puts "Before:"
  puts vars.map { |v| v.inspect }.join(", ")

  vars.each do |v|
    v = StringIO.new
  end

  puts "After:"
  puts vars.map { |v| v.inspect }.join(", ")
end

capture_output [$stdout, $stderr]

This proves to be impossible to do the way I had imagined doing it for the following reasons:

  • Even though v is a "reference" to $stdout, when I do v = StringIO.new, it breaks the reference and causes v (a local variable, by the way) to be a "reference" to the newly constructed StringIO object. $stdout is unchanged.
  • I would just use IO#replace, but that (1) doesn't exist, and (2) even if it did exist, all it could do would be to replace itself with another object of the same type, not with a different type (unless we did some Ruby/DL magic, but let's not go there...).

This output confirms that $stdout escapes our efforts to change it:

Before:
#<IO:0xb7f18030>, #<IO:0xb7f1801c>
After:
#<IO:0xb7f18030>, #<IO:0xb7f1801c>

Any other bright ideas?

Kernel has a global_variables method, but it just returns an array containing the names of all the global variables; it doesn't help you to modify those variables (like $GLOBALS does in PHP). I was hoping for a global_variable_set method like there is a class_variable_set and instance_variable_set.

You can't use eval() as an lvalue, or you might be able to do something like eval(global_name) = new_value.

If you are able to enumerate ahead of time all possible variables that might be passed in and write a special case for each of those variables, then the following solution "works"...

require 'stringio'
def capture_output(vars = [$stdout], &block)
  puts "Before:"
  puts vars.map { |v| v.inspect }.join(", ")

  vars.each do |v|
    case v
      when $stdout
        $stdout = StringIO.new
      when $stderr
        $stderr = StringIO.new
    end
  end

  puts "After:"
  puts vars.map { |v| v.inspect }.join(", ")
end

capture_output [$stdout, $stderr]

(this has the desired effect)

That's a huge "if", however. I want a solution that works for any variables that may be passed in -- not just input that matches one of the pre-defined allowed variables. I want a generic solution that works for any variables, and any number of variables.

Even if that requisite condition is satisfied, I'm certainly not satisfied with that solution -- all that duplication is uuuugly.

The following "solution" also "works":

require 'stringio'
def capture_output(vars = [$stdout], &block)
  puts "Before:"
  puts vars.map { |v| eval(v).inspect }.join(", ")

  vars.each do |v|
    eval(v + " = StringIO.new")
  end

  puts "After:"
  puts vars.map { |v| eval(v).inspect }.join(", ")
end

capture_output ["$stdout", "$stderr"]

But it is just as ugly. evaling strings containing variable names just to set a variable? That is stooping to the utter depths of programmer desperation. It's too kludgey. I'd just rather not use eval to get the job done, if you know what I mean.

But that's the best I've come up with. Please inform me of a better solution!

Proposed to Ruby language: Either of the following would be satisfactory to me:

  • Add a standard Pointer class to the language
  • Add a built-in Kernel#global_variable_set method.

Can I pass argument by reference so that the method can return more than one return value?

No, not usually. But it shouldn't matter because Ruby provides a better way to return more than one return value.

def analyze_string(input, size)
  size = input.length
  puts size.object_id
  input
end

size = nil
puts size.object_id
puts analyze_string("Hmm...", size)
puts "And its size was: #{size}"
4
23
Some string
And its size was:

I must have told you a 1000 times by now: var = does not change the object pointed to by var; instead, it re-points var to a different object.

"Fine", you say, "we'll just use replace".

"Okay, go for it!" I say, smiling because I know it won't work.

def analyze_string(input, size)
  size.replace input.length
  puts size.object_id
  input
end

size = Fixnum.new
puts size.object_id
puts analyze_string("Some string", size)
puts "And its size was: #{size}"
undefined method `new' for Fixnum:Class (NoMethodError)

How about this?:

...
size = 0
puts size.object_id
puts analyze_string("Some string", size)
puts "And its size was: #{size}"
undefined method `replace' for 0:Fixnum (NoMethodError)

"So we're out of luck then, right?"

"Well, not entirely. We just have to approach the problem from a different angle ... the Ruby way."

def analyze_string(input)
  return input[0..0], input.length
end

first_letter, size = analyze_string("Some string")
puts "The first letter was: #{first_letter}"
puts "And its size was: #{size}"
The first letter was: S
And its size was: 11

So there you have it: even though you can "pass arguments by reference" and change those objects from within your method some of the time (like when you pass in objects that respond to replace, such as String), it is probably not a good habit to get into, and it certainly not the preferred way of returning multiple values.

Boolean expressions

[just for fun (category)] If some common English boolean phrases were translated into Ruby...

def die!; "You're dead." end
def skate; "I'm skating" end
xmp "skate or die!"
def skate; "I'm "; !"skating" end
xmp "skate or die!"


def ticket!; "You just got a ticket!" end
def click_it; "I clicked it" end
xmp "click_it or ticket!"
def click_it; not 'wearing a seat belt' end
xmp "click_it or ticket!"


class << it = Object.new
  def will_hurt; "Ouch! Hey, that hurt!" end
end
def litter; "I'm just a-litterin' away!" end
xmp "litter and it.will_hurt"
def litter; "Okay, I'll stop! I will"; not "litter any more." end
xmp "litter and it.will_hurt"
skate or die!
    ==>"I'm skating"
skate or die!
    ==>"You're dead."
click_it or ticket!
    ==>"I clicked it"
click_it or ticket!
    ==>"You just got a ticket!"
litter and it.will_hurt
    ==>"Ouch! Hey, that hurt!"
litter and it.will_hurt
    ==>false

Creating hashes without enclosing in {/}

A common Ruby idiom for methods is to make the last argument an "options" hash... This allows for some cleaner syntax and more flexibility in your calls...

  • you can specify as many options as you'd like, or none at all
  • you can specify them in any order
irb -> def m(arg1, arg2, options = {}); p [arg1, arg2, options]; end
    => nil

irb -> m 'arg1', 'arg2', :option1 => true, :option2 => :maybe
["arg1", "arg2", {:option1=>true, :option2=>:maybe}]
    => nil

It only works in certain cases, though...

You can only omit the {/} for the last argument, however...

irb -> def m(arg1, arg2, options = {}, arg4 = nil); p [arg1, arg2, options, arg4]; end
    => nil

irb -> m 'arg1', 'arg2', :option1 => true, :option2 => :maybe, 'arg4'
SyntaxError: compile error
(irb):19: syntax error, unexpected '\n', expecting tASSOC
        from (irb):19
        from :0

If you need to pass an argument after your "options" hash, then you need to enclose the hash with {/}...

irb -> m 'arg1', 'arg2', {:option1 => true, :option2 => :maybe}, 'arg4'
["arg1", "arg2", {:option1=>true, :option2=>:maybe}, "arg4"]
    => nil

This conciser syntax without the {/} also works to a limited extent within arrays...


irb -> ['arg1', 'arg2'] + [:option_1 => 'hi']
    => ["arg1", "arg2", {:option_1=>"hi"}]

It seems it only works if the hash is the only element of the array, though, (even when you put the hash as the last element) which seems kind of arbitrary if you ask me....

irb -> ['arg1', 'arg2'] + ['arg3', :option_1 => 'hi']
SyntaxError: compile error
(irb):25: syntax error, unexpected tASSOC, expecting ']'
['arg1', 'arg2'] + ['arg3', :option_1 => 'hi']

Assigning a method call to a local variable of the same name

irb -> def foo
         'foo'
       end

Don't do this:

irb -> foo = foo
    => nil
irb -> foo
    => nil

When it sees the foo that you are assigning to foo, it will have already registered the local variable "foo" and initialized it with a value of nil. (Recall that bare names are treated as locals if the local has been initialized already. It is only treated as a method call if there is no local variable by that name!) In other words, it is identical to doing this:

irb -> foo = nil
    => nil
irb -> foo = foo
    => nil

Instead, be sure to use the () symbols so that it knows that is a method call.

irb -> foo = foo()
    => "foo"

irb -> foo
    => "foo"

Or, just assign it to a variable with a different name than your method:

irb -> snoo = foo
    => "foo"

No problem!

Syntax: operator associativity / precedence / order of operations

Caveat: {/} have higher precedence than do/end !

From Pickaxe

Pickaxe, p. 168

def one(arg)
  if block_given?
    "block given to 'one' returns #{yield}"
  else
    arg
  end
end
def two
  if block_given?
    "block given to 'two' returns #{yield}"
  end
end
result1 = one two {
  "'three'" 
}
#result1 = one(two {
#  "'three'" 
#})
result2 = one two do
  "'three'"
end
#result2 = one(two) do
#  "'three'"
#end
puts "With {/}   : #{result1}" # => With {/}   : block given to 'two' returns 'three'
puts "With do/end: #{result2}" # => With do/end: block given to 'one' returns 'three'

Using {/} instead of do/end can cause bugs

The Rake User Guide (http://docs.rubyrake.org/read/chapter/4#page23) makes it clear that using {/} can lead to unexpected problems, because it may pass your block to the wrong method! :

Blocks may be specified with either a do/end pair, or with curly braces in Ruby. We strongly recommend using do/end to specify the actions for tasks and rules. Because the rakefile idiom tends to leave off parenthesis on the task/file/rule methods, unusual ambiguities can arise when using curly braces. For example, suppose that the method object_files returns a list of object files in a project. Now we use object_files as the prerequisites in a rule specified with actions in curly braces.

  # DON'T DO THIS!
  file "prog" => object_files {
    # Actions are expected here (but it doesn't work)!
  }

Because curly braces have a higher precedence than do/end, the block is associated with the object_files method rather than the file method.

This is the proper way to specify the task …

  # THIS IS FINE
  file "prog" => object_files do
    # Actions go here
  end

Using do/end instead of {/} can also cause bugs

Unfortunately, I've found that one is not necessarily safe just because one follows the rule of always using do/end for all multi-line blocks. Even then you can cause behavior that may not be what you expected unless you understand the difference pretty well...

I guess it's best simply to understand the difference in associativity (rather than blindly following a convention you don't understand) and use whichever one yields the associativity you are wanting.

I first ran into this problem with this simple-looking piece of code.

  puts output_streams.map do |output_stream|
    output_stream.inspect
  end.join ", "

To my surprise, it gave an error:

undefined method `join' for nil:NilClass (NoMethodError)

I had to change it to {/}, supposedly the wrong way to multi-line blocks...

  puts output_streams.map { |output_stream|
    output_stream.inspect
  }.join ", "

A closer look...

irb -> puts [1, 2].map do |v| 
  v 
end.join ", "
1
2
NoMethodError: undefined method `join' for nil:NilClass
        from (irb):18
        from :0

What??

irb -> puts [1, 2].map do |v| v end
1
2
    => nil

Ah... So I guess the default associativity of:

puts [1, 2].map do |v| v end.join ", "

is actually:

puts([1, 2].map do |v| v end).join ", "

(evaluates to nil.join ", ") ...which is not what I wanted in this case!

This is more along the lines of what I wanted...

irb -> puts( [1, 2].map do |v| v end.join(", ") )
1, 2
    => nil

Of course I could have always done it with {/} like this...

irb -> puts [1, 2].map { |v| v }.join(", ")
1, 2
    => nil

but I originally had it on multiple lines so I thought I was "supposed to" use do/end.

To summarize the difference in behavior:

meth1 objA.iterator_meth do block end.meth2
= (meth1 objA.iterator_meth do block end).meth2

meth1 objA.iterator_meth { block }.meth2
yields = meth1 (objA.iterator_meth { block }).meth2

Conclusion: Using {/} for multi-line blocks is not necessarily the "wrong" way. It sure beats putting tons of extra parentheses to effect your desired associativity, IMHO! So {/} can sometimes be the preferred option, even for multi-line blocks!

More specifically, objA.iterator_meth do block end.meth1.meth2
is okay, but if you're passing the result of your block-taker to another method, then {/} might be safer/more appropriate...
meth1 objA.iterator_meth { block end.meth2 }.

Caveat: method call has higher associativity than range (..) operator

Don't accidentally do this:

'a'..'z'.each {|letter| print letter}
=> z

Do this instead:

('a'..'z').each {|letter| print letter}
=> abcdefghijklmnopqrstuvwxyz

Caveat: &&/|| have higher precedence than and/or !

Observe:

irb -> a = true and false

irb -> a
    => true       # <--- !!!

# Same as doing this:
irb -> (a = true) and false

This is not a bug. It's just something to be aware of. It's just the order of operator precedence: = comes before and. Period.

In the first example, the assignment a = true happens regardless of what follows 'and'. This is because the = operator (?) has a higher precedence than 'and'. So it happens first and then the 'and' is evaluated.

With && it is different.

irb -> a = true && false

irb -> a
    => false

true && false is evaluated before the assigment. Then the result of true && false (which is false) is assigned to a.

This is probably what you would want to do most of the time (rather than a = something and something_else).

boolean operators (and and or)

I think this is the same in almost all languages, but just as a reminder, if you don't specify parentheses, it will evaluate the operators from left to right. So this:

 > true or false and false
=> false

is really the same as this:

 > (true or false) and false
=> false

To override that default order of operations you must use parentheses:

 > true or (false and false)
=> true
 


Operators

& operator for argument passing

def my_func(&block)
  class_eval block
end
my_func { puts 'hi' }

can't convert Proc into String

But...

def my_func(&block)
  class_eval &block
end
my_func { puts 'hi' }

works how you'd want it to!

The & operator tells Ruby to pass the argument ("block") to the function ("my_func") as a block, rather than whatever class it already was (it uses to_proc to coerce it if it isn't already a block).

"What type of object was the block argument before then??" I hear some of you asking... Let's ask irb and find out:

irb -> def my_func(&block)
irb(main):002:1> puts block.inspect
irb(main):003:1> end
    => nil

irb -> my_func { puts 'hi' }
#<Proc:0xb7eec8f4@(irb):4>
    => nil

Answer? A Proc object!

... which we could then call if we wanted:

irb -> def my_func(&block)
irb(main):006:1> block.call
irb(main):007:1> end
    => nil

irb -> my_func { puts 'hi' }
hi
    => nil

* ("splat") operator for argument passing

This is the way variable length parameter lists were meant to be!

It works intuitively, both for specifying argument lists (formal parameters) and for calling the method (with actual parameters)...

irb -> def what_did_I_pass?(*args) args end
    => nil

irb -> what_did_I_pass?
    => []

irb -> what_did_I_pass? {:a => 1}
SyntaxError: compile error
(irb):3: syntax error, unexpected tASSOC, expecting '}'
what_did_I_pass? {:a => 1}
                       ^
        from (irb):3
irb -> what_did_I_pass?({:a => 1})
    => [{:a=>1}]

irb -> what_did_I_pass? :a => 1
    => [{:a=>1}]

irb -> what_did_I_pass? [3, 1]
    => [[3, 1]]

irb -> what_did_I_pass? [3, 1], 'string'
    => [[3, 1], "string"]

irb -> args = 'arg1', 'arg2', 'arg3'
    => ["arg1", "arg2", "arg3"]

# Note the difference here:
irb -> what_did_I_pass? *args
    => ["arg1", "arg2", "arg3"]

irb -> what_did_I_pass? args
    => [["arg1", "arg2", "arg3"]]

[Parallel assignment (category)] and splat (*)

irb -> a = Array.new
    => []

irb -> *a = 1, 2, 3
    => [1, 2, 3]

# Or without first telling the variable what type it is. (Let it figure it out based on what we set it to!)
irb -> args = 'arg1', 'arg2', 'arg3'
    => ["arg1", "arg2", "arg3"]

irb -> args.class
    => Array

irb -> *args = 1, 2, 3
    => [1, 2, 3]

irb -> args
    => [1, 2, 3]

Comparison operators

"objects being compared" means the receiver and the argument.

obj1 < obj2 actually calls the < method of obj1 (the receiver) with obj2 as its argument. So it is equivalent to writing obj1.<(obj2).
irb -> 3 < 4
    => true

irb -> 3.<(4)
    => true
Operator Meaning Examples Negated form
== True if values are equal (may convert types). 3 == 3.0 is true. !=
eql?
(not strictly an operator)
True if objects being compared have both the same type and values. 3.eql?(3.0) is false.
equal?
(not strictly an operator)
True if objects being compared have the same object ID.
=== May mean different things, depending on the class (?). It is the comparison operator used in a case statement. String === "a" is true
<, <= less than, less than or equal >=, >
=~ Regular expression pattern match
irb -> "what" =~ /at/
    => 2

irb -> /at/ =~ "what"
    => 2
!~

(some information from p. 95, in Chapter 7. Expressions)

Caveat: "case something.class" doesn't work the way you might expect

In particular,

irb> case nil.class
   >   when NilClass: puts "NilClass"
   > end

does not output "NilClass".

This is how to do do a comparison with classes (just drop the .class part):

irb> case nil
   >   when NilClass: puts "NilClass"
   > end
NilClass

As explained on http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/175098 , case uses the === operator to compare (not ==), so even though it is true that nil.class == NilClass, it is not true that nil.class === NilClass. As they explain, invoking === on a Class object "checks to see if the argument is an instance oof that class". And "NilClass is not an instance of NilClass, but of Class."!!




Loops

for a in list
end

#same as:
list.each do |a|
end

for i in 1..10
end

1.upto(10) do |i|
end

1..10.each |i|
end

loop do
  break(return_value)
end

#break: terminates the immediately enclosing loop
#next: starts the next iteration
#redo: repeats the current iteration of the loop (withot reevaluating the condition or calculating the next item)
#retry: rewinds the loop and starts over with the first iteration

#How to get out of a nested loop? Can't use break. Try throw/catch.


Ruby / Numbers

Ruby / Numbers edit


round_to( n )


Rounds to the nearest _n_th degree.

  4.555.round_to(1)     #=> 5.0
  4.555.round_to(0.1)   #=> 4.6
  4.555.round_to(0.01)  #=> 4.56
  4.555.round_to(0)     #=> 4.555
# File lib/facets/core/float/round_to.rb, line 13
  def round_to( n ) #n=1
    return self if n == 0
    (self * (1.0 / n)).round_off.to_f / (1.0 / n)
  end

Part of: [Facets (category)]

Example: Say you have a string input and it needs to be converted to a numeric type and rounded to 2 places, in order to represent a currency...

price = price_input.to_f.round_to(0.01)  # '4.555' => 4.56

sprintf

A shorthand notation: format % value

irb -> "%.20f" % (1.8 + 0.1)
    => "1.90000000000000013323"

Inaccuracy of floats

irb -> sprintf("%.20f", (0.2 - 0.05 - 0.15))
    => "0.00000000000000002776"
irb -> sprintf("%.20f", 1.9)
    => "1.89999999999999991118"

irb -> sprintf("%.20f", 1.8 + 0.1)
    => "1.90000000000000013323"

http://www.ruby-forum.com/topic/48754 Ruby Forum - Using Float For Currency. Retrieved on 2007-05-11 11:18.


Floating point numbers represent an extremely wide range of values - much wider than their integer counterparts. This is handled through an exponent and mantissa. For this ability, they trade off precision. Think about the case of adding a large floating point number to a small floating point number:

     irb> a = 1.0e30
       => 1.0e+030
     irb> b = 1.0e-30
       => 1.0e-030
     irb> a + b
       => 1.0e+030
My comment:
irb -> ("%.40f" % (a = 1.0e30)).rjust(80)
    => "        1000000000000000019884624838656.0000000000000000000000000000000000000000"

irb -> ("%.40f" % (b = 1.0e-30)).rjust(80)
    => "                                      0.0000000000000000000000000000010000000000"

irb -> ("%.40f" % (a - b)).rjust(80)
    => "        1000000000000000019884624838656.0000000000000000000000000000000000000000"

While this is an extreme example, it does demonstrate the loss of precision. Essentially, in floating point arithmetic we're trying to squeeze much more out of, say 32 or 48 or 64 bits.

Integer arithmetic, on the other hand, is exact. And therefore so is fixed point arithmetic; however, fixed point doesn't enjoy the wide representation range as floats.

The bottom line is you should never use floating point when it comes to money. Eventually you're going to miss pennies. Instead represent things in the smallest denomination, such as cents, and fix it up in presentation, or use a custom Money column type, or data type.


[...] your program logic can also behave unexpectedly.

irb> 1.20 - 1.00 == 0.20
=> false


One would use BigDecimal instead of Fixnum or Bignum when one is going to do some operation that might produce a fraction. If you're certain your operations will always result in whole numbers, though, I don't see a point in dragging in BigDecimal.


BigDecimal

http://stdlib.rubyonrails.org/libdoc/bigdecimal/rdoc/index.html

Representing Numbers to Arbitrary Precision (http://ajax.stealthsettings.com/tutorial/representing-numbers-to-arbitrary-precision.html) (< 2007-05-01). Retrieved on 2007-05-11 11:18.


http://www.ruby-forum.com/topic/48754. Retrieved on 2007-05-11 11:18.


> amount = BigDecimal.new('9.756')
> rounded = (amount * 100).round / 100
> printf('%.02f', rounded)
> 
> Outputs '9.76'

There is at least one point more to make about rounding: this method has a bias when floats ending with a 5 are involved. In the extreme case, when all you three-decimal currency amounts end in a 5 (like $9.755, $9.765) then all are rounded upwards. Instead, half of them should be rounded up, the other half down. A good method to do that is to round to the nearest even digit:

  $9.755 -> $9.76
  $9.765 -> $9.76 (not $9.77)

The attached method round2 does this (although probably not very efficiently):

  8.5.round2         #-> 8
  9.765.round2(0.01) #-> 9.76


Problem

You’re doing high-precision arithmetic, and floating-point numbers are not precise enough.

Solution

A BigDecimal number can represent a real number to an arbitrary number of decimal places.

        require 'bigdecimal'

        BigDecimal("10").to_s                # => "0.1E2"
        BigDecimal("1000").to_s              # => "0.1E4"
        BigDecimal("1000").to_s("F")         # => "1000.0"

        BigDecimal("0.123456789").to_s       # => "0.123456789E0"

Compare how Float and BigDecimal store the same high-precision number:

        nm = "0.123456789012345678901234567890123456789"
        nm.to_f                         # => 0.123456789012346
        BigDecimal(nm).to_s
        # => "0.123456789012345678901234567890123456789E0"

Discussion

BigDecimal numbers store numbers in scientific notation format. A BigDecimal consists of a sign (positive or negative), an arbitrarily large decimal fraction, and an arbitrarily large exponent. This is similar to the way floating-point numbers are stored, but a double- precision floating-point implementation like Ruby’s cannot represent an exponent less than Float::MIN_EXP (1021) or greater than Float::MAX_EXP (1024). Float objects also can’t represent numbers at a greater precision than Float::EPSILON, or about 2.2*10-16.


http://pleac.sourceforge.net/pleac_ruby/numbers.html. Retrieved on 2007-05-11 11:18.


...

Making Numbers Even More Random

# from the randomr lib: 
# http://raa.ruby-lang.org/project/randomr/
----> http://raa.ruby-lang.org/project/randomr/

require 'random/mersenne_twister'
mers = Random::MersenneTwister.new 123456789
puts mers.rand(0)    # 0.550321932544541
puts mers.rand(10)   # 2

Generating Biased Random Numbers

def gaussian_rand
    begin
        u1 = 2 * rand() - 1
        u2 = 2 * rand() - 1
        w = u1*u1 + u2*u2
    end while (w >= 1)
    w = Math.sqrt((-2*Math.log(w))/w)
    [ u2*w, u1*w ]
end

mean = 25
sdev = 2
salary = gaussian_rand[0] * sdev + mean
printf("You have been hired at \$%.2f\n", salary)

Taking Logarithms

log_e = Math.log(val)
log_10 = Math.log10(val)

def log_base(base, val)
    Math.log(val)/Math.log(base)
end

answer = log_base(10, 10_000)
puts "log10(10,000) = #{answer}"

Multiplying Matrices

require 'matrix.rb'

a = Matrix[[3, 2, 3], [5, 9, 8]]
b = Matrix[[4, 7], [9, 3], [8, 1]]
c = a * b

a.row_size
a.column_size

c.det
a.transpose

Using Complex Numbers

require 'complex.rb'
require 'rational.rb'

a = Complex(3, 5)              # 3 + 5i
b = Complex(2, -2)             # 2 - 2i
puts "c = #{a*b}"

c = a * b
d = 3 + 4*Complex::I

printf "sqrt(#{d}) = %s\n", Math.sqrt(d)

...

Comparing Floating-Point Numbers

Comparing Floating-Point Numbers (http://ajax.stealthsettings.com/tutorial/comparing-floating-point-numbers.html). Retrieved on 2007-05-11 11:18.


Floating-point math is very precise but, due to the underlying storage mechanism for Float objects, not very accurate. Many real numbers (such as 1.9) can’t be represented by the floating-point standard. Any attempt to represent such a number will end up using one of the nearby numbers that does have a floating-point representation. You don’t normally see the difference between 1.9 and 1.8 + 0.1, because Float#to_s rounds them both off to “1.9″. You can see the difference by using Kernel#printf to display the two expressions to many decimal places:

        printf("%.55f", 1.9)
        # 1.8999999999999999111821580299874767661094665527343750000
        printf("%.55f", 1.8 + 0.1)
        # 1.9000000000000001332267629550187848508358001708984375000

Both numbers straddle 1.9 from opposite ends, unable to accurately represent the number they should both equal. Note that the difference between the two numbers is precisely Float::EPSILON:

        Float::EPSILON                       # => 2.22044604925031e-16
        (1.8 + 0.1) - 1.9                    # => 2.22044604925031e-16

This EPSILON’s worth of inaccuracy is often too small to matter, but it does when you’re doing comparisons. 1.9+Float::EPSILON is not equal to 1.9-Float::EPSILON, even if (in this case) both are attempts to represent the same number. This is why most floating-point numbers are compared in relative terms.

The most efficient way to do a relative comparison is to see whether the two numbers differ by more than an specified error range, using code like this:

        class Float
          def absolute_approx(other, epsilon=Float::EPSILON)
            return (other-self).abs <= epsilon
          end
        end

        (1.8 + 0.1).absolute_approx(1.9)     # => true
        10e10.absolute_approx(10e10+1e-5)    # => false

The default value of epsilon works well for numbers close to 0, but for larger numbers the default value of epsilon will be too small. Any other value of epsilon you might specify will only work well within a certain range.

Thus, Float#approx, the recommended solution, compares both absolute and relative magnitude. As numbers get bigger, so does the allowable margin of error for two numbers to be considered “equal.” Its default relative_epsilon allows numbers between 2 and 3 to differ by twice the value of Float::EPSILON. Numbers between 3 and 4 can differ by three times the value of Float::EPSILON, and so on.

A very small value of relative_epsilon is good for mathematical operations, but if your data comes from a real-world source like a scientific instrument, you can increase it. For instance, a Ruby script may track changes in temperature read from a thermometer that’s only 99.9% accurate. In this case, relative_epsilon can be set to 0.001, and everything beyond that point discarded as noise.

        98.6.approx(98.66)                           # => false
        98.6.approx(98.66, 0.001)                    # => true
 


Ruby / Strings and symbols

Ruby / Strings and symbols edit


Aliases: Ruby / Strings, Ruby / Symbols See also: Ruby / Regular expressions

Strings and symbols: Strings

Searching (also has [Regexp (category)])

irb -> "abcde"["abc"]
    => "abc"

irb -> "abcde"[/.b./]
    => "abc"

irb -> "abcde".match /.b./
    => #<MatchData:0xb7eed6c4>

irb -> "abcde".match(/.b./)[0]
    => "abc"

irb -> "abcde"["z"]
    => nil
irb -> "<div><div>Contents</div></div>"[%r{<div>(.*)</div>}]
    => "<div><div>Contents</div></div>"

irb -> "<div><div>Contents</div></div>"[%r{<div>(.*)</div>}, 1]
    => "<div>Contents</div>"

Substrings (slice/[])

Unfortunately, [] gets character code rather than substring when you pass a single index rather than a range of indexes:

irb -> "abc"[0..2]
    => "abc"

irb -> "abc"[0..0]
    => "a"
irb -> "abc"[-1..-1]
    => "c"

But:

irb -> "abc"[-1]
    => 99
(not "c")

Another workaround (-1..-1 was the first workaround):

irb -> "abc"[-1].chr
    => "c"

Delimiters / Different ways to delimit a string literal

String interpolation

You can even nest #{} inside of #{}!

p "#{field} = #{ object.send("#{field}") } !"

%q{}, %Q{}, %q<>, etc.

Choose your own delimiter ({}, (), <>, [], ||, whatever)!

%Q{} allows string interpolation; %q{} does not.

Can be nested!

Can be useful for metaprogramming, or just building large strings...

['stdout', 'stderr'].each do |stream_name|
  eval(%Q{

    class Test_#{stream_name} < Test::Unit::TestCase
      def setup
        $#{stream_name} = StringIO.new
      end
      def test_simple_filter
        filter_#{stream_name}(lambda{|input| ''}) do
          noisy_command_#{stream_name}
        end
        assert_equal '', $#{stream_name}.string
      end
    end
  })
end

Question: If you nest one string inside of the other, how do you control in which one the string interpolation happens?

Answer: By escaping the { characters, of course!

Interpolate now:

irb -> $a = 'test'
irb -> puts %Q{
     "    puts %Q{
     "      #{$a}
     "    }
     "  }

   puts %Q{
     test
   }

    => nil

Interpolate later:

irb -> a = %Q{
     "   puts %Q{
     "     #\{$a\}
     "   }
     " }
    => "\n  puts %Q{\n    \#{$a}\n  }\n"

irb -> eval(a)

    test

    => nil

Crazy powerful kung-fu heredoc syntax

To allow your terminating delimiter to be indented

irb ->         <<-WayOutHere
     " la dee da
     "        la dee da
     "                       WayOutHere
    => "la dee da\n       la dee da\n"

If you just use <<, then it will treat your indented delimiter as part of the string (it will not detect it as the delimiter unless all the way to the left -- no indenting).

irb -> <<WayOutHere
     "                 WayOutHere
     " WayOutHere
    => "                WayOutHere\n"

To disable string interpolation

(example from Phrogs on ruby-talk at 2007-01-17 08:55)

Do this:

b = <<'FOO'
b#{1+1}
FOO

instead of this:

a = <<FOO
a#{1+1}
FOO

Can start heredoc in the middle of an expression, finish the rest of your expression, and then continue with the string

Kind of strange, but cool!

Example (mine):

irb -> /^=begin[ \t\f]*#{b=''}.*?\n(.*?)\n=end/mi.match(<<End )[1]
     " =begin
     " require 'foo'
     " foo
     " =end
     " End
    => "require 'foo'\nfoo"

Example from http://ruby-doc.org/core/classes/ERB.html

   def build
     b = binding
     # create and run templates, filling member data variebles
     ERB.new(<<-'END_PRODUCT'.gsub(/^\s+/, ""), 0, "", "@product").result b
       <%= PRODUCT[:name] %>
       <%= PRODUCT[:desc] %>
     END_PRODUCT
     ERB.new(<<-'END_PRICE'.gsub(/^\s+/, ""), 0, "", "@price").result b
       <%= PRODUCT[:name] %> -- <%= PRODUCT[:cost] %>
       <%= PRODUCT[:desc] %>
     END_PRICE
   end

Example:

          puts Subversion.help(subcommand).gsub(<<End, '')
Subversion is a tool for version control.
For additional information, see http://subversion.tigris.org/
End

... makes for nicer syntax than

          puts Subversion.help(subcommand).gsub(<<End
Subversion is a tool for version control.
For additional information, see http://subversion.tigris.org/
End
          , '')

In fact, that syntax isn't even valid!

 syntax error, unexpected ',', expecting ')' (SyntaxError)
          , '')
           ^

Nor is this:

          puts Subversion.help(subcommand).gsub(<<End
Subversion is a tool for version control.
For additional information, see http://subversion.tigris.org/
End, '')
 can't find string "End" anywhere before EOF (SyntaxError)

padding a string

   "hello".rjust(20, " ")           #=> "               hello"

Indenting / Changing tab/indent

Removing indent

[Facets (category)]

Let's say I want to remove the indent/leading-line-spaces from a multi-line string...

irb -> require 'facets/core/string/margin'
irb -> require 'facets/core/string/indent'

irb -> class String; def rchomp; self.gsub(/\A\n/, ''); end; end

irb -> input = %(
     "   line1
     "   line2
     " ).rchomp
    => "  line1\n  line2\n"

irb -> puts input.margin
ine1
ine2
# Not what I wanted!

irb -> puts input.indent(-2)  # Unindent by 2 spaces
line1
line2
# Good!

irb -> puts input.tab(0)      # Replace any existing leading-line-spaces with 0 spaces.
line1
line2
# Good!
irb -> input = %(
     "   line1
     "     line2
     " ).rchomp
    => "  line1\n    line2\n"

irb -> puts input.tab(0)      # Replace any existing leading-line-spaces with 0 spaces.
line1
line2
    => nil
# Not quite what I wanted! I wanted line 2 to be '  line2'.

irb -> puts input.indent(-2)  # Unindent by 2 spaces
line1
  line2
# Yes, like that!
irb -> puts input.indent(2)
    line1
      line2
    => nil

irb -> puts input.tab(4)
    line1
    line2
    => nil

Processing a string one character at a time

irb -> "tyler".scan(/./) {|l| p l }
"t"
"y"
"l"
"e"
"r"

Checksums

irb -> "tyler".sum
    => 560

irb -> a = []; "tyler".each_byte {|l| a << l }; a.inject {|sum, i| sum + i}
    => 560

How do I capitalize the first letter? (the equivalent of ucfirst in PHP)

irb -> "hi there".capitalize
    => "Hi there"

irb -> "hi there".upcase
    => "HI THERE"

# Destructive modification?
irb -> original = "hi there"; new = original.dup; new.capitalize; original + " => " + new
    => "hi there => hi there"
irb -> original = "hi there"; new = original.dup; new.capitalize!; original + " => " + new
    => "hi there => Hi there"

How do I capitalize the first letter of each word? (the equivalent of ucwords in PHP)

I want to be able to do this:

irb -> "hi there".ucwords
    => "Hi There"

String#capitalize_all [Ruby Facets (category)]

http://facets.rubyforge.org/src/doc/rdoc/core/classes/String.html#M000904

capitalize_all( pattern=$;, *limit ) Capitalize all words (or other patterned divisions) of a string.

  "this is a test".capitalize_all  #=> "This Is A Test"

Another implementation

If I had to implement it, I would first make a change_each_word(!) iterator, an then build capitalize_each_word(!) on top of that.

# TODO: move to quality_extensions

#require 'facets/string/partitions'  # Facets 2.0?
require 'facets/core/string/each_word'
require 'qualitysmith_extensions/enumerable/enum'   # Future version of Ruby?: obj.enum_for(method = :each, *args)

# irb -> s = 'anthony john doe'; s.change_each_word! {|a| a.capitalize}; s
#     => "Anthony John Doe"
class String
  def change_each_word(&block)
    self.dup.change_each_word!(&block)
  end
  def change_each_word!
    each_word do |value, range|
      self[range] = (yield value)
    end
  end
end
class String
  def capitalize_each_word!
    change_each_word! do |word|
      word.capitalize
    end
  end
  alias_method :ucwords!, :capitalize_each_word!
end

# irb -> s = 'anthony john doe'; s.map_each_word {|a| a.capitalize}
#     => ["Anthony", "John", "Doe"]
class String
  def map_each_word
    enum(:each_word).map do |value, range|
      yield value
    end
  end
end

[Caveats (category)] [Built-in behavior is wrong (category)] s.downcase! returns nil rather than s!

You tell me: is this behavior intuitive?:

irb -> 'd'.downcase!
    => nil
irb -> ['d'].include? 'd'
    => true

irb -> ['d'].include? 'd'.downcase
    => true

# But!
irb -> ['d'].include? 'd'.downcase!
    => false

irb -> 'd'.downcase!
    => nil

I find that unintuitive.

What's more, it causes some obfuscation in order to "work around" this unwanted behavior.

Example:

irb -> response = ""
    => ""
irb -> response = $stdin.getc.chr while !['a', 'd', 'i', "\n"].include?(response.downcase!)
d
d
d
^DIRB::Abort: abort then interrupt!!
        from /usr/lib/ruby/1.8/irb.rb:81:in `irb_abort'
        from /usr/lib/ruby/1.8/irb.rb:243:in `signal_handle'
        from /usr/lib/ruby/1.8/irb.rb:66:in `start'
        from (irb):16:in `call'
        from (irb):16:in `getc'
        from (irb):16
        from :0

I had to Control-D out of the loop because the exit condition was never being satisfied. Specifically, my input, 'd', had downcase! called on it, and response.downcase! resulted in nil, which is not in the list of valid inputs, so it kept looking hoping that maybe next time I'd enter something "more valid".

A workaround (that obfuscates mildly):

irb -> response = ""
    => ""
irb -> response = $stdin.getc.chr while !['a', 'd', 'i', "\n"].include?(begin response.downcase!; response end)
d    => nil

How do I prefix all the elements in my array of strings with a prefix string?

irb -> elements = ['a', 'b', 'c']
    => ["a", "b", "c"]

This output just isn't cutting it...

irb -> puts elements
a
b
c

Let's say instead you want to display it as a simple tree.

irb -> puts '+';
       puts '\\- ' + elements
+
TypeError: can't convert Array into String
        from (irb):3:in `+'
        from (irb):3

That certainly doesn't work. I guess we want to use map. This works...

irb -> puts '+';
       puts elements.map{|e| '\\- ' + e}
+
\- a
\- b
\- c

But it would kind of be nice to have the prefix come before the array to which it is prefixed...wouldn't it?

Hmm... how about this?

irb -> puts (['\\- ']*elements.size).map {|prefix| $a ||= -1; prefix + elements[$a += 1]}
\- a
\- b
\- c

Wow, is that ugly! And unsafe.

[To do: Find better solution]

Split

‘’.split is unicode safe - 'unicode string'.split // will split a string into its individual characters, even for multibyte characters. (http://woss.name/2006/05/07/notes-from-a-rails-course/)

Example of [Heredoc (category)], Example of [String#margin (category)]

[Ruby Facets (category)]

    assert_equal <<-End.margin, output.chomp
      |3 + x
      |=> 4
    End


If we'd just done this (without using margin):

    assert_equal <<-End, output.chomp
      3 + x
      => 4
    End

, then we would have gotten a failure:

<"      3 + x\n      => 4\n"> expected but was
<"3 + x\n=> 4">.

To make the strings be equal without using margin, we'd have had to left-align everything, all the way to the left margin:


    assert_equal <<-End, output
3 + x
=> 4
    End

Yuck. I think that's exactly the sort of thing that prompted the author of String#margin to write it...

The ? "byte" [operator]

irb -> ?A
    => 65

irb -> ?\n
    => 10

irb -> ?\n.chr
    => "\n"

irb -> ?\t
    => 9

irb -> ?\r
    => 13

irb -> ?\  # That's a single space
    => 32

The use of \ within strings

Any time you are building a string of any significant length, you should be asking yourself this important question:

Do I want the \ characters in this string to be treated as escape characters or as literal '\' characters?

Note the difference between these 2 behaviors:

\ as escape character \ as literal
special inert, "safe"

"Everyone" knows that in order to get your \n to be treated as a newline rather than a literal \n, you have to use double quotes ("\n") rather than single quotes ('\n'). But what happens if you use the intrepid \ escape character in front of other, less-suspecting characters, that normally don't appear following a \, like "d"...? Let's try it and see!

irb -> "\n"
    => "\n"   # This is a newline.

irb -> "\d"
    => "d"    # This, however, is just an 
              # ordinary, lowly 'd'!

irb -> puts "\n"

    => nil

irb -> puts "\d"
d
    => nil

irb -> '\n'
    => "\\n"  # A literal '\' character
              # followed by a literal 'n'
              # character.

irb -> '\d'
    => "\\d"

irb -> puts '\n'
\n
    => nil

irb -> puts '\d'
\d
    => nil
"\d"   #=> "d"
%(\d)  #=> "d"
%Q(\d) #=> "d"
'\d'   #=> "\\d"
%q(\d) #=> "\\d"
'\n'   #=> "\\n"
%q(\n) #=> "\\n"
"\n"   #=> "\n"
%(\n)  #=> "\n"
%Q(\n) #=> "\n"

In summary, %q(...) is the same as '...' and both %(...) and %Q(...) are the same as "..." (for these test cases anyway).

I think the %q(...) form is typically the best choice for large strings that you want to be "safe" ("take these characters literally").


[Caveats (category)]: Be careful to consider how \ characters are treated when building code to be evaled

Here is one example of when I've forgotten about this behavior and have been bitten by it...


[Debugging stories (category)]

I had built up a string containing some code to be evaluated later in the context of my model:

$common_validation_code = %(
  ...
  validates_format_of :zip, :with => /\d{5}(-\d{4})?/, :message => "should be in the form 12345 or 12345-1234"
  ...
)

class Model < ActiveRecord::Base
  ...
  eval($common_validation_code)
  ...
end

However, this code was not working the way I expected it to. I expected the input '12345' to be considered valid, but it was telling me that it was not!

I did a quick sanity check in irb to convince myself that the regexp was in fact valid:

irb -> !!( '12345' =~ /\d{5}(-\d{4})?/ )
    => true

irb -> !!( '12345-1234' =~ /\d{5}(-\d{4})?/ )
    => true

irb -> !!( '1234' =~ /\d{5}(-\d{4})?/ )
    => false

Yeah, that's what I thought! So needless to say, I was a little bit confused as to why it wasn't working in my model.

It wasn't until I tried outputting the contents of my $common_validation_code variable to the screen that I realized what the problem was:

puts $common_validation_code
  ...
  validates_format_of :zip, :with => /d{5}(-d{4})?/, :message => "should be 12345 or 12345-1234"
  ...

Wait a second, my regexp is supposed to be /\d{5}(-\d{4})?/, not /d{5}(-d{4})?/.

Heh. So it would have been fine with accepting zip codes like "ddddd" as valid, but not zip codes that contained actual numerals!

irb -> !!( 'ddddd-dddd' =~ /#{"\d{5}(-\d{4})?"}/ )
    => true

irb -> !!( '12345-1234' =~ /#{"\d{5}(-\d{4})?"}/ )
    => false

Anyway, the fix was really, really simple -- just change a single character and it made all the difference in the world!

-  $common_validation_code = %(
+  $common_validation_code = %q(
end

[Examples of a single character making a big difference (category)]

Moral of the story: %q( , not %( !

 


Which types of strings interpolate variables and which do not

It looks like these partitions are the same as for the \ character being literal or an escape character...

irb -> a='!'
    => "!"
irb -> '#{a}'
    => "\#{a}"

irb -> %q(#{a})
    => "\#{a}"
irb -> "#{a}"
    => "!"

irb -> %Q(#{a})
    => "!"

irb -> %(#{a})
    => "!"


Strings and symbols: Symbols

What are they?

They're different than strings. They're identifiers.

Bruce Tate (2007-03-13). Crossing borders: Extensions in Rails: The anatomy of an acts_as plug-in (http://www-128.ibm.com/developerworks/java/library/j-cb03137/index.html). Retrieved on 2007-03-14 16:44.

(A symbol is a user-defined name.)


Symbols can contain characters other than the normally allowed symbol characters

Usually, you just make symbols with letters and underscores, :like_this . But you can also do this:

irb -> :'complicated.symbol!@#$%^&*()'
    => :"complicated.symbol!@\#$%^&*()"

You can also do this: "whatever#{variable}".to_sym .

[Caveat (category)]: Symbol#to_s doesn't retain the initial : character

This can be very confusing, especially when you are evaling something, and you expect that symbols interpolated into a string will ... well, stay looking like symbols.


irb -> def foo; end

irb -> method(:foo)
    => #<Method: Object#foo>

irb -> puts "method(#{:foo})"
method(foo)

I would have expected to see:

method(:foo)

If we try to eval it, we get a less-than-helpful/[less-than-intuitive error (category)]:

irb -> eval "method(#{:foo})"
TypeError: (eval):1:in `method': nil is not a symbol
        from (irb):8
        from (eval):1
        from (irb):8
        from :0

(foo returns nil, which, of course, is not a symbol)

To work around this, it looks like Symbol#inspect does what we (sometimes) want Symbol#to_s to do, so we can use it instead:

irb -> puts "method(#{:foo.inspect})"
method(:foo)

irb -> eval "method(#{:foo.inspect})"
    => #<Method: Object#foo>
 


Ruby / Regular expressions

Ruby / Regular expressions edit


Links

http://www.rubycentral.com/book/tut_stdtypes.html Programming Ruby: The Pragmatic Programmer's Guide

http://www.regular-expressions.info/ruby.html Ruby Regexp Class - Regular Expressions in Ruby

[] vs. match/=~

If you just want the (entire) matching text returned, you can do this (simpler but not as powerful):

irb -> "abcdef"[/bcd/]
    => "bcd"
irb -> "abcdef"[/bcda/]
    => nil

If you need more power, such as access to multiple match groups, then you may need to use match/=~:

irb -> "abcdef".match /bcd/
    => #<MatchData:0xb7f99144>
irb -> "abcdef".match /b(c)d(.+)/ ; "#{$1}#{$2}"
    => "cef"
irb -> "abcdef".=~ /b(c)d(.+)/ ; "#{$1}#{$2}"
    => "cef"

How to remove a substring

irb -> "aaabaa".sub(/b/, '')
    => "aaaaa"

How to remove a substring multiple times

str.gsub

Returns a copy of str with all occurrences of pattern replaced [...].

Previously the only way I could think to do it:

Only removes the first occurrence:

irb -> input = "aaababaaba"
    => "aaababaaba"
irb -> input.sub!(regexp = /b+/, '')
    => "aaaabaaba"

Doesn't work at all!:

irb -> input = "aaababaaba"; regexp = nil
    => nil
irb -> input.sub!(regexp = /b+/, '') until input !~ regexp ; input
    => "aaababaaba"

irb -> input.sub!(regexp, '') until input !~ (regexp = /b+/) ; input
    => "aaaaaaa"

This unfortunately doesn't work, due to [the order in which Ruby parses variables (category)].

irb -> input = "aaababaaba"
    => "aaababaaba"

irb -> input.sub!(regexp, '') until input !~ (regexp = /b+/) ; input
NameError: undefined local variable or method `regexp' for main:Object
        from (irb):2
        from :0

You have to initialize regexp (even to nil works) before you can read from it. Observe that even though it comes after the sub! command, the regexp = /b+/ initialization in the until expression happens bofer the sub! command.

irb -> input = "aaababaaba"; regexp = nil
    => nil
irb -> input.sub!(regexp, '') until input !~ (regexp = /b+/) ; input
    => "aaaaaaa"

MatchData

I think it is better practice to use .match and MatchData objects rather than to use =~ and refer to funky global variables like $` and $2...


if (matches = "abcde".match(/.c./))
  puts matches.to_s
  puts matches[0]
  puts matches.pre_match
  puts matches.post_match
end
#outputs:
bcd
bcd
a
e


Can treat it like an array...

   m = /(.)(.)(\d+)(\d)/.match("THX1138.")
   m[0]       #=> "HX1138"
   m[1, 2]    #=> ["H", "X"]
   m[1..3]    #=> ["H", "X", "113"]
   m[-3, 2]   #=> ["X", "113"]

captures vs. to_a:

> match_data = 'foo.html'.match(/(.+)\.(\w+)/)
=> #<MatchData:0x7f1e62f2ba48>

> puts match_data.to_a
foo.html
foo
html

> puts match_data[0..-1]
foo.html
foo
html

> puts match_data[0..-1] == match_data.to_a
true

> puts match_data[0..-1] == match_data.captures
false

> puts match_data.captures
foo
html

But I don't want to create a temporary local variable -- especially not one with a long name like match_data!

You could use a shorter name, like matches or m or md.

Or you could bypass that temporary variable altogether, if all you need is, say, the captures...

> basename, extension = 'foo.html'.match(/(.+)\.(\w+)/).captures
=> ["foo", "html"]
Returns the portion of the original string before the current match. Equivalent to the special variable $`.
   m = /(.)(.)(\d+)(\d)/.match("THX1138.")   
   m.pre_match   #=> "T"      

Returns the portion of the original string after the current match. Equivalent to the special variable $’.
   m = /(.)(.)(\d+)(\d)/.match("THX1138: The Movie")
   m.post_match   #=> ": The Movie"


Returns the array of matches.
   m = /(.)(.)(\d+)(\d)/.match("THX1138.")
   m.to_a   #=> ["HX1138", "H", "X", "113", "8"]

Returns the entire matched string.
   m = /(.)(.)(\d+)(\d)/.match("THX1138.")
   m.to_s   #=> "HX1138"

Match groups

This example [2] uses match group number 2:

  /(.)(.)(.)/.match("abc")[2]   #=> "b"

This example shows how to extract the numerical prefix from a string:

irb -> "013_whatever".match(/[0-9]+/)[0]
    => "013"

irb -> "013_whatever".match(/([0-9]+)_([\w]+)/)[1]
    => "013"
irb -> "013_whatever".match(/([0-9]+)_([\w]+)/)[1..2]
    => ["013", "whatever"]

Character classes

Abbreviation Short for Meaning
\d [0-9] Digit character
\D [^0-9] Nondigit
\s [ \t\r\n\f] Whitespace character
\S [^ \t\r\n\f] Nonwhitespace character
\w [A-Za-z0-9_] Word character
\W [^A-Za-z0-9_] Nonword character

Anchors

http://www.rubycentral.com/book/tut_stdtypes.html

The patterns ^ and $ match the beginning and end of a line, respectively. The patterns \b and \B match word boundaries and nonword boundaries, respectively. Word characters are letters, numbers, and underscore.

Multi-line regular expressions

irb -> "line1\nline2"[/.*/]
    => "line1"

irb -> "line1\nline2"[/.*/m]
    => "line1\nline2"
irb -> "
     " <div>
     "   <div>
     "     Contents
     "   </div>
     " </div>
     " "[%r{<div>(.*)</div>}m, 1]
    => "\n  <div>\n    Contents\n  </div>\n"

Greed

Example

irb -> 'prefix1-prefix2-main_filename.rb' =~ /^(.*)-(.*)/; [$1, $2]
    => ["prefix1-prefix2", "main_filename.rb"]

irb -> 'prefix1-prefix2-main_filename.rb' =~ /^(.*?)-(.*)/; [$1, $2]
    => ["prefix1", "prefix2-main_filename.rb"]

Example

irb -> "<div>Contents of 1st div</div><div>Contents of 2nd div</div>"[%r{<div>(.*)</div>}m, 1]
    => "Contents of 1st div</div><div>Contents of 2nd div"

irb -> "<div>Contents of 1st div</div><div>Contents of 2nd div</div>"[%r{<div>(.*?)</div>}m, 1]
    => "Contents of 1st div"

Example: Removing an option from a command-line string

[Command-line options (category)] [Command-line arguments (category)]

Let's say you want to remove option1=? from a list of command-line options, and you don't know what the value of that option ('?') will be.

One's first attempt might look like this (greedy version):

irb -> 'command option1=1 option2=2 option3=3'.gsub(/option1=(.*) /, '')
    => "command option3=3"

but notice how it also removed option2 from the list of options in addition to option1! That's not what we wanted!

Non-greed to the rescue!

irb -> 'command option1=1 option2=2 option3=3'.gsub(/option1=(.*?) /, '')
    => "command option2=2 option3=3"

Now it only matches the minimum necessary before the first space it encounters and then it stops matching. So it matches 'option1=1 '. Perfect. That's exactly what we want.

Side note: This method doesn't work very well if your options' values may contain spaces...


If you wanted to remove option1 including its value (which may include spaces), then that technique probably won't work for you...

irb -> "command option1='1 + 1' option2=2 option3=3".gsub(/option1=(.*?) /, '')
    => "command + 1' option2=2 option3=3"

In that case, you're better off using a full-featured command-line parser. I've used one; I just can't remember the name.

Or, if you're able to use ARGV, that would work too, as that only takes spaces into account and properly turns the command line into a list of arguments. If you can use ARGV, then you would use a different approach than described: you would use reject to remove those elements from ARGV that don't suit your fancy, rather than gsub to remove the respective substring.

Example:

  p ARGV
  puts ARGV.join(' ')

  new_args = ARGV.reject {|it| it =~ /option1=/}
  p new_args
  puts new_args.join(' ')
> command option1='1 + 1' option2=2 option3=3

["command", "option1=1 + 1", "option2=2", "option3=3"]
command option1=1 + 1 option2=2 option3=3

["command", "option2=2", "option3=3"]
command option2=2 option3=3
 

scan

Example use: Say you copied and pasted a list of methods from an RDoc page and you want to convert that list--which is oddly spaced--from this:

the_oddly_spaced_list = "assert   assert_block   assert_equal   assert_in_delta   assert_instance_of   assert_kind_of   assert_match   assert_nil   assert_no_match   assert_not_equal   assert_not_nil   assert_not_same   assert_nothing_raised   assert_nothing_thrown   assert_operator   assert_raise   assert_raises   assert_respond_to   assert_same   assert_send   assert_throws"

to having one word per line.

the_oddly_spaced_list.scan(/\w+/) {|word| puts word}

Can interpolate strings into regular expressions

irb -> neat_regexp = Regexp.escape('*neat*')
    => "\\*neat\\*"      # a string

irb -> "What a *neat* idea!" =~ /a #{neat_regexp} idea/
    => 5

Converting between strings and regular expressions: Comparison

input type output type escaped?
Regexp.escape(s) String String yes
String#to_re(s) (Facets) String Regexp yes
Regexp.union(s) String Regexp yes
String#to_rx(s) (Facets) String Regexp no
/#{s}/ String Regexp no
r.to_s Regexp String N/A May not be spelled the same if you convert back, but should be equivalent.

Converting strings to regular expressions: Regexp.escape, Regexp.union, and String#to_re

Might be useful if you have some user-supplied input as a string and you need it be treated as a literal in your regexp -- you want to inoculate the string and remove any special powers that it might otherwise have if it were inserted straight into a regular expression.

irb -> neat_regexp = Regexp.escape('*neat*')
    => "\\*neat\\*"
irb -> "What a *neat* idea!" =~ /a #{neat_regexp} idea/
    => 5
irb -> "What a *neat* idea!" =~ /a #{Regexp.escape("*neat*")} idea/
    => 5
irb -> "What a *neat* idea!" =~ /a \*neat\* idea/
    => 5
irb -> "What a *neat* idea!" =~ /a *neat* idea/
    => nil

It's even useful if you have a string literal in your code (as opposed to from user input) that you want to treat as a regular expression without having to worry about the escaping rules!

This is a bit easier to type:

irb -> require 'facets/core/string/to_re'
irb -> 'Are you *sure*? *Really* sure?' =~ 'Are you *sure*?'.to_re
    => 0

than this:

irb -> 'Are you *sure*? *Really* sure?' =~ /Are you \*sure\*\?/
    => 0

, for example.

The difference between Regexp.escape and String#to_re is that Regexp.escape returns a string (which you'd then have to interpolate into a regular expression -- String#to_re skips that step and converts straight into a regular expression: conciser but not as flexible.

Notice:

irb -> 'Are you *sure*?'.to_re
    => /Are\ you\ \*sure\*\?/
irb -> Regexp.escape('Are you *sure*?')
    => "Are\\ you\\ \\*sure\\*\\?"

irb -> 'Are you *sure*? *Really* sure?' =~ Regexp.escape('Are you *sure*?')
TypeError: type mismatch: String given
        from (irb):5:in `=~'
irb -> 'Are you *sure*? *Really* sure?' =~ /#{Regexp.escape('Are you *sure*?')}.*\?$/
    => 0
irb -> 'Are you *sure*? *Really* sure?' =~ /#{'Are you *sure*?'.to_re.to_s}.*\?$/
    => 0

It looks like Regexp.union() actually does the same thing as String#to_re:

irb -> /#{Regexp.escape('*.*')}/
    => /\*\.\*/

irb -> Regexp.union('*.*')
    => /\*\.\*/

May not be spelled the same if you convert back, but should be equivalent

In general, converting a Regexp to a String, causes it to not be optimized for prettiness. Rather, it has to store all of the flags, even the default ones, to make sure no information is lost during the conversion.

irb -> %r{#{ /(?-mix:.*)/.to_s }}
    => /(?-mix:.*)/

irb -> %r{#{ /(?i-mx:.*)/.to_s }}
    => /(?i-mx:.*)/

but:

irb -> %r{ #{ /.*/.to_s } }
    => / (?-mix:.*) /


Bug in Regexp#to_s ?

Unfortunately, Regexp#to_s doesn't appear to work properly...

irb -> 'Are you *sure*? *Really* sure?' =~ /#{'Are you *sure*?'.to_re.to_s}/
    => 0

but...

irb -> 'Are you *sure*? *Really* sure?' =~ 'Are you *sure*?'.to_re.to_s.to_re
    => nil

^ and $ can be used in subexpressions

Example

Say we had this input:

input = ["processor", "processing", "process", "process_with_fluff", "process_without_fluff"]

and want this as output:

["process", "process_with_fluff", "process_without_fluff"]

How would we do it?

These don't work:

irb -> input.grep /^process/
    => ["processor", "processing", "process", "process_with_fluff", "process_without_fluff"]

irb -> input.grep /^process_/
    => ["process_with_fluff", "process_without_fluff"]

irb -> input.grep /^process$/
    => ["process"]

irb -> input.grep /^(process_|process)$/
    => ["process"]

Ah, but this does!:

irb -> input.grep /^(process_|process$)/
    => ["process", "process_with_fluff", "process_without_fluff"]

# Notice how the order of precedence is such that this is equivalent (don't need the parentheses)...
irb -> input.grep /^process_|process$/         
    => ["process", "process_with_fluff", "process_without_fluff"]

# But this is the best / most concise solution of them all...
irb -> input.grep /^process(_|$)/
    => ["process", "process_with_fluff", "process_without_fluff"]

We want all (method) names that either start with "process_" (a prefix) or are exactly "prefix".

([Application (category)]): How to match an exact filename, which may be part of a larger path. (example of "^ and $ can be used in subexpressions")

# This means the filename has to come directly after a '/' character (\/) or has to be the beginning of the path (^).
irb -> exact_filename_re = /(\/|^)filename.rb$/
    => /(\/|^)filename.rb$/

irb -> '/a/really/long/path/filename.rb' =~ exact_filename_re
    => 19

irb -> 'filename.rb' =~ exact_filename_re
    => 0

# But if it's part of a longer filename, it won't match, which is what we want.
irb -> 'a_longer_filename.rb' =~ exact_filename_re
    => nil

(This is useful in conjunction with an include/exclude pattern for FileList, since I've had problems with other matching methods, such as globbing and "plain strings".)

/^...$/ vs. /\A...\z

Controller User Input Validation » Ruby on Rails Security Blog (http://www.rorsecurity.info/2007/05/29/controller-user-input-validation/) (2007-05-29). Retrieved on 2007-05-11 11:18.


# A file name may be alphanumerical and may contain .-+_
file = parseparam( params[:file], "", "str", nil, /^[\w\.\-\+]+$/)

The last example seems to validate for a valid file name, however it is prone to user agent injection, a file name with embedded JavaScript, such as file.txt\%0A<script>alert('hello')</script>, passes the filter. This is due to the widespread belief that ^ matches the beginning of a string and $ the end, as in other programming languages. In Ruby, however, these characters match the beginning and end of a line, so the above string passes the filter, as it contains a line break (%0A). The correct sequences for Ruby are \A and \z, so the expression from above should read /\A[\w\.\-\+]+\z/.

irb -> "line1
     " line2" =~
       /^line1$/
    => 0              # Matches

irb -> "line1
     " line2" =~
       /\Aline1\z/
    => nil            # Doesn't match



 


Collections (arrays, sets, hashes, ...)

Mathematical (set theory) operators mapped to Ruby operators

See w:Set

mathematics name Ruby
U union | (or)
intersection & (and)

Arrays are ordered; Sets are not

irb> [1,2] == [2,1]
=> false

irb> [1, 2].to_set == [2, 1].to_set
=> true

Arrays: how to access a subset of

irb> [10,20,30,40,50][0..1]
=> [10, 20]

irb> [10,20,30,40,50][1..2]
=> [30, 40]
# Get the first 3 elements from the array
irb> [10,20,30,40,50][0..2]
=> [10, 20, 30]
# It's okay if it has < 3 elements
irb> [10][0..2]
=> [10]

# Even more intuitive...
irb> [10,20,30,40,50].first(3)
=> [10, 20, 30]
irb> [10].first(3)
=> [10]

Get all elements starting with the nth element:

irb(main):005:0> [0,1,2,3,4][2..-1]
=> [2, 3, 4]

Comparison with Python: In Python, you could simply do [2:] to get all elements starting with element 2.

Everything but a certain element

You need Ruby Facets installed to use this method.

irb -> require 'rubygems'
irb -> require 'facet/array/delete_values_at'
    => true

irb -> a = ["good", "bad", "good", "bad"]
    => ["good", "bad", "good", "bad"]

irb -> a.delete_values_at(1, 3)
    => ["bad", "bad"]

irb -> a
    => ["good", "good"]

Or use *= syntax. But I think that's only available through an extension to Hash (Hash doesn't have a * method by default).

Example from http://ratchets.rubyforge.org/manual.html

        keys = keys.to_h
        keys *= {
          :message => info.message
        }
        keys *= (info[:simple] || {})

Hash: How do I add two hashes together?

irb -> {"a"=>"A", "b"=>"?"}.merge( {"b"=>"B!"} )
    => {"a"=>"A", "b"=>"B!"}

Arrays: How do I add to an array?

irb -> a = [1,2,3]
    => [1, 2, 3]

irb -> a += [4,5]
    => [1, 2, 3, 4, 5]

irb -> a = [1,2,3]
    => [1, 2, 3]

irb -> a.concat [4,5]
    => [1, 2, 3, 4, 5]

#But:
irb -> a = [1,2,3]
    => [1, 2, 3]

irb -> a << [4,5]
    => [1, 2, 3, [4, 5]]

<</push vs. concat

First, observe that << and push are exactly the same (correct me if I'm wrong). << makes for nicer-reading syntax (less need for parentheses due to different order of operations for operators as for normal methods?), though...

irb -> [1, 2] << [2, 3]
    => [1, 2, [2, 3]]
irb -> [1, 2].push [2, 3]
    => [1, 2, [2, 3]]

irb -> [1, 2] << [2, 3] << [2, 3]
    => [1, 2, [2, 3], [2, 3]]
irb -> ([1, 2].push [2, 3]).push [2, 3]
    => [1, 2, [2, 3], [2, 3]]

irb -> [1, 2].push [2, 3].push [2, 3]
(irb):25: warning: parenthesize argument(s) for future version
    => [1, 2, [2, 3, [2, 3]]]

They are different from += and concat in at least a couple of ways

  • <</push can easily be chained together; += cannot (because the lvalue of an assignment can only be a variable, not an expression)
irb -> a = [1, 2]; a += [2, 3] += [2, 3]
SyntaxError: compile error
(irb):29: syntax error, unexpected tOP_ASGN, expecting $end
a = [1, 2]; a += [2, 3] += [2, 3]
  • If passed an array as its argument, +/|/concat adds the elements of that array, whereas <</pushd add the array itself
irb -> a = [1, 2]; a += [2, 3]; a += [2, 3]
    => [1, 2, 2, 3, 2, 3]
irb -> a = [1, 2]; a << [2, 3] << [2, 3]
    => [1, 2, [2, 3], [2, 3]]

(make a table to summarize)

Here's an example of how you need to be careful and intentional about whether you want to add the array as an element or the elements of the array as elements...

Let's say the user can supply an array of revisions which will be used and then you will call a command-line program with those revisions as arguments to the --revisions option... Then you don't want any nested arrays; it needs to be a flat array. So you can either use << and then flatten it or just use concat (can't use += due to the lvalue not being a variable)...

irb -> args = []; args << '--revisions' << revisions
    => ["--revisions", ["14", "47", "90"]]

irb -> args = []; (args << '--revisions').concat revisions
    => ["--revisions", "14", "47", "90"]

irb -> args = []; args << '--revisions' << revisions; args.flatten!
    => ["--revisions", "14", "47", "90"]

Now for a slightly more advanced example where those techniques won't work... Suppose you actually want to have a nested array: you want the argument following '--revisions' to remain an array. But you want to add some more elements after that and you want to just add those as 'elements' (not as a single element that is the array). If we just indiscriminately used flatten, then we would use all of the nesting, even though we wanted to retain some of it...

irb -> user_supplied_args = ['--verbose', '--style', 'pretty']
    => ["--verbose", "--style", "pretty"]

irb -> args = []; args << '--revisions' << revisions << user_supplied_args
    => ["--revisions", ["14", "47", "90"], ["--verbose", "--style", "pretty"]]

Not what I want... I want ["--revisions", ["14", "47", "90"], "--verbose", "--style", "pretty"].

irb -> args = []; args << '--revisions' << revisions << user_supplied_args; args.flatten!
    => ["--revisions", "14", "47", "90", "--verbose", "--style", "pretty"]

Still not what I want. It flattened the inner array that I didn't want to flatten.

Solution: << for when I want to add an array as an array; concat when I just want to add the elements of an array.

irb -> args = []; (args << '--revisions' << revisions).concat user_supplied_args
    => ["--revisions", ["14", "47", "90"], "--verbose", "--style", "pretty"]

How does it handle duplicates?

The | (set union) operator will remove duplicates; + and concat will not (but you could always call uniq! to remove duplicates).

Use + if you want all elements to be retained (even duplicates)

irb -> [1] + [1, 2]
    => [1, 1, 2]

but

irb -> [1] | [1, 2]
    => [1, 2]

So you'd probably want to use +, for example, if you had an array of arguments that you were adding user-supplied arguments to. You wouldn't want the user's arguments to be ignored just because they were the same as something already in the array...

irb -> def my_exec(*args)
         args = ['cat'] + args
         p args   # exec *args
       end

irb -> my_exec 'cat'
["cat", "cat"]
    => nil

Use | if you don't want duplicates

irb -> a = [1, 2, 3]
    => [1, 2, 3]

irb -> a += [3, 4, 5]
    => [1, 2, 3, 3, 4, 5]
irb -> a = [1, 2, 3]
    => [1, 2, 3]

irb -> a |= [3, 4, 5]
    => [1, 2, 3, 4, 5]

+= and concat are the same?

irb -> a = [1, 2]
    => [1, 2]

irb -> a.concat [2, 3]
    => [1, 2, 2, 3]

irb -> a = [1, 2]
    => [1, 2]

irb -> a += [2, 3]
    => [1, 2, 2, 3]

Uniq

irb -> ([1, 2, 3] + [3, 4, 5]).uniq
    => [1, 2, 3, 4, 5]

irb -> [1, 2, 3].concat([3, 4, 5]).uniq
    => [1, 2, 3, 4, 5]

An example

We have to include the || [] here because Hash#delete returns nil if the requested key is not found.

(It would be nice if NilClass would be smart enough to behave as [] when you do array-operations (like |) to it. But I guess how would it now that it's being passed as an argument to Array#| ? I guess Array#| could/might call arg.to_array ... that might work.)

  def require_all(start_dir, options = {})
    exceptions = [/^all\.rb$/] | (options.delete(:except) || [])
    FileList[start_dir + "/**/*.rb"].each do |filename|
      unless exceptions.any? {|e| File.basename(filename) =~ e}
        require filename
      end
    end
  end

Arrays: Miscellaneous

irb -> (a=[1, 2, 3]).zip(a)
    => [[1, 1], [2, 2], [3, 3]]

Hashes

Neat! You can even use nil as a hash key!

irb -> {nil => 1, 2 => 3}
    => {nil=>1, 2=>3}

irb -> {nil => 1, 2 => 3}[nil]
    => 1

any? [also filed under: Iterators]

irb -> [/^\./, /bad/].any? {|regexp| ".hidden_file" =~ regexp}
    => true

irb -> [/^\./, /bad/].any? {|regexp| "good" =~ regexp}
    => false

Great example:

  # Example:
  #   require_all File.dirname(__FILE__), :except => [/ignore/]
  def require_all(start_dir, options = {})
    exceptions = options.delete(:except) || [/^all.rb$/]
    FileList[start_dir + "**/**/*.rb"].each do |filename|
      unless exceptions.any? {|e| File.basename(filename) =~ e}
        require filename
      end
    end
  end

It reads almost like English: Unless the file being considered matches any of the patterns that we want to exclude, require it.

[Example] An array of strings is more flexible than just a single string

Consider this example, which adds to the options array of the RDocTask object ...

  def self.rdoc_task(&block)
    module_eval do
      #-------------------------------------------------------------------------------------------
      desc 'Generate RDoc'
      Rake::RDocTask.new(:rdoc) do |rdoc|
        rdoc.rdoc_dir = 'doc'
        rdoc.options << '--line-numbers' << '--inline-source' << '--all' << '--diagram'
        rdoc.options << '--extension' << 'rake=rb'
        puts rdoc.options.inspect
        rdoc.template = './doc_include/template/qualitysmith.rb'
        rdoc.rdoc_files.include(
          'README',
          'lib/**/*.rb'
        )
        yield rdoc
      end
    end
  end

This results in rdoc.options having:

["--line-numbers", "--inline-source", "--all", "--diagram", "--extension", "rake=rb"]

What's neat is that the block that we yield to can actually remove options fairly easily...

rdoc.options.delete("--diagram")

Well I thought it was neat. Wouldn't work so great for removing the "--extension", "rake=rb" option, though, which had to be 2 separate items in the array in order for it to work properly.

Processes / forking external processes / etc.

Ruby / Process management edit


How do I execute an external program?

method stdout? stderr? stdin? real-time? comments process id? exit code?
exec("command") no no no yes Simple, non-interactive invocation; replaces current process with new process (in other words, any Ruby code after the exec will not be executed!) NA NA
system("command") no no no yes Simple, non-interactive invocation; waits till execution is done; outputs both stdout and stderr as normal NA $?.exitstatus
result = `command` yes no, unless you do 2>&1 no buffered—output is returned only when the command has finished/exited Same, only it capture the output of that process. NA $?.exitstatus
pipe = IO.popen("command", "r") yes no, unless you do 2>&1 no yes—can even read a char or a line at a time, if you want Interactive control of other process (write to its stdin, and then read from its stdout) pipe.pid $?.exitstatus
pipe = IO.popen("command", "w+") yes no, unless you do 2>&1 yes "" "" "" ""
Open3.popen3("command") yes yes yes yes Very similar to IO.popen.
exec("command") if fork.nil? NA NA NA yes Starts a child process running concurrently (in the "background").

What happens to standard error (stderr)?

For those commands that only capture stdout (`...`, IO.popen), you should probably decide between one of two options:

  • Discard stderr (result = `command 2>/dev/null`)
  • Redirect stderr to stdout (result = `command 2>&1`)

The default behavior (if you don't do either of those) is for the standard error to just appear on screen as normally would if you had run the program independently (not from a script). This can be pretty odd/confusing, though, if you were attempting to silently capturing the output of some command but you still got some output from it. It's not always clear where the output is coming from (your command or the command you're trying to capture). That's why I prefer to either discard it or capture it. But you have to remember to one or the other or it may just appear on screen instead!

popen

How do I use popen?

IO.popen("other_program", "w+") do |pipe|
  pipe.puts "here, have some input"
  pipe.close_write  # If other_program process doesn't flush its output, you probably need to use this to send an end-of-file, which tells other_program to give us its output. If you don't do this, the program may hang/block, because other_program is waiting for more input.
  output = pipe.read
end

# You can also use the return value from your block. (exit code stored in $? as usual)
output = IO.popen("other_program", "w+") do |pipe|
  pipe.puts "here, have some input"
  pipe.close_write
  pipe.read
end

http://ruby-doc.org/core/classes/IO.html#M002294

Runs the specified command string as a subprocess; the subprocess’s standard input and output will be connected to the returned IO object. If cmd_string starts with a ``-’’, then a new instance of Ruby is started as the subprocess. The default mode for the new file object is ``r’’, but mode may be set to any of the modes listed in the description for class IO. If a block is given, Ruby will run the command as a child connected to Ruby with a pipe. Ruby’s end of the pipe will be passed as a parameter to the block. In this case IO::popen returns the value of the block. If a block is given with a cmd_string of ``-’’, the block will be run in two separate processes: once in the parent, and once in a child. The parent process will be passed the pipe object as a parameter to the block, the child version of the block will be passed nil, and the child’s standard in and standard out will be connected to the parent through the pipe. Not available on all platforms.

(Example) Getting child process id

irb -> pipe = IO.popen('uname')
    => #<IO:0xb7dd19b0>

irb -> pipe.readlines
    => ["Linux\n"]

irb -> "Parent is #{Process.pid}"
    => "Parent is 27577"

irb -> "popen's child process is #{pipe.pid}"
    => "popen's child process is 31914"

irb -> pipe.close
    => nil

irb -> "popen's child process is #{pipe.pid}"
IOError: closed stream

What if I want to use popen to capture stderr too?

If you just want it combined with stdout, you can simply redirect the stderr stream into the stdout stream (2>&1).

If you want to capture them separately, check out Open3.

http://ruby-doc.org/core/classes/Open3.html#M005449

  Open3.popen3('nroff -man') { |stdin, stdout, stderr| ... }

How long will it wait?

If you call read (etc.), it will wait until there is more output or the command finished (EOF)

Be careful, because your process will block if there is no more output!

Bad idea:

sleeper = IO.popen("sleep 10000"); puts sleeper.read

You will be waiting for a long time. sleep will never produce any output, because (if you read your docs carefully), "Reads at most length bytes from the I/O stream, or to the end of file if length is omitted or is nil." There is no end of file... so it blocks and waits for one.

Equally bad idea, because sleep will never produce any output, not even 1 character of output.

sleeper = IO.popen("sleep 10000"); puts sleeper.read(1)

Better idea:

irb -> sleeper = IO.popen("sleep 1"); p sleeper.read
""
    => nil

Even though the sleep doesn't return any output, at least read will stop blocking when the subcommand terminates .

In the following example, read(1) causes the process to block until the subprocess flushes its output. It ends up taking a full 5 seconds before we get any input.

irb -> countdown = IO.popen("ruby -e '5.downto(0) {|i| puts i; sleep 1}'"); countdown.read(1)
5
4
3
2
1
0
    => nil

By contrast, this returns immediately, because $stdout.sync tells Ruby to flush the output immediately every time you try to output something:

irb -> countdown = IO.popen("ruby -e '$stdout.sync = true; 5.downto(0) {|i| puts i; sleep 1}'"); countdown.read(1)
    => "5"

In the case of a bi-directional pipe ('w+') (stdout+stdin)... be sure you call close_write!

Good idea:

irb -> IO.popen("grep food", "w+") { |pipe| pipe.puts "have some food"; pipe.close_write; output = pipe.read }
    => "have some food\n"

Bad idea:

irb -> IO.popen("grep food", "w+") { |pipe| pipe.puts "have some food"; output = pipe.read }
[Will block, waiting for input. Have to Ctrl-C to interrupt.]
IRB::Abort: abort then interrupt!!

When you close a popened pipe and that process is still running, your process will wait for it...

sleeper = IO.popen("sleep 10000") # The child process goes to sleep
sleeper.close
# The parent process will now wait for the sleep command to finish...

Sometimes you want to have more (interactive) control over the subprocesses that you start up.

You could always use the timeout library and Process.kill the child process if it takes too long.

You might also consider using fork, or using IO.select (with a timeout) to check if any more output is available...

How do you you close your pipe and kill that process without waiting for it???

irb -> pipe = IO.popen('sleep 1000')
    => #<IO:0xb7d4cd78>

irb -> pipe.pid
    => 28120

irb -> Process.kill 'TERM', pipe.pid

irb -> pipe.close

How do I read up until a certain input is reached and then exit the popen?

Simple way

def read_until(pipe, stop_at, verbose = true)
  lines = []
  while line = pipe.gets
    break if line =~ stop_at
    puts line if verbose
    lines << line
  end
  lines
end

IO.select way

I haven't figured out yet when you would need to use this method, but this is the method used by Jamis Buck's GDB wrapper (http://weblog.jamisbuck.org/assets/2006/9/25/gdb.rb), so it must have some use, right?

def read_until(pipe, stop_at, verbose = true)
  lines = []
  line = ""
  while result = IO.select([pipe])  #, nil, nil, 10)
    next if result.empty?

    c = pipe.read(1)
    break if c.nil?

    line << c
    break if line =~ stop_at

    # Start a new line?
    if line[-1] == ?\n
      puts line if verbose
      lines << line
      line = ""
    end
  end
  lines
end

The online RDocs for IO.select and Kernel.select both say to "See Kernel#select."

So instead, I will rely on my trusty Pickaxe book, p. 528-529 [3]:

IO.select(read_array
[, write_array
[, error_array
[, timeout]]] )
=> array or nil

Performs a low-level select call, which waits for data to become available from input/output devices. The first three parameters are arrays of IO objects or nil. The last is a timeout in seconds, which should be an Integer or a Float. The call waits for data to become available for any of the IO objects in read_array, for buffers to have cleared sufficiently to enable writing to any of the devices in write_array, or for an error to occur on the devices in error_array. If one or more of these conditions are met, the call returns a three-element array containing arrays of the IO objects that were ready. Otherwise, if there is no change in status for timeout seconds, the call returns nil. If all parameters are nil, the current thread sleeps forever.

select( [$stdin], nil, nil, 1.5 )       »       [[#<IO:0x401ba090>], [], []]





Forking processing

How do I fork a child process?

Depends on whether you want to run Ruby code (same program) or an external program in your child process...

# Run an external program (command) in the child process
exec(command) if fork.nil?
# ...
Process.wait

-or-

# Run some Ruby code (same program) in the child process
fork do
  puts "In child process. parent pid is #$$"
  exit 99
end
child_pid = Process.wait
puts "Child (pid #{child_pid}) terminated with status #{$?.exitstatus}"

http://corelib.rubyonrails.org/classes/Kernel.html#M002088

The parent process should use Process.wait to collect the termination statuses of its children or use Process.detach to register disinterest in their status; otherwise, the operating system may accumulate zombie processes.

Using IO.popen('-')

irb -> "Parent process is #{Process.pid}"
    => "Parent process is 27577"

irb -> IO.popen('-') {|pipe| $stderr.puts "#{Process.pid} is here, pipe is #{pipe}"}
32579 is here, pipe is
27577 is here, pipe is #<IO:0xb7d700ac>

Getting process id

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/9105


> |How can I get the PID of a program run in a ruby script. I've 
> |looked at open, popen, system, and %x//, and various Process 
> |methods, but haven't figured it out.
 
> IO#pid for open, popen.  No way provided for system and `` (yet).

http://www.hostingforum.ca/262611-getting-pid-external-command.html

def my_system(*cmd)
  pid = fork do
    exec(*cmd)
    exit! 127
  end
  yield pid if block_given?
  Process.waitpid(pid)
  $?
end

irb -> my_system("sleep 3") { |pid| 
  puts "other process id is #{pid}"
  Process.kill "TERM", pid 
}
Pid is 22586
    => #<Process::Status: pid=22586,signaled(SIGTERM=15)>




[Caveats (category)] Ruby's `...`, system, and exec require executable scripts (non-binaries) to have shebang line!

Bash doesn't care! As long as the executable bit is set, it will be happy to run it for you!

> echo -e "echo 'bash_hi'" > bash_hi_no_shebang; chmod 0700 bash_hi_no_shebang
> cat bash_hi_no_shebang
echo 'bash_hi'
> /home/tyler/bash_hi_no_shebang
bash_hi

Ruby, however, is a bit more picky about what it will run for you. You will get a mysterious command not found if it doesn't like your script.

irb -> `/home/tyler/bash_hi_no_shebang`
(irb):37: command not found: /home/tyler/bash_hi_no_shebang
    => ""

! Even though the command (the file) does exist!

But this does work from ruby:

[tyler: ~]> echo -e '#!'"/bin/bash\necho 'bash_hi'" > bash_hi; chmod 0700 bash_hi
[tyler: ~]> /home/tyler/bash_hi
bash_hi

irb -> `/home/tyler/bash_hi`
    => "bash_hi\n"

The shebang line makes all the difference!

Which techniques work to start/fork a process that needs to interact with the user (as in, a text editor)?

Let's face it, some command-line programs are interactive and need full access to the [terminal]. They are "[full-screen]" and may do fancy [ANSI] tricks. Examples: editors, top.

For these cases, not all methods of starting process work equally well. system is the technique that works the best.

These work:

  • system('vim') (seems like the safest bet)
  • exec('editor') if fork.nil? (with varying degrees of success)
  • exec('vim') (but you can't execute code after that)

These do not work:

  • `vim`

Details:

irb -> system('vim temp'); p File.readlines('temp')
[it started up vim, where I wrote "hi there", saved, and exited]
["hi there\n"]
    => nil
/pre>

<pre>
irb -> `vim`
Vim: Warning: Output is not to a terminal
[Then it just hung there, not giving any feedback at all when I typed. Ctrl-C, Ctrl-D, etc. were all ineffectual. The only thing I could do at this point was Ctrl-Z.]
> pkill -9 irb
[11]+  Killed                  irb

Warning: Don't run exec('vim') if fork.nil? from within irb. That had the weirdest behavior of them all. I really do not understand what it did to the terminal, but it really screwed it up visually so it was hard to tell what you were doing or even what program you were doing it in. I was lucky to be able exit out of everything alive (and by alive I mean successfully).

irb -> exec('editor') if fork.nil?
[a bunch of guessing, trying various exit commands, and eventually escaping]

When I finally got out of irb, a reset command was necessary to fix my terminal.

It seemed to work fine though if just executed with straight ruby (not irb):

> cat temp.rb
exec('vim temp') if fork.nil?
Process.wait
p File.readlines('temp')
> ruby temp.rb
[fired up vim]
"temp" 1L, 9C written
["hi there\n"]

Timing out an operation

http://pleac.sourceforge.net/pleac_ruby/processmanagementetc.html

# implemented thanks to http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/1760
require 'timeout'

begin
    timeout(5) {
        waitsec = rand(10)
        puts "Let's see if a sleep of #{waitsec} seconds is longer than 5 seconds..."
        system("sleep #{waitsec}")
    }
    puts "Timeout didn't occur"
rescue Timeout::Error    
    puts "Timed out!"
end

Signals

Sending a Signal

http://pleac.sourceforge.net/pleac_ruby/processmanagementetc.html

Process.kill(9, pid)                    # send $pid a signal 9
Process.kill(-1, Process.getpgrp())     # send whole job a signal 1
Process.kill("USR1", $$)                # send myself a SIGUSR1
Process.kill("HUP", pid1, pid2, pid3)   # send a SIGHUP to processes in @pids
#-----------------------------
begin
    Process.kill(0, minion)
    puts "#{minion} is alive!"
rescue Errno::EPERM                     # changed uid
    puts "#{minion} has escaped my control!";
rescue Errno::ESRCH
    puts "#{minion} is deceased.";      # or zombied
rescue
    puts "Odd; I couldn't check the status of #{minion} : #{$!}"
end

Installing a Signal Handler (trap)

http://pleac.sourceforge.net/pleac_ruby/processmanagementetc.html

Kernel.trap("QUIT", got_sig_quit)       # got_sig_quit = Proc.new { puts "Quit\n" }
trap("PIPE", "got_sig_quit")            # def got_sig_pipe ...
trap("INT") { ouch++ }                  # increment ouch for every SIGINT
#-----------------------------
trap("INT", "IGNORE")                   # ignore the signal INT
#-----------------------------
trap("STOP", "DEFAULT")                 # restore default STOP signal handling

http://svn.tylerrick.com/public/ruby/examples/trap-chaining_two_signal_handlers.rb

# See ~/code/gemables/qualitysmith_extensions/lib/qualitysmith_extensions/kernel/trap_chain.rb for the final product

def trap_chain(*args, &block)
  previous_interrupt_handler = trap("INT") {}
  trap("INT") do
    block.call
    previous_interrupt_handler.call unless previous_interrupt_handler == "DEFAULT"
  end
end
trap_chain("INT") { puts "Handler 1"; }
trap_chain("INT") { puts "Handler 2"; puts "Exiting..."; exit }
trap_chain("INT") { puts "Handler 3" }
trap_chain("INT") { puts "Handler 4" }

Process.kill "INT", 0

loop do
  # Press Ctrl-C to interrupt this loop
end

# Outputs:
Handler 4
Handler 3
Handler 2
Exiting...

Temporarily Overriding a Signal Handler

http://pleac.sourceforge.net/pleac_ruby/processmanagementetc.html

# the signal handler
def ding
    trap("INT", "ding")
    puts "\aEnter your name!"
end

# prompt for name, overriding SIGINT
def get_name
    save = trap("INT", "ding")

    puts "Kindly Stranger, please enter your name: "
    name = gets().chomp()
    trap("INT", save)
    name
end

Links

http://pleac.sourceforge.net/pleac_ruby/processmanagementetc.html

 


Input/output

Ruby / Input/output edit

How to write to a file

irb -> require 'tmpdir'
    => true

irb -> Dir.tmpdir
    => "/tmp"

irb -> File.open(my_filename = "#{Dir.tmpdir}/my_file", "w") { |file| file.puts "woo!" }
    => nil

irb -> File.open(my_filename, "r") { |file| puts file.gets }
woo!
    => nil

irb -> File.delete(my_filename)
    => 1

How to find out the location of the "temp" directory

irb -> require 'tmpdir'
    => true

irb -> Dir.tmpdir
    => "/tmp"

irb -> File.open(my_filename = "#{Dir.tmpdir}/my_file", "w") { |file| file.puts "woo!" }
    => nil

irb -> File.open(my_filename, "r") { |file| puts file.gets }
woo!
    => nil

irb -> File.delete(my_filename)
    => 1

How to read in a file

The quickest and easiest way is probably with

  • File.read -- gets you a single lnong string
  • File.readlines (or IO.readlines -- don't ask me why we have both).
irb -> File.read('foo')
    => "line1\nline2\n"

irb -> File.readlines('foo')
    => ["line1\n", "line2\n"]
irb -> require 'rubygems'
irb -> require 'extensions/symbol'

irb -> f = File.open('foo', 'w') { |f| f.puts 'line1'; f.puts 'line2' }
    => nil

irb -> File.readlines('foo')
    => ["line1\n", "line2\n"]
irb -> File.readlines('foo').map(&:chomp)
    => ["line1", "line2"]

You can also open a file and then read from it, if you want a bit more control...

irb -> f = File.open('foo', 'w') { |f| f.puts 'line1'; f.puts 'line2' }
    => nil

Writing filters / file processors/transformers in Ruby

Using ARGF

Here's a "simple" example:

http://eigenclass.org/hiki.rb?eigenclass.org+repainted+1

I used a quick Ruby script to rewrite the CSS stylesheets (making most colors 20% darker), and you're getting a new logo with a 50% probability on each pageview. Here's the (trivial) code:

#!/usr/bin/env ruby

FACTOR = 0.8

def transform_color(r, g, b)
  case r + g + b
  when 255..600
    [r, g, b].map{|x| (FACTOR * x).to_i}
  else
    [r, g, b]
  end
end

css = ARGF.read
css.gsub!(/#[0-9A-Fa-f]{6}/) do |color|
  "#" + 
    transform_color(*color[1..6].scan(/../).map{|x| Integer("0x#{x}") }).map{|x| "%02X" % x}.join("") 
end

css.gsub!(/#[0-9A-Fa-f]{3}(?=[^0-9A-Fa-f])/) do |color|
  r, g, b = color[1..3].scan(/./).map{|x| Integer("0x#{x}#{x}") }
  "#" + 
    transform_color(r, g, b).map{|x| "%02X" % x}.join("") 
end

puts css

Using File.open_as_string

Facets

http://facets.rubyforge.org/src/doc/rdoc/core/classes/File.html#M000692

  # Reverse contents of "message"
  File.open_as_string("message") { |str| str.reverse! }
require 'facets/core/file/self/open_as_string'

irb -> FileUtils.touch 'foo'
    => ["foo"]

irb -> File.open_as_string('foo') { |s| puts s }

    => nil

irb -> File.open_as_string('foo') { |s| s = "Initial contents\n" }; puts File.read('foo')

    => nil

irb -> File.open_as_string('foo') { |s| s.replace "Initial contents\n" }; puts File.read('foo')
Initial contents
    => nil

irb -> File.open_as_string('foo') { |s| s += "Another line\n" }; puts File.read('foo')
Initial contents
    => nil

irb -> File.open_as_string('foo') { |s| s << "Another line\n" }; puts File.read('foo')
Initial contents
Another line
    => nil


FileUtils.touch 'foo'
File.open_as_string('foo') { |s| s.replace "Initial
contents\n" } # Don't do s = "..."!
File.open_as_string('foo') { |s| s << "Another line\n"
} # Don't do s += "..."!
File.open_as_string('foo') { |s| s.gsub! /old/, 'new' } # Must use in-place version (gsub!), not gsub

matches.rb

Practically a one-liner!

$stdin.each_line do |filename|
  puts filename if IO.read(filename.chomp) =~ /#{Regexp.escape(ARGV[0])}/
end

ARGF

Pickaxe, p. 335 or English library RDoc

An object that provides access to the concatenation of the contents of all the files given as command-line arguments, or $stdin (in the case where there are no arguments). $< supports methods similar to a File object: inmode, close, closed?, each, each_byte, each_line, eof, eof?, file, filename, fileno, getc, gets, lineno, lineno=, path, pos, pos=, read, readchar, readline, readlines, rewind, seek, skip, tell, to_a, to_i, to_io, to_s, along with the methods in Enumerable. The method file returns a File object for the file currently being read. This may change as $< reads through the files on the command line. Read only.

How do I check if the output stream has been redirected to a pipe?

I want to output in color if it has not been redirected to a pipe, but not output in color if output has been redirected.

In other words, these should generate two different outputs:

my_script.rb | cat -
my_script.rb

I thought maybe this would be the case when the output has been redirected, but it appears not to be the case: ($> != STDOUT)

What else can I check??


 


Files and directories

Ruby / Files and directories edit

.

http://www.ruby-doc.org/core/classes/Dir.html

See also': Ruby / Input/output

How to get the current working directory

Dir.getwd or FileUtils.pwd

How to change the working directory (cd)

 > require 'rubygems'
=> true
 > require 'rake'
=> true
 > Rake::FileList
=> Rake::FileList
 > FileUtils.pwd
=> "/var/www/whatever_app/db"
 > FileUtils.cd '..'; FileUtils.pwd
=> "/var/www/whatever_app"
 > FileUtils.cd 'config'; FileUtils.pwd
=> "/var/www/whatever_app/config"
 > FileList['*']
=> ["environments", "environment.rb", "routes.rb", "boot.rb", "database.yml", "lighttpd.conf", "deploy.rb", "database.mydb.yml"]

[listing files (category)][file globbing (category)] How to get a list of all files in the current directory using Dir

If by "current directory" you mean the directory that this script is in (File.dirname(__FILE__)):

Dir.new(File.dirname(__FILE__)).each do |file|
  unless ['.', '..'].include? file
    puts file
  end
end

If instead, you mean "current directory" should be what Dir.getwd would return, then this is what you want:

Dir.new('.').each do |file|
  unless ['.', '..'].include? file
    puts file
  end
end

But often we don't want to print out the files; we just want them returned as an array...

For that, we have Dir.entries. Unfortunately, it also includes the ., .. entries that we almost never want...

Dir.new('.').entries.reject {|f| [".", ".."].include? f}

We can also do the same thing only using an Enumerator (handy, for instance, if Dir hadn't provided an entries method)...

require 'qualitysmith_extensions/enumerable/enum'

Dir.new('.').enum(:each).to_a

Dir.new('.').enum(:each).reject {|f| [".", ".."].include? f}

If you want to exclude all entries beginning with a . (such as .svn and other directories that are supposed to be hidden)...

Dir.new('.').entries.reject {|f| f =~ /^\./}

[Complaint] Path building is very non-object-oriented

If you're lazy, you can just do it like this:

require File.dirname(__FILE__) + "/lib/svn.rb"

but that's not platform independent. So you're supposed to do it like this:

require File.join(File.dirname(__FILE__), "../lib/svn.rb")

This looks a lot more like procedural programming though: You call File.join and pass it two strings. It looks really ugly to me.

Why not instead have two path component objects and tell them to add each other?

I'm imagining something like this, we can just use the + operator:

require File.dirname(__FILE__).to_path + "../lib/svn.rb".to_path

[cleaning paths (category)] Pathname and File.expand_path

irb -> require 'pathname'

irb -> Pathname.new('/usr/lib/../bin').to_s
    => "/usr/lib/../bin"

irb -> Pathname.new('/usr/lib/../bin').cleanpath.to_s
    => "/usr/bin"

irb -> File.expand_path('/usr/lib/../bin')
    => "/usr/bin"

Files and directories: [listing files (category)]/[file globbing (category)]/[file traversal (category)]/[iteration (category)]

Dir.glob

How does it compare with Dir.multiglob and FileList?

Dir.multiglob [Ruby Facets (category)]

  Dir.multiglob( '*.rb', '*.py' )
  Dir.multiglob( '*', :recurse => true )
  Dir.multiglob('**/*')

How does it compare with FileList?

How to test if something is a directory

File::Stat.new(path).directory?

or

FileTest.directory?(path)

Iterating through contents of a directory

  Dir.new(base_dir = "./some_directory/").each do |name|
    path = "#{base_dir}#{name}"
    if dir_name !~ /^\./
      # Do something with #{path}
    end
  end

Non-directories only:

  Dir.new(base_dir = "./some_directory/").each do |name|
    path = "#{base_dir}#{name}"
    if dir_name !~ /^\./ and !FileTest.directory?(path)
      # Do something with #{path}
    end
  end

Isn't there a better/conciser way?

Hmm, not that I've found so far...


Directory tree traversal ("recursion")

I wrote a class DirectoryRecurser.rb that traverses a directory for you and calls a callback block that you supply once for each entry.

But http://www.ruby-doc.org/core/classes/Find.html looks like a cleaner way to do the same thing.

List all files (paths) within a directory (recursively)

http://svn.tylerrick.com/public/ruby/examples/listing_files_with_Dir_and_Find.rb

require 'find'
require 'fileutils'
Find.find(dir) do |path|
  if FileTest.directory?(path)
    if File.basename(path)[0] == ?. and File.basename(path) != '.'
      Find.prune
    else
      next
    end
  else
    puts path
  end
end

Or, if you'd rather get an array of paths back, use Find.select instead of Find.find...

gem 'qualitysmith_extensions'
require 'qualitysmith_extensions/find/select'
files = Find.select(dir) do |path|
  ...
  true
end


Example: Recursively remove .svn directories

Recursively remove .svn directories (ruby script) (http://textsnippets.com/posts/show/735). Retrieved on 2007-02-08 10:03.

require 'find'
require 'fileutils'
Find.find('./') do |path|
  if File.basename(path) == '.svn'
    FileUtils.remove_dir(path, true)
    Find.prune
  end
end

FileList

http://facets.rubyforge.org/src/doc/rdoc/more/classes/FileList.html (Ported from the version that's part of Rake.)

require 'rubygems'
require 'facets/more/filelist'

The Pickaxe book, p. 230, points out that FileList automatically ignores commonly unused files (like the CVS directory).

 > require 'rubygems'
=> true
 > require 'rake'
=> true
 > Rake::FileList
=> Rake::FileList
 > FileUtils.pwd
=> "/var/www/whatever_app/db"
 > FileUtils.cd '..'; FileUtils.pwd
=> "/var/www/whatever_app"
 > FileUtils.cd 'config'; FileUtils.pwd
=> "/var/www/whatever_app/config"
 > FileList['*']
=> ["environments", "environment.rb", "routes.rb", "boot.rb", "database.yml", "lighttpd.conf", "deploy.rb", "database.mydb.yml"]

file globbing with FileList

This is much like file globbing on the command line. That is, you can use * to get a list of all files in the current directory. I think FileList is even more powerful though: you can use ** to indicate that you don't care how many subdirectories deep it has to traverse.

FileList["**/*.rb"].each do |filename|
  puts filename
end

FileList: include and exclude

FileList['{lib,test,examples}/**/*.rb', '[A-Z]*'].exclude('TODO').to_a

a find command with FileList

http://onestepback.org/index.cgi/Tech/Rake/FindInCode.red

 #!/usr/bin/env ruby
 require 'rake'
 FileList["**/*.rb"].egrep(Regexp.new(ARGV.first))


 


Dates and times

Ruby / Dates and times edit

Dates, Times, and Datetimes, oh my!

How do you create a new Date/Time from (year, month, day) values?

Time (with local timezone) Time.local(year, month, day, ...) / Time.mktime(...) Sat Jul 01 00:00:00 -0700 2006
(with UTC timezone) Time.utc(year, month, day) Sat Jul 01 00:00:00 UTC 2006
Date (no timezone information) Date.new(year, month, day) "2006-07-01"
DateTime (no timezone information?) DateTime.civil(year, month, day) "2006-07-01T00:00:00Z"

Confusingly, the Rdoc (at http://www.ruby-doc.org/core/classes/Date.html#M000480) shows the documentation for new0 where it should show the documentation for new! That is, you can't even find the documentation for new() in the standard Rdoc!!

Find it here instead: http://www.rubycentral.com/pickaxe/lib_standard.html#Date.new


How do I turn a string into a Date?

(how do I "parse" a string that contains a date)

irb -> require 'date'

irb -> Date.strptime('2007-04-04')
    => #<Date: 4908389/2,0,2299161>

irb -> Date.parse('2007-04-04')
    => #<Date: 4908389/2,0,2299161>

irb -> Time.parse('2007-04-04')
    => Wed Apr 04 00:00:00 -0700 2007

irb -> DateTime.parse('2007-04-04')
    => #<DateTime: 4908389/2,0,2299161>


What's the difference between Date and Time and DateTime?

To do: make a table that shows which methods are available in each

/usr/lib/ruby/1.8/date.rb

DateTime is a subclass of Date which makes these methods (among others) public:

  • hour()
  • min()
  • sec()

Caveat: you may need to require certain date classes for certain Date/Time features

> ruby -rtime -e 'p $LOADED_FEATURES.grep(/date/); puts Time.parse("2007-01-01")'
["date/format.rb", "parsedate.rb"]
Mon Jan 01 00:00:00 -0800 2007

> ruby -rdate -e 'p $LOADED_FEATURES.grep(/date/); puts Date.parse("2007-01-01")'
["date/format.rb", "date.rb"]
2007-01-01

So "parsedate.rb" is needed in order to parse for Time, but not for Date. Odd.

Caveat: irb gives you an incomplete Date class by default

irb -> Date
    => Date

irb -> $LOADED_FEATURES.grep /date/
    => ["date/format.rb", "parsedate.rb"]

irb ->  Date::civil(2003, 4, 8)
NoMethodError: undefined method `civil' for Date:Class
        from (irb):4
irb -> require 'date'
    => true

irb -> $LOADED_FEATURES.grep /date/
    => ["date/format.rb", "parsedate.rb", "date.rb"]

irb ->  Date::civil(2003, 4, 8)
    => #<Date: 4905475/2,0,2299161>

Plain old ruby, however, doesn't give you any Date class at all by default...

> ruby -e 'Date'
-e:1: uninitialized constant Date (NameError)

My solution to this confusion was to add this line to my .irbrc:

require 'date'      # Since irb only gives you ["date/format.rb", "parsedate.rb"] by default!

How to increment by day or month

irb -> (Date.new(2006, 12, 14) + 1).to_s
    => "2006-12-15"

irb -> (Date.new(2006, 12, 14) >> 1).to_s
    => "2007-01-14"
irb -> require 'date'

irb -> date =  Date.strptime('2008-03-16')
irb -> puts date
2008-03-16

irb -> date = date + 7
irb -> puts date
2008-03-23

irb -> date = date >> 1
irb -> puts date
2008-04-23

What's the difference between Time and Datetime??

...

Having Date, Time, and Datetime classes but no easy way to convert between them

Ruby on Rails's ActiveSupport provides the much-needed Time.to_date() conversion...but that should be core Ruby!!

How do you instantiate a Date object with the current date/time? What's the equivalent of Time.now in the Date class?

Date.today

How do I get an ISO-8601 date?

> require 'date'
> Date.today.to_s
    => "2006-08-21"

> DateTime.now.to_s
    => "2006-08-21T13:08:34-0700"

> DateTime.now..to_s[0..18]
    => "2006-08-21T13:14:15"

# For Time objects:
> Time.now.strftime("%Y-%m-%d %H:%M:%S")
    => "2008-07-17 02:25:53"

How do I create a Time object from a Unix timestamp?

>> Time.at(1161200857)
=> Wed Oct 18 12:47:37 PDT 2006

Date: Iterators for days

# Print the days from 2006-11-01 through 2006-11-05
irb -> require 'date'

irb -> ((a=Date.new(2006, 11, 1)) .. a+4).each {|d| puts d.to_s}
2006-11-01
2006-11-02
2006-11-03
2006-11-04
2006-11-05

More examples

irb -> now = Time.now; "#{now.strftime("%a")}, #{now.strftime("%b")} #{now.day}, #{now.year}"
    => "Wed, Nov 1, 2006"
  def date_options(today)
    [
      "Not sure",
      (today + 1 .. today + 13).select { |date|
        Date::DAYNAMES[date.wday] != "Sunday"
      }.collect { |date|
        "#{date.strftime("%a, %b")} #{date.day}"
      }.collect { |option_string|
        [
          "#{option_string} - morning",
          "#{option_string} - afternoon"
        ]
      },
      "Other"
    ].flatten
  end

produces

["Not sure",
 "Thu, Nov 2 - morning",
 "Thu, Nov 2 - afternoon",
 "Fri, Nov 3 - morning",
 "Fri, Nov 3 - afternoon",
 "Sat, Nov 4 - morning",
 "Sat, Nov 4 - afternoon",
 "Mon, Nov 6 - morning",
 "Mon, Nov 6 - afternoon",
 "Tue, Nov 7 - morning",
 "Tue, Nov 7 - afternoon",
 "Wed, Nov 8 - morning",
 "Wed, Nov 8 - afternoon",
 "Thu, Nov 9 - morning",
 "Thu, Nov 9 - afternoon",
 "Fri, Nov 10 - morning",
 "Fri, Nov 10 - afternoon",
 "Sat, Nov 11 - morning",
 "Sat, Nov 11 - afternoon",
 "Mon, Nov 13 - morning",
 "Mon, Nov 13 - afternoon",
 "Tue, Nov 14 - morning",
 "Tue, Nov 14 - afternoon",
 "Other"]

[Libraries (category)]

See also Rails_plugins_and_libraries_/_Lower-level#.5BRuby-level.5D:_Dates_and_times

DateUtils

Documentation: RDoc


Project/Development: http://rubyforge.org/projects/dateutils/


Description: DateUtils provide some handy classes to deal with Date e.g. Week, Month, Year




Readiness: 4 - Beta


Chronic

Categories/Tags: [Natural language parsers (category)]
Homepage: http://chronic.rubyforge.org/
Documentation: http://chronic.rubyforge.org/
Source code: sudo gem install chronic



Description: Chronic is a natural language date/time parser written in pure Ruby.





http://chronic.rubyforge.org/

  Chronic.parse('tomorrow')
    #=> Mon Aug 28 12:00:00 PDT 2006

  Chronic.parse('monday', :context => :past)
    #=> Mon Aug 21 12:00:00 PDT 2006

  Chronic.parse('this tuesday 5:00')
    #=> Tue Aug 29 17:00:00 PDT 2006

  Chronic.parse('this tuesday 5:00', :ambiguous_time_range => :none)
    #=> Tue Aug 29 05:00:00 PDT 2006


 


Data structures

Ordered hashes: Dictionary

http://facets.rubyforge.org/src/doc/rdoc/more/classes/Dictionary.html

The alternative, using just primitives: array of hashes.

Comparison:

# Order is *not* preserved!
irb -> h = {:a => 1, :b => 2, :c => 3}
    => {:c=>3, :a=>1, :b=>2}
irb -> h.values
    => [3, 1, 2]

# Order is preserved. Intuitive.
irb -> d = Dictionary[:a, 1, :b, 2, :c, 3]
    => {:a=>1, :b=>2, :c=>3}
irb -> d.values
    => [1, 2, 3]
# Still acts like Hash
irb -> d[:a]
    => 1

# Order is preserved. But accessing elements is awkward.
irb -> a_h = [{:a => 1}, {:b => 2}, {:c => 3}]
    => [{:a=>1}, {:b=>2}, {:c=>3}]

irb -> a_h.find {|h| h[:a]}[:a]
    => 1
irb -> a_h.find {|h| h[:b]}[:b]
    => 2

Is there a more concise / hash-like way to initialize a Dictionary? Sure!

Not like this (what you pass to new will be used as a default value, just like with Hash)...

irb -> d = Dictionary.new({:a => 1, :b => 2, :c => 3})
    => {}

irb -> d[:a]
    => {:c=>3, :a=>1, :b=>2}

This should work but doesn't...

irb -> d = Dictionary[{:a => 1, :b => 2, :c => 3}]
NoMethodError: undefined method `order' for {:c=>3, :a=>1, :b=>2}:Hash
        from /usr/lib/ruby/gems/1.8/gems/facets-1.8.51/lib/facets/more/dictionary.rb:301:in `replace'
        from /usr/lib/ruby/gems/1.8/gems/facets-1.8.51/lib/facets/more/dictionary.rb:84:in `[]'
        from (irb):15

Ranges

Intuitive conversion to arrays (expands the description of the range into the actual list of member elements):

irb -> (1..10).to_a
    => [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Discontiguous ranges

Check out how intuitive this is...


irb -> [3, 6..9, 12]
    => [3, 6..9, 12]

irb -> [3, 6..9, 12][1].end
    => 9

Actually iterating through it though is another story...

Expanding it...

irb -> [3, 6..9, 12].to_a
    => [3, 6..9, 12]
# Not what I was hoping for, but that's fine. This should actually be its method, like expand or something...

irb -> [3, 6..9, 12].expand
NoMethodError: undefined method `expand' for [3, 6..9, 12]:Array
        from (irb):7
        from :0

irb -> require 'rubygems'; require 'extensions/symbol'
    => true

irb -> [3, 6..9, 12].map(&:to_a).flatten
/usr/lib/ruby/gems/1.8/gems/extensions-0.6.0/lib/extensions/symbol.rb:24: warning: default `to_a' will be obsolete
    => [3, 6, 7, 8, 9, 12]

Exception handling

Ruby / Exception handling edit


Exception hierarchy

  • Exception
    • fatal
    • NoMemoryError
    • ScriptError
      • LoadError
      • NotImplementedError
      • SyntaxError
    • SignalException
      • Interrupt
    • StandardError
      • ArgumentError
      • IOError
        • EOFError
      • IndexError
      • LocalJumpError
      • NameError
        • NoMethodError
      • RangeError
        • FloatDomainError
      • RegexpError
      • RuntimeError
      • SecurityError
      • SystemCallError
        • Errorno::__ (ENOENT, etc.) (system-dependent)
      • ThreadError
      • TypeError
      • ZeroDivisionError
    • SystemExit
    • SystemStackError

The rescue statement modifier

Intro

rescue doesn't always have to be used with a begin block. You can do rescues even more concisely.

irb -> raise 'an error' rescue "Hey, there was an error!"
    => "Hey, there was an error!"

Or, most concisely of all:

irb -> raise 'an error' rescue nil
    => nil

It really doesn't get more concise than that for exception handling. If you just want to silence exceptions and you don't want to actually do any intelligent handling of them then rescue nil is your ticket.

Details

When you use rescue as a statement modifier, it rescues StandardError and its subclasses. You can't specify which class of exception(s) that you want to rescue like you can with the normal version of rescue (begin ... rescue Exception => exception ... end, for example).

You also can't do such things as "ensure" when you use this style of rescue.

Caveat: The thing you pass to rescue is what it should do when there's a StandardError -- not the class of exceptions that it should rescue

That fact is a bit confusing, because it's different from the normal begin/rescue blocks.

irb -> begin
         raise "A runtime error"
       rescue RuntimeError
         "The thing that I want to happen when there's a RuntimeError"
       end
    => "The thing that I want to happen when there's a RuntimeError"

but:

irb -> raise "A runtime error" rescue "The thing that I want to happen when there's a RuntimeError"
    => "The thing that I want to happen when there's a RuntimeError"

If you forget this fact, you may be surprised when the error class itself is returned:

irb -> raise "A runtime error" rescue TypeError
    => TypeError

As soon as the error is rescued and you're no longer in the rescue block, $ERROR_INFO/$! resets to nil.

irb -> require 'English'
    => true

irb -> $ERROR_INFO
    => nil

irb -> raise "A runtime error" rescue puts $ERROR_INFO.inspect
#<RuntimeError: A runtime error>
    => nil

irb -> $ERROR_INFO
    => nil
irb -> $!
    => nil

Example: print errors without stopping

Use this trick when you want to see error messages that are raised but you want execution to continue:

irb -> NotARealClass.new() rescue $stderr.puts $!.inspect
#<NameError: uninitialized constant NotARealClass>
    => nil

or just:

irb -> NotARealClass.new() rescue $stderr.puts $!
uninitialized constant NotARealClass
    => nil

http://codeforpeople.com/lib/ruby/autorequire/autorequire-0.0.0/README

    # rescues error and prints its message- support method for samples/*
    #
    def print_error
      yield
    rescue Exception => e
      puts "#{ e } (#{ e.class })!"
    end

Like all things in Ruby, rescue has a return value

This can be useful, for example, when you need to assign a default value to a variable if an error occurs.

irb -> an_important_variable = SomethingThatWillRaiseAnError.do_calculation rescue "a sane default value"
    => "a sane default value"

You may be thinking (if you come from a background in other, less orthogonal languages), how can you do things like rescue puts $!.inspect? You can do that because puts, like all methods and blocks, has a return value.

irb -> a = puts "whatever"
whatever
    => nil

irb -> puts a
nil
    => nil

irb -> a = raise 'arg!' rescue puts 'whatever'
whatever
    => nil

irb -> puts a
nil
    => nil

Can have an ensure without a begin block

def something_that_might_raise_an_error
  raise "an error"
ensure
  puts "that we print this no matter what"
end

something_that_might_raise_an_error rescue puts $!
that we print this no matter what
an error
 


Questions ("How do I...?")

Sort by: [open questions] [solved questions]

How do I check what platform the user is running on?

qualitysmith_extensions/kernel/windows_platform.rb

module Kernel
  def windows_platform?
    RUBY_PLATFORM =~ /mswin32/

    # What about mingw32 or cygwin32?
    #RUBY_PLATFORM =~ /(win|w)32$/

    # What about 64-bit Windows?
  end
end

Can you use modules to namespace an (otherwise global) method?

So for example call MyNamespace::method from some method somewhere else?

Would that be an instance method the module? Can modules have instance methods? If they did, wouldn't they be accessed like MyNamespace.method?

Hmm... looks like you can do what I'm trying to do, and they're called module methods:

http://ruby-doc.org/core/classes/Module.html :

The methods in a module may be instance methods or module methods. Instance methods appear as methods in a class when the module is included, module methods do not. Conversely, module methods may be called without creating an encapsulating object, while instance methods may not. (See Module#module_function)

Which print command do I use?? puts, p, pp, print??

puts includes a newline; print doesn't.

p vs. puts

p calls inspect on its arguments; puts does not.

p D.new
#<D:0xb7fa6498 @var=5>
puts D.new
#<D:0xb7fa63d0>

http://ruby-doc.org/core/classes/Kernel.html#M005772

S = Struct.new(:name, :state) s = S['dave', 'TX'] p s produces:

   #<S name="dave", state="TX">

pp

For dumping large or complex objects, I found obj.inspect to be practically unreadable! Fortunately, pp provides us with an alternative.

pp obj is the pretty alternative to puts obj.inspect

(Note: You will need to require 'pp')

pp_s

See Ruby libraries / Core.

How do I enable my class to be usable in Ranges?

See Ruby / How do I enable my class to be usable in Ranges?

What's the difference between inspect(), to_s(), pretty_print(), and to_yaml()?

([Rails (category)]: to_yaml() is called by ActionView::Helpers::DebugHelper::debug())

(is pretty_print what pp uses/calls?)

Do we really need all of them???

Seeing something like "#<Month:0xb7ac2bc8>" is never (almost) helpful, even to a programmer!

Understanding YAML dumps

It looks like items in an array have a - in front of them, but items in a hash do not.

I guess the type of an attribute is determined implicitly for built-in types like strings, floats, and hashes. It is only mentioned explicitly for custom objects. For example:

- &id007 !ruby/object:Quote 
  errors: !ruby/object:ActiveRecord::Errors 
    base: *id007
    errors: {}

What's the *id007 mean? Probably indicates recursion, since it does indeed reference the containing object.

[Open problem, would like to solve] to_yaml dumps are too verbose

[Rails (category)]: When I have a huge data structure with 40 attributes and I output debug(output) in a view, I get way too much information to be useful.

I want to be able to tell it, "only show me these attributes: a, b, c".

The problem is, the attributes I'm interested in usually aren't at the root level of the object; usually they're contained in, say, the attributes instance variable of the object.

I've tried overriding Object#to_yaml, with something like this:

%
      class Object
        def to_yaml( opts = {} )
          YAML::quick_emit( object_id, opts ) do |out|
            out.map( taguri, to_yaml_style ) do |map|
              to_yaml_properties.each do |m|
                if m.to_s == 'attributes' || m.to_s == 'shop' || m.to_s == 'quote_price' || m.to_s == 'name' || m.to_s == 'id'
                  map.add( m[1..-1], instance_variable_get( m ) )
                end
              end
            end
          end
        end
      end

But it just output an empty object then:

- !ruby/object:Quote {}

Clearly I'm not understanding how to_yaml works well enough to be modifying it.

Plan of attack:

  • Create a simple but representative test object(an ActiveRecord model with 3 attributes?)
  • Figure out how to_yaml works, as well as inspect, pp_s, etc.
  • Figure out how I actually want my debug output to look; write a test for it
  • Either monkey patch to_yaml or make an alternative

[Rails][Open problem, would like to solve] raise object.inspect is uselessly verbose sometimes since it only shows it on one line

Probably could be fixed by patching whatever view shows local exceptions, so that it splits long messages into multiple lines.

[Open problem, would like to solve] Need a method Object#where_was_I_defined?(method) to help answer that question

Possible implementation: Look at #Module#ancestors.each{|module| module.methods} and see which ancestor has a match.

But what if it's not a module (what if it's a class?)?

What does Module#ancestors do? Why does it give me this output?

irb -> Module.ancestors
    => [Module, Object, PP::ObjectMixin, Kernel]

irb -> Object.ancestors
    => [Object, PP::ObjectMixin, Kernel]

irb -> Class.ancestors
    => [Class, Module, Object, PP::ObjectMixin, Kernel]

Supposedly it returns a list of modules included in mod. So does that mean somewhere the Object module was mixed in to the Module module?? I thought Object was a class, not a module.

Or is ancestors more intuitive, listing superclasses as well as mixed-in modules? I think that must be the case, because this is what Class#superclass tells me:

irb -> Class.superclass
    => Module

irb -> Module.superclass
    => Object

irb -> Object.superclass
    => nil

So (Module.ancestors.include? Object) because Module is a subclass ob Object, not because Object is mixed into Module!

Can I add a method to a single object, similar to what Module#define_method provides for modules?

This apparently doesn't work:

irb -> o = Object.new
    => #<Object:0xb7f7c47c>

irb -> o.define_method(:act_like_an_owl!) { puts "Hoo! Hoo! Hoo!" }
NoMethodError: undefined method `define_method' for #<Object:0xb7f7c47c>
        from (irb):4
        from :0

and yet, I was able to do this in a Rails method:

    error_count = error_objects.each do |o|
      # o is ActiveRecord::Errors
      o.define_method(:boo) do
        'boo'
      end
    end

Why did that work?

---

Oh yeah, now I remember how you do it. It's really simple:

irb -> o = Object.new
    => #<Object:0xb7eedccc>

irb -> def o.act_like_an_owl!; puts "Hoo! Hoo! Hoo!"; end
    => nil

irb -> o.act_like_an_owl!
Hoo! Hoo! Hoo!
    => nil

When you reopen and "override" an existing method, is there any way to call the original method from your new method?

Because often I want to add to the functionality, not replace it entirely!

For example, maybe a method doesn't currently accept Month objects (only Date objects) and you'd like to add that functionality.

But maybe the existing method is 20 lines of code and you don't want to duplicate that in your new method.

You just want a wrapper that calls the original in most cases but does some kind of conversion in the case.

Perhaps:

def method_i_am_overriding(in)
  case 
    when Month; super.method_i_am_overriding(in.to_date)
    else super.method_i_am_overriding(in)
  end
end

I just found the answer to this:

http://www.ruby-doc.org/core/classes/Module.html#M001250

alias_method(new_name, old_name)

Makes new_name a new copy of the method old_name. This can be used to retain access to methods that are overridden.

   module Mod
     alias_method :orig_exit, :exit
     def exit(code=0)
       puts "Exiting with code #{code}"
       orig_exit(code)
     end
   end
   include Mod
   exit(99)

How do I remove/override an entire class/module (as opposed to just some individual methods)?

Let's say you wanted to override the Dir class. First you'd remove the constant that references that class, and then you would create your new constant/class. Nothing about the old class would be used any more (unless the old class has already been instantiated before you redefine it, or unless it's a module that was mixed into another class/module, causing the old methods copied over).

Object.send( :remove_const, :Dir )
class Dir
  # whatever you want ... start fresh!
end
Dir.new # Will use the *new* definition of Dir, not the old one

Example: How MockFS does it

This is how MockFS does it, effectively allowing method calls/messages to be passed through to the original Dir class only if some condition is met (!MockFS.mock?). Otherwise (MockFS.mock?), it will [delegate] the method to its own (mock) version.

/usr/lib/ruby/gems/1.8/gems/mockfs-0.1.6/lib/mockfs/override.rb

MockFS::OriginalDir = Dir
Object.send( :remove_const, :Dir )
module Dir #:nodoc:
  def self.method_missing( symbol, *args, &action )
    dir = MockFS.mock? ? MockFS.dir : MockFS::OriginalDir
    dir.send( symbol, *args, &action )
  end
end

More generally

http://svn.tylerrick.com/public/ruby/examples/remove_class.rb

require 'rubygems'
require 'qualitysmith_extensions/module/remove_const'

class Klass
  def foo; 'foo'; end
end

class Klass
  def foo; 'something else'; end
end

o2 = Klass.new
puts o2.foo    # => something else

Warning: If you instantiate the class before you've redefined it, that instance will continue to use the old class definition

http://svn.tylerrick.com/public/ruby/examples/remove_class.rb

require 'rubygems'
require 'qualitysmith_extensions/module/remove_const'

class Klass
  def foo; 'foo'; end
end
o = Klass.new
puts o.foo     # => foo

Klass.remove_const!
puts o.foo     # => foo (still!)

class Klass
  def foo; 'something else'; end
end
puts o.foo     # => foo (still!!)

o2 = Klass.new
puts o2.foo    # => something else

This behavior is different than what happens if you simply reopen an existing module/class. In that case it affects even objects that were instantiated before your change (monkey patch) was made:

irb -> class Klass; def foo; 'foo'; end; end
    => nil

irb -> o = Klass.new
irb -> o.foo
    => "foo"

irb -> class Klass; def foo; 'something else'; end; end
    => nil
irb -> o.foo
    => "something else"

That's because the monkey patch / re-opening of the class actually changes the class that all previously instantiated objects of that class name share (and get their method implementations from); but in the case where we undefined the constant and then created a new class using the same constant name, we have actually created two different classes; the second class object is completely unrelated from the first; it just (coincidentally) happens to share the same name (constant).

The new module/class, although it shares the same name, is actually a completely different module/class object

Or: "Does this mean that the original class object continues to exist forever?"

irb -> file_class = File
    => File
irb -> file_class.object_id
    => -604258948

irb -> Object.send( :remove_const, :File )
    => File
irb -> class File; end
    => nil

irb -> file_class.object_id
    => -604052098
irb -> File.object_id
    => -605239078  # *different*

irb -> file_class.object_id != File.object_id
    => true
# You can see that the two classes are *different objects*

There's no way to delete an object that I know of... Except to let it get garbage collected. So no, I suppose it wouldn't necessarily exist forever; there is a chance that it would be removed, but (I assume) only if there were no longer any references to it.

Warning: Methods from your original class continue to work even if the constant for that class/module is removed

If you "save" a method before you remove the class's constant, that method continues to work just fine. (File.dirname is a class method, by the way.)

irb -> d = File.method :dirname
    => #<Method: File.dirname>

irb -> d.call '/home/tyler'
    => "/home"

irb -> Object.send( :remove_const, :File )
    => File
irb -> File
NameError: uninitialized constant File
irb -> class File; # empty
       end
    => nil

irb -> d.call '/home/tyler'
    => "/home"

It also works for instance methods, though in that case you need to send 'method' to an instance of the class (logically). Also observe that the original class continues to exist (?) even after we remove the constant for it...

irb -> class Foo; def foo; p self.inspect; end; end

irb -> foo = Foo.new.method :foo
    => #<Method: Foo#foo>

irb -> foo.call
"#<Foo:0xb7eb5b10>"

irb -> Object.send( :remove_const, :Foo )

irb -> foo.call
"#<Foo:0xb7eb5b10>"


Caveat: Removing a class constant does not actually remove the class object itself

I found this out the hard way (as I seem to do with so many things in Ruby).

I had a test that tested a console app that used the Colored gem. In my test, I wanted it to not colorize anything.

In order to impotentize/inoculate the Colored module, I thought I would just remove the Colored constant, and then redeclare my own Colored module with methods that didn't do anything.

I started by just removing the Colored constant, expecting to start seeing some NoMethodError errors. To my amazement, it didn't.

I confirmed that the constant was in fact removed (even after the calls that were still doing the colorizing (thinking maybe some autoloading was causing the constant to be resuscitated)). Indeed it was removed:

p Object.const_defined?(:Colored)

returned false.

Then how on earth did my strings have access to those methods (#red, #bold, etc.)??

Of course! I finally hit upon the reason:

/usr/lib/ruby/gems/1.8/gems/colored-1.0/lib/colored.rb was including the module:

String.send(:include, Colored)

So even after I removed the constant, the String class still had all those methods.

The question is were the methods copied in or are we somehow "referencing" the original methods in the Colored module? In other words, if I just redefine Colored#bold, etc. will that affect the String objects into which Colored was already mixed in??

Let's find out...

It looks like the methods were copied into String. What this means is that this had no effect (the Strings were still being colorized):

module Colored
  def colorize(string, options = {})
    string
  end
end

But this did successfully inoculate/de-colorize my dear strings:

class String
  def colorize(string, options = {})
    string
  end
end

What this means on a more general level:

If you want to override/change the effects/behavior of a module used by a library, it's not always possible to do it simply being re-opening the module itself. Sometimes you have to figure out which all modules/classes the given module was mixed into (and there could be many!). Fortunately, in this case, there was only one class into which the Colored module got mixed, so it was easy to reverse/undo the effects of Colored. (Even if it had been mixed in to Object, it would still be pretty easy to clean up/reverse/undo...)

(open) How do I access a class variable outside of the class where it's defined?

class MyClass
  @@my_variable = 'something'
end

# This doesn't work:
class SomeOtherClass
  def print_that_one_variable
    puts MyClass.my_variable
  end
end

# This doesn't work:
class SomeOtherClass
  def print_that_one_variable
    puts MyClass::my_variable
    puts MyClass.@@my_variable
  end
end

# This does work (wrapping it in an accessor class method)
class MyClass
  def self.my_variable
    @@my_variable
  end
end


What's the opposite of 'object.nil?'?

object, except in the case of false

{}.nil? true if {}
false true

and

nil.nil? true if nil
true nil

but

true.nil? false.nil? true if true true if false
false false true nil

Order of precedence question: and/if

redirect_to :main_url and return if @recipe.save

(redirect_to :main_url and return) if @recipe.save

or

redirect_to :main_url and (return if @recipe.save)

Is ==nil the same as .nil? ?

I think so...

irb -> a = nil
    => nil

irb -> a == nil
    => true

irb -> a.nil?
    => true



Is there a more natural way to translate Python "if something in [option1, option2]" into Ruby?

This:

  if ["y","n"].include? answer
    ...
  end

doesn't read as well as if you had the needle first and the haystack second:

  if answer in ["y","n"]
    ...
  end


size, length, count, ... ??

String#length

Array#size

count is not recognized (but you could alias the method if you wanted)

How do I make a &block parameter optional?

If you say def my_method(a, b, &block), block is required.

To allow a block to be specified but not require it, do:

   def proc_from
     proc = Proc.new if block_given?
     proc.call(...)
   end

or probably

   def proc_from
     yield .... if block_given?
   end

(http://ruby-doc.org/core/classes/Proc.html#M000804)


What's class_eval for?

How is this:

ActiveRecord::Base.class_eval do
  include TextConversion::Acts::Blog
end

different from this:

class ActiveRecord::Base do
  include TextConversion::Acts::Blog
end

?


How do I write a global method/function?

Typically that's a bad idea. You're usually better of making a class method of some class instead, to make the organization nice and avoid name collisions.

But you can...

For example, I wanted my backtrace method to be a super-global method.

The trick is to define the method in a class that gets included into the outermost namespace.

Kernel is one such module that is automatically included into the outermost namespace.

But you can include any module. It's as easy as:

module Globals
  def my_global
  end
end
include Globals


Send an e-mail (with or without attachments)

#!/usr/bin/env ruby
require 'rubygems'
require 'action_mailer'
#require 'mime/types' # Use this gem if you want to get the MIME type from a file

class ExampleMailer < ActionMailer::Base

  def simple_message(recipient)
    from 'you@example.com'
    recipients recipient
    subject 'A simple plain-text message'
    body 'This is the body.'
  end

  def with_attachment(recipient)
    from 'you@example.com'
    recipients recipient
    subject 'A message with an attachment'
    body 'Here is your attachment for today'
    mime_type = 'text/x-comma-separated-values'
    attachment(mime_type) do |attachment|
      file_name = 'some_file.csv'
      attachment.body = File.read(file_name)
      attachment.filename = file_name
      attachment.transfer_encoding = 'quoted-printable'   # If it is a text/* type
    end
  end
end

ExampleMailer.deliver_with_attachment('you@example.com')

See http://svn.tylerrick.com/public/ruby/bin/email_file.rb for a great example that puts it all together.

MIME types

Homepage: http://mime-types.rubyforge.org/
Documentation: http://mime-types.rubyforge.org/
Source code: gem install mime-types
Project/Development: http://rubyforge.org/projects/mime-types


Description: Types.




Readiness: 1.15 February 12, 2006


.type_for will try to determine an appropriate MIME type based on the filename (not on the content).

Strangely, it didn't recognize the '.rb' extension. (So I made 'text/plain' the default for unrecognized extensions.)

irb -> require 'mime/types'
    => true

irb -> MIME::Types.type_for('what.rb')
    => []

It fairs a little better when you give it a .png filename:

irb -> MIME::Types.type_for('what.png')
    => [#<MIME::Type:0xb7a87098 @obsolete=nil, @url=["IANA", "[Randers-Pehrson]"], @content_type="image/png", @raw_sub_type="png", @media_type="image", @simplified="image/png", @encoding="base64", @raw_media_type="image", @extensions=["png"], @registered=true, @system=nil, @sub_type="png">]



Caveat: If you override-and-alias a method in a file and then load the file twice, you can get stuck in infinite loop

Why would you load the same file twice? One time this might happen is if you have two different svn:externals to the same library. Even though it's really the same file, if there are two copies of the file, each with a different path, then Ruby will think they are different files and will load both of them.

Example:

You override ActionMailer::Base.deliver! and alias the old one to ActionMailer::Base.really_deliver!, add some of your own code, and then have it call ActionMailer::Base.really_deliver!
First time file is loaded: You override ActionMailer::Base.deliver! (call this new method deliver1) and alias the old one to ActionMailer::Base.really_deliver! (call thhis deliver0). deliver1 calls deliver0.
Second time file is loaded: You override ActionMailer::Base.deliver! again (call this deliver3) and alias the "old" one (deliver1) to ActionMailer::Base.really_deliver! (deliver2).
Now when you call ActionMailer::Base.deliver!, you are actually calling deliver3, which calls ActionMailer::Base.really_deliver! (deliver2), which is an alias for deliver1, which calls ActionMailer::Base.really_deliver!, which is actually deliver2, and now we're in a loop.

Solution: Only create the alias if it hasn't already been created:

alias_method :really_deliver!, :deliver! unless method_defined?(:really_deliver!)

Real root of the problem:

  • Require/load should be smart enough to not load two identical copies of the same file (is that even possible??)
  • [problems (category)]There should be a more elegant way to override methods and have access to the original method (an 'original_method' keyword comparable to the 'super' keyword??)

Another good example: Ruby / How to alias and override a class method

How do you survive without the ++ operator (i++)?

Oh it's not that heard, really. i+=1 takes, what, a whole character more to type than i++?

C++ has ++i (pre-increment) and i++ (post-increment). Ruby only has i+=1 (pre-increment).

Miss the post-increment variety? Sometimes I do...

Here's an example use of += : Instead of hard-coding all the index numbers (and then having to change them when the order of these e-mails changes) like this:

      assert_match %r{Subject A}, @emails[0].subject
      assert_match %r{Subject B}, @emails[1].subject
      assert_match %r{Subject C}, @emails[2].subject

you can do this:

      i=-1
      assert_match %r{Subject A}, @emails[i+=1].subject
      assert_match %r{Subject B}, @emails[i+=1].subject
      assert_match %r{Subject C}, @emails[i+=1].subject

If you need to change the order around, you can simply move the lines without changing them at all:

      i=-1
      assert_match %r{Subject C}, @emails[i+=1].subject
      assert_match %r{Subject B}, @emails[i+=1].subject
      assert_match %r{Subject A}, @emails[i+=1].subject

Notice how you have to start at -1 in order to make the first index operation look at 0, because the order of operations goes like this: increment i (i+=1, i changes from -1 to 0), look up @emails at the entry at index at the result of i+=1 (which happens to be 0). It's a little weird, but oh well.

How do you check if a class constant has been defined yet?

puts const_defined?(:TheClass)

raises error

undefined method `const_defined?' for #<Object:0xb7f3c9f8> (NoMethodError)

const_defined? is a public instance method of Module.

It would be nice if there were like an outer Module object that I could refer to and ask questions about ["global"] constants like class constants.

Problem and solution: If you monkey patch a class before the original class is defined, it won't be effective (Rails)

I ran into this problem when I was trying to mock out a method called InterfaxFax#fax!. I put my mock monkey patch in test/test_helper.rb. At that point in time the interpreter hadn't yet loaded or auto-loaded vendor/plugins/shared/lib/interfax_fax.rb / InterfaxFax so it thought it was creating a new class named InterfaxFax rather than re-opening an existing class InterfaxFax.

This led to an error that was confusing until I realized what had happened:

wrong number of arguments (1 for 0) (ArgumentError)

This was being raised when I called InterfaxFax.new(some_document). In my original class, I provided an initialize method that accepted 1 argument. But now since I was loading the new class (supposed to be a monkey patch for the original class but ended up accidentally being a stand-alone class) and not getting the initialize method from the original class, it was using the default initialize method inherited from Object, which only takes 0 parameters. Hence the ArgumentError.

The problem in this case was our reliance on Rails's auto-loading. [To do: Write separate article on this problem] We don't explicitly list in our start-up file (config/environment.rb) and have a require statement for every file that we need to use. Instead, we just name our files according to Rails's conventions (ClassName => class_name.rb) and rely on Rails to auto-load the file containing the class definition at the time we first reference the class.

But if we never reference the class but then try to monkey patch the class, it won't be effective.

Solution: Always ensure that the class is leaded before you monkey patch it!

We can't just use something like const_defined? to check if it's been defined. We need it to be defined; it's not optional. We actually need to force Rails to auto-load the constant. The easiest way to do that is to just reference the constant on a line by itself. This is valid Ruby syntax and triggers Rails auto-load mechanism.

test/test_helper.rb:

InterfaxFax        # causes Rails loader to search for the file interfax_fax.rb in all of its load paths
# Monkey patch:
class InterfaxFax
  def fax!(*args)
    puts "monkey-patched version"
  end
end

vendor/plugins/shared/lib/interfax_fax.rb (it will find the file here and load it):
class InterfaxFax
  def initialize(data)
    #...
  end
  def fax!(fax_number)
    puts "original version"
    #...
  end
end

The other option is to explicitly load the file before you monkey-patch it (require "interfax_faxer"), but it feels like I ought to be consistent and either explicitly load all files or consistently rely on Rails's auto-loading for all file loading.

Note: If we don't force auto-loading of InterfaxFax before the monkey patch, then the auto-load will never happen!

# Monkey patch:
class InterfaxFax
  def fax!(*args)
    puts "monkey-patched version"
  end
end
InterfaxFax        # does NOT cause Rails loader to search for the file interfax_fax.rb in all of its load paths. Does NOT cause vendor/plugins/shared/lib/interfax_fax.rb to be reloaded.

The reason for this is because Rails's auto-loading is triggered by a const_missing message -- that is, when it comes across a constant that it's never seen before. But in this case, it has seen the constant InterfaxFax before -- that class is now defined -- so no const_missing message will be sent, Rails's will not try to auto-load any files for InterfaxFax, and the "real" class definition, in interfax_fax.rb, will not get loaded.

So when we try to call one of the methods that's only available in the original class (like new(with_an_argument)), it won't find it.

One way you might thing that you can get around this is by just doing an explicit require statement.

class InterfaxFax
  def fax!(*args)
    $sent_a_fax = true
    puts "overridden"
  end
end
require "interfax_fax"

You might (if you're silly) think that that will load all the original methods from interfax_fax.rb while retaining the monkey patch. This simply isn't the way it works: the monkey patch will be lost, overridden by the original method definition. The last method definition always wins.

This can be dangerous. Especially when you depend on monkey patching to "mock out" dangerous behavior in the original class (for example, using external web services that cost money), rendering it [innocuous].

There's probably a better, more reliable way to create mock objects. (What is it??)

---

The other possible way monkey patching can fail to work is independent of Rails and its auto-loading. It occurs when your monkey patch simply is defined before the original fax. Here is an example (it can all go in the same file):

a_test.rb:

class InterfaxFax
  def fax!(*args)
    puts "monkey-patched version"
  end
end

class InterfaxFax
  def initialize(data)
    #...
  end
  def fax!(fax_number)
    puts "original version"
    #...
  end
end

InterfaxFax.new("some data").fax!
# => "original version"

It uses the original version instead of the monkey-patched version because the "monkey-patched version" came first and actually is treated as the original class definition. So then when Ruby gets to the 2nd class definition in the file (the one that is supposed to be the "original"), it actually reopens the class, overrides the "monkey-patched version" with the "original version" and, since the "original version" is the last version it sees, that's the one that gets used.

Moral of the story: Monkey patches have to come after the code they are supposed to be patching. (Duh.)

What's the best way to make example code?

Answer:

  • For really simple examples:
    • Just do it in irb and copy and paste the output.
  • For all but the simplest examples:
    • write a stand-alone Ruby script. This way, people can not only read it, but they can also easily modify it and try it (run it).
    • Use xmp (example printer)


#!/usr/bin/env ruby
require 'irb/xmp'

class Single
  def Single.method_in_single() end
end

xmp <<end
Single.singleton_methods
end

When run, will output:

Single.singleton_methods
    ==>["method_in_single"]

If desired (so that users can see the output without running the file -- useful for static/RDoc web pages), decorate the code with the results of running it.

xmp <<end
Single.singleton_methods      # ==>["method_in_single"]
end

Be warned, however, that now your output will be cluttered by the comment:

Single.singleton_methods      # ==>["method_in_single"]
    ==>["method_in_single"]

So... you could move the comment outside of the xmp block:

xmp <<end
Single.singleton_methods
end
# ==>["method_in_single"]

But now your example file is looking ugly... You could throw your class definition and everything inside the xmp block if you'd like, to keep your source code less fragmented...

xmp <<End
  class Single
    def Single.method_in_single() end
  end
  Single.singleton_methods
  # ==>["method_in_single"]
End

But then your output is less readable!:

  class Single
    def Single.method_in_single() end
  end
    ==>nil
  Single.singleton_methods
    ==>["method_in_single"]
  # ==>["method_in_single"]
    ==>nil

That's not very nice, seeing it evaluate your comments. (Would be nice if xmp would strip out all comments before evaluating.)

Another option would be to call xmp for each statement individually, rather than passing it several statements as a large heredoc string:

xmp "Single.singleton_methods"      # ==>["method_in_single"]

That gives the output that we want... But again, it makes the code a little less readable/pretty. "How so?" you ask? Well, because it seems ugly to pass around Ruby statements as strings. Sure, that's what we were doing before with the heredoc, but heredoc makes it less obvious that it's a string:

xmp <<end
  object.find("whatever")
end

, although technically a sting, doesn't appear as obviously like one as this would:

xmp "object.find(\"whatever\")"

There's trade-offs any way you do it. Let me know if you find a good solution/compromise.

What I'd like to be able to do, is to simply pass a block rather than a string:

xmp do
  object.find("whatever")
end

or, on a single line:

xmp { object.find("whatever") }

How to do that is the topic of another section...


Would it be possible to create a "but" operator?

It would just be an alias for and, but it would make some expressions read more naturally; it would let you emphasize the "but" quality of the condition that follows the but.

Even if one can't add to the set of built-in operators, I suppose one could simply make it a method instead:

class Object
  def but(rhs)
    self and rhs
  end
end

if !a.empty?.but a.size < 3
  "It's not completely empty, that's true... but it's still too small!"
end

Motivation:

I especially feel like using "but" when I'm adding a positive condition to an existing condition that has an unless in front of it. In order for the positive condition to not be a double negative, I have to rewrite the unless statement as an if statement. But then it feels like I should retain the negative emphasis that the unless supplied: hence the need for "but".

"unless it's one of the files we want to exclude..."

  unless [File.basename(__FILE__), ".", ".."].include? file
    puts file
  end

"if it matches the pattern of files we want to include, but it it's not one of the files we want to exclude..." The word "but" works so much better here than "and" in English; the same can be said about the same statement written in Ruby:

  if file =~ /\.rb$/ and not [File.basename(__FILE__), ".", ".."].include? file
    puts file
  end

[Problems (category)] Can I create a method that accepts more than one block?

No, but you can get around that by using Procs.

Why would you want to pass more than one block to a method? There are lots of reasons! I've already come across several occasions where I've wanted to do this...

  • I want to let the user specify a block to do filtering on the output from some other arbitrary block
  • I want to create a custom assertion that lets you pass a block that does some extra checking on the exception I expect to be raised by the other arbitrary block that is passed.

Like I alluded to, I've been able to [work around (category)] this problem by using Procs. Usually I'm able to identify one block that stands out to me as the primary or "main" block. This is the one that I let be an actual block (the one that we yield to). Any other "blocks" that I want to pass in as "blocks" I actually turn into Procs, which I create with the lambda or Proc.new command and pass in just like any other (normal (non-block)) argument: in the normal argument list.

Example:

Although I can't do this:

    assert_exception(ArgumentError) do |exception|
      assert_equal "Unmatched single quote: '", exception.message
    end do
      SomeCommand.execute("foo -m 'stuff''")
    end

,

I can do this:

    assert_exception(ArgumentError, lambda { |exception|
      assert_equal "Unmatched single quote: '", exception.message
    }) do
      SomeCommand.execute("foo -m 'stuff''")
    end

[Duplication (category)] / [Removing duplication (category)]

Example

Is there any (concise) way to remove this duplication?

dir, file = File.dirname(input), File.basename(input)

I came up with this:

dir, file = [:dirname, :basename].map{ |a| File.send(a, input) }

but it's arguable whether or not that's an improvement... It's more characters.

Slight variation (even more duplication that could be removed)...

dirname, basename = File.dirname(input), File.basename(input)

I tried this:

  [:dirname, :basename].map{ |a| eval "#{a} = File.send(a, input)" }

, but not only is it ugly (eval), it also doesn't work because it creates locals within the scope of the block and those variables aren't available for use outside of it, so I got this lovely error:

  undefined local variable or method `basename' for main:Object (NameError)





Object-orientedness

Ruby / Classes, objects, modules, and methods edit

Object-orientedness

Object-orientedness: Classes and objects

Class hierarchy of fundamental classes

I think it's pretty important to get this much clear:

irb -> Class.superclass
    => Module

irb -> Module.superclass
    => Object

irb -> Object.superclass
    => nil

irb -> Class <= Module and Module <= Object
    => true

Class is subclass of Module

I have frequently been confused because I forgot that Class is a subclass of Module. So when I find some method in Module, I fail to realize that I can use that same method in Class as well!

So don't be alarmed, for example, that you can't find Class.class_method_set in the rdoc. Instead, you'll find that there is Module.class_method_set. You can still do this:

irb -> String.send(:class_variable_set, :@@what, 3)
    => 3

Okay, I'm confused. String.superclass is Object, not Class. So how does it inherit class_variable_set from Module?? (Or does it not?)

Class is the only known subclass of Module

If there is any other subclass of Module, please tell me!!

irb -> class Example end
    => nil
irb -> Example.class <= Module
    => true
irb -> Example <= Module
    => nil
irb -> Example.class
    => Class

irb -> "Another".class
    => String
irb -> "Another".class <= Module
    => nil

So you can check if an argument is either a Class or a Module by asking arg.class <= Module.

I think that's exactly what they were doing in the Pickaxe Ruby book, p. 389 in this example:

class Module
  ...
  def Module::doc(aClass)
    # If we're passed a class or module, convert to string
    aClass = aClass.name if aClass.class <= Module
    @@docs[aClass]
  end
end
puts Module::doc(Example)   # Example is a class, so aClass.class is Class, and Class <= Module
puts Module::doc("Another") # "Another" is not a class and is not a module, so it fails that test

(I wonder why they didn't just check aClass.is_a?(Class) or aClass.is_a?(Module)...wouldn't that have been easier to read?)

Reopening a class that has/had a superclass specified

Which of these is legal?:

class B < A
end
class B < C
end
# no!

class B < A
end
class B
end

class B
end
class B < A
end

Inheriting from a ... function (dynamically-determined class)???

/usr/lib/ruby/gems/1.8/gems/rails-1.1.6/lib/rails_generator/commands.rb

  class Base < DelegateClass(Rails::Generator::Base)



Methods

How do you call a class method?

from an instance method of the same class

Here's one way I've found that works:

class Klass
  def my_instance_method
    self.class.my_class_method
  end

I like it only marginally better than:

class Klass
  def my_instance_method
    MyClass.my_class_method
  end

from another class method of the same class

class Klass
  def self.my_instance_method
    self.my_class_method
  end

Overriding methods

"Inside a subclass, calling super is sufficient by itself — you don’t have to specify the method name or arguments. It figures them out automagically from the current method and arguments. It can be overridden if necessary." (http://woss.name/2006/05/07/notes-from-a-rails-course/)

"Mixins (using include) behave as if they are inherited methods. So you can override them and call super and so on, as usual." (http://woss.name/2006/05/07/notes-from-a-rails-course/)

[Obscure useless tip] Can't set self= ? Try replace.

Can't do:

class Array
        def stupid
                self = [self[0]]
        end
end

Will give "Can't change the value of self"

Instead:

class Array
        def stupid
                self.replace [self[0]]
        end
end

replace doesn't necessarily exist in every class, but it exists in Array, Hash, String, ...


Singleton classes: Object-specific classes

See Pickaxe book, Chapter 24. Classes and Objects: How Classes and Objects Interact, p. 382-383

Ruby allows you to create a class tied to a particular object. In the following example, we create two String objects. We then associate an anonymous class with one of them, overriding one of the methods in the object's base class and adding a new method.

a = "hello"
b = a.dup

class <<a
  def to_s
    "The value is '#{self}"
  end
  #...
end
a.to_s      #=> "The value is 'hello'"   [uses to_s from anonymous class that we just created for a]
b.to_s      #=> "hello"                  [uses to_s from String class]

This example uses the class <<obj notation, which basically says "build me a new class just for the object obj." We could also have written it as

#...
def a.to_s
  #...
end
#...

The effect is the same in both cases: a class is added to the object a. This gives us a strong hint about the Ruby implementation: a virtual class is created and inserted as a's direct class. a's original class, String, is made this virtual class's superclass. [...]

Virtual classes / meta classes / singleton classes

Technically, according to the Pickaxe book p. 380-382, each object has a meta/virtual/singleton class associated with it. I'm kind of confused by what difference (if any) there exists between those 3 terms.

Pickaxe p. 382 (Matz):

"Every object in Ruby has its own attributes (methods, constants, and so on) that in other languages are held by classes. It's just like each object having its own class."
"To handle per-object attributes, Ruby provides a classlike something for each object that is sometimes called a singleton class."

"Singleton methods": Adding methods to objects as opposed to classes

When would you want to do that?

  • Whenever you only plan on calling the method on that particular method...
  • When you don't want to clutter up your base classes with your ["domain-specific"] methods...

[Real-world sightings (category)]

TkVariable example

Pickaxe p. ??:

require 'tk'
...
checked = TkVariable.new
def checked.status
  value == "1" ? "Yes" : "No"
end
status = TkLabel.new do
  text checked.status
  ...
end
...
TkCheckButton.new do
  variable checked
  ...
end
TkButton.new do
  text "Show status"
  command { status.ext(checked.status) }
end

In this example, we add behavior to a single TkVariable object: we add a status method to the checked object. Why do we do that? Because we want the text of a label to be tied to the current result of that method. And we don't want the label to display the non-user-friendly valus "1" or "0"; rather, we want to show something user friendly like "Yes" or "No".

We only use it for this one object, so there's no use in re-opening the TkVariable class and adding the method to all TkVariable objects. Nor do we really want to go through all the work of creating a special container class just to contain a TkVariable object and provide a status method.

Mocha::Expectation#with

http://mocha.rubyforge.org/classes/Mocha/Expectation.html#M000017

     # File lib/mocha/expectation.rb, line 169
169:     def with(*arguments, &parameter_block)
170:       @parameters, @parameter_block = arguments, parameter_block
171:       class << @parameters; def to_s; join(', '); end; end
172:       self
173:     end

Here a to_s method is added to a singleton class created just for the @parameters object.

Problem: you have a method that you want available to Strings objects used by your class but you don't want to litter the base String class with them.

(Very often in the context of: Adding a presentation layer to your data layer.)

Subclassing won't get you anywhere, since you can't control the type of objects returned by other methods (they'll return Strings, not "CustomStrings" (or whatever), so we'll just have to deal with it.)

Also, we don't want to go the procedural, non-OOP way of just calling a global/class-method...

number_to_currency(1234567890.50)     => $1,234,567,890.50

colorize_svn_diff(svn("diff file1"))

Yuck.

We want it to be object-oriented. So it should look closer to this (although not necessarily exactly), with a method being called on the object, rather than the object being passed to some method from another class:

(1234567890.50).to_currency

svn("diff file1").colorize

Rails has a bunch of "helper" methods like this that (I believe) cause your view code to be very un-object-oriented. (See f.i. NumberHelper)

Solution 1: Add these methods to the base class.

Add the methods directly to Float or to String.

Sure that would work...some of the time. But it would be very prone to name collisions. Other libraries that you're using might have had the same idea about adding methods with those names to Float or String. Even within your own application, you may want to have more than one method called "to_currency" or "colorize". You may want to display currencies one way in this report or quite another way over somewhere else.

Solution 2: Singleton methods.

Attempt 1

irb -> def svn(args); "output from `#{args}`"; end
    => nil

irb -> module Svn; module Diff; def colorize; self + " (colorized)"; end; end; end
    => nil

irb -> svn("diff file1").extend(Svn::Diff).colorize
    => "output from `diff file1` (colorized)"

Attempt 2: singleton_send: Create a singleton class with the method you need and then call it, with a single command

irb -> class Object; def singleton_send(mod, *args); self.extend(mod); self.send(*args); end; end
    => nil

irb -> svn("diff file1").singleton_send(Svn::Diff, :colorize)
    => "output from `diff file1` (colorized)"

Published at http://qualitysmithext.rubyforge.org/classes/Object.html#M000027




Duck typing

Pickaxe book p. 367: "In Ruby, the class is never (OK, almost never) the type. Instead, the type of an object is defined more by what the object can do. In Ruby, we call this duck typing. If an object walks like a duck and talks like a duck, then the interpreter is happy to treat it as if it were a duck."

[Example (category)] StringIO

Can use it interchangeably any place that expects a file IO object! It doesn't care! As long as it responds to all the same messages, the users of the object don't care.








[Monkey patching (category)]

Bruce Tate (2007-03-13). Crossing borders: Extensions in Rails: The anatomy of an acts_as plug-in (http://www-128.ibm.com/developerworks/java/library/j-cb03137/index.html). Retrieved on 2007-03-14 16:44.

([reopening classes (category)])(Warning)

In Ruby, you can open up any class and redefine it quickly. This capability is one of Ruby's greatest strengths because of the increased flexibility. But the capability is a weakness too. Too much flexibility can lead to code that's difficult to understand and maintain, so be careful.


[Monkey patching (category)]: If you override attribute my_attr, will it also override my_attr= for you?

alias_method :date_placed, :creation_date

Will that also override :date_placed= as well?

Nope, you need to override that separately:

alias_method :date_placed=, :creation_date=

Is there/should there be another method that does both of these for you? Hmm...






Object-orientedness: Modules and mixins

How do I define class methods in my module that should be mixed into the includer class?

How not to do it:

module ActsAsCat
  def self.acts_as_cat(a)
    ...
  end
end

class ActiveRecord::Base
  include ActsAsCat
end

That doesn't work because when you include ActsAsCat, it although it mixes in all constants, variables, and methods from ActsAsCat into ActiveRecord::Base, acts_as_cat is defined as a class method, not a normal method. Class methods are not mixed in. (This is unfortunate and unintuitive in my opinion. I wonder why it can't be fixed by a future version of Ruby. Discuss)

This works:

module ActsAsCat
  def self.included(base)
    base.extend ClassMethods
  end

  module ClassMethods
    def acts_as_cat(a)
      ...
    end
  end
end

class ActiveRecord::Base
  include ActsAsCat
end

What does it do? base.extend mixes in all the instance methods of ActsAsCat::ClassMethods (acts_as_cat in this case) are mixed into the class represented by base (in this case, ActiveRecord::Base). So they become class methods.

Actually, an easier way to make class methods be mixed in when including is to use Facets' class_extension method.

append_features or included?

You can also define append_features instead of included. What's the difference? Well, the Ruby docs say ...

(http://ruby-doc.org/core/classes/Module.html#M000743) included(mod) is a "Callback invoked whenever the receiver is included in another module or class. This should be used in preference to Module.append_features if your code wants to perform some action when a module is included in another."
(http://ruby-doc.org/core/classes/Module.html#M000719) append_features(mod): "When this module is included in another, Ruby calls append_features in this module, passing it the receiving module in mod. Ruby’s default implementation is to add the constants, methods, and module variables of this module to mod if this module has not already been added to mod or one of its ancestors." This seems like a more fitting place to mix in ClassMethods.
I think we should heed the manual's advice and use "included" "in preference to Module.append_features". Why? (1) It's a nicer-sounding, more intuitive name. (2) The manual says to. (3) Because then we don't have to remember to call super like we would if we overrode append_features.

What does it mean to define a class/module within a class?

Also: [the difference between extend/include (category)] [extended/included (category)]

And why would you want to do such a thing?


class Command
  def initialize(subcommand)
    self.extend(self.class.const_get(:"Subcommand#{subcommand}"))
  end

  module SubcommandA
    def self.extended(base)
      puts "#{self.name} got extended into #{base.class.name}" end
    def self.included(base)
      puts "#{self.name} got included into #{base.class.name}" end
  end

  class SubcommandC
    def initialize
      puts "SubcommandC#initialize"
    end
  end
end

puts Module.constants.grep(/^Command/)
puts Command.constants.grep(/^Sub/).sort.map{|a| '\\ ' + a}
puts

puts "Including/extending at outermost level"
include Command::SubcommandA
self.extend Command::SubcommandA

puts
puts "Command.new(:A)"
Command.new(:A)

puts
puts "What about the *class* within a class?"
Command::SubcommandC.new

It looks like the class SubcommandC defined within the class Command behaves just as if it had been defined within a module named Command. In other words, our class behaved like a module.

That sort of makes sense that a class can be used like a module. It (Class) is, after all, a subclass of Module. (I'm not sure why I was so surprised to see modules defined within classes then!)

irb -> Class.superclass
    => Module

That's actually really nice that Ruby gives you that flexibility. It doesn't arbitrarily say "No, you can't put classes inside of classes! You can only put them in modules." It helps keep things clean because you can use your class for namespacing as well as for data storage (instance variables) and behavior (methods).

What is the outermost scope, actually? What is the receiver then? To what does an include apply?

If every method call has a receiver (an object), then what is the receiver if you just do a "bare word" (not qualified at all) method, like this?:

some_method

irb will tell that it is operating on an instance of (and I don't mean subclass of) Object:

irb -> self
    => main

irb -> self.class
    => Object

So any method calls you make will be directed to that Object object (or Kernel, but that's another topic) unless you tell it otherwise.

You can tell irb to point to a different object, by the way, ...

irb -> irb s
Welcome!

irb#1 -> self
      => "test"

irb#1 -> self[0..0]
      => "t"

Okay, that's nice and all, but what does it mean for me? Well, it would be nice if we understood what happened when we did include at the outer scope.

  module SubcommandA
    def self.extended(base)
      puts "#{self.name} got extended into #{base.class.name}" end
    def self.included(base)
      puts "#{self.name} got included into #{base.class.name}" end
  end
end

puts "Including/extending at outermost level"
include Command::SubcommandA
self.extend Command::SubcommandA

yields:

Including/extending at outermost level
Command::SubcommandA got included into Class
Command::SubcommandA got extended into Object

So it looks like [class methods](?) like include will be received by Class.

(Yes, the Class constant does refer to an object, in case you were wondering...

irb -> Class.object_id
    => -604440958

)

How does Kernel work then?

Question: Kernel is a module. So if the outer scope is Object, how come we're able to call the methods of Kernel (like "puts") as bare names?

It's almost as if Kernel is mixed into Object.

Indeed, that seems to be what this is telling us:

irb -> self.class.ancestors
    => [Object, PP::ObjectMixin, Kernel]

Yet its methods don't show up in Object...

irb -> self.class
    => Object
irb -> self.methods.grep /puts/
    => []
irb -> self.instance_methods.grep /puts/
NoMethodError: undefined method `instance_methods' for main:Object
        from (irb):13

irb -> Object.methods.grep /puts/
    => []
irb -> Object.instance_methods.grep /puts/
    => []

irb -> Kernel.methods(false).grep /puts/
    => ["puts"]
irb -> Kernel.instance_methods.grep /puts/
    => []

..., so was it really mixed in or does Ruby handle Kernel specially (internally)?

The RDoc for Kernel lists puts (and most other methods) under the heading "Public Instance methods". And yet...

If Kernel were actually mixed into Object, we should see behavior more like this...

irb -> module Whatever; def foo; end; end
    => nil
irb -> include Whatever
    => Object

irb -> self.methods.grep /foo/
    => ["foo"]

irb -> Object.methods.grep /foo/
    => ["foo"]

That is, we should see it listed in the Object.methods array. But it isn't.

So again I wonder, how does Kernel work?

Could the Kernels methods (puts at least) actually be class methods?

These are the only instance methods that Kernel claims to have:

irb -> Kernel.instance_methods(false).sort
    => ["==", "===", "=~", "__id__", "__send__", "class", "clone", "display", "dup", "eql?", "equal?", "extend", "freeze", "frozen?", "gem", "hash", "id", "inspect", "instance_eval", "instance_of?", "instance_variable_get", "instance_variable_set", "instance_variables", "is_a?", "kind_of?", "method", "methods", "nil?", "object_id", "pretty_inspect", "private_methods", "protected_methods", "public_methods", "require", "require_gem", "respond_to?", "send", "singleton_methods", "taint", "tainted?", "to_a", "to_s", "type", "untaint"]

So either the RDocs are lying by listing puts in the "Public Instance Methods" section, or I'm not understanding something.

Either way, I'm confused.

Here are some of the class methods provided by Kernel (I only included the interesting/familiar methods in this list, because it's a long list, but you can see that they're all class methods):

irb -> puts (Kernel.methods - Object.methods - Kernel.instance_methods).sort
irb -> require 'qualitysmith_extensions/module/class_methods'
irb -> puts Kernel.class_methods
`
at_exit
binding
block_given?
caller
catch
eval
exec
exit
fork
getc
gets
global_variables
lambda
load
p
pp
print
proc
putc
puts
raise
rand
require
sleep
system
throw

So puts is a class method! ... right?

So when we use these methods as bare names, somehow it knows to find them in Kernel.

Still don't understand how that works, unless Object inherits from Kernel or something, which I doubt they do...

irb -> Object.superclass
    => nil

I know that it's mixed into Object...

irb -> self.class.ancestors
    => [Object, PP::ObjectMixin, Kernel]

but I didn't think that mixing in a module caused the class methods to be added to the includer class. Am I wrong here?

set_trace_func doesn't tell us if the method called was a class method or instance method... Otherwise the following trace might have been more informative...

require 'rubygems'
require 'unroller'
Unroller::trace(:include_c_calls => true) { puts 'hi' }
#Identical output if we do this instead:
#Unroller::trace(:include_c_calls => true) { Kernel.puts 'hi' }
 | - (  c-call)     Kernel block_given? (/usr/lib/ruby/gems/1.8/gems/unroller-0.0.13/lib/unroller.rb:323)
 | - (c-return)     Kernel block_given? (/usr/lib/ruby/gems/1.8/gems/unroller-0.0.13/lib/unroller.rb:323)
 |    Unroller::trace(:include_c_calls => true) { puts 'hi' }           | t.rb:3
 | - (  c-call)     Kernel       puts (t.rb:3 )
 | - (  c-call)         IO      write (t.rb:3 )
hi | - (c-return)         IO      write (t.rb:3 )
 | - (  c-call)         IO      write (t.rb:3 )

 | - (c-return)         IO      write (t.rb:3 )
 | - (c-return)     Kernel       puts (t.rb:3 )


So if we "override" puts by defining a local method of that name, what actually happens? Why can't it find Kernel#puts ?

irb -> def puts; 'nothing'; end
    => nil

irb -> puts
    => "nothing"

And yet we can still access it as a module (class) method!...

irb -> Kernel.puts

    => nil

So it is a module method ... right? It's both? I'm confused.

Actually, it only finds them in Kernel if they don't exist as local methods (in a more local scope). But maybe that's nothing unusually as far as the resolution of instance/class methods goes...

Local methods that match the same name as a Kernel method will be used instead of the Kernel method

So we can "override" methods sort of like this...

irb -> def exec; end
    => nil

irb -> exec
    => nil

We're not actually overriding the method when we do this however -- the Kernel method is still callable if we qualify the method call with some scope information -- we're just taking advantage of the [method call resolution order] of Ruby: instance methods, then class methods (?).

irb -> Kernel.exec('date')
Wed Feb 21 09:48:26 PST 2007

(To actually override a Kernel method, just re-open the Kernel class.)

Question: If we override `, however, how do we call the Kernel version???

irb -> def `(*args); p args; end
    => nil

irb -> `date`
["date"]
    => nil

irb -> Kernel.`hi`
(irb):19: warning: parenthesize argument(s) for future version
SyntaxError: compile error
(irb):19: unterminated string meets end of file
        from (irb):19
        from :0

irb -> Kernel.`(hi)
     " `
SyntaxError: compile error
(irb):24: unterminated string meets end of file
        from (irb):24
        from :0

Well, when lacking a more elegant solution, the send method usually gets the job done!

irb -> Kernel.send(:`, 'date')
    => "Wed Feb 21 10:07:11 PST 2007\n"

What does/can a mixin do?

http://www.rubycentral.com/book/classes.html

When a class includes a module, that module's instance methods become available as instance methods of the class. It's almost as if the module becomes a superclass of the class that uses it. Not surprisingly, that's about how it works. When you include a module, Ruby creates an anonymous proxy class that references that module, and inserts that proxy as the direct superclass of the class that did the including. The proxy class contains references to the instance variables and methods of the module.

(This means you can call super from a class and it will call the corresponding method of (one of) the modules that it has mixed in.)

Mixins: What's a mixin?

Bruce Tate (2007-03-13). Crossing borders: Extensions in Rails: The anatomy of an acts_as plug-in (http://www-128.ibm.com/developerworks/java/library/j-cb03137/index.html). Retrieved on 2007-03-14 16:44.

A module has method definitions, but no base inheritance hierarchy. Instead, you can attach modules to any existing Ruby class. If the concept is new to you, think of a module as an interface plus the implementation for that interface. The nice thing about a module is that you can attach its functionality to any existing Ruby class, and you can attach as many as you want. You can also leverage a class's existing capabilities. This technique is called mixing in. C++ uses multiple inheritance to provide a similar capability, but with ugly complications. The Java founders eliminated multiple inheritance to address those complications. With modules, you can get some of the benefits of multiple inheritance without the sticky complications. Languages such as Smalltalk and Python also support mix-in inheritance.

Mixins: who needs multiple inheritance?

Brown, Gregory (2007-01-11). inheritance concept in ruby (http://groups.google.com/group/comp.lang.ruby/browse_thread/thread/aac34ed1e5ac40d1/c7b7d59e35c8ff25?lnk=gst&q=inheritance). Retrieved on 2007-02-02 11:20.

The real complication with multiple inheritance is that you end up inheriting the ancestors of both parent classes. When you use a mixin, the additional methods are tacked on to a class in the heirarchy, rather than creating another set of ancestors. Since modules live outside of the tree of inheritance, they can be used to provide functionality without complicating the ancestry chain. See Comparable and Enumerable for excellent and practical uses of modules.

In which cases do the two scenarious (M.I. and mixins) cause different behaviors? Is it only related to member visibility? If yes, how exactly?

if:

A < B < C

and module D is mixed into C, there is still a single path back to A.

Had we used multiple inheritence (if it were possible), perhaps

F < E < D

so now, C has two distinct roots, A and F.

Now imagine circular dependencies and other complications. Scary! :)

So modules allow you to avoid the verbosity of interfaces without complicating the chain of ancestors.

Brown, Gregory (2007-01-12 12:13). inheritance concept in ruby (http://groups.google.com/group/comp.lang.ruby/browse_thread/thread/aac34ed1e5ac40d1/c7b7d59e35c8ff25?lnk=gst&q=inheritance). Retrieved on 2007-02-02 11:20.

That is, the exact same thing happens when you have a class include two modules and both implement the 'hello' method. There is no self-evident way of choosing, so you have to come up with an arbitrary choice like that of order of call to 'include'.

Yes, this is true.

However the issue is rarely with immediate dependencies, but dependency chains.

I.e., with mixins,

If I have A->B->C and C mixes in D, then E

the lookup is C,E,D,B,A

which logically is quick intuitive if you think about it.

If I have A->B->C

and  ?->D->C

and

?->E->C

Those two question marks represent two independent dependency chains that you cannot be sure what they implement without knowing the ancestors of each.

Since modules live outside of the hierarchy, you will only be likely to run into this problem with bad design. The point of mixins (and multiple inheritance) really, is to address orthogonal concerns. Name collision is a sign of either a code smell, or just a really complex system that is going to need a lot of thought to begin with.


Brown, Gregory (2007-01-19 20:20). Digging Deep: Mixing it up (or in) with Modules (http://www.oreillynet.com/ruby/blog/2007/01/digging_deep_mixing_it_up_or_i_1.html). Retrieved on 2007-02-02 11:20.

So this post will focus on a recent topic on RubyTalk, the merit of using Mixins as a suitable replacement for multiple inheritance.

Overview

If you haven’t worked with multiple inheritance before, or aren’t familiar with object oriented programming, this article isn’t going to be super helpful. However, as a quick review, the core concept of multiple inheritance is that an object can have more than one parent class. For example, perhaps you have a Student class, and you want to say Joe is a student, but he’s also a musician and a rubyist. Single inheritance would not let you have this kind of structure without awkward hackery, so multiple inheritance would make life easier.

Though I’m sure there are folks out there who can explain why Java’s interfaces help simplify things in some cases, this sort of thing is a notorious source of annoyance in Java. A lot of times single inheritance just doesn’t do the trick. So often when folks hear that Ruby is single inheritance, they get sort of nervous.

On the gripping hand, there are plenty of languages in which incorrect use of MI can really bite you. C++ and Perl would be common culprits [...].

Ruby I suppose is lucky enough to be able to happily steal good ideas and throw away the bad ones. With this in mind, I think the Mix-in idea really fits the language.

But isn’t a Module just a container / a crippled Class ?

A short definition of a Module would likely be ‘a ruby class that cannot be instantiated and does not live within the class hierarchy’. It wouldn’t be 100% complete, but it’s enough to work with.

This at first sounds inconsequential. But the semantic power that such a construct provides is rather vast. David Black mentioned that he likes the idea of having both a noun like construct as well as an adjective like construct available for modeling.

That’s is a pretty good way to put it. This object is an Array. It is Enumerable.

If you look closely, you’ll see the distinction. The relationship is not really an ‘is a’ for the latter, but more describes some behavior the object has. It’s noun vs. adjective, when it all boils down.

...

I see the point, but I still don’t get how to use these things in my modeling

Basically, if you’re already familiar with modeling multiple inheritance in a sane way, the leap to mix-ins will be relatively small. However, if you’re new to the concept, it might take a little extra leap. Going back to David’s analogy, classes are our Noun component and modules our adjective.

Perhaps we have a Record construct and we want to be able to add tags to these records( a la folksonomy ). This is an ideal time to say an object with the ability to be tagged is Taggable, which is a nice adjective, and makes a nice module.

Subclasses vs. mixins

Hodel, Eric (2006-08-23). Subclassing vs include (http://blog.segment7.net/articles/2006/08/23/subclassing-vs-include). Retrieved on 2007-02-02 11:51.

In memcached Basics for Rails Rob Sanheim asked: [W]hy make the cached model a class to extend instead of a module? Whether or not a model is cached should be an implementation detail, and shouldn’t define the hierarchy for a class. I know I would rather not use the power of (single!) inheritance just to cache something, when a mix-in should be plenty powerful to do it. The short answer is: Using a class the correct way to overlay features on top of another class. Here’s my long answer: When you use a class you get super, and super is a beautiful thing. It automatically walks your class’ ancestors and calls the right method in the right order. A module doesn’t have this property so you can’t use it to overlay features on top of a class. The class’ implementation will always be called before the module’s implementation. Now you’re going to say something about using alias to shuffle methods around. You could do that, but you’ll have to do this for each method you want to overlay (five in CachedModel) which involves lots of extra typing “do_the_thingy_without_the_stuff” that you could have had for free (and more-prettily) with “super”. You also get another problem that may cause subtle bugs. When you use alias to overlay features the order of execution is dependent upon the order the files are required in! Intentionally writing code where the behavior may change file load order gives me the heebie-jeebies. Having to do all that work to get the benefits of a subclass tells me that a module isn’t powerful enough to do what a subclass can, so a module isn’t the right way to add a cache. I avoid* using modules to overlay features of a class, but do use them to add orthogonal or complementary features. Typically when I write a module it ends up being used like Comparable, Enumerable or Singleton. When I need to do something invasive a subclass is better. Finally, making the argument that adding caching to ActiveRecord::Base via a subclass shouldn’t define the inheritance argument is very subjective. I could justify using a subclass by saying that ActiveRecord is a data storage class, and CachedModel is just another data store. If you want caching, inherit from CachedModel. If you don’t want caching, inherit from ActiveRecord::Base. Jim Weirich said 1 day later: Hi Eric, I’m not sure I’m following your reasoning. Eric: When you use a class you get super, and super is a beautiful thing. Agree about the beauty thing. But modules participate in super chaining as well. [How?] Eric: A module doesn't have this property so you can't use it to overlay features on top of a class. The class's implementation will always be called before the module's implementation. The current class's methods will be called before both the module and the parent class. The module functions will be called before the parent class. Which seems to be the right thing to do. Unless you are talking about something more subtle where the module is mixed-in in more than one place in the inheritance heirarchy. Thanks. Rob Sanheim said 1 day later: Hi Eric, Thanks for the detailed response. I’m curious as to why a module wouldn’t work wrt the point Jim made. I’m aware of the amount of magic going on in the finder methods, so I agree 100% things could get very nasty if you had to alias and chain things… Regarding the inheritance issue – it just seems to me that having a “Cacheable” module makes much more sense (assuming the implementation isn’t horrible). In a perfect world, I’d also prefer including ActiveRecord::Base so my business models could extend from anything, and possibly also mixin ActiveCVS, ActiveLDAP, etc. I see the models main concern being the business logic and data, and how that data gets persisted is secondary. That said, I’ll admit my real world code hasn’t yet seen the need for abstract business superclasses, so Ruby’s power might allow metaprogramming to take the place of deeper class hiearchies. Eric Hodel said 2 days later: Jim: Lately I’ve seen lots of use of modules for layering functionality on top of classes implemented so that you don’t have to add include Blah into every subclass, particularly from rails. (For example flash.rb, from an older revision so its more clear.) I find this particularly messy. CachedModel needs to override both instance and class methods, and include only adds instance methods. To work around this Rails adds a ClassMethods module and automatically includes it as appropriate. I find a separate module for extending class methods isn’t nearly as readable or organized as having everything in one place. I also can’t remember when which callbacks get called when, but super is simple and easy to understand. ... James Mead said 11 days later: Perhaps the more pertinent question is – should ActiveRecord::Base be a class? It could be argued that both persistence and caching are orthogonal concerns relative to the business logic of your domain.

[extension/non-core] Passing parameters along with your include

/usr/lib/ruby/gems/1.8/gems/facets-1.8.51/lib/facets/more/paramix.rb (Should be documented on the Module class page, but it seems that dependency.rb's documentation "trumped" that)

  module Mixin
    def hello
      puts "Hello from #{Mixin(:name)}"
    end
  end

  class MyClass
    include Mixin, :name => 'Ruby'
  end

  m = MyClass.new
  m.hello -> 'Hello from Ruby'
 





Procs/lambda

Example that uses variable number of arguments

This proc will add all of its arguments together and return the resulting sum:

irb -> Proc.new {|*args| args.inject{|sum, i| sum + i} }.call 1, 2, 3, 4
    => 10

How is Proc.new different from lambda?

The Rdoc for lambda tells us: "Equivalent to Proc.new, except the resulting Proc objects check the number of parameters passed when called."

Demo:

irb -> Proc.new {|a, b| a + b}.call 1, 2, 3
    => 3

irb -> lambda {|a, b| a + b}.call 1, 2, 3
ArgumentError: wrong number of arguments (3 for 2)
        from (irb):4
        from (irb):4:in `call'
        from (irb):4
        from :0

Can we get the source code from a Proc we've created? / Can Procs be serialized/marshalled?

Sort of. Not directly with any core Ruby classes/methods, though. And only simple Procs can be serialized ever (no closures, for example [am I sure??]).

Read Ruby Quiz - SerializableProc (#38) / [4] (for how it's done. Pretty cool stuff.

http://groups.google.com/group/comp.lang.ruby/browse_thread/thread/bca52d261bd28703

Jeffrey Moss wrote:
> Wouldn't it be possible to write a C extension for serializable closures?

I think NodeWrap does this. See http://rubystuff.org/nodewrap/

If I had this input:

block_to_string {
  whatever.foo(    blah         )
}

could I get it to produce this output?: "whatever.foo(blah)".

Hey, I found what I was looking for! ruby2ruby!

> irb
irb -> require 'ruby2ruby'
    => true

irb -> def block_to_string(&block); block.to_ruby; end
    => nil

irb -> block_to_string {
         whatever.foo(    blah         )
       }
    => "proc {\n  whatever.foo(blah)\n}"


Blocks (Closures)

Crossing borders: Closures: More than a little sugar

Closures are anonymous functions with closed scope.

...

Look at the first and fourth lines of code in Listing 5. The first assigns a value to the tax variable. The fourth line uses that variable to compute the tax column of the price table. But this usage is in a closure, so this code actually executes in the context of the collect method! Now you have insight into the term closure. The scope between the name space of the environment that defines the code block and the function that uses it are essentially one scope: the scope is closed. This characteristic is essential. This closed scope is the communication that ties the closure to the calling function and the code that defines it.

Some of the common closure scenarios are:

  • Refactoring
  • Customization
  • Iterating across collections
  • Managing resources
  • Enforcing policy

Listing 13. Enforcing policy

def do_transaction
   begin
      setup_transaction
      yield
      commit_transaction
   rescue
      roll_back_transaction
   end
end

Example: application shutdown

Here's an example from Pickaxe book, page 248. It's a simple web server:

#!/usr/bin/ruby
require 'webrick'
include WEBrick

server = HTTPServer.new(
  :Port         => 2000,
  :DocumentRoot => File.join(Dir.pwd, "/html")
)

trap("INT") { server.shutdown }

server.start

As you can probably easily tell from reading the code, the block { server.shutdown } is executed when the SIGINT signal is received, thus ensuring that the server will get shut down tidility on interrupts.

What I particularly like about blocks in this case is that we didn't need to make a global variable for the HTTPServer object just so we could access it from inside the trap block. Instead, we can just use the local variable server!

(see http://www.ruby-doc.org/core/classes/Kernel.html#M005748 for details about the trap method.)

Example: timeout

/usr/lib/ruby/gems/1.8/gems/rails-1.1.6/lib/rails_generator/commands.rb

          # Look up synonyms on WordNet.  Thanks to Florian Gross (flgr).
          def find_synonyms(word)
            require 'open-uri'
            require 'timeout'
            timeout(5) do
              open(SYNONYM_LOOKUP_URI % word) do |stream|
                data = stream.read.gsub(" ", " ").gsub("<BR>", "")
                data.scan(/^Sense \d+\n.+?\n\n/m)
              end
            end
          rescue Exception
            return nil
          end





Iterators/Enumerators

Ruby / Iterators/Enumerators edit

(What's the difference between the terms "iterator" and "enumerator", by the way? I think iterator is the preferred term most of the time.)

Things you can do with Enumerable objects

  • inject
  • map
  • ...
delete_if (Array)
delete_if (Set)
delete_if (Hash)

concise iterator/map/collect calls with Symbol.to_proc

A very concise way to use map to do a method call on all elements

(requires http://extensions.rubyforge.org/)

irb -> require 'rubygems'
irb -> require 'extensions/symbol'

irb -> :size.to_proc
    => #<Proc:0xb7bfab00@/usr/lib/ruby/gems/1.8/gems/extensions-0.6.0/lib/extensions/symbol.rb:24>

irb -> ["three", "different", "words"].map(&:size)
    => [5, 9, 5]

       # The old-school method that you'll never go back to now that you know the concise alternative:
irb -> ["three", "different", "words"].map{|a|a.size}
    => [5, 9, 5]

How does it work? PragDave has a great explanation. "It’s an incredibly elegant use of coercion and of closures." I agree!

Example uses

Accumulate/sum up some numbers:

irb -> require 'rubygems'
irb -> require 'extensions/symbol'

irb -> sum = [3, 1, 9, 6].inject(&:+)
    => 19
irb -> %w{a couple words}.map(&:capitalize).join(' ')
    => "A Couple Words"
irb -> require 'rubygems'
irb -> require 'extensions/symbol'

irb -> f = File.open('foo', 'w') { |f| f.puts 'line1'; f.puts 'line2' }
    => nil

irb -> File.readlines('foo')
    => ["line1\n", "line2\n"]
irb -> File.readlines('foo').map(&:chomp)
    => ["line1", "line2"]

Enumerable#map_send(meth, *args) {|e.send(meth, *args)| ...} [Ruby Facets (category)]

http://facets.rubyforge.org/src/doc/rdoc/core/classes/Enumerable.html#M001330

# File lib/facets/core/enumerable/map_send.rb, line 3
  def map_send(meth, *args)
    if block_given?
      map{|e| yield(e.send(meth, *args))}
    else
      map{|e| e.send(meth, *args)}
    end
  end

Sometimes Symbol#to_proc isn't enough because it doesn't let you pass args. map_send is the answer to this!

irb -> require 'facets/core/enumerable/map_send.rb'

irb -> [1,2,3].map_send(:+, 3)
    => [4, 5, 6]


Enumerable#every / Array#every! [Ruby Facets (category)]

http://facets.rubyforge.org/src/doc/rdoc/core/classes/Enumerable.html#M001236

Returns an elementwise Functor. This allows you to map a method on to every element.

irb -> require 'facets/core/enumerable/every'

  [1,2,3].every + 3           #=> [4,5,6]

  ['a','b','c'].every.upcase  #=> ['A','B','C']

Arguably more readable than map.

irb -> require 'facets/core/enumerable/map_send.rb'

irb -> [1,2,3].map_send(:+, 3)
    => [4, 5, 6]

How do I collect an "each"-type iterator as a collection?

irb -> "tyler".each_byte {|l| p l }
116
121
108
101
114

How do I convert that into the array [116, 121, 108, 101, 114] instead?

Sure, I can do this:

irb -> a = []; "tyler".each_byte {|l| a << l }; a
    => [116, 121, 108, 101, 114]

But isn't there a more elegant way???

Yes! See next question...

How do I turn an "each" into a "map"?

Use an Enumerable::Enumerator object! Very handy.

The plain map method is equivalent to this:

Enumerable::Enumerator.new(object, :each).map{|value| value}

Example:

irb ->  require 'enumerator'

irb ->  Enumerable::Enumerator.new(['a', 'b', 'c'], :each).map{|value| value}
    => ["a", "b", "c"]

irb -> ['a', 'b', 'c'].map{|value| value}
    => ["a", "b", "c"]

Fine, you don't need an Enumerator to do that... But to "catch" the values that are yielded by other values and get them returned as an array, I really don't know any other good way to do it...

irb -> Enumerable::Enumerator.new(['a', 'b', 'c'], :each_with_index).map{|value, index| [index, value]}
    => [[0, "a"], [1, "b"], [2, "c"]]

irb -> Enumerable::Enumerator.new(['a', 'b', 'c'], :each_with_index).map{|value, index| "#{index+1}. #{value}"}
    => ["1. a", "2. b", "3. c"]

...

Kernel#enum

require 'qualitysmith_extensions/kernel/enum'

Why would you want an "each" to be a "map"?

When you want to chain several "map"-type iterators together, that's when.

irb -> "line 1\nline 2".each_line {|line| "* #{line}"}.map {|line| (' '*2) + line}
    => ["  line 1\n", "  line 2"]

We wanted the each_line to prefix the line with a bullet. But that's not what happened. each_line actually returns the receiver unchanged!

irb -> "line 1\nline 2".each_line {|line| "* #{line}"}
    => "line 1\nline 2"

That's fine with that iterator is the "end of the line", so to speak (the last iterator in a chain of iterators):

irb -> "line 1\nline 2".each_line {|line| puts "* #{line}"}
* line 1
* line 2
    => "line 1\nline 2"

... but not so cool when you want to chain something on after it.

Solution with Enumerators:


irb -> require 'enumerator'
    => true

irb -> Enumerable::Enumerator.new("line 1\nline 2", :each_line).map {|line| "* #{line}"}.map {|line| (' '*2) + line}
    => ["  * line 1\n", "  * line 2"]

But what if you want that enumerator in the middle of a chain of map-type iterators?

How to insert an enumerator in the middle of a chain of map-type iterators

We want to do something sort of like this (but obviously not exactly like this):

irb -> n = 3; (1..n).to_a.map {|i| "line #{i}"}.join("\n").Enumerable::Enumerator.new(our_string_weve_built_so_far, :each_line).map {|line| "* #{line}"}.map {|line| (' '*2) + line}
NoMethodError: undefined method `Enumerable' for ["line 1", "line 2", "line 3"]:Array
        from (irb):9
        from :0

We could probably solve this with a Functor. But, more directly (no extra classes) and generically, we could solve it like this:

irb -> class Object; def with_self; yield self end; end
    => nil

irb -> n = 3; (1..n).to_a.map {|i| "line #{i}"}.join("\n").with_self {|our_string_weve_built_so_far| Enumerable::Enumerator.new(our_string_weve_built_so_far, :each_line)}.map {|line| "* #{line}"}.map {|line| (' '*2) + line}
    => ["  * line 1\n", "  * line 2\n", "  * line 3"]

That's pretty ugly and verbose, though. Let's see if we can concisify it a bit. I see that Enumerable defines several "enum_" methods. I'm very curious why they didn't throw in a generic "enum" method that let you pass in the name of the iterator to be used.

irb -> [].class.ancestors
    => [Array, Enumerable, Object, PP::ObjectMixin, Kernel]

irb -> Enumerable.instance_methods.grep /^enum/
    => ["enum_slice", "enum_cons", "enum_with_index"]

No matter. We'll just write our own.

http://code.qualitysmith.com/gemables/our_extensions/lib/enumerable/enum.rb

irb -> require 'enumerator'
irb -> module Enumerable def enum(iterator) Enumerable::Enumerator.new(self, iterator) end; end
    => nil

And we can see that it works marvelously:

irb -> n = 3; (1..n).to_a.map {|i| "line #{i}"}.join("\n").enum(:each_line).map {|line| "* #{line}"}.map {|line| (' '*2) + line}
    => ["  * line 1\n", "  * line 2\n", "  * line 3"]

[good example of iterator] all?

tasks/install_tasks.rake [Not yet released (category)]

      if !status.empty? and status.all? { |line| line =~ /^\?/ or line =~ /^A.*\.$/ }
        SVN.add '*'
        SVN.remove_without_delete 'log/*.log'
      end

Translated to English: if all of the status lines returned by Subversion begin with a ? or an A, then...

[good example of iterator] collect + max

This block of code goes through each of the task objects, looks at the length of its name, and uses the highest one as the width for display purposes.

      width = displayable_tasks.collect { |t|
        t.name.length
      }.max

collect actually returns an array containing all the lengths, and then max gets the highest item in the array.

From: /usr/lib/ruby/gems/1.8/gems/rake-0.7.1/lib/rake.rb :

    def display_tasks_and_comments
      displayable_tasks = Rake::Task.tasks.select { |t|
        t.comment && t.name =~ options.show_task_pattern
      }
      width = displayable_tasks.collect { |t|
        t.name.length
      }.max
      displayable_tasks.each do |t|
        printf "rake %-#{width}s  # %s\n", t.name, t.comment
      end
    end

each_line

irb -> "a\nb".each_line {|a| puts a }
a
b
    => "a\nb"
 



Debugging

Ruby / Debugging edit


General

Debugging Ruby can be made difficult when the classes/modules you are working with may have been reopened any number of times anywhere in your code base. By the time other modules have been mixed in and methods overrided, the resulting class may be completely different than how you think it is. How do you even know what methods exist??

These might help a little:

  • Object#public_methods
  • Object#private_methods
  • Object#protected_methods
  • instance_variables
  • instance_methods

How to examine call stack?

  • Kernel#caller

Debugging: Why can't it load [filename]??

Say, for instance, it's not finding a file named 'foo' when you do require 'foo'.

  • The first thing to check, of course, is the spelling of the filename.
  • Next, check $LOAD_PATH... The file (relative path) must exist in one of the paths listed there, or it won't find it.
  • Unless you're using RubyGems or some other library that changes the default behavior of require/load...

Here's some debug code I used one time. Not sure it's generally useful though...

  p $LOAD_PATH

  $LOAD_PATH.each {|d| print "For #{d}:"; p    (Dir.entries(d).grep(/foo/) rescue []) }
  puts "Was found in: " + $LOAD_PATH.find {|d| (Dir.entries(d) rescue []).include?('foo.rb') }

  require 'foo'


Problem: inspect is too verbose?

Here's one way to just have it print the methods you care about:

[:whatever, :some_method].each {|method| puts "#{method} = #{self.send(method)}"}

as an alternative to:

puts self.inspect

Also consider using pp object or puts object.to_yaml, as they can be far more readable a lot of th etime.

Problem: you can't always print to stdout to do debugging

Sometimes I look at someone else's code (like Rails) to try to figure out how it works. Sometimes just looking at the code doesn't tell me enough; I need to inspect the values of variables as they are running.

So.... I can either try to mess with the breakpointer/debugger... or I could just dump the values I want to a file and inspect it after I run the code.

You never know what logging facilities are available. Best to just rely on primitive built-in file-writing mechanism, like those that the File class provides.

Make sure your ~/temp directory is writable, and then add this wherever you want to dump stuff.

File.open("/home/tyler/temp/debugging.log", "a") { |file| file.puts an_interesting_variable }

or

    File.open("debugging.log", "w") { |file| file.puts '' }
    def out(out); File.open("debugging.log", "a") { |file| file.puts out }; end

watch cat $debugging.log


Why doesn't it show the entire backtrace?

[Questions (category)][Problems (category)]

Sometimes, when the backtrace is really long, Ruby tries to be helpful and omits part of the backtrace (replacing the omitted section with the number of lines that were omitted, as in "... 54 levels...", sort of like this:

./script/../config/../../lib/rails_profiler.rb:39:in `log': You have a nil object when you didn't expect it! (NoMethodError)
The error occurred while evaluating nil.info    from ./script/../config/../vendor/rails/activerecord/lib/active_record/connection_adapters/sqlite_adapter.rb:145:in `execute'
        from ./script/../config/../vendor/rails/activerecord/lib/active_record/connection_adapters/sqlite_adapter.rb:346:in `catch_schema_changes'
        from ./script/../config/../vendor/rails/activerecord/lib/active_record/connection_adapters/sqlite_adapter.rb:145:in `execute'
        from ./script/../config/../vendor/rails/activerecord/lib/active_record/connection_adapters/sqlite_adapter.rb:160:in `insert'
        from ./script/../config/../vendor/rails/activerecord/lib/active_record/base.rb:1811:in `create_without_callbacks'
        from ./script/../config/../vendor/rails/activerecord/lib/active_record/callbacks.rb:254:in `create_without_timestamps'
        from ./script/../config/../vendor/rails/activerecord/lib/active_record/timestamp.rb:39:in `create'
        from ./script/../config/../vendor/rails/activerecord/lib/active_record/base.rb:1789:in `create_or_update_without_callbacks'
         ... 54 levels...
        from ./script/../config/../vendor/rails/activesupport/lib/active_support/dependencies.rb:495:in `require'
        from ./script/../config/../vendor/rails/railties/lib/commands/server.rb:40
        from ./script/server:4:in `require'
        from ./script/server:4

I have to say, however, that despite the good intentions of whomever added this "feature", it is often not helpful. What if I want to see the entire backtrace?

Unfortunately, I don't think there's any (easy, built-in) way to enable the showing of the full backtrace.

Instead one is forced to rescue the exception yourself and print out the full backtrace.

You have to go to (one of the) outer file (the one listed at the bottom of the incomplete backtrace), and surround the offending line/lines with something like this:

begin
  # block within which the exception is raised
rescue Exception => exception
  puts exception.class.name + ": " + exception.message
  puts exception.backtrace.join( "\n" )
end

So it's doable, but a bit of a pain...

And you have to remember to remove that code when you are done...

References:

Libraries

Unroller

Homepage: http://unroller.rubyforge.org/


Project/Development: http://rubyforge.org/projects/unroller


Description: A tool for generating human-readable "execution traces". Displays the source code on your screen in real-time as it is executed. Want to know what goes on behind the scenes when you call some 3rd-party method? Trace the method call and find out!



Authors: Tyler Rick


RuntimeProfile

Project/Development: http://rubyforge.org/projects/runtimeprofile/


Description: Diagnostic utility to track, during the runtime, which file and line classes, modules, and methods are added and changed.




Readiness: This Project Has Not Released Any Files, 2007-05-11 20:21


 




Metaprogramming

(Dr. Nic)

http://drnicwilliams.com/2007/06/09/smart-people-doing-smart-things-in-netherlands-rubyenrails-2007/. Retrieved on 2007-05-11 11:18.


DIY syntax The ability to extend my programming language tickles me pink. Often you write a block of code and you just think “That should be prettier and simpler”. With Ruby meta-programming, blocks, method_missing, const_missing and optional parentheses you can craft nearly any syntactic sugar you like to replace lengthy, complicated code.

http://www.slideshare.net/drnic/rubyenrails2007-dr-nic-williams-diy-syntax/


loops + define_method

Save yourself a bunch of keystrokes as well as duplication by doing something like this:

      [:adapter, :host, :database, :username, :password].each do |method|
        define_method(method) { @database[method.to_s] }
      end

instead of this:

      def adapter
        @database["adapter"]
      end
      def host
        @database["host"]
      end
      def database
        @database["database"]
      end
      def username
        @database["username"]
      end
      def username
        @database["password"]
      end

(Oops, was that a typo in the last case? Must have been because of all that copying and pasting I did. You get my point!)

Concurrency: Multithreading

Clarification: This is about Ruby's ["green"] threads, which are different from the operating system's native threads...

What is shared between threads?

Pickaxe book, p. 136

  • "A thread shares all global, instance, and local variables that are in existence at the time the thread starts."
  • "Local variables created within a thread's block are truly local to that thread--each thread will have its own copy of these variables."

Don't forget to "join" your threads

Pickaxe book, p. 137

When a Ruby program terminates, all threads are killed, regardless of their status. However, you can wait for a particular thread to finish by calling that thread's Thread#join method. The calling thread [usually the main thread] will block until the given thread is finished.

Synchronization

Using thread.rb's Mutex class (Mutex#synchronize)

http://corelib.rubyonrails.org/classes/Mutex.html

http://safari.oreilly.com/0596523696/rubyckbk-CHP-20-SECT-4#rubyckbk-CHP-20-SECT-4

This code gives every object a synchronize method. This simulates the behavior of Java, in which synchronize is a keyword that can be applied to any object:


        require 'thread'
        class Object
          def synchronize
            mutex.synchronize { yield self }
          end

          def mutex
            @mutex ||= Mutex.new
          end
        end
        list = []
        Thread.new { list.synchronize { |l| sleep(5); 3.times { l.push "Thread 1" } } }
        Thread.new { list.synchronize { |l| 3.times { l.push "Thread 2" } } }
        sleep(6)
        list
        # => ["Thread 1", "Thread 1", "Thread 1", "Thread 2", "Thread 2", "Thread 2"]

Object#synchronize only prevents two synchronized code blocks from running at the same time. Nothing prevents a wayward thread from modifying the object without calling synchronize first.

...

It would be nice if you could to do this synchronization implicitly, the way you can in Java: you just designate certain methods as "synchronized", and the interpreter won't start running those methods until it can obtain an exclusive lock on the corresponding object. The simplest way to do this is to use [aspect-oriented programming (category)]. The RAspect library described in Recipe 10.15 can be used for this.

The following code defines an Aspect that can wrap methods in synchronization code. It uses the Object#mutex method defined above, but it could easily be changed to define its own Mutex objects:

        require 'aspectr'
        require 'thread'

        class Synchronized < AspectR::Aspect
          def lock(method_sym, object, return_value, *args)
            object.mutex.lock
          end

          def unlock(method_sym, object, return_value, *args)
            object.mutex.unlock
          end
        end


Any AspectR aspect method needs to take three arguments: the symbol of the method being called, the object it's being called on, and (if the aspect method is being called after the original method) the return value of the method.

The rest of the arguments are the arguments to the original method. Since this aspect is very simple, the only argument we need is object, the object we're going to lock and unlock.

Let's use the Synchronized aspect to create an array where you can only call push, pop, or each once you get an exclusive lock.

        array = %w{do re mi fa so la ti}
        Synchronized.new.wrap(array, :lock, :unlock, :push, :pop, :each)

...

When the first thread calls each, the AspectR-generated code calls lock, and the first thread gets a lock on the array. The second thread starts and it wants to call pop, but pop has been modified to require an exclusive lock on the array. The second thread can't run until the first thread finishes its call to each, and the AspectR-generated code calls unlock.

        Thread.new { array.each { |x| puts x } }
        Thread.new do
          puts 'Destroying the array.'
          array.pop until array.empty?
          puts 'Destroyed!'
        end
        # do
        # re
        # mi
        # fa
        # so
        # la
        # ti
        # Destroying the array.
        # Destroyed!

Queue

http://corelib.rubyonrails.org/classes/Queue.html

This class provides a way to synchronize communication between threads. Example:

  require 'thread'

  queue = Queue.new

  producer = Thread.new do
    5.times do |i|
      sleep rand(i) # simulate expense
      queue << i
      puts "#{i} produced"
    end
  end

  consumer = Thread.new do
    5.times do |i|
      value = queue.pop
      sleep rand(i/2) # simulate expense
      puts "consumed #{value}"
    end
  end

  consumer.join

Monitor / MonitorMixin

http://stdlib.rubyonrails.org/libdoc/monitor/rdoc/index.html

Methods:

  • broadcast count_waiters new signal wait wait_until wait_while

Adds monitor functionality to an arbitrary object by mixing the module with include. For example:

   require 'monitor.rb'

   buf = []
   buf.extend(MonitorMixin)
   empty_cond = buf.new_cond

   # consumer
   Thread.start do
     loop do
       buf.synchronize do
         empty_cond.wait_while { buf.empty? }
         print buf.shift
       end
     end
   end

   # producer
   while line = ARGF.gets
     buf.synchronize do
       buf.push(line)
       empty_cond.signal
     end
   end

The consumer thread waits for the producer thread to push a line to buf while buf.empty?, and the producer thread (main thread) reads a line from ARGF and push it to buf, then call empty_cond.signal.

Condition variables (via MonitorMixin)

http://stdlib.rubyonrails.org/libdoc/monitor/rdoc/classes/MonitorMixin/ConditionVariable.html

Pickaxe book, p. 146 [5]:

require 'monitor'
playlist = []
playlist.extend(MonitorMixin)
plays_pending = playlist.new_cond

# Customer request thread thread
customer = Thread.new do
  loop do
    ...
    playlist.synchronize do
      playlist << req
      plays_pending.signal
    end
  end
end

# Player thread
player = Thread.new do
  loop do
    playlist.synchronize do
      break if ok_to_shutdown && playlist.empty?
      plays_pending.wait_while { playlist.empty? }
      song = playlist.shift
    end
    ...
  end
end

Mutex_m

http://stdlib.rubyonrails.org/libdoc/mutex_m/rdoc/index.html

Tips/Neat tricks/Ruby-fu magic/Ruby dynamicness

In method_missing, use define_method ... to create a dynamic method, then it’ll already be there for next time. (http://woss.name/2006/05/07/notes-from-a-rails-course/)

How to turn a string into an array of characters

irb -> "letters".split(//)
    => ["l", "e", "t", "t", "e", "r", "s"]

How to turn a string into an array of lines

irb -> multiline_string = <<End
     " Line 1
     " Line 2
     " End
    => "Line 1\nLine 2\n"

irb -> multiline_string.split(//)
    => ["L", "i", "n", "e", " ", "1", "\n", "L", "i", "n", "e", " ", "2", "\n"]

irb -> multiline_string.split(/\n/)
    => ["Line 1", "Line 2"]

irb -> multiline_string.split("\n")
    => ["Line 1", "Line 2"]

Write less code: take advantage of the fact that nil is false and non-nil is true!!

Suppose you want to ensure that your method returns an Array, but there is a chance that your return value might be nil at that point. You'd need to explicitly translate nil into []. Which would you rather write?

events.nil? ? [] : events

or

events || []

?

Decorator Pattern with Ruby in 8 lines

http://www.lukeredpath.co.uk/2006/9/6/decorator-pattern-with-ruby-in-8-lines

I think Trevor's variation is even more impressive.

How do you dynamically create/"create" methods?

Two ways...

Either:

  1. define them dynamically or
  2. don't define them at all and use method_missing to pretend like they're defined when the message is actually received (even more dynamic!).

define them dynamically

You can use Module#define_method or

class_eval do
  def ...
end

or use strings:

class_eval do
  def #{prefix}_thingy ...
end

The advantage of doing it this way is that the methods are actually defined and available for introspection (they show up when you do object.methods, etc.). And when an error occurs in the method, it will actually say it occured in [name of method] rather than in method_missing.

One disadvantage, though, is that you often have to evaluate strings, which is sort of messy, especially when you need to represent strings inside of your strings inside of your strings... Can't conceive of such a situation? All right, I'll give you an example:

  @@pricing_formula_fields.each do |pricing_term|
    class_eval <<-EOS
      def effective_#{pricing_term}
        puts %Q[ formula.send("#{pricing_term}") + self.market.send("#{pricing_term}_adjustment") + self.send("#{pricing_term}_adjustment").to_f ]
      end
    EOS
  end

Somewhat contrived, but it's really not too uncommon to do stuff like that, where you need to dynamically call a method (using send, and passing a string to it) from within a string from within a string that will be evaluated.

It can get pretty confusing with all sorts of delimiters to try to match up ... I wouldn't even dream of doing all that with a single string delimiter (say, "), because then you'd have to escape and double-escape things: cleaner to mix 3 method (<<-EOS EOS, %Q[ ], and " ", for example)... nice that we have 3 methods we can mix!

A little bit more complicated even:

  @@pricing_formula_fields.each do |pricing_term| # (for example, :labor_price)
    class_eval <<-EOS
      def effective_#{pricing_term}(prefix = "")
        self.formula.send("#{pricing_term}") + self.market.send("#{pricing_term}_adjustment") + self.send("\#{prefix}#{pricing_term}_adjustment").to_f
      end
    EOS
  end

Note that if I tried to put #{prefix} instead of \#{prefix}, it would have complained about undefined local variable prefix. The reason we put \# is so that it doesn't evaluate it when the class_eval is called. Instead we want the literal character # to be left there in the string inside of elf.send() so that it will interpolate that variable later when effective_whatever is called: elf.send("#{prefix}whatever_adjustment") !

use method_missing

Useful especially when you can't practically enumerate all possible variations of method names that you want to be able to use.

Ruby on Rails takes advantage of this very well, by having methods like find_by_name_and_phone_number_and_age (or any other permutation of fields) available to you.

Aspect-Oriented Programming in Ruby

http://facets.rubyforge.org/api/more/classes/Cut.html

  class X
    def x; "x"; end
  end

  cut :C < X do
    def x; '{' + super + '}'; end
  end

  X.new.x  #=> "{x}"

Neat!!

True, you could insert that functionality -- that "cut" -- by using built-in Ruby mechanisms (class reopening, alias_method, and method overriding) -- without the use of the Cut class:

  class X
    def x; "x"; end
  end

  class X
    alias_method :original_x, :x
    def x; '{' + original_x + '}'; end
  end

  X.new.x  #=> "{x}"

but that's a little more cumbersome and ugly and you have to think of a name for each 'original function' that you want to preserve.

Example of dynamically choosing a class

Not only can you dynamically choose a method to call, but you can also dynamically choose which class to instantiate. Useful if you want to call one standard method that's capable of delegating to a more specialized class...

irb -> klass = (rand < 0.5 ? String : Array)
    => Array
irb -> klass.new
    => []

irb -> klass = (rand < 0.5 ? String : Array)
    => String
irb -> klass.new
    => ""

Shoot, you can even create classes on the fly. Here's how to create a new, anonymous class:

irb -> klass = Class.new(superclass = Object)
    => #<Class:0xb7f031e4>

irb -> klass.name
    => ""

irb -> klass.superclass
    => Object

Why you'd ever want to do that, I have no idea, but at least you can if you want to!





Consuming Web services

http://www.brendonwilson.com/blog/2006/04/02/ruby-soap4r-wsdl-hell/ Ruby + SOAP4R + WSDL Hell at www.brendonwilson.com

http://blog.webgambit.com/articles/2006/04/30/calling-a-net-web-service-from-rails-for-dummies Calling a .NET Web service from Rails (for dummies)

(Modified from their example:)

require 'soap/wsdlDriver'
factory = SOAP::WSDLDriverFactory.new("http://ws.invesbot.com/stockquotes.asmx?WSDL")
soap = factory.create_rpc_driver
soapResponse = soap.GetQuote(:symbol => 'MSFT')

http://www.pranavbihari.com/articles/2005/12/02/testing-paypal-web-services-with-ruby-soap4r Testing PayPal Web Services With Ruby soap4r

http://webservices.sys-con.com/read/39831.htm Web Services Made Easy with Ruby @ SOA WEB SERVICES JOURNAL

http://www.vanruby.com/slides-for-soap4r-presentation-by-emil.html Slides for SOAP4R Presentation by Emil | VanRuby - Vancouver Ruby Association

http://dev.ctor.org/soap4r soap4r - Trac




Reflection/Introspection

Finding out what methods are available for an object

It is useful to remove the methods common to all objects; they just get in the way when you list them.

(some_object.public_methods - Object.public_methods).sort
or just
(some_object.methods - Object.methods).sort

If you want to search through the methods: some_object.methods.grep /keyword/

Accessing the call stack

Kernel#caller

To get your current location ("file.rb:133"), get the last value from that array: caller[-1].

This did not yield accurate results for me. So far I've only tried it in Rails unit tests.

A more reliable, but more complicated way:

    begin
      raise "Where am I?"
    rescue Exception => exception
      puts exception.backtrace
    end

Better yet, throw it in a method so that it can be easily reused:

module Kernel
  def backtrace
    begin
      raise "Where am I?"
    rescue Exception => exception
      full_backtrace = exception.backtrace
      return full_backtrace[1..-1]    # We don't want this call to backtrace showing up in the backtrace!
    end
  end
end

    puts backtrace

Method introspection

irb -> def foo(a, b, c) end
    => nil

irb -> method(:foo)
    => #<Method: Object#foo>

irb -> method(:foo).methods - Object.methods
    => ["arity", "call", "to_proc", "unbind", "[]"]

method 'arity'

irb -> def foo(a, b, c) end
    => nil

irb -> method(:foo)
    => #<Method: Object#foo>

irb -> method(:foo).arity
    => 3

Optional args...

irb -> class Object; def required_arity() -(self.arity+1) end; end
irb -> class Object; def has_optional_args?() self.arity < 0 end; end

irb -> def unlimited_args(*args) end
irb -> def require_2_optional_3(a, b, c=3, d=4, e=5) end

irb -> method(:unlimited_args).arity
    => -1
irb -> method(:unlimited_args).required_arity
    => 0
irb -> method(:unlimited_args).has_optional_args?
    => true

irb -> method(:require_2_optional_3).arity
    => -3
irb -> method(:require_2_optional_3).required_arity
    => 2
irb -> method(:require_2_optional_3).has_optional_args?
    => true

Note that there's no way to determine how many optional args a method has...that I'm aware anyway.



Ruby internals / Really low-level introspection

See also: Ruby execution unroller (set_trace_func)

Really low-level introspection

http://eigenclass.org/hiki.rb?class+hierarchy+introspection+evil.rb

Describes how to use evil.rb to dig into the innermost internals of Ruby objects.

Object#internal is defined by evil.rb to return a handle that allows us to inspect and manipulate the low-level fields associated to the original object. For regular objects, the interesting fields would be:

  • iv_tbl: a pointer to a hash table with the instance variables (st_tbl *, see st.c)
  • flags: contains information such as whether the object is frozen
  • klass: points to the class of the object
Time for some basic low-level introspection:
a = Object.new
a.internal.klass.to_i                              # => 3085110560
a.instance_variable_set(:@foo, 1)
a.internal.iv_tbl.to_i                             # => 135912160

If I get around to it evil.rb will also be available here: http://svn.tylerrick.com/public/ruby/evil.rb

evil.rb uses 'dl/struct', part of Ruby/DL.

Ruby/DL

The docs for Ruby/DL are not completely translated but available here: http://www.jbrowse.com/text/rdl_en.html .

The coolest thing about DL, I think, is that it lets you have access to pointers from Ruby!

http://www.jbrowse.com/text/rdl_en.html

"is an instance of DL::PtrData. The PtrData class is the class for handling pointers from Ruby, and defines methods for managing pointers."

evil.rb takes advantage of pointers to let you inspect and muck around with objects' internals.

Example:

class Object
  ...

  def internal_ptr(*args)
    raise(ArgumentError, "Can't get pointer to direct values.") \
      if direct_value?
    pos = self.object_id * 2
    DL::PtrData.new(pos, *args)
  end

  # Changes the class of an Object to a new one. This will
  # change the methods available on the Object.
  #
  #   foo_klass = Class.new {}
  #   obj = Object.new
  #   obj.class = foo_klass
  #   obj.class # => foo_klass
  def class=(new_class)
    raise(ArgumentError, "Can't change class of direct value.") \
      if direct_value?
    raise(ArgumentError, "Class has to be a Class.") \
      unless new_class.is_a? Class
    if self.class.to_internal_type and
       new_class.to_internal_type and
       self.class.to_internal_type != new_class.to_internal_type
      msg = "Internal type of class isn't compatible with " + 
            "internal type of object."
      raise(ArgumentError, msg)
    end
    if self.class.to_internal_type == RubyInternal::T_DATA
      msg = "Internal type of class isn't compatible with " + 
            "internal type of object. (Both are T_DATA, but " +
            "that doesn't imply that they're compatible.)"
      raise(ArgumentError, msg)
    end
    self.internal.klass = new_class.internal_ptr.to_i
    return self
  end

  # Will let an Object become another Object in the whole
  # application. This is like #replace on Strings, Arrays
  # and Hashes, but it will work with more Objects.
  #
  #   obj_a = Object.new
  #   obj_b = Object.new; obj_b.instance_eval { @b = true }
  #   obj_a.instance_eval { @b } # => nil
  #   obj_a.become(obj_b)
  #   obj_a.instance_eval { @b } # => true
  def become(other)
    cloned = other.clone
    self.swap(cloned)
    return cloned
  end

  ...
end

Crazy stuff!

Ruby internals: a self-study guide to the sources

http://eigenclass.org/hiki.rb?ruby+internals+guide

Question (introspection): How do I figure out where a certain method came from?

For intance, I come across a call to a method blank?(). How do I determine which class that was defined in (it could be defined in the class of the object on which it was called, or it could be defined in any of its superclasses) or which module it was defined in (if it was mixed in)?

It would also be helpful to know the file path and line number where it was defined, since there could be many versions of a particular class on your system.

I don't know the best way to do this.

But one trick that may be useful is to look at the anscestors of the class on which you are calling this mystery method.

my_object.ancestors




Comparison with other languages

Objective-C

Categories

http://en.wikipedia.org/wiki/Cocoa_%28API%29

When extensions are needed, Cocoa's use of Objective-C makes this a straightforward task. Objective-C includes the concept of "categories" which allows for modifications to an existing class "in-place". Functionality can be accomplished in a category without any changes to the original classes in the framework, or even access to its source. Under more common frameworks this same task would require the programmer to make a new subclass supporting the additional features, and then change all instances of the classes to this new class.

C++

http://www.pjhyett.com/posts/198-c-x C++0x

template<class T> using Vec = vector<T,My_alloc<T>>;
Vec<double> v = { 2.3, 1.2, 6.7, 4.5  };
sort(v);
for(auto p = v.begin(); p!=v.end(); ++p)
    cout << *p << endl;

It’s a complete joke that the improvements to C++ are supposed to make the language easier to use and learn. Look how much easier and intuitive it is to write this code in Ruby right now.

v = [ 2.3, 1.2, 6.7, 4.5 ]
v.sort!
v.each { |p| puts p }

C#

Ruby 101 for .NET Developers: .map and .collect (http://www.softiesonrails.com/2007/3/2/ruby-101-for-net-developers-map-and-collect). Retrieved on 2007-03-29 14:44.

// C# Code
public string[] GetNames(Customer[] customers)
{
    List names = new List();

    for (Customer cust in customers)
    {
        names.Add(cust.Name);
    }

    return names.ToArray();
}

Looks like normal C# code, right? I think this is how most of us would solve this problem in C#. In fact, this is a common pattern you'll see all over .NET code: create a container for the results, start a loop, pull out the results you need into the new container, then return the new container.

This is in fact such a common programming pattern, that Ruby idiomizes it for us and makes it a lot easier. The collect iterates over your collection for you, and for each element it finds in the collection, it yields to a block that you provide.

Here's the key: each return value from your block is added to a new collection automatically, and this new collection becomes the return value from the collect call:

def get_names(customers)
  customers.collect { |cust| cust.name }
end

Ruby 101 for .NET Developers: .map and .collect (http://www.softiesonrails.com/2007/3/2/ruby-101-for-net-developers-map-and-collect). Retrieved on 2007-03-29 14:44.

Now what's cool is when you start putting different Ruby pieces together. Suppose I only want the names of customers who have purchased more than 1000 widgets. In C#, I'd just put an if statement in there somewhere:

// C# Code
public string[] GetNames(Customer[] customers)
{
    List names = new List();

    for (Customer cust in customers)
    {
        if (cust.PurchasedQty > 1000)
        {
            names.Add(cust.Name);
        }
    }

    return names;
}

In Ruby, I find it's much better to utilize another Enumerable method, select:

def get_names(customers)
  customers.select { |cust| cust.purchased_qty > 1000 }.collect { |cust| cust.name }
end

Ruby 101 for .NET developers: Multiple assignments made easy (http://www.softiesonrails.com/2007/2/13/ruby-101-for-net-developers-multiple-assignments-made-easy). Retrieved on 2007-03-29 14:44.

Here’s another ruby-ism that wasn’t obvious to me at first. In C#, I might have a class that took several parameters in the constructor, and I would need to save those values in some private member variables. [...] Also, [...] let’s say the name, city, and schedule should be read-only properties of my class.

// C# code 
class Team
{
    private Schedule schedule; // reference to a Schedule object
    private string name;
    private string city;

    public Team(string name, string city, Schedule schedule)
    {
        this.name = name;
        this.city = city;
        this.schedule = schedule;
    }

    public string Name
    {
        get { return name; }
    }

    public string City
    {
        get { return city; }
    }

    public Schedule Schedule
    {
        get { return schedule; }
    }
}

I’ve done this thousands of times, and it seemed easy enough to me that I never thought about how it could ever be any better. But actually I was just a mouse who’d been trained to run through a maze if I wanted some cheese. Look at the Ruby equivalent:

class Team

  attr_reader :name, :city, :schedule

  def initialize(name, city, schedule)
    @name, @city, @schedule = name, city, schedule
  end

end

Here I’ve taken advantage of Ruby’s multiple assignment feature to quickly initialize everything.


Blogs/articles/etc.

Ruby blogs edit

 


Articles

http://www.vanderburg.org/Speaking/Stuff/oscon05.pdf Metaprogramming Ruby

http://www.rubyinside.com/19-rails-tricks-most-rails-coders-dont-know-131.html

How do I create a CSV file in Rails and then return it as a file to the client?

CSV::Writer and send_data

http://blog.teksol.info/articles/2006/03/23/returning-csv-data-to-the-browser

DRb

"detach" application

http://raa.ruby-lang.org/project/detach/

require 'detach'
u = Universe.new.detach
s = u.size # Whereas this would normally block now, and take a very long time, you can query it's running
s.ready? # false
sleep VERY_LONG_TIME
s.ready? # true
s.to_s # "about 10 billion light years in radius"

irb

How to make irb prettier

http://svn.tylerrick.com/shell/config/irbrc

#!/bin/env ruby

require 'pp'  # Pretty printer (mostly useful for printing out big objects or hashes, like IRB.conf...)
require 'irb/completion'


IRB.conf[:IRB_RC] = proc do |conf|
  leader = " " * conf.irb_name.length
  conf.prompt_i = "#{conf.irb_name} -> "

  # The prompt for a continuing statement
  conf.prompt_c = leader + '    '

  # The prompt for a continuing string
  conf.prompt_s = leader + '  " '

  conf.return_format = leader + " => %s\n\n"

  puts "Welcome!"
end


# "aliases"/commands
# (Sort of like defining an alias in bash...)
def ri(*names)
  #names.map {|name| name.to_s}
  system("ri " + names[0].to_s)
end

Documentation? See Pickaxe book.



To do

This page is getting out of control and should probably be broken up into pieces sometime soon.

  • Upgrade page to a Category
  • Make subcategories
    • move major sections into their own category pages (object-orientedness, reflection, strings, examples, etc.)
  • a page can be in more than one subcategory -- yay! further split category pages and add multiple inheritance
  • Decide what's so different about Questions and Cheat sheet/reference ... possibly consolidate
  • Import ~/dev/tyler/examples/ruby/ onto this page.
Retrieved from "http://whynotwiki.com/Ruby"
Ads
Personal tools