Ruby / Syntax
From WhyNotWiki
[edit] Rescue statement modifier within method call
irb -> p(Dir.entries(d).grep(/erb/) rescue)
SyntaxError: compile error
(irb):7: syntax error, unexpected kRESCUE_MOD, expecting ')'
p(Dir.entries(d).grep(/erb/) rescue)
^
from (irb):7
To get around this, you'll need to throw on one more set of parentheses:
irb -> p((Dir.entries(d).grep(/erb/) rescue []))
[]
=> nil
[edit] Multi-line comments
=begin Your comment here =end
Useful for commenting out a block of code.
You can also tell it what type of comment. The following special types are ones that I know of so far:
- =begin rdoc -- begins an RDoc block
- =begin test -- begins an inline test that will be read by Reap's test extraction tool
[edit] branching structures (if, case, ...) can be used inline in expressions even!
irb -> 'Is it true? ' + if false then 'yes' else 'no' end
=> "Is it true? no"
[edit] statement/expression "modifiers" (comes after expression)
[edit] the "if" expression modifier
This can apply to a single statement:
irb -> puts 'statement' if false
=> nil
or to a block (of any size):
irb -> begin; puts 'hi'; puts 'bye'; end if false
=> nil
irb -> begin; puts 'hi'; puts 'bye'; end if true
hi
bye
=> nil
You can also have the if clause apply to a section of code by enclosing in parentheses:
irb -> (puts 'First statement'; puts 'Second statement') if false
=> nil
irb -> puts 'First statement'; puts 'Second statement' if false
First statement
=> nil
[edit] Order of evaluation of statement modifiers
Example 0:
These don't work:
irb -> puts message if message = 'hi'
NameError: undefined local variable or method `message' for main:Object
from (irb):1
FileUtils.rm(file) if File.exist?(file = '/path/to/file')
Example 1: This works:
diff_output = Subversion.diff(external.container_dir)
puts diff_output unless diff_output.blank?
but you can't (unfortunately) condense that into one statement. In other words, this doesn't work as desired:
puts (diff_output = Subversion.diff(external.container_dir)) unless diff_output.blank?
See:
irb -> require 'rubygems'; require 'active_support'
irb -> puts (diff_output = "some long diff output") unless diff_output.blank?
=> nil
That's because when the Ruby parser hits "diff_output =", even though it doesn't evaluate it at that point, it still initializes diff_output to nil. Next, diff_output.blank? is evaluated and finds diff_output to still be nil. So the unless block evaluates to unless false and the block it is protecting (which actually sets the (diff_output = "some long diff output")) does not get evaluated. Kind of unintuitive and disappointing at times, but not too bad. We just have to write this as two lines instead of one in this case.
Example 2: validating arguments
Suppose you have a multiple-choice "option" named how and you want to validate that the caller gave it one of the valid options. If it's not one of the valid options, you want to print out a nice error message that lists what the valid options are. The setup code is:
options = args.last.is_a?(Hash) ? args.last : {}
how = options.delete(:how) || :capture
Now how would we validate the how option?
raise ArgumentError.new(":how option must be one of #{valid_options.inspect}") unless (valid_options = [:capture, :exec, :popen]).include? how
That actually works, because the unless clause (which gets executed first) sets up the array for the "then" clause...
irb -> how = :vapidly
=> :vapidly
irb -> raise ArgumentError.new(":how option must be one of #{valid_options.inspect}") unless (valid_options = [:capture, :exec, :popen]).include? how
ArgumentError: :how option must be one of [:capture, :exec, :popen]
from (irb):6
from :0
irb -> how = :capture
=> :capture
irb -> raise ArgumentError.new(":how option must be one of #{valid_options.inspect}") unless (valid_options = [:capture, :exec, :popen]).include? how
=> nil
We could have also written it like this:
unless (valid_options = [:capture, :exec, :popen]).include? how
raise ArgumentError.new(":how option must be one of #{valid_options.inspect}")
end
Example 3
Suppose you want to only print the results of some method call, but only if the results are not ""...
It feels like you should be able to do it with one line, like this...
puts output = obj.some_method() unless output == ""
but that doesn't work.
We're forced to break it into 2 lines.
output = obj.some_method() puts output unless output == ""
In this case, if all you're worried about is getting blank lines when the method returns a blank line, you could use print instead.
print output = obj.some_method() unless output == ""
But that doesn't solve the more general problem. Which is that an if/unless statement modifier (that comes after the statement) can never check and respond to an input that exists only after running the statement to which the modifier applies. This is because the if/unless statement modifier is evaluated before the statement it is guarding.
It looks to me like the only way to do what I'm trying to do with a single statement is to use the "regular" old if/then/else statement inline in my puts statement.
irb -> unless (a = "") == "" then puts a else 'doing nothing' end
=> "doing nothing"
irb -> unless (a = "not an empty string") == "" then puts a else 'doing nothing' end
not an empty string
=> nil
So, applying that to my more specific example:
unless (output = obj.some_method()) == "" then puts output else end
Not as concise or readable as I'd hoped, but it does get the job done in one line.
[edit] "rescue" statement modifier / how to execute multiple statements as if they were one
Be sure you realize that only the statement immediately following the rescue belongs to the rescue clause (will only be executed if an exception is raised). Any statements after that will be executed unconditionally.
To demonstrate:
irb -> "Something that doesn't raise an exception." rescue p 'something'; p 'else'
"else"
=> nil
If you need to execute multiple statements as if they were one, just throw a begin/end around them! This would be the correct way to do what the above example was trying to do:
irb -> "Something that doesn't raise an exception." rescue begin; p 'something'; p 'else'; end
=> "Something that doesn't raise an exception."
irb -> raise "An exception" rescue begin; p 'something'; p 'else'; end
"something"
"else"
=> nil
[edit] ||= (conditional initialization operator)
I love this operator. It lets you initialize a variable to a default only if the variable isn't already set.
My only complaint about it is that it doesn't work for booleans. (It also sometimes doesn't work how I'd like for hashes, arrays, as I'll explain in a bit.)
Ruby 101 for .NET developers: The strange ||= operator (http://www.softiesonrails.com/2007/2/6/ruby-101-for-net-developers-the-strange-OR-operator).
Sometimes in Ruby you'll come across some code that looks like this:
stuff ||= [ ]
or maybe
stuff ||= { }
What's going on here? First, if you're new to Ruby, you may not realize how the || operator really works, so let's digress for a minute and talk about that first. The || operator is a short-circuited logical OR operator. If I say
a = "hello" b = a || "goodbye"
What is the value of b? It will be "hello". Since a evaluated to something that's not nil and not false, it stopped right there (short circuited itself) and assigned be a reference to a. But suppose a was nil instead:
a = nil b = a || "goodbye"
Now, b will become "goodbye". Since a was nil, the OR operator continued to evaluate the next expression, which was "goodbye". So b becomes "goodbye".
If you're with me so far, then you're already rounding third base. Now, you probably already know that this:
a = a + 5
can be shortened to this:
a += 5
right? In Ruby, this same idea gets applied to the OR operator. Instead of writing this:
a = a || "baseball"
This means, if a already has a value, then keep it; but it it's nil, then assign it to "baseball". But horrors, what a lot of typing! A good Ruby programmer would do this instead and save two whole keystrokes:
a ||= "baseball"
In other words, a will become "baseball" if it was nil (or false) before, otherwise it will just keep its original value.
[edit] [caveats (category)] Doesn't work for booleans
irb -> def foo(a) a ||= true; a end
=> nil
irb -> foo(false)
=> true
Sets a to true (the default value) even though a non-default value (false) was passed in! The consequences for not understanding this important point could be tragic! It's the difference between true and false.
Workaround:
irb -> def foo(a) a ||= true if a.nil?; a end
=> nil
irb -> foo(false)
=> false
More verbose, but it gives the (IMHO) desired behavior.
Better solution? Well, we can't just make up a new operator in Ruby, unfortunately, but we might be able to solve it with a new method...
irb -> class Object; def default(default) self = default if self.nil? end; end SyntaxError: compile error (irb):33: Can't change the value of self class Object; def default(default) self = default if self.nil? end; end
Ohp! Never mind. I guess Ruby doesn't like that. I bet it woulda worked though...
irb -> class Object; def default(default) puts "Setting to default value of #{default}" if self.nil? end; end
irb -> foo(false)
=> nil
irb -> foo()
Setting to default value of true
=> nil
[edit] What if you want a [] variable to be treated as nil (uninitialized) value and to set it to a default if it is either nil or []?
irb -> a = []
=> []
irb -> (!a.empty?)
=> false
irb -> (!a.empty? || a)
=> []
irb -> !a.empty? || (a = ['default', 'array'])
=> ["default", "array"]
irb -> a
=> ["default", "array"]
irb -> a = []
=> []
irb -> a = ['default', 'array'] if a.empty?
=> ["default", "array"]
That works!
But it wouldn't work so well if a were nil.
irb -> a = nil => nil irb -> !a.empty? || (a = ['default', 'array']) NoMethodError: undefined method `empty?' for nil:NilClass from (irb):14Unless we did this:
irb -> class NilClass; def empty?; true; end; end => nil irb -> !a.empty? || (a = ['default', 'array']) => ["default", "array"]or this:
irb -> a && !a.empty? || (a = ['default', 'array']) => ["default", "array"]or this:
irb -> a = ['default', 'array'] if a && !a.empty? => ["default", "array"]or this:
irb -> a = nil irb -> a = ['default', 'array'] if a.nil? || (a && a.empty?) => ["default", "array"] irb -> a = [] irb -> a = ['default', 'array'] if a.nil? || (a && a.empty?) => ["default", "array"]
How about the case where it is not nil or []?
irb -> a = ['existing', 'array']
=> ["existing", "array"]
irb -> a = ['default', 'array'] if a.nil? || (a && a.empty?)
=> nil
irb -> a
=> ["existing", "array"]
Cool, that works. Except that the assignment line returns nil, which may not be what we want. Maybe we'd rather the assignment line returned a ... ?
It'd be kind of neat if we could do this Python-esque (?) syntax...
irb -> (a = ['default', 'array'] if a.empty? else a)
SyntaxError: compile error
(irb):10: syntax error, unexpected kELSE, expecting ')'
(a = ['default', 'array'] if a.empty? else a)
^
... but, that's not valid in Ruby.
Once again assuming this...
irb -> class NilClass; def empty?; true; end; end
=> nil
Let's see if we can get the assignment line to return a.
irb -> a = []
=> []
irb -> ((a = ['default', 'array'] if a.empty?) || a).each {|e| puts e}
default
array
=> ["default", "array"]
Good, it used the default!
irb -> a = ['existing', 'array']
=> ["existing", "array"]
irb -> ((a = ['default', 'array'] if a.empty?) || a).each {|e| puts e}
existing
array
=> ["existing", "array"]
Good, it used the existing value of a (rather than overriding with a default)!
When would you want to do that? I don't know.
Here's the situation that originally motivated me to solve that problem: I wanted to use the value of ARGV (which I expected to be a list of directories), unless ARGV was empty ([]), in which case I just wanted by default to use the current directory (['.']). I didn't have any control of ARGV; it's always an array ([] if there are no args), so I couldn't just use ||=, because that only works if it is nil.
But then I realized that ARGV was a constant and I couldn't change its values by doing something like ||= anyway. This is what I ended up doing:
((['.'] if ARGV.empty?) || ARGV).each do |dir|
...
end
and then changed to:
(if ARGV.empty? then ['.'] else ARGV end).each do |dir|
...
end
[edit] [caveats (category)] Can't do (var ||= "") += "string"
Why would I want to do that? So that I don't have to write it as two lines, of course! Compactness.
The problem, you see, is that you can't call + on nil variables (a default behavior of NilClass that I prefer to change)
irb -> string += 'some thing'
NoMethodError: undefined method `+' for nil:NilClass
from (irb):1
I could always do this:
string ||= "" string += "string"
But that's sort of a pain. I sort of wish I could do this:
irb -> (string ||= "") += 'some thing'
SyntaxError: compile error
(irb):2: syntax error, unexpected tOP_ASGN, expecting $end
(string ||= "") += 'some thing'
^
from (irb):2
The reason it doesn't like that, of course, is that the l-value of an assignment (which += is) must be a variable, not an expression. I sort of think it should be able to use the variable mentioned in the expression (string), but oh well.
This works, however:
irb -> string += 'this' if string ||= ''
=> "this"
irb -> string += ' works' if string ||= ''
=> "this works"
[edit] [Parallel assignment (category)]
See also: Pickaxe p. 340 (the rules), Pickaxe p. 90 (examples)
[edit] Why would you want to use it?
- it's concise!
- you can use it to swap variables (a, b = b, a)
[edit] lvalues, ... = *rvalue
"The rvalue is replaced with the elements of the array, with each element forming its own rvalue."
irb -> a, b, c = *[1, 2, 3]; "a=#{a}, b=#{b}, c=#{c}"
=> "a=1, b=2, c=3"
[edit] lvalues, ... = array
irb -> array = [1, 2, 3]
irb -> a, b, c = array; "a=#{a}, b=#{b}, c=#{c}"
=> "a=1, b=2, c=3"
[edit] If there are more lvalues than rvalues...
... the excess will be assigned the value nil.
irb -> a, b, c = 1; "a=#{a.inspect}, b=#{b.inspect}, c=#{c.inspect}"
=> "a=1, b=nil, c=nil"
[edit] If the last lvalue is prefixed with a *: a, ..., *catch_all = ...
(without the *)
irb -> a, b, catch_all = 1, 2, 3, 4, 5; "a=#{a}, b=#{b}, catch_all=#{catch_all.inspect}"
=> "a=1, b=2, catch_all=3"
but (with the *):
irb -> a, b, *catch_all = 1, 2, 3, 4, 5; "a=#{a}, b=#{b}, catch_all=#{catch_all.inspect}"
=> "a=1, b=2, catch_all=[3, 4, 5]"
It also works when there is only one lvalue and it is prefixed with a *.
irb -> *vars = 1, 2, 3; vars
=> [1, 2, 3]
More examples...
irb -> array = [3, 4, 5]
irb -> a, b, not_really_a_catch_all = 1, 2, array; "a=#{a}, b=#{b}, not_really_a_catch_all=#{not_really_a_catch_all.inspect}"
=> "a=1, b=2, not_really_a_catch_all=[3, 4, 5]"
# Simply assigns the value of array to not_really_a_catch_all
irb -> a, b, *catch_all = 1, 2, *array; "a=#{a}, b=#{b}, catch_all=#{catch_all.inspect}"
=> "a=1, b=2, catch_all=[3, 4, 5]"
# Has the same end result, but uses a catch-all and a splat (*array).
irb -> a, b, *catch_all = 1, 2, array; "a=#{a}, b=#{b}, catch_all=#{catch_all.inspect}"
=> "a=1, b=2, catch_all=[[3, 4, 5]]"
# Probably not what you want.
irb -> a, b, catch_all = 1, 2, *array; "a=#{a}, b=#{b}, catch_all=#{catch_all.inspect}"
=> "a=1, b=2, catch_all=3"
# The splat was too big to catch! The excess from the splat (4, 5) just ended up being ignored.
[edit] They are performed in parallel (effectively), so swapping is possible
irb -> a, b = 1, 2
=> [1, 2]
irb -> a, b = b, a; "a=#{a}, b=#{b}"
=> "a=2, b=1"
[edit] But they are also performed in order, so if an rvalue has side-effects, it may affect the next rvalue
irb -> a, b, c = (x = 0), (x += 1), (x += 1)
=> [0, 1, 2]
irb -> x
=> 2
[edit] The return value of the multiple assignment is an array
irb -> return_val = (a, b, c = 1, 2, 3)
=> [1, 2, 3]
[edit] [Example of * operator with an Array (category)]
Either of these works:
irb -> a, b, c = *['something']*3
=> ["something", "something", "something"]
irb -> "a=#{a}, b=#{b}, c=#{c}"
=> "a=something, b=something, c=something"
irb -> a, b, c = ['something']*3
=> ["something", "something", "something"]
irb -> "a=#{a}, b=#{b}, c=#{c}"
=> "a=something, b=something, c=something"
[edit] [values, not references (category)]
This should be self-evident to the astute Rubyist, but I've forgotten it at least once so I'll just give this reminder for my own sake then.
irb -> a, b, c = 1, 2, 3
=> [1, 2, 3]
irb -> vars = [a, b, c]
=> [1, 2, 3]
# vars now contains a reference to a, b, and c, right? Wrong.
irb -> *vars = 4, 5, 6; "a=#{a}, b=#{b}, c=#{c}"
=> "a=1, b=2, c=3"
irb -> vars
=> [4, 5, 6]
a, b, and c are not changed at all by this. Only vars (the actual lvalue) is changed.
[edit] Are variables references, pointers, or what?? / Do things get passed/returned by value or by reference?
This concept/question is actually very important for every serious Ruby developer to understand. I know it's sure gotten me very confused before.
The short version is [may not be 100% technically accurate, but conceptually it works to describe what's going on]:
- variables are really just pointers to objects
- most operations you do to a variable (
var.operate!) actually end up changing the object to which it points - variable assignment (
var =) is different: it points the variable to a different object; it does not modify the original object!
You can tell what object a variable "points to" by inspecting its object_id. Each distinct object has a unique object_id. It is possible for more than one variable to point to the same object (and thus have the same object_id).
If we were to compare Ruby with C++, Ruby's variables are a lot like by-reference in C++: you can't just change what they point to with as much freedom as you can with C++ pointers, but -- much like by-reference argument passing in C++ -- the variable you pass in to a method can be modified by the function. Likewise, if a method returns a variable, that variable can be modified by the caller. Don't you forget that.
[edit] { Some examples using strings
These examples illustrate that method calls on a variable do not change the "object_id" of that variable. In other words, after calling these methods, the variable still points to the same object.
irb -> a = "a"; a.object_id
=> -604332868
irb -> a.replace("b"); a.object_id
=> -604332868
irb -> a.concat "b"; a.object_id
=> -604332868
irb -> a.upcase!; a.object_id
=> -604332868
See how the object_id of a was -604332868 in all cases?
Now here are some examples to illustrate that using = on a variable to do an assignment will usually cause the variable to point to a new/different object.
I said usually, because sometimes an assignment will not change the target of the variable, like when you assign a variable to itself, or pass the variable to a method that returns the object.
irb -> a = "a"; a.object_id
=> -604365948
irb -> a = a; a.object_id
=> -604365948
irb -> def return_arg(arg) arg end; a = return_arg(a); a.object_id
=> -604365948
But usually an assignment results in a new object being created. Observe:
irb -> a = "a"; a.object_id
=> -604212148
# That's what we start with.
irb -> a = "b"; a.object_id
=> -604251578
# That's really just a shorter way of saying:
irb -> a = String.new("b"); a.object_id
=> -604265298
# Each string literal actually constructs a new object, even if you *spell* the string exactly the same. Symbols do not have this behavior, which is partly what makes them so cool...
irb -> print "a".object_id, " ", "a".object_id
-604361848 -604361868 => nil # (different objects)
irb -> print :"a".object_id, " ", :"a".object_id
164578 164578 => nil # (same object...every time!)
irb -> a = "a" + ""; a.object_id
=> -604235568
# Even though the resulting string *looks* the same ("a") -- the string is still "spelled" the same before and after -- the + operator actually (unconditionally) results in a new object being constructed.
irb -> a += "b"; a.object_id
=> -604280198
# You might *think* that this would just add "b" to the end of the existing object (to which a points). It doesn't though; it creates a new object. Use a.concat("b") if you want it to *modify* a's current object rather than creating a new object.
# a += "b" is just short for this (and we've already observed that String#+ *always* creates a new String object):
irb -> a = a + "b"; a.object_id
=> -604318858
irb -> def return_arg(arg) arg + "" end; a = return_arg(a); a.object_id
=> -604228528
Did you see how every one of those operations showed a to have a different object_id? That's because each one of them actually created a new String object and associated the variable a with that new String object.
The docs [1] confirm what we have already observed about the + operator:
str + other_str => new_strConcatenation—Returns a new
Stringcontainingother_strconcatenated tostr."Hello from " + self.to_s #=> "Hello from main"
[edit] } Some examples using strings
[edit] Are variables passed/returned by value or reference?
You've already seen one example where they are both passed and returned by reference.
irb -> a = "a"; a.object_id
=> -604250678
irb -> def return_arg(arg) arg end; a = return_arg(a); a.object_id
=> -604250678
(the exact same object that is passed ends up being returned)
So I guess that answers your question, doesn't it. The rest of my examples will just further illustrate that, but the concept is really easy to remember: yes, everything is by reference [if I understand correctly].
So even though the following returns a different object than is passed in, the object that the caller gets back is the same object that existed when we called return from within the method.
irb -> def return_arg(arg) return arg + "" end; a = return_arg(a); a.object_id
=> -604347018
This can be illustrated even more clearly by putting a puts inside of the method...
irb -> def return_arg(arg)
new_str = arg + "";
puts new_str.object_id;
new_str
end;
a = return_arg(a); a.object_id
-604280118
=> -604280118
[edit] [Caveats (category)] Attribute reader methods can actually be used to modify the object's instance variables
You may think your object's private data is safe just because you've provided an attr_reader and not an attr_writer. This is not necessarily the case, however, so you need to be careful!
In particular, an attr_reader> method will return the attribute (instance variable) by reference, so the caller can modify the instance variable all she likes. This may disturb some folks, because it seems like the reader should be read-only.
If this were C++, we'd even declare the reader as const. But you don't have quite that degree of protection of your object's internal data as you do in C++, unfortunately. In fact, there's a way to get around practically every encapsulation "convention" provided by Ruby; they're really just "conventions", not rules that are enforced by Ruby (use obj.send(:private_method_A), for instance, to call private_method_A as if it were a public method). [To do: move to its own section].
class A
attr_reader :options
def initialize
@options = []
end
end
a = A.new
puts a.options.object_id # to show that the a.options() method returns the variable it is reading *by reference*...
a.options() << 'flum1' # and with this "reference" to A's private @options data, we can do whatever we'd like!
a.options() << 'flum2'
p a.options
-604226398 ["flum1", "flum2"]
Illustration of how to create a "reference" to an object (so you have 2 variables pointing to the same object)...
my_ref = a.options p my_ref.object_id # to show that my_ref *points* to the same object as a my_ref.replace ['zroo'] # we operate on my_ref, but end up changing the object that a points to (since they point to the same object) p a.options
-604226398 ["zroo"]
Also note (yet again) that this trick only works if the return value is an actual variable rather than an "expression"... If you were to implement your attribute reader like this...
def options
@options | []
end
...instead of like this...
def options
@options
end
...then you would effectively make it read-only! a.options().replace, a.options() << element would modify some object, certainly, but it could never be used to modify @options. (Take that, you people-who-try-to-directly-access-my-object's-instance-variables-rather-than-going-through-the-interface-I've-provided!)
Also note how in the following example...
class A
def initialize
@options = []
end
def options
@options | []
end
end
a = A.new
puts a.options.object_id
puts a.options.object_id
puts a.options.object_id
a.options() << 'flum2'
p a.options
...each call to a.options returns a different object. (Has that fact sunk into your mind yet??)
-604404818 -604404858 -604404898 []
[edit] How to make a new variable "reference" an existing object ([Examples (category)] of the difference between = and replace())
Simplest possible example:
irb -> a = "a"; a.object_id
=> -604430038
irb -> my_ref = a; my_ref.object_id
=> -604430038
irb -> my_ref.replace("c")
=> "c"
irb -> a
=> "c"
Example that shows that arrays maintain "references" (rather than just the values) to their member objects too...
irb -> a = "a"; b = "b"; a.object_id
=> -604430038
irb -> array = [a, b]
=> ["a", "b"]
irb -> array[0].object_id
=> -604430038
irb -> my_ref = array[0]
=> "a"
irb -> my_ref = array[0]; my_ref.object_id
=> -604430038
irb -> my_ref = "c"; my_ref.object_id
=> -604549458
irb -> my_ref = array[0]; my_ref.object_id
=> -604430038
irb -> my_ref.replace("c"); my_ref.object_id
=> -604430038
irb -> a
=> "c"
[edit] ([Examples (category)] of the difference between = and replace())
irb -> def foo(input)
puts $a.object_id
puts input[0].object_id
$a = 'b'
puts $a.object_id
puts input[0].object_id
puts input[0]
end
=> nil
irb -> $a, $b = 'a', 'b'
=> ["a", "b"]
irb -> foo([$a, $b])
-604472168 # (the original object (containing 'a'))
-604472168 # (the original object (containing 'a'))
-604486558 # $a was changed to point to the object containing 'b'
-604472168 # but input[0] still points to the original object (containing 'a')
a # input[0] still points to the original object (containing 'a')
=> nil
irb -> $b
=> "b" # $a was changed to point to the object containing 'b'
[edit] [Problems (category)] Why I wish Ruby had proper pointers (even if every object did have a replace() method, that still doesn't let you change types) ([Examples (category)] of the difference between = and replace())
Sometimes you want to have your method accept an array of objects and then modify those objects and have your changes visible outside of the method (as opposed to just modifying local variables).
The following example using Strings illustrates how this is possible with classes that implement the replace() method.
require 'stringio'
def change_vars!(vars)
puts "Before:"
puts vars.map { |v| v.inspect }.join(", ")
vars.each do |v|
v.replace 'new'
end
puts "After:"
puts vars.map { |v| v.inspect }.join(", ")
end
$a, $b = ['old']*2
change_vars! [$a, $b]
Before: "old", "old" After: "new", "new"
Note, however, that replace() can't be used to change the type of the variable.
require 'stringio'
def change_vars!(vars)
puts "Before:"
puts vars.map { |v| v.inspect }.join(", ")
vars.each do |v|
v.replace StringIO.new
end
puts "After:"
puts vars.map { |v| v.inspect }.join(", ")
end
$a, $b = ['old']*2
change_vars! [$a, $b]
Before:
"old", "old"
...:in `replace': can't convert StringIO into String (TypeError)
from temp.rb:11:in `change_vars!'
from temp.rb:10:in `each'
from temp.rb:10:in `change_vars!'
from temp.rb:19
Those were contrived examples. Here's what I really want to be able to do:
- pass an array of objects to
capture_output; these objects will be of typeIO - in
capture_output, I want to be be able to use an iterator and change every object in the array that was passed in to be a different object (a new object, of typeStringIO)
require 'stringio'
def capture_output(vars = [$stdout], &block)
puts "Before:"
puts vars.map { |v| v.inspect }.join(", ")
vars.each do |v|
v = StringIO.new
end
puts "After:"
puts vars.map { |v| v.inspect }.join(", ")
end
capture_output [$stdout, $stderr]
This proves to be impossible to do the way I had imagined doing it for the following reasons:
- Even though
vis a "reference" to$stdout, when I dov = StringIO.new, it breaks the reference and causes v (a local variable, by the way) to be a "reference" to the newly constructed StringIO object. $stdout is unchanged. - I would just use IO#replace, but that (1) doesn't exist, and (2) even if it did exist, all it could do would be to replace itself with another object of the same type, not with a different type (unless we did some Ruby/DL magic, but let's not go there...).
This output confirms that $stdout escapes our efforts to change it:
Before: #<IO:0xb7f18030>, #<IO:0xb7f1801c> After: #<IO:0xb7f18030>, #<IO:0xb7f1801c>
Any other bright ideas?
Kernel has a global_variables method, but it just returns an array containing the names of all the global variables; it doesn't help you to modify those variables (like $GLOBALS does in PHP). I was hoping for a global_variable_set method like there is a class_variable_set and instance_variable_set.
You can't use eval() as an lvalue, or you might be able to do something like eval(global_name) = new_value.
If you are able to enumerate ahead of time all possible variables that might be passed in and write a special case for each of those variables, then the following solution "works"...
require 'stringio'
def capture_output(vars = [$stdout], &block)
puts "Before:"
puts vars.map { |v| v.inspect }.join(", ")
vars.each do |v|
case v
when $stdout
$stdout = StringIO.new
when $stderr
$stderr = StringIO.new
end
end
puts "After:"
puts vars.map { |v| v.inspect }.join(", ")
end
capture_output [$stdout, $stderr]
(this has the desired effect)
That's a huge "if", however. I want a solution that works for any variables that may be passed in -- not just input that matches one of the pre-defined allowed variables. I want a generic solution that works for any variables, and any number of variables.
Even if that requisite condition is satisfied, I'm certainly not satisfied with that solution -- all that duplication is uuuugly.
The following "solution" also "works":
require 'stringio'
def capture_output(vars = [$stdout], &block)
puts "Before:"
puts vars.map { |v| eval(v).inspect }.join(", ")
vars.each do |v|
eval(v + " = StringIO.new")
end
puts "After:"
puts vars.map { |v| eval(v).inspect }.join(", ")
end
capture_output ["$stdout", "$stderr"]
But it is just as ugly. evaling strings containing variable names just to set a variable? That is stooping to the utter depths of programmer desperation. It's too kludgey. I'd just rather not use eval to get the job done, if you know what I mean.
But that's the best I've come up with. Please inform me of a better solution!
Proposed to Ruby language: Either of the following would be satisfactory to me:
- Add a standard
Pointerclass to the language - Add a built-in
Kernel#global_variable_setmethod.
[edit] Can I pass argument by reference so that the method can return more than one return value?
No, not usually. But it shouldn't matter because Ruby provides a better way to return more than one return value.
def analyze_string(input, size)
size = input.length
puts size.object_id
input
end
size = nil
puts size.object_id
puts analyze_string("Hmm...", size)
puts "And its size was: #{size}"
4 23 Some string And its size was:
I must have told you a 1000 times by now: var = does not change the object pointed to by var; instead, it re-points var to a different object.
"Fine", you say, "we'll just use replace".
"Okay, go for it!" I say, smiling because I know it won't work.
def analyze_string(input, size)
size.replace input.length
puts size.object_id
input
end
size = Fixnum.new
puts size.object_id
puts analyze_string("Some string", size)
puts "And its size was: #{size}"
undefined method `new' for Fixnum:Class (NoMethodError)
How about this?:
...
size = 0
puts size.object_id
puts analyze_string("Some string", size)
puts "And its size was: #{size}"
undefined method `replace' for 0:Fixnum (NoMethodError)
"So we're out of luck then, right?"
"Well, not entirely. We just have to approach the problem from a different angle ... the Ruby way."
def analyze_string(input)
return input[0..0], input.length
end
first_letter, size = analyze_string("Some string")
puts "The first letter was: #{first_letter}"
puts "And its size was: #{size}"
The first letter was: S And its size was: 11
So there you have it: even though you can "pass arguments by reference" and change those objects from within your method some of the time (like when you pass in objects that respond to replace, such as String), it is probably not a good habit to get into, and it certainly not the preferred way of returning multiple values.
[edit] Boolean expressions
[edit] [just for fun (category)] If some common English boolean phrases were translated into Ruby...
def die!; "You're dead." end def skate; "I'm skating" end xmp "skate or die!" def skate; "I'm "; !"skating" end xmp "skate or die!" def ticket!; "You just got a ticket!" end def click_it; "I clicked it" end xmp "click_it or ticket!" def click_it; not 'wearing a seat belt' end xmp "click_it or ticket!" class << it = Object.new def will_hurt; "Ouch! Hey, that hurt!" end end def litter; "I'm just a-litterin' away!" end xmp "litter and it.will_hurt" def litter; "Okay, I'll stop! I will"; not "litter any more." end xmp "litter and it.will_hurt"
skate or die!
==>"I'm skating"
skate or die!
==>"You're dead."
click_it or ticket!
==>"I clicked it"
click_it or ticket!
==>"You just got a ticket!"
litter and it.will_hurt
==>"Ouch! Hey, that hurt!"
litter and it.will_hurt
==>false
[edit] Creating hashes without enclosing in {/}
A common Ruby idiom for methods is to make the last argument an "options" hash... This allows for some cleaner syntax and more flexibility in your calls...
- you can specify as many options as you'd like, or none at all
- you can specify them in any order
irb -> def m(arg1, arg2, options = {}); p [arg1, arg2, options]; end
=> nil
irb -> m 'arg1', 'arg2', :option1 => true, :option2 => :maybe
["arg1", "arg2", {:option1=>true, :option2=>:maybe}]
=> nil
It only works in certain cases, though...
You can only omit the {/} for the last argument, however...
irb -> def m(arg1, arg2, options = {}, arg4 = nil); p [arg1, arg2, options, arg4]; end
=> nil
irb -> m 'arg1', 'arg2', :option1 => true, :option2 => :maybe, 'arg4'
SyntaxError: compile error
(irb):19: syntax error, unexpected '\n', expecting tASSOC
from (irb):19
from :0
If you need to pass an argument after your "options" hash, then you need to enclose the hash with {/}...
irb -> m 'arg1', 'arg2', {:option1 => true, :option2 => :maybe}, 'arg4'
["arg1", "arg2", {:option1=>true, :option2=>:maybe}, "arg4"]
=> nil
This conciser syntax without the {/} also works to a limited extent within arrays...
irb -> ['arg1', 'arg2'] + [:option_1 => 'hi']
=> ["arg1", "arg2", {:option_1=>"hi"}]
It seems it only works if the hash is the only element of the array, though, (even when you put the hash as the last element) which seems kind of arbitrary if you ask me....
irb -> ['arg1', 'arg2'] + ['arg3', :option_1 => 'hi'] SyntaxError: compile error (irb):25: syntax error, unexpected tASSOC, expecting ']' ['arg1', 'arg2'] + ['arg3', :option_1 => 'hi']
[edit] Assigning a method call to a local variable of the same name
irb -> def foo
'foo'
end
Don't do this:
irb -> foo = foo
=> nil
irb -> foo
=> nil
When it sees the foo that you are assigning to foo, it will have already registered the local variable "foo" and initialized it with a value of nil. (Recall that bare names are treated as locals if the local has been initialized already. It is only treated as a method call if there is no local variable by that name!) In other words, it is identical to doing this:
irb -> foo = nil
=> nil
irb -> foo = foo
=> nil
Instead, be sure to use the () symbols so that it knows that is a method call.
irb -> foo = foo()
=> "foo"
irb -> foo
=> "foo"
Or, just assign it to a variable with a different name than your method:
irb -> snoo = foo
=> "foo"
No problem!
[edit] Syntax: operator associativity / precedence / order of operations
[edit] Caveat: {/} have higher precedence than do/end !
[edit] From Pickaxe
Pickaxe, p. 168
def one(arg)
if block_given?
"block given to 'one' returns #{yield}"
else
arg
end
end
def two
if block_given?
"block given to 'two' returns #{yield}"
end
end
result1 = one two {
"'three'"
}
#result1 = one(two {
# "'three'"
#})
result2 = one two do
"'three'"
end
#result2 = one(two) do
# "'three'"
#end
puts "With {/} : #{result1}" # => With {/} : block given to 'two' returns 'three'
puts "With do/end: #{result2}" # => With do/end: block given to 'one' returns 'three'
[edit] Using {/} instead of do/end can cause bugs
The Rake User Guide (http://docs.rubyrake.org/read/chapter/4#page23) makes it clear that using {/} can lead to unexpected problems, because it may pass your block to the wrong method! :
Blocks may be specified with either a do/end pair, or with curly braces in Ruby. We strongly recommend using do/end to specify the actions for tasks and rules. Because the rakefile idiom tends to leave off parenthesis on the task/file/rule methods, unusual ambiguities can arise when using curly braces.
For example, suppose that the method object_files returns a list of object files in a project. Now we use object_files as the prerequisites in a rule specified with actions in curly braces.
# DON'T DO THIS! file "prog" => object_files { # Actions are expected here (but it doesn't work)! }Because curly braces have a higher precedence than do/end, the block is associated with the object_files method rather than the file method.
This is the proper way to specify the task …
# THIS IS FINE file "prog" => object_files do # Actions go here end
[edit] Using do/end instead of {/} can also cause bugs
Unfortunately, I've found that one is not necessarily safe just because one follows the rule of always using do/end for all multi-line blocks. Even then you can cause behavior that may not be what you expected unless you understand the difference pretty well...
I guess it's best simply to understand the difference in associativity (rather than blindly following a convention you don't understand) and use whichever one yields the associativity you are wanting.
I first ran into this problem with this simple-looking piece of code.
puts output_streams.map do |output_stream|
output_stream.inspect
end.join ", "
To my surprise, it gave an error:
undefined method `join' for nil:NilClass (NoMethodError)
I had to change it to {/}, supposedly the wrong way to multi-line blocks...
puts output_streams.map { |output_stream|
output_stream.inspect
}.join ", "
A closer look...
irb -> puts [1, 2].map do |v|
v
end.join ", "
1
2
NoMethodError: undefined method `join' for nil:NilClass
from (irb):18
from :0
What??
irb -> puts [1, 2].map do |v| v end
1
2
=> nil
Ah... So I guess the default associativity of:
puts [1, 2].map do |v| v end.join ", "
is actually:
puts([1, 2].map do |v| v end).join ", "
(evaluates to nil.join ", ") ...which is not what I wanted in this case!
This is more along the lines of what I wanted...
irb -> puts( [1, 2].map do |v| v end.join(", ") )
1, 2
=> nil
Of course I could have always done it with {/} like this...
irb -> puts [1, 2].map { |v| v }.join(", ")
1, 2
=> nil
but I originally had it on multiple lines so I thought I was "supposed to" use do/end.
To summarize the difference in behavior:
meth1 objA.iterator_meth do block end.meth2
= (meth1 objA.iterator_meth do block end).meth2
meth1 objA.iterator_meth { block }.meth2
yields = meth1 (objA.iterator_meth { block }).meth2
Conclusion: Using {/} for multi-line blocks is not necessarily the "wrong" way. It sure beats putting tons of extra parentheses to effect your desired associativity, IMHO! So {/} can sometimes be the preferred option, even for multi-line blocks!
More specifically, objA.iterator_meth do block end.meth1.meth2
is okay, but if you're passing the result of your block-taker to another method, then {/} might be safer/more appropriate...
meth1 objA.iterator_meth { block end.meth2 }.
[edit] Caveat: method call has higher associativity than range (..) operator
Don't accidentally do this:
'a'..'z'.each {|letter| print letter}
=> z
Do this instead:
('a'..'z').each {|letter| print letter}
=> abcdefghijklmnopqrstuvwxyz
[edit] Caveat: &&/|| have higher precedence than and/or !
Observe:
irb -> a = true and false
irb -> a
=> true # <--- !!!
# Same as doing this:
irb -> (a = true) and false
This is not a bug. It's just something to be aware of. It's just the order of operator precedence: = comes before and. Period.
In the first example, the assignment a = true happens regardless of what follows 'and'. This is because the = operator (?) has a higher precedence than 'and'. So it happens first and then the 'and' is evaluated.
With && it is different.
irb -> a = true && false
irb -> a
=> false
true && false is evaluated before the assigment. Then the result of true && false (which is false) is assigned to a.
This is probably what you would want to do most of the time (rather than a = something and something_else).
[edit] boolean operators (and and or)
I think this is the same in almost all languages, but just as a reminder, if you don't specify parentheses, it will evaluate the operators from left to right. So this:
> true or false and false => false
is really the same as this:
> (true or false) and false => false
To override that default order of operations you must use parentheses:
> true or (false and false) => true
