Ruby / Iterators/Enumerators
From WhyNotWiki
(What's the difference between the terms "iterator" and "enumerator", by the way? I think iterator is the preferred term most of the time.)
[edit] Things you can do with Enumerable objects
- inject
- map
- ...
delete_if (Array) delete_if (Set) delete_if (Hash)
[edit] concise iterator/map/collect calls with Symbol.to_proc
A very concise way to use map to do a method call on all elements
(requires http://extensions.rubyforge.org/)
irb -> require 'rubygems'
irb -> require 'extensions/symbol'
irb -> :size.to_proc
=> #<Proc:0xb7bfab00@/usr/lib/ruby/gems/1.8/gems/extensions-0.6.0/lib/extensions/symbol.rb:24>
irb -> ["three", "different", "words"].map(&:size)
=> [5, 9, 5]
# The old-school method that you'll never go back to now that you know the concise alternative:
irb -> ["three", "different", "words"].map{|a|a.size}
=> [5, 9, 5]
How does it work? PragDave has a great explanation. "It’s an incredibly elegant use of coercion and of closures." I agree!
[edit] Example uses
Accumulate/sum up some numbers:
irb -> require 'rubygems'
irb -> require 'extensions/symbol'
irb -> sum = [3, 1, 9, 6].inject(&:+)
=> 19
irb -> %w{a couple words}.map(&:capitalize).join(' ')
=> "A Couple Words"
irb -> require 'rubygems'
irb -> require 'extensions/symbol'
irb -> f = File.open('foo', 'w') { |f| f.puts 'line1'; f.puts 'line2' }
=> nil
irb -> File.readlines('foo')
=> ["line1\n", "line2\n"]
irb -> File.readlines('foo').map(&:chomp)
=> ["line1", "line2"]
[edit] Enumerable#map_send(meth, *args) {|e.send(meth, *args)| ...} [Ruby Facets (category)]
http://facets.rubyforge.org/src/doc/rdoc/core/classes/Enumerable.html#M001330
# File lib/facets/core/enumerable/map_send.rb, line 3
def map_send(meth, *args)
if block_given?
map{|e| yield(e.send(meth, *args))}
else
map{|e| e.send(meth, *args)}
end
end
Sometimes Symbol#to_proc isn't enough because it doesn't let you pass args. map_send is the answer to this!
irb -> require 'facets/core/enumerable/map_send.rb'
irb -> [1,2,3].map_send(:+, 3)
=> [4, 5, 6]
[edit] Enumerable#every / Array#every! [Ruby Facets (category)]
http://facets.rubyforge.org/src/doc/rdoc/core/classes/Enumerable.html#M001236
Returns an elementwise Functor. This allows you to map a method on to every element.
irb -> require 'facets/core/enumerable/every' [1,2,3].every + 3 #=> [4,5,6] ['a','b','c'].every.upcase #=> ['A','B','C']
Arguably more readable than map.
irb -> require 'facets/core/enumerable/map_send.rb'
irb -> [1,2,3].map_send(:+, 3)
=> [4, 5, 6]
[edit] How do I collect an "each"-type iterator as a collection?
irb -> "tyler".each_byte {|l| p l }
116
121
108
101
114
How do I convert that into the array [116, 121, 108, 101, 114] instead?
Sure, I can do this:
irb -> a = []; "tyler".each_byte {|l| a << l }; a
=> [116, 121, 108, 101, 114]
But isn't there a more elegant way???
Yes! See next question...
[edit] How do I turn an "each" into a "map"?
Use an Enumerable::Enumerator object! Very handy.
The plain map method is equivalent to this:
Enumerable::Enumerator.new(object, :each).map{|value| value}
Example:
irb -> require 'enumerator'
irb -> Enumerable::Enumerator.new(['a', 'b', 'c'], :each).map{|value| value}
=> ["a", "b", "c"]
irb -> ['a', 'b', 'c'].map{|value| value}
=> ["a", "b", "c"]
Fine, you don't need an Enumerator to do that... But to "catch" the values that are yielded by other values and get them returned as an array, I really don't know any other good way to do it...
irb -> Enumerable::Enumerator.new(['a', 'b', 'c'], :each_with_index).map{|value, index| [index, value]}
=> [[0, "a"], [1, "b"], [2, "c"]]
irb -> Enumerable::Enumerator.new(['a', 'b', 'c'], :each_with_index).map{|value, index| "#{index+1}. #{value}"}
=> ["1. a", "2. b", "3. c"]
...
[edit] Kernel#enum
require 'qualitysmith_extensions/kernel/enum'
[edit] Why would you want an "each" to be a "map"?
When you want to chain several "map"-type iterators together, that's when.
irb -> "line 1\nline 2".each_line {|line| "* #{line}"}.map {|line| (' '*2) + line}
=> [" line 1\n", " line 2"]
We wanted the each_line to prefix the line with a bullet. But that's not what happened. each_line actually returns the receiver unchanged!
irb -> "line 1\nline 2".each_line {|line| "* #{line}"}
=> "line 1\nline 2"
That's fine with that iterator is the "end of the line", so to speak (the last iterator in a chain of iterators):
irb -> "line 1\nline 2".each_line {|line| puts "* #{line}"}
* line 1
* line 2
=> "line 1\nline 2"
... but not so cool when you want to chain something on after it.
Solution with Enumerators:
irb -> require 'enumerator'
=> true
irb -> Enumerable::Enumerator.new("line 1\nline 2", :each_line).map {|line| "* #{line}"}.map {|line| (' '*2) + line}
=> [" * line 1\n", " * line 2"]
But what if you want that enumerator in the middle of a chain of map-type iterators?
[edit] How to insert an enumerator in the middle of a chain of map-type iterators
We want to do something sort of like this (but obviously not exactly like this):
irb -> n = 3; (1..n).to_a.map {|i| "line #{i}"}.join("\n").Enumerable::Enumerator.new(our_string_weve_built_so_far, :each_line).map {|line| "* #{line}"}.map {|line| (' '*2) + line}
NoMethodError: undefined method `Enumerable' for ["line 1", "line 2", "line 3"]:Array
from (irb):9
from :0
We could probably solve this with a Functor. But, more directly (no extra classes) and generically, we could solve it like this:
irb -> class Object; def with_self; yield self end; end
=> nil
irb -> n = 3; (1..n).to_a.map {|i| "line #{i}"}.join("\n").with_self {|our_string_weve_built_so_far| Enumerable::Enumerator.new(our_string_weve_built_so_far, :each_line)}.map {|line| "* #{line}"}.map {|line| (' '*2) + line}
=> [" * line 1\n", " * line 2\n", " * line 3"]
That's pretty ugly and verbose, though. Let's see if we can concisify it a bit. I see that Enumerable defines several "enum_" methods. I'm very curious why they didn't throw in a generic "enum" method that let you pass in the name of the iterator to be used.
irb -> [].class.ancestors
=> [Array, Enumerable, Object, PP::ObjectMixin, Kernel]
irb -> Enumerable.instance_methods.grep /^enum/
=> ["enum_slice", "enum_cons", "enum_with_index"]
No matter. We'll just write our own.
http://code.qualitysmith.com/gemables/our_extensions/lib/enumerable/enum.rb
irb -> require 'enumerator'
irb -> module Enumerable def enum(iterator) Enumerable::Enumerator.new(self, iterator) end; end
=> nil
And we can see that it works marvelously:
irb -> n = 3; (1..n).to_a.map {|i| "line #{i}"}.join("\n").enum(:each_line).map {|line| "* #{line}"}.map {|line| (' '*2) + line}
=> [" * line 1\n", " * line 2\n", " * line 3"]
[edit] [good example of iterator] all?
tasks/install_tasks.rake [Not yet released (category)]
if !status.empty? and status.all? { |line| line =~ /^\?/ or line =~ /^A.*\.$/ }
SVN.add '*'
SVN.remove_without_delete 'log/*.log'
end
Translated to English: if all of the status lines returned by Subversion begin with a ? or an A, then...
[edit] [good example of iterator] collect + max
This block of code goes through each of the task objects, looks at the length of its name, and uses the highest one as the width for display purposes.
width = displayable_tasks.collect { |t|
t.name.length
}.max
collect actually returns an array containing all the lengths, and then max gets the highest item in the array.
From: /usr/lib/ruby/gems/1.8/gems/rake-0.7.1/lib/rake.rb :
def display_tasks_and_comments
displayable_tasks = Rake::Task.tasks.select { |t|
t.comment && t.name =~ options.show_task_pattern
}
width = displayable_tasks.collect { |t|
t.name.length
}.max
displayable_tasks.each do |t|
printf "rake %-#{width}s # %s\n", t.name, t.comment
end
end
[edit] each_line
irb -> "a\nb".each_line {|a| puts a }
a
b
=> "a\nb"
