Ruby / Iterators/Enumerators

From WhyNotWiki

Jump to: navigation, search

(What's the difference between the terms "iterator" and "enumerator", by the way? I think iterator is the preferred term most of the time.)

Contents

[edit] Things you can do with Enumerable objects

  • inject
  • map
  • ...
delete_if (Array)
delete_if (Set)
delete_if (Hash)

[edit] concise iterator/map/collect calls with Symbol.to_proc

A very concise way to use map to do a method call on all elements

(requires http://extensions.rubyforge.org/)

irb -> require 'rubygems'
irb -> require 'extensions/symbol'

irb -> :size.to_proc
    => #<Proc:0xb7bfab00@/usr/lib/ruby/gems/1.8/gems/extensions-0.6.0/lib/extensions/symbol.rb:24>

irb -> ["three", "different", "words"].map(&:size)
    => [5, 9, 5]

       # The old-school method that you'll never go back to now that you know the concise alternative:
irb -> ["three", "different", "words"].map{|a|a.size}
    => [5, 9, 5]

How does it work? PragDave has a great explanation. "It’s an incredibly elegant use of coercion and of closures." I agree!

[edit] Example uses

Accumulate/sum up some numbers:

irb -> require 'rubygems'
irb -> require 'extensions/symbol'

irb -> sum = [3, 1, 9, 6].inject(&:+)
    => 19
irb -> %w{a couple words}.map(&:capitalize).join(' ')
    => "A Couple Words"
irb -> require 'rubygems'
irb -> require 'extensions/symbol'

irb -> f = File.open('foo', 'w') { |f| f.puts 'line1'; f.puts 'line2' }
    => nil

irb -> File.readlines('foo')
    => ["line1\n", "line2\n"]
irb -> File.readlines('foo').map(&:chomp)
    => ["line1", "line2"]

[edit] Enumerable#map_send(meth, *args) {|e.send(meth, *args)| ...} [Ruby Facets (category)]

http://facets.rubyforge.org/src/doc/rdoc/core/classes/Enumerable.html#M001330

# File lib/facets/core/enumerable/map_send.rb, line 3
  def map_send(meth, *args)
    if block_given?
      map{|e| yield(e.send(meth, *args))}
    else
      map{|e| e.send(meth, *args)}
    end
  end

Sometimes Symbol#to_proc isn't enough because it doesn't let you pass args. map_send is the answer to this!

irb -> require 'facets/core/enumerable/map_send.rb'

irb -> [1,2,3].map_send(:+, 3)
    => [4, 5, 6]


[edit] Enumerable#every / Array#every! [Ruby Facets (category)]

http://facets.rubyforge.org/src/doc/rdoc/core/classes/Enumerable.html#M001236

Returns an elementwise Functor. This allows you to map a method on to every element.

irb -> require 'facets/core/enumerable/every'

  [1,2,3].every + 3           #=> [4,5,6]

  ['a','b','c'].every.upcase  #=> ['A','B','C']

Arguably more readable than map.

irb -> require 'facets/core/enumerable/map_send.rb'

irb -> [1,2,3].map_send(:+, 3)
    => [4, 5, 6]

[edit] How do I collect an "each"-type iterator as a collection?

irb -> "tyler".each_byte {|l| p l }
116
121
108
101
114

How do I convert that into the array [116, 121, 108, 101, 114] instead?

Sure, I can do this:

irb -> a = []; "tyler".each_byte {|l| a << l }; a
    => [116, 121, 108, 101, 114]

But isn't there a more elegant way???

Yes! See next question...

[edit] How do I turn an "each" into a "map"?

Use an Enumerable::Enumerator object! Very handy.

The plain map method is equivalent to this:

Enumerable::Enumerator.new(object, :each).map{|value| value}

Example:

irb ->  require 'enumerator'

irb ->  Enumerable::Enumerator.new(['a', 'b', 'c'], :each).map{|value| value}
    => ["a", "b", "c"]

irb -> ['a', 'b', 'c'].map{|value| value}
    => ["a", "b", "c"]

Fine, you don't need an Enumerator to do that... But to "catch" the values that are yielded by other values and get them returned as an array, I really don't know any other good way to do it...

irb -> Enumerable::Enumerator.new(['a', 'b', 'c'], :each_with_index).map{|value, index| [index, value]}
    => [[0, "a"], [1, "b"], [2, "c"]]

irb -> Enumerable::Enumerator.new(['a', 'b', 'c'], :each_with_index).map{|value, index| "#{index+1}. #{value}"}
    => ["1. a", "2. b", "3. c"]

...

[edit] Kernel#enum

require 'qualitysmith_extensions/kernel/enum'

[edit] Why would you want an "each" to be a "map"?

When you want to chain several "map"-type iterators together, that's when.

irb -> "line 1\nline 2".each_line {|line| "* #{line}"}.map {|line| (' '*2) + line}
    => ["  line 1\n", "  line 2"]

We wanted the each_line to prefix the line with a bullet. But that's not what happened. each_line actually returns the receiver unchanged!

irb -> "line 1\nline 2".each_line {|line| "* #{line}"}
    => "line 1\nline 2"

That's fine with that iterator is the "end of the line", so to speak (the last iterator in a chain of iterators):

irb -> "line 1\nline 2".each_line {|line| puts "* #{line}"}
* line 1
* line 2
    => "line 1\nline 2"

... but not so cool when you want to chain something on after it.

Solution with Enumerators:


irb -> require 'enumerator'
    => true

irb -> Enumerable::Enumerator.new("line 1\nline 2", :each_line).map {|line| "* #{line}"}.map {|line| (' '*2) + line}
    => ["  * line 1\n", "  * line 2"]

But what if you want that enumerator in the middle of a chain of map-type iterators?

[edit] How to insert an enumerator in the middle of a chain of map-type iterators

We want to do something sort of like this (but obviously not exactly like this):

irb -> n = 3; (1..n).to_a.map {|i| "line #{i}"}.join("\n").Enumerable::Enumerator.new(our_string_weve_built_so_far, :each_line).map {|line| "* #{line}"}.map {|line| (' '*2) + line}
NoMethodError: undefined method `Enumerable' for ["line 1", "line 2", "line 3"]:Array
        from (irb):9
        from :0

We could probably solve this with a Functor. But, more directly (no extra classes) and generically, we could solve it like this:

irb -> class Object; def with_self; yield self end; end
    => nil

irb -> n = 3; (1..n).to_a.map {|i| "line #{i}"}.join("\n").with_self {|our_string_weve_built_so_far| Enumerable::Enumerator.new(our_string_weve_built_so_far, :each_line)}.map {|line| "* #{line}"}.map {|line| (' '*2) + line}
    => ["  * line 1\n", "  * line 2\n", "  * line 3"]

That's pretty ugly and verbose, though. Let's see if we can concisify it a bit. I see that Enumerable defines several "enum_" methods. I'm very curious why they didn't throw in a generic "enum" method that let you pass in the name of the iterator to be used.

irb -> [].class.ancestors
    => [Array, Enumerable, Object, PP::ObjectMixin, Kernel]

irb -> Enumerable.instance_methods.grep /^enum/
    => ["enum_slice", "enum_cons", "enum_with_index"]

No matter. We'll just write our own.

http://code.qualitysmith.com/gemables/our_extensions/lib/enumerable/enum.rb

irb -> require 'enumerator'
irb -> module Enumerable def enum(iterator) Enumerable::Enumerator.new(self, iterator) end; end
    => nil

And we can see that it works marvelously:

irb -> n = 3; (1..n).to_a.map {|i| "line #{i}"}.join("\n").enum(:each_line).map {|line| "* #{line}"}.map {|line| (' '*2) + line}
    => ["  * line 1\n", "  * line 2\n", "  * line 3"]

[edit] [good example of iterator] all?

tasks/install_tasks.rake [Not yet released (category)]

      if !status.empty? and status.all? { |line| line =~ /^\?/ or line =~ /^A.*\.$/ }
        SVN.add '*'
        SVN.remove_without_delete 'log/*.log'
      end

Translated to English: if all of the status lines returned by Subversion begin with a ? or an A, then...

[edit] [good example of iterator] collect + max

This block of code goes through each of the task objects, looks at the length of its name, and uses the highest one as the width for display purposes.

      width = displayable_tasks.collect { |t|
        t.name.length
      }.max

collect actually returns an array containing all the lengths, and then max gets the highest item in the array.

From: /usr/lib/ruby/gems/1.8/gems/rake-0.7.1/lib/rake.rb :

    def display_tasks_and_comments
      displayable_tasks = Rake::Task.tasks.select { |t|
        t.comment && t.name =~ options.show_task_pattern
      }
      width = displayable_tasks.collect { |t|
        t.name.length
      }.max
      displayable_tasks.each do |t|
        printf "rake %-#{width}s  # %s\n", t.name, t.comment
      end
    end

[edit] each_line

irb -> "a\nb".each_line {|a| puts a }
a
b
    => "a\nb"
Personal tools