04.20.08

Self-Modifying Code? Or Self-Creating Code?

Posted in Uncategorized at 12:49 pm by JohnB

[aside: Why write an email that will be read by just one or two people when you can instead write a blog post that will be seen by… uh… one or two people?]

I was recently discussing the problem of tracing program execution through code that doesn’t exist.  Sometimes, usually to understand and debug a program, you need to trace through it. If the code doesn’t exist, its hard to follow where it goes. Actually, the code exists, its just not in source control or otherwise easily searchable.  Let me explain.

In Ruby, whose source code is a lot like its executable format, its trivial to write code that writes code - so you, the programmer, don’t have to. The Rails web framework uses this a lot, for creating whatever form of ‘find’ strikes your fancy: find_by_firstname, find_by_firstname_and_gender, find_by_pet_species_and_viciousness, etc. - as long as your database table has the appropriately named columns then it will find what you’re looking for with no work on your part!  Actually, some work is required - you need to know that most any routine starting with ‘find’ is probably auto-generated by Rails and thus will be un-findable in the source tree.

A more complex example is the routing helpers - a shorthand way of saying “put a link here on this web page that will take the user to that web page over there.  Until you realize that most any method ending in ‘path’ is a URL helper then you’ll be confused by code that references user_edit_path or formatted_pet_list_path or formatted_pet_species_list_path.  The last one says that you want to get the path to a specially-formatted list of the various pet species represented by the system (e.g. “http://myPetSite.com/pet_species/list.csv”). Fairly clear once you know what it does, but fairly obtuse until you get to that point.  And it is, trust me on this, shorter, more maintainable and more clear than the alternative (after, of course, you learn to read it).

So this difficulty in finding the source to “formatted_pet_species_list_path” (or something like that) started a discussion that eventually got around to the idea that Ruby and Rails uses code to write code.

“Its the 11th commandment”, said one, ” - thou shalt not write self-modifying code!”

Yes and no. Call me irrationally exuberant or say that I drank some sugary flavored colored water - but I don’t think its as cut-and-dried as it used to be.  This commandment, when Moses brought it down from on high, was written for a compiled language, where the source was very very different from the executing code, and it actually did modify the code being executed.  This is bad bad bad - its hard to understand, it leads to intractable bugs, it’ll make you go blind, melt glaciers and other bad stuff.  No argument there.

But is this what Rails is doing with Ruby? I’d say its different.  Instead of modifying existing code it is merely creating code that did not exist before. Its more akin to a pre-processor that could, just before compiling a program, generate all the permutations of finding database rows from columns X, Y and Z (and any number of others).  Looked at this way, it appears to be similar to a C++ template - generic code that a programmer has written to simplify the writing of code that does similar things in similar ways.  The main difference is that with Ruby there is no pre-processor - the routine gets created at the time it is first used.

The “template” in this case is a routine called method_missing that gets handed the name and arguments of any routine that doesn’t exist.  It looks for a function name matching the form ‘find_by_X‘ where X makes sense for the database table in question (of course, if it doesn’t match the correct format then it just passes the name and arguments onward for some other routine to either make sense of or to burp up an error).  Once the routine is created it is available to be called again, with none of the overhead incurred when it was first created. More importantly, it was created with no additional overhead on the part of the programmer - to write it, test it, debug it or modify it. This, in my opinion, is a huge advantage - but I’m a lazy developer who doesn’t want to write or debug any code that I don’t have to.

Kool-aid anyone?

03.20.08

Its Official - My Old Product is Dead

Posted in Uncategorized at 4:22 pm by JohnB

Pay By Touch To Shut Down All Biometric Services Immediately

Biometric authentication transactions to cease at 11:59:59PM March 19, 2008

SAN FRANCISCO - (March 19, 2008) - Solidus Networks, Inc., dba Pay By Touch, regretfully announced today that it will no longer process biometric transactions on behalf of its merchant customers and consumer membership base, as of 11:59:59PM March 19, 2008.

On December 14, 2007, Solidus Networks filed for U.S. bankruptcy protection under Chapter 11. As part of the company’s restructuring, it was determined that the enterprise could no longer support the biometric authentication and payment system as it currently exists, based on lack of funding and current market conditions.

Other non-biometric Solidus Networks business units will continue on their current business paths.

Solidus Networks extends its sincere gratitude to the shoppers, merchants, vendors, investors, partners, and employees who have been supporting the company’s vision since its first biometric payment transaction in 2002.

02.27.08

unpack!

Posted in Uncategorized, ruby at 12:54 am by JohnB

 [Update: presentation from the 4/15/2008 Ruby Meetup is now available here.]

I like reading code. Its like a novel and I want to read it cover-to-cover. Some, such as Why’s Camping framework, I struggle to comprehend. But most code that I read comes up slightly short. Like a novel with some mis-spellings, awkward phrasing or repeated analogies, I mentally mark it as “could be better”. And sometimes I really do sit down and write something better - maybe just for my own amusement but often for a useful purpose.

I recently had the experience of reading some code that parsed a variable-length binary data structure. This sort of thing comes up often when parsing a file format or communications protocol. Most of the code looks fairly similar because it does similar stuff: ignore one byte, read the next four as the length of the following junk, read two important bytes, ignore two more, read another four-byte length and skip past the following N bytes - ad nauseum.

I’ve written it in C, and it looks something like this (ignoring error conditions like getting to the end of the buffer):

ptr = &data;                  // start at the beginning of our data
ptr++;                        // skip junk we don't care about
UInt32 len = *(UInt32 *) ptr; // get the 4-byte length
len = ntohl(len);             // convert from network byte ordering
ptr += sizeof(UInt32);        // skip past the length we just read
ptr += len;                   // skip past the data we don't care about
UInt16 cost = *(UInt16 *)ptr; // read our important two bytes
cost = ntohs(cost);           // convert to the correct byte ordering

In Ruby, this tends to be shorter due to the handy String.unpack() routine, which takes a concise format string to define how many bytes to read and what to do with them. “a3″ reads 3 bytes as a string, “N” reads 4 bytes in network order, “n” reads 2 bytes in network order, etc. The code above could be rewritten in Ruby like this:

array = data.unpack( a1N“)        # read the junk and the 4 length bytes
len = array[1]                     # only get the length value we care about
data = data[5..-1]                 # throw away the stuff we just read
array =  data.unpack( a#{len}n ) # define the length to read on the fly
cost = array[1]                    # get our data in its correct ordering
data = data[(len+2)..-1]           # again, throw away what we just read

This code works fine, but its not much more readable than the C code. A first step would be do define a string.unpack!() routine, where the ‘!’ exclamation clues us in that it modifies the object we’re working with. In this case, the modification is to eat (discard) the data we just read. This shortens the code to:

array = data.unpack!( a1N“)       # read the junk and the 4 length bytes
len = array[1]                     # only get the length value we care about
array =  data.unpack!(”a#{len}n“)  # define the length to read on the fly
cost = array[1]                    # get our data in its correct ordering

But again, this isn’t much more readable (in my opinion) than the C code. Additionally, it doesn’t help us understand the code much better in the case where our format string is “a3Nna5″ and we need to remember which item in ‘array’ corresponds to the ‘n’ in the string (in this case, it is array[2]). After a test iteration or two, what I finally hit upon was to encapsulate the behavior we want in a separare Unpacker class, that automatically eats the data it reads and stores the results in an internal Hash object, to map the name ‘len’ or ‘cost’ to the data. I also combined the format string and the resulting variable so we can clearly see the relationships. The result looks like this:

u = Unpacker.new(data)
u.u! a1        => unused
      N         => len
u.u! a#{u.len} => unused
      n         => cost

Now we can clearly see which values are ignored, which are given meaningful names, and how the format codes relate to the meaning of the data. Changing it to reflect a better understanding of the underlying data will be very easy. Note that the only reason its in two statements is to define a value for u.len before we use it - blocks of fixed-length data can be one statement.

The code to implement the Unpacker class is only about 30 lines of Ruby - including the string.unpack!() routine that can be reused separately.

class String
  def unpack! format
     array = self.unpack(format+”a*“)
    self.replace array.pop
     return array
   end
end
class Unpacker < Hash
   attr_reader :data
 def initialize string
     @data = string
    super
  end
  # format string is expected to have whitespace between each
  # “unpackCode=>variableName” pairing (which can have whitespace
  # around the “=>”).  u! was picked to be short so it would
  # look nice, and to connote a destructive “unpack!” operation.
  def u! format
    format.gsub(/\s*=>\s*/,’=>‘).strip.split(/\s+/).each do |segment|
    src,dst = segment.split(/=>/)
    self[dst] = @data.unpack!(”#{src}“)[0]
 end
end
# Hash_with_Attrs - For the simplicity of using either u.len or u[’len’],
# makes a hash appear to have members for each hash entry. Many thanks
# to Why_ for collecting this handy routine on his a href= RedHanded blog.
# Note of Caution: ‘len’ is fine but ‘length’ would not be since u.length
# would give the number of entries in the hash, not the just-parsed value.
def method_missing(meth,*args)
  meth = meth.id2name
  if meth =~ /=$/
    self[meth[0..-2]] = (args.length<2 ? args[0] : args)
  else
    self[meth]
  end
end
end

Update: An even cleaner and shorter way would be to implement a DSL as a module so the code above could look like this:

a 1,    :unused
N       :len
a :len, :unused
n       :cost

(and yes, this is valid Ruby code)

01.29.08

Book Recommendation: The Rails Way

Posted in Uncategorized at 12:53 pm by JohnB

The Rails Way is a Ruby on Rails reference book that I bought on Josh Susser’s recommendation.  I’ve actually, to my family’s dismay, been reading the darn thing instead of just referring to it like one would a, well, reference book.  A lot of Rails’isms that I had a vague idea about I now understand with much more clarity.  It will definitely come in handy soon when I start my new job writing mostly RoR code!

01.25.08

Using Ubuntu

Posted in Uncategorized at 3:18 pm by JohnB

I’ve heard that dual-booting Ubuntu linux was easy but its really true. I’m now running Ubuntu and it was as easy as various blog posts have said. The longest step in the process was defragmenting the drive before repartitioning with Ubuntu. There are a few issues remaining around using the data on the Windows partition from Linux, but on the whole I’m very happy with the switch.

[Update 1/29/2008: the network is inconsistent.  Upon a boot or un-hibernate it may be completely incapable of finding my router - but then later it is fine. I’ll continue trying to track it down… using the Windows OS!] 

01.14.08

Xkcd Titles

Posted in ruby at 1:26 pm by JohnB

I’ve just noticed the geekily hilarious xkcd comic and one of the funniest aspects is that each comic has a ‘title’ attribute (the text that pops up when you hover your mouse over the image) that is often as funny as the comic itself. However, the length of the title often causes it to be truncated in my browser (Firefox 2.x, which probably has an obscure show-entire-title setting). Rather than arduously do a ‘view source’ on each one (or figure out the Firefox setting), I have Ruby do it for me. And for you if you want:

# xkcd.rb
# extract all the titles from xkcd comics since they
# tend to be too long to fully show in the browser

# USAGE: ruby -rubygems -rxkcd.rb -e 'Xkcd.new.show_all'

require 'open-uri'
require 'hpricot'

class Xkcd
  DOMAIN = 'http://xkcd.com/'

  def show id = 343  # 343 is the NSA/RSA one
    begin
      @hp = Hpricot.parse( open( "%s/%d/" % [DOMAIN,id.to_i] ) )
      (@hp / :img).each do |el|
        puts "%4d: %s" % [id.to_i, el[:title]] if el[:title]
      end
    rescue
    end
  end

  def show_all
    0.upto(400) do |i|
      show i
    end
  end
end

01.12.08

Jumping on the Bandwagon

Posted in musings at 3:06 pm by JohnB

I just have to wonder: who were the first two people to have “died in a blogging accident“?

12.17.07

Another High-Traffic Rails Site: catalogchoice.org

Posted in Uncategorized at 6:12 pm by JohnB

Its getting a lot of traffic and seems pretty snappy:

http://www.catalogchoice.org/

So yes, Virginia, Ruby and Rails do scale.

11.28.07

The Perception of Scarcity in a Climate of Fear

Posted in musings at 1:56 pm by JohnB

I was playing Blokus today, where competition is driven by the scarcity of space on the game board, and realized that the perception of scarcity is often more prevalent than actual scarcity - and thus we needlessly hobble ourselves by limiting things that are abundant. Similarly, our fear that something might happen to us (crime, identity theft, terrorism, etc. - whatever monsters we see on the evening news) forces us to add locks and protections that mostly just result in making it hard for us to access our own belongings and data and websites.

The context for this discussion is a website (nameless, sorry) that I’m interested in working on. The startup site, yet another type of social network, holds the promise of allowing for some very interesting and powerful interactions - but unnecessarily limits its users as it guards scarce server resources and data security. Furthermore, and I’m going out on a limb here, I suspect that these mis-perceptions are one of the reasons this startup has had difficulty in raising much-needed funds. Some examples:

  • Users are automatically logged out after a few idle minutes, with no option of changing the time period before auto-logout (or choosing “Keep me logged in” for single-user computers). This seems a bit draconian given that there is nothing accessible on the site that couldn’t be gathered in other ways - no bank statements, social security number or mother’s maiden name.
  • A PDF document containing the public profile data for your social circle can be generated for off-line access, but only by a subset of the social circle and only for a short period of time. I think this is intended both for security and to guard scarce resources (such as server time and bandwidth). The former concern is misguided - anyone receiving the PDF can circumvent security by immediately sending it to bad people - which is unstoppable once you provide off-line access). The scarcity of server time or bandwidth can be overcome by delegating it to someone else such as Amazon’s ECC or S3 services.
  • New people can be invited to the social circle, but only by a small initial set of users - and those invitations expire relatively quickly. Its unclear why this decision was made, but I suspect it was due to some perception of scarcity or security. All it appears to do is add yet another unnecessary barrier to entry.

In spite of these issues, and others, I’m still captivated by the underlying ideas that it represents and by what it could become in the future. Hopefully I can rapidly prototype my vision for an improved site and use it as a starting point to land a dream job.

Even More Rapid Development

Posted in Uncategorized at 1:41 pm by JohnB

The success of the Ruby on Rails web framework is somewhat based on its  ability to soothe the pain caused by the not-so-rapid development process of other, so-called “enterprise-ready” frameworks.  But Rails is not the only Ruby web framework, and not the fastest one for initial prototyping(*).  The faster (more rabid?) ones I’ve looked at:

  • Camping.  From the the quirky mind of why-the-luck-stiff (no other name given) it inspires absurdly fast development (and absurdity!).
  • Sinatra.  Some people who have tried Camping have moved on to Sinatra - it has a clean syntax and a simple metaphor (Sinatra attends events) and is supported by a larger team.

Its hard to imagine what faster development would look like - maybe a web interface for defining Camping or Sinatra event handlers?  Code the app directly from the browser!

(*) Footnote: Note that I use the word “prototype” because that is all I have done with them - I see no reason they couldn’t scale as well as Rails or any other web framework.

« Previous entries