2006-11-11

We all live in a Yellow Submarine

Anyone having read my previous posts, here and on ruby-talk, knows I love to explore ideas. It doesn't matter if they are considered "good" or "bad". Most cannot be judged without first exponding on them anyway. In fact, half the time I have no idea where an idea might lead until I sit down and blog it.

Tonight's idea comes by way of frustration with project organization, specifically module namespaces. A minor issue for small projects; large projects on the other hand... Well, consider my current problem. I have a class called Project. Now related to Project are a number of modularized tool sets. Each tool set can be used independently of the Project class, but typically will be used via it. So where do I locate the tool sets? My first instinct is to go ahead and put them in the class itself.


module MyApp
class Project
module ToolSetA
module ToolSetB


But I find this less the optimal. While it may be a small and unlikely matter, class is not a bonofide namespace --it is a class. And while it depends on these tool sets, the tool sets do not necessarily depend on it. As such we would never be able to include these tool sets in another namespace --such as the toplevel if it struck my or some users fancy. So we are left then to use some alternate organization.


class Project
module ProjectToolsets
module ToolSetA
module ToolSetB


or as a compromise


class Project
module Toolsets
module ToolSetA
module ToolSetB


The downside here, of course, are the long-winded names. But a better solution eludes me, other than one possibility: the use of all capitals for pure namespace modules.


class Project
module PROJECT
module ToolSetA
module ToolSetB


It's a bit strange in appearance but it works well. One quark however is that this new rule begs for my project's toplevel namepsace to be all caps too. Do I want to go there?


module MYAPP
class Project
module PROJECT
module ToolSetA
module ToolSetB


I'm not sure it's the solution I'm after, but to its merit, it does draw a nice distinction between namespace modules and other modules and classes.

In the course of this consideration I began to wonder about the distinction between Class and Module. The difference is almost not-existent in reality. If you peer into the Ruby source code you will find that interoperability between them is purposefully prevented. After all Class is a subclass of Module. Yet they are made distinct for a good reason. They provide conceptually different ideas. A class represents an data archetype; a module represents a reusable component. In fact, one could easily argue that Module itself could use an additional distiction between Namespace and Mixin. Even so, I could not help but wonder if it might yet be possible to have a single Encapsulation, relegating the differences to the elements within them instead of the encapsulation types themselves. I imagined this:


class Something
def x
def y
mod_def a
mod_def x
class SomethingElse


Instantiating via Something.new or subclassing would provide the instance methods x and y. Including however would provide the mod_def methods a and x instead along with adding SomethingElse to the including namespace. More refined means of controlling namespace become possible. For instance include_constants could limit inclusion to constants only; vice-verse with include_methods. Methods could be defined as both instance and module methods. And while we're at it, throw in a class_def as an alternative to def self.x.

It's interesting. I've often thought about the idea of eliminating the distinction between Class and Module. This is the first time it's occurred to me that it could be done while retaining the utility of that distinction by passing responsibility down to the methods themselves.

I suppose now the question is, what are the downsides to this? That'll require further consideration, but one clear point is that methods are less cleanly divided. You could have module methods scattered about your class definitions, weaving in and out of your instance methods. I suspect we would make an effort to nicely organize them however. Besides it means having fewer modules to name --and I'm all for anything that reduces the number of names I have to make-up.

It would be interesting to see how far one could go in implementing this in pure Ruby. Some details of Ruby will hold back a perfect implementation, but the essence of it is certainly possible. For starters, here's a neat trick for doing without the distinction between class and module.


class Module
def new
mod = self
Class.new{ include mod }.new
end
end


Have fun! Unfortunately I'm not. I'm still stuck on the forementioned namespace issue! Oh well. Back to the coding board...

2006-11-10

Separation of Church and State

Have you ever had a class so choke full of interrelated data and function members that had trouble avoiding name clashes between the two. Of course it's a rare problem when you're in full control of the members, but when you're designing extensible classes, it become a major issue and you have to resort to some less-than-lovely work around.

Let me give you a simple scenario.


class Package
# Release date
#
attr :release

# Release the package.
#
def release
puts "Telling the world on #{@release}..."
# ...
end
end


The issue here is clear. On one hand, we want to use release as a noun to represent the date of release. On the other, we want to use it as a verb for releasing the package to the world. Of course, under completely isolated circumstances we could just change one of the names and deal. But when we are working on the basis of extensibility, where these and additional data or functional members may be added readily, say via a plug-in system, then a solution is not as simple.

So what can we do? The bottom line is that in some way or another the two member types must be distinguished from one another.

One could transform one set of the members with a slightly different name via some uniform convention. For instance, all data memebers could start with "my_", so release as a date would be my_release. Ruby actually makes this it a bit nicer in that we can use '?' or '!' prepended to method names. A fair solution might then be:


def release?
@release
end


or


def release!
puts "Telling the world on #{release?}..."
#...
end


It's not a perfect solution however, especially as a matter of convention. It goes against the grian. '?' typically indicates a true/false query. And '!' indicates in place or caution. Consider how others will "smell" your code when they see a question mark for every reference to a data member.

The other more traditional solution is to use delegation. In this case we make a subclass for either or both of the member types. For instance:


class Package

class Data
attr :release
end

attr :data

def initialize
@data = Data.new
end

def release
puts "Telling to the world on #{data.release}..."
#...
end

end


Albeit a bit longer. It works very well. Delegation is a powerful tool. One could even emulate the former solution via method_missing, trapping method calls that end in '?' and rerouting them to @data. Another advantage is that we can readily pass around the data independent of the function members. On the flip side however, we are regulated to this special data.member interface. and likewise any reverse access by the data members to the functional members, if ever needed, would require us to also pass a reference to the Package instance into the Data instance.

In considering all this of course, it becomes apparent that Ruby already has a means of distinguishing data members from functional members via instance variables. Clearly @release references the date. But Ruby does not give us the power to treat "instance members" publicly or programaticlly. We can't, for instance, use project.@release to access the release date. Nor can we wrap data members in order to massage their data, say:


def @release
super || Time.now
end
public :@release


I'm sure many readers will take such notion for simply god aweful. But I think careful consideration at least warrants the fair question. "Is a distinct separation between data and functional members useful?" The mere existence of instance variables indicates that the distinction is in fact useful. In contrast, data members could have been made indistinguishable from functional members, or local variable persistence could be used in their stay. So if the distinction is useful, why hide public access to data members behind functional members acting as mini-delegates?

To be a bit more pragmatic, how would a solution to our example pane out if data members were in fact accessible? Interestingly it could look exactly like the original example. Public access to the release date however would simply come via project.@release or preferably even project@release. And there would be no need for any name (mis)conventions or special-interface delegation.

Of course let's be honest here. '@' itself is the Special Delegate of State to the Ruby "Church". Too bad he's only allowed to preach to the chior.