Archive for February 2008

An extensible FARG implementation, Part II

February 28, 2008

[Microsoft Word seems to be messing up my fonts. Apologies.]

The question I explore in this post is how much and what can be shared between CRCC projects. I will try to stay as concrete as I can. In this post, I have managed to talk about a single “simple” aspect.

One of the most obviously sharable aspects of FARGitecture is the codelet system. I am not referring to the types of codelets, but rather to the infrastructure for dealing with codelets. If a good, solid, and reusable framework had been available to me, it would indeed have saved time. My final implementation is only a couple of hundred lines of code, but the effort required and mistakes repeated make it quite costly. Needlessly costly.

What are the petty things needed of this infrastructure? Obviously, we need a simple interface to add codelets and select codelets. That much is obvious. What may become more apparent only by implementing a system and struggling to grapple with its complexity are other features and tools, like the following (and I have not counted these “optional” features in the 200 lines I mentioned):

  • We need the ability to remove stale codelets. When a Codelet is added to the Coderack, it may refer to some structure in the workspace. While the codelet is awaiting its turn to run, this workspace structure may be destroyed. At the very least, we need code to recognize stale codelets to prevent them from running.
  • Writing new codelets should be easy. Most codelets, when written in the programming language being used, share boilerplate code. Tools that allow us to write closer to the domain (and then compile our code down to the programming language) have proved very useful to me. They improve code readability, lower the barrier to experimenting with new codelets, even improve error messages. Just to illustrate, here is the code for one codelet type:

    CodeletFamily flipReln( $reln ! ) does {
    RUN: {
    my $new_reln = $reln->FlippedVersion() or return;
    $reln->uninsert;
    $new_reln->insert;
    }
    }

    Mind you, I am not claiming this to be hard or undoable by appropriately setting up classes so that the “compiler” is not required. It is not difficult, but it takes careful thought and time. I am just pointing out that something needs to be done to ease codelet writing.

  • Seqsee has the notion of scripts. Some tasks are by their very nature somewhat serial. Like makings tea. Or describing a solution. How would a FARGitecture do such tasks? Certainly not by using a single codelet. We do not need a “make tea from scratch” codelet, or a “write dissertation” codelet. Seqsee “solves” this (ha ha) by splitting the task into several codelets that call each other. Painful to write? Yes. Repetitive to code? Yes. Extend the programming language to reduce busy work? Yes. So here is Seqsee’s code for one task (which is a subtask of a bigger task, but we don’t have to care).

    CodeletFamily DescribeRelationCompound( $reln !, $ruleapp ! ) does scripted {
    STEP: {
    my $category = $reln->get_category();
    SCRIPT DescribeRelnCategory, { cat => $category };
    }
    STEP: {
    my $meto_mode = $reln->get_meto_mode();
    my $meto_reln = $reln->get_metonymy_reln();
    SCRIPT DescribeRelnMetoMode,
    {
    meto_mode => $meto_mode,
    meto_reln => $meto_reln,
    ruleapp => $ruleapp,
    };
    }
    }

    The SCRIPT calls launch other codelets, which may call more scripts, and return. All along, all the laws of codelets hold: they can be removed from the Coderack without being run, several “scripts” may run in parallel, and so forth. Again, being able to focus at the task on hand is a big win.

  • At any stage, several codelets are on the Coderack. Several avenues are being explored in parallel. In order to understand what’s going on, or even to demonstrate what’s happening, we can build tools. These are only related to the codelets, and can be completely domain independent. Here are some such visualizations from Seqsee.
    • Codelets launch other codelets. By logging what codelet added what other codelet when, we can obtain trees of codelet launching. Very useful for debugging. This image is from a tool that allows exploration of such trees, allows searching for specific types of codelets, and also lets us see how long codelets sit on the Coderack.

    • At any stage, there is pressure to do several things. Each codelet is a tiny bit of pressure. The log mentioned above also allows us to create graphs that display the change in pressure to do a particular thing over time.

      The image below shows the pressure to extend groups during a single run.

All tools mentioned here are domain independent. Just writing the system that manages the infra takes little time, but writing tools takes longer. And deciding what tools to build takes longer still, and yet, in the long run, they save time. Without these tools, I may not have been able to write Seqsee.

Imagine writing a FARGitecture to be like programming a task. Well, don’t imagine, it is programming a task. Except that we design it from scratch: we build our own virtual machine, write our own tools. It is high time we had a shareable toolset, at the very least.

I will end by pointing out that the images I have added above were generated from a log file. This file could have been dumped by a COBOL program, if somebody wanted to write a FARGitecture in that wonderful language. We don’t need to agree on a programming language to share tools.

Advertisements

An extensible FARG implementation

February 28, 2008

I have been meaning to write this post for a while, now. Over the last few years, the idea and urgent discussion regarding a hypothetical library that would allow a new FARG project to be achieved in a few keystrokes has surfaced, repeatedly, and has not led anywhere thus far. The latest outcropping was last week, and this post is an attempt to delineate what I believe needs to be done.

These days, I am working on my dissertation. I thought about what I want to say here, and what part of this I can use in the dissertation. I think I can use some parts of this post there.

The Current Problem: Starting from Scratch

Several FARG projects have been created thus far, and are listed in the following table. (FARG Historians and Lore Masters! Your help in filling and correcting this table will be much appreciated.) Programs that have not been completed are marked with an asterisk.

Program

Author

Language

Year

Domain

Jumbo

Douglas Hofstadter

Scheme?

1983?

Anagrams

Seek-Whence

Marsha Meredith

?

1986

Integer pattern sequences

Numbo

Daniel Defays

?

1986

 

Copycat

Melanie Mitchell

Franz Lisp

1990

Letter analogy puzzles

Tabletop

Bob French

?

1992

 

Letter Spirit

Gary McGraw

Scheme?

1995

Grid fonts

Metacat

Jim Marshall

Chez Scheme

1999

Letter analogy puzzles

Letter Spirit II

John Rehling

Scheme?

2001

Grid fonts

Phaeaco

Harry Foundalis

C++

2006

Bongard Problems

*Seqsee

Abhijit Mahabal

Perl

2008+

Integer pattern sequences

*Musicat

Eric Nichols

C#

2008+

 

*?

Francisco Lara-Dammer

Java

2008+

Geometry

Capyblanca

Alexander Linhares

Delphi?

2007?

Chess

 

Column 3 (“language”) is very diverse. For implementing a complex program, using a powerful language that the lone programmer knows inside out makes sense. Especially if there is no code to reuse, anyway, as has been the case so far. Other reasons that influence language choice (such as availability of people to hire who are skilled in that language) are simply non-existent in a PhD. Most PhD projects are prototypes, proofs of concepts.

Each project has consequently started from scratch (except for the obvious exceptions of Metacat and Letter Spirit II, both of which extended prior projects). Each project takes years to complete, and it is natural to wonder how much effort, if any, could have been saved if it had been possible to reuse some of the code. Michael Roberts, Alexander Linhares, Harry Foundalis, Eric Nichols, and I have again been discussing these issues recently.

My next post will examine (my guess about) what kinds of things may be shared between projects. A subset of what can be shared is the core of FARG architectures, the crux without which the program ceases to be a FARG implementation. I will try to spell out what I think is in the core.

[Edited: In his comment, A. Joseph Hagar pointed out several more projects which were not done here at the CRCC. I have listed these in the table below.]

Program

Author

Language

Year

Domain

Fluid Analogies Engine

Scott Bolland

Java

2005

Potentially General

*
B-Cat

Payel Ghosh, Ralf Juengling, Lanfranco Muzi, Mick Thomure

?

2007+

Bongard Problems

Starcat

Joseph Lewis et al.

?

2004

Potentially General

All the world in a song

February 28, 2008

From the excellent Strange Maps blog today: a musical score representing a map of the world.

The notion of “expectation”

February 28, 2008

A concept I was thinking of, back when first thinking through the Magnifi* engine, was that of “expectation.”  I’m still finding it useful in my current iteration, and I hope to keep exploring it.

The basic notion is this: In Copycat, where the goal of a run is fixed, the expectation is defined by the actual code.  It’s the fourth letter string in the problem (in abc : abd :: mrrjjj : ?, the ? stands for the expectation) along with the rules used to get there.

In Magnifi*, where the structure of the Workspace is determined by the domain definition (thus not determined by the code in the engine), the expectation is effectively a question mark like the one I used up above.  The expectation is the driving force of a run — it defines what the purpose of a run is.

OK?  So we can see an entire run as consisting of building conceptual structure that will fit into the slot of an expectation in a felicitous way, plus the requirement to “show your work”.  Now consider this: what if I define the specifications of a piece of software in this same way, and set the expectation as the deliverable?  The creative work necessary to derive the conceptual structure to fulfill that expectation should result in, say, a Perl script to the specification I first entered.

That’s where I hope this work is going.  It might take a little while to get there; I’ve been just sitting here thinking about starting for a decade, after all…  Discuss amongst yourselves.

Meeting notes: Feb. 23, 2008

February 23, 2008

Today six or seven of us met at CRCC for a nice discussion. I won’t try to review it all here, but I’ll list a few items of note:

  1. We talked about putting together a demo of lots of the FARG projects and having some sort of gathering (such as an “open house” later this year, say at the start of the fall semester, to celebrate CRCC’s 20th anniversary.
  2. We discussed this blog – everyone seems to agree that this multi-author WordPress system will work fine. If you’re a core FARGonaut who wants to write posts and for some reason hasn’t been added yet, let me know.
  3. I suggested that several of us could draft a paper comparing and contrasting the various FARG projects, as a way of putting new projects into context as well as getting a handle on what features comprise the “core” of the “FARGitecture”. There seemed to be support for this idea, and I’d certainly enjoy helping put together such a report with a couple others in our group.
  4. A large portion of the discussion was on this notion of a “core” – we’ve had plenty of this discussion on the mailing list recently so I won’t repeat it here. Ab suggested listing a bunch of simple domains as “use cases” to help define the common elements. The goal is to write what Matt called the FARG RAGF, or Really Awesome General Framework. Whether or not the RAGF is a good idea is open to plenty of debate, but the idea is, for better or worse, to write a reusable library to facilitate implementation of novel FARG models.
  5. We created a new private Google Group to provide archived FARG discussion as a complement to our local CRCC email list. I brainstormed a list of some features for the RAGF, and I’ll post it there (because it was suggested that such technical discussion may not belong in this blog).

Please let me know if you have any questions or suggestions about our new online FARG features. I’m looking forward to seeing what we start writing here! Just as a word of advice, please remember that this is a public blog and we should all keep things professional and representative of the general mission of CRCC. More speculative discussion (like debate about a hypothetical RAGF) should probably be left to the discussion group.

New typography term

February 23, 2008

Keming: the result of improper kerning.

A bit brief for a proper blog post, but there you go.

Welcome to the FARG Blog

February 22, 2008

Dear friends, this is a quick note to kick off our new FARG Blog. We’re setting it up so that a bunch of us can author our own posts, so we can each write about whatever FARG-related topics we’d like. We’re still planning to use our internal mailing list for crazier discussions that don’t need to be on the public blog, but we’ll use the blog to spread the word about current happenings, new projects underway, etc. We’ll see how it evolves as things get rolling

For a list of many of the current and past FARG members, associates, etc., please see the list of people on the CRCC website.