Intriguing Ideas

Some intriguing ideas have been suggested, and even tried out, in this general area. My own efforts include leading a panel discussion ("Making a CASE for Open Source") and writing some columns (e.g., "Silicon Carny", formerly on http://www.sunworld.com). Here are some other ideas which have I have found intriguing.

In "The Case for a New Business Model" (Communications of the ACM, August 2000), Philip G. Armour argues that software is (like DNA, brains, hardware, and books) a medium for storing knowledge. Unfortunately, even when the software "works", the domain-specific knowledge it embodies tends to be buried in the source code. I would contend that system metadata is a perfect example of this phenomenon.

The Trove project (http://www.tuxedo.org/~esr/trove) is Eric Raymond's proposal for "an open-source distributed archiving system for use at large software archive sites". The Trove schema (http://www.tuxedo.org/~esr/trove/schema) covers individuals (e.g., author, contact person, maintainer) and packages.

The XML-based Open Software Description Format (OSD; http://www.w3.org/TR/NOTE-OSD.html) is "a vocabulary used for describing software packages and their dependencies for heterogeneous clients". In addition to handling common packaging issues, it deals with the problem of employing "push" technology to update client machines.

The Software Carpentry project (http://software-carpentry.com) is an effort to encourage the development of generic software development tools, using the Python programming language. They have been running competitions for proposals and have garnered several interesting submissions to date. The project divides its problem space into several areas (e.g., Configure, Build, Test, and Track). Although integration of the tools is a project goal, the divided approach to development may make this difficult to achieve.

Assorted package management and system administration tools track the configuration (e.g., file characteristics and/or installed packages) of running systems. In general, however, these are not tied into any overall metadata (let alone documentation) system.

The approaches and goals of the Open Source Metadata Framework (OMF; http://www.ibiblio.org/osrt/omf) are very closely aligned with those of the Meta Project:

The OMF aims to collect data about Open Source documentation, or metadata, that will be used to describe the documentation. The idea is that the OMF will act as a sophisticated card-catalog type of system for the numerous Open Source documentation projects that exist. The OMF offers a number of advantages over standard card catalog type systems, however. Chief among these is the fact that the OMF has been designed from the ground up to be completely open, standards based, and sharable. We will accomplish this by using pre-defined standards (XML and the Dublin Core description for metadata) and allowing all metadata generated to be accessed by anyone that wants it. Because the metadata itself is to be stored in XML files, anyone should be able to use it.

ScrollKeeper (http://scrollkeeper.sourceforge.net) is based on OMF and embodies many of the same ideas as Meta:

ScrollKeeper is a cataloging system for documentation on open systems. It manages documentation metadata (as specified by the Open Source Metadata Framework (OMF) and provides a simple API to allow help browsers to find, sort, and search the document catalog. It will also be able to communicate with catalog servers on the Net to search for documents which are not on the local system.

The key differences between these projects (OMF and ScrollKeeper) and the Meta Project lie in project scope. Whereas OMF and ScrollKeeper are limited to documentation (with a nod towards high-level "package metadata"), Meta also concerns itself with low-level details (e.g., programs, files, and directories). With luck, however, Meta will be able to incorporate their work, using it to solve many of the gnarly problems on the "documentation" front.

Apple Innovations

Apple (http://www.apple.com) has released a variety of Eclectic Systems (e.g., A/UX, *Darwin , Mac OS X ), introducing some interesting ideas along the way. Although some of these ideas are tuned to "Mac OS" needs and may seem foreign to "traditional" Unix practices, they are still worthy of examination.

A/UX systems contained an annotated list (/FILES), describing every file and directory on a "vanilla" system. In addition, a "flat file data base" contained the initial characteristics (e.g., checksum, size, permissions) and acceptable variations (e.g., "must not get smaller") for all files. A system utility was able to use this data base to perform emergency replacement of damaged system files.

The Commando application, implemented in both A/UX and MPW (Macintosh Programmer's Workshop), is a GUI-based tool that allows users to investigate and experiment with command-line options. In the A/UX version, Commando's control files were "compiled" from each application's "man" pages.

Mac OS X, borrowing from NeXTSTEP and OpenStep, bundles applications or documents (including related files) as (generally opaque) directories. Thus, a user might see Foo as an icon for an application, but the developer knows that Foo is a directory containing a program, language resources, etc.

This idea, a variation of the Mac OS "Resource Fork", keeps related sets of files together, rather than spreading them all over the file tree (e.g., in bin, lib, and man directories). This has obvious benefits in installation and removal, but it also provides a convenient framework for handling complex issues such as internationalization and localization.

Mac OS X also uses an interesting packaging scheme, with a rather complex format. A package contains a file archive, some metadata, pre/post install scripts, and an indexed database of file information (POSIX attributes, hard link information, mach-o architectural information, etc.). The database helps drive installation and can also be used to find out what components need to be upgraded in the future and which components have been changed since install time.

Apple's Mac OS uses a variety of metadata-based file management techniques (e.g., the Resource Fork, the Desktop Database). It appears that both Mac OS X and Eazel's Nautilus (http://www.eazel.com) will continue and expand on this work.

-- Main.RichMorin - 16 Jun 2003
Topic revision: r8 - 08 Jun 2003, WikiGuest
This site is powered by Foswiki Copyright © by the contributing authors. All material on this wiki is the property of the contributing authors.
Foswiki version v2.1.6, Release Foswiki-2.1.6, Plugin API version 2.4
Ideas, requests, problems regarding CFCL Wiki? Send us email