Existing Work

A great deal of documentation and package management infrastructure exists for Eclectic Systems, but it does not form a cohesive whole. Here are some representative examples, ranging from "build systems" to books.

Any significant Open Source package will have an automated "build" mechanism, generally supported by some form of make. The GNU configuration and build suite (e.g., autoconf , configure) has been adopted by many developers, standardizing (and greatly simplifying) the build process for most packages on most systems.

Build and Packaging Systems

All OS distributors, of course, have "build" systems for their base distributions. None of these systems provides much support for automated documentation, however (except maybe AntHill).

AntHill (http://www.urbancode.com/projects/anthill/default.jsp) is a tool that ensures a controlled build process and promotes the sharing of knowledge within an organization. AntHill performs a checkout from the source repository of the latest version of a project before every build and tags the repository with a unique build number after every build. It supports many repository adapters, including: CVS (Concurrent Versions System), Visual Source Safe, Perforce, PVCS, StarTeam, MKSIntegrity, and FileSystem. AntHill also automatically updates a project intranet site with artifacts from the latest build. AntHill is an extension to the Apache-Ant project and is compatible with version 1.3, 1.4 and 1.5 of Ant. AntHill is OpenSource and is released under a Mozilla-like license.

Cons (http://www.dsmit.com/cons) is a Perl-based replacement for make(1). It claims to have improved dependency analysis, aiding the updating of interrelated packages. It also employs goodies like MD5 cryptographic signatures to "accurately determine whether a given file needs to be rebuilt".

Debian (http://www.debian.org, http://www.debian.org/distrib/packages), FreeBSD (http://www.freebsd.org, http://www.freebsd.org/ports), and Red Hat (http://www.redhat.com) have each developed useful and popular package management systems; each of these systems supports several thousand packages. The systems differ in assorted respects, but they all support automated installation of packages, in both source or binary form. Jam (http://perforce.com/jam/doc/jam.paper.html) is a make(1) replacement that is optimized for handling "huge programs", where thousands of files may be involved. Apple (http://www.apple.com) uses Jam in their (Mac OS X) "Project Builder" tool.

The QEF Software Process Automation System (http://www.qef.com) "was developed to handle large scale software engineering projects to be run on many platforms". QEF seems to be oriented toward "doing things the right way from the beginning", but some of the insights are applicable to existing systems. In any event, David Tilbrook's papers (e.g., "Large Scale Porting through Parameterization") cover a number of interesting issues.

Indexing Systems

Most online indexing systems are strongly optimized for interactive use. Provision for automated access is thus quite sketchy, but the situation is getting better. The Comprehensive Perl Archive Network (CPAN; http://cpan.perl.org), for instance, can provide XML responses to search queries.

FileWatcher (http://www.filewatcher.org), Freshmeat (http://www.freshmeat.net), and other indexing systems track thousands of Open Source packages (FileWatcher and Freshmeat each track tens of thousands of packages). These systems contain high-level descriptions, version information, etc. FileWatcher and Freshmeat both provide (non-XML) snapshots of their "backend" databases.

SourceForge (http://sourceforge.net) provides a range of support services, from indexing and CVS access through email and Web presence for Open Source packages. It does not currently attempt, however, to cover the entire range of available packages.

rpmfind (http://rpmfind.net) and rpm2html (http://rpmfind.net/linux/rpm2html) leverage the metadata in Red Hat Package Manager (RPM; http://www.rpm.org) archives, via the XML-based Resource Description Framework (RDF; http://www.w3.org/Metadata).

Documentation

Most popular packages include both user- and programmer-level documentation. Sometimes this is very well written and complete; other times it is not. Even if the documentation is excellent and available in electronic form, however, it may be difficult to browse and search.

Documentation may be encoded in any of a number of formats (e.g., ASCII, HTML, "man" pages, PDF, PostScript, TeX, troff). Tools are typically available for manipulating (e.g., browsing, converting, indexing, and/or searching) all common formats. Not every facility exists for each format, however.

Worse, no single tool handles all formats and no universal "exchange" format has been adopted. As a result, browsing and indexing are only available for limited, disconnected subsets of the available packages and documents.

In addition, users must learn (and remember) how to operate (and perhaps administer) a number of tools. It can thus involve considerable effort to "browse" (i.e., search, examine, print) documents in "unfamiliar" formats. At the very least, this stifles examination of new packages.

Finding out the formats (or even purposes) of the files in a subsystem can be difficult. Little existing documentation concerns itself with individual files, let alone with file relationships. Although the "man" pages of Eclectic Systems document many files, most of these are executables. Some control files are also written up, but few other files receive explicit attention.

Many magazine articles, papers, and books cover Open Source offerings. Even when they are in electronic form, however, their formats (e.g., HTML, PDF, PostScript) do not lend themselves to automated indexing. As a result, they are not well integrated into a documentation "system".

In Summary

A great deal of documentation and metadata exist for Eclectic Systems, covering acquisition, building, installation, modification, and use. Unfortunately, it is not "tied together" into a convenient, well-integrated whole. Assorted problems (e.g., name spaces, file formats, tool differences) complicate maintenance and use of both the packages and their metadata.

On a more global scale, package management and indexing systems do not share information in any formal way. Much of the metadata collected for each package by these systems is common, but no formal mechanisms exist for converting or exchanging this information.

For example, each FreeBSD "package maintainer" must track (and discover!) package changes, using informal mechanisms (e.g., release notes and email), supplemented by code examination and testing. The resulting knowledge is then buried in FreeBSD package configuration files.

The Unified BSD Package Collection initiative (http://www.openpackages.org) hopes to get all of the BSDs on the same page. While this would be a great improvement, and is a fine short-term goal, it would be far better to get all Eclectic Systems to share their packaging (and other) metadata.

-- Main.RichMorin - 16 Jun 2003
Topic revision: r9 - 08 Jun 2003, WikiGuest
This site is powered by Foswiki Copyright © by the contributing authors. All material on this wiki is the property of the contributing authors.
Foswiki version v2.1.6, Release Foswiki-2.1.6, Plugin API version 2.4
Ideas, requests, problems regarding CFCL Wiki? Send us email