The Hackerlab at regexps.com

Inventory Ids for Source

up: arch Meets hello-world
next: Importing the First Revision
prev: Project Tree Inventories

Caution: Steep Learning Curve: As in the previous chapter, the concepts and commands introduced here are likely to be unfamiliar to you, even if you have used other revision control systems. Once you "get it", though, this will seem quite natural. Best of all, this is the last tricky step before we can start storing project trees in an archive.

Looks Like Source vs Really is Source

In the previous chapter, we saw how to find out which files look like source according to the naming conventions:

        % tla inventory --names --source
        hw.c
        main.c

In this chapter, there's a new distincition: files which look like source according to their names, vs. files which really are source.

When you save your project tree in an archive, arch will store the files that really are source and ignore the rest. We can ask which files really are source by dropping the --names option to inventory :

        % tla inventory --source
        [no output]

It's a little more interesting if we include arch's own "system files and directories" in the listing:

        % tla inventory --source --all --both
        {arch}
        {arch}/.arch-project-tree
        {arch}/=tagging-method
        {arch}/hello-world
        [....]

but the thing to note here is that hw.c and main.c aren't listed. Arch thinks they are source in name only. The next section gives a recipe to fix that, and the sections after that explain what's really going on.

The add Command

We can tell arch that our files really are source, and should really be archived with the project, using the tla add command:

        % tla add hw.c
        % tla add main.c

And now we get a better answer from:

        % tla inventory --source
        hw.c
        main.c

A related command is tla delete :

        % tla delete hw.c

That doesn't delete the file hw.c itself:

        % ls
        hw.c            hw.c.~1~        main.c          {arch}

but it does remove it from the official list of source:

        % tla inventory --source
        main.c

For the sake of the examples, we need to put hw.c back in the list:

        % tla add hw.c

        % tla inventory --source
        hw.c
        main.c

Let's take a deeper look at what's going on when you tla add files:

Two Names for Every File

In the arch world, every source file (and directory) in your project tree has two names: a file path and a inventory id .

The file path of a file is the relative path to the file from the root of the project tree. It describes where within a source tree a file is located.

The inventory id of a file is a (mostly) arbitrary string that is unique to the file within the tree. The inventory id remains constant even if a file is renamed. So while the file path says where a file is located, the inventory id says which file it is that's stored at that path.

The purpose of tla add is to assign an inventory id to a file.

In our example, we can examine the ids:

        % tla inventory --source --ids
        hw.c    x_very_long_string
        main.c  x_another_very_long_string
        ^^^^    ^^^^^^^^^^^^^^^^^^^^^^^^^^
         |                |
         |           inventory ids
     file paths

Ordinarily, when a file is moved, its file path changes, but its inventory id should remain the same. The tla move command helps with this. Suppose that we:

        % mv hw.c hello.c

we should follow that with:

        % tla move hw.c hello.c

after which:

        % tla inventory --source --ids
        hello.c   x_very_long_string
        main.c    x_another_very_long_string

Note that hello.c has the same inventory id that hw.c used to.

We'll come back to the topic of renames later so, for now, let's put things back where they started:

        % mv hello.c hw.c
        % tla move hello.c hw.c

Quick Aside -- Adding Directories

The tla add command applies to directories, too. If we were to create a new subdirectory in the tree, we should tla add it:

        % mkdir docs

        % tla inventory --names --source --both
        docs
        hw.c
        hello.c

but

        % tla inventory --source --both
        hw.c
        hello.c

unless

        % tla add docs

and then

        % tla inventory --source --both
        docs
        hw.c
        hello.c

But again, for the sake of our example, we don't need docs. We can just:

        % rm -rf docs

There isn't a need to tla delete a directory that we physically remove.

How it Works -- tla add

What tla add does is fairly simple. Note that when we added hw.c and main.c , a new directory was created:

        % ls -a
        .               .arch-ids       hw.c.~1~        {arch}
        ..              hw.c            main.c

The .arch-ids directory is new:

        % ls .arch-ids
        hw.c.id         main.c.id

        % cat .arch-ids/hw.c.id
        very long string

The *.id files is where the raw data that determines a file id are stored. The command tla delete removes those files. The command tla move renames them.

The id for a directory is stored slightly differently. For example, when we created a docs subdir and gave it an id with tla add , that created a file docs/.arch-ids/=id .

Keeping Things Neat and Tidy

The command:

        % tla tree-lint

is useful for keeping things neat and tidy.

tree-lint will tell you of any ids for which the corresponding file does not exist. It will tell you of any files that pass the naming conventions, but for which no explicit id exists.

It will also warn you about files that don't fit the naming conventions.

Inventory Ids -- There's More Than One Way to Do It

In this chapter, you've learned about the basic commands add , move , and delete .

The use of those tools for managing inventory ids was chosen as the default behavior because, superficially at least, it resembles similar commands in systems such as CVS which many users are already familiar with.

There are other ways to manage inventory ids. Sometimes the other ways are more convenient. A later chapter discusses these other techniques (see: xref : !!! ).

Why is it Like This -- The Purpose of Inventory Ids

As you'll see in later chapters, arch is good at managing changes made to source trees and the files they contain, and good at telling you about the history of trees and files.

As an example, let's suppose that Alice and Bob are both working on the hello_world project. In her tree, Alice makes some changes to hw.c . In his tree, Bob renames hw.c to hello.c .

At some point it is necessary to "sync-up" Alice and Bob. Bob should wind up with the changes Alice has been making. Alice should wind up with the same file renaming that Bob has done.

arch provides many mechanisms for that syncing up -- it's one of the most important things that arch can do -- but nearly all of them boil down to computing and applying changesets.

Alice can ask arch to create a changeset describing the work she's done, and that changeset will describe the changes she made within hw.c . Bob can create a changeset and that changeset will describe the file renaming he did.

If Alice applies Bob's changeset to her tree, her copy of hw.c should be renamed hello.c . But a trickier case is this: What happens if Bob applies Alice's changeset to his tree?

Alice changed a file named ./hw.c , but in Bob's tree, those same changes should be made to a file named ./hello.c . Fortunately, both files have the same inventory id:


        file path               inventory id
        ---------               -------------

                 Alice's tree:
        ./hw.c                  x_very_long_string
                                                  \ 
                                                   - the same long string
                 Bob's tree:                      /
        ./hello.c               x_very_long_string


In Alice's changeset, the changes Alice made are described as being made to the file whose id is x_very_long_string .

Therefore, when applying that changeset to Bob's tree, arch knows to apply the changes to the file with that same id; it knows to apply the changes to his ./hello.c .

That example illustrates what inventory ids are for: they allow arch to describe the changes made to a tree in terms of the logical identity of files rather than their physical location. There are many more complicated examples of how inventory ids come into play, but now you've seen at least the basic point.

Why is it Like This -- Why tla move Doesn't Move Files

Why doesn't tla delete delete the file being removed from the source category, or tla move rename it?

Those commands work as they do so that you can adjust the ids in a tree even if some other tool which knows nothing about arch has rearranged files. For example, if you use a "directory editor" to rename source files, tla move is available to catch-up to the changes the directory editor made.

Sometimes, arch users request the addition of commands: tla mv , tla mkdir , tla rmdir , and tla rm that would modify both ids and the corresponding source files. That's a great idea and it's not all that hard: so, if you're looking for something to do, that's a good idea for a real-world programming project on which to try-out and learn arch. Let us know on the gnu-arch-users mailing list if you do this, so that we can consider merging your changes into the distribution.

arch Meets hello-world: A Tutorial Introduction to The arch Revision Control System
The Hackerlab at regexps.com