MarkEmptyDirs

From Git SCM Wiki
Jump to: navigation, search

OBSOLETE CONTENT

This wiki has been archived and the content is no longer updated. Please visit git-scm.com/doc for up-to-date documentation.

Introduction

GIT as well as other well-known version control systems cannot version directories. In other words, you cannot add empty directories! A "workaround" for this issue is to use placeholder files which are placed into empty directories. These placeholder files can then be committed into the repository and will make sure that, upon checkout, the directory tree is entirely reconstructed. This solution is also suggested in the GitFaq. Note however, that using that workaround might not be a good idea. Creating missing directories during a build process should also be considered as an option.

Sometimes a solution where the missing directories are created by some magic is not practicable and people will face the problem of managing such placeholder files. In particular, the problem with using placeholder files is that you need to create them, and delete them, if they are not necessary anymore (because there were added sub-directories or files). With big source trees managing these placeholder files can be cumbersome and error prone.

In the past, I had been confronted with such a situation several times. This is why I decided to write an open source tool which can manage the creation/deletion of such placeholder files automatically. It creates placeholder files in all empty directories. If later on new files or directories are put into such directories, the placeholder files are not necessary anymore and, thus, are removed automatically.

This tool is licensed under the GPLv3 and runs on Mono .NET platform (available for Linux, MacOS X and Windows) as well as on Microsoft .NET platform (available only for Windows). You can download it here: http://code.google.com/p/markemptydirs.

Use Cases

In the following sections some typical use cases are described and it is shown how the tool can help you getting your work easily done. For clarity, I use directory names with upper case letters and file names with lowercase letters in the following examples.

Create placeholder files

Within a directory `PROJECT` find all "leaf" directories that do not contain any files or sub-directories, and create a placeholder file `.emptydir` within each such "leaf" directory.

For example, let's assume the following source tree:

[PROJECT]
        |
        *--[DIR1]
        |       |
        |       *--[DIR1-1]
        |
        *--[DIR2]
        |       |
        |       *--file1
        |
        *--[DIR3]

A tree with the corresponding placeholder files will then look like this:

[PROJECT]
        |
        *--[DIR1]
        |       |
        |       *--[DIR1-1]
        |                 |
        |                 *--.emptydir
        |
        *--[DIR2]
        |       |
        |       *--file1
        |
        *--[DIR3]
                |
                *--.emptydir

To create the `.emptydir` files use the `MarkEmptyDirs` tool simply like this:

$> MarkEmptyDirs.exe PROJECT

This creates empty `.emptydir` files. If you want the placeholder files to contain special content there are two ways.

  • Provide the content directly on command line:
$> MarkEmptyDirs.exe --text "Do not delete this file." PROJECT
  • Provide a template file:
$> MarkEmptyDirs.exe --file template.txt PROJECT

In both cases you can use template variables in your placeholder content which are evaluated everytime a new placeholder file is to be created. The evaluated results then replace the variables. Currently, the following variables are available:

VARIABLE DESCRIPTION ARGUMENT DESCRIPTION
`§datetime[[<format-pattern>]]§` get UTC time `format-pattern`: C# DateTime format pattern string
`§dir[[<base|cur.abs|cur.rel>]]§` get the base directory or current directory (default is `base`) `base` : base directory -- `cur.abs` : absolute path of current directory -- `cur.rel` : relative path to current directory
`§env:<env-var-name>§` get the value from an environment variable `env-var-name` : the environment variable's name
`§guid§` get a new globally unique identifier
`§lf[[<count>]]§` get a line feed character `count` : integer describing how many linefeeds should be returned
`§placeholder[[<name|fullname>]]§` get the placeholder name (default is `name`) `name` : file name only -- `fullname` : full file path
path|vol>§` get platform specific directory, path, or volume separator `dir` : directory level separator -- `path` : path separator -- `vol` : volume separator
`§sp[[<count>]]§` get a space character `count` : integer describing how many spaces should be returned

So in order to create template files with a unique id and time information you could do:

$> MarkEmptyDirs.exe --text="CREATOR: §env:USER§§lf§ID: §guid§§lf§TIME: §datetime§§lf:2§Do not delete this file.§lf§" PROJECT

A generated placeholder file might then look like this:

CREATOR: jonnydee
ID: 36358168-b347-4758-adaa-63e198809c32
TIME: 28.07.2009 17:20:30

Do not delete this file.

Using placeholder files with unique content allows to detect renames of directories automatically.

Update placeholder files

Let's assume the `PROJECT` directory has undergone some changes. Directories and/or files have been added and/or deleted. This requires to synchronize the corresponding placeholder files. In particular, some now are not needed anymore and new ones may now be necessary.

For instance, have a look at the following tree:

[PROJECT]
        |
        *--[DIR1]
        |       |
        |       *--[DIR1-1]
        |                 |
        |                 *--.emptydir
        |                 |
        |                 *--newfile.txt
        |
        *--[DIR2]
        |
        *--[DIR3]
                |
                *--.emptydir

`DIR1-1` now contains a new file `newfile.txt` and a previously created `.emptydir` file. The former now acts as a placholder and, thus, the latter is not needed anymore. `DIR2` now is empty and requires a placeholder file.

Consequently, after updating the placeholder files the tree should look like this:

[PROJECT]
        |
        *--[DIR1]
        |       |
        |       *--[DIR1-1]
        |                 |
        |                 *--newfile.txt
        |
        *--[DIR2]
        |       |
        |       *--.emptydir
        |
        *--[DIR3]
                |
                *--.emptydir

Using the `MarkEmptyDirs` tool for synchronization one would simply do (again):

$> MarkEmptyDirs.exe PROJECT

Delete placeholder files

There are situations where one wants a clean tree which does not contain any placeholder files. So all `.emptydir` files should be removed.

This can simply be achieved by using the `--clean` option:

$> MarkEmptyDirs.exe --clean PROJECT

List placeholder files

In order to list all placeholder files just execute:

$> MarkEmptyDirs.exe --list PROJECT

Note that this command is simply a shorthand for:

$> MarkEmptyDirs.exe --short --dry-run --clean PROJECT

Execute actions upon creation or deletion of placeholder files

Sometimes you might want to hook into the placeholder creation or deletion process in order to execute certain actions. For example, if you want the placeholder files be added to a Mercurial repository automatically you could tell MarkEmptyDirs to execute a `hg add <placeholder-file>` everytime a new placeholder is created. Likewise, you can tell it to execute a `hg rm -fA <placeholder-file>` everytime a placeholder is to be deleted. The following command line will exactly do this:

$> MarkEmptyDirs.exe --create-hook="hg add §placeholder§" --delete-hook="hg rm -fA §placeholder§" PROJECT

Remove everything except placeholder files

Sometimes, directories previously marked empty using placeholder files should remain empty. For instance, you might want to checkin an empty `bin` directory without the compiled binary files. Let's assume the following directory tree:

[PROJECT]
        |
        *--[DIR1]
        |       |
        |       *--[DIR1-1]
        |                 |
        |                 *--newfile.txt
        |
        *--[DIR2]
        |       |
        |       *--.emptydir
        |       |
        |       *--tempfile1.dll
        |       |
        |       *--[DIR2-1]
        |                 |
        |                 *--tempfile2.obj
        |
        *--[DIR3]
                |
                *--.emptydir
                |
                *--tempfile3.so

In this example, `DIR2` contains not only a placeholder but also `tempfile1.dll` and a subdirectory `DIR2-1` which we want to get rid of. `DIR3` contains `tempfile3.so` which should also be removed. Actually, we want the tree to look like this:

[PROJECT]
        |
        *--[DIR1]
        |       |
        |       *--[DIR1-1]
        |                 |
        |                 *--newfile.txt
        |
        *--[DIR2]
        |       |
        |       *--.emptydir
        |
        *--[DIR3]
                |
                *--.emptydir

So in order to re-empty directories containg placeholder files you can use MarkEmptyDirs like this:

$> MarkEmptyDirs.exe --purge PROJECT

WARNING: Be careful with this command! If you are not sure what the `--purge` command will do execute a `--dry-run` in combination with `--verbose` option first:

$> MarkEmptyDirs.exe --dry-run --verbose --purge PROJECT

If your tree contains symbolic links and `--follow-symlinks` option is switched to `true` I would highly recommend to do this in advance!


Personal tools