git-annex-sim(1) General Commands Manual git-annex-sim(1)
NAME
git-annex-sim - simulate a network of repositories
SYNOPSIS
git annex sim start [my.sim]
git annex sim command
git annex sim show
git annex sim end
git annex sim run my.sim
DESCRIPTION
This command simulates the behavior of git-annex in a network of
repositories, determining which files would reach which repositories
according to the configuration of preferred content, numcopies, trust
level, etc.
The input to the simulation is a sim file, and/or sim commands that are
run after starting it. These are in the form "git annex sim command"
with the command in the same format used in the sim file (see sim
commands list below). For example, "git annex sim step 1" runs the
simulation one step.
The simulation keeps a log as it runs, which contains the entire
simulation input, as well as the actions performed in the simulation,
and the results of the simulation. Use "git-annex sim show" to display
the log. This allows re-running the same simulation later, as well as
analyzing the results of the simulation.
Use "git annex sim end" to finish the simulation, and clean up.
As a convenience, to run a sim from a file, and then stop it, use "git-
annex sim run". If there is a problem running the sim, it will be shown
before it is stopped.
Note that interrupting this command while it is running may leave the
simulation in an inconsistent state. And running multiple sim commands
at the same time can as well, although it is safe to run "git annex sim
visit" while running other sim commands.
THE SIM FILE
This text file is used to configure the simulation and also to report
on the results of the simulation. Each line takes the form of a command
followed by parameters to the command. Lines starting with "#" or "--"
are comments.
Here is an example sim file:
# add repositories to the simulation and connect them as remotes
init foo
init bar
connect foo <-> bar
# add a special remote
initremote baz
connect foo -> baz <- bar
# configure repositories
numcopies 2
group foo client
wanted foo standard
group bar archive
wanted bar standard
wanted baz include=*.mp3
# add annexed files in the working tree to the simulation, as if they
# were just added to repository foo
addtree foo include=*.mp3
addtree foo include=*.jpg
addtree foo include=bigfiles/
# add simulated annexed files
add bigfile 100gb bar
add hugefile 10tb foo
# run the simulation forward by ten steps
step 10
# remove foo's remote bar and see if a new file added to foo reaches
bar
disconnect foo -> bar
add foo.mp3 2mb foo
step 5
SIM COMMANDS
This is the full set of commands that can be used in the sim file as
well as passed to "git annex sim" while a simulation is running.
init name
Initialize a simulated repository, giving it a name that will be
used in the simulation.
initremote name
Initialize a simulated special remote.
use name here|remote|description|uuid
Use an existing repository in the simulation, with its existing
configuration (trust level, groups, preferred and required
content, maxsize, and the groupwanted configuration of its
groups).
The repository is given a name for the purposes of the
simulation. The repository to use can be specified by remote
name, uuid, etc. Example: "use myrepo here"
visit repo [command]
Runs the specified shell command inside the simulated
repository, and waits for it to exit.
When no shell command is specified, it runs an interactive
shell.
The command is run in a git repository whosegit-annex branch
contains the state of that simulated repository. This allows
running any git-annex commands, such as git-annex whereis to
examine the state of the simulation. You should avoid making any
changes to git-annex state.
connect repo [<-|->|<->] repo [...]
Add a connection between two or more repositories. The arrow
indicates which direction the connection runs, and it can be
bidirectional. For example, "connect foo -> bar" makes bar be a
remote of foo, while "connect foo <-> bar" makes each be the
remote of the other. A chain of connections can extend to many
repositories, eg "connect foo -> bar -> baz -> foo"
disconnect repo [<-|->|<->] repo [...]
Removes connections between repositories.
For example, "disconnect foo -> bar" makes foo no longer have
bar as a remote.
addtree repo expression
Adds annexed files from the git repository to the simulation
making them be present in the specified repository.
The expression is a preferred content expression (see git-
annex-preferred-content(1)) specifying which annexed files to
add. While it is possible to include all or a large number of
files this way, note that often it's more efficient to simulate
a small quantity of files that have the particular properties
you are interested in.
When run in a subdirectory of the repository, only files in that
subdirectory are considered for addition.
This can be used with the same files more than once, to make
multiple repositories in the simulation contain the same files.
add filename size repo [repo ...]
Create a simulated annexed file with the specified filename and
size, that is present in the specified repository, or
repositories.
The size can be specified using any usual units, eg "10mb" or
"3.3terabytes"
The filename cannot contain a space.
This stages a file in the index, so that regular git-annex
commands can be used to query the state of the simulated annexed
file. If there is already an annexed file by that name, it will
be overwritten with the new file.
Note that the simulation does not cover adding conflicting files
to different repositories. The files in the simulation are the
same across all simulated repositories.
addmulti N suffix minsize maxsize repo [repo ...]
Add multiple simulated annexed files, with random sizes in the
range between minsize and maxsize.
The files are named by combining the number, which starts at 1
and goes up to N, with the suffix.
For example:
addmulti 100 testfile.jpg 100kb 10mb foo
That adds files named "1testfile.jpg", 2testfile.jpg", etc.
Note that adding a large number of files to the simulation can
slow it down and make it use a lot of memory.
step N Run the simulation forward by this many steps.
On each step of the simulation, one file is either transferred
or dropped, according to the preferred content and other
configuration.
If there are no more files that can be either transferred or
dropped according to the current configuration, a message will
be displayed to indicate that the simulation has stabilized.
This also simulates git pull and git push being run in each
repository, as needed in order to find additional things to do.
stepstable N
Run the simulation forward by this many steps, at which point it
is expected to have stabilized.
If the simulation does not stabilize, the command will exit with
a nonzero exit state.
action repo getwanted remote
Simulate the repository getting files it wants from the remote.
action repo dropunwanted
Simulate the repository dropping files it does not want, when it
is able to verify enough copies exist on remotes.
action repo dropunwantedfrom remote
Simulate the repository dropping files from the remote that the
remote does not want, when it is able to verify enouh copies
exist.
action repo sendwanted remote
Simulate the repository sending files that the remote wants to
it.
action repo gitpush remote
Simulate the repository pushing the git-annex branch to the
remote.
action repo gitpull remote
Simulate the repository pulling the git-annex branch from the
remote.
action repo pull remote
Simulate the equivilant of git-annex-pull(1), by combining the
actions gitpull, getwanted, and dropunwanted.
action repo push remote
Simulate the equivilant of git-annex-push(1) by combining the
actions sendwanted, dropunwantedfrom, and gitpush.
action repo sync remote
Simulate the equivilant of git-annex-sync(1) by combining the
actions gitpull, getwanted, sendwanted, dropunwanted, and
gitpush.
action [...] while action [...]
Simulate running the two actions concurrently. While the
simulation only actually simulates one thing happening at a
time, when the actions each operate on multiple files, they will
be interleaved randomly.
Any number of actions can be combined this way.
For example:
action foo dropunwanted while action bar getwanted foo
In this example, bar may or may not get a file before foo drops
it.
seed N Sets the random seed to a given number. Using this should make
the results of the simulation deterministic. The output sim file
always has the random seed included in it, so it can be used to
replay the simulation.
present repo file
This indicates the expected state of the simulation at this
point. The repository should contain the content of the file. If
it does not, the discrepancy will be indicated on standard
error, and the git-annex sim command will eventually exit
nonzero.
This is added to the output sim file as the simulation runs.
notpresent repo file
This indicates the expected state of the simulation at this
point. The repository should not contain the content of the
file. If it does, the discrepancy will be indicated on standard
error, and the git-annex sim command will eventually exit
nonzero.
This is added to the output sim file as the simulation runs.
numcopies N
Sets the desired number of copies. This is equivilant to git-
annex-numcopies(1).
Note that other configuration that sets numcopies, such as
.gitattributes files, is not used by the simulation.
mincopies N
Sets the minimum number of copies. This is equivilant to git-
annex-mincopies(1).
trustlevel repo trusted|untrusted|semitrusted|dead
Sets the trust level of the repository. This is equivilant to
git-annex-trust(1), git-annex-untrust(1), etc.
wanted repo expression
Configure the preferred content of a repository. This is
equivilant to git-annex-wanted(1).
required repo expression
Configure the required content of a repository. This is
equivilant to git-annex-required(1).
groupwanted group expression
Configure the groupwanted expression. This is equivilant to git-
annex-groupwanted(1).
randomwanted repo term...
Configure the preferred content of a repository to a random
expression generated by combining a random selection of the
provided terms with "and", "or", and "not".
For example, "randomwanted foo exclude=*.x include=*.x
largerthan=100kb" might generate an expression of "exclude=*.x
or not largerthan=100kb and include=*.x" or it might generate an
expression of "include=*.x and exclude=*.x"
randomrequired repo term...
Configure the required content of a repository to a random
expression.
randomgroupwanted group term...
Configure the groupwanted to a random expression.
group repo group
Add a repository to a group. This is equivilant to git-
annex-group(1).
ungroup repo group
Remove a repository from a group. This is equivilant to git-
annex-ungroup(1).
metadata filename expression
Change the metadata of the simulated file. The expression is in
the same format as the --set option of the git-annex-metadata
command. For example: metadata foo year=2025
maxsize repo size
Configure the maximum size of a repository. This is equivilant
to git-annex-maxsize(1).
rebalance [on|off]
Setting "rebalance on" is the equivilant of passing the
--rebalance option to git-annex. Setting "rebalance off" undoes
that.
For example:
maxsize foo 1tb
rebalance on
step 100
rebalance off
clusternode name repo
Simulate a repository being a node of a cluster, which can be
referred to using the specified name.
Rather than a cluster gateway being simulated as a separate
entity, any connection to a cluster node with that name is
treated as accessing that repository via the same cluster
gateway.
Since a cluster gateway knows about all changes that are made to
nodes via it, every repository that has a connection to a
cluster node will immediately know about changes that are made
via that node, without needing a simulated git pull.
To simulate a repository being a node of more than one cluster,
or behind multiple gateways in the same cluster, use this
command to give it multiple names.
OPTIONS
The git-annex-common-options(1) can be used.
EXAMPLES
git-annex includes a collection of sim files, at
SEE ALSO
git-annex(1)
git-annex-test(1)
AUTHOR
Joey Hess
git-annex-sim(1)