Normally commands like git annex add
always add files to the annex.
And when using the v6 repository mode, even git add
and git commit -a
will add files to the annex.
Let's suppose you're developing a video game, written in C. You have source code, and some large game assets. You want to ensure the source code is stored in git -- that's what git's for! And you want to store the game assets in the git annex -- to avod bloating your git repos with possibly enormous files, but still version control them.
The annex.largefiles configuration is useful for such mixed content
repositories. It's checked by git annex add
, by git add
and git commit -a
(in v6 repositories), by git annex import
and the assistant. It's
also used by git annex addurl
and git annex importfeed
when downloading
files. When a file does not match annex.largefiles, these commands will
add its content to git instead of to the annex.
This saves you the bother of keeping things straight when adding files.
examples
For example, let's make only files larger than 100 kb be added to the annex,
and never *.c
and *.h
source code files.
Write this to the .gitattributes
file:
* annex.largefiles=(largerthan=100kb)
*.c annex.largefiles=nothing
*.h annex.largefiles=nothing
Or, set the git configuration instead:
git config annex.largefiles 'largerthan=100kb and not (include=*.c or include=*.h)'
Both of these settings do the same thing. Setting it in the .gitattributes
file makes any checkout of the repository share that configuration, so is often
a good choice. Setting the annex.largefiles git configuration lets different
checkouts behave differently. The git configuration overrides the
.gitattributes
configuration.
syntax
The value of annex.largefiles is similar to a preferred content expression. The following terms can be used in annex.largefiles:
include=glob
/exclude=glob
Specify files to include or exclude.
The glob can contain
*
and?
to match arbitrary characters.smallerthan=size
/largerthan=size
Matches only files smaller than, or larger than the specified size.
The size can be specified with any commonly used units, for example, "0.5 gb" or "100 KiloBytes"
mimetype=glob
Looks up the MIME type of a file, and checks if the glob matches it.
For example, "mimetype=text/*" will match many varieties of text files, including "text/plain", but also "text/x-shellscript", "text/x-makefile", etc.
The MIME types are the same that are displayed by running
file --mime-type
This is only available to use when git-annex was built with the MagicMime build flag.
anything
Matches any file.
nothing
Matches no files. (Same as "not anything")
not expression
Inverts what the expression matches.
and
/or
/( expression )
These can be used to build up more complicated expressions.
The way the .gitattributes
example above works is, *.c
and *.h
files
have the annex.largefiles attribute set to "nothing",
and so those files are never treated as large files. All other files use
the other value, which checks the file size.
Note that, since git attribute values cannot contain whitespace, it's useful to instead parenthesize the terms of the annex.largefiles attribute. This trick allows for more complicated expressions. For example, this is the same as the git config shown earlier, shoehorned into a git attribute:
* annex.largefiles=(largerthan=100kb)and(not((include=*.c)or(include=*.h)))
temporarily override
If you've set up an annex.largefiles configuration but want to force a file to be stored in the annex, you can temporarily override the configuration like this:
git annex add -c annex.largefiles=anything smallfile
I use version 5.20140412, and I've tried the annex.largefiles in .gitattributes, but doesn't work. Every file is added to .git/annex/objects, including the ones that are excluded in .gitattributes.
I've just started using git-annex so maybe I'm doing something wrong...
I have the same problem: the annex.largefiles is ignored by "git add" when set in .gitattributes allthouch git check-attr does list it.
but it works when set with git config annex.largefiles
git annex version 6.20160126
The first version to support largefiles in .gitattributes was 6.20160211, so both the above commenters just have too old a version.
Hello
it took me some time to figure out how to exclude directories matching a specific structure within the .gitattributes file:
Maybe it helps someone else. (In case this way is the intended way)
Hi guys!
sigh
Currently I am pulling my hair, maybe anybody here can clear things up a bit. I tried to setup a brand new mixed content repo with git-annex but it bluntly ignores my .gitattributes and annexes everything. When I set largefiles in config everything is fine and restrictions are applied right, in .gitattributes even a "* annex.largefiles=nothing" has no effect. All attributes show up right with git check-attr, I double checked. :-/ Same thing with a newly initialized minimal example repo.
I tried git-annex as distributed by openSUSE and the current stand-alone-package (in case it's a distribution bug), too. So no clues here, too.
Output of git annex version:
git-annex version: 6.20170302-gb35a50cca build flags: Assistant Webapp Pairing Testsuite S3(multipartupload)(storageclasses) WebDAV Inotify DBus DesktopNotify ConcurrentOutput TorrentParser MagicMime Feeds Quvi key/value backends: SHA256E SHA256 SHA512E SHA512 SHA224E SHA224 SHA384E SHA384 SHA3_256E SHA3_256 SHA3_512E SHA3_512 SHA3_224E SHA3_224 SHA3_384E SHA3_384 SKEIN256E SKEIN256 SKEIN512E SKEIN512 SHA1E SHA1 MD5E MD5 WORM URL remote types: git gcrypt p2p S3 bup directory rsync web bittorrent webdav tahoe glacier ddar hook external
System: OpenSUSE Tumbleweed Linux 4.9.3-1-default #1 SMP PREEMPT Thu Jan 12 11:32:53 UTC 2017 (2c7dfab) x86_64 x86_64 x86_64 GNU/Linux
Any ideas? After trying around for hours I am somewhat flabberghasted. Did I miss some config- or buildoption to enable support for .gitattributes?
Kind regards
Jörn
@joern.mankiewicz, you need to file a bug report with enough information to reproduce your problem.
annex.largefiles in .gitattributes works fine:
Note that if annex.largefiles is set in git config (including global git config), it overrides the .gitattributes setting. So a reasonable guess would be that you set it in the git config.
Thanks, joey.
Your last comment brought me onto the right track. The Problem was not in the repository, but an old stale global .gitconfig in my homedir. I just checked $XDG_CONFIG_HOME/git/config were currently my global git-config is residing and totaly forgot about this old config. Stupid me!
was my savior here as it clearly indicated that there is indeed a (unintended) config setting and where to find the file. So i can strongly recommend anybody experiencing strange behavior to try this one-liner. It might have saved me hours of time.
Thanks for your help! :)
Cheers
Jörn
With v6, is there any way to retain old usage of
git add
andgit annex add
to manually choose which files are kept under plain git and which annexed?I'm aware of the
-c annex.largefiles=foo
parameter, but that's pretty cumbersome.Hi, from technical point of view, are there any drawbacks/limitations on adopting a workflow of everyone in the project using "git annex add" and relying on the annex.largefiles settings instead of them having to use the separate commands? * I would use repo v5 as repo v6 seems to still need work to do, and I don't need it's features. I just would like to avoid human error of people not using by mistake regular git add for bigfiles. I understand that repo v6 would allow, but I don't like it's default behavior of using unlocked mode when I add things with git add (although it would properly annex the files, but in unlocked mode these files would occupy space in the work copy, and I don't want that). Thanks.