Communication between git-annex and a program implementing an external special remote uses this protocol.
starting the program
The external special remote program has a name like
git-annex-remote-$bar
. When
git annex initremote foo type=external externaltype=$bar
is run,
git-annex finds the appropriate program in PATH.
The program is started by git-annex when it needs to access the special remote, and may be left running for a long period of time. This allows it to perform expensive setup tasks, etc. Note that git-annex may choose to start multiple instances of the program (eg, when multiple git-annex commands are run concurrently in a repository).
protocol overview
Communication is via stdin and stdout. Therefore, the external special remote must avoid doing any prompting, or outputting anything like eg, progress to stdout. (Such stuff can be sent to stderr instead.)
The protocol is line based. Messages are sent in either direction, from git-annex to the special remote, and from the special remote to git-annex.
In order to avoid confusing interactions, one or the other has control at any given time, and is responsible for sending requests, while the other only sends replies to the requests.
Each protocol line starts with a command, which is followed by the command's parameters (a fixed number per command), each separated by a single space. The last parameter may contain spaces. Parameters may be empty, but the separating spaces are still required in that case.
example session
The special remote is responsible for sending the first message, indicating the version of the protocol it is using.
VERSION 1
Recent versions of git-annex respond with a message indicating protocol extensions that it supports. Older versions of git-annex do not send this message.
EXTENSIONS INFO
The special remote can respond to that with its own EXTENSIONS message, which could have its own protocol extension details, but none are currently used. (It's also fine to reply with UNSUPPORTED-REQUEST.)
EXTENSIONS
Next, git-annex will generally send a message telling the special remote to start up. (Or it might send an INITREMOTE or EXPORTSUPPORTED, so don't hardcode this order.)
PREPARE
The special remote can now ask git-annex for its configuration, as needed, and check that it's valid. git-annex responds with the configuration values
GETCONFIG directory
VALUE /media/usbdrive/repo
GETCONFIG automount
VALUE true
Once the special remote is satisfied with its configuration and is ready to go, it tells git-annex that it's done with the PREPARE step:
PREPARE-SUCCESS
Now git-annex will make a request. Let's suppose it wants to store a key.
TRANSFER STORE somekey tmpfile
The special remote can then start reading the tmpfile and storing it. While it's doing that, the special remote can send messages back to git-annex to indicate what it's doing, or ask for other information. It will typically send progress messages, indicating how many bytes have been sent:
PROGRESS 10240
PROGRESS 20480
Once the key has been stored, the special remote tells git-annex the result:
TRANSFER-SUCCESS STORE somekey
Now git-annex will send its next request.
Once git-annex is done with the special remote, it will close its stdin. The special remote program can then exit.
git-annex request messages
These are messages git-annex sends to the special remote program. None of these messages require an immediate reply. The special remote can send any messages it likes while handling the requests.
Once the special remote has finished performing the request, it should send one of the corresponding replies listed in the next section.
The following requests must all be supported by the special remote.
INITREMOTE
Requests the remote to initialize itself. This is where any one-time setup tasks can be done, for example creating an Amazon S3 bucket.
Note: This may be run repeatedly over time, as a remote is initialized in different repositories, or as the configuration of a remote is changed. (Bothgit annex initremote
andgit-annex enableremote
run this.) So any one-time setup tasks should be done idempotently.PREPARE
Tells the remote that it's time to prepare itself to be used.
Only EXTENSIONS and INITREMOTE or EXPORTSUPPORTED can come before this.TRANSFER STORE|RETRIEVE Key File
Requests the transfer of a key. For STORE, the File is the file to upload; for RETRIEVE the File is where to store the download.
Note that the File should not influence the filename used on the remote.
Note that in some cases, the File may contain whitespace.
Note that it's important that, while a Key is being stored, CHECKPRESENT not indicate it's present until all the data has been transferred.CHECKPRESENT Key
Requests the remote to check if a key is present in it.REMOVE Key
Requests the remote to remove a key's contents.
The following requests can optionally be supported. If not handled,
replying with UNSUPPORTED-REQUEST
is acceptable.
EXTENSIONS List
Sent to indicate protocol extensions which git-annex is capable of using. The list is a space-delimited list of protocol extension keywords. The remote can reply to this with its own EXTENSIONS list.GETCOST
Requests the remote to return a use cost. Higher costs are more expensive. (See Config/Cost.hs for some standard costs.)GETAVAILABILITY
Requests the remote to send back anAVAILABILITY
reply. If the remote replies withUNSUPPORTED-REQUEST
, its availability is assumed to be global. So, only remotes that are only reachable locally need to worry about implementing this.CLAIMURL Url
Asks the remote if it wishes to claim responsibility for downloading an url. If so, the remote should send back anCLAIMURL-SUCCESS
reply. If not, it can sendCLAIMURL-FAILURE
.CHECKURL Url
Asks the remote to check if the url's content can currently be downloaded (without downloading it). The remote replies with one ofCHECKURL-FAILURE
,CHECKURL-CONTENTS
, orCHECKURL-MULTI
.WHEREIS Key
Asks the remote to provide additional information about ways to access the content of a key stored in it, such as eg, public urls. This will be displayed to the user by eg,git annex whereis
. The remote replies withWHEREIS-SUCCESS
orWHEREIS-FAILURE
.
Note that users expectgit annex whereis
to run fast, without eg, network access.
This is not needed whenSETURIPRESENT
is used, since such uris are automatically displayed bygit annex whereis
.GETINFO
Requests the remote to send some information describing its configuration, for display bygit annex info
.
Reply with a series ofINFOFIELD
each followed byINFOVALUE
, and concluded withINFOEND
.EXPORTSUPPORTED
Used to check if a special remote supports exports. The remote responds with eitherEXPORTSUPPORTED-SUCCESS
orEXPORTSUPPORTED-FAILURE
. Note that this request may be made before or afterPREPARE
.EXPORT Name
Comes immediately before each of the following export-related requests, specifying the name of the exported file. It will be in the form of a relative path, and may contain path separators, whitespace, and other special characters.
No response is made to this message.TRANSFEREXPORT STORE|RETRIEVE Key File
Requests the transfer of a File on local disk to or from the previously provided Name on the special remote.
Note that it's important that, while a file is being stored, CHECKPRESENTEXPORT not indicate it's present until all the data has been transferred.
The remote responds with eitherTRANSFER-SUCCESS
orTRANSFER-FAILURE
, and a remote where exports do not make sense may always fail.CHECKPRESENTEXPORT Key
Requests the remote to check if the previously provided Name is present in it.
The remote responds withCHECKPRESENT-SUCCESS
,CHECKPRESENT-FAILURE
, orCHECKPRESENT-UNKNOWN
.REMOVEEXPORT Key
Requests the remote to remove content stored byTRANSFEREXPORT
with the previously provided Name.
The remote responds with eitherREMOVE-SUCCESS
orREMOVE-FAILURE
.
If the content was already not present in the remote, it should respond withREMOVE-SUCCESS
.REMOVEEXPORTDIRECTORY Directory
Requests the remote remove an exported directory.
If the remote does not use directories, or automatically cleans up empty directories, this does not need to be implemented.
The directory will be in the form of a relative path, and may contain path separators, whitespace, and other special characters.
Typically the directory will be empty, but it could possbly contain files or other directories, and it's ok to remove those.
The remote responds with eitherREMOVEEXPORTDIRECTORY-SUCCESS
orREMOVEEXPORTDIRECTORY-FAILURE
.
Should not fail if the directory was already removed.RENAMEEXPORT Key NewName
Requests the remote rename a file stored on it from the previously provided Name to the NewName.
The remote responds withRENAMEEXPORT-SUCCESS
orRENAMEEXPORT-FAILURE
.
To support old external special remote programs that have not been updated
to support exports, git-annex will need to handle an ERROR
response
when using any of the above.
More optional requests may be added, without changing the protocol version,
so if an unknown request is seen, reply with UNSUPPORTED-REQUEST
.
special remote replies
These should be sent only in response to the git-annex request messages. They do not have to be sent immediately after the request; the special remote can send its own messages (listed in the next section below) while it's handling a request.
PREPARE-SUCCESS
Sent as a response to PREPARE once the special remote is ready for use.PREPARE-FAILURE ErrorMsg
Sent as a response to PREPARE if the special remote cannot be used.TRANSFER-SUCCESS STORE|RETRIEVE Key
Indicates the transfer completed successfully.TRANSFER-FAILURE STORE|RETRIEVE Key ErrorMsg
Indicates the transfer failed.CHECKPRESENT-SUCCESS Key
Indicates that a key has been positively verified to be present in the remote.CHECKPRESENT-FAILURE Key
Indicates that a key has been positively verified to not be present in the remote.CHECKPRESENT-UNKNOWN Key ErrorMsg
Indicates that it is not currently possible to verify if the key is present in the remote. (Perhaps the remote cannot be contacted.)REMOVE-SUCCESS Key
Indicates the key has been removed from the remote. May be returned if the remote didn't have the key at the point removal was requested.REMOVE-FAILURE Key ErrorMsg
Indicates that the key was unable to be removed from the remote.EXTENSIONS List
Sent in response to a EXTENSIONS request, the List could be used to indicate protocol extensions that the special remote uses, but there are currently no such extensions, so the List is empty.COST Int
Indicates the cost of the remote.AVAILABILITY GLOBAL|LOCAL
Indicates if the remote is globally or only locally available. (Ie stored in the cloud vs on a local disk.)INITREMOTE-SUCCESS
Indicates the INITREMOTE succeeded and the remote is ready to use.INITREMOTE-FAILURE ErrorMsg
Indicates that INITREMOTE failed.CLAIMURL-SUCCESS
Indicates that the CLAIMURL url will be handled by this remote.CLAIMURL-FAILURE
Indicates that the CLAIMURL url wil not be handled by this remote.CHECKURL-CONTENTS Size|UNKNOWN Filename
Indicates that the requested url has been verified to exist.
The Size is the size in bytes, or use "UNKNOWN" if the size could not be determined.
The Filename can be empty (in which case a default is used), or can specify a filename that is suggested to be used for this url.CHECKURL-MULTI Url1 Size1|UNKNOWN Filename1 Url2 Size2|UNKNOWN Filename2 ...
Indicates that the requested url has been verified to exist, and contains multiple files, which can each be accessed using their own url. Each triplet of url, size, and filename should be listed, one after the other. Note that since a list is returned, neither the Url nor the Filename can contain spaces.CHECKURL-FAILURE
Indicates that the requested url could not be accessed.WHEREIS-SUCCESS String
Indicates a location of a key. Typically an url, the string can be anything that it makes sense to display to the user about content stored in the special remote.WHEREIS-FAILURE
Indicates that no location is known for a key.INFOFIELD
/INFOVALUE
/INFOEND
Reply to a GETINFO request. This can be used to add info about anything, but things like an url to the remote, or details of the remote's configuration are typical. It should not include any sensitive information like passwords, since it will be displayed to the user's screen.There can be zero or more
INFOFIELD
messages, each containing the name of a field, and each is immediately followed by anINFOVALUE
message containing its value. The sequence is concluded byINFOEND
. For example:INFOFIELD repository location INFOVALUE http://example.com/repo/ INFOFIELD datacenter INFOVALUE Antarctica INFOEND
EXPORTSUPPORTED-SUCCESS
Indicates that it makes sense to use this special remote as an export.EXPORTSUPPORTED-FAILURE
Indicates that it does not make sense to use this special remote as an export.RENAMEEXPORT-SUCCESS Key
Indicates that aRENAMEEXPORT
was done successfully.RENAMEEXPORT-FAILURE Key
Indicates that aRENAMEEXPORT
failed for whatever reason.REMOVEEXPORTDIRECTORY-SUCCESS
Indicates that aREMOVEEXPORTDIRECTORY
was done successfully.REMOVEEXPORTDIRECTORY-FAILURE
Indicates that aREMOVEEXPORTDIRECTORY
failed for whatever reason.UNSUPPORTED-REQUEST
Indicates that the special remote does not know how to handle a request.
special remote messages
These messages may be sent by the special remote at any time that it's handling a request.
VERSION Int
Supported protocol version. Current version is 1. Must be sent first thing at startup, as until it sees this git-annex does not know how to talk with the special remote program!
(git-annex does not send a reply to this message, but may give up if it doesn't support the necessary protocol version.)PROGRESS Int
Indicates the current progress of the transfer (in bytes). May be repeated any number of times during the transfer process, but it's wasteful to update the progress until at least another 1% of the file has been sent. This is highly recommended for STORE. (It is optional but good for RETRIEVE.)
(git-annex does not send a reply to this message.)DIRHASH Key
Gets a two level hash associated with a Key. Something like "aB/Cd". This is always the same for any given Key, so can be used for eg, creating hash directory structures to store Keys in. This is the same directory hash that git-annex uses inside.git/annex/objects/
(git-annex replies with VALUE followed by the value.)DIRHASH-LOWER Key
Gets a two level hash associated with a Key, using only lower-case. Something like "abc/def". This is always the same for any given Key, so can be used for eg, creating hash directory structures to store Keys in. This is the same directory hash that is used by eg, the directory special remote.
(git-annex replies with VALUE followed by the value.)
(First supported by git-annex version 6.20160511.)SETCONFIG Setting Value
Sets one of the special remote's configuration settings.
Normally this is sent during INITREMOTE, which allows these settings to be stored in the git-annex branch, so will be available if the same special remote is used elsewhere. (If sent after INITREMOTE, the changed configuration will only be available while the remote is running.)
(git-annex does not send a reply to this message.)GETCONFIG Setting
Gets one of the special remote's configuration settings, which can have been passed by the user when runninggit annex initremote
, or can have been set by a previous SETCONFIG. Can be run at any time.
(git-annex replies with VALUE followed by the value. If the setting is not set, the value will be empty.)SETCREDS Setting User Password
When some form of user and password is needed to access a special remote, this can be used to securely store them for later use. (Like SETCONFIG, this is normally sent only during INITREMOTE.)
The Setting indicates which value in a remote's configuration can be used to store the creds.
Note that creds are normally only stored in the remote's configuration when it's surely safe to do so; when gpg encryption is used, in which case the creds will be encrypted using it. If creds are not stored in the configuration, they'll only be stored in a local file.
(embedcreds can be set to yes by the user or by SETCONFIG to force the creds to be stored in the remote's configuration).
(git-annex does not send a reply to this message.)GETCREDS Setting
Gets any creds that were previously stored in the remote's configuration or a file. (git-annex replies with "CREDS User Password". If no creds are found, User and Password are both empty.)GETUUID
Queries for the UUID of the special remote being used.
(git-annex replies with VALUE followed by the UUID.)GETGITDIR
Queries for the path to the git directory of the repository that is using the external special remote. (git-annex replies with VALUE followed by the path.)SETWANTED PreferredContentExpression
Can be used to set the preferred content of a repository. Normally this is not configured by a special remote, but it may make sense in some situations to hint at the kind of content that should be stored in the special remote. Note that if a unparsable expression is set, git-annex will ignore it.
(git-annex does not send a reply to this message.)GETWANTED
Gets the current preferred content setting of the repository. (git-annex replies with VALUE followed by the preferred content expression.)SETSTATE Key Value
Can be used to store some form of state for a Key. The state stored can be anything this remote needs to store, in any format. It is stored in the git-annex branch. Note that this means that if multiple repositories are using the same special remote, and store different state, whichever one stored the state last will win. Also, it's best to avoid storing much state, since this will bloat the git-annex branch. Most remotes will not need to store any state.
(git-annex does not send a reply to this message.)GETSTATE Key
Gets any state that has been stored for the key.
(git-annex replies with VALUE followed by the state.)SETURLPRESENT Key Url
Records an URL where the Key can be downloaded from.
Note that this does not make git-annex think that the url is present on the web special remote.
Keep in mind that this stores the url in the git-annex branch. This can result in bloat to the branch if the url is large and/or does not delta pack well with other information (such as the names of keys) already stored in the branch.
(git-annex does not send a reply to this message.)SETURLMISSING Key Url
Records that the key can no longer be downloaded from the specified URL.
(git-annex does not send a reply to this message.)SETURIPRESENT Key Uri
Records an URI where the Key can be downloaded from.
For example, "ipfs:ADDRESS" is used for the ipfs special remote; its CLAIMURL handler checks for such URIS and claims them.
(git-annex does not send a reply to this message.)SETURIMISSING Key Uri
Records that the key can no longer be downloaded from the specified URI.
(git-annex does not send a reply to this message.)GETURLS Key Prefix
Gets the recorded urls where a Key can be downloaded from. Only urls that start with the Prefix will be returned. The Prefix may be empty to get all urls. (git-annex replies one or more times with VALUE for each url. The final VALUE has an empty value, indicating the end of the url list.)DEBUG message
Tells git-annex to display the message if --debug is enabled.
(git-annex does not send a reply to this message.)
These messages are protocol extensions; it's only safe to send them to git-annex after it sent a EXTENSIONS that included the name of the message.
INFO message
Tells git-annex to display the message to the user.
When git-annex is in --json mode, the message will be emitted immediately in its own json object, with an "info" field.
(git-annex does not send a reply to this message.)
general messages
These messages can be sent at any time by either git-annex or the special remote.
ERROR ErrorMsg
Generic error. Can be sent at any time if things get too messed up to continue. When possible, use a more specific reply from the list above.
The special remote program should exit after sending this, as git-annex will not talk to it any further. If the program receives an ERROR from git-annex, it can exit with its own ERROR.
long running network connections
Since an external special remote is started only when git-annex needs to access the remote, and then left running, it's ok to open a network connection in the PREPARE stage, and continue to use that network connection as requests are made.
If you're unable to open a network connection, or the connection closes, perhaps because the network is down, it's ok to fail to perform any requests. Or you can try to reconnect when a new request is made.
Note that the external special remote program may be left running for quite a long time, especially when the git-annex assistant is using it. The assistant will detect when the system connects to a network, and will start a new process the next time it needs to use a remote.
readonly mode
Some storage services allow downloading the content of a file using a regular http connection, with no authentication. An external special remote for such a storage service can support a readonly mode of operation.
It works like this:
- When a key's content is stored on the remote, use SETURLPRESENT to tell git-annex the public url from which it can be downloaded.
- When a key's content is removed from the remote, use SETURLMISSING.
Document that this external special remote can be used in readonly mode.
The user doesn't even need to install your external special remote program to use such a remote! All they need to do is run:
git annex enableremote $remotename readonly=true
The readonly=true parameter makes git-annex download content from the urls recorded earlier by SETURLPRESENT.
TODO
- When storing encrypted files stream the file up/down the pipe, rather than using a temp file. Will probably involve less space and disk IO, and makes the progress display better, since the encryption can happen concurrently with the transfer. Also, no need to use PROGRESS in this scenario, since git-annex can see how much data it has sent/received from the remote. However, \n and probably \0 need to be escaped somehow in the file data, which adds complication.
- uuid discovery during INITREMOTE.
- Hook into webapp. Needs a way to provide some kind of prompt to the user in the webapp, etc.
It says: "Note that the File should not influence the filename used on the remote. The filename used should be derived from the Key."
Thus this interface might not be useful to implement New special remote suggeston - clean directory? The clean directory special remote would just do that: save $key content on the remote under the filename $file.
PREPARE-Failure ErrorMsg (matching INITREMOTE-FAILURE ErrorMsg)
Also, i'd like for the following to overwrite existing credentials/configs MYFOLDER="testfolder" MYLOGIN="login" MYPASSWORD="pword" MYURL="http://webdav/" git annex enableremote owncloud type=external externaltype=owncloud --debug
This would also be needed for refreshing oauth.
Added PREPARE-FAILURE
git-annex enableremote causes INITREMOTE to be called, so any credentials can be stored etc. (Note that, as with built-in special remotes, credentials are only stored in the git-annex branch when the remote is encrypted. Otherwise, they're stored locally in a .git/annex/creds/ file.)
Also, I'd recommend using environment variables for passing credentials to initremote/enableremote, because that avoids leaking them in
ps
.. but it probably doesn't make sense to use environment variables for other settings, but instead pass them as parameters of initremote/enableremote, which can be looked up using GETCONFIG. Only exception might be if the setting needs to vary between different machines.Hook should be able to set default configuration for itself.
For instance, clean flickr hook will only upload some files(notably pictures). The user shouldn't have to manage that.
Other hooks have a maximum filesize(though i guess that doesn't matter once splitting works).
I suppose you're talking about preferred content settings.
I think that it makes sense for hooks to use existing git-annex plumbing etc when it's available. So a hook could just run
git annex wanted
to manage its preferred content.The only problem is that a hook does not currently have a way to discover the UUID of the repository! So I've added a GETUUID to cover this and other use cases.
I think the hook running anything in shell, to interact with git annex, is a mistake.
I see a lot more potential pitfalls and mistakes(especially crossplatform).
It should be the existing git annex plumbing (preferred content) as you say. I just really think it should be configurable in the protocol, instead of a having to run a shell command.
Since you have made this advanced protocol i really see it as a mistake to do anything between the hook and git-annex outside of the protocol, it makes much more sense to have all their interactions happen within the protocol.
IMO anyways.
Tobias made some good points:
So, added GETWANTED and SETWANTED. However, if I find myself recapitulating a lot of git-annex's command-line plumbing stuff in this protocol, I will need to revisit this decision and find a better way. Particularly, I narrowly escaped an intractable dependency loop in [[!commit 8e3032df2d5c6ddf07e43de4b3bb89cb578ae048]].
The ability to mark a remote as being a "cloud" remote. To silence the "Unable to download files from your other devices. Add a cloud repository" message in the webapp.
Maybe as simple as "SETCONFIG cloud true", if that is a viable implementation.
You might want to use chunked transfer, i.e. a series of "EXPECT 65536" followed by that many bytes of binary data and an EOF marker (EXPECT-END or EXPECT 0), instead of escaping three characters (newline, NUL, and the escape prefix) and the additional unnecessary tedious per-character processing that would require.
First things first: in the documentation, I think
SETCONFIG Setting
should beSETCONFIG Setting Value
.Now a few questions:
SETCREDS
andGETCREDS
have both a username and a password? I'd like to use them to store a OAuth token, but because of this I also have to store a dummy value, which seems weird to me. Is it possible to just doSETCREDS oauth_token XXXYYY123456
?PROGRESS
: my remote is sendingPROGRESS xxx
every 64kb uploaded or downloaded, but no upload/download progress is displayed by git-annex. Is this normal? Should I do it myself, or will it be done by a future version of git-annex?Joey, thanks a lot for all your work on git-annex :)
Schnouki, fixed SETCONFIG docs. Note that this is a wiki. :)
I agree it's a little weird for SETCREDS to have a username and a password. This is just exposing git-annex's existing credential storage which has a tuple of values rather than using, say, a multivalue map. If you only need one it's fine to put in a dummy value for the other one.
Re PROGRESS, it seems I hooked up the progress stuff, so it was visible in the webapp, but forgot to put up a progress display at the command line. Fixed in git.
I look forward to seeing your special remote implementation!
Joey, thanks for fixing that so quickly. Indeed it works in the webapp; I'll check the CLI version as soon as possible :)
I just released the new remote on https://github.com/Schnouki/git-annex-remote-hubic. It's for hubiC, a French personal cloud storage service made by OVH that has free 25GB accounts and only charges €10/month for 10TB (VAT included). It's really experimental (hacked it over the weekend), but it seems to work for me so far.
I wonder how should I utilize this new API (WHEREIS) in my case: it seems just to lead to duplication of whereis information in my case of a special remote to support extracting of content from archives. If I make it to reply with the same url (which is not "public" per se, i.e. can't be used by annex directly) I just get it duplicated:
if I "explain" it a bit, also somewhat duplicate:
But if I just reply with "WHEREIS-FAILURE" it becomes more sensible (no duplicates), but I feel that then better documentation for this feature get adjusted to describe that it is only to complement information already known to annex, and not really to "provide any information about ways to access the content of a key stored in it". Or have I missed the point? ;)
@sjvdwalt, git-annex does not specify or expect any character encoding to be used for this protocol. A robust external special remote shouldn't assume any particular character encoding, either.
Lines will be terminated with '\n' (0xA), and words in lines are delimited by an ascii space (0x20). The keywords in the protocol are ascii too of course. Any values can contain an arbitrary sequence of bytes that may or may not be able to be decoded using the current character encoding.
IIRC, character encodings like UTF8 that encode a character to multiple bytes avoid ever using 0x0 to 0xFF when doing so. So, every ascii space and newline are unambiguously such, and it's safe to split on them even though no encoding is specified.
There's no point in implementing WHEREIS if it's going to reply with the same values that are passed to SETURIPRESENT.
Some special remotes may not need to use SETURIPRESENT to work at all, and yet storing data on the remote makes it available from some public url. This is the kind of situation where it makes sense to implement WHEREIS.
could you express your expert choice explicitly among those 3 choices how I should react to WHEREIS in my (archives) case
or really not implement it at all? (we are still at VERSION 1, so I thought that not implementing it might lead to some undesired side-effects)
As the documentation says, it's fine to not implement WHEREIS if you don't need it.
I have a question about local storage of credentials. I assumed that when creds were stored in the repo (because the remote is encrypted or because embedcreds=yes), they wouldn't be stored locally in .git/annex/creds. But it seems they are stored locally, in plaintext, regardless.
Is there a way to prevent this? Ideally, credentials should not be stored plaintext at all...but maybe there's a technical issue I'm not seeing.
@szrc, it's pretty expensive to pull encrypted creds out of the git repository and run gpg to decrypt them. Doing so also tends to result in a gpg password prompt.
Rather than do that every time git-annex needs the creds to access the remote, it maintains a local cache file, which has its permissions set so only the local user (and root, naturally) can read it.
Thanks for the response, Joey. It seems to me that many/most operations for which credentials are needed will require a gpg prompt anyway, but I can see why it might be too expensive in some cases.
Anyway, if you ever saw fit to add the option to disable or limit local caching, I would definitely use it -- and I'm guessing I'm not the only one who would prefer not to store credentials in plaintext.
Thanks for all of your great work on git-annex.
Is there a reason that DIRHASH in type=external uses a different format (mixed-case) from that used by type=directory (lower-case)? Or, could there be e.g.
DIRHASH LOWER
to select between the formats?I'm writing an external remote for SMB filesystem access and I'd like its storage to be usable via type=directory in case of emergencies or other reasons, but with different dirhash layouts that's not going to work, I assume...
I don't think there's any particularly good reason why DIRHASH uses the mixed case format. However, it can't be changed without busting existing stuff.
So yeah, I've gone ahead and added a
DIRHASH_LOWER
.Thanks. It'll probably be safer for my use case of storing data on Windows network shares than the mixed-case version.
(Speaking of "too late to change",
DIRHASH-LOWER
with a dash might be more consistent with the existing responses?)It would be nice if - progress could be provided in percentage rather than bytes, when that's all that's available - git-annex could inform the special remote how many bytes a file to be downloaded is, if that information is already available
You can find out the size of a key by using
git-annex examinekey $key --format='${bytesize}\n'
(There's a --batch option to avoid needing to spin up repeated such processes.)I suppose I could add a PROGRESSPRECENT, but any version of git-annex that didn't support it would fail with a protocol error if a special remote tried to use that. So, the special remote would need to check git-annex version to use it.
So, maybe better to get the size of the key yourself, and convert the percentage to bytes for PROGRESS.