--files-from parses input files by ignoring comments starting with # and ;
and stripping whitespace from start and end of strings.
The --files-from-raw flag was added that reads every line from the file ignoring
comment characters and not stripping whitespace while maintaining
backwards compatibility.
Fixes#3762
Before this change if there were two files with the same name and the
same ID in the same directory, dedupe would delete one of them but
since these are actually the same file (with the same ID) then both
files would be deleted leading to data loss.
This should never actually happen, however it did happen as part of a
bug introduced in rclone which was fixed by
dfc7215bf9 drive: fix duplicate items when using --drive-shared-with-me #4018
This change checks to see if any of the duplicates have the same ID
and if they do it refuses to delete them.
Before this change these tests attempted to measure transfers and
checks in lieu of having a rename statistic with a very complicated
heuristic.
The change switches over to using the rename statistic which should be
100% reliable.
Before this change the first pass of --delete-before would output
"There was nothing to transfer" and then proceed to transfer things.
This makes sure the message isn't printed in the delete phase.
See: https://forum.rclone.org/t/incorrect-debug-output/15267
This commit corrects the logic for --track-renames-strategy which
broke the integration tests.
It also improves the parsing of the argument and adds a test for that.
This commit adds the `--track-renames-strategy` flag which allows the
user to choose the strategy for tracking renames when using the
`--track-renames` flag.
This can be "hash" or "modtime" or both currently.
This, when used with `--track-renames-strategy modtime` enables
support for tracking renames in encrypted remotes.
Fixes#3696Fixes#2721
Before this change backends which introduce overhead (eg crypt) were
failing to upload the first file.
This change increases the threshold to 2k to allow the first file to
go through even with some overhead but the next file to definitely
fail.
Before this change we checked the transfer was out of range only
before the Read call. This means that we returned all the data to the
reader before declaring an error. This means that some backends wrote
the file even though an error was returned.
This fix checks the transfer after the Read as well, and chops the
excess characters off the read data if we are over the limit so that
we don't ever deliver all the data.
This fixes the tests introduced as part of 6f1766dd9e and #2672
on backends other than local.
Before this change the exit code for transfer limit exceeded was
incorrect. This was because the `resolveExitCode` function unwraps the
error thus reading the underlying error which is not the same as the
error it was comparing to (`ErrorMaxTransferLimitReached`).
This change fixes it by splitting the error definition in two so that
when the Fatal error is unwrapped we match against
`ErrorMaxTransferLimitReached` however when we return the error we
return `ErrorMaxTransferLimitReachedFatal`.
In bde0334bd8 "operations: fix setting the timestamp on Windows
for multithread copy" the test for multithread copy failed to take
into account the modify window of the remote under test.
Before this fix we attempted to set the modification time on the file
when it was open. This works fine on Linux but not on Windows. The
test was also incorrect testing the source file rather than the
destination file.
This closes the file before setting the modification time and fixes
the tests.
Fixes#3994
Before this change, the elapsed time shown with the --progress flag
would not print ".0s" so the elapsed time.
This change will make it so that the line width is kept a bit more
consistent by always printing to a fixed-point.
This does change the displayed value when the elapsed time
is less than 1s, in which it used to be that the value would be shown
in ms or smaller units.
Signed-off-by: Gary Kim <gary@garykim.dev>
Before this changed we unconditionally fetched the MimeType. On Some
backends like s3 and swift this takes an extra transaction which meant
that `lsf` on those backends was needlessly slow.
This adds an internal option so `lsf` can declare whether it wants
MimeTypes or not depending on whether the user asked for them and an
external flag `--no-mimetype` for `lsjson`.
See: https://forum.rclone.org/t/reliably-setup-incremental-updates/14006/8
This is a timing dependent test and to make it long enough so that it
would work with the remotes would make it too long for local tests.
The code paths are identical for local vs non-local so just run on
local.
This fixes the integration tests.
This gives you more control over how long rclone will run for, making
it easier to script backups, e.g. via cron. Once the `--max-duration`
time limit is reached, no new transfers will be initiated, but those
already in-flight will be allowed to complete.
Fixes#985
Before this change the expect/continue timeout was set to
--conntimeout which was 60s by default which is too long to wait.
This was noticed when using s3 with a proxy which apparently didn't
support expect / continue properly.
Set --expect-continue-timeout 0 to disable expect/continue.
Statistics of ransfers which were interrupted are not cleared before
retry iteration. These transfers completed with over 100 percentage.
This change clears transfer accounting before next retry iteration is
done in order to keep numbers in track.
Fixes#3861
Without the fix we can have a race, example:
```
Write at 0x00c000432039 by goroutine 187:
github.com/rclone/rclone/fs/accounting.(*StatsInfo).Error()
fs/accounting/stats.go:495 +0x3f1
github.com/rclone/rclone/fs/accounting.(*StatsInfo).Error-fm()
fs/accounting/stats.go:477 +0x55
github.com/rclone/rclone/fs/walk.listRwalk.func1()
fs/walk/walk.go:162 +0xd2
github.com/rclone/rclone/fs/walk.walk.func2()
fs/walk/walk.go:402 +0x30f
Previous read at 0x00c000432039 by goroutine 184:
github.com/rclone/rclone/fs/accounting.(*statsGroups).sum()
fs/accounting/stats_groups.go:351 +0xcae
github.com/rclone/rclone/fs/accounting.rcTransferredStats()
fs/accounting/stats_groups.go:132 +0x1f4
```
Fixes#3844
Before this change if an operation was retried on operations.Copy and
the operation was large enough to use an async buffer then an async
buffer was leaked on the retry. This leaked memory, a file handle and
a go routine.
After this change if Account.WithBuffer is called and there is already
a buffer, then a new one won't be allocated.
rclone library users might be intrested in changing default value to
other, or even disabling it. With current version it's impossible which
leads to races when number of uploaded objects exceeds default limit.
Fixes#3732
For few commands, RClone counts a error multiple times. This was fixed by
creating a new error type which keeps a flag to remember if the error has
already been counted or not. The CountError function now wraps the original
error eith the above new error type and returns it.
from rsync manual:
--compare-dest=DIR
This option instructs rsync to use DIR on the destination machine as an
additional hierarchy to compare destination files against doing transfers
(if the files are missing in the destination directory). If a file is found
in DIR that is identical to the sender's file, the file will NOT be
transferred to the destination directory. This is useful for creating
a sparse backup of just files that have changed from an earlier backup.
--copy-dest=DIR
This option behaves like --compare-dest, but rsync will also copy unchanged
files found in DIR to the destination directory using a local copy.
This is useful for doing transfers to a new destination while leaving
existing files intact, and then doing a flash-cutover when all files
have been successfully transferred.
On google fs (drive, google photos, and google cloud storage), if
headless is selected, do not open browser.
This also supplies a new option "auth-no-open-browser" for authorize
if the user does not want it.
This should fix#3323.
This was broken in e337cae0c5 when we deleted the transfers
immediately.
This is fixed by keeping a merged slice of time ranges of completed
transfers and adding those to the current transfers.
In a28239f005 we made --files-from obey --no-traverse. In the
process this caused --files-from without --no-traverse to do a
complete recursive scan unecessarily.
This was only noticeable in users of fs/march, so sync/copy/move/etc
not in ls/lsf/etc.
This fix makes sure that we use conventional directory listings in
fs/march unless `--files-from` and `--no-traverse` is set or
`--fast-list` is active.
Fixes#3619
It was reported that v1.49.4 which was accidentally compiled with
go1.13 instead of go1.12 produced errors like this:
Failed to get StartPageToken: Get https://www.googleapis.com/drive/v3/changes/startPageToken?XXX: stream error: stream ID 1789; INTERNAL_ERROR
IO error: open file failed: Get https://www.googleapis.com/drive/v3/files/XXX?alt=media: stream error: stream ID 1781; INTERNAL_ERROR
These are errors from the http2 library. It appears that go1.13 when
communicating with google drive defaults to http2 whereas with go1.12
it doesn't.
It is unclear what is causing these errors, but retrying them since
they don't happen very often seems like a valid strategy.
This was fixed in v1.49.5 by compiling with go1.12 - this fix is
designed to work with go1.13
See: https://forum.rclone.org/t/1-49-4-plex-internal-errors-on-google-drive/12108/
Before this fix, attempting to set a non top level environment
variable would fail with "Couldn't find flag".
This fixes it by passing in the flags that the env var is being set
from.
Fixes#3615
Before this change --update would transfer any file which was newer
than the destination regardless of whether it had changed or not.
This is needlessly wasteful of bandwidth.
After this change --update will only transfer files if they are newer
**and** they are different (checked with checksum and size).
'Couldn't find file - Need to Transfer' changed to 'Need to transfer -
File Not Found at Destination' because while reading the debug logs, it
confuses with failure in operation.
changes:
- chunker: remove GetTier and SetTier
- remove wdmrcompat metaformat
- remove fastopen strategy
- make hash_type option non-advanced
- adverise hash support when possible
- add metadata field "ver", run strict checks
- describe internal behavior in comments
- improve documentation
note:
wdmrcompat used to write file name in the metadata, so maximum metadata
size was 1K; removing it allows to cap size by 200 bytes now.
In 53a1a0e3ef we introduced a problem where if there was an
error on the file being transferred then the file was re-opened and
the old one wasn't closed.
This was partially fixed in bfbddab46b however this didn't
address the case of the old file being closed.
This is now fixed by
- marking the file as open again in UpdateReader
- moving the stopping the accounting machinery to a new method Done
Before this change you could only configure the local backend flags
which don't have the local prefix (eg `--copy-links`) with
`RCLONE_LOCAL_COPY_LINKS`.
This change makes `RCLONE_COPY_LINKS` valid too which is much more
logical for the users.
Fixes#3534
Before this change it was possible to make a remote with an invalid
name in the config file, either manually or with `rclone config
create` (but not with `rclone config`).
When this remote was used, because it was invalid, rclone would
presume this remote name was a local directory for a very suprising
user experience!
This change checks remote names more carefully and returns errors
- when the user tries to use an invalid remote name on the command line
- when an invalid remote name is used in `rclone config create/update/password`
- when the user tries to enter an invalid remote name in `rclone config`
This does not prevent the user entering a remote name with invalid
characters in the config manually, but such a remote will fail
immediately when it is used on the command line.
Before this change, using -P occasionally deadlocked on the Transfer
mutex when Transfer.Done() was called with a non nil error and the
StatsInfo mutex since they mutually call each other.
This was fixed by making sure that the Transfer mutex is always
released before calling any StatsInfo methods.
This improves on: 6f87267b34Fixes#3505
Before this change if -u/--update was in effect we compared the size
of the files to see if the transfer should go ahead. This was
comparing -1 with an actual size so the transfer always proceeded.
After this change we use the existing `sizeDiffers` function which
does the correct comparison with -1 for files of unknown length.
See: https://forum.rclone.org/t/sync-with-google-photos-to-local-drive-will-result-in-recoping/11605
Without this patch the resulting error is first converted to string and then recreated.
This makes it impossible to use the defined error types to figure out the cause of the error,
and may result in invalid HTTP status codes.
This patch adds a test TestExecuteJobErrorPropagation to validate that the errors are
properly propagated.
Before this change we didn't calculate or check hashes of transferred
files if --size-only mode was explicitly set.
This problem was introduced in 20da3e6352 which was released with v1.37
After this change hashes are checked for all transfers unless
--ignore-checksums is set.
Before this change we calculated the checkums when using
--ignore-checksum but ignored them at the end.
Now we don't calculate the checksums at all which is more efficient.
Before this change for a post copy Hash check we would run the hashes sequentially.
Now we run the hashes in parallel for a useful speedup.
Note that this refactors the hash check in Copy to use the standard
hash checking routine.
The error "tls: bad record MAC" is very likely to be caused by
hardware issues. It indicates that a packet got corrupted somewhere.
As a work around, this change treats it as retriable error which
allows the chunk to get retried and the transfer to continue.
See: https://forum.rclone.org/t/rc-rc-job-expire-interval-bug/11188
rclone was ignoring the --rc-job-expire-duration and --rc-job-interval
flags. This turned out to be an initialization order problem and was
fixed by moving those flags out of global config into rc config.
Before this change, using -P occasionally deadlocked on the transfer
mutex and the stats mutex since they call each other via the progress
printing.
This is fixed by shortening the locking windows and converting the
mutex to a RW mutex.
Prior to this fix, a request such as
rclone lsf -R --include "/dir/**" remote:
Would use ListR which is very inefficient as it lists the whole remote
for one directory.
This changes it to use recursive walking if the filters imply any
directory filtering. So `--include *.jpg` and `--exclude *.jpg` will
still use ListR wheras `--include "/dir/**` will not.
Before this fix rclone calculated all the hashes on transfer. This
was particularly slow for the local backend.
After the fix we just calculate one hash which is enough for data
integrity.
This was factored from fstest as we were including the testing
enviroment into the main binary because of it.
This was causing opening the browser to fail because of 8243ff8bc8.
This adds experimental support for web gui integration so that rclone can fetch and run a web based GUI using the --rc-web-ui and related flags.
It downloads and caches a webui zip file which it then unpacks and opens in the browser.
core/stats can return two different schemas in 'transferring' field.
One is object with fields the other is just plain string.
This is confusing, unnecessary and makes defining response schema
more difficult. It also returns `lastError` as value which can be
rendered differently depending on source of error.
This change standardizes 'transferring' filed to always return
object but with reduced fields if they are not available.
Former string item is converted to {name:remote_name} object.
'lastError' is forced to be a string as in some cases it can be encoded
as an object.
Introduce stats groups that will isolate accounting for logically
different transferring operations. That way multiple accounting
operations can be done in parallel without interfering with each other
stats.
Using groups is optional. There is dedicated global stats that will be
used by default if no group is specified. This is operating mode for CLI
usage which is just fire and forget operation.
For running rclone as rc http server each request will create it's own
group. Also there is an option to specify your own group.
This is done to make clear ownership over accounting object and prepare
for removing global stats object.
Stats elapsed time calculation has been altered to account for actual
transfer time instead of stats creation time.
This was started by Fionera, finished off by Laura with fixes and more
docs from Nick.
Co-authored-by: Fionera <fionera@fionera.de>
Co-authored-by: Nick Craig-Wood <nick@craig-wood.com>
We now have a backend (fichier) which doesn't support 0 length
files. Therefore all 0 length files in the tests have been replaced
with length 1.
In a future commit we will implement a test for 0 length files.
- Change rclone/fs interfaces to accept context.Context
- Update interface implementations to use context.Context
- Change top level usage to propagate context to lover level functions
Context propagation is needed for stopping transfers and passing other
request-scoped values.
Before this change when using "rclone config create" it wasn't
possible to add passwords in one go, it was necessary to call "rclone
config password" to add the passwords afterwards as "rclone config
create" didn't obscure passwords.
After this change "rclone config create" and "rclone config update"
will obscure passwords as necessary as will the corresponding API
calls config/create and config/update.
This makes "rclone config password" and its API config/password
obsolete, however they will be left for backwards compatibility.
Before this change serving bucket based objects
`[remote:bucket]/path/to/object` would fail with 404 not found.
This was because the leading `/` in `/path/to/object` was being passed
to NewObject.
When sorting fs.DirEntries we sort by DirEntry type and
when synchronizing files let the directories be before objects,
so when the destintation fs doesn't support duplicate names,
we will only lose duplicated object instead of whole directory.
The enables synchronisation to work with a file and a directory of the same name
which is reasonably common on bucket based remotes.
Before this change, when using rclone copy or move with --backup-dir
and the source was a single file, rclone would fail to use the backup
directory.
This change looks up the backup directory in the Fs cache and uses it
as appropriate.
This affects any commands which call operations.MoveFile or
operations.CopyFile which includes rclone move/moveto/copy/copyto
where the source is a single file.
Fixes#3219
Before this change we calculated the checksum which is potentially
time consuming and then ignored the result. After the change we don't
calculate the checksum if we are about to ignore it.
Tests have been randomly failing with messages like
listen tcp 127.0.0.1:51778: bind: address already in use
Rework all the test servers so they choose a random free port on
startup and use that for the tests to avoid.
In c5ac96e9e7 we made --files-from only read the objects specified and
don't scan directories.
This caused problems with Google drive (very very slow) and B2
(excessive API consumption) so it was decided to make the old
behaviour (traversing the directories) the default with --files-from
and use the existing --no-traverse flag (which has exactly the right
semantics) to enable the new non scanning behaviour.
See: https://forum.rclone.org/t/using-files-from-with-drive-hammers-the-api/8726Fixes#3102Fixes#3095
Before the fix we were only de-duping the ListR batches.
Afterwards we dedupe everything.
This will have the consequence that rclone uses more memory as it will
build a map of all the directory names, not just the names in a given
directory.
This dramatically increases the speed (7x in my tests) of the de-dupe
as google drive supports ListR directly and dedupe did not work with
`--fast-list`.
Fixes#2902
This will increase speed for backends which support ListR and will not
have the memory overhead of using --fast-list.
It also means that errors are queued until the end so as much of the
remote will be listed as possible before returning an error.
Commands affected are:
- lsf
- ls
- lsl
- lsjson
- lsd
- md5sum/sha1sum/hashsum
- size
- delete
- cat
- settier
It otherwise has the nearly the same interface as walk.Walk which it
will fall back to if it can't use ListR.
Using walk.ListR will speed up file system operations by default and
use much less memory and start immediately compared to if --fast-list
had been supplied.
Make the pacer package more flexible by extracting the pace calculation
functions into a separate interface. This also allows to move features
that require the fs package like logging and custom errors into the fs
package.
Also add a RetryAfterError sentinel error that can be used to signal a
desired retry time to the Calculator.
This brings it up to par with lsjson.
This commit also reworks the framework to use ListJSON internally
which removes duplicated code and makes testing easier.
This puts a shim on the reader opened by Copy so that if an error is
returned, the reader is re-opened at the correct seek point.
This should make downloading very large files more reliable.
This replaces the `sync.Pool` allocator with lib/pool. This
implements a pool of buffers of up to 64MB which can be re-used but is
flushed every 5 seconds.
If `--use-mmap` is set then rclone will use mmap for memory
allocations which is much better at returning memory to the OS.
* drive: don't run teamdrive config if auto confirm set
* onedrive: don't run extra config if auto confirm set
* make Confirm results customisable by config
Fixes#1010
This fixes several things wrong with the layout of the stats.
Transfers which haven't started are printed in the same format as
those which have so the stats with `--progress` don't show horrible
artifacts.
Checkers and transfers now get a ": checkers" and ": transfers" label
on the end of the stats line. Transfers will have the transfer stats
when the transfer has started instead of this.
There was a bug in the routine which shortened the file names (it
always produces strings 1 too long). This is now fixed with a test.
The formatting string was wrong with a fixed width of 45 - this is now
replaces with the value of `--stats-file-name-length`.
This also meant that there were unecessary leading spaces in the file
names. So the default `--stats-file-name-length` was raised to 45
from 40.
Cookies are handled by cookiejar in memory with fshttp module through
the entire session.
One useful scenario is, with HTTP storage system where index server
adds authentication cookie while redirecting to CDN for actual files.
Also, it can be helpful to reuse fshttp in other storage systems
requiring cookie.
This means that rclone will pick up tokens from concurrently running
rclones. This helps for Box which only allows each refresh token to
be used once.
Without this fix, rclone caches the refresh token at the start of the
run, then when the token expires the refresh token may have been used
already by a concurrently running rclone.
This also will retry the oauth up to 5 times at 1 second intervals.
See: https://forum.rclone.org/t/box-token-refresh-timing/8175
Before this change rclone would read the list of files from the
files-from parameter and check they existed one at a time. This could
take a very long time for lots of files.
After this change, rclone will check up to --checkers in parallel.
The --no-traverse flag was not implemented when the new sync routines
(using the march package) was implemented.
This re-implements --no-traverse in march by trying to find a match
for each object with NewObject rather than from a directory listing.
Before this change TestPurge would remove a container and subsequent
tests would fail because the container was still being deleted so
couldn't be created.
This was fixed by introducing an fstest.NewRunIndividual() test runner
for TestPurge which causes the test to be run on a new container.
If `--rc-user` or `--rc-pass` is set then the URL that is opened with
`--rc-files` will have the authorization in the URL in the
`http://user:pass@localhost/` style.
Before this change, Purge on the fallback path would try to delete
directories starting from the root rather than the dir passed in.
Rmdirs would also attempt to delete the root.
Before this change using --files-from would scan all the directories
that the files could possibly be in causing rclone to do more work
that was necessary.
After this change, rclone constructs an in memory tree using the
--fast-list mechanism but from all of the files in the --files-from
list and without scanning any directories.
Any objects that are not found in the --files-from list are ignored
silently.
This mechanism is used for sync/copy/move (march) and all of the
listing commands ls/lsf/md5sum/etc (walk).
Use the same function to join the root paths for the wrapping remotes
alias, cache and crypt.
The new function fspath.JoinRootPath is equivalent to path.Join, but if
the first non empty element starts with "//", this is preserved to allow
Windows network path to be used in these remotes.
Before this change remotes without server side Move (eg swift, s3,
gcs) would not be able to rename files.
After it means nearly all remotes will be able to rename files on
rclone mount with the notable exceptions of b2 and yandex.
This changes checks to see if the remote can do Move or Copy then
calls `operations.Move` to do the actual move. This will do a server
side Move or Copy but won't download and re-upload the file.
It also checks to see if the destination exists first which avoids
conflicts or duplicates.
Fixes#1965Fixes#2569
Before this change the moving average for the individual file stats
would start at 0 and only converge to the correct value over 15-30
seconds.
This change starts the weighting period as 1 and moves it up once per
sample which gets the average to a better value instantly.
Previous to this change package used for this
github.com/VividCortex/ewma took a 0 average to mean reset the
statistics. This happens quite often when transferring files though a
buffer.
Replace that implementation with a simple home grown one (with about
the same constant), without that feature.
This change allows remotes to be created on the fly without a config
file by using the remote type prefixed with a : as the remote name, Eg
:s3: to make an s3 remote.
This assumes the user is supplying the backend config via command line
flags or environment variables.
The race detector currently detects a race with len(chan) against
close(chan).
See: https://github.com/golang/go/issues/27070
Skip the tests which trip this bug under the race detector.
--max-backlog controls the queue length.
Add statistics for the check/upload/rename queues.
This means that checking can complete before the uploads which will
give rclone the ability to show exactly what is outstanding.