This patch provides the support of synchronous cache space recovery
to allow read threads to recover from ENOSPC errors when cache space
can be recovered from cache items that are not in use or safe to be
reset/emptied .
The patch complements the existing cache cleaning process in two ways.
Firstly, the existing cache cleaning process is time-driven that runs
periodically. The cache space can run out while the cache cleaner
thread is still waiting for its next scheduled run. The io threads
encountering ENOSPC return an internal error to the applications
in this case even when cache space can be recovered to avoid this
error. This patch addresses this problem by having the read threads
kick the cache cleaner thread in this condition to recover cache
space preventing unnecessary ENOSPC errors from being seen by the
applications.
Secondly, this patch enhances the cache cleaner to support cache
item reset. Currently the cache purge process removes cache
items that are not in use. This may not be sufficient when the
total size of the working set exceeds the cache directory's
capacity. Like in the current code, this patch starts the purge
process by removing cache files that are not in use. Cache items
whose access times are older than vfs-cache-max-age are removed first.
After that, other not-in-use items are removed in LRU order until
vfs-cache-max-size is reached. If the vfs-cache-max-size (the quota)
is still not reached at this time, this patch adds a cache reset
step to reset/empty cache files that are still in use but not
dirtied. This enables application processes to continue without
seeing an error even when the working set depletes the cache space
as long as there is not a large write working set hoarding the
entire cache space.
By design this patch does not add ENOSPC error recovery for write
IOs. Rclone does not empty a write cache item until the file data
is written back to the backend upon close. Allowing more cache
space to be consumed by dirty cache items when the cache space is
already running low would increase the risk of exhausting the cache
space in a way that the vfs mount becomes unreadable.
The deadlock was caused in transfermap.go by calling mu.RLock() in one
function then calling it again in a sub function. Normally this is
fine, however this leaves a window where mu.Lock() can be called. When
mu.Lock() is called it doesn't allow the second mu.RLock() and
deadlocks.
Thead 1 Thread 2
String():mu.RLock()
del():mu.Lock()
sortedSlice():mu.RLock() - DEADLOCK
Lesson learnt: don't try using locks recursively ever!
This patch fixes the problem by removing the second mu.RLock(). This
was done by factoring the code that was calling it into the
transfermap.go file so all the locking can be seen at once which was
ultimately the cause of the problem - the code which used the locks
was too far away from the rest of the code using the lock.
This problem was introduced in:
bfa5715017 fs/accounting: sort transfers by start time
Which hasn't been released in a stable version yet
- add a directory to the optional Purge interface
- fix up all the backends
- add an additional integration test to test for the feature
- use the new feature in operations.Purge
Many of the backends had been prepared in advance for this so the
change was trivial for them.
This is preparation for getting the Accounting to check the context,
buf first we need to get it in place. Since this is one of those
changes that makes lots of noise, this is in a seperate commit.
go1.15 introduced a stricter policy for what you can convert with
`string()` and now `go vet` warns if you try to do `string(int)`.
See: https://github.com/golang/go/issues/32479
If the parameter being passed is an object then it can be passed as a
JSON string rather than using the `--json` flag which simplifies the
command line.
rclone rc operations/list fs=/tmp remote=test opt='{"showHash": true}'
Rather than
rclone rc operations/list --json '{"fs": "/tmp", "remote": "test", "opt": {"showHash": true}}'
Before this change, if the cache was given a source `remote:file` it
stored `remote:` with the error `fs.ErrorIsFile` attached. This meant
that if it `remote:` was subsequently looked up it would return the
`fs.ErrorIsFile` error.
This broke `moveto remote:file remote:file2` as moveto would lookup
`remote:` from the second argument and erroneously get the
`fs.ErrorIsFile` error.
This likely broke other commands too.
This was broken in
4c9836035 fs/cache: Add Pin and Unpin and canonicalised lookup
Which was released in v1.52.0
The fix is to make a new cache entry for `remote:` with no error
attached in the case that the original call returned `fs.ErrorIsFile`.