[arvados] Important - Data integrity patch being prepared + mitigation steps
Tom Morris
tfmorris at veritasgenetics.com
Fri Apr 26 21:32:41 UTC 2019
*We recently discovered an issue which was introduced in the Arvados v1.3
release which can cause certain collections to not be tallied correctly
during block garbage collection accounting. This can, in turn, make it look
like blocks are not being referenced and are candidates for deletion. This
only affects collections created as output from container, not those
created by the Arvados Workbench or arv-put. This only affects
installations where the keep-balance service is running with the
-commit-trash command line flag (which is the default) and the keepstore(s)
are running with EnableDelete enabled (which is not the default).We are
testing a fix for this which will be included an Arvados v1.3.2 hotfix
release as soon as it's ready, but in the mean time, we strongly recommend
that customers turn off block garbage collection. The Arvados DevOps team
has already done this for all customers with support contracts, so no
action is necessary on your part.You can minimize impact by preserving data
that has been incorrectly trashed but not yet deleted. On each keepstore
server node, update the config file /etc/arvados/keepstore/keepstore.yml,
setting the TrashCheckInterval to a very long time. Then restart keepstore.
TrashCheckInterval: 87600hAlso stop the keep-balance daemon on whatever
machine it's running on. $ systemctl --now disable keep-balanceWe will
send a followup note shortly when we have a patched version of the software
available, as well as recommendations on data recovery.Best regards,Tom
MorrisDirector, Product Management*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.arvados.org/pipermail/arvados/attachments/20190426/f72c632e/attachment.html>
More information about the arvados
mailing list