[ARVADOS] updated: 9877c8914fa0bb17fcb9d6f2e30e067f8b135d79
git at public.curoverse.com
git at public.curoverse.com
Mon Mar 16 12:42:31 EDT 2015
Summary of changes:
...106_fix_collection_portable_data_hash_with_hinted_manifest.rb | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
via 9877c8914fa0bb17fcb9d6f2e30e067f8b135d79 (commit)
via 4f3b7339bef1286a34745c7ecd97476f56af469c (commit)
from d16e54da7e751807685f576d089d69417c9094b0 (commit)
Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.
commit 9877c8914fa0bb17fcb9d6f2e30e067f8b135d79
Merge: d16e54d 4f3b733
Author: Brett Smith <brett at curoverse.com>
Date: Mon Mar 16 12:41:36 2015 -0400
Merge branch '5319-collection-pdh-fix-performance-wip'
Refs #5319.
commit 4f3b7339bef1286a34745c7ecd97476f56af469c
Author: Brett Smith <brett at curoverse.com>
Date: Mon Mar 16 10:09:57 2015 -0400
5319: Improve performance of Collection PDH fix migration.
* Use PostgreSQL's native regular expression search to limit the
number of records we pull through ActiveRecord.
* Use a smaller batch size to avoid pulling pathological batches of
records that cause swapping.
diff --git a/services/api/db/migrate/20150303210106_fix_collection_portable_data_hash_with_hinted_manifest.rb b/services/api/db/migrate/20150303210106_fix_collection_portable_data_hash_with_hinted_manifest.rb
index 7f65450..d983e7b 100644
--- a/services/api/db/migrate/20150303210106_fix_collection_portable_data_hash_with_hinted_manifest.rb
+++ b/services/api/db/migrate/20150303210106_fix_collection_portable_data_hash_with_hinted_manifest.rb
@@ -55,8 +55,13 @@ class FixCollectionPortableDataHashWithHintedManifest < ActiveRecord::Migration
end
def each_bad_collection
- Collection.find_each do |coll|
- next unless (coll.manifest_text =~ /\+[A-Z]/)
+ # It's important to make sure that this line doesn't swap. The
+ # worst case scenario is that it finds a batch of collections that
+ # all have maximum size manifests (64MiB). With a batch size of
+ # 50, that's about 3GiB. Figure it will end up being 4GiB after
+ # other ActiveRecord overhead. That's a size we're comfortable with.
+ Collection.where("manifest_text ~ '\\+[A-Z]'").
+ find_each(batch_size: 50) do |coll|
stripped_manifest = coll.manifest_text.
gsub(/( [0-9a-f]{32}(\+\d+)?)(\+\S+)/, '\1')
stripped_pdh = sprintf("%s+%i",
-----------------------------------------------------------------------
hooks/post-receive
--
More information about the arvados-commits
mailing list