[ARVADOS] created: a829e7e14aa6380616337e0e7dda47c4f9f7022c

Git user git at public.curoverse.com
Mon May 2 16:26:09 EDT 2016


        at  a829e7e14aa6380616337e0e7dda47c4f9f7022c (commit)


commit a829e7e14aa6380616337e0e7dda47c4f9f7022c
Author: Peter Amstutz <peter.amstutz at curoverse.com>
Date:   Fri Apr 29 09:11:15 2016 -0400

    8998: Monkey patch URI.decode_www_form_component to validate efficiently.
    
    Rack uses the standard library method URI.decode_www_form_component to process
    parameters.  This method first validates the string with a regular expression,
    and then decodes it using another regular expression.  Ruby 2.1 and earlier has
    a bug is in the validation; the regular expression that is used generates many
    backtracking points, which results in exponential memory growth when matching
    large strings.  The fix is to monkey-patch the version of the method from Ruby
    2.2 which checks that the string is not invalid instead of checking it is
    valid.

diff --git a/services/api/config/initializers/fix_www_decode.rb b/services/api/config/initializers/fix_www_decode.rb
new file mode 100644
index 0000000..bc50c12
--- /dev/null
+++ b/services/api/config/initializers/fix_www_decode.rb
@@ -0,0 +1,16 @@
+module URI
+  if Gem::Version.new(RUBY_VERSION) < Gem::Version.new('2.2')
+    # Rack uses the standard library method URI.decode_www_form_component to
+    # process parameters.  This method first validates the string with a
+    # regular expression, and then decodes it using another regular expression.
+    # Ruby 2.1 and earlier has a bug is in the validation; the regular
+    # expression that is used generates many backtracking points, which results
+    # in exponential memory growth when matching large strings.  The fix is to
+    # monkey-patch the version of the method from Ruby 2.2 which checks that
+    # the string is not invalid instead of checking it is valid.
+    def self.decode_www_form_component(str, enc=Encoding::UTF_8)
+      raise ArgumentError, "invalid %-encoding (#{str})" if /%(?!\h\h)/ =~ str
+      str.b.gsub(/\+|%\h\h/, TBLDECWWWCOMP_).force_encoding(enc)
+    end
+  end
+end
diff --git a/services/api/test/helpers/time_block.rb b/services/api/test/helpers/time_block.rb
index a3b03ff..5c75373 100644
--- a/services/api/test/helpers/time_block.rb
+++ b/services/api/test/helpers/time_block.rb
@@ -8,4 +8,16 @@ class ActiveSupport::TestCase
       $stderr.puts "#{t1 - t0}s #{label}"
     end
   end
+
+  def vmpeak c
+    open("/proc/self/status").each_line do |line|
+      print "Begin #{c} #{line}" if (line =~ /^VmHWM:/)
+    end
+    n = yield
+    open("/proc/self/status").each_line do |line|
+      print "End #{c} #{line}" if (line =~ /^VmHWM:/)
+    end
+    n
+  end
+
 end
diff --git a/services/api/test/integration/collections_performance_test.rb b/services/api/test/integration/collections_performance_test.rb
index 892060a..634ac11 100644
--- a/services/api/test/integration/collections_performance_test.rb
+++ b/services/api/test/integration/collections_performance_test.rb
@@ -37,4 +37,18 @@ class CollectionsApiPerformanceTest < ActionDispatch::IntegrationTest
       delete '/arvados/v1/collections/' + uuid, {}, auth(:active)
     end
   end
+
+  test "memory usage" do
+     hugemanifest = make_manifest(streams: 1,
+                                  files_per_stream: 2000,
+                                  blocks_per_file: 200,
+                                  bytes_per_block: 2**26,
+                                  api_token: api_token(:active))
+    json = time_block "JSON encode #{hugemanifest.length>>20}MiB manifest" do
+      Oj.dump({manifest_text: hugemanifest})
+    end
+     vmpeak "post" do
+       post '/arvados/v1/collections', {collection: json}, auth(:active)
+     end
+  end
 end

-----------------------------------------------------------------------


hooks/post-receive
-- 




More information about the arvados-commits mailing list