[arvados-dev] Manifest changes for API & Ruby SDK
Brett Smith
brett at curoverse.com
Sun Sep 7 10:52:40 EDT 2014
Hi everyone,
I recently merged code to resolve #3720, where Workbench would time out
before it could render large collections. For being “just” a performance
issue, we had to make surprising number of changes to code organization to
get this fixed, so I wanted to outline those for everyone.
The crux of the issue was simple: Collections had a files attribute, an
array that lists each file in the Collection along with its location and
size. When the manifest contains many files, it took too long to have the
API server parse the entire manifest, and then encode this “alternate view”
of the manifest to JSON.
Collections API changes
The fix is equally simple to explain: the API server no longer prepares or
provides the files attribute, for list or get requests. The related
data_size attribute is also gone. Clients are responsible for their own
manifest parsing. This is still a performance win, because it avoids a JSON
encode/decode cycle, and many consumers don’t want or need to parse the
entire manifest; they only care about the first few files.
Ruby SDK changes
The API server and Workbench both use this file information. To avoid
duplicating code between them, I added a Keep::Manifest class to our Ruby
SDK. This handles common parsing tasks, and both the API server and
Workbench use it now. The API server’s old Locator class became
Keep::Locator and moved here too.
It sounds a little funny to say that our API server depends on our SDK, but
it makes more sense when you remember it’s using the Keep client part of
the SDK.
Keep::Manifest yields results incrementally as they’re parsed out. This
way, if you only need information from the early part of the manifest, you
can get that without the expense of parsing the entire text.
Build changes
Jenkins now tests our Ruby SDK, builds the gem, and uses that gem when
testing other components like the API server. This means it takes a little
care to manage the servers’ version dependency on the arvados Gem, just
like we’ve had with the API server and arvados-cli for a while. I’ve
updated our Ruby SDK wiki page
<https://arvados.org/projects/arvados/wiki/Hacking_Ruby_SDK> with more
information about how you can test your changes before pushing them, and
arrange a consistent build with Jenkins.
Let me know if you have questions about any of this.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.arvados.org/pipermail/arvados-dev/attachments/20140907/523e2e67/attachment.html>
More information about the arvados-dev
mailing list