[GET-dev] trait-o-matic's mysql queries seem unnecessarily slow

Kimberly Robasky krobasky at gmail.com
Thu Sep 16 15:11:52 EDT 2010


Databases are almost always slower than flat-files when you're not
doing any writing.  I would also expect to see see significant
performance gains using code that is specifically designed to our data
- in this case, pre-sorting is a big win!  If you want to make an even
faster app, try using C to read in a binary file instead of ASCII
data; however, then you lose the advantage of human-readable files.

-Kim

On Thu, Sep 16, 2010 at 3:02 PM, Alexander Wait Zaranek
<awaitz at post.harvard.edu> wrote:
> On Thu, Sep 16, 2010 at 2:55 PM, Madeleine Price Ball <meprice at gmail.com> wrote:
>> I suspect all the other steps in Trait-o-matic processing can be improved in
>> the same way: simultaneously moving through pre-sorted files rather than
>> loading one into MySQL and then querying MySQL.
> I wrote code for another project that does exactly this.  In that
> project, UNIX sort / IO / compression was the bottleneck for
> data-processing.
>
> We might be able to do better than pre-sorting but, clearly, it's an
> improvement over the existing method.
>
> Sasha
>
> _______________________________________________
> GET-dev mailing list
> GET-dev at lists.freelogy.org
> http://lists.freelogy.org/mailman/listinfo/get-dev
>




More information about the Arvados mailing list