gabeiscoding | 6 months ago | on: Show HN: E-Paper Family 2 Day Calendar
gabeiscoding's comments
gabeiscoding | 6 months ago | on: Show HN: E-Paper Family 2 Day Calendar
gabeiscoding | 3 years ago | on: Oxide and the Chamber of Mysteries [video]
But now I can look forward to their leap over to “social audio”. First Twitter spaces (where I would sometimes chime in live) and now the Discord-hosted On the Metal.
Most podcasts are sports commentary. These guys are full-contact in the game. I love it. Keep it up Bryan!
gabeiscoding | 5 years ago | on: Parallel Seam Carving
It was ages ago, and has since been archived on google code :)
[0] https://code.google.com/archive/p/seam-carving-gui/ [1] https://github.com/esimov/caire
gabeiscoding | 6 years ago | on: Real-world dynamic programming: seam carving
The original paper was discussed on slashdot and back at that time I was inspired to build a little GUI around an open source algorithm implementation to play with my Qt skills.
It allows you to shrink, expand and "mask out" regions you don't want touch etc.
Still available on Google Code archive:
gabeiscoding | 7 years ago | on: Advanced techniques to implement fast hash tables
I've been learning rust recently. As a learning exercise, I compared the robin-hood hashing of std::collections::HashMap in rust to his klib/khash he mentions in this article , and then tried various hash functions to try and match his performance:
https://github.com/gaberudy/hash_test
No dice, his hash table is smaller and faster.
My next step is to try and implement his data structure and hashing functions directly in rust and see if I can get it to near-C performance...
gabeiscoding | 8 years ago | on: Astronaut’s DNA No Longer Matches His Identical Twin’s After Year Spent in Space
He's a great speaker, and a cool guy and tackles some of the most interesting (at least to hear about) science in genomics
[1] https://www.dropbox.com/s/sfg6rdmgxjwdpil/Mason_NEB_talk_AGB... [2] https://twitter.com/mason_lab/status/964151387687972864
gabeiscoding | 8 years ago | on: Content-aware image resize library
Not as fancy as photoshop I'm sure, but does have the ability to paint a mask of regions to keep / remove to aid the algorithm and get the desired result. Multi-threaded too!
gabeiscoding | 10 years ago | on: Google Genomics: store, process, explore and share genomic data
It looks to be solving the same problems as DNAnexus, Seven Bridges, BaseSpace etc as a way to wrap open source tools in more user-friendly ways.
But it's orchestrating the production of smaller set of data that still needs the next step of human interpretation, report writing, family-aware algorithms and most complex annotations (the problem space Golden Helix is in).
In other words, the automatable bits that is not the hard part that I mentioned in my blog post.
gabeiscoding | 10 years ago | on: Google Genomics: store, process, explore and share genomic data
Ultimately, the "hard part" about genomics is not big-data requiring Spanner and BigTable to get anything done. I actually wrote a blog post about this this week:
http://blog.goldenhelix.com/grudy/genomic-data-is-big-data-b...
Both BAM and VCF files can be hosted through a plain HTTP file-server and be meaningfully queried through their BAI/TBI indexes. Visualization tools like our GenomeBrowse or the Broad's IGV can already read S3 hosted genomic files directly without having an API layer and very efficiently (gzip compressed blocks of binary data). So, I see the translation of the exact same data into API-only accessible storage system, where I can't download the VCF and do quick and iterative analysis on it more of a downside that plus.
Disclaimer: I build variant interpretation software for NGS data at Golden Helix. Our customers are often small clinical labs who size of data and volume are not driving them to the cloud.
gabeiscoding | 10 years ago | on: When sequencing makes genotyping obsolete (soon)
Nanopores are no where near the throughput and accuracy of Illumina's sequencing by synthesis tech, and if there is a pathway to challenge Illumina's position, it will be extremely complex, iterative and _long_.
Meanwhile Illumina is amassing a billion dollar war chest and is adding its own complex and iterative improvements to its platform (two-color detection, longer and longer reads, higher cluster density), maintaining its market lead.
As much as the analogy to microprocessor manufacturing and Moore's law is alluring, the messy stuff of biology and single molecule chemical manipulation and sensor detection is unlikely to obediently follow the same innovation curve.
gabeiscoding | 11 years ago | on: Introducing Sense – A Next-Generation Platform for Data Science
Or do you want more utilities that work together and run locally? (i.e. repeatable workflows in project bundles of scripts + data + saved notebook logs etc).
gabeiscoding | 11 years ago | on: What’s Behind the Great Podcast Renaissance?
Sounds fantastic and something I would love to check out.
gabeiscoding | 13 years ago | on: Analyzing my DNA
From my last chat with Brian Naughton (their lead informatics guy) about this, it sounds like they are planning on doing more sequencing in the future. But it could be whole genome and it may be geared more towards research (your selected based on your phenotype) than open to any customer.
gabeiscoding | 13 years ago | on: A farewell to bioinformatics (2012)
The trick is, academics often have excess manpower capacity in the form of grad students and post-docs. Even though personell is usually one of the highest expenses on any given grant, they often don't look at ways to improve the efficiency of their research man-hours.
That's not a blank rule, as we have definitely had success with the value proposition of research efficiency, but in general, a lot of things business adopt to improve project time (like Theory of Constraints project management, Mindset/Skillset/Toolset matching of personel et) is of no interest to academic researchers.
gabeiscoding | 13 years ago | on: A farewell to bioinformatics (2012)
GATK currently has no concept of a "stable" branch of their repo (Appistry is going to provide quarterly releases in the future, which is great).
The flag I am raising is that a "stable" release is needed before it get's integrated into a clinical pipeline. Because the Broad's reputation is so high, it is important to raise this flag as otherwise researchers and even clinical bioinformaticians assume choosing the latest release of GATK for their black-box variant caller is as safe as an IT manager choosing IBM.
gabeiscoding | 13 years ago | on: A farewell to bioinformatics (2012)
On your second point. 23andMe had every incentive to pay attention to their output, but it is fair to say it's their responsibility for letting this slip through. But, it's worth noting in the context of the OP rant, that 23andMe probably paid much more attention to their tools than most academics who often treat alignment and variant calling as a black box that they trust works as advertised.
So what I actually argue in the post (and should have stated more clearly in my summary here) was that GATK is incentivised, as an academic research tool, to quickly advance their set of features with the cost of bugs being introduced (and hopefully squashed) along the way.
This "dev" state of a tool is inappropriate for a clinical pipeline, and GATK's teams' answer to that is a "stable" branch of GATK that will be supported by their commercial software partner. Good stuff.
Finally, I actually have no conflict of interest here as Golden Helix does not sell commercial secondary analysis tools (like CLC Bio does). I wrote this from the perspective of someone who is a 23andMe consumer as well as being informed as I give recommendations of upstream tools with our users (which I might add, I would still recommend and use GATK for research use, with the caution to potentially forgo the latest release for a more stable one).
You know though, the conflict of interest dismissal is something I run into more than I would expect. I'm not sure if some commercial software vendor has acted in bad faith in our industry to deserve the cynicism or if this is defaultly inherited by the "academic" vs "industry" ethos.
gabeiscoding | 13 years ago | on: A farewell to bioinformatics (2012)
"A Hitchhikers Guide to Next Generation Sequencing"
Part1: http://blog.goldenhelix.com/?p=423
gabeiscoding | 13 years ago | on: A farewell to bioinformatics (2012)
I wrote a post about why GATK - one of the most popular bioinformatic tools in Next Generation Sequencing should not be put into a clinical pipeline:
http://blog.goldenhelix.com/?p=1534
In terms of your ideal software strategy, I can speak to that as well, as I am actually attempting to do almost exactly what you suggesting. My team is all masters in CS & Stats, with focus on kick-ass CG visualization and UX.
We released a free genome browser (visualization of NGS data and public annotations) that reflects this:
http://www.goldenhelix.com/GenomeBrowse/
But you're right, selling software in this field is a very weird thing. It's almost B2B, but academics are not businesses and their alternative is always to throw more Post-Doc man-power at the problem or slog it out with open source tools (which many do).
That said, we've been building our business (in Montana) over the last 10 years through the GWAS era selling statistical software and are looking optimistically into the era of sequencing having a huge impact on health care.
gabeiscoding | 14 years ago | on: Secure Shell chrome (killer) app
I have a side-by-side rasterized image showing the difference here:
https://github.com/gaberudy/epaper-calendar/blob/main/docs/f...
Linux insists on doing some freetype font thickening, giving the output a random thick-line look. If anyone knows more tricks to disable this or influence the anti-alias font rendering behavior, let me know!