geoffjentry's comments

geoffjentry | 2 years ago | on: Nextflow: Data-Driven Computational Pipelines

I feel like groovy pushes towards the worst of both worlds between an internal dsl and external dsl. It’s an internal dsl so you get the language but oh man groovy sucks

geoffjentry | 2 years ago | on: Nextflow: Data-Driven Computational Pipelines

I don’t know. I started this route and then quickly switched to only dipping into nf-core when they had actual prior art.

The interplay of nf and groovy (how I wish they hadn’t used groovy!) can be mind bending but if you’re writing your own thkng you have a different optimization model than nf-core that is trying to be one size fits all

geoffjentry | 2 years ago | on: Nextflow: Data-Driven Computational Pipelines

The big difference when comparing bioinformatics systems with non are what the typical payload of a DAG node is and what optimizations that indicates. Most other domains don’t have DAG nodes that assume the payload is a crappy command line call and expecting inputs/outputs to magically be in specific places on a POSIX file system.

You can do this on other systems but it’s nice to have the headache abstracted away for you.

The other major difference is assumption of lifecycle. In most biz domains you don’t have researchers iterating on these things the way you do in bioinf. The newer ML/DS systems do solve this problem than say Aorflow

geoffjentry | 2 years ago | on: Nextflow: Data-Driven Computational Pipelines

While true that’s a minor distinction when comparing the clusters of bioinformatics workflow systems vs workflow systems aimed at different domains

geoffjentry | 2 years ago | on: Nextflow: Data-Driven Computational Pipelines

As GP referenced CWL, while NF had appeared first in terms of the bioinformatics world Nextflow, CWL, Snakelike, and WDL all erupted close enough to each other to be equal-ish. The people were aware of each other but they were all so nascent that it wasn't clear if it was worth joining in or not. At the end of the day these all came from groups trying to scratch particular itches, and not everyone agreed on the right way to scratch.

However all of them were rejections of prior models as well as the workflow solutions prominent in the business space.

geoffjentry | 2 years ago | on: Snakemake – A framework for reproducible data analysis

> are arguably better for people who have a stronger software engineering basis

As someone who is a software developer in the bioinformatics space (as opposed to the other way around) and have spent over 10 years deep in the weeds of both the bioinformatics workflow engines as well as more standard ones like Airflow - I still would reach for a bioinfx engine for that domain.

But - what I find most exciting is a newer class of workflow tools coming out that appear to bridge the gap, e.g. Dagster. From observation it seems like a case of parallel evolution coming out of the ML/etc world where the research side of the house has similar needs. But either way, I could see this space pulling eyeballs away from the traditional bioinformatics workflow world.

geoffjentry | 3 years ago | on: Consider working on genomics

A really hard aspect to this is that there's a massive impedance mismatch between the research & production side of things. Working in the research side is pretty straightforward - although software development practices are going to be a lot looser & faster. Working in a production environment is straightforward, it's like any other software job. But - working at the confluence of those two states is incredibly difficult.

geoffjentry | 3 years ago | on: Consider working on genomics

Keep in mind that there are wetlabs with experiments being conducted in them. Lab techs will be coming and going at all hours.

geoffjentry | 3 years ago | on: Consider working on genomics

> Good luck trying to use a functional-first language, aside from maybe Scala

While they've moved away from it in the last few years, the Broad Institute had a huge investment in Scala. It's been in use there since at least 2010 and I believe longer. The primary software department was almost entirely Scala based for several years. That same department had pockets of Clojure as well.

geoffjentry | 3 years ago | on: Airflow's Problem

It's a mindset shift to a more declarative model. The idea has also popped up in other niche orchestrators.

This is an oversimplification but IMO the easiest way of picturing it is instead thinking of defining your graph as a forward moving thing w/ the orchestrator telling things they can be run you shift to defining your graph nodes to know their dependencies and they let the orchestrator know when they're runnable.

geoffjentry | 3 years ago | on: VCs are scared when they should be greedy

Not only that, but it wasn't even the largest issue.

People point at pimentoloaf.com or whatever and laugh. But when those companies went under, they took away real dollars from "real" B2B companies. And then when those companies went under, "real" companies who depended on them went under. And so on.

geoffjentry | 3 years ago | on: Rent in NYC without paying broker fees

In Boston we have brokers but it's not this reason (EDIT: See end). The realtors are really only there as the front door to get the lease signed. The landlord does the rest of the landlording. The claim is that the value add the brokers provide is vetting potential tenants - credit checks, etc as well as the time investment of showing the unit.

The standard fee is 1 months rent and this is almost always paid by the tenant.

The arguments here tend to cluster into two groups: 1) Laws should be passed that the landlord needs to pay the realtor's fee and not the tenant. 2) It doesn't matter if #1 happens as the landlord would just bake it into the rent anyways.

EDIT: I should say it's not just this reason as what you cite does happen. But it's just de rigueur here regardless of if you're renting the upstairs unit of a homeowner or going through a property management company.

geoffjentry | 3 years ago | on: We don't show typing status

I found it often the opposite with ytalk. Yes, some people would just plow through it and make the best of things. And one could already be thinking about a response in real time.

But other people would find the need to correct every typo. And it was painful as crap watching someone with a 5 WPM typing speed and an affinity for typos to get through what they were trying to say. And eventually you move past thinking about a response in real time to screaming "JUST STOP TYPING ALREADY!!!!"

geoffjentry | 3 years ago | on: How Airbnb Built “Wall” to prevent data bugs

Is this available for others to use or internal only? I think the answer is the latter as a google search didn't turn anything up and I didn't see anything in the article. But if I'm wrong I'd love to kick the tires a bit.

geoffjentry | 3 years ago | on: Does communication matter in technical interviews? Here's the data

> Well, these results definitely validate that. Once you're topped out in coding ability, better communication only helps!

I think it's close to this, but not quite as it's a sliding scale and not a step function.

From a business perspective, there tends to be diminishing returns as one's coding skills improve. And that's the point where returns on communication skills tends to become more important. So more often than not the higher the level the more important communication skills become over coding skills. This is not saying a high level engineer can be all talk and no ability, but rather a great communicator/good coder might become more valuable than a good communicator/great coder.

Apologies for all the hedging in the above. I'm trying to take into account that there are always exceptions to the above. Some positions need that absolute wizard. It's the exception to the rule, but it exists.

geoffjentry | 3 years ago | on: Give me back my monolith (2019)

> and that OOP was supposed to be about messaging

I agree that these are Kay's thoughts and also agree with his take on what it should be. But I think the reality is more complicated than it simply being something that evolved away from his grand dream. It's more that there was a soup of ideas floating around during that time that came together as OOP and the combination that became dominant was something else. For instance Simula was already using inheritance prior to Kay's message passing proposal.

geoffjentry | 3 years ago | on: Give me back my monolith (2019)

> that suggests we got OOP wrong, very wrong.

It does, but that's because there are different flavors of OOP. Alan Kay's original take on OO was closer to the actor model than what grew into the mainstream spin on OOP with inheritance and the rest.

If you take 10 steps back and squint, microservices & the actor model start to look pretty similar.

geoffjentry | 4 years ago | on: Unit Testing is Overrated (2020)

I've become a fan of using mutation testing to drive unit test coverage. The goal of mutation testing is to ensure there is test coverage around areas with a higher likelihood of being brittle. Work smarter, not harder.

geoffjentry | 4 years ago | on: Collaborate with kindness: Etiquette tips in Slack

Right. The complaint isn't people including a greeting ("hey", "hi", "hello"). It's only giving the greeting.

"Hi - remind me what time we're meeting" is fine. "Hi" is not :)

geoffjentry | 4 years ago | on: I no longer grade my students’ work, and I wish I had stopped sooner

Universities aren't trade schools. You go there to learn, not to earn a job.