ims's comments

ims | 3 years ago | on: Goodbye, data science

This guy is a straight-shooter with upper-management written all over him.

ims | 3 years ago | on: Robert's Rules of Order (1876)

Technical people love parliamentary procedure because it notionally resolves messy human deliberation into a linear call stack with system interrupts.

The important thing to understand is that the rules are mainly for exception handling and are borderline irrelevant on the golden path. Most of the time, committees don't even think about the rules because everyone understands motions, seconding, and voting. Groups often operate in de facto ‘suspension of the rules’ and just talk through issues semi-formally until it’s time to take a vote. That’s actually the optimal outcome in most settings.

The true test of the rules is when disagreements arise about the form of debate rather than subject matter. Sometimes there is a legitimate procedural question but often this comes up when the apparent minority decides to start maneuvering because they believe they are going to lose. In the real world, this tends to play out in one of two ways depending on context:

1. This is a highly professional body with a parliamentarian at the meeting (or at least somebody plausible like a general counsel) who can call the balls and strikes, or the chair is—at least in principle—considered competent to rule by enough people present. A ruling is made and the body moves on.

2. This is an amateur body (which includes most government bodies below the state/province level and the vast majority of private committees/panels/boards), in which case people will resolve the issue as humans usually do. Namely, either the meeting will fall apart and be unable to conduct business or the most influential or aggressive parties will win regardless of what the rules say.

"But the body can just resolve everything properly by reading the rules!" -- well, theoretically.

But think back to the last time you played one of those byzantine German board games for the first time. Now imagine that nobody at the table really cares about board games and are not used to reading game rules. Further imagine that some parties are willing to defect from the spirit of the rules in order to raise esoteric legal and procedural objections, waste time, and filibuster outcomes they don’t want.

Real meetings have time limits, and while the U.S. Senate might stay up past midnight occasionally, regular people who have to wake up for work in the morning and who are giving up family time for a thankless volunteer position generally will not tolerate taking 5 hours to unwind the call stack in a hostile proceeding. So again, the loudest and most assertive parties tend to wear everyone else down. In that case the rules are at best useful for establishing in the record that procedure was not followed, which is only really useful if the issue can be escalated to the courts, appealed to a higher body, or revisited in a subsequent session.

Others in the thread have suggested simplified rulesets, and I’ll recommend Rosenberg’s Rules of Order which was designed by an experienced judge specifically for smaller meetings. But the truth is that almost any set of rules will work for amateur bodies if parties operate in good faith, and almost no set of rules will work if not.

ims | 4 years ago | on: Ask HN: Who is hiring? (November 2021)

DrivenData Labs | Data Scientist and Senior Data Scientist | Berkeley, CA / Boston, MA / Denver, CO | REMOTE | Full-time

We run online machine learning challenges with social/scientific impact, and we work directly with mission-driven organizations on all sorts of interesting data science consulting projects. Since 2014 we’ve worked with more than 50 organizations in areas like international development, health, education, research and conservation, and public services.

We pride ourselves on being a great place to work and to learn. We take the development of our team members very seriously and we value the priorities that we each have in our lives at work and outside of work. We help each other develop clean, well-organized, well-documented code in service of correct and reproducible data science.

Our team writes and speaks often about reproducible data science and data ethics -- you may recognize our Cookiecutter Data Science project (https://drivendata.github.io/cookiecutter-data-science/) or the Deon data ethics checklist (https://deon.drivendata.org/).

We're looking for more great people in Boston, the Bay Area, or any of the states we currently operate. Feel free to reach out with any questions: [email protected]

Positions: https://drivendata.workable.com/

ims | 4 years ago | on: Ask HN: Literature for mathematical optimization?

Sounds like you're looking more for optimization theory, but if you want a gentle introduction to applications with approachable math and lots of examples, I highly recommend "Operations Research: Applications and Algorithms (4E)" by Wayne Winston. It's a solid undergrad level text covering basic linear optimization, mixed integer linear programs, and non-linear optimization.

ims | 5 years ago | on: macOS to FreeBSD migration a.k.a. why I left macOS

I think the user vs. product dichotomy is not right in this case. Microsoft really does make its money on products and support. You can see this on their public filings.

The relevant dichotomy is more like: people who run Windows aren't the buyers. One-off personal licenses for home PCs are more than a rounding error but are certainly not what made Microsoft what it is.

Governments and F500 companies buy Windows and Office for X00,000 machines for X0 years of support at a time. Enterprise procurement teams are the actual buyers whose opinions matter to product managers.

ims | 5 years ago | on: Ask HN: Who is hiring? (December 2020)

DrivenData Labs | Data Scientist and Software Engineer | Berkeley, CA / Boston, MA / Denver, CO | REMOTE currently, ONSITE likely | Full-time

We run online machine learning challenges with social impact, and we work directly with mission-driven organizations to drive change through data science and engineering. Since 2014 we’ve worked with more than 35 organizations across 50+ projects in areas like international development, health, education, research and conservation, and public services.

We pride ourselves on being a great place to work and to learn. We take the development of our team members very seriously and we value the priorities that we each have in our lives at work and outside of work. We like to tackle problems that matter as a team. We help each other develop clean, well-organized, well-documented code in service of correct and reproducible data science. We believe the work we do should positively impact people’s lives.

Our team writes and speaks often about reproducible data science and data ethics -- you may recognize our Cookiecutter Data Science project (https://drivendata.github.io/cookiecutter-data-science/) or the Deon data ethics checklist (https://deon.drivendata.org/).

Ultimately, we're a team of smart, passionate data scientists and engineers interested in doing good work for good reason. We're looking for more great people in Boston or the Bay area. We're excited to hear from you!

Positions: https://drivendata.workable.com/

ims | 5 years ago | on: What's wrong with social science and how to fix it

Yes, the primary specification was a linear probability model for the likelihood of a binary dependent variable conditioned on two binary input variables. As far as I could tell, the fit was max likelihood without regularization and the paper's bombshell conclusion was based on the regression coefficients' p-values.

The Stata thing was just one of many, many red flags.

ims | 5 years ago | on: What's wrong with social science and how to fix it

There were some stunning claims being made on Twitter last month based on a recently published study. Instantly skeptical, I dug into the methodology section and found this gem:

"It should be noted that the results cannot be estimated using a physician fixed effect due to a numeric overflow problem in Stata 15 which cannot be overcome without changing the assumptions of the logit model."

... The sad part was they didn't even choose a reasonable model in the first place.

ims | 5 years ago | on: CEO of Uber: Gig Workers Deserve Better

Lower compensation than private sector is not specific to the military. It is true of most government positions.

Typical offsetting factors that rational agents weigh include pensions, education subsidies and other benefits, job stability, and perceived upward mobility.

ims | 5 years ago | on: Best Data Science Books According to the Experts

> See the comment above: “I'm not even sure what to recommend for developing good software judgment and habits.“. It’s like a chess coach admonishing their subject to simply “think harder”. Not helpful.

Hey, it seems like you took this as gatekeeping or something. These skills can definitely be taught or self-learned, I've done it and seen it done many times.

My point was only that I don't know resources that can act as a shortcut (my actual word above), i.e. ways to skip over the longer path of gaining experience through long engagement with the topic. So maybe more like a chess coach saying they don't know any books that let a beginner jump ahead to being a more experienced player?

There are hundreds of past threads on HN about books to level up in software, so clearly some people have thoughts about this. I just don't know what to recommend a data scientist who needs these skills immediately.

ims | 5 years ago | on: Best Data Science Books According to the Experts

Seconding this comment. Based on experience in hiring data scientists and comparing notes with many others that hire data scientists, the most frequent gaps in knowledge are (1) statistics specifically and scientific computing in general and (2) disciplined software engineering.

People good at (1) and bad at (2) write "PhD code" that may or may not be right but you can't tell because it's too disorganized. People good at (2) but bad at (1) get fine-ish looking numbers out of their good looking code but you can't tell whether it's right because they may have ignored or misunderstood fundamental assumptions and correctness of the underlying methods.

There are also seemingly tens of thousands of people on the market who have little experience in either but have adapted projects from examples online into their Github potfolio and put all of the relevant terms into their resume anyway.

I think most aspiring data scientists would be better served going with more introductory texts and really understanding them. Maybe Blitzstein and Hwang's "Introduction to Probability" and then McElreath's "Statistical Rethinking" or Wasserman's "All of Statistics" for people who need more stats.

I'm not even sure what to recommend for developing good software judgment and habits. There doesn't seem to be a shortcut for that. Maybe "Fluent Python" or "Effective Python" for Python people? No idea for the R ecosystem.

ims | 5 years ago | on: Forgotten Best Sellers

I randomly picked de Hartog's "The Captain" off the shelf while wandering through the stacks at my local library last year. It was a wonderful read, and the experience was a valuable reminder that serendipity can still be found in an age of Amazon shopping lists and infinite scroll ebook readers.

Mechanically popping the next book off of my self-assigned queue rarely inspires the same sense of reading purely for curiosity or pleasure.

ims | 6 years ago | on: Data Science: Reality Doesn't Meet Expectations

The standard whatever.fit(X, y) isn't very appealing but there are much more bespoke models that require creative engagement with stats/CS knowledge, e.g. Bayesian hierarchical models or deep learning models that are more complicated than what can be copy/pasted from Medium.
page 1