bnprks's comments

bnprks | 1 month ago | on: Ask HN: Distributed SQL engine for ultra-wide tables

With genomics, your data is probably write ~once, almost entirely numeric, and is most likely used for single-client offline analysis. This differs a lot from what most SQL databases are optimizing for.

My best experience has been ignoring SQL and using (sparse) matrix formats for the genomic data itself, possibly combined with some small metadata tables that can fit easily in existing solutions (often even in memory). Sparse matrix formats like CSC/CSR can store numeric data at ~12 bytes per non-zero entry, so a single one of your servers should handle 10B data points in RAM and another 10x that comfortably on a local SSD. Maybe no need to pay the cost of going distributed?

Self plug: if you're in the single cell space, I wrote a paper on my project BPCells which has some storage format benchmarks up to a 60k column, 44M row RNA-seq matrix.

bnprks | 1 year ago | on: Everything I know about the fast inverse square root algorithm

Amusingly, (to me at least) there's also an SSE instruction for non-reciprocal square roots but it's so much slower than reciprocal square root that calculating sqrt(x) as x * 1/sqrt(x) is faster assuming you can tolerate the somewhat reduced precision.

bnprks | 1 year ago | on: Energy-Efficient Llama 2 Inference on FPGAs via High Level Synthesis

Yeah, I think DRAM is almost certainly the future, just in terms of being able to afford the memory capacity to fit large models. Even Cerebras using a full wafer only gets up to 44 GB of SRAM on a chip (at a cost over $2M).

An interesting twist is that this DRAM might not need to be a central pool where bandwidth must be shared globally -- e.g. the Tensortorrent strategy seems to be aiming for using smaller chips that each have their own memory. Splitting up memory should yield very high aggregate bandwidth even with slower DRAM, which is great as long as they can figure out the cross-chip data flow to avoid networking bottlenecks

bnprks | 1 year ago | on: Energy-Efficient Llama 2 Inference on FPGAs via High Level Synthesis

Seems like the claims of the abstract for speed and energy-efficiency relative to an RTX 3090 are when the GPU is using a batch size of 1. I wonder if someone with more experience can comment on how much throughput gain is possible on a GPU by increasing batch size without severely harming latency (and what the power consumption change might be).

And from a hardware cost perspective the AWS f1.2xlarge instances they used are $1.65/hr on-demand, vs say $1.29/hr for an A100 from Lambda Labs. A very interesting line of thinking to use FPGAs, but I'm not sure if this is really describing a viable competitor to GPUs even for inference-only scenarios.

bnprks | 1 year ago | on: Deaths at a California skydiving center, but the jumps go on

The article also gives reason to be skeptical of the quoted "10 fatalities out of an estimated 3.65 million jumps in 2023". If we count 28 known fatalities at this one facility from 1983 to 2021, we get around 0.75 fatalities per year.

In other words, we would expect that 14 facilities of similar death counts to the one in the article would equal the total US fatalities for a year. The USPA dropzone locator [1] lists 142 facilities, so if we take everything at face value then this facility is ~10x worse than the average for USPA members.

> But I'd bet it's less than $200/jump worth of risk

In this case at least, it seems that this specific facility is higher risk than that. And with a lack of legally mandated reporting requirements, I'd say the onus is on a facility to prove safety once it's averaging a death every 1.3 years.

[1]: https://www.uspa.org/dzlocator?pagesize=16&Country=US

bnprks | 1 year ago | on: Oxide Cloud Computer. No Cables. No Assembly. Just Cloud

Are you sure you're comparing equivalent memory and storage specs? I needed to go into the customization menus in the Dell configurator to spec something equivalent, where prices started going up quite rapidly.

For example "3.2TB Enterprise NVMe Mixed Use AG Drive U.2 Gen4 with carrier" is $3,301.65 each, and you'd need 10 of those to match the Oxide storage spec -- already above the $30k total price you quoted. Similarly, "128GB LRDIMM, 3200MT/s, Quad Rank" was $3,384.79 each, and you'd need 8 of those to reach the 1TiB of memory per server Oxide provides.

With just the RAM and SSD cost quoted by Dell, I get to $60k per server (x16 = $960k), which isn't counting CPU, power, or networking.

I agree these costs are way way way higher than what I'd expect for consumer RAM or SSD, but I think if Oxide is charging in line with Dell they should be asking at least $1MM for that hardware. (At least compared to Dell's list prices -- I don't purchase enterprise hardware either so I don't know how much discounting is typical)

Edit: the specific Dell server model I was working off of for configuration was called "PowerEdge R6515 Rack Server", since it was one of the few I found that allowed selecting the exact same AMD EPYC CPU model that Oxide uses [1]

[1]: https://www.dell.com/en-us/shop/dell-poweredge-servers/power...

bnprks | 2 years ago | on: Array Languages: R vs. APL (2023)

If the function chooses to overwrite the value of a variable binding, it doesn't matter how it is defined at the call site (so inner x wins in your example). In the tidyverse libraries, they often populate a lazy list variable (think python dictionary) that allows disambiguating in the case of name conflicts between the call site and programmatic bindings. But that's fully a library convention and not solved by the language.

bnprks | 2 years ago | on: Array Languages: R vs. APL (2023)

The good news is that most variables in R are immutable with copy-on-write semantics. Therefore, most of the time everything here will be side-effect-free and any weird editing of the variable bindings is confined to within the function. (The cases that would have side effects are very uncommonly used in my experience)

bnprks | 2 years ago | on: Array Languages: R vs. APL (2023)

One of the wildest R features I know of comes as a result of lazy argument evaluation combined with the ability to programmatically modify the set of variable bindings. This means that functions can define local variables that are usable by their arguments (i.e. `f(x+1)` can use a value of `x` that is provided from within `f` when evaluating `x+1`). This is used extensively in practice in the dplyr, ggplot, and other tidyverse libraries.

I think software engineers often get turned off by the weird idiosyncrasies of R, but there are surprisingly unique (arguably helpful) language features most people don't notice. Possibly because most of the learning material is data-science focused and so it doesn't emphasize the bonkers language features that R has.

bnprks | 2 years ago | on: 8 years later: A world Go champion's reflections on AlphaGo

To my knowledge AlphaGo models never became meaningfully available to the public, but 8 years later the KataGo project has open source, superhuman Go AI models freely available and under ongoing development [1]. The open source projects that developed in the wake of AlphaGo and AlphaZero are a huge success story in my mind.

I haven't played Go in a while, but I'm kind of excited to try going back to use the KataGo-based analysis/training tools that exist now.

[1]: https://github.com/lightvector/KataGo

bnprks | 2 years ago | on: What the Gardasil Testing May Have Missed (2017)

I'm sorry about negative experiences and/or regrets other commenters might have about their vaccinations. Measuring the risk/reward profile of vaccines seems far from simple, particularly in cases like this where the large benefits (no cancer) and risks (autoimmune problems) may both be quite rare for any individual. It is too bad if the study didn't fully capture possible risks in this case, and hopefully follow-up studies and monitoring can help better describe the risk profile.

It's worth noting the benefits of HPV vaccination do seem to be quite real, though. In the US, >20% of the female population has a high-risk HPV infection [1], and cervical cancer runs at ~12k new cases and ~4k deaths a year [2]. A follow-up study found women vaccinated before age 17 had about 88% reduction in cervical cancer, with around 53% for women vaccinated at 17-30 years of age [3] (presumably later-vaccinated women had a high chance of already having an HPV infection so the vaccine wouldn't be useful).

I think potentially saving >3.5k lives and >10k cervical cancer cases annually in the US is a pretty good return if we can get widespread HPV vaccination, though of course we should also work hard to study and minimize vaccine side-effects. I'm similarly hopeful of news about EBV as a cause of multiple sclerosis [4], which is another situation where preventing a widespread infection might prevent rare but serious illnesses.

[1] https://www.cdc.gov/nchs/products/databriefs/db280.htm

[2] https://gis.cdc.gov/Cancer/USCS/#/Trends/1,2,73,1,3,value,23

[3] https://www.cancer.gov/news-events/cancer-currents-blog/2020...

[4] https://www.hsph.harvard.edu/news/press-releases/epstein-bar...

bnprks | 2 years ago | on: Learning From DNA: a grand challenge in biology

Strong second for wishing they tried physically testing some model output. The importance of "model that makes outputs AlphaFold thinks look like Cas" is very different from "model that makes functional Cas variants".

For design tasks like in this paper, I think computational models have a big hill to climb in order to compete with physical high-throughput screening. Most of the time the goal is to get a small number of hits (<10) out of a pool of millions of candidates. At those levels, you need to work in the >99.9% precision regime to have any hope of finding significant hits after multiple-hypothesis correction. I don't think they showed anything near that accurate in the paper.

Maybe we'll get there eventually, but the high-throughput techniques in molecular biology are also getting better at the same time.

bnprks | 2 years ago | on: How do computers calculate sine?

Sadly even SSE vs. AVX is enough to often give different results, as SSE doesn't have support for fused multiply-add instructions which allow calculation of a*b + c with guaranteed correct rounding. Even though this should allow CPUs from 2013 and later to all use FMA, gcc/clang don't enable AVX by default for the x86-64 targets. And even if they did, results are only guaranteed identical if implementations have chosen the exact same polynomial approximation method and no compiler optimizations alter the instruction sequence.

Unfortunately, floating point results will probably continue to differ across platforms for the foreseeable future.

bnprks | 2 years ago | on: Free data transfer out to internet when moving out of AWS

I hope this is just the start of egress fee changes in response to the European Data Act. Taking a look at other parts of the law's text [1], I think this change related to Article 25, but Article 34 section 2 is what really stands out to me:

> Where a data processing service is being used in parallel with another data processing service, the providers of data processing services may impose data egress charges, but only for the purpose of passing on egress costs incurred, without exceeding such costs.

Hopefully this article doesn't end up with exploitable loopholes. Bringing down AWS, GCP, and Azure egress costs to market rates could majorly help reduce cloud lock-in gradually without having to close your entire account.

[1]: https://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=OJ:...

bnprks | 2 years ago | on: French court issues damages award for violation of GPL

Yeah, the crux of the issue would definitely be whether use of an API is prohibited by default under copyright law for a country (i.e. does using a library make something a derivative work of the library). In the US, at least, the Google v Oracle case makes me think this is worst case fair use (for many contexts) and best case too functional to be covered by copyright in the first place.

Though I can certainly imagine that a multinational company might not be confident of the copyright status of API usage in all countries they operate in.

bnprks | 2 years ago | on: Mass Retraction of unethical Chinese Forensic Genetics Papers

I think it is known that the Chinese state is making genetic databases of these minority populations and is taking genetic samples without consent. It's also known that the Chinese state is committing human rights violations against these minorities. I have not personally read reporting about specific ways the genetic samples have been used in fact, though.

In a US lab, for instance, I would expect that a similar genetic study would have hard-copy signed consent forms from every participant in the study with 3-6 year retention requirements, and this could be audited by their institution's IRB if there were concerns. (Ultimately I think there is an accountability chain all the way to the US federal government, though I'm not familiar with how the institutional IRBs are monitored). I don't know what equivalent institutions might exist in China, and whether the journals got/requested any verification of consents from them.

Though for papers like these where co-authors have affiliations with police departments or academies, I'm not sure how trustworthy it would be even if the police did claim they had evidence of consent for the data in these papers. (Given that Chinese police are known to be collecting genetic samples without consent in some documented cases.)

bnprks | 2 years ago | on: French court issues damages award for violation of GPL

Looking closer at the GPL, it seems like most requirements only kick in once you "convey" GPL-covered code. If you make your users get the GPL component themselves from a 3rd party (e.g. PyPI or other package repository), then you might be okay. I'd be curious for input from others, but it seems like the following flow avoids GPL virality by avoiding "conveying" the GPL-covered code to the end-user:

1. You give your user a non-GPL python package with requirements.txt file (no bundled dependencies)

2. Your user pip-installs the dependencies (including some GPL-licensed ones)

3. Your user runs the application

As long as your country doesn't consider use of an API prohibited under the copyright of the implementing code, I think steps 1-3 would be fine (though not very practical for a product).

I'd be curious for others input, though, as this has bugged be for a while in the R community where several core libraries (like the Matrix package) are GPL licensed but many packages that depend on GPL packages claim to be licensed under MIT or some other license.

bnprks | 2 years ago | on: Mass Retraction of unethical Chinese Forensic Genetics Papers

I think this 2019 article from the NY Times gives a reasonable introduction [1]. In short, the concern is that China is developing genetic databases as part of its state surveillance and repression of the Uighur people in Xinjiang. So the idea might be, for example, that if the Chinese state can obtain the DNA of a dissident they can identify family members to threaten or harass.

The research in question is directly related to finding and cataloging genetic markers that could be used in such a surveillance database. And with no way to credibly verify that the genetic samples were given with full consent, it seems probable that the studies themselves were part of this project to create an Orwellian surveillance state for certain minorities in China. Needless to say, western journals would prefer to not be accomplices to these human rights abuses, hence the retractions.

[1]: https://www.nytimes.com/2019/02/21/business/china-xinjiang-u...

page 1