syntonym2 | 4 months ago | on: Async and Finaliser Deadlocks
syntonym2's comments
syntonym2 | 7 months ago | on: Fossjobs: A job board for Free and Open Source jobs
syntonym2 | 7 months ago | on: Show HN: Python file streaming 237MB/s on $8/M droplet in 507 lines of stdlib
Streaming workload is not inherently expensive. The main work is to bring the bytes of the files to the network card as quick as possible, and nearly no computation needs to be performed.
> The cost of generating boundaries and constructing headers scales with request count and payload size.
The only computation necessary to generate boundaries is to ensure that the chosen boundary does not occur in the content, and it seems that the code does not actually check this, but generates a random UUID4. Boundaries and headers are per-file and could be cached, so they don't scale with the number of requests or payload size.
syntonym2 | 2 years ago | on: Logitech Partners with iFixit for Self-Repairs
[1] https://zmk.dev/
syntonym2 | 4 years ago | on: A Python-based programming language for high-performance computational genomics
syntonym2 | 5 years ago | on: Show HN: Mongita is to MongoDB as SQLite is to SQL
syntonym2 | 5 years ago | on: Test for lists in Cython
I tried to run the benchmark on my own computer, but the setup documentation was not enough for me to get the julia integration running. I haven't used julia before, so it might just be something very simple.
Similar I haven't used poetry much before, and the given documentation failed to install the necessary setuptools-rust for me. I could fix it on my own, but doesn't make me feel certain about the outcome of the benchmark.
The rust benchmark did not reproduce for me: "Rust (Pyo3) parallel after" has a 1.56 speedup for me, but a 2.6x slowdown for the author. Also I don't understand what the difference between "after" and "before" is, the code just calls the same code twice. Might be a JIT/Cache thing, but it's unclear to me. One sentence what before/after refers to would be very helpful.
Generally all measurements are only done once. Measuring at least thrice gives one at least a chance to detect an outlier and gives possibility for statistics, e.g. is a difference betwee the different cython annotations even meanginful?
The "C Cython (pure-python mode)" is reported faster then "C Cython (.pyx)". The Cython project itself says that using pyx files should be faster, so something strange is going on.
"Cython is fast, but none of these methods are able to release the GIL. " (A) this is not true (B) this seems to be mostly over single threaded performance, so why is that meaningful?
"Rust is not that fast beacuse it needs to copy data; using Pyo3 objects would probably lead to similar results as cython, but with an added library." The rust code already contains Pyo3, so an "added library" is not necessary as far as I understand.
I'd guess the performance stems more from conversions between different types then anything else. Maybe Julia (and the python-julia bridge) is particular smart about it and thus it's super easy to use, while pyo3 (and cython) needs some more work to interface with python. Even if that is true, I couldn't say it from the presented data.
With these caveats resolved I'd be interested in the benchmark, but without it I can't really say anything from it.
syntonym2 | 5 years ago | on: Terminal Multiplexers
syntonym2 | 5 years ago | on: Minimal safe Bash script template
[0] https://unix.stackexchange.com/a/24808/317276syntonym2 | 5 years ago | on: Hypercore protocol: a distributed (P2P) append-only log
The hyper* world seems to be very fragmented right now. There is the dat [0] project, which started 2013 and shares files between computers p2p. In may 2020 the dat protocol was renamed to the hypercore protocol [1] and dat "will continue as the collective of teams and projects that have evolved from the original Dat CLI project.". hypercore-protocol [2] links to multiple applications for file sharing (none of them the dat CLI tool).
Hyperdrive [3] "help[s] you share files quickly and safely, directly from your computer [...] -- all P2P". The github [4] mentions the hyperdrive daemon als a batteries included experience, but the hyperdrive daemon has a deprecation notice and tells you to use hyperspace.
Hyperspace [5] "provides remote access to Hypercores and a Hyperswarm instance" and "exposes a simple RPC interface". The documentation is very technical and seems to be aimed at developers, not using it as a tool.
Digging around in the github organisation [6] or by stumbling upon the patreon [7] one can find the hyp CLI tool which is "A CLI for peer-to-peer file sharing (and more) using the Hypercore Protocol". The first commit is 9 days old. On the twitter of the author one can also find hyperbeam [8], which is integrated into hyp.
Here on HN one can also find an announcement for "uplink" [9].
The tech looks pretty cool, but the vast amount of different projects makes it difficult to grasp. From all of these tools the dat tool seems to be the most advanced, but not actively maintained. It's not linked on any of the hyper* sites and doesn't seem to be the recommended way to use the hypercore protocol to share files p2p. While I would like to use the tech, I'm pretty lost on how to do that today.
[0] https://docs.datproject.org/
[1] https://blog.datproject.org/2020/05/15/dat-protocol-renamed-hypercore-protocol/
[2] https://hypercore-protocol.org/
[3] https://hypercore-protocol.org/#hyperdrive
[4] https://github.com/hypercore-protocol/hyperdrive
[5] https://github.com/hypercore-protocol/hyperspace
[6] https://github.com/hypercore-protocol/cli
[7] https://www.patreon.com/posts/hyp-command-line-44923749
[8] https://github.com/mafintosh/hyperbeam
[9] https://www.patreon.com/posts/paul-reveals-is-44665610syntonym2 | 5 years ago | on: R adds native pipe and lambda syntax
syntonym2 | 5 years ago | on: Arch Conf 2020
syntonym2 | 5 years ago | on: YouTube bans Stefan Molyneux, David Duke, Richard Spencer for hate speech
[0] https://www.reddit.com/r/BlackPeopleTwitter/comments/b93w1j/...
[1] https://www.reddit.com/r/BlackPeopleTwitter/comments/gumxuy/...
syntonym2 | 5 years ago | on: A Statistical Analysis of Coughing Patterns on ‘Who Wants to Be a Millionaire?’
The second argument goes through each of the questions, but doesn't really show any evidence for cheating. Question 8 has no suspicious coughs (coughs close to the correct answer), and Ingram knows the answer quickly. Question 9 contains one suspicious cough, but only after the second mention of the correct answer. Again Ingram is quite sure about this answer and does not consider any other answer.
Question 10 is the first question (of that day) Ingram struggles with. There are 7 coughs recorded, 5 cough "clusters" (coughs very close to each other), 2 cough cluster could be suspicious. The first potentially suspicious cough cluster is pretty far away from the answers , while the second one is close to the correct answer. The next notes that these coughs didn't come from the phantom cougher, but from his wife. On the other hand there are lots of better place to insert a cough to cheat. This is one of the few questions where he changes his mind.
Question 11 has five coughs in total and three suspicious coughs, but cough 2-5 are very close together (and very late). Ingram is focused on the correct answer from the beginning, and does not really consider any other answer.
Question 12 has one cough after an incorrect answer at the very beginning. There is one cough cluster later on after the correct answer, but the correct answer is said 5 times, whereas the only one other answer is mentioned once, and only considered for a few seconds. The other coughes are relatively far away from any answer.
Question 13 has a cough after the correct answer in the very beginning, but it is not called as a significant cough in the linked youtube video from WWTBAM. Only the correct answer is considered.
In Question 14 Ingram struggles similar to question 10. there are 12 coughs. 6 coughs are heard shortly after a wrong answer (coughs 1, 2, 3, 4, 5 and 9), and 4 coughs are heard shortly after the right answer (coughs 6, 10, 11, 12). The next mentions a muttered "no", but any muttering loud enough to inform the contestant surely must be heard by the host and the crew. If they cheated using coughs, why muttering very obviously?
Question 15 has 23 coughs, 7 are close to wrong answers and 10 are close to the right answer. Ingram speaks the correct answer 11 times, and lists the incorrect answers 4 times. Most of the suspicious coughing happens in the very end, when he basically made his decision. Most of the coughing close to the incorrect answers happens early. If coughing was used, why did he not lock into one of the incorrect answers? The text mentions Ingram muttering "I think I know..." and calls that suspicious, but that phrase seems a very natural choice here.
The questions, coughing and answering doesn't follow a pattern: Sometimes there is a cough very early after the correct answer and Ingram picks that option, sometimes there is a cough very early after the incorrect answer put Ingram doesn't pick that option. Sometimes for unclear questions there are multiple coughs after the correct option, and Ingram picks that option; sometimes for unlcear options there are multiple coughs after the incorrect option and Ingram does not pick that option. On some unclear questions there is no coughing at all, but Ingram still guesses correctly. On some clear questions there is coughing, although Ingram did not even consider any wrong answer.
Next the blog post compares distributions of elapsed time since last answer for correct and incorrect answers. It does not use a statistical test (e.g. Kolmogorow-Smirnow-Test), but just counts coughs to some threshold. Resolution is very rough, the subsampling for that plot only done once. While simulations show that there are more coughs after an correct answer then expected, there are also more coughs after an incorrect answer then expected. The simulations disregard that for most incorrect answers there is not a lot of time until the next answer is spoken. Indeed often the incorrect answers are simply listed and immediately disregarded.
While the presented data might call for some scrutiny, it is far away from justifying the damning tone.
syntonym2 | 6 years ago | on: How coffee became a modern necessity
syntonym2 | 6 years ago | on: Holes in Bayesian Statistics
> The second challenge that the uncertainty principle poses for Bayesian statistics is that [...] we routinely treat the act of measurement as a direct application of conditional probability.
Furthermore it states that this problem might also arise for other applications of Bayesian statistics:
> If classical probability theory needs to be generalizedto apply to quantum mechanics, then it makes us wonder if it should be generalized for applicationsin political science, economics, psychometrics, astronomy, and so forth. It’s not clear if there are any practical uses to this idea in statistics, outside of quantum physics. For example, would it make sense to use “two-slit-type” models in psychometrics, to capture the idea that asking one question affects the response to others?