phissenschaft | 2 years ago | on: Emacs-copilot: Large language model code completion for Emacs
phissenschaft's comments
phissenschaft | 3 years ago | on: 916 Days of Emacs
phissenschaft | 3 years ago | on: Spack – scientific software package manager for supercomputers, Linux, and macOS
phissenschaft | 3 years ago | on: I tricked myself into working out 3 times per week
phissenschaft | 3 years ago | on: Launch HN: Sematic (YC S22) – Open-source framework to build ML pipelines faster
Here are some high level questions:
- How does it handle failure of individual tasks in the pipeline? - What if the underlying jobs (e.g. training or dataset extraction or metrics evaluation) need to run outside the k8s cluster (e.g. running bare-metal, slurm, sagemaker, or even a separate k8s cluster)? - How does caching work if multiple pipeline can share some common components (e.g. dataset extraction)?
phissenschaft | 4 years ago | on: Ray: A Distributed Framework for Emerging AI Applications
phissenschaft | 4 years ago | on: Ray: A Distributed Framework for Emerging AI Applications
There are a few nice features I wish Ray would eventually get to.
On the user experience side, it would be nice to have task level logs: often time it's easier for users to reason at task level, especially the task is a facade that triggers other complicated library/subprocess calls.
For the scheduler, if there's more native support for sharded/bundled/partitioned tasks and https://cloud.google.com/blog/products/gcp/no-shard-left-beh...
phissenschaft | 4 years ago | on: Cue, an open-source data validation language
phissenschaft | 4 years ago | on: Using Argo to Train Predictive Models at FlightAware
phissenschaft | 4 years ago | on: PatchELF: Simple utility for modifying existing ELF executables and libraries
phissenschaft | 4 years ago | on: Apache Arrow 4.0
phissenschaft | 5 years ago | on: On Repl-Driven Programming
With this mode https://github.com/nnicandro/emacs-jupyter one can connect to a jupyter kernel running locally or remote (would mostly prefer SSH port forwarding or kubectl port-forward the remote jupyter server). It makes life so much easier to interact with cloud environment (e.g. spark).
phissenschaft | 5 years ago | on: Talking out loud to yourself is a technology for thinking
phissenschaft | 5 years ago | on: Interactive C++ for Data Science
phissenschaft | 5 years ago | on: Controversies and Challenges in fMRI (2018)
phissenschaft | 5 years ago | on: What Is Nix?
phissenschaft | 6 years ago | on: Configs suck? Try a real programming language
""" ... embrace the inevitability of programmatic configuration, and maintain a clean separation between computation and data. The language to represent the data should be a simple, data-only format such as JSON or YAML, and programmatic modification of this data should be done in a real programming language, where there are well-understood semantics, as well as good tooling ... """