phissenschaft's comments

phissenschaft | 2 years ago | on: Emacs-copilot: Large language model code completion for Emacs

I use Emacs for most of my work related to coding and technical writing. I've been running phind-v2-codellama and openhermes using ollama and gptel, as well as github's copilot. I like how you can send an arbitrary region to an LLM and ask for things about it. Of course the UX is in early stage, but just imagine if a foundation model can take all the context (i.e. your orgmode files and open file buffers) and can use tools like LSP.

phissenschaft | 3 years ago | on: 916 Days of Emacs

I just stopped worry and succumbed to https://github.com/emacs-evil/evil. Now I only mostly just fiddle with orgmode configs to generate nice looking HTML and PDFs.

phissenschaft | 3 years ago | on: Spack – scientific software package manager for supercomputers, Linux, and macOS

Been using Spack for a while to manage my machine learning package dependencies. It allows me to quickly spin up projects with complex dependencies (my current environment has 329 packages built ...). It's pretty easy to use with containers. It allows me to evaluate and migrate to different PyTorch/CUDA versions easily.

phissenschaft | 3 years ago | on: I tricked myself into working out 3 times per week

Really happy for the author! I also find gamification and vanity help. Strava/Peloton are good places to keep scores and brag about what you've done.

phissenschaft | 3 years ago | on: Launch HN: Sematic (YC S22) – Open-source framework to build ML pipelines faster

Congratulation on the launch! Best wishes! Would absolutely love to dive into it soon.

Here are some high level questions:

- How does it handle failure of individual tasks in the pipeline? - What if the underlying jobs (e.g. training or dataset extraction or metrics evaluation) need to run outside the k8s cluster (e.g. running bare-metal, slurm, sagemaker, or even a separate k8s cluster)? - How does caching work if multiple pipeline can share some common components (e.g. dataset extraction)?

phissenschaft | 4 years ago | on: Ray: A Distributed Framework for Emerging AI Applications

Is there a plan to tighter integrate into k8s, potentially in a multi-cluster/federated setting. It's a lot easier to get buy-ins for ray adoption from infra teams where k8s is the centralized compute substrate.

phissenschaft | 4 years ago | on: Ray: A Distributed Framework for Emerging AI Applications

Great work and kudos to the Ray team! It's definitely a fresh look with a lot of lessons learned from previous generations (e.g. spark).

There are a few nice features I wish Ray would eventually get to.

On the user experience side, it would be nice to have task level logs: often time it's easier for users to reason at task level, especially the task is a facade that triggers other complicated library/subprocess calls.

For the scheduler, if there's more native support for sharded/bundled/partitioned tasks and https://cloud.google.com/blog/products/gcp/no-shard-left-beh...

phissenschaft | 4 years ago | on: Cue, an open-source data validation language

Wonder if https://hydra.cc/ would be a better choice

phissenschaft | 4 years ago | on: Using Argo to Train Predictive Models at FlightAware

Really like the flexibility provided by Argo. It's the missing workflow concept from Kubernetes.

phissenschaft | 4 years ago | on: PatchELF: Simple utility for modifying existing ELF executables and libraries

Super useful for updating rpath!

phissenschaft | 4 years ago | on: Apache Arrow 4.0

Great to see Ballista in arrow https://github.com/apache/arrow/pull/9723

phissenschaft | 5 years ago | on: On Repl-Driven Programming

My concept of a "REPL" is mostly defined by emacs. You would have a buffer with a code file, with an active jupyter kernel with the correct dependencies loaded in it. Then one would send any active region with `C-c C-c` and get timely feedback.

With this mode https://github.com/nnicandro/emacs-jupyter one can connect to a jupyter kernel running locally or remote (would mostly prefer SSH port forwarding or kubectl port-forward the remote jupyter server). It makes life so much easier to interact with cloud environment (e.g. spark).

phissenschaft | 5 years ago | on: Talking out loud to yourself is a technology for thinking

Very helpful to have a conversation with oneself. I'd sometimes try different accents to create an illusion of someone else talking.

phissenschaft | 5 years ago | on: Interactive C++ for Data Science

I find being able to directly evaluate code snippets via attaching to a jupyter kernel (running locally or in cloud) is one of the most important development efficiency booster.

phissenschaft | 5 years ago | on: Controversies and Challenges in fMRI (2018)

Using fMRI to study brain functions is akin to studying US road transportation with only CO2 emission map.

phissenschaft | 5 years ago | on: What Is Nix?

Wondering if folks would prefer spack (https://spack.io/) given its heavy adoption in the HPC community.

phissenschaft | 6 years ago | on: Configs suck? Try a real programming language

https://queue.acm.org/detail.cfm?id=2898444

""" ... embrace the inevitability of programmatic configuration, and maintain a clean separation between computation and data. The language to represent the data should be a simple, data-only format such as JSON or YAML, and programmatic modification of this data should be done in a real programming language, where there are well-understood semantics, as well as good tooling ... """