phissenschaft's comments

phissenschaft | 2 years ago | on: Emacs-copilot: Large language model code completion for Emacs

I use Emacs for most of my work related to coding and technical writing. I've been running phind-v2-codellama and openhermes using ollama and gptel, as well as github's copilot. I like how you can send an arbitrary region to an LLM and ask for things about it. Of course the UX is in early stage, but just imagine if a foundation model can take all the context (i.e. your orgmode files and open file buffers) and can use tools like LSP.

phissenschaft | 3 years ago | on: Launch HN: Sematic (YC S22) – Open-source framework to build ML pipelines faster

Congratulation on the launch! Best wishes! Would absolutely love to dive into it soon.

Here are some high level questions:

- How does it handle failure of individual tasks in the pipeline? - What if the underlying jobs (e.g. training or dataset extraction or metrics evaluation) need to run outside the k8s cluster (e.g. running bare-metal, slurm, sagemaker, or even a separate k8s cluster)? - How does caching work if multiple pipeline can share some common components (e.g. dataset extraction)?

phissenschaft | 4 years ago | on: Ray: A Distributed Framework for Emerging AI Applications

Great work and kudos to the Ray team! It's definitely a fresh look with a lot of lessons learned from previous generations (e.g. spark).

There are a few nice features I wish Ray would eventually get to.

On the user experience side, it would be nice to have task level logs: often time it's easier for users to reason at task level, especially the task is a facade that triggers other complicated library/subprocess calls.

For the scheduler, if there's more native support for sharded/bundled/partitioned tasks and https://cloud.google.com/blog/products/gcp/no-shard-left-beh...

phissenschaft | 5 years ago | on: On Repl-Driven Programming

My concept of a "REPL" is mostly defined by emacs. You would have a buffer with a code file, with an active jupyter kernel with the correct dependencies loaded in it. Then one would send any active region with `C-c C-c` and get timely feedback.

With this mode https://github.com/nnicandro/emacs-jupyter one can connect to a jupyter kernel running locally or remote (would mostly prefer SSH port forwarding or kubectl port-forward the remote jupyter server). It makes life so much easier to interact with cloud environment (e.g. spark).

phissenschaft | 5 years ago | on: Interactive C++ for Data Science

I find being able to directly evaluate code snippets via attaching to a jupyter kernel (running locally or in cloud) is one of the most important development efficiency booster.

phissenschaft | 6 years ago | on: Configs suck? Try a real programming language

https://queue.acm.org/detail.cfm?id=2898444

""" ... embrace the inevitability of programmatic configuration, and maintain a clean separation between computation and data. The language to represent the data should be a simple, data-only format such as JSON or YAML, and programmatic modification of this data should be done in a real programming language, where there are well-understood semantics, as well as good tooling ... """

page 1