imaurer's comments

imaurer | 1 year ago | on: Prelude – a tiny CLI tool building context prompts from your code

Have a bunch of Makerile commands (pbcopy-api, pbcopy-ui, pbcopy-curr) that use some mishmash of git ls-files, grep, xargs tail -n +1 piped into pbcopy.

Kitchen sink command: pbcopy-all: git ls-files | xargs tail -n +1 | pbcopy

Works like a charm in Q2 2024.

I’m sure this will be a very solved problem by 2025.

imaurer | 1 year ago | on: The one about the web developer job market

“ Finding effective documentation, information, and training is likely to get harder, especially in specialised topics where LLMs are even less effective than normal.”

Who needs documentation with Claude and pbcopy?

imaurer | 2 years ago | on: Inversion: Fast, Reliable Structured LLMs

Currently, LLM models are not state of the art at Named Entity Recognition. They are slower, more expensive and less accurate than a fine tuned BERT model.

However, they are way easier to get started with using in context learning. Soon, they will be cheaper and probably faster enough too that training your own model will be a waste of time for 95% of use cases (probably higher because it will unlock use cases that wouldn’t break even with the old NLP approaches from a value perspective).

This is why I am tracking LLM structured outputs here:

https://github.com/imaurer/awesome-llm-json

And created an autocorrecting pydantic library that could be used for Named entity linking:

https://github.com/genomoncology/FuzzTypes

imaurer | 2 years ago | on: What is a Vector Database? (2021)

I am bullish Pgvector because I am “postgres for everything guy”.

Current concerns are the scaling and recall performance.

The author is looking at product quantization along with other ideas: https://github.com/pgvector/pgvector/issues/27

More details on product quantization: https://mccormickml.com/2017/10/13/product-quantizer-tutoria...

A nice repo that tracks the ANN relative performance of different indexes: https://mccormickml.com/2017/10/13/product-quantizer-tutoria...

Also shoutout to Weaviate because they have great docs, are open source and have very informative YouTube channel.

https://weaviate.io/

imaurer | 2 years ago | on: Collection of LLM resources that can be used to build products you can “own”

If Google Docs was the only way most people wrote text then I think your analogy would indeed be apt. In this case, nearly all people using Large Language Models are doing so through a web page (ChatGPT) or an API.

That's the inspiration behind the name, open for something better. Considered "Edge" as well, but was concerned that would seem IoT/mobile specific.

page 1