qigtyofhp's comments

qigtyofhp | 1 year ago | on: The First Hedge Fund

The story was interesting but the title is misleading. This wasn't the first hedge fund. Benjamin Graham started his first fund in the 1920s which would be what we call a hedge fund today. Graham's fund might not be the first hedge fund but it came before Jones'.

qigtyofhp | 3 years ago | on: Understanding and coding the self-attention mechanism of large language models

The attention mechanism started as a simple trick to not use recurrent neural networks.

Read the intro in the original paper "Attention is all you need" (https://arxiv.org/abs/1706.03762)

This video explains the drawbacks to RNNs and how transformers solve that: https://youtu.be/S27pHKBEp30?t=394

Andrej Karpathy explains attention here: https://youtu.be/kCc8FmEb1nY?t=3719

He explains how attention is seen as a communication network: https://youtu.be/kCc8FmEb1nY?t=4298

qigtyofhp | 3 years ago | on: Storing OpenAI embeddings in Postgres with pgvector

This will load the transformer models from Hugging Face (their models have a similar architecture to OpenAI embedding models): https://www.sbert.net/docs/pretrained_models.html

Redis has approximate nearest-neighbors vector similarity search: https://redis-py.readthedocs.io/en/stable/examples/search_ve...

Generate the embeddings on a rented GPU, push to Redis then do a similarity search. Store vectors in Redis using ndarray.tobytes()

qigtyofhp | 3 years ago | on: Storing OpenAI embeddings in Postgres with pgvector

This is an interesting feature. But you don't need to use just OpenAI's embeddings. You can generate your own embeddings with open source SOTA transformer models which would probably work just the same. You could generate a couple hundred thousand embeddings with a rented A100 for less than 2 dollars. And the point of converting text or other objects (like images) into embeddings is to compare a large number of documents to a source document very fast. It's more useful to put the embeddings in something like Redis. This pgvector data type would be good for an offline backup of vectors.
page 1