szesiongteo's comments

szesiongteo | 2 years ago | on: Demo: Code generation and editing via streaming speech recognition

I created this Proof of Concept (PoC) and video demonstrate how prompt routing can be used to make ChatGPT dynamically select the targeted prompt, achieving better results. The concept involves categorizing the initial prompt using ChatGPT and then selecting a more specific prompt based on the tag to resend targeted prompt to ChatGPT. This PoC showcases the efficiency we can gain when using speech recognition and AI tools in development. I will release the code-assist tool on GitHub once I have more free time to clean up the code. If we have a set of well-tested prompts for different types of specific tasks, I believe this tooling approach can help us increase our productivity by skipping a few steps in our repetitious daily work tasks than just code generation. Editing code, issuing commands (with safety checks), and automated scope-down online search are the next steps to multiply work productivity.

Video Demo: Code generation and editing via streaming speech recognition https://www.youtube.com/watch?v=gJXAFffxtIs

szesiongteo | 2 years ago | on: Ask HN: ChatGPT mobile app voice chat custom instructions

It doesn't show up, I suppose OpenAI kept the custom instruction (SYSTEM role instructions) at the server-side. I was referring to how to obtain the custom instructions to make the response human-like when using ChatGPT mobile app in voice mode.

szesiongteo | 2 years ago | on: Google Cloud Spanner is now half the cost of Amazon DynamoDB

Well, the thing with Google is whenever their service does not get enough users to reach their target revenue then they can suddenly shutdown in the following year. This is the reason why I would always go with AWS or Azure and GCP the last choice.

szesiongteo | 2 years ago | on: Fine-tune your own Llama 2 to replace GPT-3.5/4

I think the cost calculation here does not reflect the actual scenario where most people face. In real world scenario, we don't get inputs queued up to millions and wait for the GPU to inference them continuously at 100% utilization. We need to ensure the user get their response in time, and assume that we get all the inputs spread out evenly within a month, we have to look at the cost of running GPU for a month vs using OpenAI API.