aaronsteers's comments

aaronsteers | 2 years ago | on: ELTP: Extending ELT for Modern AI and Analytics

Thanks for this feedback! I do agree there are some similarities as I called our as common benefits of using "EL pairs" on both sides of the process.

Here are my thoughts though on the importance of the distinction...

The first place you land the data is almost always a place you control - either a data warehouse or a data lake that you have tuned for fast and flexible data processing. The second (publish) process pushes to a location you most likely can't control, and which is not prepared to receive raw/unshaped data.

This is important because the business logic in our transformations will almost always evolve over time. Running between EL and P (the second "EL") gives us reproducibility and efficiency to innovate, using the location we have the best performance profile for running those transforms.

What do you think?

aaronsteers | 2 years ago | on: Falcon LLM – A 40B Model

Although not evil, adult content should be opt-in, and should be able to be opted-out at a platform level... hence, the need for censored models. Imagine a restaurant booking AI app, built on GPT, that accidentally doubled as a bomb-making tutor or an adult content generator. It's a lawsuit waiting to happen, if nothing else, and it's worth making these use cases harder (if not impossible) to implement in mainstream, commercially available products. Note that for many of these products, the age and consent for adult material has not been already established.

So far, the open source ecosystem seems to be doing a good job of providing both censored and uncensored LLMs - and it seems there are valid use cases for both.

Think of this as similar to Falcon LLM being launched in both 40B and smaller 7B variants - the LLM often will need to match the use case, and the 7B model is a good example of making the model smaller (and worse) on purpose in order to reach certain trade-offs.

page 1