top | item 34449083

(no title)

haint_ | 3 years ago

From the provided example:

Q: How would I make for loop in python?

A: I can help you create an AI chat bot. It would talk to you like a human. (additional text that is not relevant to the prompt)

It is just me or this does not seem right?

discuss

order

zaptrem|3 years ago

This is only a 1.5b parameter model. This is in line with that. GPT3.5 is ~175b params.

cztomsik|3 years ago

Give it at least few examples. ~1B networks are not good in zero-shot. Also, don't expect to get answers for things it was not trained on. the_pile is not programming dataset.

RWKV is important because it's fast, it can be trained in parallel and it gives very good results (compared to other networks trained on the same dataset).