hyperopt
|
1 year ago
|
on: The Lost Art of Logarithms
Charles Petzold wrote one of my favorite books - "Code: The Hidden Language of Computer Hardware and Software". Very excited to see how this turns out and thanks for giving some of this knowledge away for free!
hyperopt
|
2 years ago
|
on: Vicuna: An open-source chatbot impressing GPT-4 with 90% ChatGPT quality
The demo for this is great. It's the best non-corporation assistant I've used so far. I suspect most of the gains here relative to the Alpaca model might have to do with the fact that the ShareGPT data are full conversations. They allow for the assistant to respond to the earlier messages in a cohesive way. As opposed to Alpaca, the data was a single question and answer, so the model seems to lose context of earlier information. Also, the coding abilities of Vicuna are significantly improved relative to Alpaca, to the point that I began to suspect they might be calling out to OpenAI in the backed. Please, release model weights and finetuning training data.
hyperopt
|
2 years ago
|
on: Gpt4all: A chatbot trained on ~800k GPT-3.5-Turbo Generations based on LLaMa
Not even close.
hyperopt
|
2 years ago
|
on: Gpt4all: A chatbot trained on ~800k GPT-3.5-Turbo Generations based on LLaMa
Does anyone know of any good test suites we can use to benchmark these local models? It would be really interesting to compare all the ones capable of running on consumer hardware so that users can easily choose the best ones to use. Currently, I'm a bit unsure how this compares to the Alpaca model released a few weeks ago.
hyperopt
|
3 years ago
|
on: WInd3x, the iPod Bootrom exploit 10 years too late
Fantastic! I was involved in the freemyipod project back when the 4G was still a current product. In fact, the freemyipod.org wiki ran from a server I kept downstairs in my parent's home. My favorite memory was when we all built robots to brute force a return address (
https://freemyipod.org/wiki/Nanotron_3000). Glad to see the project is still alive!
hyperopt
|
7 years ago
|
on: Learnability can be undecidable
hyperopt
|
7 years ago
|
on: Pampy: Pattern Matching for Python
Nice! Extending this further to support associative and commutative operations yields a functionality similar to that provided in Mathematica. This approach was taken by the MatchPy project to hopefully integrate into SymPy so that they can use the Rubi integration ruleset.
hyperopt
|
8 years ago
|
on: The Big Vitamin D Mistake
In CA, Quest Diagnostics requires a doctors note for the tests I've taken. Other providers or other tests may be different.
hyperopt
|
9 years ago
|
on: Google supercharges machine learning tasks with TPU custom chip
That's what I'm thinking. I was anticipating the release of GPU instances, but now I'm thinking that they will simply leapfrog over GPU instances straight to this.
hyperopt
|
9 years ago
|
on: Google supercharges machine learning tasks with TPU custom chip
In March at the GCP NEXT keynote [1], Jeff Dean demos Cloud ML on the GCP. He casually mentions passing in the argument "replicas=20" to get "20 way parallelism in optimizing this particular model". GCE does not currently offer GPU instances. I've never heard the term replicas in the GPU ML discourse. These devices may enable a type of parallelism that we have not seen before. Furthermore, his experiments are apparently using the Criteo dataset, which is a 10GB dataset. Now, I haven't looked into the complexity of the model or to what extent they train it to, but right now that sounds really impressive to me.
1: https://youtu.be/HgWHeT_OwHc?t=2h13m6s
hyperopt
|
9 years ago
|
on: Google supercharges machine learning tasks with TPU custom chip
It doesn't just need to be a place to experiment cheaply. Many companies building software around ML techniques still rent time on EC2. Unless you are training models 24/7 and have your machines located in a very cost efficient location in terms of power/cooling, It's probably better for your training to be done in the cloud. It think very few use cases fall into the latter category.
hyperopt
|
9 years ago
|
on: Google supercharges machine learning tasks with TPU custom chip
I listen to the Talking Machines as well. Great podcast. Another question would be are the gains worth the cost of an ML-specific ASIC. GPUs have the entire, massive gaming industry driving the cost down. I suppose that as adoption of gradient-descent-based neural networks increases, it may be worth the cost in a similar way that GPUs are worth the cost. Then again, I have never implemented SGD on a GPU so I'm not aware if there are any bottlenecks that could be solved with an ML-specific ASIC. Can anyone else shed some light on this?
hyperopt
|
9 years ago
|
on: Google supercharges machine learning tasks with TPU custom chip
The Cloud Machine Learning service is one that I'm highly anticipating. Setting up arbitrary cloud machines for training models is a mess right now. I think if Google sets it up correctly, it could be a game changer for ML research for the rest of us. Especially if they can undercut AWS's GPU instances on cost per unit of performance through specialized hardware. I don't think the coinciding releases/announcements of TensorFlow, Cloud ML, and now this are an accident. There is something brewing and I think it's going to be big.