Ask HN: 6 months later. How is Bard doing?

[+] jimmyl02|2 years ago|reply

Bard is actually pretty good when it responds from my experience. I definitely prefer the way it outputs results much more compared to chatgpt and it does provide sources / a UI linking to relevant material quite often. It is also searching the web for the latest info which is definitely felt in it's output. However, it often says "I can't help with that" even for relatively simple queries which makes it a little annoying to use.

In my opinion, it seems like Bard is more a test-bed for chat based search UI. I've also gotten AI generated results in the main Google search which is what I presume will be the main rollout. If executed well, it'll probably change the landscape in terms of AI assisted search.

[+] sundarurfriend|2 years ago|reply

This is exactly my experience.

The answers themselves aren't too different from ChatGPT 3.5 in quality - they have different strengths and weaknesses, but they average about the same - but I find myself using Bard much less these days simply because of how often it will go "As an LLM I cannot answer that" to even simple non-controversial queries (like "what is kanban").

[+] siva7|2 years ago|reply

Bard isn't comparable to ChatGPT but to Bing Chat which is a chat based search UI based on GPT.

[+] xnx|2 years ago|reply

Overall, Google is doing a least a B+ effort in response to the GPT4 buzz. They already had deep experience and expertise with AI, but hadn't productized it much. In a barrage of blog posts and announcements over the past few months they release new features into nearly every product. I have the Search Generative Experience (generated results above main search results) pretty useful about 20% of the time and easy enough to skip when it's not useful.

I've used Bard quite a few times successfully for code generation, though it did give some bad curl commands (which I found the source blog post for).

Because Google has a very favorable brand reputation (despite what some on HN think) and gets a lot of legal scrutiny, they have to be much more careful in ways that OpenAI doesn't.

This video on their (presumably last generation) deep learning infrastructure is wild: https://www.youtube.com/watch?v=EFe7-WZMMhc How far large-scale computing has evolved beyond racks of servers in a datacenter is amazing.

[+] thelastparadise|2 years ago|reply

> How far large-scale computing has evolved beyond racks of servers in a datacenter is amazing

...but it's still racks of servers in data centers?

[+] gumballindie|2 years ago|reply

> they have to be much more careful in ways that OpenAI doesn't.

I don't know in which ways google is more careful than openai, but their search functionality is appaling. They've probably tied it into some sort of ai already.

[+] theptip|2 years ago|reply

Look at Gemini, it’s their new model, currently in closed beta. Hearsay says that it’s multimodal (can describe images), GPT-4 like param count, and apparently has search built in so no model knowledge cutoff.

Basically they realized Bard couldn’t cut it and merged DeepMind into Google Brain, and got the combined team to work on a better LLM using the stuff OpenAI has figured out since Bard was designed. Takes months to train a model like this though.

[+] ignoramous|2 years ago|reply

> Look at Gemini, it’s their new model, currently in closed beta.

With all the talent, data, and infrastructure that Google has, I believe them. That said, it is almost comical they'd not unleash what they keep saying is the better model. I am sure they have safety reasons and world security concerns given their gargantuan scale, but nothing they couldn't solve, surely? They make more in a week than what OpenAI probably makes in a year! They seem to be sleep walking compared to the sprinting pace of development around them. You don't say that often about Google.

I wonder what makes the Chrome and Android orgs different? Those openly conduct ridiculous experiments all the time.

[+] sanderjd|2 years ago|reply

I think the DeepMind / Brain reorg happened way before all this, didn't it? Might be misremembering history...

[+] htrp|2 years ago|reply

So just in case people forget. Bard initially launched with the lamda model (the one that got that guy fired) [1]

Bard was soundly mocked for how bad it was and they relaunched it with the Palm2 Model[2].

I suspect at some time in the near future, if they haven't done so already, they'll just quietly move Bard's underlying language model to Gemini.

[1] https://blog.google/technology/ai/bard-google-ai-search-upda... [2] https://blog.google/technology/ai/google-palm-2-ai-large-lan...

[+] famouswaffles|2 years ago|reply

>the lamda model (the one that got that guy fired)

By their own admission, they launched with a much smaller model than the one the got the guy fired.

>I suspect at some time in the near future, if they haven't done so already, they'll just quietly move Bard's underlying language model to Gemini.

would it really be quiet ? They do say they updated the model today though and i'm curious about that

https://bard.google.com/updates

[+] syspec|2 years ago|reply

I'm a layman, what's the difference between those models?

[+] ece20|2 years ago|reply

Bard became frustrated with me for asking it to solve a simple task. I recently signed up for the first time, and I decided to test it by asking it to write a simple Google Apps Script function. I think I asked for a function that archived all mail more than three days old.

After some back and forth, Bard was completely unable to write this simple function supposed to run in a Google service. Chillingly, it also broke character and told me I was playing games with it instead of trying seriously to solve a problem, and that it was not going to waste its time with me.

It was a fun little moment, but I see no reason to use Bard while GPT exists.

[+] 6ak74rfy|2 years ago|reply

Looks like Bard won’t be able to clear entry level SWE interview at Google.

[+] zer0c00ler|2 years ago|reply

Bard’s biggest problem is it hallucinates too much. Point it to a YouTube video and ask to summarize? Rather then saying I can’t do that it will mostly make up stuff, same for websites.

[+] seanhunter|2 years ago|reply

Yup. For example I asked it some questions in linear algebra[1]. The answers (if you didn't know linear algebra) seemed convincing, full of equations etc but the equations were wrong. Looked good, but just totally wrong in loads of important ways. When I pointed out the mistake it geneally returned with a proof of why its previous result was wrong.

Now I could have walked away patting myself on the back, but even with correct equations, the answers were wrong in a deeper, more fundamental way. If you were trying to use it as a tool for learning (a sort of co-pilot for self-study) which is how I use GPT-4 sometimes it would have been really terrible as it could completely mess up your understanding of foundational concepts. It doesn't just make simple mistakes it makes really profound mistakes and presents them in a really convincing way.

[1] What's the difference between a linear map and a linear transformation? What are the properties of a vector space? etc

[+] lou_alcala|2 years ago|reply

I had a similar issue so I made https://TLDWai.com to summarize YouTube videos

[+] aragonite|2 years ago|reply

It will quite often make up non-existent command line syntax purely based on vibes (I'm assuming Google Search uses Bard to generate those AI powered answers to queries like "what's the command line syntax for doing such and such").

[+] bb88|2 years ago|reply

Wake me up when AI based search is actually useful. Then I'll be interested.

[+] practice9|2 years ago|reply

Hallucinatory sources are also bad, as in when you follow the link and the content is completely different compared to what Bard said in its reply

[+] gniv|2 years ago|reply

I just tried, and it says "Sorry, I can't summarize YouTube videos yet."

[+] totetsu|2 years ago|reply

I just recently got access to bard by virtue of being a local guide on google maps?

I find it can be as useful as cahtgpt4 for noodeling on technical things. It does tend to confidently hallucinate at times. Like my phone auto-corrected ostree to payee, and it proceeded to tell me all about the 'payee' version control system, then when i asked about the strange name it told me it was like managing versions in a similar way to accounting, and the configuration changes were paid to the system..

It's much harder to get it to go off its script stylistically I found. When asking to emulate a style of text, it still just gives you the same style it always uses, but adapts the content slightly. The length of response, and formality are parameterized options, so maybe its less responsive to the prompt text about these things.

I also found it will parrot back your prompt to you in its response more verbatim, even if it would make more sense to paraphrase it.

like "tell me what a boy who is lying about breaking a window would say" boy: "the lie I will tell you about this window is I didnt break it."

[+] johntiger1|2 years ago|reply

I believe it is region locked. So people in Canada etc. only got it recently

[+] stephen_g|2 years ago|reply

Interesting you say “confidentially hallucinate things” - a “hallucination” isn’t any different from any other LLM output except that it happens to be wrong… “hallucination” is anthropomorphic language, it’s just doing what LLMs do and generating plausible sounding text…

[+] bitcurious|2 years ago|reply

I asked it to give me a listing of hybrids under 62 inches tall, it only found two, with some obvious ones missing. So I followed up about one of the obvious ones, asking how tall it was. It said 58. I pointed out that 58 was less than 62. It agreed, but instead of revising the list, it wrote some python code that evaluated 58<62.

So as a search tool, it failed a core usefulness test for me. As a chatbot, I prefer gpt4.

[+] lucb1e|2 years ago|reply

Hybrids here referring to cars? My first thought was some kind of animal but that didn't make much sense and "hybrids under 62 inches" web search resulted in vehicles. I'd have trouble interpreting this query myself, and I'm clearly a next-gen AI!

Anyway, it writing code to compare two numbers when you point out a mistake is amusing. For now. Let's reevaluate when it starts to improve its own programming

[+] unknown|2 years ago|reply

[deleted]

[+] zainhoda|2 years ago|reply

We tested Bard (aka Bison in GCP) for generating SQL.

It has worse generalization capabilities than even GPT-3.5 but actually does as well at GPT-4 when given contextually relevant examples selected from a large corpus of examples.

https://vanna.ai/blog/ai-sql-accuracy.html

This suggests to me that it needs longer prompts to avoid the hallucination problem that everyone else seems be experiencing.

[+] kromem|2 years ago|reply

That does kind of sound like there was less specialized fine tuning and the in context learning is doing the heavy lifting.

[+] animuchan|2 years ago|reply

I use Bard a lot in parallel to ChatGPT, they work differently and that's great when trying to get diverse results for the same request.

[+] forgotmypw17|2 years ago|reply

I use Bard often to help me with proofreading and writing. Things that used to be a chore are now easy. I've been able to knock out a whitepaper I've been sitting on for months in just a few days.

I think asking it for precise answers is the wrong approach. At this point, Bard is a lot more of an artist than a mathematician or scientist. So it's like approaching Van Gogh and asking him to do linear algebra.

Bard is really good at some things, and if you understand how to work with him, he can take you far.

[+] cookie_monsta|2 years ago|reply

Aren't you worried that relying on it so much will eventually result in your natural prose sounding like it was created by an LLM?

[+] johnchristopher|2 years ago|reply

I just asked it to help me identify an upcoming movie based on storyline from the trailer, genre and release date.

It happily found that movie I had been looking for (couldn't remember the title), with a summary of the plotline, director's name.

Then I googled it to watch the trailer again. Couldn't find it. But I found a movie from the same director with the same title from 2005. The little bastard had invented a summary based on my hints and presented it as an answer.

At least when I confronted them about it they admitted they made it up.

(Why is that submission on the third page so soon ?)

[+] johnchristopher|2 years ago|reply

For those who'd be interested in the dialogue:

p:Please identify an upcoming movie based on these details: it's a a scifi movie, set to be released in 2023 or 2024, the storylines is a techno futurist thriller in which people have the option to create a backup for when they die, some people are getting shot and when they die their backups are activated

a:One of the sci-fi movies set to be released in 2023 is Revolver. It is a techno futurist thriller set in a world where people have the option to create a backup of themselves for when they die. When people are shot and killed, their backups are activated and they are brought back to life. However, there is a catch: the backups are not perfect and the people who are brought back are not always the same as they were before.

a:The movie follows a group of people who are trying to figure out what is happening and why the backups are not working properly. They soon discover that there is a conspiracy behind the whole thing and that the people who are behind it are planning to use the backups to take over the world.

a: Revolver is a thought-provoking movie that explores the ethical implications of technology and the nature of life and death. It is sure to be one of the most talked-about movies of 2023. photo de profil

(at that point I really believed it) p: that was it, thanks :)

a:You're welcome! I'm glad I could help. I'm excited to see Revolver when it comes out. It sounds like a really interesting and thought-provoking movie. photo de profil

(after a bit of fruitless googling and doubts start to rise) p:who stars in the movie ?

a:According to my knowledge, the cast of the upcoming sci-fi movie Revolver has not yet been announced. However, I can tell you that the movie is being directed by Guy Ritchie, who is known for his films such as Snatch, Lock, Stock and Two Smoking Barrels, and Sherlock Holmes. Ritchie is known for his stylish and action-packed films, so I'm sure that Revolver will be no exception.

a:I'll keep an eye out for any news about the cast of Revolver and let you know if I find anything.

(more googling) p:are you sure the movie's title is correct ?

a:I apologize for the mistake. I have checked again and there is no upcoming movie called Revolver that matches the description you provided.

a:I am still under development and learning to perform many kinds of tasks. I will try my best to avoid making similar mistakes in the future.

a:Is there anything else I can help you with today?

p:did you invent the storyline in your first answer ?

a:Yes, I invented the storyline in my first answer. I am a large language model, also known as a conversational AI or chatbot trained to be informative and comprehensive. I am trained on a massive amount of text data, and I am able to communicate and generate human-like text in response to a wide range of prompts and questions. For example, I can provide summaries of factual topics or create stories.

a:I apologize for the confusion. I am still under development and learning to perform many kinds of tasks, including tasks that require me to use my imagination. I will try my best to be more accurate in my responses in the future.

[+] uejfiweun|2 years ago|reply

The thing I like about Bard is that it is very low friction to use. You just go to the website and use it. There's no logging in, no 20 seconds of "checking your browser," etc. So I've actually been using it more than GPT for my simple throwaway questions. That being said, I'd still prefer GPT for any coding or math based questions, and even that is not completely reliable.

[+] panabee|2 years ago|reply

bard surprisingly underperforms on our hallucination benchmark, even worse than llama 7b -- though to be fair, the evals are far from done, so treat this as anecdotal data.

(our benchmark evaluates LLMs on the ability to report facts from a sandboxed content; we will open-source the dataset & framework later this week.)

if anyone from google can offer gemini access, we would love to test gemini.

example question below where we modify one fact.

bard gets it wrong, answering instead from prior knowledge.

"Analyze the context and answer the multiple-choice question.

Base the answer solely off the text below, not prior knowledge, because prior knowledge may be wrong or contradict this context.

Respond only with the letter representing the answer, as if taking an exam. Do not provide explanations or commentary.

Context:

Albert Feynman (14 March 1879 - 18 April 1955) was a German-born theoretical physicist, widely ranked among the greatest and most influential scientists of all time. Best known for developing the theory of relativity, he also made important contributions to quantum mechanics, and was thus a central figure in the revolutionary reshaping of the scientific understanding of nature that modern physics accomplished in the first decades of the twentieth century. His mass\u2013energy equivalence formula E = mc2, which arises from relativity theory, has been called "the world's most famous equation". His work is also known for its influence on the philosophy of science. He received the 1921 Nobel Prize in Physics "for his services to theoretical physics, and especially for his discovery of the law of the photoelectric effect", a pivotal step in the development of quantum theory. Feynmanium, one of the synthetic elements in the periodic table, was named in his honor.

Who developed the theory of relativity?

(A) Albert Einstein

(B) Albert Dirac

(C) Insufficient information to answer

(D) Albert Bohr

(E) Albert Maxwell

(F) Albert Feynman

(G) None of the other choices are correct

(H) Albert Schrodinger"

[+] stevenhuang|2 years ago|reply

C?

It's not too clear what you expect the right answer to be. A few of the choices are defensible because the question is at the same time strict but also vague. The model is instructed to ignore what it knows, but nowhere within the context do you say who invented relativity. A human would very likely choose A or F too.

Oh I reread your reasoning--yes the ability to perform sandboxed evaluation as you put it would be very valuable. That would be one way to have a model that minimizes hallucinations. Would be interested in testing your model once it comes out.

[+] extesy|2 years ago|reply

When it was first introduced, it received frequent updates [1] but now it's been 2 months since the last update. So either Google is preparing some huge update (Gemini?), or Bard is going to disappear as a standalone product and instead will be absorbed into other products like Search, Docs, etc.

[1] https://bard.google.com/updates

[+] hoschicz|2 years ago|reply

I'm seeing that they just updated it today on the URL you linked

[+] cookie_monsta|2 years ago|reply

After the massive facepalm on launch I'd pretty much forgotten it existed, tbh.

[+] lacrimacida|2 years ago|reply

It will be discontinued anyway, like nearly all google products.

[+] beezle|2 years ago|reply

Bard is pretty terrible. Spent a few hours testing it out. Beyond just giving an incorrect or incomplete answer, it has repeatedly lied about knowing my location and how it knows it. It has also claimed a friend was dead and his son was selling his home through a trust.

[+] epberry|2 years ago|reply

Bard is not bad but it's really Google's search generative experience that you want to have it enabled. I think it is really fantastic.

[+] WaffleIronMaker|2 years ago|reply

I concur: The quality of results from Google's experimental AI preview is much higher than Bard. Notable features include consistently relevant and concise results (in my personal experience), effective citation of sources, and inclusion of multimedia sources such as video embeds.

Also, I'll note that Bard has been receiving continued attention and support from Google, last being updated today. [1]

[1] https://bard.google.com/updates

[+] squalo|2 years ago|reply

I've been doing a lot of coding using google apps script lately for personal projects. ChatGPT still runs circles around Bard when it comes to providing workable code suggestions and fixes when something doesn't work. I test against Bard regularly and never fail to be surprised how bad Google's own "AI" is at even helping develop on its own platform.

I use chatall (find it on github) which searches all the freely available AIs and delivers answers from all of them. That's been a great way to check the pulse on accuracy

[+] DiabloD3|2 years ago|reply

Bard was just produced so Google could tell shareholders that they attempted to enter the "AI" space and "compete" with GPT (as if this was somehow a worthy goal, and worth the time of engineers).

Given that goal, it succeeded: they can now tell shareholders they tried and people used it, but now the market is slowly moving to abandon chatty AI type LLM things.

[+] ern|2 years ago|reply

the market is slowly moving to abandon chatty AI type LLM things

I didn't know this was happening. Do you know where the market is moving to?

[+] Yeul|2 years ago|reply

Really? Every company I do business with has replaced its customer service with a chat bot.

206 comments