Bard is actually pretty good when it responds from my experience. I definitely prefer the way it outputs results much more compared to chatgpt and it does provide sources / a UI linking to relevant material quite often. It is also searching the web for the latest info which is definitely felt in it's output. However, it often says "I can't help with that" even for relatively simple queries which makes it a little annoying to use.
In my opinion, it seems like Bard is more a test-bed for chat based search UI. I've also gotten AI generated results in the main Google search which is what I presume will be the main rollout. If executed well, it'll probably change the landscape in terms of AI assisted search.
The answers themselves aren't too different from ChatGPT 3.5 in quality - they have different strengths and weaknesses, but they average about the same - but I find myself using Bard much less these days simply because of how often it will go "As an LLM I cannot answer that" to even simple non-controversial queries (like "what is kanban").
Overall, Google is doing a least a B+ effort in response to the GPT4 buzz. They already had deep experience and expertise with AI, but hadn't productized it much. In a barrage of blog posts and announcements over the past few months they release new features into nearly every product. I have the Search Generative Experience (generated results above main search results) pretty useful about 20% of the time and easy enough to skip when it's not useful.
I've used Bard quite a few times successfully for code generation, though it did give some bad curl commands (which I found the source blog post for).
Because Google has a very favorable brand reputation (despite what some on HN think) and gets a lot of legal scrutiny, they have to be much more careful in ways that OpenAI doesn't.
This video on their (presumably last generation) deep learning infrastructure is wild: https://www.youtube.com/watch?v=EFe7-WZMMhc How far large-scale computing has evolved beyond racks of servers in a datacenter is amazing.
> they have to be much more careful in ways that OpenAI doesn't.
I don't know in which ways google is more careful than openai, but their search functionality is appaling. They've probably tied it into some sort of ai already.
Look at Gemini, it’s their new model, currently in closed beta. Hearsay says that it’s multimodal (can describe images), GPT-4 like param count, and apparently has search built in so no model knowledge cutoff.
Basically they realized Bard couldn’t cut it and merged DeepMind into Google Brain, and got the combined team to work on a better LLM using the stuff OpenAI has figured out since Bard was designed. Takes months to train a model like this though.
> Look at Gemini, it’s their new model, currently in closed beta.
With all the talent, data, and infrastructure that Google has, I believe them. That said, it is almost comical they'd not unleash what they keep saying is the better model. I am sure they have safety reasons and world security concerns given their gargantuan scale, but nothing they couldn't solve, surely? They make more in a week than what OpenAI probably makes in a year! They seem to be sleep walking compared to the sprinting pace of development around them. You don't say that often about Google.
I wonder what makes the Chrome and Android orgs different? Those openly conduct ridiculous experiments all the time.
Bard became frustrated with me for asking it to solve a simple task. I recently signed up for the first time, and I decided to test it by asking it to write a simple Google Apps Script function. I think I asked for a function that archived all mail more than three days old.
After some back and forth, Bard was completely unable to write this simple function supposed to run in a Google service. Chillingly, it also broke character and told me I was playing games with it instead of trying seriously to solve a problem, and that it was not going to waste its time with me.
It was a fun little moment, but I see no reason to use Bard while GPT exists.
Bard’s biggest problem is it hallucinates too much. Point it to a YouTube video and ask to summarize? Rather then saying I can’t do that it will mostly make up stuff, same for websites.
Yup. For example I asked it some questions in linear algebra[1]. The answers (if you didn't know linear algebra) seemed convincing, full of equations etc but the equations were wrong. Looked good, but just totally wrong in loads of important ways. When I pointed out the mistake it geneally returned with a proof of why its previous result was wrong.
Now I could have walked away patting myself on the back, but even with correct equations, the answers were wrong in a deeper, more fundamental way. If you were trying to use it as a tool for learning (a sort of co-pilot for self-study) which is how I use GPT-4 sometimes it would have been really terrible as it could completely mess up your understanding of foundational concepts. It doesn't just make simple mistakes it makes really profound mistakes and presents them in a really convincing way.
[1] What's the difference between a linear map and a linear transformation? What are the properties of a vector space? etc
It will quite often make up non-existent command line syntax purely based on vibes (I'm assuming Google Search uses Bard to generate those AI powered answers to queries like "what's the command line syntax for doing such and such").
I just recently got access to bard by virtue of being a local guide on google maps?
I find it can be as useful as cahtgpt4 for noodeling on technical things. It does tend to confidently hallucinate at times. Like my phone auto-corrected ostree to payee, and it proceeded to tell me all about the 'payee' version control system, then when i asked about the strange name it told me it was like managing versions in a similar way to accounting, and the configuration changes were paid to the system..
It's much harder to get it to go off its script stylistically I found. When asking to emulate a style of text, it still just gives you the same style it always uses, but adapts the content slightly. The length of response, and formality are parameterized options, so maybe its less responsive to the prompt text about these things.
I also found it will parrot back your prompt to you in its response more verbatim, even if it would make more sense to paraphrase it.
like "tell me what a boy who is lying about breaking a window would say"
boy: "the lie I will tell you about this window is I didnt break it."
Interesting you say “confidentially hallucinate things” - a “hallucination” isn’t any different from any other LLM output except that it happens to be wrong… “hallucination” is anthropomorphic language, it’s just doing what LLMs do and generating plausible sounding text…
I asked it to give me a listing of hybrids under 62 inches tall, it only found two, with some obvious ones missing. So I followed up about one of the obvious ones, asking how tall it was. It said 58. I pointed out that 58 was less than 62. It agreed, but instead of revising the list, it wrote some python code that evaluated 58<62.
So as a search tool, it failed a core usefulness test for me. As a chatbot, I prefer gpt4.
Hybrids here referring to cars? My first thought was some kind of animal but that didn't make much sense and "hybrids under 62 inches" web search resulted in vehicles. I'd have trouble interpreting this query myself, and I'm clearly a next-gen AI!
Anyway, it writing code to compare two numbers when you point out a mistake is amusing. For now. Let's reevaluate when it starts to improve its own programming
We tested Bard (aka Bison in GCP) for generating SQL.
It has worse generalization capabilities than even GPT-3.5 but actually does as well at GPT-4 when given contextually relevant examples selected from a large corpus of examples.
I use Bard often to help me with proofreading and writing. Things that used to be a chore are now easy. I've been able to knock out a whitepaper I've been sitting on for months in just a few days.
I think asking it for precise answers is the wrong approach. At this point, Bard is a lot more of an artist than a mathematician or scientist. So it's like approaching Van Gogh and asking him to do linear algebra.
Bard is really good at some things, and if you understand how to work with him, he can take you far.
I just asked it to help me identify an upcoming movie based on storyline from the trailer, genre and release date.
It happily found that movie I had been looking for (couldn't remember the title), with a summary of the plotline, director's name.
Then I googled it to watch the trailer again. Couldn't find it. But I found a movie from the same director with the same title from 2005. The little bastard had invented a summary based on my hints and presented it as an answer.
At least when I confronted them about it they admitted they made it up.
(Why is that submission on the third page so soon ?)
p:Please identify an upcoming movie based on these details: it's a a scifi movie, set to be released in 2023 or 2024, the storylines is a techno futurist thriller in which people have the option to create a backup for when they die, some people are getting shot and when they die their backups are activated
a:One of the sci-fi movies set to be released in 2023 is Revolver. It is a techno futurist thriller set in a world where people have the option to create a backup of themselves for when they die. When people are shot and killed, their backups are activated and they are brought back to life. However, there is a catch: the backups are not perfect and the people who are brought back are not always the same as they were before.
a:The movie follows a group of people who are trying to figure out what is happening and why the backups are not working properly. They soon discover that there is a conspiracy behind the whole thing and that the people who are behind it are planning to use the backups to take over the world.
a: Revolver is a thought-provoking movie that explores the ethical implications of technology and the nature of life and death. It is sure to be one of the most talked-about movies of 2023.
photo de profil
(at that point I really believed it)
p: that was it, thanks :)
a:You're welcome! I'm glad I could help. I'm excited to see Revolver when it comes out. It sounds like a really interesting and thought-provoking movie.
photo de profil
(after a bit of fruitless googling and doubts start to rise)
p:who stars in the movie ?
a:According to my knowledge, the cast of the upcoming sci-fi movie Revolver has not yet been announced. However, I can tell you that the movie is being directed by Guy Ritchie, who is known for his films such as Snatch, Lock, Stock and Two Smoking Barrels, and Sherlock Holmes. Ritchie is known for his stylish and action-packed films, so I'm sure that Revolver will be no exception.
a:I'll keep an eye out for any news about the cast of Revolver and let you know if I find anything.
(more googling)
p:are you sure the movie's title is correct ?
a:I apologize for the mistake. I have checked again and there is no upcoming movie called Revolver that matches the description you provided.
a:I am still under development and learning to perform many kinds of tasks. I will try my best to avoid making similar mistakes in the future.
a:Is there anything else I can help you with today?
p:did you invent the storyline in your first answer ?
a:Yes, I invented the storyline in my first answer. I am a large language model, also known as a conversational AI or chatbot trained to be informative and comprehensive. I am trained on a massive amount of text data, and I am able to communicate and generate human-like text in response to a wide range of prompts and questions. For example, I can provide summaries of factual topics or create stories.
a:I apologize for the confusion. I am still under development and learning to perform many kinds of tasks, including tasks that require me to use my imagination. I will try my best to be more accurate in my responses in the future.
The thing I like about Bard is that it is very low friction to use. You just go to the website and use it. There's no logging in, no 20 seconds of "checking your browser," etc. So I've actually been using it more than GPT for my simple throwaway questions. That being said, I'd still prefer GPT for any coding or math based questions, and even that is not completely reliable.
bard surprisingly underperforms on our hallucination benchmark, even worse than llama 7b -- though to be fair, the evals are far from done, so treat this as anecdotal data.
(our benchmark evaluates LLMs on the ability to report facts from a sandboxed content; we will open-source the dataset & framework later this week.)
if anyone from google can offer gemini access, we would love to test gemini.
example question below where we modify one fact.
bard gets it wrong, answering instead from prior knowledge.
"Analyze the context and answer the multiple-choice question.
Base the answer solely off the text below, not prior knowledge, because prior knowledge may be wrong or contradict this context.
Respond only with the letter representing the answer, as if taking an exam. Do not provide explanations or commentary.
Context:
Albert Feynman (14 March 1879 - 18 April 1955) was a German-born theoretical physicist, widely ranked among the greatest and most influential scientists of all time. Best known for developing the theory of relativity, he also made important contributions to quantum mechanics, and was thus a central figure in the revolutionary reshaping of the scientific understanding of nature that modern physics accomplished in the first decades of the twentieth century. His mass\u2013energy equivalence formula E = mc2, which arises from relativity theory, has been called "the world's most famous equation". His work is also known for its influence on the philosophy of science. He received the 1921 Nobel Prize in Physics "for his services to theoretical physics, and especially for his discovery of the law of the photoelectric effect", a pivotal step in the development of quantum theory. Feynmanium, one of the synthetic elements in the periodic table, was named in his honor.
It's not too clear what you expect the right answer to be. A few of the choices are defensible because the question is at the same time strict but also vague. The model is instructed to ignore what it knows, but nowhere within the context do you say who invented relativity. A human would very likely choose A or F too.
Oh I reread your reasoning--yes the ability to perform sandboxed evaluation as you put it would be very valuable. That would be one way to have a model that minimizes hallucinations. Would be interested in testing your model once it comes out.
When it was first introduced, it received frequent updates [1] but now it's been 2 months since the last update. So either Google is preparing some huge update (Gemini?), or Bard is going to disappear as a standalone product and instead will be absorbed into other products like Search, Docs, etc.
Bard is pretty terrible. Spent a few hours testing it out. Beyond just giving an incorrect or incomplete answer, it has repeatedly lied about knowing my location and how it knows it. It has also claimed a friend was dead and his son was selling his home through a trust.
I concur: The quality of results from Google's experimental AI preview is much higher than Bard. Notable features include consistently relevant and concise results (in my personal experience), effective citation of sources, and inclusion of multimedia sources such as video embeds.
Also, I'll note that Bard has been receiving continued attention and support from Google, last being updated today. [1]
I've been doing a lot of coding using google apps script lately for personal projects. ChatGPT still runs circles around Bard when it comes to providing workable code suggestions and fixes when something doesn't work. I test against Bard regularly and never fail to be surprised how bad Google's own "AI" is at even helping develop on its own platform.
I use chatall (find it on github) which searches all the freely available AIs and delivers answers from all of them. That's been a great way to check the pulse on accuracy
Bard was just produced so Google could tell shareholders that they attempted to enter the "AI" space and "compete" with GPT (as if this was somehow a worthy goal, and worth the time of engineers).
Given that goal, it succeeded: they can now tell shareholders they tried and people used it, but now the market is slowly moving to abandon chatty AI type LLM things.
[+] [-] jimmyl02|2 years ago|reply
In my opinion, it seems like Bard is more a test-bed for chat based search UI. I've also gotten AI generated results in the main Google search which is what I presume will be the main rollout. If executed well, it'll probably change the landscape in terms of AI assisted search.
[+] [-] sundarurfriend|2 years ago|reply
The answers themselves aren't too different from ChatGPT 3.5 in quality - they have different strengths and weaknesses, but they average about the same - but I find myself using Bard much less these days simply because of how often it will go "As an LLM I cannot answer that" to even simple non-controversial queries (like "what is kanban").
[+] [-] siva7|2 years ago|reply
[+] [-] xnx|2 years ago|reply
I've used Bard quite a few times successfully for code generation, though it did give some bad curl commands (which I found the source blog post for).
Because Google has a very favorable brand reputation (despite what some on HN think) and gets a lot of legal scrutiny, they have to be much more careful in ways that OpenAI doesn't.
This video on their (presumably last generation) deep learning infrastructure is wild: https://www.youtube.com/watch?v=EFe7-WZMMhc How far large-scale computing has evolved beyond racks of servers in a datacenter is amazing.
[+] [-] thelastparadise|2 years ago|reply
...but it's still racks of servers in data centers?
[+] [-] gumballindie|2 years ago|reply
I don't know in which ways google is more careful than openai, but their search functionality is appaling. They've probably tied it into some sort of ai already.
[+] [-] theptip|2 years ago|reply
Basically they realized Bard couldn’t cut it and merged DeepMind into Google Brain, and got the combined team to work on a better LLM using the stuff OpenAI has figured out since Bard was designed. Takes months to train a model like this though.
[+] [-] ignoramous|2 years ago|reply
With all the talent, data, and infrastructure that Google has, I believe them. That said, it is almost comical they'd not unleash what they keep saying is the better model. I am sure they have safety reasons and world security concerns given their gargantuan scale, but nothing they couldn't solve, surely? They make more in a week than what OpenAI probably makes in a year! They seem to be sleep walking compared to the sprinting pace of development around them. You don't say that often about Google.
I wonder what makes the Chrome and Android orgs different? Those openly conduct ridiculous experiments all the time.
[+] [-] sanderjd|2 years ago|reply
[+] [-] htrp|2 years ago|reply
Bard was soundly mocked for how bad it was and they relaunched it with the Palm2 Model[2].
I suspect at some time in the near future, if they haven't done so already, they'll just quietly move Bard's underlying language model to Gemini.
[1] https://blog.google/technology/ai/bard-google-ai-search-upda... [2] https://blog.google/technology/ai/google-palm-2-ai-large-lan...
[+] [-] famouswaffles|2 years ago|reply
By their own admission, they launched with a much smaller model than the one the got the guy fired.
>I suspect at some time in the near future, if they haven't done so already, they'll just quietly move Bard's underlying language model to Gemini.
would it really be quiet ? They do say they updated the model today though and i'm curious about that
https://bard.google.com/updates
[+] [-] syspec|2 years ago|reply
[+] [-] ece20|2 years ago|reply
After some back and forth, Bard was completely unable to write this simple function supposed to run in a Google service. Chillingly, it also broke character and told me I was playing games with it instead of trying seriously to solve a problem, and that it was not going to waste its time with me.
It was a fun little moment, but I see no reason to use Bard while GPT exists.
[+] [-] 6ak74rfy|2 years ago|reply
[+] [-] zer0c00ler|2 years ago|reply
[+] [-] seanhunter|2 years ago|reply
Now I could have walked away patting myself on the back, but even with correct equations, the answers were wrong in a deeper, more fundamental way. If you were trying to use it as a tool for learning (a sort of co-pilot for self-study) which is how I use GPT-4 sometimes it would have been really terrible as it could completely mess up your understanding of foundational concepts. It doesn't just make simple mistakes it makes really profound mistakes and presents them in a really convincing way.
[1] What's the difference between a linear map and a linear transformation? What are the properties of a vector space? etc
[+] [-] lou_alcala|2 years ago|reply
[+] [-] aragonite|2 years ago|reply
[+] [-] bb88|2 years ago|reply
[+] [-] practice9|2 years ago|reply
[+] [-] gniv|2 years ago|reply
[+] [-] totetsu|2 years ago|reply
I find it can be as useful as cahtgpt4 for noodeling on technical things. It does tend to confidently hallucinate at times. Like my phone auto-corrected ostree to payee, and it proceeded to tell me all about the 'payee' version control system, then when i asked about the strange name it told me it was like managing versions in a similar way to accounting, and the configuration changes were paid to the system..
It's much harder to get it to go off its script stylistically I found. When asking to emulate a style of text, it still just gives you the same style it always uses, but adapts the content slightly. The length of response, and formality are parameterized options, so maybe its less responsive to the prompt text about these things.
I also found it will parrot back your prompt to you in its response more verbatim, even if it would make more sense to paraphrase it.
like "tell me what a boy who is lying about breaking a window would say" boy: "the lie I will tell you about this window is I didnt break it."
[+] [-] johntiger1|2 years ago|reply
[+] [-] stephen_g|2 years ago|reply
[+] [-] bitcurious|2 years ago|reply
So as a search tool, it failed a core usefulness test for me. As a chatbot, I prefer gpt4.
[+] [-] lucb1e|2 years ago|reply
Anyway, it writing code to compare two numbers when you point out a mistake is amusing. For now. Let's reevaluate when it starts to improve its own programming
[+] [-] unknown|2 years ago|reply
[deleted]
[+] [-] zainhoda|2 years ago|reply
It has worse generalization capabilities than even GPT-3.5 but actually does as well at GPT-4 when given contextually relevant examples selected from a large corpus of examples.
https://vanna.ai/blog/ai-sql-accuracy.html
This suggests to me that it needs longer prompts to avoid the hallucination problem that everyone else seems be experiencing.
[+] [-] kromem|2 years ago|reply
[+] [-] animuchan|2 years ago|reply
[+] [-] forgotmypw17|2 years ago|reply
I think asking it for precise answers is the wrong approach. At this point, Bard is a lot more of an artist than a mathematician or scientist. So it's like approaching Van Gogh and asking him to do linear algebra.
Bard is really good at some things, and if you understand how to work with him, he can take you far.
[+] [-] cookie_monsta|2 years ago|reply
[+] [-] johnchristopher|2 years ago|reply
It happily found that movie I had been looking for (couldn't remember the title), with a summary of the plotline, director's name.
Then I googled it to watch the trailer again. Couldn't find it. But I found a movie from the same director with the same title from 2005. The little bastard had invented a summary based on my hints and presented it as an answer.
At least when I confronted them about it they admitted they made it up.
(Why is that submission on the third page so soon ?)
[+] [-] johnchristopher|2 years ago|reply
p:Please identify an upcoming movie based on these details: it's a a scifi movie, set to be released in 2023 or 2024, the storylines is a techno futurist thriller in which people have the option to create a backup for when they die, some people are getting shot and when they die their backups are activated
a:One of the sci-fi movies set to be released in 2023 is Revolver. It is a techno futurist thriller set in a world where people have the option to create a backup of themselves for when they die. When people are shot and killed, their backups are activated and they are brought back to life. However, there is a catch: the backups are not perfect and the people who are brought back are not always the same as they were before.
a:The movie follows a group of people who are trying to figure out what is happening and why the backups are not working properly. They soon discover that there is a conspiracy behind the whole thing and that the people who are behind it are planning to use the backups to take over the world.
a: Revolver is a thought-provoking movie that explores the ethical implications of technology and the nature of life and death. It is sure to be one of the most talked-about movies of 2023. photo de profil
(at that point I really believed it) p: that was it, thanks :)
a:You're welcome! I'm glad I could help. I'm excited to see Revolver when it comes out. It sounds like a really interesting and thought-provoking movie. photo de profil
(after a bit of fruitless googling and doubts start to rise) p:who stars in the movie ?
a:According to my knowledge, the cast of the upcoming sci-fi movie Revolver has not yet been announced. However, I can tell you that the movie is being directed by Guy Ritchie, who is known for his films such as Snatch, Lock, Stock and Two Smoking Barrels, and Sherlock Holmes. Ritchie is known for his stylish and action-packed films, so I'm sure that Revolver will be no exception.
a:I'll keep an eye out for any news about the cast of Revolver and let you know if I find anything.
(more googling) p:are you sure the movie's title is correct ?
a:I apologize for the mistake. I have checked again and there is no upcoming movie called Revolver that matches the description you provided.
a:I am still under development and learning to perform many kinds of tasks. I will try my best to avoid making similar mistakes in the future.
a:Is there anything else I can help you with today?
p:did you invent the storyline in your first answer ?
a:Yes, I invented the storyline in my first answer. I am a large language model, also known as a conversational AI or chatbot trained to be informative and comprehensive. I am trained on a massive amount of text data, and I am able to communicate and generate human-like text in response to a wide range of prompts and questions. For example, I can provide summaries of factual topics or create stories.
a:I apologize for the confusion. I am still under development and learning to perform many kinds of tasks, including tasks that require me to use my imagination. I will try my best to be more accurate in my responses in the future.
[+] [-] uejfiweun|2 years ago|reply
[+] [-] panabee|2 years ago|reply
(our benchmark evaluates LLMs on the ability to report facts from a sandboxed content; we will open-source the dataset & framework later this week.)
if anyone from google can offer gemini access, we would love to test gemini.
example question below where we modify one fact.
bard gets it wrong, answering instead from prior knowledge.
"Analyze the context and answer the multiple-choice question.
Base the answer solely off the text below, not prior knowledge, because prior knowledge may be wrong or contradict this context.
Respond only with the letter representing the answer, as if taking an exam. Do not provide explanations or commentary.
Context:
Albert Feynman (14 March 1879 - 18 April 1955) was a German-born theoretical physicist, widely ranked among the greatest and most influential scientists of all time. Best known for developing the theory of relativity, he also made important contributions to quantum mechanics, and was thus a central figure in the revolutionary reshaping of the scientific understanding of nature that modern physics accomplished in the first decades of the twentieth century. His mass\u2013energy equivalence formula E = mc2, which arises from relativity theory, has been called "the world's most famous equation". His work is also known for its influence on the philosophy of science. He received the 1921 Nobel Prize in Physics "for his services to theoretical physics, and especially for his discovery of the law of the photoelectric effect", a pivotal step in the development of quantum theory. Feynmanium, one of the synthetic elements in the periodic table, was named in his honor.
Who developed the theory of relativity?
(A) Albert Einstein
(B) Albert Dirac
(C) Insufficient information to answer
(D) Albert Bohr
(E) Albert Maxwell
(F) Albert Feynman
(G) None of the other choices are correct
(H) Albert Schrodinger"
[+] [-] stevenhuang|2 years ago|reply
It's not too clear what you expect the right answer to be. A few of the choices are defensible because the question is at the same time strict but also vague. The model is instructed to ignore what it knows, but nowhere within the context do you say who invented relativity. A human would very likely choose A or F too.
Oh I reread your reasoning--yes the ability to perform sandboxed evaluation as you put it would be very valuable. That would be one way to have a model that minimizes hallucinations. Would be interested in testing your model once it comes out.
[+] [-] extesy|2 years ago|reply
[1] https://bard.google.com/updates
[+] [-] hoschicz|2 years ago|reply
[+] [-] cookie_monsta|2 years ago|reply
[+] [-] lacrimacida|2 years ago|reply
[+] [-] beezle|2 years ago|reply
[+] [-] epberry|2 years ago|reply
[+] [-] WaffleIronMaker|2 years ago|reply
Also, I'll note that Bard has been receiving continued attention and support from Google, last being updated today. [1]
[1] https://bard.google.com/updates
[+] [-] squalo|2 years ago|reply
I use chatall (find it on github) which searches all the freely available AIs and delivers answers from all of them. That's been a great way to check the pulse on accuracy
[+] [-] DiabloD3|2 years ago|reply
Given that goal, it succeeded: they can now tell shareholders they tried and people used it, but now the market is slowly moving to abandon chatty AI type LLM things.
[+] [-] ern|2 years ago|reply
I didn't know this was happening. Do you know where the market is moving to?
[+] [-] Yeul|2 years ago|reply