Google announces Sec-Gemini v1 a new experimental cybersecurity model

[+] qwertox|1 year ago|reply

There is generally something about the Gemini models which feels a bit different than Claude, ChatGPT or Mistral.

I always have the feeling that I'm chatting with a model oriented towards engineering tasks. The seriousness, lack of interest of being humorous or cool.

I don't know if this is because I interact with Gemini only through AI Studio, and it may have different system instructions (apart from those one can add oneself, which I never do) than the one at gemini.google.com.

I never use gemini.google.com because of the lack of a simple export feature. And it's not even possible to save one chat to disk (well, neither do the others), I just wish it did.

AI Studio saving to Google Drive is really useful. I lets you download the chat, strip it of verbose things like the thinking process, and reuse it in a new chat.

I wish gemini.google.com had a "Save as Markdown" per answer and for the complete chat (with a toggle to include/exclude the thinking process). Then it would be a no brainer for me.

It's the same as if Google Docs would not have an "Download.." menu entry but you could only "save" the documents via Takeout.

[+] Y_Y|1 year ago|reply

> The seriousness, lack of interest of being humorous or cool.

I love this. When ChatGPT compliments me on my great question or tries to banter it causes me great despair.

[+] tyushk|1 year ago|reply

You put into words something I've been struggling to describe for a long time. Gemini gives short, succinct responses with whatever information you need and minimal anything else. ChatGPT, Claude both fill text with mannerisms, formatting, etc.

I didn't realize just how big the difference was until I tested it.

"How do I clear a directory of all executable files on Debian?"

Gemini 2.0 Flash: (responses manually formatted)

        find /path/to/directory -type f -executable -delete
    Replace /path/to/directory with the actual path.

ChatGPT: (full link [1])

    To clear (delete) all executable files from a directory on Debian (or any Linux system), you can use the find command. Here's a safe and effective way to do it:
    # [checkmark emoji] Command to delete all executable files in a directory (not recursively): [..]
    # [magnifying glass emoji] Want to preview before deleting? [..]
    # [caution sign emoji] Caution: [..]

[1] https://chatgpt.com/share/67f055c8-4cc0-8003-85a6-bc1c7eadcc...

[+] HelenePhisher|1 year ago|reply

> And it's not even possible to save one chat to disk (well, neither do the others), I just wish it did.

Ask Claude to generate a .md of the conversation, it will do that with the option to download that or a PDF of it. A lovely, but well hidden feature!

[+] occamschainsaw|1 year ago|reply

I have been using the Obsidian web clipper to export chats from ChatGPT and Claude web versions to nicely-formatted md files. You can save md to Obsidian or download it as a standalone file. It doesn’t support Gemini yet though.

https://github.com/obsidianmd/obsidian-clipper

[+] asadm|1 year ago|reply

2.5 has been amazing for programming. I just send it entire repo as context when I am lazy and then ask it for entire modified files back with the (medium sized) change. It almost always works! I wish to either start using cursor or some vscode extension to do this from ide itself.

[+] unknown|1 year ago|reply

[deleted]

[+] ZYbCRq22HbJ2y7|1 year ago|reply

Is it because Google is feeding the model that information about you? It knows more of the responses you'd like? Just like Google does with search history?

[+] gavinray|1 year ago|reply

It doesn't seem as popular, but I've found Grok to treat you the least like a child and provide good answers. Especially with more complicated tasks.

[+] codelion|1 year ago|reply

it's interesting that different models evoke such distinct personalities. i agree, sometimes the excessive enthusiasm can be distracting. a concise, focused response is often more valuable, especially for technical tasks. i find that a clear system prompt can really steer the model's behavior, like you mentioned.

[+] tomrod|1 year ago|reply

I uniformly call Gemini is a bash script. Really like it they way.

[+] dilyevsky|1 year ago|reply

When you have to justify your spend to public shareholders it makes it much more difficult to spend tokens on “great quesion!” and vocal ticks and what not

[+] baby|1 year ago|reply

how is that related to the post?

[+] regulayshun|1 year ago|reply

[deleted]

[+] jruohonen|1 year ago|reply

> Next, in response to a question about the vulnerabilities in the Salt Typhoon description, Sec-Gemini v1 outputs not only vulnerability details (thanks to its integration with OSV data, the open-source vulnerabilities database operated by Google), but also contextualizes the vulnerabilities with respect to threat actors (using Mandiant data).

I remain still skeptical about LLMs in this space, although I might be proven wrong, as often happens. Nevertheless, OSV has already been a big advance, so it is great that it gets a further commitment.

[+] andy99|1 year ago|reply

Is this a "model" as in a set of transformer weights that inherently does security work or is it a system that has data lookup and or other tools along with an LLM to do the question interpretation, synthesis, and output presentation?

From the description re data integrations it sounds like the latter, unless the data mentioned is in fact used for training.

The distinction is important because a security-tuned model will have different limitations and uses than an actual pre-build security LLM app. Being an app also makes benchmarking against other "models" less straightforward.

[+] esafak|1 year ago|reply

It's interesting how we're seeing the emergence of specialized models, much like trained humans.

[+] jgalt212|1 year ago|reply

What's old is new again. Pretty much all ML and statistical models were specialized for a single task / domain.

[+] infoSecer|1 year ago|reply

It always blows my mind that nobody at Google thought it would be a good idea to very carefully review the answer of the AI. In the second screenshot, the prompt asks about CVE-2024-3400, and at first glance this appears ok.

But in the affected systems section it states:

> Also Hitachi Energy RTU500 firmware and Siemens Ruggedcom APE1808 firmware.

I cannot find any reference that this Hitachi device is vulnerable to that CVE. Hitachi has a nice interface to list all vulnerabilities of their devices, this CVE is not part of it. In the Mitigation section any mention of Hitachi is also missing. Almost as if this device is not vulnerable.

There is some more weirdness, like it doesn't mention the "portal" feature is also vulnerable.

[+] ebursztein|1 year ago|reply

Thanks for looking in-depth in our post. The Hitachi RTU500 mention is not an hallucination, we did check for those. It is mentioned in the Mandiant threat intelligence data.

[+] notepad0x90|1 year ago|reply

I'm always torn apart when it comes to LLMs and analytical tasks. When you perform an analytical task, whether it is something simple like assessing the potential risk and impact of a vulnerability or complex like analyzing an obfuscated malware sample to determine its capabilities, you have to thoroughly go over the data points available to you, and corroborate the data points or evidence you are using to come up with conclusions. LLMs can help with a lot of this, but you still have to go over their reasoning (black-box mostly) or backtrack their work before you can accept their conclusions.

In other words, even with humans, their skills and experience are never enough. they have to show the reasoning behind their conclusions and then show that reasoning is backed up by an independent source of fact. Short of that, you can still perform analysis, but then you must clearly state that your analysis is weak and requires more follow-up work and validation.

So with LLMs, I'm torn up because they kind of make your life a lot easier, but does it just feel that way or are they adding more work and uncertainty where that is intolerable?

[+] ziddoap|1 year ago|reply

Could be great for augmenting a cybersec professional's tasks; I'm certainly interested in trying it. However, I fear it will not be used as just one of the tools in the toolbox, and rather it will be used as something to defer (and consequently shed liability) to.

[+] booi|1 year ago|reply

Has anybody been able to shed liability to AI yet?

[+] mmooss|1 year ago|reply

Using AI systems for high-speed security actions, proactive and reactive, seems necessary but not sufficient:

I expect attackers will also use AI systems, trained on the latest in effective attacks. What about defense would make defenders' AI systems more effective than attackers'?

I think it's necessary because, if the attackers use AI systems then the defenders need to keep up.

Also, we need to be creating far more secure systems to start with. Now it is, to a degree, security through obscurity - something is secure when attackers can't find the bugs fast enough. Security through obsurity wouldn't seem to work well when the attacker uses AI software.

[+] ZYbCRq22HbJ2y7|1 year ago|reply

Does it seem like a bad idea to trust something that is probablistically correct with security?

[+] amitport|1 year ago|reply

Like with any automatic procedure: Are humans better?

Specifically, in their own example they are just citing Mandiant, which may itself be wrong...

https://news.ycombinator.com/item?id=43595294

[+] unknown|1 year ago|reply

[deleted]

[+] arresin|1 year ago|reply

Maybe this has something to do with the wiz acquisition.

[+] majestik|1 year ago|reply

I read the article, and while it’s great the model can generate relevant output- so what? The article doesn’t discuss any action being taken using that output.

So what’s the big breakthrough here?

47 comments