top | item 46031208

Syd – An offline-first, AI-augmented workstation for blue teams

21 points| paul2495 | 3 months ago |sydsec.co.uk

Hi HN, I’m Paul. I’m building Syd, an offline-first forensic workstation that orchestrates tools like YARA and Nmap through a GUI, using a local LLM to analyze the results without leaking data. It runs completely offline on localhost—no data is ever sent to the cloud, making it safe for sensitive investigations.

Here's a demo: https://www.youtube.com/watch?v=8dQV3JbLrRE.

I built this because while tools like YARA are powerful, managing rule sets and decoding hex strings is slow. AI is great at explaining malware signatures, but I couldn't use ChatGPT for my work because pasting potential malware or sensitive logs into a web form is a massive security risk. I needed the intelligence of an LLM but with the privacy of an air-gapped machine.

Under the hood, it’s built on Python 3. I use subprocess to manage the heavy lifting of the scanning engines so the UI (built with CustomTkinter) doesn't freeze. The "secret sauce" isn't the AI itself, but the parser I wrote that converts the unstructured text output from YARA into a structured JSON format that the local LLM can actually understand and reason about.

I’ve been using it to triage files for my own learning. In one case, Syd flagged a file matching a "SilentBanker" rule and the AI pointed out specific API calls for keylogging, saving me about 20 minutes of manual hex-editing. In the demo video linked, you can see this workflow: scanning a directory, hitting on a custom YARA rule, and having the local AI immediately analyze the strings.

Through this process, I learned that "AI wrappers" are easy, but AI orchestration is hard—getting the tools to output clean data for the LLM is the real challenge. I'd love to hear if there are other static analysis tools (like PEStudio or Capa) you consider essential for a workstation like this, or how you currently handle the privacy risk of using AI for log analysis.

5 comments

order

paul2495|3 months ago

Author here. Happy to answer questions!

A bit more context on how Syd works: it uses Dolphin Llama 3 (dolphin-2.9-llama3-8b) running locally via llama-cpp-python. You'll need about 12-14GB RAM when the model is loaded, plus ~8GB disk space for the base system (models, FAISS index, CVE database). The full exploit database is an optional 208GB add-on.

What makes this different from just wrapping an LLM, the core challenge wasn't the AI—it was making security tools output data that an LLM can actually understand tools like YARA, Volatility, and Nmap output unstructured text with inconsistent formats. I built parsers that convert this into structured JSON, which the LLM can then reason about intelligently. Without that layer, you get hallucinations and garbage analysis.

Current tool integrations: - Red Team: Nmap (with CVE correlation), Metasploit, Sliver C2, exploit database lookup - Blue Team: Volatility 3 (memory forensics), YARA (malware detection), Chainsaw (Windows event log analysis), PCAP analysis, Zeek, Suricata - Cross-tool intelligence: YARA detection → CVE lookup → patching steps; Nmap scan → Metasploit modules ready-to-run commands

The privacy angle exists because I couldn't paste potential malware samples, memory dumps, or customer network scans into ChatGPT without violating every security policy. Everything runs on localhost:11434—no data ever leaves your machine. For blue teamers handling sensitive investigations or red teamers on client networks, this is non-negotiable.

Real-world example from the demo syd scans a directory with YARA, hits on a custom ransomware rule, automatically looks up which CVE was exploited(EternalBlue/MS17-010), explains the matched API calls, and generates an incident response workflow—all in about 15 seconds. That beats manual analysis by a significant margin.

What I'd love feedback on:

1. Tool suggestions: What other security tools would you want orchestrated this way? I'm looking at adding Capa(malware capability detection) and potentially Ghidra integration. 2. For SOC/IR folks: How are you currently balancing AI utility with operational security? Are you just avoiding LLMs entirely, or have you found other solutions? 3. Beta testers: If you're actively doing red/blue team work and want to try this on real investigations, I'm looking for people to test and provide feedback. Especially interested in hearing what breaks or what features are missing.

  The goal isn't to replace your expertise—it's to automate the tedious parts (hex decoding, correlating CVEs,explaining regex patterns) so you can focus on the actual analysis. Think of it as having a junior analyst who never gets tired of looking up obscure Windows API calls.

  Check out sydsec.co.uk for more info, or watch the full demo at the YouTube link in the original post.

properbrew|3 months ago

Hey, I watched your video a few times and really like the idea. Is the inferencing being done on the CPU, do you support GPU as well?

The idea is solid and I like the direction you’re going with it, but the demo doesn’t really show it off. There’s a lot of jumping around in the UI and it’s hard to follow what’s happening without any audio. The interesting bit is right at the end when the rule gets generated, but it’s over so fast that you don’t really get a feel for what Syd is actually doing under the hood.

It was a bit hard to follow with no audio, just a simple “here’s the scan running, here’s the parser kicking in, here’s where the model steps in” kind of thing. Even speeding up the slower parts would make it easier to see the flow. Right now it feels more like a screen recording than a walkthrough. When you’ve spent hundreds of hours inside something it all feels obvious, but for someone seeing it for 3 minutes it’s tough to piece together what’s happening. Been there myself.

The automation angle you mentioned in the post is the part that really sells it. If the tool can take a directory, scan it, parse, correlate and then spit out the rule with almost no manual copying, that’s the kind of workflow improvement I (and maybe others?) care about. The video doesn’t quite show that yet, so it’s hard to judge how smooth the actual experience is.

I’m not against backing something like this, especially as it runs locally and handles the annoying parts. £250 is fine, but at the moment the payment page is just a Stripe form with no real signal that the thing is ready or actively maintained. A clearer demo, a roadmap, or even a short narrated “here’s the state of it today” would go a long way in building confidence.

Apologies if this comes across a bit direct. The idea is solid though. Local LLM + structured output from real security tools is genuinely useful. Keep going.

codethief|3 months ago

Came here because I thought this might be related to https://git.sr.ht/~alip/syd / https://gitlab.exherbo.org/sydbox/sydbox , which has been discussed here on HN various times over the years.

paul2495|3 months ago

Thanks for the links different project though. Those are sandboxing and syscall-monitoring tools, while my Syd is an offline AI assistant built for security workflows (DFIR, pentesting, malware triage, tool-output reasoning, etc.).

Completely unrelated codebases, just happens to share the same name.