top | item 40659236

Launch HN: Overwatch (YC S22): OSINT platform for cyber and fraud risk

164 points| Bisen | 1 year ago

Hey HN! Arjun and Zara here - cofounders of Overwatch (https://www.overwatchdata.ai), a platform to automate OSINT and threat intel, turning it into actionable insights. Check out our clickthrough demo here: https://app.storylane.io/share/qyayvtamapis.

Overwatch began when we were working with risk and threat intel teams at Google, Stripe, and government. We experienced the immense challenge every fraud and cyber threat analyst faces: manually parsing through an ocean of data to find valuable insights and filter out the noise. This included using many of the feeds and tools out there that were often very expensive, noisy, keyword-based, and lacked accurate entity extraction or advanced query features.

Most threat intelligence tools utilize thousands of keywords and teams of analysts to manually sift through torrents of alerts. These alerts are usually individual posts on various platforms across news, social media, deep and dark web sources that have some matching keyword. This is full of false positives, requiring many hours to wade through to figure out what intel matters most to our users, why, and what they can do next.

Overwatch uses an alternative approach by layering AI agents and NLP techniques, including a combination of multifarious datasets, cluster analysis, topic modeling, Retrieval Augmented Language Models (RALM) and domain knowledgeable agents.

This allows us to (1) Filter through OSINT in real time to identify events and narratives that matter to our users, and write reports on what they could do about it; (2) Identify dark web and deep web threats, fraud methods, new tactics, and compromised accounts, stolen checks, and credentials affecting our users or their peers; (3) Send an alert any time one a 3rd party supplier or parts of the tech stack are impacted by a widely exploited vulnerability, ransomware attack, or breach; and (4) Track malware and ransomware groups that are actively targeting your industry including Indicators of Compromise (IOCs).

Our intelligence is actionable because the alert comes with the context and important details that an analyst needs to make an informed decision. Being AI-native, we also have a range of chat and data visualization features to effectively function as an intel co-pilot or industry expert. Finally, our in-house intelligence analysts and investigators can assist threat intelligence teams with HUMINT investigations and darkweb acquisition.

Our current customers include internet platforms, financial institutions, and supply chain companies. Within a day of one breach, one of our customers used Overwatch to surface 18,000+ leaked credentials. Another used us to surface fraudulent checks and learn exactly how threat actors were targeting their specific product features.

Our website says “Request a demo” but if you want to poke around on a very basic example of how we’re aggregating dark web, deep web, social, and surface web, log in at https://app.overwatchdata.io/ using these credentials: username: try_overwatch@overwatchdata.io pw: HelloHNWorld

That login is for an un-personalized feed of cyber threat intel (breaches, vulnerabilities, ransomed organizations, and industry updates) that gives you a flavor of not just the kind of information from which we can collect, but more importantly, how our technology prioritizes, clusters, and summarizes alerts for cyber / fraud analysts. Try the chat agent on the left-hand side to parse through the data.

Or sign up for a longer trial and preview of our email alerts: https://xryl45u9uep.typeform.com/to/pvtZQyS0. You can also check out our clickthrough demo for dark and deep web intelligence: https://app.storylane.io/share/qyayvtamapis.

Integration options range from simple dashboard access to our API for those who want to weave our intelligence directly into other products. Pricing is dependent on how complex a threat landscape our users want to monitor and we’re still figuring out how to standardize this but we’ll always do our best for the HN community.

Since the platform is AI-powered, it can also be used for news monitoring, supply chain disruptions, regulatory monitoring, or social media monitoring. We’ve had a lot of experience wrangling text-based feeds and using numerous AI-models (from embeddings, entity extractors, and LLMs) to filter, categorize, cluster, and analyze the data into meaning - so let us know if you’d like to nerd-out or have had any particular challenges. Looking forward to your feedback and questions! Thanks, HN!

89 comments

devmor|1 year ago

Using RAG is definitely a relief factor after reading that you're using AI and NLP for aggregate analysis, but I'm curious how much manual review this actually saves?

Since the model summaries would still need to be validated against the source results manually, your business' actual viability as a product hinges on whether customers perceive a significant time savings in the data provided via these channels over historical aggregation methods (like keyword analysis that you mentioned) and level of false positives.

What do you measure as the largest impact here? Is there a large time savings, is it additional discovery from blindspots that other methods don't cover? Both? Are there additional benefits you see to this model beyond automation and expanded discovery?

Bisen|1 year ago

Some of our customers said they spend around 3hrs a day on navigating new vulnerabilities alone. Step 1: wading through info from some easy and some hard to access sources; Step 2: trying to bring together all the most relevant information. E.g. just detecting your payroll provider got ransomed is just one step, then you have to research the group, any indicators that you might also be infected etc. then step 3: what do I do next? e.g. adding relevant hashes to virus total.

We not just help with relevant detection, but also automate some of the next two steps as well. Bringing the total weekly time saved down to a few minutes a day.

nycdatasci|1 year ago

I would expect a landing page to show summaries. Largest organizations impacted in last 7 days, most active exploit, etc. Instead all I see are events - apparently including tweets as a source - with minimal context. Just do what you advertise. Show me the latest breaking info from the dark web. Who is impacted, how much, and what was the vector? Better to be sorted by magnitude of impact rather than strictly chronologically. Bonus points if you consider when the user was last logged in to your platform: for people that last viewed your content a month ago, here are the biggest events from the last month. Same for weekly, daily frequencies.

That said, love the initiative and focus on this space and there’s probably an opportunity to sell your data to hedge funds.

Bisen|1 year ago

Great points - can say this is all 'coming soon'! One note, because this profile isn't personalized to a particular user's products, tech stack, and 3rd party/ supply chain, it's especially chaotic. For a more tailored profile, the social, news, and dark web posts all cluster around a specific events since there are far fewer critical events of interest to a specific user. No excuses, just sharing for background. Interesting point on the hedge fund use case, haven't been able to find a good user/ persona to interview about that and would love any suggestions if you have any. Thanks again for checking us out.

patchorang|1 year ago

Sort of off topic question. But how would you get into the type of work that uses this tool? I've always thought this type of work would be interesting, but I have no ideas where to start. What are the job titles? Fraud Analyst? Thanks!

PenguinCoder|1 year ago

Titles vary greatly but the general domain is information security and cyber security. Here is a good primary of basic qualifications and duties for a given role (Generic, not tuned to a specific company) - https://www.cyberseek.org/pathway.html

This type of information (OSINT of vulns/cves, proof of concepts) is useful for the Blue team side of defending against attackers. With easy to access information in a timely manner, the defenders can proactively put roadblocks and alerts into place for vulnerabilities as opposes to AFTER they are popped/hacked by such.

Prevention is ideal; detection, a must.

Tao3300|1 year ago

There's no getting into anything you haven't already been doing 6+ years right now.

zara2|1 year ago

I love it! We’ve seen the job title change a bit depending on the sector but can be under fraud analyst, threat intel analyst, risk analyst, financial crime analyst, sometimes trust and safety teams.

edm0nd|1 year ago

What is the pricing to monitor per each keyword?

I know platforms like Flare are cool but when you need to monitor hundreds or even thousands of corporate keywords, domains, and assets, it becomes cheaper for CTI to just write the tools themselves.

What does your platform look like in regards to this and pricing?

For example, your pricing for monitoring 100 keywords and pricing for monitoring 500 keywords.

200k unique telegram channels is an interesting stat.

Each Telegram account (if paid Premium account) can only be in 1k channels and groups max. To monitor 200k unique channels/groups, you have a network of at least 200 paid Telegram accounts continuously monitoring? Are you using Pyrogram or Telethon for this? Are these accounts owned by you (Overwatch) or are you just using a bunch of 3rd party Telegram intel feeds?

Bisen|1 year ago

We totally hear you and that’s why we don’t really charge by keyword but instead look at how many agents we need to deploy/ use cases we build towards. A 1000+ assets is the norm for our users. Would you be interested in connecting on a call to better understand the use cases and tell you more about how it works?

sncsy|1 year ago

I worked with Arjun in trust and safety / risk at both Google and Stripe. He’s not only an expert in the space, but is incredibly users-first. If you’re looking for a product like this and want a great partner, Arjun and team are it!

jonnyparris|1 year ago

Congrats on the launch! As design feedback, the demos don't seem to pass the "squint test" for intuitively surfacing the most important information / actions on the screen. Maybe a more specific walkthrough of a redacted / hypothetical scenario that's focused more on the user's decision-making process & actions instead of the kitchen sink of product features would better illustrate how/why things are laid out as they are currently.

ssahoo|1 year ago

Also storyline type demo sucks. A simple video that I can rewind will be preferred.

ThinkBeat|1 year ago

This feels a bit like a turbo RSS reader that plows through some easy and some difficult to access information and actively selects and targets it to subscribers?

Bisen|1 year ago

That's definitely the vibe of the example we're presenting. But for specific customers, the agent can cluster and bring additional relevant context to each of the 'events', and even recommend actions / automate certain actions e.g. if you detect a compromised account, send it team X.

chicagojoe|1 year ago

FWIW, clicking around, there are some odd display issues in the "References" (https://attack.mitre.org/techniques/T1486)

It looks like you're embedding data from Twitter - are you paying for decahose/enterprise access or just paying for a low volume of high value tweets (i.e. I'm seeing many from FalconFeedsio, DailyDarkWeb)

Bisen|1 year ago

I can jump in on the data source question - we can track specific accounts and keywords on Twitter, we aren't paying for the full firehose yet. We also track those original ransomware and dark web sites and blogs being referenced and just figuring out how best to cluster them all into the same event. Thanks for checking it out!

candiddevmike|1 year ago

What is your detection hit/miss rate? What happens when you miss something?

Seems like this is going to become a cat and mouse game similar to evading AV.

zara2|1 year ago

Great point! From some of our case studies we see users catch 25% more 3 days earlier than other solutions.

To your point catching every threat or every alert especially on darkweb is always a cat and mouse game. Our idea is a prioritization problem – how do you mitigate the biggest risks quickly.

The existing OSINT tools we used are keyword search based / pretty noisy so we’ve been focusing on the idea that given there’s no way analysts can find or triage every alert, how do you catch the biggest stuff. We do a few things from AI crawlers to continue to expand data collections to AI categorization, clustering, data extraction etc to make it easier to track the cover the most ground.

pbrum|1 year ago

Very interesting. It sounds like the tool is broadly powerful in combining a threat intel dashboard + news digest processor + AI features to better customize the output of the first two. The details of the API output will be important to many of your customers, as will the richness of the sources covered (forums and Telegram channels often die out and the "buzz" starts to happen in a different place, etc). Like some other commenters said, this is a fairly vendor-saturated space, so as a buyer I'd be looking for sharply presented distinction factors, beginning with price rather than AI (which is still a good thing to have).

I have a lot of experience with this kind of tool and workflow from at least three perspectives: internal builds; vendor; and consumer of vendor products such as this one. Happy to talk more if you're interested

Bisen|1 year ago

I'd love to take you up on that offer. Hn doesn't reveal emails but we have one listed here that directs straight to me, in case you're able to drop us a line: https://www.overwatchdata.ai/request-a-demo Looking forward to learning from your experience!

ssahoo|1 year ago

Congratulations on the launch. I noticed that you guys are SOC2 type II certified. I wonder how did you achive that so fast?

Bisen|1 year ago

Thank you! It definitely didn't feel quick. We had a 3 month audit window and we used Vanta.

skilled|1 year ago

This looks great, I am seeing a lot of potential use cases here. It also pools together a lot of the stuff that goes unreported in the news.

Would this be a service you would ever offer to regular researchers?

zara2|1 year ago

Definitely! Would love to chat more. A lot of our users want more customized monitoring / agent versions of this, and our dashboards are pretty easy to customize.

Opening up certain components of the platform is something we are definitely looking into and passionate about.

waihtis|1 year ago

Pretty cool, like the fact you tie all types of feeds (threats, vulns etc) together into a single view. How are you guys different from the plethora of other TI platform vendors out there?

Bisen|1 year ago

In 3 important ways besides price: -Personalization: The platform can be fully customized to your interests, e.g. 3rd party vendors, tech stack, products, peers or industry. It’s like having your personalized threat intel org that cuts through the noise. Each of the alerts are ranked and tailored to your interests. -Customization: The platform can be used for a range of use cases, with agents undertaking tasks ranging from finding and extracting check, loan, and credit card fraud, risky narratives about your brand, through to breaches or emerging ransomware groups targeting your tech stack or vendors. Those agents can even identify breaking events that could be near your assets. -So what and now what: Each report provides finished intel, whether finding or extracting relevant indicators of compromise, background on the threat actor and victim, compromised credentials, or compromised credit cards and checks. We can even automate workflows through integrations or creating cases/escalations for specific teams.

abtinf|1 year ago

How is this different from Interpres?

Bisen|1 year ago

In 3 important ways besides price:

-Personalization: The platform can be fully customized to your interests, e.g. 3rd party vendors, tech stack, products, peers or industry. It’s like having your personalized threat intel org that cuts through the noise. Each of the alerts are ranked and tailored to your interests.

-Customization: The platform can be used for a range of use cases, with agents undertaking tasks ranging from finding and extracting check, loan, and credit card fraud, risky narratives about your brand, through to breaches or emerging ransomware groups targeting your tech stack or vendors. Those agents can even identify breaking events that could be near your assets.

-So what and now what: Each report provides finished intel, whether finding or extracting relevant indicators of compromise, background on the threat actor and victim, compromised credentials, or compromised credit cards and checks. We're training our agents to be even more specific with answers e.g. "what IOCs relate to malware groups most active in the airline industry". We can even automate workflows through integrations or creating cases/escalations for specific teams.

redman25|1 year ago

How do you compare to incumbents like Blumira? What does your mitre coverage look like?

Bisen|1 year ago

EDR's are a great way to help secure endpoints but high fidelity threat intel which is tailored to your environment and org's needs can help increase awareness and shine light on potential security blindspots. This is especially critical when the threats are ever evolving and time to exploit is decreasing year over year. Qualys in a 2023 report stated that "25 percent of these security vulnerabilities were immediately targeted for exploitation, with the exploit being published on the same day as the vulnerability itself was publicly disclosed. They offer some outside the perimeter threats but by reputation, it’s a weakness and narrowly targeted to your organizations credentials and vulns, and orgs usually still need a threat intel provider. For example, one of our users who already uses an EDR, may not know about a 3rd party that’s been ransomed by a threat actor e.g. APT 73. An alert from Overwatch saying a 3rd party has been compromised will also include information about recent IOCs e.g. hashes and file extensions attributed to that threat actor so that the user can add them to virus total and scan internally to make sure they haven’t been compromised. This is an example of how EDRs and threat intel can work in concert.

airstrike|1 year ago

How is this different from RecordedFuture? Other than the obvious RAG capabilities

Bisen|1 year ago

RF contracts are heavily services based and cost up to 7 figures for tailored intel across a number - we delivers a similarly personalized experience but for a fraction of the price. AI agents can also do a range of additional and customizable tasks e.g. bringing together relevant context about a threat actor, tracking fraud methods, compromised checks and cards, narrative analysis, geopolitical disruptions etc. They can also be automated to create new escalations and actions through integrations. It's like having RF data as well as digital analysts to do a lot of the leg work for you.

scrollaway|1 year ago

Would love to chat with one of the founders. Can you send me an email? (profile)

dang|1 year ago

Just an offtopic heads-up that emails in HN profiles aren't visible to other users, only to admins.

If anyone wants to share an email address with other users, it needs to go in the About box.

thecleaner|1 year ago

Wow. Thats a really fresh concept and an important need. Congrats on the launch.

baxtr|1 year ago

Thanks. Who is the target user for this kind of tool? A CISO team member?

Bisen|1 year ago

Our current users are CISO's security ops team, threat intel team, blue team, fraud strategy or fraud intel team. Hope that helps!

guyseneca|1 year ago

Seems like alerting in threat intel is getting disrupted by AI - Cool.

1oooqooq|1 year ago

all my problems with risk analysis tools are false positives. adding AI just sounds like there will have more of them and harder to figure out when it happen.

Bisen|1 year ago

Our whole aim is to make you happy by downranking those false positives while remaining explainable. We blogged about the explainability part in case you're interested: https://www.overwatchdata.ai/blog/the-imperative-of-explaina...

artembugara|1 year ago

Arjun and Zara are amazing! They’re our batch and group office hours mates from YC S22.

We (https://www.newscatcherapi.com/) also serve the same use case but only for the news analysis part. And we don’t really have a UI: it’s all data accessible via an API.

I see a lot of questions here about comparing Overwatch to other OSINT tools. The ability to customize/personalize is a huge difference.

In my experience, clients with the most expensive problems are super underserved because there is no “Palantir-like” solutions.

Don’t get me wrong: you don’t have to do consulting — just tweak the onboarding/set up. Making bespoke solution for the companies with the biggest problems is a great way to get into the market. And it can work on the huge scale. E.g. Palantir.

An example from what we have as a very typical situation at NewsCatcher: a big bank is absolutely blown away because we actually can find news about private companies that they need to track with minimum false positives. And all we have to do is to tweak a bit our entity disambiguation module to work with the data points that the bank actually has.

perch56|1 year ago

Congrats on the launch! Not to sound negative about it but you do realize Overwatch is a trademark of Blizzard Entertainment …

adi_lancey|1 year ago

Great team, great launch!

martinbaun|1 year ago

Congrats, look very good!

I would hesitate with the name though. Overwatch is also a game series from Blizzard.

And Blizzard is known to be a little sue-addicted.

zdw|1 year ago

AI tool with the same name as a game where a robot uprising is one of the main backstories seems like it could be a bit problematic...

keepamovin|1 year ago

I disagree - I think the name is great.

Consider Venn diagrams: the audience for these two homynymous products has small overlap. Further moderated when you consider term stickiness to respective meanings is only high for a small fraction of that audience.

In other words, most people aware of both can cope with a mutual name. And some may even think it's cool. Each name enhancing the other through association and analogy.

Overwatch is definitely the right choice. Consider the 'OG' meaning originates from military terminology. In this context, "overwatch" refers to a tactical position where one unit provides covering fire and surveillance for another unit as it moves forward or performs an action. The overwatch position is typically elevated or strategically placed to have a clear view of the battlefield, allowing the overwatching unit to detect threats and engage enemies to protect the advancing or exposed units.

This concept ensures that the moving or vulnerable unit can operate with reduced risk, as the overwatching unit can neutralize potential dangers and provide critical information about the surroundings. The practice of overwatch is a fundamental tactic in military operations, emphasizing teamwork, communication, and strategic positioning.

zelias|1 year ago

Maybe split the difference by naming Sombra the CTO

marcus0x62|1 year ago

It is also, and more to the point of being a real trademark issue, the name of a managed threat hunting service from Crowdstrike.

technick|1 year ago

Doesn't the government have a program named overwatch as well? Something in the air force cyber transport division...

jacques_chester|1 year ago

Trademarks are namespaced by subject matter.

prakashn27|1 year ago

how many you thought it is "Overwatch" multiplayer game? :-)

aodonnell2536|1 year ago

Now there’s “Overwatch” the game, “Overwatch” the anticheat platform, and “Overwatch” the OSINT platform!