What can we learn from the matrix.org compromise?

pm90|6 years ago

This is such a poorly written article:

* no detailed analysis of how the attack was undertaken. Its not even clear how the attacker managed to get in (was it a publicly exposed Jenkins? vulnerable bastion? what?)

* no analysis of what the existing matrix.org security perimeter looked like or how it could be made better.

* repetition of security tropes. Use VPN. Use Github Enterprise (wait wtf? Why not private repos in Github?). Don't use Ansible, use salt.

Ridiculous. I was looking forward to a nice long read about how this breach was undertaken. Hugely disappointed.

bifrost|6 years ago

If you click through to the GH Issues I linked to there are some pretty good data points as to what happened. I didn't feel the need to copypasta.

But yes, publicly exposed jenkins and repos lead to the compromise, not an uncommon story unfortunately.

Perimeter - I didn't see much evidence of one existing and I didn't go probing their networks to find out.

Security tropes are real for a reason, you don't have to believe me though.

Private repos in GitHub are still publicly hosted and are orders of magnitude easier to get into than having an in perimeter repo. They've leaked before and they'll keep on leaking. GitHub even made it harder for people to fork private repos to their own public accounts but it still happens.

nobatron|6 years ago

There's a lot wrong with this article.

Firstly having a private network for your infrastructure isn't a one stop solution for keeping attackers out.

Secondly using Github Enterprise or self hosted GitLab doesn't make up for storing secrets in Git.

Looking forwards to the proper write up.

netsectoday|6 years ago

* this idiot claimed "Ansible was used to keep the attacker in the system" which in all reality Ansible did what it was supposed to by altering the correct authorized_keys file and the attacker leveraged an old default in the sshd config. This is a sshd config issue, not Ansible.

The sales-pitch for Salt (against Ansible) is ridiculous and misguided.

I just checked out the Salt SSH module and even if they used salt they would still have this issue. Then answer here is to not use the default /etc/ssh/sshd_config value of #AuthorizedKeysFile .ssh/authorized_keys .ssh/authorized_keys2. Uncomment and remove authorized_keys2.

KirinDave|6 years ago

Why aren't people reporting the fact that Matrix.org actually lost control of their network a second time within hours of their first all clear sounding?

I feel like this is an important part of the story for anyone looking for teachable infosec moments.

bifrost|6 years ago

I guess I technically glossed over that but I did say "One of the more interesting pieces of this was how Ansible was used to keep the attacker in the system". The attacker was persisted via CM and their public repo, I'm actually surprised this doesn't happen more often.

driminicus|6 years ago

Because the second tine was a dns hijack, not a network compromise. I'm a little fuzzy on the details, but it had something to do with cloudflares API not revoking some access token.

Either way, a DNS hijack is not great, but not nearly as bad as the initial compromise.

Arathorn|6 years ago

The rebuilt infra wasn’t compromised; what happened was that we rotated the cloudflare API key whilst logged into CF with a personal account but then masquerading as the master admin user. Turns out that rotating the API key rotates your personal one, not the one you’re masquerading as, and we didn’t think to manually compare the secret before confirming it had the right value. Hence the attacker was able to briefly hijack DNS to their defacement site until we fixed it.

We will write this up in a full postmortem in the next 1-2 weeks.

Arathorn|6 years ago

If it wasn’t clear, this article wasn’t written by the Matrix.org team, nor did the author discuss any of it with us to our knowledge.

We’ll publish our own full post-mortem in the next 1-2 weeks.

Arathorn|6 years ago

also, reading this article more carefully, much of this just plain wrong:

> One of the more interesting pieces of this was how Ansible was used to keep the attacker in the system.

Fwiw the infra that was compromised was not managed by Ansible; if it had been we would likely have spotted the malicious changes much sooner.

nisa|6 years ago

It's been a few years since I last used Saltstack but if you have access to the master you have instant root on all minions or did that somehow change? salt '*' cmd.run 'find / -delete' and game-over?

bifrost|6 years ago

Very true, however I'd rather have that problem than an ever multiplying number of user accounts on systems that can su/sudo.

ubercow13|6 years ago

Why is it considered safer to expose a VPN to the internet than SSH? Is it just that there is one exposed service for the organisation rather than one per machine?

bifrost|6 years ago

SSH tunneling is handy but if you want to push anything else over it, its a pain for the "layperson". You're not going to have a great time supporting people with it. I've done it, it sucks. Scripts and special SSH config files are the pits. VPNs are way easier, they can support multiple access levels and roles, are often not blocked by other people's packet filters and firewalls and the good ones can even validate that a host is in "compliance" before they're allowed onto the network.

closeparen|6 years ago

You can expose one SSH box per organization (a “bastion”) and deploy SSH configs to clients that make it look like you have direct access to the hosts behind it.

acct1771|6 years ago

That'd probably be a solid question that the people implementing WireGuard in Linux kernel/supporting that can cover.

krupan|6 years ago

Can anyone explain the Jenkins vulnerability that was used to initially gain access? Reading the CVEs didn't give me the impression that they enabled remote exploits

bifrost|6 years ago

My 5 second lazy summaries of the CVEs:

CVE-2019-1003001, CVE-2019-1003002 -> Anyone with read access to Jenkins can own the build environment.

CVE-2019-1003000 -> I didn't get a lot of the details on this but it basically looks like "broken sandboxing, you can run bad scripts".

This is also a good resource: https://packetstormsecurity.com/files/152132/Jenkins-ACL-Byp...

zimbatm|6 years ago

The attacker gained network access through Jenkins.

Don't deploy a public-facing Jenkins, especially if it has credentials attached to it. It's really hard to secure, especially if pull-requests can run arbitrary code on your agents.

Jenkins / CI is the sudo access to most organizations.

bifrost|6 years ago

I agree with you 100% here, I would not deploy any CI publicly unless its heavily fenced off into "read only" territory.

r1ch|6 years ago

One thing I learned was where to modify the pageant source code (Windows equivalent of ssh-agent) to make my agent prompt before signing (with the default focus on "no"). This feels much safer and is a very minor inconvenience. I wonder why more agents don't have this built in.

Example: https://twitter.com/R1CH_TL/status/1118559239084158977

forgotmypw|6 years ago

I'd like to take this opportunity to plug my in-development decentralized, distributed, completely open forum, using PGP as the "account" system, and text files as the data store.

So any reasonably competent hacker can re-validate the entire forum's content and votes, reasonably quickly reimplement the whole thing, and/or fork the forum at any time.

http://shitmyself.com/

ficklepickle|6 years ago

This is very interesting! I have so many questions. If you see this, kindly send me an email. It's in my profile. I love the idea!

bifrost|6 years ago

Very Cool! I'll check it out!

mjevans|6 years ago

That medium.com has a paywall and doesn't want to share content? (is what I learned)

bifrost|6 years ago

This might work: https://medium.com/@tomsparks/what-can-we-learn-from-the-mat...

tomupom|6 years ago

Not getting the same paywall trouble as you but https://outline.com/PZnDHL

inetknght|6 years ago

I have gone on some long verbal rants about the dark patterns (bordering on malicious behavior) exhibited by key agents such as SSH agent, GPG agent, Pageant, and the like.

What can you learn from the compromise? Never use an agent. Kill it with fire^H^H^H^H -9.

bifrost|6 years ago

The attacker still would have gotten their key in. TBH if you kill the agent people are just going to copy their keys with no passphrases around. Ask me how I know...

nine_k|6 years ago

How about using hardware tokens instead? With a right setup, private keys never leave it.

yjftsjthsd-h|6 years ago

Okay, I'll bite. What are you calling a dark pattern in assorted agents? Especially given that dark pattern implies intent to harm. (And I say this as someone looking at using an agent: If there's a gotcha, I'd like to know about it)

73 comments