Launch HN: Synthetic Minds (YC S18) – Program Synthesis to Protect Dapps
99 points| saurabh20n | 7 years ago | reply
Current use of web-app-like development practices for smart contracts is not working. People have lost over 4.1M ETH due to code bugs in these contracts, aka decentralized applications (Dapps). That equals $1.8B at current exchange rates, or $500M at the time of loss. The execution environment is "deploy once, change never", which resembles hardware and space applications. NASA/Airbus use formal methods to check this kind of software. Dapps need similar tools.
I have worked on formal methods since 2006. In my PhD on program synthesis, I reduced checking code properties to theorems that could be proven mechanically. Incidentally, a full program was not required. The program could have "holes" and the prover would “fill” the holes by synthesizing code. I took those ideas to synthetic biology in my postdoc and previous company.
In 2016 when I started to look at smart contracts, I realized those ideas could help build robust Dapps, and maybe prevent them from losing $100M+/year. In June 2016, a Solidity vulnerability was discussed (June 10), but the DAO was deemed immune (June 12), right before it was exploited (June 17), losing ~$150M. In 2017, another ~$300M was lost to the Parity multisig bug. This year, I built a system to help mitigate future failures.
The core insight requires some background: Imagine you wrote a lossless compression algorithm `K` (e.g., LZ77 or LZW). Would it not be amazing to synthesize the decompressor `D` automatically? The invariant for `K+D` is the identity function. So if we had a `D` with holes, we could use a prover to verify the identity invariant and “solve for D” simultaneously. Now, what if your `K` was a blockchain smart contract, and `D` an arbitrary user contract? If we want K to be immune to double spends, we could ask for `K+D` that violates the invariant “K’s balance at start >= K’s balance at end”. Not only would we formally verify properties, we would synthesize a specific `D` that helps explain any violations.
Here's how it works: Our tool has `prove` and `autocode` modes. For a smart contract, `prove` validates its properties while `autocode` will generate user contracts that break it—if any exist. Our servers will aim for turnaround times of minutes allowing use in CI, instead of weeks-long human audits. Two demo screencasts are online: http://synthetic-minds.com/pages/faq.html.
There are 3 steps under the hood. The 1st step is a source-to-source compiler from Solidity to a shared-memory, non-object, explicit Blockchain-state Intermediate Representation (BIR). BIR is not Solidity specific, so we could compile Ethereum’s Vyper or Tezo’s Liquidity to it, but Solidity’s traction makes it a natural place to start. The 2nd step is a symbolic executor that converts BIR to an “equation”. Think of symbolic execution as execution that traverses every path and without explicit values. E.g., `func a(int x) { y = x + 1; z = y * y; return z; }` would translate to `a(x) = (x + 1).(x + 1)`. Symbolic execution allows us to convert entire programs into an SMT equation. SMT is a theorem proving language used by many automated solvers, e.g., Z3, CVC4. The most novel, 3rd step, augments a user contract with holes for function bodies. That encodes any arbitrary future user contract. The solver can fill holes that (dis)prove combined properties. This simultaneously validates global invariants, and provides readable code if problematic user contracts exist.
Formal methods are sometimes hard to use, but it turns out that for smart contracts we can bypass the major issues (false positives, esoteric bugs found, and difficult-to-explain findings). First, most false positives come from inevitable approximations that formal analyses have to make, so they can terminate while analyzing code that itself might not terminate, i.e., Undecidability of analyzing Turing-complete languages. Gas limits truncate executions, which means we get to skip those approximations. Second, when esoteric (weird and obscure) bugs cause failures in normal web services you have to weigh the time and cost of fixing it against the potential loss, which might be negligible. But because smart contracts are immutable, any esoteric code path can be fatal to the entire Dapp, so there is no such tradeoff. Lastly, in contrast with the infinite variability of interfaces for normal web services, the uniformity of blockchain actors provides a template, enabling us to synthesize code as developer-friendly explanations of behavior.
Over 12+ years of work, there is one question I have repeatedly asked: Can we trade off human insight at the expense of compute power? Current alternatives for securing contracts look for 10-20 anti-patterns, built from human insights observing past failures. In contrast, our approach asks for a functional invariant (e.g., no double spend), which might be 100x or more expensive to solve. Compute is cheap though, and our ability to find future failures is only limited by the amount we expend on `autocode`.
The first prototype is operational now, and I am sending out beta invites next week: http://synthetic-minds.com. Excited to hear thoughts from the community!
[+] [-] russdpale|7 years ago|reply
With that said, I don't want to be completely negative, what you are doing is a damned great idea and I think just about all of us can get behind these types of processes, as someone who is learning about solidity vulnerabilities (blockchain @ berkeley) right now it is kind of funny this post comes around today. I have kind of been fretting about how easy it is to write terrible smart contracts and I kept thinking, well how the hell does NASA do this? What about air frame manufacturing? That is how I want to develop my dApps, I don't want to deploy and hope, you know? I hate that feeling, not just because I don't want to lose anyone funds, but because I want to be proud of my code and use it for good.
I am really looking forward to checking this out, I can't see it at work but will check it out as soon as I get home! I think this is really important work you are doing. I think just about everyone who is responsible in this space knows doing contract development in a web development mind frame is dangerous and stupid. I even heard someone say the other day "agile smart contract development", like NO! I had an old drill sergeant tell me about combat "Slow is smooth, and smooth is fast", well turns out it applies to smart contract and blockchain development as well.
[+] [-] saurabh20n|7 years ago|reply
I care about the major complicated projects succeeding, e.g., 0x, Golem, Augur, Aragon, to name a few. They can go the DAO route, or I can do my part in helping. Help me get in touch with more projects if you're connected.
Open sourcing right off the bat will have the opposite effect. If it works at scale, we will be pitting the attackers against the developers. I would rather get this tool in the hands of developers first. I welcome other people developing the same :) - better yet: I've already developed a version, come help me make it useful for the community.
[+] [-] dumbfoundded|7 years ago|reply
[+] [-] saurabh20n|7 years ago|reply
[+] [-] westoncb|7 years ago|reply
This was pretty mind-blowing for me. (I haven't really looked into program synthesis so I may be slightly more susceptible ;) There was one part of it that seemed problematic from my (perhaps flawed) understanding, though.
So I guess the idea is we have K+D = I (and this is roughly equivalent to 'D(K(s)) = I(s)' where 's' is a string to compress), so if we come up with a proof of 'I'—which would be a path through theorem space starting from I'm not sure what axioms—and we know 'I' is equivalent to 'K+D', and we already know the value of 'K,' then the value of 'D' is implied.
My question is what is the equivalent here to the subtraction operation that would allow us to say D = I - K? It seems very surprising to me that K and D would be structured in such a way that their own proofs would necessarily have any relationship to the proof we derived for I, so that such a subtraction would work so cleanly...
[+] [-] saurabh20n|7 years ago|reply
What you do is solve `\exists D: \forall arr: D(K(arr)) = arr` where `arr` is an arbitrary length array/bitstream. See paper [1]. There is also really old code you can browse at the bottom of my personal page linked in profile (doesn't compile anymore coz of deprecated dependencies.)
The arithmetic analogy break down when solving. So you're not explicitly computing `I - K`, but instead letting the constraint solver pick values that satisfy the equation (hence SMT solving.) Also, you are absolutely right that there is no reason to have their structure be related to identity: But, the "damage" done by `K` is encoded in the state at the interface to `D` and `D` better undo that state to get to identity, which is what the solver would exploit.
[1] Path-based Inductive Synthesis for Program Inversion http://saurabh-srivastava.com/pubs/pldi11-pins.pdf
[+] [-] Animats|7 years ago|reply
In the "dApp" world, the top five by market cap are:[1]
- EOS, a competitor to Etherium. It's a platform for more dApps.
- VeChain Thor, which pivoted from a RFID tracking system for cargo to a "mobile wallet".
- Tron - "The platform is still in its infancy as is expected to reach the “Eternity” by 2023. In its final stage, the platform will allow content creators to directly monetize their content by creating their own token over the TRX blockchain."
- Omise - this is a "white label solution for mobile payments". It doesn't really use "dApps", it just used a token sale to raise money.
- Icon - a blockchain to integrate other blockchains.
None of these actually do anything.
If this is so great, is YC itself buying ICO tokens? Didn't think so.
[1] https://www.newgenapps.com/blog/top-ethereum-dapps-tokens-re...
[+] [-] akharris|7 years ago|reply
In this case, we've known Saurabh for years (this is the second time we funded him) and I've been talking to him about Synthetic Minds for close to a year. If you believe, as he and I do, that the story on crytpo has barely started, then what he is building will have a huge impact on the future state of the world.
That's a bet we'll take whenever we can.
[+] [-] saurabh20n|7 years ago|reply
I am working on a simple premise: The DAO (perpetually running, globally accessible, democratic kickstarter) would have been a good project to see succeed. It failed not because it was a bad idea, but because we rushed into it without tools like this one. Augur, Aragon, 0x, Golem, Filecoin are other projects that might be worthwhile. Also ZCash (doesn't have smart contracts yet, but you should really look into the work Ariel Gabizon and company are doing.)
You wouldn't want to write an linux kernel code without a compiler, would you? You shouldn't write immutable code without formal verifiers. Agree?
[+] [-] CryptoPunk|7 years ago|reply
And the fourth item on your list is wrong. It's OmiseGo, not Omise. Omise is the company behind OmiseGo, which is a project aims to develop an Ethereum (Plasma) side-chain to host a decentralized exchange.
[+] [-] 1996|7 years ago|reply
Some people have made millions, some brand new technology is created after a unique technological advancement that solves the byzantine general problem, even YC is investing --- yet old people cry "it is nothing but a fad!", so much that it becomes comical, like the black knight in the Monty Python (especially when it comes from someone confusing Dapps and Tokens)
Eventually, you guys will have to admit you were wrong. A bit like people who thought horseless carriage were a fad. It's ok, we all get things wrong someones. No bad blood.
Technology is here to stay. Many people are ideologically opposed to the libertarian roots of cryptocurrencyes. Many others regret not investing in time, or selling too early.
I get that. But maybe it's time to move on.
[+] [-] zitterbewegung|7 years ago|reply
What I had so far is at https://github.com/zitterbewegung/Belmod
[+] [-] saurabh20n|7 years ago|reply
[ Edit: Your idea is fantastic, btw. If you can find use cases where the size of contracts is small, it'll be amazing to have synthesis from test cases. ps: Do borrow some ideas from the way Sumit solves Flashfill. ]
My system is closed source, partially because it is computationally heavy and I'll spin up a server fleet for analyses, and also because I don't want it to be used for attacks, i.e., limit use to mostly the original developers of projects.
Connect with the email on the website. I'll look at your code a little later.
[+] [-] detroitcoder|7 years ago|reply
[+] [-] saurabh20n|7 years ago|reply
[+] [-] zebnyc|7 years ago|reply
[+] [-] saurabh20n|7 years ago|reply
The key distinction is building a synthesizer, not just a verifier. These systems are hard to use, and synthesized code is the key to making them accessible. Watch the demo videos. See if you agree.
On the verifier side: QSP is a protocol/marketplace for connecting people needing audits, with human auditors or free open source tools (Oyente and Mythril) [1]. My verifier is more comprehensive and future-proof at the expense of being compute intensive, as discussed on my FAQ [2].
[1] https://quantstamp.com/visual-guide-to-quantstamp
[2] FAQ: "How is this different from Ethereum's other tools ?" https://synthetic-minds.com/pages/faq.html
[+] [-] retr0yrus|7 years ago|reply
Is it not true that program synthesis will work, as long as a "contract" isn't too large or designed adversarially? Because if that weren't true you'd have a debugger for all circuits and programs (suitably restricted in computing power).
[+] [-] saurabh20n|7 years ago|reply
Yes. It should work. There's a good amount of work to be done though. :)
I don't follow what you mean by a "debugger" in this case.
[+] [-] DennisP|7 years ago|reply
[+] [-] saurabh20n|7 years ago|reply
Compare this to the alternatives of paying $20,000 - $500,000 for manual audits.
[+] [-] baby_wipe|7 years ago|reply
[+] [-] saurabh20n|7 years ago|reply
For bugs: The demo videos on the FAQ page show the kinds of bugs it can find, e.g., the one that crashed the DAO, and ones that simpler tools will be unable to find. Of course you are right, the holy grail is finding bugs that escape both human audits and other tools. Haven't run it on very large codebases yet. Working on it!
[+] [-] fembot__|7 years ago|reply
[+] [-] adamnemecek|7 years ago|reply
[+] [-] verdverm|7 years ago|reply
[+] [-] unknown|7 years ago|reply
[deleted]
[+] [-] saurabh20n|7 years ago|reply
[+] [-] ghostmaree|7 years ago|reply
[deleted]
[+] [-] aarushi45|7 years ago|reply
[deleted]
[+] [-] Salaheddin|7 years ago|reply
[deleted]
[+] [-] thebiglebrewski|7 years ago|reply
[+] [-] saurabh20n|7 years ago|reply