Show HN: Psychic - An open-source integration platform for unstructured data
122 points| jasonwcfan | 2 years ago |github.com
For example, we know that the pain of building new API integrations scales with the level of fragmentation and number of competing "standards". In the current meta, we see this pain with a lot of AI startups who invariably need to connect to their customers data, but have to support 50+ integrations before they even scale to 50+ customers.
This is the process for an AI startup to add a new integration for a customer:
- Pore over the API docs for each source application and write a connector for each
- Play email tag to find the right stakeholders and get them to share sensitive API keys, or give them an OAuth app. It can take 6+ weeks for some platforms to review new OAuth apps
- Normalize data that arrives in a different formats from each source (HTML, XML, text dumps, 3 different flavors of markdown, JSON, etc)
- Figure out what data should be vectorized, what should be stored as SQL, and what should be discarded
- Detect when data has been updated and synchronize it
- Monitor when pipelines break so data doesn’t go stale
This is a LOT of work for something that doesn’t move the needle on product quality.
That’s why we built Psychic.dev to be the fastest and most secure way for startups to connect to their customer’s data. You integrate once with our universal APIs and get N integrations with CRMs, knowledge bases, ticketing systems and more with no incremental engineering effort.
We abstract away the quirks of each data source into Document and Conversation data models, and try to find a good balance to allow for deep integrations while maintaining broad utility. Since it’s open source, we encourage founders to fork and extend our data models to fit their needs as they evolve, even if it means migrating off our paid version.
To see an example in action, check out our demo repo here: https://github.com/psychic-api/psychic-langchain-tutorial/
We are also open source and open to contributions, learn more at docs.psychic.dev or by emailing us at [email protected]!
[+] [-] jw1224|2 years ago|reply
It’s been a challenge getting my SaaS app connected to fragmented APIs belonging to many of my customers, each with their own use cases.
One of the biggest hurdles I faced was Asana’s API. A customer wanted us to hook into an Asana webhook: when a task was added to their project, they needed to push the data to their account on our platform (and vice-versa).
But because Asana is so “flexible” (ha!), all the field names in their API responses were UUIDs. It was a total nightmare to figure out which key/values were the ones we wanted. I’m not sure if/how Psychic can figure this out.
Secondly, maybe it’s just how your landing page is phrased — but this feels like “IFTTT for AI tooling”, rather than “IFTTT powered by AI”.
I see a lot more commercial value in the latter direction. To most prospective customers, your headline “Easy to set up” doesn’t mean a React hook and Python SDK. Just give us a REST API! :)
[+] [-] jasonwcfan|2 years ago|reply
Definitely worth exploring but as you've experienced there are enough problems with extracting and normalizing data across the long tail of SaaS apps for us to get to reasonable scale.
re: the Asana API issue, that's both hilarious and sad. We do plan to build a transformation layer so that all data is reshaped to a consistent schema before sending it off to customers (hence the "Universal" aspect of the API). These quirks of each data source are exactly the kinds of things we want to solve for so our users don't need to worry about it.
[+] [-] jasonwcfan|2 years ago|reply
[+] [-] 9dev|2 years ago|reply
Considering I’ll need to get other data in there soon, probably, I’m in the market for Psychic. The question I have, though, is: can you really reconcile the Schema of several apps into one, without settling for the smallest common denominator? What do you do about platforms like Notion, that don’t even provide webhooks? We settled on polling, but obviously that won’t scale.
[+] [-] jasonwcfan|2 years ago|reply
Data syncs -> If the source doesn't offer webhooks, we just poll daily, do a diff on our side, and send the updated data to our customer. I'm not aware of any way to avoid polling when webhooks aren't available, but we plan to do the polling ourselves so we can provide a webhook like experience for customers.
[+] [-] babyshake|2 years ago|reply
[+] [-] jasonwcfan|2 years ago|reply
[+] [-] michaelmior|2 years ago|reply
[+] [-] ayanb9440|2 years ago|reply
[+] [-] jekude|2 years ago|reply
[+] [-] jasonwcfan|2 years ago|reply
We plan to focus strictly on the data layer, helping companies connect to their data sources through a universal API. We already are dogfooding our platform for some customers! By far the most popular use cases are customer support automation and search through workplace apps.
It's facinating that the build/buy decision has flipped for a lot of companies. As long as they have an engineering team, a lot of companies are trying to build out their own AI capabilities in house, I'm guessing because no one wants to miss the boat.
[+] [-] madisonmay|2 years ago|reply
[+] [-] jasonwcfan|2 years ago|reply
We also don't expect companies to customize the functionality, just to self-host it or use the cloud version, or use it for personal projects.
[+] [-] ipv4dhcp|2 years ago|reply
[+] [-] 2-718-281-828|2 years ago|reply
[+] [-] itronitron|2 years ago|reply