Thanks for sharing your blogpost. We had a similar journey. I installed and tried both Langfuse and Phoenix and ended up choosing Langfuse due to some versioning conflicts on the python dependency. I’m curious if your thoughts change after V3? I also liked that it only depended on Postgres but the scalable version requires other dependencies.
The thing I liked about Phoenix is that it uses OpenTelemetry. In the end we’re building our Agents SDK in a way that the observability platform can be swapped (https://github.com/zetaalphavector/platform/tree/master/agen...) and the abstraction is OpenTelemetry-inspired.
"Langsmith appeared popular, but we had encountered challenges with Langchain from the same company, finding it overly complex for previous NonBioS tooling. We rewrote our systems to remove dependencies on Langchain and chose not to proceed with Langsmith as it seemed strongly coupled with Langchain."
I've never really used Langchain, but setup Langsmith with my own project quite quickly. It's very similar to setting up Langfuse, activated with a wrapper around the OpenAI library. (Though I haven't looked into the metadata and tracing yet.)
Functionally the two seem very similar. I'm looking at both and am having a hard time figuring out differences.
I'm a maintainer of Opik, an open source LLM evaluation and observability platform. We only launched a few months ago, but we're growing rapidly: https://github.com/comet-ml/opik
marcklingen|1 year ago
resiros|1 year ago
We're oss, otel compliant with stronger focus on evals and the enabling collaboration between subject matter experts and devs.
suninsight|1 year ago
We also picked Langfuse - more details here: https://www.nonbios.ai/post/the-nonbios-llm-observability-pi...
unnikrishnan_r|1 year ago
> "Another notable feature of Langfuse is the use of a model as a judge ... this is not enabled in the free version/self-hosted version"
I think you can add LLM-as-judge to the self-hosted version of Langfuse by defining your own evaluation pipeline: https://langfuse.com/docs/scores/external-evaluation-pipelin...
barefeg|1 year ago
The thing I liked about Phoenix is that it uses OpenTelemetry. In the end we’re building our Agents SDK in a way that the observability platform can be swapped (https://github.com/zetaalphavector/platform/tree/master/agen...) and the abstraction is OpenTelemetry-inspired.
ianbicking|1 year ago
I've never really used Langchain, but setup Langsmith with my own project quite quickly. It's very similar to setting up Langfuse, activated with a wrapper around the OpenAI library. (Though I haven't looked into the metadata and tracing yet.)
Functionally the two seem very similar. I'm looking at both and am having a hard time figuring out differences.
skull8888888|1 year ago
calebkaiser|1 year ago