Changelog
Latest release updates from the Langfuse team. Check out our Roadmap to see what's next.
All posts
Experiments as a First-Class Concept
Experiments now live alongside Datasets as their own top-level feature—run them with or without datasets, compare across runs, and track progress over time.
11 Days Ago
Boolean LLM-as-a-Judge Scores
LLM-as-a-Judge evaluators can now return boolean scores for `true` / `false` decisions.
2 Weeks Ago
Updates to Dashboards
Detailed reference for how dashboards behave differently in "Fast Preview" — trace counts, histograms, filters, and more.
March 23, 2026
Categorical LLM-as-a-Judge Scores
LLM-as-a-Judge evaluators can now return categorical scores in addition to numeric ones.
March 20, 2026
Simplify Langfuse for Scale
Langfuse now delivers faster product performance at scale. See the overview page for rollout details, access, and migration steps.
March 10, 2026
Langfuse CLI
Fully use Langfuse from the CLI. Built for AI agents and power users.
February 17, 2026
Evaluate Individual Operations: Faster, More Precise LLM-as-a-Judge
Observation-level evaluations enable precise operation-specific scoring for production monitoring.
February 13, 2026
Run Experiments on Versioned Datasets
Fetch datasets at specific version timestamps and run experiments on historical dataset versions via UI, API, and SDKs for full reproducibility.
February 11, 2026
Corrected Outputs for Traces and Observations
Capture improved versions of LLM outputs directly in trace views. Build fine-tuning datasets and drive continuous improvement with domain expert feedback.
January 14, 2026
Inline Comments on Observation I/O
Anchor comments to specific text selections within trace and observation input, output, and metadata fields.
January 7, 2026
Filter Observations by Tool Calls and add Tool Calls to Dashboard Widgets
Add filtering, table columns, and dashboard widgets for analyzing tool usage in your LLM applications.
December 22, 2025
v2 Metrics and Observations API (Beta)
New high-performance v2 APIs for metrics and observations with cursor-based pagination, selective field retrieval, and optimized data architecture.
December 17, 2025
Dataset Item Versioning
Track dataset changes over time with automatic versioning on every addition, update, or deletion of dataset items.
December 15, 2025
OpenAI GPT-5.2 support
Langfuse now supports OpenAI GPT-5.2 with day 1 support across all major features.
December 12, 2025
Batch Add Observations to Datasets
Select multiple observations from the observations table and add them to a new or existing dataset with flexible field mapping.
December 11, 2025
Pricing Tiers for Accurate Model Cost Tracking
Langfuse now supports pricing tiers for models with context-dependent pricing, enabling accurate cost calculation for models with context-dependent pricing.
December 2, 2025
Hosted MCP Server for Langfuse Prompt Management
Langfuse now includes a native Model Context Protocol (MCP) server with write capabilities, enabling AI agents to fetch and update prompts directly.
November 20, 2025
OpenAI GPT-5.1 support
Langfuse now supports OpenAI GPT-5.1 with day 1 support for the LLM playground, LLM-as-a-judge evaluations, and comprehensive cost tracking.
November 14, 2025
Launch Week 4 🚀
Organize Your Datasets in Folders
Use slashes in dataset names to create folders for better organization.
November 8, 2025
Launch Week 4 🚀
JSON Schema Enforcement for Dataset Items
Define JSON schemas for your dataset inputs and expected outputs to ensure data quality and consistency across your test datasets.
November 8, 2025
Launch Week 4 🚀
Score Analytics with Multi-Score Comparison
Validate evaluation reliability and uncover insights with comprehensive score analysis. Compare different evaluation methods, track trends over time, and measure agreement between human annotators and LLM judges.
November 7, 2025
Annotation Support in Experiment Compare View
Add human annotations while reviewing experiment results side-by-side. Review experiment outputs, assign scores, and leave comments while viewing full experiment context.
November 6, 2025
Launch Week 4 🚀
Baseline Support in Experiment Compare View
Compare experiment runs side-by-side with baseline designation to systematically identify regressions and improvements
November 6, 2025
Launch Week 4 🚀
Filters in Compare View
Filter experiment results in the compare view to focus on specific subsets, such as items where evaluator scores dropped below a threshold
November 6, 2025
Launch Week 4 🚀
Langfuse for Agents
Trace agents with beautifully rendered tool calls and understand their performance through Agent Evals.
November 5, 2025
Launch Week 4 🚀
Amazon Bedrock AgentCore Integration
Trace AI agents built with Amazon Bedrock AgentCore via OpenTelemetry and Langfuse
November 4, 2025
Launch Week 4 🚀
@Mentions and Reactions in Comments
Tag teammates with @mentions to notify them instantly, and add emoji reactions to comments for quick acknowledgments.
November 4, 2025
Launch Week 4 🚀
IdP-Initiated SSO Support
Langfuse now supports IdP-initiated SSO, allowing users to start authentication directly from their identity provider (e.g., Okta, Azure AD, Keycloak, JumpCloud)
November 4, 2025
Launch Week 4 🚀
Mixpanel integration
We teamed up with Mixpanel to integrate LLM-related product metrics into your existing Mixpanel dashboards.
November 4, 2025
Launch Week 4 🚀
Advanced Filtering for Public Traces and Observations API
The traces endpoint in public API now supports complex JSON-based filtering.
November 3, 2025
Launch Week 4 🚀
Filter Sidebar for Tables
Quickly filter tables by column values in the filter sidebar with one click.
November 3, 2025
Langchain v1 Support
Langfuse SDKs now support Langchain v1 for both Python and JS/TS. The integration remains stable and backward compatible.
October 26, 2025
LLM-as-a-Judge Execution Tracing & Enhanced Observability
Every LLM-as-a-Judge evaluator execution now creates a trace, allowing you to inspect the exact prompts, responses, and token usage for each evaluation.
October 16, 2025
Spend Alerts
Monitor your organization's cloud spending and receive notifications when costs exceed predefined monetary thresholds.
October 10, 2025
Natural Language Filtering for Traces
Filter your traces using plain English queries. Powered by AWS Bedrock with zero data retention.
September 30, 2025
Structured Output Support for Prompt Experiments
Enforce JSON schema response formats in prompt experiments to ensure consistent, parseable outputs for evaluation and analysis.
September 30, 2025
Mutable Score Configs
Score configurations can now be updated after creation. Modify existing configs via API/SDK and UI while keeping all your data safe.
September 29, 2025
Experiment Runner SDK
New high-level SDK abstraction for running experiments on datasets with automatic tracing, concurrent execution, and flexible evaluation.
September 17, 2025
TypeScript SDK v4 (GA)
The new OpenTelemetry-based TypeScript SDK v4 is now generally available with improved DX, modular packages, and seamless integrations.
August 28, 2025
Additional Observation Types for More Meaningful Span Context
New observation types including Agent, Tool, Chain, Retriever, Evaluator, Embedding, and Guardrail provide semantic meaning to your traces.
August 27, 2025
New End-to-End Walkthrough Videos
We've released new comprehensive walkthrough videos covering observability, prompt management, and evaluation to help you get up to speed quickly with Langfuse.
August 26, 2025
Full-Text Search Across Dataset Items
Find dataset items by searching through their actual content with our new full-text search capability
August 25, 2025
Additional provider options in LLM calls in playground and evals
Set additional provider options in your LLM calls in playground and in llm-as-a-judge evaluations.
August 14, 2025
Docs now available as Markdown (.md) endpoints
Append .md to any docs URL to fetch the page as Markdown. Built at compile time for fast, reliable access.
August 7, 2025
OpenAI GPT-5 pricing now available in Langfuse
Day 1 support for OpenAI GPT-5 including tracking token counts and USD spend.
August 7, 2025
Annotation Queue Assignments
Assign users to annotation queues to make it easier for team members to focus on relevant tasks.
August 6, 2025
Slack Integration for Prompt Webhooks
Receive prompt change notifications directly in your Slack channels with our native integration.
July 30, 2025
LLM Playground with Side-by-Side Comparison
The LLM Playground now supports side-by-side prompt comparison with parallel LLM execution.
July 28, 2025
Sessions in Annotation Queues
Annotation Queues now support session-level annotation, making it easier to evaluate multi-turn interactions in your LLM applications.
July 28, 2025
LiveKit Agents Tracing Integration
Trace real-time voice AI agents and multimodal conversations built with LiveKit Agents via OpenTelemetry and Langfuse
July 25, 2025