June 2026

Flutter · Ente Photos

Open source end-to-end encrypted photo storage app with a Flutter frontend. Tested on the main branch available in June 2026. The benchmark focuses on cross-module flows across UI, services, gateway calls, local DB access, and state propagation.

View on GitHub ↗

Benchmark setup

Repository

ente-io/ente

Stack

Flutter / Dart · ~4,061 Dart files

Tessra version

v2.19.x

Date evaluated

June 2026

Branch / snapshot

main branch, June 2026 snapshot

Results summary

Claude Haiku 4.5

core flow → connected risk directional

Baseline search found several core mechanisms. Tessra helped connect those findings to cross-module context, callers, and product risk.

Validated case · SelectionState

2 / 3 → 3 / 3 +1 pt

Validated case: SelectionState. The baseline found the core mechanism with normal repo search. Tessra added targeted symbol context and produced a cleaner architecture-level explanation. This is not a zero-to-perfect case.

Results per case

Case	Question	Baseline result	With Tessra	Observation
01	How does DeduplicationService._getDuplicateFiles() decide whether to make the HTTP call, and how does it group files in two different ways?	Solved with normal repo search	Solved	Simple local lookup; not a strong differentiator
02	In DeleteSuggestionsPage, trace the complete chain from asyncLoader to the local DB: service, gateway, HTTP endpoint, intermediate Dart type, and final DB flag.	Partial chain	More complete chain	Cross-module tracing improved
03	In trashFilesOnServer(), what ownership validation is performed? What fallback runs if collectionID is not owned? What happens if the fallback also fails?	Core fallback found	Fallback + downstream risk	Tessra connected callers, local deletion, and user-visible risk
04	What HTTP endpoint and batch size does the hasMigratedSizes backfill use?	Endpoint or batch size may be found by search	Endpoint + batch context	Concrete implementation detail
05	Why does SelectionState's InheritedWidget have updateShouldNotify=false? How does the state actually propagate?	2 / 3 — found the core mechanism with normal repo search	3 / 3 — stronger architecture-level answer with targeted symbol context	Tessra improved answer quality, focus, and usefulness. This is not a zero-to-perfect case.

Key findings

Context completeness and cognitive leverage for architectural tracing

Baseline search found core mechanisms in several cases. The harder part was connecting those findings to the full flow: UI, services, gateway, local DB, state propagation, and user-visible behavior. Tessra added cognitive leverage: it gave the agent a clearer working map of the repo, connected relevant symbols, and produced more useful engineering explanations. Example: in Case 03, normal search found the fallback. Tessra connected that fallback to callers, local deletion, and consistency risk. In Case 05, the baseline reached 2/3; Tessra raised it to 3/3 with targeted symbol context. The win is not that Tessra finds a file. The win is that it helps the agent turn scattered code paths into an engineering-level explanation.

Silent file drop — risk beyond the first lookup

Normal search found the core fallback: ownership validation, search for another owned collection, and severe logging when no fallback exists. Tessra went further: it connected that flow to callers, local DB deletion, and user-visible risk. The point was not finding the function; it was understanding what could become inconsistent afterward.

Known limitations

This benchmark should be read as evidence of improved context completeness and cognitive leverage, not as a promise of perfect answers. Some cases are partially or fully solvable with normal repository search. The value is in evaluating whether the agent can connect core mechanisms to cross-module navigation, fallbacks, callers, and downstream risk. Results may vary across repositories and model runs.

Try it yourself

See what Tessra surfaces in your repo.

Index an Angular, Django, or Flutter repo and try local context for 7 days.

Start free ← All benchmarks