← All benchmarks
June 2026

Flutter · Ente Photos

Open source end-to-end encrypted photo storage app with a Flutter frontend. Tested on the main branch available in June 2026. The benchmark focuses on cross-module flows across UI, services, gateway calls, local DB access, and state propagation.

View on GitHub ↗
Repository
Stack
Flutter / Dart · ~4,061 Dart files
Tessra version
v2.19.x
Date evaluated
June 2026
Branch / snapshot
main branch, June 2026 snapshot
Claude Haiku 4.5
core flow connected risk directional
Baseline search found several core mechanisms. Tessra helped connect those findings to cross-module context, callers, and product risk.
Validated case · SelectionState
2 / 3 3 / 3 +1 pt
Validated case: SelectionState. The baseline found the core mechanism with normal repo search. Tessra added targeted symbol context and produced a cleaner architecture-level explanation. This is not a zero-to-perfect case.
Case Question Baseline result With Tessra Observation
01 How does DeduplicationService._getDuplicateFiles() decide whether to make the HTTP call, and how does it group files in two different ways? Solved with normal repo search Solved Simple local lookup; not a strong differentiator
02 In DeleteSuggestionsPage, trace the complete chain from asyncLoader to the local DB: service, gateway, HTTP endpoint, intermediate Dart type, and final DB flag. Partial chain More complete chain Cross-module tracing improved
03 In trashFilesOnServer(), what ownership validation is performed? What fallback runs if collectionID is not owned? What happens if the fallback also fails? Core fallback found Fallback + downstream risk Tessra connected callers, local deletion, and user-visible risk
04 What HTTP endpoint and batch size does the hasMigratedSizes backfill use? Endpoint or batch size may be found by search Endpoint + batch context Concrete implementation detail
05 Why does SelectionState's InheritedWidget have updateShouldNotify=false? How does the state actually propagate? 2 / 3 — found the core mechanism with normal repo search 3 / 3 — stronger architecture-level answer with targeted symbol context Tessra improved answer quality, focus, and usefulness. This is not a zero-to-perfect case.
Context completeness and cognitive leverage for architectural tracing
Baseline search found core mechanisms in several cases. The harder part was connecting those findings to the full flow: UI, services, gateway, local DB, state propagation, and user-visible behavior. Tessra added cognitive leverage: it gave the agent a clearer working map of the repo, connected relevant symbols, and produced more useful engineering explanations. Example: in Case 03, normal search found the fallback. Tessra connected that fallback to callers, local deletion, and consistency risk. In Case 05, the baseline reached 2/3; Tessra raised it to 3/3 with targeted symbol context. The win is not that Tessra finds a file. The win is that it helps the agent turn scattered code paths into an engineering-level explanation.
Silent file drop — risk beyond the first lookup
Normal search found the core fallback: ownership validation, search for another owned collection, and severe logging when no fallback exists. Tessra went further: it connected that flow to callers, local DB deletion, and user-visible risk. The point was not finding the function; it was understanding what could become inconsistent afterward.
This benchmark should be read as evidence of improved context completeness and cognitive leverage, not as a promise of perfect answers. Some cases are partially or fully solvable with normal repository search. The value is in evaluating whether the agent can connect core mechanisms to cross-module navigation, fallbacks, callers, and downstream risk. Results may vary across repositories and model runs.
Try it yourself

See what Tessra surfaces in your repo.

Index an Angular, Django, or Flutter repo and try local context for 7 days.