KeyKitStart an assessment

Evaluate, compare, monitor, and benchmark any data or model API.

KeyKit runs structured assessments against your specific requirements, not generic benchmarks. Know which provider to choose, and know if they stay that way.

Start an assessmentGet certified

Used by buyers to evaluate and monitor providers. Used by providers to prove and maintain quality.

What KeyKit does

Four capabilities. All live today.

Evaluations

Test against your requirements, not ours.

Define your tolerance for freshness, latency, field fill rate, and reliability. KeyKit runs your data or model API through up to 32 test types and returns PASS, WARN, or FAIL verdicts tied to what you actually need. Use your own trial API key. No vendor involvement required.

Evaluation #007
COMPLETE
FIT
Historical Depth
38 mo · req. ≥ 24 mo
92
FIT
Freshness Lag
1.8 hr avg · req. ≤ 4 hr
88
PARTIAL
Field Completeness
82% full · req. ≥ 90%
61
FIT
Deduplication Rate
1.1% dupe · req. ≤ 5%
94
NO FIT
Rate Limit
800 req/hr · req. ≥ 5,000
18
FIT
Response Latency
340 ms p95 · req. ≤ 800 ms
85
4 FIT
1 PARTIAL FIT
1 NO FIT
Comparison

Compare providers side by side.

Run the same evaluation suite across multiple providers and see the results head to head. Built for buyers shortlisting two or three options before committing. Built for providers who want to know where they stand.

TEST
PROVIDER A
PROVIDER B
Freshness Lagdiff
FIT88
PARTIAL61
Field Completeness
FIT94
FIT90
Response Latencydiff
FIT85
NO FIT22
Availability
FIT99
FIT97
Deduplication
FIT91
FIT88
Health Checks

Know if a provider degrades after you sign.

Schedule evaluations to run daily, weekly, or monthly. KeyKit alerts you when coverage drops, latency spikes, or field fill rates change. Post-purchase monitoring so you are never caught off guard.

Health CheckWeekly · Mondays
Jun 02
84
Jun 09
81
Jun 16
71
Jun 23
68
Jun 30
77
Score dropped 13 pts Jun 16 · Freshness lag exceeded threshold
Benchmarks

See how your results compare to everyone else.

KeyKit aggregates anonymized results across all users testing the same provider. Know immediately whether your results are normal or a red flag, without asking anyone.

Benchmark · Provider A142 evaluations
Your result76
Top quartile84
Median68
Bottom quartile49

Your score is above the median. 2 tests below top-quartile threshold.

For buyers

For buyers

Choose the right provider before you sign. Comparison shows you the difference.
Protect yourself after you sign. Health checks catch degradation before it becomes a problem.
Know if your results are normal. Benchmarks tell you where you stand in the market.
Start free assessment
For providers

For providers

Prove quality at point of sale. Share your evaluation results directly with prospects.
Prove consistency over time. Health checks show buyers you do not degrade after they sign.
See where you stand in the market. Benchmarks show you how you compare to competitors.
Talk to us about certification

KeyKit assessments use the same framework Mulberry applies to Independent Field Assessments. Verified providers are listed on Sourced.cc. Learn about IFAs · Browse Sourced

Currently supported: Think-Pol, Tisane. More providers added regularly.

Request a provider

Verified providers are listed on Sourced.cc, where qualified buyers are already looking.

Browse Sourced
Evaluation frameworks

32 test types across coverage, data quality, freshness, reliability, compliance, and more.

Every framework scores API performance against your stated requirements, not industry averages. Supported providers include Think-Pol and Tisane. More providers added regularly.

Coverage

2 runs

Does the dataset cover the time range and regions your use case requires?

Historical DepthGeographic Coverage

Data Quality

4 runs

Are records complete, canonical, and free of duplicates before they hit your pipeline?

Field CompletenessDeduplication RateCross-Query ConsistencyProvenance Metadata

Determinism

4 runs

Does re-querying the same parameters return the same results? Critical for incremental pipelines.

Result Set StabilitySort Order StabilityCount StabilityField Value Stability

Freshness

2 runs

How stale is "live" data? We measure actual ingestion lag against your stated tolerance.

Freshness LagLag Distribution

Query Complexity

6 runs

Can the API handle the queries your use case actually needs, or only the simple ones in the demo?

Basic Keyword QueryBoolean LogicNested BooleanWildcard & FuzzyField-Scoped QueryComplex Multi-Clause

Scale & Reliability

3 runs

Performance and stability under realistic load, not cherry-picked conditions.

Response LatencyRate Limit DiscoveryAvailability Check

Language & Scripts

1 run

Does multilingual content arrive correctly encoded and attributed?

Language Coverage

Stress & Edge Cases

4 runs

What breaks at the edges? Edge-case testing surfaces failures before production does.

Malformed Query HandlingEmpty Result HandlingRate Limit BreachDeep Pagination

Compliance & Cost

3 runs

Is sensitive data scoped correctly? Does cost hold at volume?

PII / Sensitive Data ScanQuota AccountingAuth & Scope Boundaries

Benchmarking

1 run

Side-by-side scoring against your current vendor or an alternative. Apples to apples.

Category Benchmark
Pricing

Start free. Go Pro when you need more.

Try KeyKit with one evaluation, no credit card required. Upgrade to Pro for unlimited evaluations, health checks, and comparison.

Free Trial
$0

One evaluation. Expires after 7 days.

  • 1 evaluation
    Expires after 7 days.
  • Simple mode only
    Plain-English results: what was tested, what it means, whether it passed.
  • No health checks
  • No comparison
Start free trial
MOST POPULAR
Pro
$349/month

For data buyers evaluating APIs before purchase.

  • Unlimited evaluations
  • Simple and Advanced mode
    Switch between plain-English summaries and full framework-level control.
  • Health checks
    Run any evaluation on a recurring schedule. Daily, weekly, or monthly.
  • Evaluation comparison
    Compare any two completed evaluations side by side, framework by framework.
  • Full framework library
    30+ evaluation frameworks across 10 groups.
  • Request logs and raw output
Start free trial
Vendor
$2,749/month

For data API companies that want to offer evaluation access to their prospects.

  • All Pro features
  • Unlimited prospect trial issuances
    Each trial is scoped to your API and expires after 14 days.
  • Prospect gets Simple mode evaluation
    Against your API. They run it. You don't influence the outcome.
  • Vendor dashboard
    Track trial status: issued, started, completed.
Talk to us