Zenith — College Data Pipeline | Discovery & Pricing

📋

What They're Asking For

Build an automated data pipeline that:

Updates a Google Sheet with 35+ variables per college
Pulls from 4-5 data sources (IPEDS, Common App, College Board, web crawling)
Uses AI agents to extract transfer policies, testing policies from college websites
Covers 50-500+ colleges
Refreshes annually (they mentioned updating 8/1/2026)

35+

Variables per College

4-5

Data Sources

500+

Colleges (Full Scope)

🎯

Discovery Questions

Before pricing, we need to understand the value this data creates for them. These questions reveal that:

"What's this data powering? Is it for internal use, a client-facing product, or something else?"

→ Reveals if this is core to their revenue or a nice-to-have

"How are you getting this data today? Manual research? How many hours does that take?"

→ Quantifies the alternative cost (their current pain)

"If you had perfect data tomorrow, what would you do with it that you can't do today?"

→ Uncovers the value unlock — new products, better service, pricing power

"You mentioned 'top 50 first' — is that Phase 1? What's the full universe of colleges you need?"

→ Scopes the full opportunity

"How often does this need to refresh? Annual? More frequent?"

→ Determines if this is one-time or recurring revenue

"What's your accuracy requirement? Is 90% good enough or does every field need manual verification?"

→ Huge impact on effort — 90% vs 99% is 3x the work

"Do you have access to the Common Data Set PDFs already, or do we need to source those?"

→ Scopes additional work

"Who else has data like this? Are there competitors with similar datasets?"

→ Reveals competitive value — if nobody else has it, it's worth more

💰

Value Analysis

What's the Alternative Cost?

Manual research per college (all variables)	~2-3 hours
500 colleges × 2.5 hrs	1,250 hours
At $40/hr (research assistant rate)	$50,000
Timeline (one person, full-time)	6+ months

Manual Approach

$50,000 in labor
6+ months to complete
Error-prone, inconsistent
Not repeatable (same cost next year)
No transfer policy extraction at scale

Automated Pipeline

One-time build cost
Runs in days, not months
Consistent, validated output
Repeatable (annual refresh = marginal cost)
AI extracts policies automatically

💡 Value-Based Pricing Framing

If we save them $50K in manual labor AND deliver in weeks instead of months AND make it repeatable... what's that worth?

The question isn't "what does this cost us to build?" — it's "what value does this create for them?"

💬

Conversation Flow

Us — Open

"Thanks for the detailed spec — really helpful. Before I put together pricing, I want to make sure I understand the value side. What's this data powering for you?"

Them — Context

[Listen for: SaaS product? Consulting service? Internal tool? Client deliverable?]

Us — Dig Deeper

"Got it. How are you getting this data today? Is someone manually researching each college?"

Them — Pain

[Listen for: hours spent, frustration, errors, things they can't do]

Us — Quantify

"Roughly how many hours does that take for, say, 100 colleges? And how much does that cost you in labor?"

Us — Vision

"If you had perfect, comprehensive data on every college tomorrow — updated automatically every year — what would you do with it that you can't do today?"

Them — Value Unlock

[Listen for: new products, better pricing, competitive advantage, time saved]

Us — Scope

"On the web crawling piece — that's the variable part. Every college structures their site differently. For the top 50, we can QA every extraction. For 500+, we'd need more automation with some manual review. What's your accuracy requirement?"

Us — Budget Discovery

"This is helpful. Before I scope anything — what have you budgeted for this? Or what range are you working within?"

Them — Budget Signal

[Listen for: specific number, range, "we haven't thought about it", deflection]

Us — If They Deflect

"Totally fair. Let me ask it differently — if you were to hire someone to manually maintain this data, what would that cost you annually? That helps me understand the value frame."

💵

Budget Discovery

⚠️ Strategy: Don't Anchor First

We do NOT give them a number. We get them to reveal their budget through value discovery. Let them anchor first.

Questions to reveal their budget:

"What have you budgeted for this project?"

→ Direct ask. Sometimes they'll just tell you.

"Have you gotten quotes from others? What did those look like?"

→ Reveals market anchors they've seen

"If you were to hire someone full-time to maintain this data, what would that cost you annually?"

→ Gets them thinking about the $50-60K alternative

"What's this data worth to your business? If you had perfect data, what revenue or savings does that unlock?"

→ Ties investment to outcome

"Is there a budget range you're working within? I want to make sure I scope something realistic."

→ Soft ask, gives them permission to share

When They Push for Our Number

"It really depends on scope, accuracy requirements, and timeline. Rather than throw out a number that might not fit, I'd rather understand what you're working with and scope something that makes sense. What's the budget range you're targeting?"

If They Absolutely Won't Anchor

Keep pushing for their number. Don't anchor first.

"I hear you. Let me do a technical assessment first — look at your current data, understand the sources, estimate complexity. Then I'll come back with options. But it helps to know what range you're working with so I don't waste your time with something that doesn't fit."

🛡️

Objection Handling

"Just give us a number"

"I will — but I don't want to waste your time with a number that doesn't fit. The scope varies a lot: top 50 colleges is very different from 500. Accuracy requirements matter. Timeline matters. Help me understand what you're working with and I'll give you something realistic."

"We don't have a budget yet"

"That's fine. Let me ask it differently — what would it cost you to NOT have this data? Or to do it manually? That helps me understand the value frame so I can scope something appropriate."

"How accurate is the AI crawling?"

"Depends on scope and investment. For a smaller set, we can QA every record — close to 100%. At scale, we're talking 85-90% automated with human review on edge cases. Web scraping is inherently variable — sites change. We build for maintainability."

"We could hire someone to do this"

"You could. A full-time research assistant is $50-60K/year, plus training, management, and turnover risk. And they'd take 6+ months to get through 500 colleges once. We're delivering a system that runs repeatedly. What's that worth to you?"

"Can we start smaller?"

"Absolutely. We can scope a Phase 1 with the top 50 — prove the approach, validate data quality, then decide on scaling. What budget works for that initial phase?"

✅

Close: Next Steps

Goal: Get Budget Signal + Access

Budget range — Even a rough range helps us scope appropriately
Share access — View access to current spreadsheet + data sources
Technical scoping call — 30 min to walk through IPEDS, Common App doc, crawling complexity
Proposal — We come back with options that fit their budget and scope

Close Line — If No Budget Yet

"Here's what I'd suggest: share access to your current spreadsheet and data sources this week. I'll do a quick technical assessment — understand the IPEDS structure, look at the Common App doc, estimate crawling complexity. Then I'll come back with 2-3 options at different scope levels. You tell me which fits your budget and timeline. Sound good?"

Close Line — If They Gave Budget

"Great — [X budget] gives me something to work with. Let me look at your current data and sources, and I'll scope something that fits. Can you share access this week? I'll have a proposal back within 48 hours."

🎯 What We Need From This Call

Budget signal — Any number or range
Value context — What this data enables for them
Current pain — How they do it today, what it costs
Access — Spreadsheet, IPEDS export, Common App doc
Timeline — When do they need this?