Discovery questions, value framing, and pricing strategy for the college dataset automation project.
Build an automated data pipeline that:
Before pricing, we need to understand the value this data creates for them. These questions reveal that:
"What's this data powering? Is it for internal use, a client-facing product, or something else?"
"How are you getting this data today? Manual research? How many hours does that take?"
"If you had perfect data tomorrow, what would you do with it that you can't do today?"
"You mentioned 'top 50 first' — is that Phase 1? What's the full universe of colleges you need?"
"How often does this need to refresh? Annual? More frequent?"
"What's your accuracy requirement? Is 90% good enough or does every field need manual verification?"
"Do you have access to the Common Data Set PDFs already, or do we need to source those?"
"Who else has data like this? Are there competitors with similar datasets?"
| Manual research per college (all variables) | ~2-3 hours |
| 500 colleges × 2.5 hrs | 1,250 hours |
| At $40/hr (research assistant rate) | $50,000 |
| Timeline (one person, full-time) | 6+ months |
If we save them $50K in manual labor AND deliver in weeks instead of months AND make it repeatable... what's that worth?
The question isn't "what does this cost us to build?" — it's "what value does this create for them?"
"Thanks for the detailed spec — really helpful. Before I put together pricing, I want to make sure I understand the value side. What's this data powering for you?"
[Listen for: SaaS product? Consulting service? Internal tool? Client deliverable?]
"Got it. How are you getting this data today? Is someone manually researching each college?"
[Listen for: hours spent, frustration, errors, things they can't do]
"Roughly how many hours does that take for, say, 100 colleges? And how much does that cost you in labor?"
"If you had perfect, comprehensive data on every college tomorrow — updated automatically every year — what would you do with it that you can't do today?"
[Listen for: new products, better pricing, competitive advantage, time saved]
"On the web crawling piece — that's the variable part. Every college structures their site differently. For the top 50, we can QA every extraction. For 500+, we'd need more automation with some manual review. What's your accuracy requirement?"
"This is helpful. Before I scope anything — what have you budgeted for this? Or what range are you working within?"
[Listen for: specific number, range, "we haven't thought about it", deflection]
"Totally fair. Let me ask it differently — if you were to hire someone to manually maintain this data, what would that cost you annually? That helps me understand the value frame."
We do NOT give them a number. We get them to reveal their budget through value discovery. Let them anchor first.
Questions to reveal their budget:
"What have you budgeted for this project?"
"Have you gotten quotes from others? What did those look like?"
"If you were to hire someone full-time to maintain this data, what would that cost you annually?"
"What's this data worth to your business? If you had perfect data, what revenue or savings does that unlock?"
"Is there a budget range you're working within? I want to make sure I scope something realistic."
"It really depends on scope, accuracy requirements, and timeline. Rather than throw out a number that might not fit, I'd rather understand what you're working with and scope something that makes sense. What's the budget range you're targeting?"
Keep pushing for their number. Don't anchor first.
"I hear you. Let me do a technical assessment first — look at your current data, understand the sources, estimate complexity. Then I'll come back with options. But it helps to know what range you're working with so I don't waste your time with something that doesn't fit."
"I will — but I don't want to waste your time with a number that doesn't fit. The scope varies a lot: top 50 colleges is very different from 500. Accuracy requirements matter. Timeline matters. Help me understand what you're working with and I'll give you something realistic."
"That's fine. Let me ask it differently — what would it cost you to NOT have this data? Or to do it manually? That helps me understand the value frame so I can scope something appropriate."
"Depends on scope and investment. For a smaller set, we can QA every record — close to 100%. At scale, we're talking 85-90% automated with human review on edge cases. Web scraping is inherently variable — sites change. We build for maintainability."
"You could. A full-time research assistant is $50-60K/year, plus training, management, and turnover risk. And they'd take 6+ months to get through 500 colleges once. We're delivering a system that runs repeatedly. What's that worth to you?"
"Absolutely. We can scope a Phase 1 with the top 50 — prove the approach, validate data quality, then decide on scaling. What budget works for that initial phase?"
"Here's what I'd suggest: share access to your current spreadsheet and data sources this week. I'll do a quick technical assessment — understand the IPEDS structure, look at the Common App doc, estimate crawling complexity. Then I'll come back with 2-3 options at different scope levels. You tell me which fits your budget and timeline. Sound good?"
"Great — [X budget] gives me something to work with. Let me look at your current data and sources, and I'll scope something that fits. Can you share access this week? I'll have a proposal back within 48 hours."
Autonomous Technologies — Sales Conversation Guide
February 2026 — Confidential