Methodology

MyUni’s data comes from public records: College Scorecard, IPEDS, DHS H-1B disclosures, and Wikidata. We refresh quarterly and archive every snapshot for rollback.

Refresh cadence

Data is refreshed on the first of January, April, July, and October. Each refresh runs as a GitHub Action that pulls fresh data from each source, validates it against the prior snapshot (±20% row-count tolerance, no orphaned foreign keys), and opens a pull request with the diff. Old snapshots are archived under data/snapshots/[year]-Q[n]/ and can be rolled back with a revert.

Sources

College Scorecard

The U.S. Department of Education’s College Scorecard provides admit rates, net price by income band, post-graduation earnings, and federal aid metrics. We use the institution-level CSV released annually.

Famous alumni

Alumni are pulled from Wikidata via SPARQL — specifically, entities with both P69 (educated at) and P27 (country of citizenship) properties pointing at a US university. We rank by sitelinks (the count of Wikipedia language editions covering the person). An alumnus with articles in 60 Wikipedias scores higher than one with three. This is the closest proxy to global notability without scraping or paywalled APIs.

Cost & aid

Net price for international students is approximated from sticker price minus the average international financial aid award. Schools without published international aid get a null field — we don’t fabricate.

Visa outcomes

H-1B sponsorship comes from the DHS Office of Foreign Labor Certification’s quarterly disclosure files. We aggregate by alma mater (when listed) and count approvals, denials, and median salary by year.

What we don’t do

We don’t rank universities. The data is provided as facets to filter on — match-rate, by country, by major, by admit rate — not as a single number.
We don’t paywall the data itself. The MyUni $39 upgrade is for AI advisor tools, not for database access.