countPaper(); workshops/posters and
wrong-conference entries removed. Tick venues below to recompute the ranking.
· build 2026-06-25 17:58This project — data collection, cleaning, analysis, and this web page — was produced end-to-end by Claude Code, an autonomous AI coding agent, with no manual data curation.
All underlying data is derived solely from publicly available sources: DBLP, ORCID, the Mathematics Genealogy Project, and public web pages (institutional homepages, Google Scholar, and similar). No private or non-public personal data is used; names appear only in the context of already-public bibliographic and academic-genealogy records.
The content is provided "as is," for informational and research purposes only, without any warranty of accuracy, completeness, or fitness for a particular purpose. Automated methods (name disambiguation, and inference of PhD year, institution, and advisor) can and do produce errors, so the figures should not be treated as authoritative.
This page is not affiliated with, endorsed by, or sponsored by any institution, conference, journal, or any individual named herein, and nothing here constitutes professional, legal, or career advice. If you are listed and would like a correction or removal, please reach out and it will be addressed.
A first-author paper (year ≥ 2000) at one of nine highly selective HPC / systems venues — SC, ICS, HPDC, IPDPS, ASPLOS, MLSys, PPoPP (conferences) and IEEE TPDS, IEEE TC (journals). Their selectivity is reflected in the Google Scholar h5-index:
| Venue | h5-index |
|---|---|
| IEEE Trans. on Parallel and Distributed Systems (TPDS) | 81 |
| IEEE Trans. on Computers (TC) | 61 |
| SC — Supercomputing | 50 |
| IEEE Int'l Symposium on Parallel & Distributed Processing (IPDPS) | 41 |
| PPoPP | 34 |
| ICS — Int'l Conference on Supercomputing | 25 |
| HPDC | 23 |
Values are the Google Scholar h5-index. (ASPLOS and MLSys are also counted, as top architecture / ML-systems venues.)
DBLP XML dump (~1 GB gzip / 5.2 GB XML), decompressed locally with dblp.dtd so named entities (e.g. ü) resolve — the same data CSRankings uses.crossref series matches. The booktitle test removes co-located workshops/companions (IPDPS Workshops, GPGPU@ASPLOS, PMAM@PPoPP, SC Companion…); the crossref test removes different conferences that share a booktitle string — ICS → Int'l Computer Symposium (516 papers) & ITCS (77), SC → Soft Computing (169), SysML → an OSDI workshop, PPoPP → WPMVP.-c main subpage). Found by a per-year spike audit.Wei Wang 0001) keep different people distinct.<author> in DBLP document order; aggregate; keep authors with ≥ 5.<phdthesis> → ORCID education API → MathGenealogy (conservative Ph.D.-only name match) → web search (homepage / Scholar / LinkedIn, identity confirmed against the author's paper span).<school> field, ORCID organization, MathGenealogy, and homepages.Result: 18,779 qualifying papers · 276 first authors with ≥ 5 · 40 with ≥ 5 before PhD. DBLP snapshot 2026-06-23.
This section examines which of these first authors reached ≥5 top-venue first-author papers by PhD graduation (for still-enrolled students, all of their papers are counted), and which factors are associated with that outcome.
phdthesis + ORCID + MathGenealogy + web search (97% coverage, 269/276).HIGH = the 40 authors who reached ≥5 papers by graduation; CTRL = 84 otherwise-comparable authors who reached only 2–4 (the comparison group). Each row reports a typical (median or %) value for each group; the gap between them is what matters.
| Factor (what it measures) | HIGH | CTRL | Verdict |
|---|---|---|---|
| Early start / runway Years between an author's first-ever paper and their PhD — how long they'd been publishing by graduation. HIGH had a ~4-year head start. | 4.1 | 1.7 | strongest lever (2.4×) |
| Branded artifact → paper series Share of authors who repeatedly published on one named system/tool (a "brand" recurring in ≥2 titles, e.g. SlimFly/SlimNoC, Legion) — i.e. building one project and shipping a series of papers on it. | 85% | 60% | strongly supported |
| Big team Median number of co-authors on their papers — a proxy for lab size and how much collaborative support they had. | 4.7 | 3.7 | supported |
| Topic focus How concentrated their topics are: fraction of papers sharing a common keyword (1.0 = all on one theme). Both groups are focused, so it doesn't separate them. | 0.61 | 0.75 | necessary, not distinguishing |
| Journal leverage Share of their papers that are journal articles (TPDS/TC) rather than conference papers — journals can add an "extended-version" paper per project. Identical across groups. | 26% | 26% | a path, not the path |
| Recent hot wave Share who graduated in 2018 or later — i.e. working in fast-moving areas (serverless, ML-systems, lossy compression) with quick publication cycles. | 60% | — | supported |
Two advisors each produced 3 of the 40 — Torsten Hoefler (ETH: Besta, Ziogas, Copik) and Devesh Tiwari (Northeastern: Patel, Basu Roy, Baolin Li); 6 elite departments account for 35%. Nearly every advisor is an HPC luminary running a large, machine-rich lab built around a flagship platform (MVAPICH, Globus, Legion, SLATE, SPCL, HPCToolkit…).
The pattern appears driven more by strategy and environment than by raw talent. The authors who reached this bar tended to share four traits:
Topic focus and journal output appear to be accelerators rather than the main driver. Caveats: this is a correlational observation, not causation; n = 40; the advisor analysis has no control group; PhD-year coverage is biased toward catalogued (recent, Western and Chinese) researchers.