Chinese Talent, American Enterprise


Methodological Notes


1. Notes on the prediction of ethnic Chinese patent inventors, including PRC-born inventors:

We use two methods to determine whether an inventor is an ethnic Chinese, and further, a PRC-born Chinese:

  • Prediction by algorithm: We use NamePrism, a non-commercial ethnicity/nationality classification tool intended to support academic research (Ye et al. 2017), to help predict the inventors’ ethnicity. The same method has been used by Diamond, McQuade and Qian (Diamond, McQuade, and Qian 2019). The algorithm produces probabilities of a name (first, last) belonging to a particular ethnicity or country, e.g. Celtic English, European, East Asian, including Chinese, Japanese, or Korean, Hispanic, South Asian, etc. The algorithm predicts one to be an ethnic Chinese based on the names prevalent in mainland China (that uses Hanyu Pinyin), Taiwan (that mainly uses Tongyong Pinyin, a variation of Wade-Giles system), Hong Kong (that typically uses Cantonese Romanization), and Southeast Asian countries.

  • Prediction by Hanyu Pinyin Dictionary: We also construct a Hanyu Pinyin dictionary based on an exhaustive list of common first and last names in Pinyin, a phonetic alphabet system used in mainland China. The list of last names is from the Chinese census (Bao 2020), and the first names are constructed by Pinyin combinations. We manually customize the Pinyin dictionary by going through all names predicted as ethnic Chinese by the algorithm in order to eliminate false positives that follow the alphabetic customs in Hong Kong, Taiwan, and Southeast Asian countries. Our Pinyin dictionary thus provides a narrow definition of ethnic Chinese that only includes inventors whose first and last names are both in the Hanyu Pinyin format. We take these narrow, dictionary-defined Chinese as the Chinese from mainland China.

By cross-checking names predicted by the algorithm against the Hanyu Pinyin dictionary, we find that 49% of the algorithm-predicted names are in the Hanyu Pinyin format. In addition, the false negative rate yielded by the algorithm (the proportion of names in Hanyu Pinying that are not captured by the algorithm) is about 7%. The narrow Hanyu Pingyin dictionary definition is thus a conservative estimation, which likely underestimates the number of China-born inventors and their contributions.

2. Descriptive statistics on the number of inventors, patents, inventors’ residence countries, and assignees in the database:

Table 1a. Number of Biotech Patent Inventors by Ethnicity (1976-2019)

Celtic English22008434%
European16073325%
Hispanic6217310%
Chinese471487%
Other East Asian (Japanese/Korean)8863814%
South Asian256664%
Russian63271%
Others344505%
Total645218 

Note: 98% of the inventors filed patents as residents of one country, 1.9% filed multiple patents as residents of two countries, and 0.1% as residents of three or more countries. The latter two types of inventors are super-mobile individuals who lived in more than one country, via work assignment or immigration, while filing patents with USPTO.

Table 1b: Number of Biotech Patents by Number of Inventors (1976-2019)

Single inventor30018734%
Two inventors23922527%
Three to five inventors28859033%
Six to ten inventors520706%
Ten or more inventors45651%
Total884636 

Note: 93% of the patents are filed by inventors from one country while 7% of the patents are filed by inventors from two or more countries. The latter indicates the importance of multi-country collaboration in biotechnology.

Table 1c: Number of Biotech Patent Assignees by Type and Country (1976-2019)

Single assignee
Companies
U.S. companies46323160.60%
Non-U.S. companies25210533.00%
Governments
U.S. government66520.90%
Non-U.S. government9150.10%
Individuals
U.S. individuals37070.50%
Non-U.S. individuals16390.20%
Unassigned70.00%
Two assignees322454.20%
Three or more assignees43430.60%
Total764844

References

Bao, H.-W.-S. 2020. “ChineseNames: Chinese Name Database 1930-2008 [R Package].” 2020. https://github.com/psychbruce/ChineseNames.

Cohen, Mark A., and Philip C. Rogers. 2020. “When Sino-American Struggle Disrupts the Supply Chain: Licensing Intellectual Property in a Changing Trade Environment.” World Trade Review, December, 1–20. https://doi.org/10.1017/S1474745620000531.

Diamond, Rebecca, Tim McQuade, and Franklin Qian. 2019. “The Effects of Rent Control Expansion on Tenants, Landlords, and Inequality: Evidence from San Francisco.” American Economic Review 109 (9): 3365–94. https://doi.org/10.1257/aer.20181289.

Ministry of Education. 1987. “Educational Statistics Yearbook of China.” China Statistical Press. http://cdi.cnki.net/Titles/SingleNJ?NJCode=N2019030252.

Partnership for a New American Economy. 2012. “Patent Pending: How Immigrants Are Reinventing the American Economy.” https://www.newamericaneconomy.org/sites/all/themes/pnae/patent-pending.pdf.

Sheehan, Matt. 2019. “Chinese AI Talent in Six Charts.” May 28, 2019. https://macropolo.org/china-ai-research-talent-data/?rp=m.

Trapani, Josh, and Katherine Hale. 2019. “Higher Education in Science and Engineering.” Science and Engineering Indicators. National Science Foundation, National Science Board. https://ncses.nsf.gov/pubs/nsb20197/international-s-e-higher-education.

Ye, Junting, Shuchu Han, Yifan Hu, Baris Coskun, Meizhu Liu, Hong Qin, and Steven Skiena. 2017. “Nationality Classification Using Name Embeddings.” ArXiv:1708.07903 [Cs], August, 1897–1906.

Five Takeaways

Other Links

Recommended Posts
Contact Us

We're not around right now. But you can send us an email and we'll get back to you, asap.

Not readable? Change text. captcha txt

Start typing and press Enter to search