1 / 38

SEWP Research Conference October 19, 2005

SEWP Research Conference October 19, 2005. Creating a Longitudinal Research Worker-Establishment Matched Dataset from Patent Data: Description and Application to Understanding International Knowledge Flows.

kay
Download Presentation

SEWP Research Conference October 19, 2005

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SEWP Research Conference October 19, 2005 Creating a Longitudinal Research Worker-Establishment Matched Dataset from Patent Data:Description and Application to Understanding International Knowledge Flows Jinyoung Kim (SUNY-Buffalo) Sangjoon John Lee (Alfred University)Gerald Marschke (SUNY-Albany)

  2. Issues • Construction of a longitudinal research worker-establishment matched panel data • Knowledge flow across national borders

  3. Idea • Policy implications on immigration, labor market, and education arena • productivity of scientific researchers • transmittal mechanism of knowledge • Technology spillover appears to be geographically limited • Firms access externally-located technology partly through hiring of and collaboration with researchers from the outside.

  4. We examined: • Trends in U.S. firms’ access to the researchers overseas and those with foreign research experience in the late 1980s through the 1990s • Role of research personnel as a pathway for the diffusion of ideas from foreign countries to U.S. innovators • The firm-level determinants of accessing innovations developed overseas.

  5. Main findings: • In recent years, an increase in the extent that U.S. innovators access researchers residing in foreign country • The fraction U.S. residents with foreign research experience in US firms appears to be falling. • U.S. pharmaceutical and semiconductor firms are increasingly going to foreign countries to employ such researchers • Retaining researchers with overseas research experience seems to facilitate access to innovations developed overseas. • In the semiconductor industry, smaller firms and older firms are more likely to make use of the output of non-U.S. R&D. • In the pharmaceutical industry, younger firms are more likely to make use of the output of non-U.S. R&D.

  6. Outline • Literature Review • Data Construction Process • Empirical findings • Conclusions

  7. Literatures Various mechanisms for technology and knowledge transfer across institutional boundaries. • Informal Contact • Agrawal, Cockburn, and McHale (2003), Von Hippel (1988) • Spillovers • Henderson, Jaffe, and Trajtenberg (REStat 1998), Jaffe (AER 1989), Zucker, Darby, and Brewer (AER 1998), Audretsch and Feldman (AER 1996), Mowery, Ziedonis (NBER 2001).

  8. Transmission of Tacit knowledge Feldman (1994) • Collaboration and Hiring Cohen, Nelson, and Walsh (Mgt Science 2002), Almeida and Kugot (Mgt Science 1999), Zucker, Darby, and Armstrong (NBER 2001), Adams, Black, Clemmons, and Stephan (NBER 2004)

  9. Data • Patent Bibliographic data (Patents BIB) • U.S. utility patents issued between January 1975 and February 2002. • Patent ID number, patent application and granting, patent assignee, and geographic information (country, state, city, address) on all inventors involved. • The number of patents during this period is 2,493,610 and the number inventor records is 5,105,754

  10. 2. ProQuest Digital Dissertations Abstracts • Author, title of dissertation, degree conferring institution, date of degree, academic field, and type of degree • From over 1,000 North American graduate schools and European universities. • For those who earned degrees in all natural science and engineering fields between 1945 and 2003 • 1,068,551 degree holders.

  11. 3. The Compact D/SEC • 12,000 publicly traded firms • at least $5 million in assets and at least 500 shareholders • Information obtained from Annual Reports, 10-K and 20-F filings, and Proxy Statements for those companies. • pharmaceutical and semiconductor firms in the Compact D/SEC data by their primary SIC. • selected only the years 1989 through 1997 due to patent grant lag

  12. 4. Standard & Poor’s Annual Guide to Stocks – Directory of Obsolete Securities • histories of firm ownership changes due to mergers and acquisitions, bankruptcy, dissolution, and name changes, updated through December 2002. 5. NBER Patent-Citations • collected by Hall, Jaffe and Trajtenberg (2001) • all citations made and received by patents granted between 1975 and 1999. (16,522,438 citation records) 6. Thomas Register • Firm founding year

  13. 3 Steps in Data Construction Citation S&P • Identifying the same inventor among ‘same/similar’ names (Patent BIB) • Identifying the Ownership Structure of Subsidiaries (Compact D/SEC, S&P) • Combining Patent-Inventor Data with Firm Data and Patent Citation Data Proquest Patent BIB Compact D/SEC Thomas +

  14. Front page of patent

  15. Step 1: Identifying the Same Inventor • Inventor name variants Adam Smith vs. Adam Smith? Adam E. Smith vs. Adam Smith? Adam Smyth vs. Adam Smith? : :

  16. The size of data (1975-2002) 2,493,610 million patents 5,105,754 million inventor names • Name of the inventor (last, first, middle, surname modifier) • Street address, zip • City, state, country Over 16 million patent citations (A. Jaffe)

  17. How to identify? • Pair each name with other names and compare N(N-1)/2 number of unique pairs. = (5,105,754 x 5,105,753) / 2 ≈ 13 trillion pairs • Trajtenberg (2004)

  18. How to Identify? a. The pair is a ‘Match’ if • Last names (SOUNDEX coded) and First Names in the pair are the same and • at least one of below categories are the same • Full Address: same street address+ city + country • Self Citation: same name is found in the patent that is citing • Shared Partner (s): two names from the pair share the same partner c.f. Strong Criteria (Trajtenberg 2004)

  19. SOUNDEX Coding Method • Code on the way a last name sounds rather than the way it is spelled. • Expand the list of similar last names to overcome the potential for inconsistent foreign name translations into English. PETTIT (P330000), Chang (C520000), Chiang (C520000) • Giving letters numerical values from 1 to 6 1 for B, F, P, V; 2 for C, G, J, K, Q, S, X, Z; 3 for D, T; 4 for L; 5 for M, N; 6 for R; 0 for punctuation, H, W, Y

  20. b. The pair is a ‘Match’ if • Full Last (not a Soundex coded) and First Names in the pair are the same and • at least one of below categories are the same • Zip Code • Full Middle Name c.f. Medium Criteria (Trajtenberg 2004) c. The pair is a ‘Mismatch’ if middle name initials are different.

  21. Impose Transitivity A matched to B  B matched to C, Amatched toC

  22. An Example • Match: 1:2 , 1:5, 1:6, 2:3, 2:4, 2:5, 2:6, 5:6: 3:6 • -ID 5 is identified to be the same inventor through Transitivity

  23. 126 mismatches found after imposing transitivity • 3 categories of Mismatches i) from data error ‘Laszlo Andra Szporny’ vs. ‘Laszlo Eszter Szporny’ ii) Inventor with 2 Middle names iii) same Last and First names appear in the same patent

  24. Matching Results • 2.3 million unique inventors (45%) out of 5.1 million names c.f. Trajtenberg (2004) • 1.6 million distinctive inventors (37%) out of 4.3 million names. (Our patent database is larger because it includes additional years, 2000-2002.) • a matching criterion of the same Assignee -> can yield a bias in mobility among inventors. • assigns scores for each matching criteria • Instead we apply the criterion that two inventors are not treated as a match if their middle name initials differ. • SOUNDEX coding system sometimes so loosely specifies names that apparently different last names are considered a match.

  25. Add Dissertation Abstract Information to Inventor data • Match degree holders in the Dissertation Abstract data with the Inventor data. • contains a full name in a string for each individual author • Convert the last, first, middle names in the inventor data to a string of aggregated names • 64,507 (3 percent) Ph.D. or equivalent degree holders out of 2.3 million uniquely identified inventors

  26. Step 2: Ownership Structure of Subsidiaries • Necessary when combine firm-level information with patent data file • Patent Assignee: either a parent firm or its subsidiaries. • Firm identifier does not exist. • Frequent changes in firm ownership and corporate names - During 1989 and 1997, 152 firms were merged, 15 firms were acquired, 145 firms changed their firm names • Firm ownership structure of subsidiaries, M&A, and name change history • Relate each assignee to a firm • Enables to identify each inventor’s firm for which he/she is innovating

  27. Select two industry firms in the Compact D/SEC • Primary SIC 2834 (pharmaceutical preparation) or Primary SIC 3674 (semiconductor and related devices) 2. Use S&P data • whether the change of an inventor’s firm is due to firm-level M&A and/or corporation name changes. 3. List of subsidiary in the Compact D/SEC throughout the period 1989-1997 • not always complete – • if once a subsidiary of the firm, it is a subsidiary throughout 1989-1997 4. Combined firms’ founding year

  28. Step 3: Combining Inventor data with firm data and Patent Citation data • Combine inventor file with firm-level data • Patent-inventor-firm matched data • Link to Hall, Jaffe, and Trajtenberg citation data (2001) • 16,522,438 citations for all granted patents applied from 1975 through 1999.

  29. Descriptive Statistics 1975 - 2002 • 2,493,610 patents • 2.05 inventors per patent • 2,299,579 unique inventors

  30. Descriptive Statistics * 3 percent(64,507) of Ph.D. or equivalent degree holders

  31. Number of Patents Granted by Year of Application * Grant lag - 97 % of patents are granted within the first 4 years of the applications date (Hall, Griliches, and Hausman 1986)

  32. International Knowledge Flow • Trends in U.S. firms’ access to the researchers with overseas research experience • Role of research personnel as a pathway for the diffusion of ideas from foreign to U.S. • The firm-level determinants of accessing innovations developed overseas.

  33. Inventors with Foreign Experience in US Domestic Patents † Resided in foreign countries in the previous 10 years

  34. Patent-Inventor Ratio by Foreign-Experience type

  35. Variable Definition and Sample Statistics

  36. Determinants of Citation to Foreign-Assigned Patents Dependent variable = logit transform of CITE_FRGN Note: Rows show the estimated coefficient and the t statistic for each regressor. The result for a constant term is suppressed. The t statistic is based on the Huber-White sandwich estimator of variance.

  37. Conclusion • An increase in the extent that U.S. innovators access researchers with foreign R&D experience in recent years • An increase in U.S. firms’ employment of foreign-residing researchers; • The fraction of research-active U.S. residents with foreign research experience appears to be falling • Possibly to capture the geographically dispersed knowledge spillovers. • Having researchers with research experience abroad seems to facilitate access to foreign produced knowledge. • In the semiconductor industry smaller firms and older firms are more likely to make use of the output of non-U.S. R&D. • In the pharmaceutical industry, younger firms are more likely to make use of the output of non-U.S. R&D.

  38. Future Extension • The consequences of the mobility of R&D personnel on firm R&D. • The impact of the arrival of a researcher with a particular set of R&D experiences on the character and quantity R&D done by a firm • The importance of inter-firm mobility for technological diffusion. • How firms organize the R&D enterprise, the extent of collaboration among scientists geographically dispersed, and the extent of interaction among scientists with different backgrounds.

More Related