Internships & Fellowships

          The following information is maintained by the The Graduate Student Issues Committee (GSIC). 

          If you would like to have your internship or fellowship listed here please contact our GSIC co-chair:
          Delwin Carter:

Graduate Student Internships

Internships are a valuable way to link your academic experience with the professional arena. Below is a list of internships that will allow students to go beyond the classroom and conduct practical research with a mentor from a testing company or research agency.

Graduate Student Fellowships

Fellowships provide structured work experience and professional development that include intensive training and experiential learning. Below is a list of fellowships that provide support to the fellow's growth and opportunities to explore a particular field of measurement.

American Board of Internal Medicine 

American Board of Internal Medicine Internship

Summer 2020Psychometric Internship
Length: 8 Weeks 
Internship Opportunity 

The ABIM’s psychometric internship program is an eight-week long summer internship running from Monday, June 1st to Friday, July 24th in Philadelphia, PA. During the program, the intern will take primary ownership of an applied research project related to psychometrics under the guidance of one of the ABIM’s measurement scientists. The intern will also have opportunities to assist psychometric staff on other research projects and to learn about operational processes 


• Doctoral student in an educational measurement (or related field) program with at least two years of coursework completed by the start of the internship 

• Preference will be given to applicants who have experience with item response theory 

• Excellent communication skills 

• Interest in certification testing 

• Eligible to be legally employed in the United States 


The ABIM provides a total of $10,000 for the eight-week internship program. This total includes an $8,000 stipend as well as a $2,000 housing allowance. 

Research Projects 

For their primary research project, the intern should expect to perform all stages of the research process, from literature review to discussion and dissemination of results. At the conclusion of the program the intern will be expected to share their results by giving a brief presentation to an audience of psychometric staff. Further, the intern will be encouraged to submit their summer project(s) for presentation at a professional conference and/or for publication. The intern will work with their mentor to select an appropriate project for their experience level and interests. Examples of previous internship projects include: 

1. Proficiency Score Estimation. This project examined how poorly estimated item parameters impact different proficiency estimators. The intern conducted a simulation study to examine how different levels of parameter instability impact Bayesian vs. non-Bayesian estimators as well as pattern vs. summed-score estimators. 

2. Exploring the Impact of Examination Timing. This project examined different ways to determine if a test is speeded. The intern conducted a thorough literature review of methods used to detect and quantify test speededness. She then used data visualization, a nonparametric tool that does not have any data assumptions to examine operational test data for speededness. This was shown to be a viable approach to assessing the impact of examination timing. 

3. Automatic Key Validation. This project developed an automatic method for determining if there is a problem with an item’s key. The intern collected data from psychometricians regarding which items required key validation and used logistic regression to mimic that professional judgement to automatically flag problematic items. 


Please submit your curriculum vitae and a letter of interest to Michelle Johnson, Research Program Manager ( by Friday, January 31st, 2020. 

Center for Assessment


2020 Summer Internship Program in Educational Assessment and Accountability 

The National Center for the Improvement of Educational Assessment, Inc. (the Center) is a small non-profit organization that occupies a unique and influential niche at the intersection of educational measurement and educational assessment policy. The Center is pleased to offer up to five (5) summer internships for advanced doctoral students in educational measurement and/or assessment/accountability policy who want the opportunity to work with the Center’s professionals on projects with direct implications for state and national educational policy. 

The Center for Assessment 

The Center was formed in 1998 as a not-for-profit corporation with a mission to increase student learning through improved assessment and accountability practices. The Center is located in Dover, NH (10 miles from the seacoast town of Portsmouth, NH and about an hour north of Boston, MA). The Center’s thirteen professional staff members have earned doctorates in psychometrics, curriculum, or statistics and most have worked at high levels in state departments of education (e.g., assessment directors) or in testing companies. The combination of technical expertise and practical experience allows Center professionals to contribute effectively to cutting edge applications in educational measurement and policy. 

The Center works directly with states (current contracts include more than 30 states or entities) and has working relationships with several national research and advocacy organizations such as the Council of Chief State School Officers (CCSSO), Achieve, and KnowledgeWorks. Some sample current projects of the Center include: 

  • Serving as technical leaders in the design and implementation of Innovative Assessment Demonstration Authority (IADA) projects with states pursuing this flexibility under the federal Every Student Succeeds Act (ESSA), 
  • Helping states devise student longitudinal growth systems for school accountability, and analyze the factors affecting the validity and reliability of such systems, 
  • Designing innovative, interactive assessment and accountability reporting systems designed to yield meaningful interpretations of student and school scores, 
  • Working with multi-state assessment consortia on a variety of issues ranging from assessment design and development to structuring systems for assisting the consortia in receiving relevant and timely technical advice, and 
  • Assisting states in developing comprehensive and coherent systems of assessment that serve summative and formative purposes. For example, the Center has been a national leader in desigining systems to support competency-based and personalized learning models.

The Summer Internship Program 

Each intern will work on one major project throughout the summer (to be negotiated between the intern and the Center mentor) and may participate with Center staff on other ongoing projects. The intern will have the opportunity to attend meetings and interact with state assessment personnel. Interns will be expected to produce a written report and a proposal for a research conference (e.g., NCME, AERA), as evidence of successful completion of their project. One of the Center’s senior staff will serve as the intern’s primary mentor, but the interns will interact regularly with many of the Center’s staff. Potential intern projects for 2020 may include the following: More details about the Center for Assessment can be found at Please also navigate to the Internship page for additional details about potential projects.

1. Mitigating the impact of rater inaccuracies on test score scales: As large scale assessments continue to include greater numbers of cognitively complex assessment tasks that ask students to respond in writing, there are both benefits and challenges. Such tasks offer the opportunity to gather rich and instructionally useful feedback for educators and students, but year-to-year score scale stability becomes harder to achieve due the potential influence of rater inconsistencies. Among the challenges associated with tests in which there is a greater emphasis placed on written responses is that the impact on score scales requires a heightened scrutiny due to its potential to threaten the comparability of scores within and across items and over time. This internship provides the opportunity for an advanced doctoral student interested in examining the psychometric impact of rater error on score scales and possible mitigation procedures. 

2. Improving the interpretability of test score reports. The aesthetics and quality of information presented on test score reports has improved over the last decade, but a survey of state individual student reports conducted by a 2019 Center summer intern (Tanaka, 2019) revealed that error associated with test scores was rarely reported. The Standards for Educational and Psychological Testing (AERA, APA, & NCME, 2014) explicitly call for error or uncertainty of test scores to be included anytime scores are reported. When asked why error is not being reported, many test contractors and state assessment leaders reported that users did not understand how to interpret error and were frustrated trying to make sense of these reports. This internship combines assessment literacy and report design to better understand how we might produce more accurate and useful score reports. This project will involve reviewing assessment literacy research on how best to communicate measurement error, designing report mock-ups, and conducting cognitive laboratories with potential stakeholders to evaluate and refine draft designs. 

3. Innovative Assessment System Additional Validity Evidence Collection and Analysis: New Hampshire received a first-in-the-nation waiver from federal requirements related to state annual achievement testing in the 2014-15 school year. The Performance Assessment of Competency Education (PACE) innovative assessment system is now in its sixth year of implementation, operating under the Innovative Assessment Demonstration Authority (IADA) under the Every Student Succeeds Act, and the collection and analysis of validity evidence continues. This project has two major component to support a special validity study in one participating district: 1) analyze collected student work on performance tasks and bodies of work to examine the extent to which student achievement is accurately reflected in PACE annual determinations, teacher judgment survey ratings, and NH SAS (the state standardized test) results in relation to the achievement level descriptors; and 2) use cognitive labs to compare two designs for collecting teacher judgments of student achievement. The purpose of this project is to analyze the evidence to make an argument about ways to improve the accuracy of teacher judgments about student achievement and add additional information to the validity argument about how well PACE standards represent the depth and breadth of student achievement. 

4. Evaluating assessment accommodations: Guidance from the United States Department of Education (ED) for peer review of state assessment systems specifies that states must ensure that accommodated administrations of assessments are appropriate, effective, and allow for meaningful interpretation and comparison of results for all students (peer review element 5.3). These are challenging criteria to meet. The purpose of this project will be to help identify and document the range of practices and sources of evidence to help developers better address these criteria. It is anticipated this project will involve a literature review, a survey of state assessment practices, and potentially the development of guidance to help assessment leaders understand and document the impact of accommodated conditions on assessment outcomes. 

Additional requirements for this project: Familiarity with special populations including students with disabilities and/or language learners. Good understanding of professional practices in large-scale, standardized assessment that bolster inclusiveness and accessibility. 

5. Test Fairness: Exploring current practices and future directions: Test developers often describe three primary goals for large-scale, high-stakes, assessments: validity, reliability, and fairness. We know comparatively less about the latter of these three. However, there is emerging scholarship that suggests our understanding of fairness should be broader and more tightly coupled with validity. The chief goal of this project is to better understand the ‘state of the states’ with respect to practices and, chiefly, documentation in support of fairness. Project activities will likely include: 

• A review of the literature to better understand the dimensions of fairness associated with large scale high stakes state tests. 

• An exploration of the leading development and design practices employed by states to support fairness. 

• A study of the prominent sources evidence that states have collected to document the extent that fairness has been addressed (e.g. technical manuals, research reports, peer review submissions). 

One outcome of the project may be to identify ways for states to bolster their practices and documentation in support of fairness. 

6. Evaluating the reliability and precision of school accountability performance scores: This project involves analyzing the reliability/precision of states’ school accountability performance scores, and evaluating states’ accountability performance criteria in terms of the reliability/precision characteristics. School performance scores can be produced a variety of ways and are usually aggregates of several other scores, including status and growth performance on tests of English language arts and mathematics, high school graduation rates, and other measures. However produced, they can be viewed as similar to assessment scores, with technical properties including validity, reliability/precision, and fairness. A key contribution of this project will be to implement advanced modeling methods to determine empirical reliability/precision estimates of school performance scores. 

Additional requirements for this project: Ability to do empirical modeling of complex scores using large data sets in SAS, R, or other programmable statistical software. Knowledge of states’ school accountability systems and federal school accountability requirements (i.e., ESSA) is a plus. 

7. Evaluating interim assessments against intended use and interpretation: This project involves evaluating the validity and usefulness of select commercial and state-developed interim assessments through constructing and analyzing theories of action and interpretive validity arguments, and associated evidence. Interim assessments may be meaningfully differentiated from each other and from other summative or formative assessments by specifying in detail both a theory of action and an interpretive argument (cf., Bennett, Kane, & Bridgeman, 2011; Gong & Dadey, 2019). Differentiated claims and interpretations are inherently represented in the marketing literature and assessment reports (e.g., item, student, class, school, and district) made available by publishers of commercial and state-sponsored interim assessments. This project will involve evaluating several interim assessments against their intended uses and interpretations, using the theory of action/interpretive validity argument approach. Possible interim assessments to analyze include prominent commercial interim assessments, as well as state-sponsored interim assessments. 

Additional requirements for this project: Deep familiarity with wide variety of instructional uses of interim assessment information and ability to extract and/or construct a theory of action regarding those instructional uses, and ability to create interpretive and evaluative validity arguments based on typical published assessment documentation (e.g., score reports, test blueprints, alignment studies); background in ELA or mathematics, including familiarity with the Common Core State Standards. 

8. Analyzing solutions to complex assessment design problems: The example of NGSS assessments: This project involves analyzing the strengths and limitations of multiple real-world solutions to a complex assessment design problem: how to assess the Next Generation Science Standards. The complex structure and lack of specifications of the NGSS, as well as varying state values and constraints have led states to develop a number of very different assessment designs for their science assessments. Close analysis of testing programs’ documentation will be used to depict what the solutions are to creating an NGSS assessment, how the designs are similar and different, and what are the strengths/limitations of each design. Particular care will be taken to differentiate different designs to accomplish the same thing, and different designs to accomplish different things. The documentation will include for each testing program theory of action, claims, score reports and supporting materials, test blueprints, standard setting materials, alignment study materials, and other relevant documents to the extent possible. The document analysis will be supplemented by interviews of key design architects of the testing programs. 

Additional requirements for this project: Interest in promoting multiple solution approaches to assessment design problems; understanding of the NGSS’s structure and content sufficient to understand forced assessment design trade-offs; ability to read assessment technical documentation (e.g., test blueprints, standard setting reports, alignment studies) and see implications for an interpretive validity argument. 

Application Information 

General Qualifications 

The intern must have completed at least two years of doctoral course work in educational measurement, curriculum studies, statistics, research methods, or a related field. Interns with documented previous research experience are preferred. Further, interns must document their ability to work independently to complete a long-term project. We have found that successful interns possess most of the following skills and knowledge (the importance of the level of skills and knowledge in each of the areas described below is dependent on the specific project): 

  • Ability to work on a team under a rapid development model 
  • A deep understanding of educational assessment and its uses including policy and practice 
  • Content knowledge in a relevant discipline (e.g. science, mathematics, language arts) 
  • Depending on the project, working knowledge of statistical analysis through multivariate analyses as well as fluency with one or more statistical packages, e.g., SAS, SPSS, R
  • A solid understanding of research design
  • Psychometrics (both classical and IRT) with demonstrated understanding of the principles of reliability and validity
  • An interest in applying technical skills and understanding major policy and practical issues
  • Excellent written and competent spoken English skills 


The internship duration is 8 weeks and is located at our offices in Dover, NH for the full 8 weeks. The internship will start in early June 2020; the specific date will be determined by the intern and the mentor. 


The Center will provide a stipend of $6000 as well as a housing allowance and reasonable relocation expenses. 


To apply for the internship program, candidates should submit the following materials electronically: 

  • A letter of interest explaining why the candidate would be a good fit with the Center, what the candidate hopes to gain from the experience, and which project(s) the candidate’s preferred project. Further, the letter should explain both what the candidate could contribute to the preferred project(s) and why the project(s) fits with the candidate’s interests. 
  • Curriculum vita, and
  • Two letters of recommendations (one must be from the candidate’s academic advisor). 

Of approximately 20-30 applicants, six to eight are identified for a telephone interview. Those interviewed by phone may be asked to submit one recent sole (preferred) or first-authored academic paper. Please do not submit the paper until it is requested. 

Materials must be submitted electronically (including letters of recommendations) to: Sandi Chaplin at and received by February 14, 2020. 

Applicants selected for interviews will be notified by March 6, 2020 regarding their candidacy. 

To learn more about the Center, please visit


The College Board 

The College Board Psychometric Intern - Summer 2020

Named by Fast Company as one of the most innovative education companies, the College Board is a mission-focused organization, and powerful force in the lives of American students. To fulfill our purpose of clearing a path for students to own their future, we offer access, opportunity, and excellence to millions of students each year. Over the past five years, the College Board has been undergoing a transformation. We’ve redesigned the SAT, PSAT/NMSQT, and many AP courses and exams, and we’ve introduced the PSAT 10, PSAT 8/9, and Official SAT Practice on Khan Academy, all with great success. 

The Psychometrics Department is looking for two doctoral summer interns for 2020 and each intern will work with two mentors on a specific project in the area of psychometrics. 

The internship spans 8 weeks, starting on June 8th and ending on July 31st, with an expected weekly full-time workload (40 hours per week). This eight-week internship is designed to provide interns with opportunity to work closely with psychometricians and gain hands-on working experience with College Board data and projects. Interns are expected to perform a literature review, conduct analysis, write a research report, and present the research to College Board staff at the conclusion of the project. 

To be eligible: 

Interns must be full time doctoral students at an accredited 4‐year university 

A strong preference will be made toward advanced students in the process of completing their dissertations 

Graduate students in psychometrics, measurement, quantitative & mathematical psychology, educational psychology, industrial‐organizational psychology, statistics or related fields are invited to submit applications 

Experience with statistical software (SAS, SPSS, and/or R) is required and working knowledge of Classical Test Theory, and Item Response Theory are desired 

Interns are expected to perform a literature review, conduct analysis, write a research report, and present the research to College Board staff at the conclusion of the project 

Students must be eligible to be legally employed in the United States (international F1 visa students please read details below) 

o Internship timing: Eight weeks (June 8–July 31, 2020) require prior clearance from the College Board before the internship begins. 

o Positions: 2 

o Hours per week: 40 hours per week 

o Location: Newtown, PA office (Yardley, PA) 

o This is a paid internship 

o Housing stipend can be offered when deemed necessary 

o Application deadline by February 28, 2020. Applicants will be informed about acceptances by March 13, 2020 

o Please indicate in your Cover Letter which project you would like to be considered for. 

Interns will work with two College Board mentors on a specific project in the area of psychometrics. Possible topics include: 

Project 1: Investigation of Methods to Evaluate Item Fit Plot 

In an operational testing environment, it is imperative that psychometric work be completed in a timely manner. One task that can be time consuming is the evaluation of item characteristic curve plots for fit. The proposed study is to evaluate variants of a method to evaluate item fit plots, with the goal of alleviating the need for human review. The study will utilize operational data and requires some advanced programming in R and familiarity with FlexMIRT. Familiarity with aberrancy detection methods is a plus. 

The goal of this project is to investigate an automated method to determine if the empirical data in an item plot fits the item characteristic curve (ICC). If it shows sufficient effectiveness, then the process developed could be used to reduce the amount of time and resources spent reviewing IRT item plots for pretest items. 

The basic procedure involves the following steps: 

1. The calibration data is split into two datasets. The procedure for splitting the data is described below. 

2. Two sets of item parameters are calibrated 

3. The ICCs of the items are compared to their corresponding ICCs 

4. Items with large differences between their ICCs are then removed 

5. The procedure is then repeated until no items are flagged for removal 

6. The excluded items are tagged as having poor item fit 

The crux of this study is to identify a procedure that splits the data in a manner that identifies items with poor item fit. A few of the options being considered are: 

1. Institution size 

2. Lz statistic 

3. Randomly 

4. Admin Region 

The final list of options will be identified in collaboration with the intern. We plan to use SAT data from a pretest administration for the analysis. The results will be compared to the list of pretest items flagged under the current review process. 

Project 2: Impact of Population Variation on Equating Error and Scale Stability 

Testing programs usually offer more than one administration throughout a year, which may lead to test-taker population variation across those administrations. Meanwhile, equating results based on different populations may vary and it is important to evaluate the impact of population variation on equating to guide and inform operational equating practice. 

This research study will evaluate whether equating invariance holds for tests administered to test-takers with various characteristics. Data will be simulated in the IRT framework and evaluated by both classical and IRT equating methods. 

The primary research questions are (1) to evaluate whether equating based on test-takers from different administrations yield similar results, and (2) to identify robust equating methods that provide stable equating solutions. This study will also investigate the impact of population variation on equating based on different kinds of test forms (e.g., with essay vs. without essay, easy vs. hard, and high reliability vs. low reliability, etc.). 

Skills Required: 

• Equating 

• Item Response theory 

• Strong programming skills in C++, SAS, or R 

To apply: 

International students who are studying at an accredited university under F1 visas are eligible to apply for the summer internship under Curricular Practical Training (CPT) stipulations. Please note that only two internship positions can be offered. International students should not apply for CPT unless accepted as a summer intern. 

Upon acceptance to the summer internship, we urge students to contact their respective international advisers at their host university as soon as possible to apply for a practical training certificate, which permits F1 visa holders to receive compensation from the College Board for the work they will be completing over the summer. The process to clear a student for CPT may take six weeks or longer. Therefore, we urge students to initiate the process as soon as possible. Additionally, all international students must have a social security number in order to receive compensation. 


National Board of Medical Examiners 

National Board of Medical Examiners Internship

Summer 2020 Internships in Assessment Science and Psychometrics

June 1 - July 24, 2020   Philadelphia, PA


The National Board of Medical Examiners® (NBME)® is a mission-driven, not-for-profit organization that serves the public by developing, administering, and conducting research on high-quality assessments for healthcare professionals. 

NBME programs include the United States Medical Licensing Examination®; an extensive offering of achievement tests for courses offered by medical schools; and numerous client examinations in medicine and other health professions. The variety of assessment programs creates numerous opportunities for applied and theoretical research that can impact practice.

The NBME employs approximately 30 doctoral level psychometricians and assessment scientists, as well as several MDs specializing in medical education. Staff is recognized internationally for its expertise in statistical analysis, psychometrics, and test development. 

Interns will interact with other graduate students and NBME staff, and will present completed projects or work-in-progress to NBME staff. Internships typically result in conference presentations (e.g., NCME) and sometimes lead to publication or dissertation topics.


  • Active enrollment in doctoral program in measurement, statistics, cognitive science, medical education, or related field; completion of two or more years of graduate coursework.
  • Experience or coursework in one or more of the following: test development, IRT, CTT, statistics, research design, and cognitive science. Advanced knowledge of topics such as equating, generalizability theory, or Bayesian methodology is helpful. Skill in writing and presenting research. Working knowledge of statistical software (e.g., Winsteps, BILOG; SPSS, SAS, or R). 
  • Interns will be assigned to one or more mentors, but must be able to work independently.
  • Must be authorized to work in the US for any employer. If selected, F-1 holders will need to apply for Curricular Practical Training authorization through their school’s international student office, and have a social security number for payroll purposes.


Total compensation for the two months is approximately $9800, and is intended to cover all major expenses (food, housing, travel).   

Research Projects

Interns will help define a research problem; review related studies; conduct data analyses (real and/or simulated data); and write a summary report suitable for presentation.  Projects are summarized below.  Applicants should identify 2 projects by number that they prefer to work on.

  1. Application of Natural Language Processing (NLP) in the field of assessment: Application of Natural Language Processing (NLP) in the field of assessment has led to innovations and changes in how testing organizations design and score tests. Possible projects will investigate novel NLP applications, using real or simulated data, for various processes relevant in an operational testing program (e.g., test construction, key validation, standard setting). Results would be informative for possible improvements to current best practices.
  1. Modeling answer-change strategy in a high-stakes MCQ examination: In this project, we explore the use of the Rasch Poisson Count model (Rasch, 1960/1980) to extend the hierarchical speed accuracy model (van der Linden, 2007) to model the item revisits and answer change behavior patterns in a high-stakes examination collected in an experimental setting. We propose to connect the elements of process data available from a computer-based test (correctness, response time, number of revisits to an item, the outcome of the revisit to an item, IRT ability of examinee and IRT item characteristics) in a hierarchical latent trait model that explains examinee’s behavior on changing the initial response to the item. The relationship between working speed, ability, and the number of visits and number of answer changes can be modeled using a multidimensional model that conceptualizes them as latent variables. The model should help us better understand the answer change behavior and cognitive behavior of examinees in a timed high-stakes examination.
  2. Performance Assessments: The intern will pursue research related to improving the precision and accuracy of a performance test involving physician interactions with standardized patients. Possible projects include designing an enhanced process for flagging aberrant ratings by trained raters and supporting research on standardized patients in a high-stakes exam.
  3. Measurement Instrument Revision and Development: This project will involve revising a commonly-used measurement instrument so that the appropriate inferences can be made with regard to medical students. Duties will include the following: working with subject-matter experts to revise the existing items; conducting think-alouds with medical students; developing a pilot measure of potential items; exploratory and confirmatory factor analysis of initial pilot results to gather structural validity evidence; developing a larger survey to gather concurrent and discriminate validity evidence with the revised measure; and administration and evaluation of the larger survey.
  4. Characterizing (and Visualizing) Item Pool Health: The health of an item pool can be defined in a number of ways. Our current test development practices utilize have/need reports broken down by content area, and many content outlines are hierarchical in nature, with several layers of content coding and metadata. The problem is that the have/need ratios are, for the most part, one dimensional, but details within the “have” portion of these ratios represent multidimensional information that can be used to improve multiple aspects of test development, including form construction, test security, pool management/maintenance, and targeting of item-writing assignments. The aims of this project are two-fold: (1) develop helpful, easily-interpretable metrics to assess item pool health; and (2) employ a sophisticated visualization method of item pool health (e.g., via R Shiny, D3.js, .NET languages/libraries, etc) to assist in improving one or more aspects of test development.
  5. Item Tagging and Mapping with Natural Language Processing: Test content outlines and specifications often change rapidly within cutting-edge domains. In response to these changes, test development teams must “map” the pre-existing content onto the new content domains. Such a task is trivial when there are equivalent content domains between the new and old content outlines. However, this direct mapping rarely occurs, leaving item mapping to be done manually, a time-intensive task that is prone to human error and differences in subjective interpretations across humans. This project seeks to utilize and integrate natural language processing (NLP), machine learning (ML), and data visualization to (1) assist subject-matter experts with creating new content outlines; (2) help map items to new content domains; (3) review manual item mappings for accuracy as a quality control measure; and (4) visually represent the degree of content distribution within a group of items (e.g., test form, item bank, etc). A component of this project will be the utilization of sophisticated data visualization methods to allow subject matter experts and test development staff to more easily examine items in multiple contexts. Strong candidates for this position will have knowledge of Python or a similar language to utilize common libraries used in NLP (e.g., Keras, Tensorflow, Pytorch, etc)
  6. Computer-Assisted Scoring of Constructed Response Test Items: Recently the NBME has developed a computer-assisted scoring program that utilizes natural language processing (NLP). The two main components of the program are (1) ensuring that the information in the constructed response is correctly identified and represented; and (2) building a scoring model based on the these concept representations. Current areas of research surrounding this project include (but are not limited to): refining quality control steps to be taken prior to an item being used in computer-assisted scoring; linking and equating computer-assisted scores with human rater scores; evaluating a scoring method based on using orthogonal arrays; and developing metrics that assess item quality and test reliability when computer-assisted scores and human scores are used to make classification decisions. The final project will be determined based on a combination of intern interest and project importance.


Candidates may apply by going to  A cover letter outlining experience and listing project interests by number, along with a current resume, are required. Application deadline is February 3, 2020. 


All applicants will be notified of selection decisions by February 21, 2020.


American Institutes for Research (AIR)

American Institutes of Research (AIR)
2020 NAEP Doctoral Student Internship Program

The mission of the NAEP Doctoral Student Internship Program is to advance and encourage secondary analysis and methodological developments using data from the National Assessment of Educational Progress (NAEP). During the10-week internship program, interns will work directly with researchers from the National Center for Education Statistics (NCES) and American Institutes for Research (AIR) on research in one of four topic areas:

Psychometrics & Statistical Methods

Process Data

Policy-Relevant Research

Envisioning Quantitative Information for Digital Media (Data Visualization)



  • Doctoral students in educational measurement, statistics, information science, sociology, psychology, economics, computer science, or other related fields.
  • Two years of coursework completed in a doctoral program OR a master’s degree completed with at least one (1) year of coursework completed in a doctoral program.
  • Must possess strong organizational and interpersonal skills.
  • Experience with applying advanced statistical and/or psychometric methods
  • Sophisticated experience with statistical software packages such as Stata, R, or Mplus.
  • Knowledge of Item Response Theory and/or sampling theory.
  • Experience analyzing data from large-scale surveys and assessment data with complex sampling design is a plus.


Internship Details

  • Application Deadline: February 18, 2020
  • Notification date: mid-March
  • 10 weeks starting May 26, 2020
  • Based in AIR’s office in Crystal City, VA
  • Paid internship (up to $12,000 for 10 weeks)


For more information please view the NAEP Summer Internship website.

Educational Testing Service (ETS)

2020- Educational Testing Service Internship

Interns in this eight-week program participate in research under the guidance of an ETS mentor. Each intern is required to give a brief presentation about the project at the conclusion of the internship. The internship is carried out in the ETS offices in Princeton, N.J. This year, projects may be conducted in the following research areas:

Research Area 1: English Language Learning and Assessment

Research Area 2: Career and Technical Education

Research Area 3: Teacher Diversity and Quality

Research Area 4: Design and Validity for Digital Assessment

Research Area 5: Modeling and Analyzing Examinee Response Processes

Research Area 6: Statistical and Psychometric Foundations

Research Area 7: Group-Score Assessment

Research Area 8: Applied Psychometrics

Research Area 9: Human and Automated Scoring

  • The application deadline is February 1, 2020.
  • Applicants will be notified of selection decisions by March 31, 2020.
  • Eight weeks: June 1, 2020–July 24, 2020
  • $6,000 salary
  • Transportation allowance for relocating to and from the Princeton area
  • Housing will be provided for interns commuting more than 50 miles
  • Current full-time enrollment in a relevant doctoral program
  • Completion of at least two years of coursework toward the doctorate prior to the program start date

For more information please view the ETS Internship Announcement.


Graduate Management Admission Council

Graduate Management Admission Council



 GMAC is pleased to offer a paid eight-week internship this summer to one advanced graduate student motivated to conduct innovative research with the Test Development and Psychometrics Department (TD&P). Possible research topics include (but are not limited to) CAT/MST, test security, non-cognitive assessment, automated item generation, machine learning, and natural language processing. Please visit for general information about the organization. 

Program Details 

▪ Openings: 1 

Duration: June 1 – July 24, 2020 

Location: GMAC headquarters in Reston, VA – participant is required to be on-site for the duration of the program (remote work is not possible) 

Stipend: $8,000 all-inclusive (relocation, housing, transportation, and living expenses are not covered separately) Deliverable: In collaboration with an assigned mentor, participant is expected to produce an original research paper for the GMAC Research Report (RR) series and/or a conference proposal for NCME, AERA, IMPS, etc. 


Current enrollment in a doctoral program in Educational Psychology, Quantitative Psychology, Statistics, or a related field 

At minimum three years of coursework in statistics and psychometrics, particularly covering the fundamentals of measurement theory (both CTT and IRT) 

Demonstrated potential for quality research (e.g., completed or on-going projects, master’s thesis, conference presentations, publications) 

Experience in statistical programming (using R, SAS, Python, etc.) and familiarity with adaptive testing strongly preferred 


Materials: (1) CV/resume; (2) brief cover letter highlighting relevant experience and research interests; (3) transcript (unofficial is fine); (4) one letter of recommendation 

Submit all materials as email attachments to 

Deadline: February 16, 2020 - applicants will be notified of decisions by February 28 



The Association of International CPA's 

The Association of International CPA's

The American Institute of CPAs (AICPA) Examinations Team continually strives to improve and strengthen the Uniform CPA Examination® (Exam) testing program through research on the use of enhanced technology and advanced measurement models. 

We have a long tradition of working with advanced graduate students through a summer internship program, providing students the opportunity to work on a state-of-the-art operational exam under the supervision of psychometricians at the AICPA. The program has been very successful and productive – internships typically lead to a published internal technical report and a presentation at a national psychometric conference. 

Program Description 

Our internship research opportunities are structured, project-based experiences, and interns should be prepared to work independently with direction from a supervising psychometrician. Psychometricians from the Examinations team will guide interns through the development, execution and write-up of a focused psychometric (or related) research study. Interns will work on topics based on their interests and skills from the team’s research agenda. 

Research Agenda 

Current research agenda topics may include, but not be limited to: 

  • Innovative methods to assess test security for traditional and innovative item formats 
  • Examining psychometric and other properties of items exhibiting drift 
  • Application of machine learning algorithms to assess item qualities,characteristics and/or outcomes 
  • Examining model-data fit indices in the presence of sparse data matrices due to adaptive testing 
  • Local item dependence of innovative item formats 
  • Different approaches to scoring (3PL versus 2PL and/or Rasch model) and resulting advantages/disadvantages of each 
  • Alternative scoring formats, including Summated scores versus IRT-derived scores, and/or Compensatory versus Conjunctive scoring approaches 

In addition to the above topics, interns working on other/complementary psychometric research topics requiring data should feel free to apply, outlining the specific proposal in the application. 


Applicants should be graduate students in the latter phase of their program in Psychometrics, Educational Measurement, I/O Psychology, Statistics, Computer Science or a related field. Preference will be given to applicants who have completed their coursework and have strong evidence of research experience. 


Successful candidates will be hired for eight weeks during the summer of 2020, with the first week being on-site at the Exam team’s office in Ewing, New Jersey. While in Ewing, interns will work with staff mentors to develop the specifics of their assigned studies. Interns will then spend the remaining seven weeks at their home university, at which time they will conduct their studies and deliver an 

agreed-upon set of deliverables that document their results and findings. Although interns will be assigned one or two mentors, interns must be able to work independently. 


Interns will be paid a $5,000 stipend. Additionally, the AICPA covers all travel expenses, including hotel and per diem costs for on-site visits. 

Additional Information 

For some background on internships and the application process, follow this link: 

How to Apply 

Interested applicants should submit the following: 

1. Cover letter with contact information (address, email, phone) and description of educational background 

2. Statement of research interests, skills and how they relate to the projects listed above (1 – 2 pages, double-spaced) 

3. Current curriculum vitae 

4. Two letters of recommendation 

Applications and questions regarding the internship program should be sent electronically to Matthew T. Schultz, Ph.D.


Applications must be received by Monday February 3, 2020. 


The National Commission on Certification of Physician Assistants 

NCCPA History and Mission

The National Commission on Certification of Physician Assistants (NCCPA) is the only nationallyrecognized certification organization for physician assistants (PAs). Established as a not-for-profitorganization in 1974, NCCPA is dedicated to assuring the public that certified PAs meet establishedstandards of clinical knowledge and cognitive skills upon entry into practice and throughout theircareers. All U.S. states, the District of Columbia, and U.S. territories rely on NCCPA certification asone of the criteria for licensure or regulation of physician assistants. As of Dec. 31, 2018, there wereapproximately 131,000 certified PAs.

2020 Summer Internship Program

NCCPA is offering an eight-week program for students currently working toward their Ph.D. in psychometrics, with at least two years of graduate coursework During the 8-week program, the intern will have the opportunity to gain experience in operational psychometric tasks involved in administering and scoring a certification assessment and collaborate on a research paper culminating in an NCME or ABRA proposal. Areas of research in the past have included: test model delivery research (e.g. ca-MST, CAT, LOFT), automated item generation (AIG), issues related to longitudinal assessment, data mining/machine learning, novel standard setting approaches, and many other applied psychometric research.

The program is scheduled to run from early June and end in late July/early August, with the first week being on-site at the NCCPA office located in Johns Creek, GA. The intern will spend the remaining seven weeks at their university or other location to conduct the agreed upon research and develop the deliverables. All finalized deliverables will be provided to NCCPA at the completion of the eight-week internship.

Each candidate should submit

  • a curriculum vitae,
  • copy of graduate school transcript (does not need to be an official transcript),
  • two letters of recommendation;
  • and a statement of purpose describing their interest in the internship and general research interests.

Application materials may be submitted online or mailed to NCCPA and must be received by February 17, 2020.
To apply for this internship please visit here.

The internship award will be announced by March 9, 2019. The award includes a $6,000 stipend. In addition, travel expenses for the trips to the NCCPA offices will be reimbursed in accordance with NCCPA's policies.



CFA Institute

CFA Institue

Psychometric Intern

Job Description Summary: 

• Dimensionality analyses to determine if CFA Institute programs' reporting topics.
• Perform analyses to support validity and reliability regarding
item type evaluation.
• Write up and present results of studies.

This paid internship position will be located in the Charlottesville, VA office.

Job Description:


  • Graduate level students are welcome to apply
  • Ability to program in R, SAS and/or Winsteps
  • Ability to perform dimensionality studies

Pay will be $15 an hour and the internship runs from June 1st – August 7th, 2020. 

The application deadline is January 31st, 2020.
To apply for this opportunity please visit here.

American College Testing (ACT) Psychometric Intern

ACT Psychometric Research Intern


ACT is a nonprofit organization helping people achieve educational and workplace success. Our programs are designed to boost lifelong learning in schools and workplaces around the world. Whether it's guiding students along their learning paths, enabling companies to develop their workforce, fostering parent, teacher, and counselor understanding of student progress, guiding job seekers toward career success, or informing policymakers about education and workforce issues. ACT is passionate about making a difference in all we do.


Position Objective: Conduct a study to investigate the usage of item latency information in detecting aberrant testing behaviors in computer-based tests.

Typical job activities include:

  • designing a research study

  • writing programs and conducting research

  • summarizing the results

  • writing research reports

  • presenting the outcome of the research and submitting a proposal based on the study to NCME/AERA


Minimum Qualifications

Education: Currently enrolled purusing a graduate degree in educational measurement, statistics, advanced statistics or a related field.

Knowledge, Skills and Abilities:

  • knowledge and research experience in different statistical models

  • computer programming skills in SAS, R, and/or Python

  • good writing skills

For more information please see the job posting.

American College Testing (ACT) Internship: Survey, Validity & Efficacy Research

ACT Survey, Validity & Efficacy Research Intern


ACT is a nonprofit organization helping people achieve educational and workplace success. Our programs are designed to boost lifelong learning in schools and workplaces around the world. Whether it's guiding students along their learning paths, enabling companies to develop their workforce, fostering parent, teacher, and counselor understanding of student progress, guiding job seekers toward career success, or informing policymakers about education and workforce issues. ACT is passionate about making a difference in all we do.


Position objective: The summer intern will provide support in furthering ACT’s understanding of the environmental factors that influence students’ attitudes toward, decision-making about, and usage of test preparation. This work will help us better articulate the key levers in test preparation usage through a literature review and with the implementation of a research study. The results of this work will be used as the foundation to investigate the impact of the introduction of modular testing and superscoring on test prep. Co-authorship of publications and submissions to national conferences will also be considered.

Description of type of work/area of focus: The summer intern will capitalize on ecological factors (e.g., systems theory and thinking) in program evaluation to better understand test preparation use. Work will focus on studying ecological factors (e.g., the application of systems theory) in understanding current research on test preparation followed by the design and implementation of a pilot study. The study will help ACT to understand the environmental factors that influence students’ attitudes toward, decision-making about, and usage of test preparation. Work will include: a) the documentation of methodological approaches, b) a framework that applies these methodological approaches to test preparation, c) the implementation of a pilot study that applies this framework, and d) a write-up of findings from the pilot study. A paper will be written that synthesizes this work. Submission to national conferences will also be considered.

Typical work-related activities will include:

  • Conducting a review of the literature on topics of focus

  • Provide support in designing and implementing a study (e.g. development of interview protocols or surveys)

  • The analysis of pilot study data

  • Participate in regular meetings with supervisors

  • Present findings to technical and non-technical audiences

  • Coauthor papers


Minimum Qualifications:Candidates should be currently enrolled in a doctoral graduate program in program evaluation or in social science (e.g., education, psychology, sociology) research program. Completion of two years of graduate school at time of internship required.

Experience Requirements/Preferences: Currently enrolled in a relevant doctoral program with program evaluation and research design training. Prior experience conducting evaluation/research preferred.

Knowledge, Skills and Abilities:Knowledge of and experience in program evaluation and/or systems theory, research methods (mixed methods preferred) and good professional writing/presentation skills are required

For more information please see the job posting.

American College Testing (ACT) Assessment Transformation Internship

ACT Assessment Transformation Internship


ACT is a nonprofit organization helping people achieve educational and workplace success. Our programs are designed to boost lifelong learning in schools and workplaces around the world. Whether it's guiding students along their learning paths, enabling companies to develop their workforce, fostering parent, teacher, and counselor understanding of student progress, guiding job seekers toward career success, or informing policymakers about education and workforce issues. ACT is passionate about making a difference in all we do.


Position Objective: A successful intern will learn about methods for studying the comparability of paper and online assessments and develop statistical analysis programs to examine mode comparability. Through this work, the intern will gain experience with common psychometric analyses supporting large-scale assessment programs. The intern will work with Assessment Transformation staff to write an AERA or NCME proposal describing recent mode comparability studies for the ACT.


Minimum Qualifications: Currently pursing a graduate degree in educational measurement, psychometrics, educational statistics, educational psychology or a related field.

Experience Requirements/Preferences: Two years of doctoral studies in a relevant field

Knowledge, Skills and Abilities:

  • classical test theory

  • item response theory

  • equating

  • statistical analysis and programming

  • data visualization

  • technical writing

  • teamwork

For more information please see the job posting.

ETS Post Doctoral Fellowship 

ETS Post Doctoral Fellowship


Individuals who have earned their doctoral degree within the last three years are invited to apply for a rewarding fellowship experience which combines working on cutting-edge ETS research projects and conducting independent research that is relevant to ETS's goals. The fellowship is carried out in the ETS offices in Princeton, N.J. This year we are seeking applicants with experience in the following areas:

  • Applied Psychometrics
  • Artificial Intelligence Based Automated Scoring
  • Modeling and Scoring Item Responses from Interactive and Simulation-Based Assessments
  • Modeling of Response Processes and Response Times
  • Psychometric Issues in Adaptive Testing Designs
  • Statistical and Psychometric Foundations
  • Statistical and Psychometric Issues in Group-Scored Assessments
Program Goals
  • Provide research opportunities to individuals who hold a doctorate in the fields indicated above
  • Enhance the diversity and inclusion among underrepresented groups in conducting research in educational assessment and related fields
Important Dates
  • March 1, 2020 — deadline for preliminary application
  • April 15, 2020 — deadline for final application materials
Duration of Program

The fellowship is for a period of up to two years, renewable after the first year by mutual agreement.

  • Competitive salary
  • $5,000 one-time relocation incentive for round-trip relocation expenses
  • Employee benefits, vacation, holidays and other paid leave in accordance with ETS policies
  • Doctorate in a relevant discipline within the past three years
  • Evidence of prior independent research

 For more information please visit the ETS Post Doctoral Fellowship announcement.

ETS Harold Gulliksen Psychometric Research Fellowship 

ETS Harold Gulliksen Psychometric Research Fellowship


During the summer, selected fellows are required to participate in the Summer Internship Program in Research for Graduate Students, working under the guidance of an ETS mentor. During the subsequent academic year, fellows study at their universities and carry out research under the supervision of an academic mentor and in consultation with their ETS mentor.

Program Goals

The goal of this program is to increase the number of well-trained scientists in educational assessment, psychometrics and statistics.

Important Dates
    • December 31, 2019 — Deadline for receipt of preliminary application materials
    • January 15, 2020 — Applicants are notified of preliminary application decision
    • February 14, 2020 — Deadline for receipt of final application materials
    • March 31, 2020 — Award recipients are notified
Duration of Program

Appointments are for one year.
Award Value

Each fellow's university receives the following:

  • $20,000 to pay a stipend to the fellow
  • $8,000 to defray the fellow's tuition, fees and work-study program commitments
  • A small grant to facilitate work on the fellow's research project

Selected fellow must participate in the Summer Internship Program in Research for Graduate Students. The fellow will receive the following:

  • $6,000 salary
  • Transportation allowance for relocating to and from the Princeton area
  • Housing will be provided for interns commuting more than 50 miles

At the time of application, candidates must be enrolled in a doctoral program, have completed all the coursework toward the doctorate, and be at the dissertation stage of their program. Dissertation topics in the areas of psychometrics, statistics, educational measurement or quantitative methods will be given priority. At the time of application, candidates will be asked to provide a statement describing any additional financial assistance such as assistantship or grant commitment that he/she will have during the fellowship period.

For more information please visit the Harold Gulliksen Psychometric Research Fellowship announcement.