USC shield

Center for Excellence in Teaching

Home > Resources > Short essay question rubric

Short essay question rubric

Sample grading rubric an instructor can use to assess students’ work on short essay questions.

Download this file

Download this file [62.00 KB]

Back to Resources Page

Rubric Best Practices, Examples, and Templates

A rubric is a scoring tool that identifies the different criteria relevant to an assignment, assessment, or learning outcome and states the possible levels of achievement in a specific, clear, and objective way. Use rubrics to assess project-based student work including essays, group projects, creative endeavors, and oral presentations.

Rubrics can help instructors communicate expectations to students and assess student work fairly, consistently and efficiently. Rubrics can provide students with informative feedback on their strengths and weaknesses so that they can reflect on their performance and work on areas that need improvement.

How to Get Started

Best practices, moodle how-to guides.

  • Workshop Recording (Spring 2024)
  • Workshop Registration

Step 1: Analyze the assignment

The first step in the rubric creation process is to analyze the assignment or assessment for which you are creating a rubric. To do this, consider the following questions:

  • What is the purpose of the assignment and your feedback? What do you want students to demonstrate through the completion of this assignment (i.e. what are the learning objectives measured by it)? Is it a summative assessment, or will students use the feedback to create an improved product?
  • Does the assignment break down into different or smaller tasks? Are these tasks equally important as the main assignment?
  • What would an “excellent” assignment look like? An “acceptable” assignment? One that still needs major work?
  • How detailed do you want the feedback you give students to be? Do you want/need to give them a grade?

Step 2: Decide what kind of rubric you will use

Types of rubrics: holistic, analytic/descriptive, single-point

Holistic Rubric. A holistic rubric includes all the criteria (such as clarity, organization, mechanics, etc.) to be considered together and included in a single evaluation. With a holistic rubric, the rater or grader assigns a single score based on an overall judgment of the student’s work, using descriptions of each performance level to assign the score.

Advantages of holistic rubrics:

  • Can p lace an emphasis on what learners can demonstrate rather than what they cannot
  • Save grader time by minimizing the number of evaluations to be made for each student
  • Can be used consistently across raters, provided they have all been trained

Disadvantages of holistic rubrics:

  • Provide less specific feedback than analytic/descriptive rubrics
  • Can be difficult to choose a score when a student’s work is at varying levels across the criteria
  • Any weighting of c riteria cannot be indicated in the rubric

Analytic/Descriptive Rubric . An analytic or descriptive rubric often takes the form of a table with the criteria listed in the left column and with levels of performance listed across the top row. Each cell contains a description of what the specified criterion looks like at a given level of performance. Each of the criteria is scored individually.

Advantages of analytic rubrics:

  • Provide detailed feedback on areas of strength or weakness
  • Each criterion can be weighted to reflect its relative importance

Disadvantages of analytic rubrics:

  • More time-consuming to create and use than a holistic rubric
  • May not be used consistently across raters unless the cells are well defined
  • May result in giving less personalized feedback

Single-Point Rubric . A single-point rubric is breaks down the components of an assignment into different criteria, but instead of describing different levels of performance, only the “proficient” level is described. Feedback space is provided for instructors to give individualized comments to help students improve and/or show where they excelled beyond the proficiency descriptors.

Advantages of single-point rubrics:

  • Easier to create than an analytic/descriptive rubric
  • Perhaps more likely that students will read the descriptors
  • Areas of concern and excellence are open-ended
  • May removes a focus on the grade/points
  • May increase student creativity in project-based assignments

Disadvantage of analytic rubrics: Requires more work for instructors writing feedback

Step 3 (Optional): Look for templates and examples.

You might Google, “Rubric for persuasive essay at the college level” and see if there are any publicly available examples to start from. Ask your colleagues if they have used a rubric for a similar assignment. Some examples are also available at the end of this article. These rubrics can be a great starting point for you, but consider steps 3, 4, and 5 below to ensure that the rubric matches your assignment description, learning objectives and expectations.

Step 4: Define the assignment criteria

Make a list of the knowledge and skills are you measuring with the assignment/assessment Refer to your stated learning objectives, the assignment instructions, past examples of student work, etc. for help.

  Helpful strategies for defining grading criteria:

  • Collaborate with co-instructors, teaching assistants, and other colleagues
  • Brainstorm and discuss with students
  • Can they be observed and measured?
  • Are they important and essential?
  • Are they distinct from other criteria?
  • Are they phrased in precise, unambiguous language?
  • Revise the criteria as needed
  • Consider whether some are more important than others, and how you will weight them.

Step 5: Design the rating scale

Most ratings scales include between 3 and 5 levels. Consider the following questions when designing your rating scale:

  • Given what students are able to demonstrate in this assignment/assessment, what are the possible levels of achievement?
  • How many levels would you like to include (more levels means more detailed descriptions)
  • Will you use numbers and/or descriptive labels for each level of performance? (for example 5, 4, 3, 2, 1 and/or Exceeds expectations, Accomplished, Proficient, Developing, Beginning, etc.)
  • Don’t use too many columns, and recognize that some criteria can have more columns that others . The rubric needs to be comprehensible and organized. Pick the right amount of columns so that the criteria flow logically and naturally across levels.

Step 6: Write descriptions for each level of the rating scale

Artificial Intelligence tools like Chat GPT have proven to be useful tools for creating a rubric. You will want to engineer your prompt that you provide the AI assistant to ensure you get what you want. For example, you might provide the assignment description, the criteria you feel are important, and the number of levels of performance you want in your prompt. Use the results as a starting point, and adjust the descriptions as needed.

Building a rubric from scratch

For a single-point rubric , describe what would be considered “proficient,” i.e. B-level work, and provide that description. You might also include suggestions for students outside of the actual rubric about how they might surpass proficient-level work.

For analytic and holistic rubrics , c reate statements of expected performance at each level of the rubric.

  • Consider what descriptor is appropriate for each criteria, e.g., presence vs absence, complete vs incomplete, many vs none, major vs minor, consistent vs inconsistent, always vs never. If you have an indicator described in one level, it will need to be described in each level.
  • You might start with the top/exemplary level. What does it look like when a student has achieved excellence for each/every criterion? Then, look at the “bottom” level. What does it look like when a student has not achieved the learning goals in any way? Then, complete the in-between levels.
  • For an analytic rubric , do this for each particular criterion of the rubric so that every cell in the table is filled. These descriptions help students understand your expectations and their performance in regard to those expectations.

Well-written descriptions:

  • Describe observable and measurable behavior
  • Use parallel language across the scale
  • Indicate the degree to which the standards are met

Step 7: Create your rubric

Create your rubric in a table or spreadsheet in Word, Google Docs, Sheets, etc., and then transfer it by typing it into Moodle. You can also use online tools to create the rubric, but you will still have to type the criteria, indicators, levels, etc., into Moodle. Rubric creators: Rubistar , iRubric

Step 8: Pilot-test your rubric

Prior to implementing your rubric on a live course, obtain feedback from:

  • Teacher assistants

Try out your new rubric on a sample of student work. After you pilot-test your rubric, analyze the results to consider its effectiveness and revise accordingly.

  • Limit the rubric to a single page for reading and grading ease
  • Use parallel language . Use similar language and syntax/wording from column to column. Make sure that the rubric can be easily read from left to right or vice versa.
  • Use student-friendly language . Make sure the language is learning-level appropriate. If you use academic language or concepts, you will need to teach those concepts.
  • Share and discuss the rubric with your students . Students should understand that the rubric is there to help them learn, reflect, and self-assess. If students use a rubric, they will understand the expectations and their relevance to learning.
  • Consider scalability and reusability of rubrics. Create rubric templates that you can alter as needed for multiple assignments.
  • Maximize the descriptiveness of your language. Avoid words like “good” and “excellent.” For example, instead of saying, “uses excellent sources,” you might describe what makes a resource excellent so that students will know. You might also consider reducing the reliance on quantity, such as a number of allowable misspelled words. Focus instead, for example, on how distracting any spelling errors are.

Example of an analytic rubric for a final paper

Above Average (4)Sufficient (3)Developing (2)Needs improvement (1)
(Thesis supported by relevant information and ideas The central purpose of the student work is clear and supporting ideas always are always well-focused. Details are relevant, enrich the work.The central purpose of the student work is clear and ideas are almost always focused in a way that supports the thesis. Relevant details illustrate the author’s ideas.The central purpose of the student work is identified. Ideas are mostly focused in a way that supports the thesis.The purpose of the student work is not well-defined. A number of central ideas do not support the thesis. Thoughts appear disconnected.
(Sequencing of elements/ ideas)Information and ideas are presented in a logical sequence which flows naturally and is engaging to the audience.Information and ideas are presented in a logical sequence which is followed by the reader with little or no difficulty.Information and ideas are presented in an order that the audience can mostly follow.Information and ideas are poorly sequenced. The audience has difficulty following the thread of thought.
(Correctness of grammar and spelling)Minimal to no distracting errors in grammar and spelling.The readability of the work is only slightly interrupted by spelling and/or grammatical errors.Grammatical and/or spelling errors distract from the work.The readability of the work is seriously hampered by spelling and/or grammatical errors.

Example of a holistic rubric for a final paper

The audience is able to easily identify the central message of the work and is engaged by the paper’s clear focus and relevant details. Information is presented logically and naturally. There are minimal to no distracting errors in grammar and spelling. : The audience is easily able to identify the focus of the student work which is supported by relevant ideas and supporting details. Information is presented in a logical manner that is easily followed. The readability of the work is only slightly interrupted by errors. : The audience can identify the central purpose of the student work without little difficulty and supporting ideas are present and clear. The information is presented in an orderly fashion that can be followed with little difficulty. Grammatical and spelling errors distract from the work. : The audience cannot clearly or easily identify the central ideas or purpose of the student work. Information is presented in a disorganized fashion causing the audience to have difficulty following the author’s ideas. The readability of the work is seriously hampered by errors.

Single-Point Rubric

Advanced (evidence of exceeding standards)Criteria described a proficient levelConcerns (things that need work)
Criteria #1: Description reflecting achievement of proficient level of performance
Criteria #2: Description reflecting achievement of proficient level of performance
Criteria #3: Description reflecting achievement of proficient level of performance
Criteria #4: Description reflecting achievement of proficient level of performance
90-100 points80-90 points<80 points

More examples:

  • Single Point Rubric Template ( variation )
  • Analytic Rubric Template make a copy to edit
  • A Rubric for Rubrics
  • Bank of Online Discussion Rubrics in different formats
  • Mathematical Presentations Descriptive Rubric
  • Math Proof Assessment Rubric
  • Kansas State Sample Rubrics
  • Design Single Point Rubric

Technology Tools: Rubrics in Moodle

  • Moodle Docs: Rubrics
  • Moodle Docs: Grading Guide (use for single-point rubrics)

Tools with rubrics (other than Moodle)

  • Google Assignments
  • Turnitin Assignments: Rubric or Grading Form

Other resources

  • DePaul University (n.d.). Rubrics .
  • Gonzalez, J. (2014). Know your terms: Holistic, Analytic, and Single-Point Rubrics . Cult of Pedagogy.
  • Goodrich, H. (1996). Understanding rubrics . Teaching for Authentic Student Performance, 54 (4), 14-17. Retrieved from   
  • Miller, A. (2012). Tame the beast: tips for designing and using rubrics.
  • Ragupathi, K., Lee, A. (2020). Beyond Fairness and Consistency in Grading: The Role of Rubrics in Higher Education. In: Sanger, C., Gleason, N. (eds) Diversity and Inclusion in Global Higher Education. Palgrave Macmillan, Singapore.

Essay Rubric: Grading Students Correctly

Author Avatar

  • Icon Calendar 10 July 2024
  • Icon Page 2897 words
  • Icon Clock 14 min read

Lectures and tutors provide specific requirements for students to meet when writing essays. Basically, an essay rubric helps tutors to analyze an overall quality of compositions written by students. In this case, a rubric refers to a scoring guide used to evaluate performance based on a set of criteria and standards. As such, useful marking schemes make an analysis process simple for lecturers as they focus on specific concepts related to a writing process. Moreover, an assessment table lists and organizes all of the criteria into one convenient paper. In other instances, students use assessment tables to enhance their writing skills by examining various requirements. Then, different types of essay rubrics vary from one educational level to another. Essentially, Master’s and Ph.D. grading schemes focus on examining complex thesis statements and other writing mechanics. However, high school evaluation tables examine basic writing concepts. In turn, guidelines on a common format for writing a good essay rubric and corresponding examples provided in this article can help students to evaluate their papers before submitting them to their teachers.

General Aspects

An essay rubric refers to a way for teachers to assess students’ composition writing skills and abilities. Basically, an evaluation scheme provides specific criteria to grade assignments. Moreover, the three basic elements of an essay rubric are criteria, performance levels, and descriptors. In this case, teachers use assessment guidelines to save time when evaluating and grading various papers. Hence, learners must use an essay rubric effectively to achieve desired goals and grades, while its general example is:

What Is an Essay Rubric and Its Purpose

According to its definition, an essay rubric is a structured evaluation tool that educators use to grade students’ compositions in a fair and consistent manner. The main purpose of an essay rubric in writing is to ensure consistent and fair grading by clearly defining what constitutes excellent, good, average, and poor performance (DeVries, 2023). This tool specifies a key criteria for grading various aspects of a written text, including a clarity of a thesis statement, an overall quality of a main argument, an organization of ideas, a particular use of evidence, and a correctness of grammar and mechanics. Moreover, an assessment grading helps students to understand their strengths to be proud of and weaknesses to be pointed out and guides them in improving their writing skills (Taylor et al., 2024). For teachers, such an assessment simplifies a grading process, making it more efficient and less subjective by providing a clear standard to follow. By using an essay rubric, both teachers and students can engage in a transparent, structured, and constructive evaluation process, enhancing an overall educational experience (Stevens & Levi, 2023). In turn, the length of an essay rubric depends on academic levels, types of papers, and specific requirements, while general guidelines are:

High School

  • Length: 1-2 pages
  • Word Count: 300-600 words
  • Length: 1-3 pages
  • Word Count: 300-900 words

University (Undergraduate)

  • Length: 2-4 pages
  • Word Count: 600-1,200 words

Master’s

  • Length: 2-5 pages
  • Word Count: 600-1,500 words
  • Length: 3-6 pages
  • Word Count: 900-1,800 words

Essay rubric

ElementDescription
Thesis StatementA well-defined thesis statement is crucial as it sets a particular direction and purpose of an essay, making it clear what a writer intends to argue or explain.
IntroductionAn introduction captures a reader’s interest and provides a framework for what a paper will cover, setting up a stage for arguments or ideas that follow after an opening paragraph.
ContentHigh-quality content demonstrates thorough understanding and research on a specific topic, providing valuable and relevant information that supports a thesis.
OrganizationEffective organization ensures author’s ideas are presented in a clear, well-structure, and logical order, enhancing readability and an overall flow of a central argument.
Evidence and SupportProviding strong evidence and detailed analysis is essential for backing up main arguments, adding credibility and depth to a final document.
ConclusionA strong conclusion ties all the main numbers together, reflects on potential implications of arguments, and reinforces a thesis, leaving a lasting impression on a reader.
Grammar and MechanicsProper grammar, spelling, and punctuation are vital for clarity and professionalism, making a whole text easy to read and understand.
Style and ToneCorrectness in writing style and author’s tone appropriate to a paper’s purpose and audience enhances an overall effectiveness of a particular text and engages a reader.
Citations and ReferencesAccurate and complete citations and references are crucial for giving credit to sources, avoiding plagiarism, and allowing readers to follow up on the research.

Note: Some elements of an essay rubric can be added, deled, or combined with each other because different types of papers, their requirements, and instructors’ choices affect a final assessment. To format an essay rubric, people create a table with criteria listed in rows, performance levels in columns, and detailed descriptors in each cell explaining principal expectations for each level of performance (Steven & Levi, 2023). Besides, the five main criteria in a rubric are thesis statement, content, organization, evidence and support, and grammar and mechanics. In turn, a good essay rubric is clear, specific, aligned with learning objectives, and provides detailed, consistent descriptors for each performance level.

Steps How to Write an Essay Rubric

In writing, the key elements of an essay rubric are clear criteria, defined performance levels, and detailed descriptors for each evaluation.

  • Identify a Specific Purpose and Goals: Determine main objectives of an essay’s assignment and consider what skills and knowledge you want students to demonstrate.
  • List a Key Criteria: Identify essential components that need to be evaluated, such as thesis statement, introduction, content, organization, evidence and support, conclusion, grammar and mechanics, writing style and tone, and citations and references.
  • Define Performance Levels: Decide on a particular scale you will use to measure performance (e.g., Excellent, Good, Fair, Poor) and ensure each level is distinct and clearly defined.
  • Create Descriptors for Each Criterion: Write detailed descriptions for what constitutes each level of performance for every criterion and be specific about what is expected at each level to avoid misunderstanding.
  • Assign Number Values: Determine a specific range for each criterion and performance level and allocate numbers in a way that reflects an actual importance of each criterion in an overall assessment.
  • Review and Revise: Examine a complete rubric to ensure it is comprehensive and clear and adjust any descriptions or number values that seem unclear or disproportionate.
  • Test a Working Essay Rubric: Apply a grading scheme to a few sample compositions to see if it effectively differentiates between different levels of performance and make adjustments as necessary.
  • Involve Peers for Feedback: Share marking criteria with colleagues or peers for feedback and insights on clarity and fairness that you might have overlooked.
  • Provide Examples: Include examples of complete papers or writing excerpts at each performance level and help students to understand what is expected for grading.
  • Communicate With Students: Share a complete rubric with students before they begin an assignment and explain each criterion and performance level so they understand how their work will be evaluated and what they need to do to achieve highest marks.

Essay Rubric Example

Organization

Excellent/8 points: A submitted essay contains stiff topic sentences and a controlled organization.

Very Good/6 points: A paper contains a logical and appropriate organization. An author uses clear topic sentences.

Average/4 points: A composition contains a logical and appropriate organization. An author uses clear topic sentences.

Needs Improvement/2 points: A provided text has an inconsistent organization.

Unacceptable/0 (zero): A complete document shows an absence of a planned organization.

Grade: ___ .

Excellent/8 points: A submitted essay shows the absence of a planned organization.

Very Good/6 points: A paper contains precise and varied sentence structures and word choices. 

Average/4 points: A composition follows a limited but mostly correct sentence structure. There are different sentence structures and word choices.

Needs Improvement/2 points: A provided text contains several awkward and unclear sentences. There are some problems with word choices.

Unacceptable/0 (zero): An author does not have apparent control over sentence structures and word choice.

Excellent/8 points: An essay’s content appears sophisticated and contains well-developed ideas.

Very Good/6 points: A paper’s content appears illustrative and balanced.

Average/4 points: A composition contains unbalanced content that requires more analysis.

Needs Improvement/2 points: A provided text contains a lot of research information without analysis or commentary.

Unacceptable/0 (zero): A complete document lacks relevant content and does not fit the thesis statement. Essay rubric rules are not followed.

Excellent/8 points: A submitted essay contains a clearly stated and focused thesis statement.

Very Good/6 points: A paper comprises a clearly stated argument. However, a particular focus would have been sharper.

Average/4 points: A thesis statement phrasing sounds simple and lacks complexity. An author does not word the thesis correctly. 

Needs Improvement/2 points: A thesis statement requires a clear objective and does not fit the theme in a paper’s content.

Unacceptable/0 (zero): A thesis statement is not evident in an introduction paragraph.

Excellent/8 points: A submitted is clear and focused. An overall work holds a reader’s attention. Besides, relevant details and quotes enrich a thesis statement.

Very Good/6 points: A paper is mostly focused and contains a few useful details and quotes.

Average/4 points: An author begins a composition by defining an assigned topic. However, a particular development of ideas appears general.

Needs Improvement/2 points: An author fails to define an assigned topic well or focuses on several issues.

Unacceptable/0 (zero): A complete document lacks a clear sense of a purpose or thesis statement. Readers have to make suggestions based on sketchy or missing ideas to understand an intended meaning. Essay rubric requirements are missed.

Sentence Fluency

Excellent/8 points: A submitted essay has a natural flow, rhythm, and cadence. Its sentences are well-built and have a wide-ranging and robust structure that enhances reading.

Very Good/6 points: Presented ideas mostly flow and motivate a compelling reading.

Average/4 points: A composition hums along with a balanced beat but tends to be more businesslike than musical. Besides, a particular flow of ideas tends to become more mechanical than fluid.

Needs Improvement/2 points: A provided text appears irregular and hard to read.

Unacceptable/0 (zero): Readers have to go through a complete document several times to give this paper a fair interpretive reading.

Conventions

Excellent/8 points: An author demonstrates proper use of standard writing conventions, like spelling, punctuation, capitalization, grammar, usage, and paragraphing. A person also uses correct protocols in a way that improves an overall readability of an essay.

Very Good/6 points: An author demonstrates proper writing conventions and uses them correctly. One can read a paper with ease, and errors are rare. Few touch-ups can make a submitted composition ready for publishing.

Average/4 points: An author shows reasonable control over a short range of standard writing rules. A person also handles all the conventions and enhances readability. Writing errors in a presented composition tend to distract and impair legibility.

Needs Improvement/2 points: An author makes an effort to use various conventions, including spelling, punctuation, capitalization, grammar usage, and paragraphing. A provided text contains multiple errors.

Unacceptable/0 (zero): An author makes repetitive errors in spelling, punctuation, capitalization, grammar, usage, and paragraphing. Some mistakes distract readers and make it hard to understand discussed concepts. Essay rubric rules are not covered.

Presentation

Excellent/8 points: A particular form and presentation of a text enhance an overall readability of an essay and its flow of ideas.

Very Good/6 points: A chosen format has few mistakes and is easy to read.

Average/4 points: An author’s message is understandable in this format.

Needs Improvement/2 points: An author’s message is only comprehensible infrequently, and a provided text appears disorganized.

Unacceptable/0 (zero): Readers receive a distorted message due to difficulties connecting to a presentation of an entire text.

Final Grade: ___ .

Grading Scheme

  • A+ = 60+ points
  • F = less than 9

Differences in Education Levels

An overall quality of various types of texts changes at different education levels. In writing, an essay rubric works by providing a structured framework with specific criteria and performance levels to consistently evaluate and grade a finished paper. For instance, college students must write miscellaneous papers when compared to high school learners (Harrington et al., 2021). In this case, assessment criteria will change for these different education levels. For example, university and college compositions should have a debatable thesis statement with varying points of view (Mewburn et al., 2021). However, high school compositions should have simple phrases as thesis statements. Then, other requirements in a marking rubric will be more straightforward for high school students (DeVries, 2023). For Master’s and Ph.D. works, a writing criteria presented in a scoring evaluation should focus on examining a paper’s complexity. In turn, compositions for these two categories should have thesis statements that demonstrate a detailed analysis of defined topics that advance knowledge in a specific area of study.

Recommendations

When observing any essay rubric, people should remember to ensure clarity and specificity in each criterion and performance level. This clarity helps both an evaluator and a student to understand principal expectations and how a written document will be assessed (Ozfidan & Mitchell, 2022). Consistency in language and terminology across an essay rubric is crucial to avoid confusion and maintain fairness. Further on, it is essential to align a working scheme with learning objectives and goals of an essay’s assignment, ensuring all key components, such as thesis, content, organization, and grammar, are covered comprehensively (Stevens & Levi, 2023). Evaluators should also be aware of the weighting and scoring distribution, making sure they accurately reflect an actual importance of each criterion. Moreover, testing a rubric on sample essays before finalizing it can help to identify any mistakes or imbalances in scores. Essentially, providing concrete examples or descriptions for each performance level can guide students in understanding what is expected for each grade (Taylor et al., 2024). In turn, an essay rubric should be reviewed, revised, and updated after each educational year to remain relevant and aligned with current academic standards. Lastly, sharing and explaining grading assessment with students before they start their composition fosters transparency and helps them to put more of their efforts into meeting defined criteria, ultimately improving their writing and learning experience in general.

Common Mistakes

  • Lack of Specificity: Descriptions for each criterion and performance level are too vague, leading to ambiguity and confusion for both graders and students.
  • Overcomplicating a Rubric: Including too many criteria or overly complex descriptions that make a scoring assessment difficult to use effectively.
  • Unbalanced Weighting: Assigning disproportionate number values to different criteria, which can mislead an overall assessment and not accurately reflect an actual importance of each component.
  • Inconsistent Language: Using inconsistent terminology or descriptors across performance levels, which can confuse users and make a rubric less reliable.
  • Not Aligning With Objectives: Failing to align a particular criteria and performance levels with specific goals and learning outcomes of an assignment.
  • Omitting Key Components: Leaving out important criteria that are essential for evaluating a paper comprehensively, such as citations or a conclusion part.
  • Lack of Examples: Not providing examples or concrete descriptions of what constitutes each performance level, making it harder for students to understand expectations.
  • Ignoring Grammar and Mechanics: Overlooking an actual importance of grammar, spelling, and punctuation, which are crucial for clear and professional writing.
  • Not Updating an Essay Rubric: Using outdated rubrics that do not reflect current educational standards or specific assignment needs.
  • Insufficient Testing: Failing to test a grading scheme on some sample documents to ensure it effectively differentiates between levels of performance and provides fair assessments.

Essay rubrics help teachers, instructors, professors, and tutors to analyze an overall quality of compositions written by students. Basically, an assessment scheme makes an analysis process simple for lecturers, and it lists and organizes all of the criteria into one convenient paper. In other instances, students use such evaluation tools to improve their writing skills. However, they vary from one educational level to the other. Master’s and Ph.D. assessment schemes focus on examining complex thesis statements and other writing mechanics. However, high school grading criteria examine basic writing concepts.  As such, the following are some of the tips that one must consider when preparing any rubric.

  • Include all mechanics that relate to essay writing.
  • Cover different requirements and their relevant grades.
  • Follow clear and understandable statements.

DeVries, B. A. (2023). Literacy assessment and intervention for classroom teachers . Routledge.

Harrington, E. R., Lofgren, I. E., Gottschalk Druschke, C., Karraker, N. E., Reynolds, N., & McWilliams, S. R. (2021). Training graduate students in multiple genres of public and academic science writing: An assessment using an adaptable, interdisciplinary rubric. Frontiers in Environmental Science , 9 , 1–13. https://doi.org/10.3389/fenvs.2021.715409

Mewburn, I., Firth, K., & Lehmann, S. (2021). Level up your essays: How to get better grades at university . NewSouth.

Ozfidan, B., & Mitchell, C. (2022). Assessment of students’ argumentative writing: A rubric development. Journal of Ethnic and Cultural Studies , 9 (2), 121–133. https://doi.org/10.29333/ejecs/1064

Stevens, D. D., & Levi, A. (2023). Introduction to rubrics: An assessment tool to save grading time, convey effective feedback, and promote student learning . Routledge, Taylor & Francis Group.

Taylor, B., Kisby, F., & Reedy, A. (2024). Rubrics in higher education: An exploration of undergraduate students’ understanding and perspectives. Assessment & Evaluation in Higher Education , 1–11. https://doi.org/10.1080/02602938.2023.2299330

To Learn More, Read Relevant Articles

how to cite a newspaper article in APA 7

How to Cite a Newspaper Article in APA 7 With Examples

  • Icon Calendar 12 October 2020
  • Icon Page 754 words

how to write a who am i essay

"Who Am I" Essay Examples & Student Guidelines

  • Icon Calendar 10 October 2020
  • Icon Page 5309 words

Eberly Center

Teaching excellence & educational innovation, creating and using rubrics.

A rubric is a scoring tool that explicitly describes the instructor’s performance expectations for an assignment or piece of work. A rubric identifies:

  • criteria: the aspects of performance (e.g., argument, evidence, clarity) that will be assessed
  • descriptors: the characteristics associated with each dimension (e.g., argument is demonstrable and original, evidence is diverse and compelling)
  • performance levels: a rating scale that identifies students’ level of mastery within each criterion  

Rubrics can be used to provide feedback to students on diverse types of assignments, from papers, projects, and oral presentations to artistic performances and group projects.

Benefitting from Rubrics

  • reduce the time spent grading by allowing instructors to refer to a substantive description without writing long comments
  • help instructors more clearly identify strengths and weaknesses across an entire class and adjust their instruction appropriately
  • help to ensure consistency across time and across graders
  • reduce the uncertainty which can accompany grading
  • discourage complaints about grades
  • understand instructors’ expectations and standards
  • use instructor feedback to improve their performance
  • monitor and assess their progress as they work towards clearly indicated goals
  • recognize their strengths and weaknesses and direct their efforts accordingly

Examples of Rubrics

Here we are providing a sample set of rubrics designed by faculty at Carnegie Mellon and other institutions. Although your particular field of study or type of assessment may not be represented, viewing a rubric that is designed for a similar assessment may give you ideas for the kinds of criteria, descriptions, and performance levels you use on your own rubric.

  • Example 1: Philosophy Paper This rubric was designed for student papers in a range of courses in philosophy (Carnegie Mellon).
  • Example 2: Psychology Assignment Short, concept application homework assignment in cognitive psychology (Carnegie Mellon).
  • Example 3: Anthropology Writing Assignments This rubric was designed for a series of short writing assignments in anthropology (Carnegie Mellon).
  • Example 4: History Research Paper . This rubric was designed for essays and research papers in history (Carnegie Mellon).
  • Example 1: Capstone Project in Design This rubric describes the components and standards of performance from the research phase to the final presentation for a senior capstone project in design (Carnegie Mellon).
  • Example 2: Engineering Design Project This rubric describes performance standards for three aspects of a team project: research and design, communication, and team work.

Oral Presentations

  • Example 1: Oral Exam This rubric describes a set of components and standards for assessing performance on an oral exam in an upper-division course in history (Carnegie Mellon).
  • Example 2: Oral Communication This rubric is adapted from Huba and Freed, 2000.
  • Example 3: Group Presentations This rubric describes a set of components and standards for assessing group presentations in history (Carnegie Mellon).

Class Participation/Contributions

  • Example 1: Discussion Class This rubric assesses the quality of student contributions to class discussions. This is appropriate for an undergraduate-level course (Carnegie Mellon).
  • Example 2: Advanced Seminar This rubric is designed for assessing discussion performance in an advanced undergraduate or graduate seminar.

See also " Examples and Tools " section of this site for more rubrics.

CONTACT US to talk with an Eberly colleague in person!

  • Faculty Support
  • Graduate Student Support
  • Canvas @ Carnegie Mellon
  • Quick Links

creative commons image

Essay Rubric

Essay Rubric

About this printout

This rubric delineates specific expectations about an essay assignment to students and provides a means of assessing completed student essays.

Teaching with this printout

More ideas to try.

Grading rubrics can be of great benefit to both you and your students. For you, a rubric saves time and decreases subjectivity. Specific criteria are explicitly stated, facilitating the grading process and increasing your objectivity. For students, the use of grading rubrics helps them to meet or exceed expectations, to view the grading process as being “fair,” and to set goals for future learning. In order to help your students meet or exceed expectations of the assignment, be sure to discuss the rubric with your students when you assign an essay. It is helpful to show them examples of written pieces that meet and do not meet the expectations. As an added benefit, because the criteria are explicitly stated, the use of the rubric decreases the likelihood that students will argue about the grade they receive. The explicitness of the expectations helps students know exactly why they lost points on the assignment and aids them in setting goals for future improvement.

  • Routinely have students score peers’ essays using the rubric as the assessment tool. This increases their level of awareness of the traits that distinguish successful essays from those that fail to meet the criteria. Have peer editors use the Reviewer’s Comments section to add any praise, constructive criticism, or questions.
  • Alter some expectations or add additional traits on the rubric as needed. Students’ needs may necessitate making more rigorous criteria for advanced learners or less stringent guidelines for younger or special needs students. Furthermore, the content area for which the essay is written may require some alterations to the rubric. In social studies, for example, an essay about geographical landforms and their effect on the culture of a region might necessitate additional criteria about the use of specific terminology.
  • After you and your students have used the rubric, have them work in groups to make suggested alterations to the rubric to more precisely match their needs or the parameters of a particular writing assignment.
  • Print this resource

Explore Resources by Grade

  • Kindergarten K

Sample Essay Rubric for Elementary Teachers

  • Grading Students for Assessment
  • Lesson Plans
  • Becoming A Teacher
  • Assessments & Tests
  • Elementary Education
  • Special Education
  • Homeschooling
  • M.S., Education, Buffalo State College
  • B.S., Education, Buffalo State College

An essay rubric is a way teachers assess students' essay writing by using specific criteria to grade assignments. Essay rubrics save teachers time because all of the criteria are listed and organized into one convenient paper. If used effectively, rubrics can help improve students' writing. Below are two types of rubrics for essays.

How to Use an Essay Rubric

  • The best way to use an essay rubric is to give the rubric to the students before they begin their writing assignment. Review each criterion with the students and give them specific examples of what you want so they will know what is expected of them.
  • Next, assign students to write the essay, reminding them of the criteria and your expectations for the assignment.
  • Once students complete the essay have them first score their own essay using the rubric, and then switch with a partner. (This peer-editing process is a quick and reliable way to see how well the student did on their assignment. It's also good practice to learn criticism and become a more efficient writer.)
  • Once peer editing is complete, have students hand in their essays. Now it is your turn to evaluate the assignment according to the criteria on the rubric. Make sure to offer students examples if they did not meet the criteria listed.

Informal Essay Rubric

Piece was written in an extraordinary style and voice

Very informative and well-organized

Piece was written in an interesting style and voice

Somewhat informative and organized

Piece had little style or voice

Gives some new information but poorly organized

Piece had no style or voice

Gives no new information and very poorly organized

Virtually no spelling, punctuation or grammatical errors

Few spelling and punctuation errors, minor grammatical errors

A number of spelling, punctuation or grammatical errors

So many spelling, punctuation and grammatical errors that it interferes with the meaning

Formal Essay Rubric

Presents ideas in an original manner

Presents ideas in a consistent manner

Ideas are too general

Ideas are vague or unclear

Strong and organized beg/mid/end

Organized beg/mid/end

Some organization; attempt at a beg/mid/end

No organization; lack beg/mid/end

Writing shows strong understanding

Writing shows a clear understanding

Writing shows adequate understanding

Writing shows little understanding

Sophisticated use of nouns and verbs make the essay very informative

Nouns and verbs make essay informative

Needs more nouns and verbs

Little or no use of nouns and verbs

Sentence structure enhances meaning; flows throughout the piece

Sentence structure is evident; sentences mostly flow

Sentence structure is limited; sentences need to flow

No sense of sentence structure or flow

Few (if any) errors

Few errors

Several errors

Numerous errors

  • Scoring Rubric for Students
  • Writing Rubrics
  • Rubric Template Samples for Teachers
  • 5 Types of Report Card Comments for Elementary Teachers
  • 200 Report Card Comments
  • Sample Report Card Comments for Social Studies
  • Science Report Card Comments
  • Report Card Comments for English Classes at School
  • Report Card Comments for Math
  • 5 Steps to Building a Student Portfolio
  • Holding Debates in Middle School Classes
  • Grading for Proficiency in the World of 4.0 GPAs
  • The Whys and How-tos for Group Writing in All Content Areas
  • Writing a Lesson Plan: Guided Practice
  • Writing a Lesson Plan: Closure and Context
  • T.E.S.T. Season for Grades 7-12
  • help_outline help

iRubric: Essay Exam Rubric

      '; }     test run   assess...   delete   Do more...
Rubric Code: By Draft Public Rubric Subject:    Type:    Grade Levels: Undergraduate





 



  • Communication

essay exam rubrics

PrepScholar

Choose Your Test

  • Search Blogs By Category
  • College Admissions
  • AP and IB Exams
  • GPA and Coursework

ACT Writing Rubric: Full Analysis and Essay Strategies

ACT Writing

body-robot-teacher-cc0-1

What time is it? It's essay time! In this article, I'm going to get into the details of the newly transformed ACT Writing by discussing the ACT essay rubric and how the essay is graded based on that. You'll learn what each item on the rubric means for your essay writing and what you need to do to meet those requirements.

ACT Essay Grading: The Basics

If you've chosen to take the ACT Plus Writing , you'll have 40 minutes to write an essay (after completing the English, Math, Reading, and Science sections of the ACT, of course). Your essay will be evaluated by two graders , who score your essay from 1-6 on each of 4 domains, leading to scores out of 12 for each domain. Your Writing score is calculated by averaging your four domain scores, leading to a total ACT Writing score from 2-12.

The Complete ACT Grading Rubric

Based on ACT, Inc's stated grading criteria, I've gathered all the relevant essay-grading criteria into a chart. The information itself is available on the ACT's website , and there's more general information about each of the domains here . The columns in this rubric are titled as per the ACT's own domain areas, with the addition of another category that I named ("Mastery Level").

demonstrate little or no skill in writing an argumentative essay. The writer fails to generate an argument that responds intelligibly to the task. The writer's intentions are difficult to discern. Attempts at analysis are unclear or irrelevant. Ideas lack development, and claims lack support. Reasoning and illustration are unclear, incoherent, or largely absent. The response does not exhibit an organizational structure. There is little grouping of ideas. When present, transitional devices fail to connect ideas. The use of language fails to demonstrate skill in responding to the task. Word choice is imprecise and often difficult to comprehend. Sentence structures are often unclear. Stylistic and register choices are difficult to identify. Errors in grammar, usage, and mechanics are pervasive and often impede understanding.
demonstrate weak or inconsistent skill in writing an argumentative essay The writer generates an argument that weakly responds to multiple perspectives on the given issue. The argument's thesis, if evident, reflects little clarity in thought and purpose. Attempts at analysis are incomplete, largely irrelevant, or consist primarily of restatement of the issue and its perspectives. Development of ideas and support for claims are weak, confused, or disjointed. Reasoning and illustration are inadequate, illogical, or circular, and fail to fully clarify the argument. The response exhibits a rudimentary organizational structure. Grouping of ideas is inconsistent and often unclear. Transitions between and within paragraphs are misleading or poorly formed. The use of language is inconsistent and often unclear. Word choice is rudimentary and frequently imprecise. Sentence structures are sometimes unclear. Stylistic and register choices, including voice and tone, are inconsistent and are not always appropriate for the rhetorical purpose. Distracting errors in grammar, usage, and mechanics are present, and they sometimes impede understanding.
demonstrate some developing skill in writing an argumentative essay The writer generates an argument that responds to multiple perspectives on the given issue. The argument's thesis reflects some clarity in thought and purpose. The argument establishes a limited or tangential context for analysis of the issue and its perspectives. Analysis is simplistic or somewhat unclear. Development of ideas and support for claims are mostly relevant but are overly general or simplistic. Reasoning and illustration largely clarify the argument but may be somewhat repetitious or imprecise. The response exhibits a basic organizational structure. The response largely coheres, with most ideas logically grouped. Transitions between and within paragraphs sometimes clarify the relationships among ideas. The use of language is basic and only somewhat clear. Word choice is general and occasionally imprecise. Sentence structures are usually clear but show little variety. Stylistic and register choices, including voice and tone, are not always appropriate for the rhetorical purpose. Distracting errors in grammar, usage, and mechanics may be present, but they generally do not impede understanding.
demonstrate adequate skill in writing an argumentative essay The writer generates an argument that engages with multiple perspectives on the given issue. The argument's thesis reflects clarity in thought and purpose. The argument establishes and employs a relevant context for analysis of the issue and its perspectives. The analysis recognizes implications, complexities and tensions, and/or underlying values and assumptions. Development of ideas and support for claims clarify meaning and purpose. Lines of clear reasoning and illustration adequately convey the significance of the argument. Qualifications and complications extend ideas and analysis. The response exhibits a clear organizational strategy. The overall shape of the response reflects an emergent controlling idea or purpose. Ideas are logically grouped and sequenced. Transitions between and within paragraphs clarify the relationships among ideas. The use of language conveys the argument with clarity. Word choice is adequate and sometimes precise. Sentence structures are clear and demonstrate some variety. Stylistic and register choices, including voice and tone, are appropriate for the rhetorical purpose. While errors in grammar, usage, and mechanics are present, they rarely impede understanding.
demonstrate well-developed skill in writing an argumentative essay The writer generates an argument that productively engages with multiple perspectives on the given issue. The argument's thesis reflects precision in thought and purpose. The argument establishes and employs a thoughtful context for analysis of the issue and its perspectives. The analysis addresses implications, complexities and tensions, and/or underlying values and assumptions. Development of ideas and support for claims deepen understanding. A mostly integrated line of purposeful reasoning and illustration capably conveys the significance of the argument. Qualifications and complications enrich ideas and analysis. The response exhibits a productive organizational strategy. The response is mostly unified by a controlling idea or purpose, and a logical sequencing of ideas contributes to the effectiveness of the argument. Transitions between and within paragraphs consistently clarify the relationships among ideas. The use of language works in service of the argument. Word choice is precise. Sentence structures are clear and varied often. Stylistic and register choices, including voice and tone, are purposeful and productive. While minor errors in grammar, usage, and mechanics may be present, they do not impede understanding.
demonstrate effective skill in writing an argumentative essay The writer generates an argument that critically engages with multiple perspectives on the given issue. The argument's thesis reflects nuance and precision in thought and purpose. The argument establishes and employs an insightful context for analysis of the issue and its perspectives. The analysis examines implications, complexities and tensions, and/or underlying values and assumptions. Development of ideas and support for claims deepen insight and broaden context. An integrated line of skillful reasoning and illustration effectively conveys the significance of the argument. Qualifications and complications enrich and bolster ideas and analysis. The response exhibits a skillful organizational strategy. The response is unified by a controlling idea or purpose, and a logical progression of ideas increases the effectiveness of the writer's argument. Transitions between and within paragraphs strengthen the relationships among ideas. The use of language enhances the argument. Word choice is skillful and precise. Sentence structures are consistently varied and clear. Stylistic and register choices, including voice and tone, are strategic and effective. While a few minor errors in grammar, usage, and mechanics may be present, they do not impede understanding.

ACT Writing Rubric: Item-by-Item Breakdown

Whew. That rubric might be a little overwhelming—there's so much information to process! Below, I've broken down the essay rubric by domain, with examples of what a 3- and a 6-scoring essay might look like.

Ideas and Analysis

The Ideas and Analysis domain is the rubric area most intimately linked with the basic ACT essay task itself. Here's what the ACT website has to say about this domain:

Scores in this domain reflect the ability to generate productive ideas and engage critically with multiple perspectives on the given issue. Competent writers understand the issue they are invited to address, the purpose for writing, and the audience. They generate ideas that are relevant to the situation.

Based on this description, I've extracted the three key things you need to do in your essay to score well in the Ideas and Analysis domain.

#1: Choose a perspective on this issue and state it clearly. #2: Compare at least one other perspective to the perspective you have chosen. #3: Demonstrate understanding of the ways the perspectives relate to one another. #4: Analyze the implications of each perspective you choose to discuss.

There's no cool acronym, sorry. I guess a case could be made for "ACCE," but I wanted to list the points in the order of importance, so "CEAC" it is.

Fortunately, the ACT Writing Test provides you with the three perspectives to analyze and choose from, which will save you some of the time of "generating productive ideas." In addition, "analyzing each perspective" does not mean that you need to argue from each of the points of view. Instead, you need to choose one perspective to argue as your own and explain how your point of view relates to at least one other perspective by evaluating how correct the perspectives you discuss are and analyzing the implications of each perspective.

Note: While it is technically allowable for you to come up with a fourth perspective as your own and to then discuss that point of view in relation to another perspective, we do not recommend it. 40 minutes is already a pretty short time to discuss and compare multiple points of view in a thorough and coherent manner—coming up with new, clearly-articulated perspectives takes time that could be better spend devising a thorough analysis of the relationship between multiple perspectives.

To get deeper into what things fall in the Ideas and Analysis domain, I'll use a sample ACT Writing prompt and the three perspectives provided:

Many of the goods and services we depend on daily are now supplied by intelligent, automated machines rather than human beings. Robots build cars and other goods on assembly lines, where once there were human workers. Many of our phone conversations are now conducted not with people but with sophisticated technologies. We can now buy goods at a variety of stores without the help of a human cashier. Automation is generally seen as a sign of progress, but what is lost when we replace humans with machines? Given the accelerating variety and prevalence of intelligent machines, it is worth examining the implications and meaning of their presence in our lives.

Perspective One : What we lose with the replacement of people by machines is some part of our own humanity. Even our mundane daily encounters no longer require from us basic courtesy, respect, and tolerance for other people.

Perspective Two : Machines are good at low-skill, repetitive jobs, and at high-speed, extremely precise jobs. In both cases they work better than humans. This efficiency leads to a more prosperous and progressive world for everyone.

Perspective Three : Intelligent machines challenge our long-standing ideas about what humans are or can be. This is good because it pushes both humans and machines toward new, unimagined possibilities.

First, in order to "clearly state your own perspective on the issue," you need to figure out what your point of view, or perspective, on this issue is going to be. For the sake of argument, let's say that you agree the most with the second perspective. A essay that scores a 3 in this domain might simply restate this perspective:

I agree that machines are good at low-skill, repetitive jobs, and at high-speed, extremely precise jobs. In both cases they work better than humans. This efficiency leads to a more prosperous and progressive world for everyone.

In contrast, an essay scoring a 6 in this domain would likely have a more complex point of view (with what the rubric calls "nuance and precision in thought and purpose"):

Machines will never be able to replace humans entirely, as creativity is not something that can be mechanized. Because machines can perform delicate and repetitive tasks with precision, however, they are able to take over for humans with regards to low-skill, repetitive jobs and high-skill, extremely precise jobs. This then frees up humans to do what we do best—think, create, and move the world forward.

Next, you must compare at least one other perspective to your perspective throughout your essay, including in your initial argument. Here's what a 3-scoring essay's argument would look like:

I agree that machines are good at low-skill, repetitive jobs, and at high-speed, extremely precise jobs. In both cases they work better than humans. This efficiency leads to a more prosperous and progressive world for everyone. Machines do not cause us to lose our humanity or challenge our long-standing ideas about what humans are or can be.

And here, in contrast, is what a 6-scoring essay's argument (that includes multiple perspectives) would look like:

Machines will never be able to replace humans entirely, as creativity is not something that can be mechanized, which means that our humanity is safe. Because machines can perform delicate and repetitive tasks with precision, however, they are able to take over for humans with regards to low-skill, repetitive jobs and high-skill, extremely precise jobs. Rather than forcing us to challenge our ideas about what humans are or could be, machines simply allow us to BE, without distractions. This then frees up humans to do what we do best—think, create, and move the world forward.

You also need to demonstrate a nuanced understanding of the way in which the two perspectives relate to each other. A 3-scoring essay in this domain would likely be absolute, stating that Perspective Two is completely correct, while the other two perspectives are absolutely incorrect. By contrast, a 6-scoring essay in this domain would provide a more insightful context within which to consider the issue:

In the future, machines might lead us to lose our humanity; alternatively, machines might lead us to unimaginable pinnacles of achievement. I would argue, however, projecting possible futures does not make them true, and that the evidence we have at present supports the perspective that machines are, above all else, efficient and effective completing repetitive and precise tasks.

Finally, to analyze the perspectives, you need to consider each aspect of each perspective. In the case of Perspective Two, this means you must discuss that machines are good at two types of jobs, that they're better than humans at both types of jobs, and that their efficiency creates a better world. The analysis in a 3-scoring essay is usually "simplistic or somewhat unclear." By contrast, the analysis of a 6-scoring essay "examines implications, complexities and tensions, and/or underlying values and assumptions."

  • Choose a perspective that you can support.
  • Compare at least one other perspective to the perspective you have chosen.
  • Demonstrate understanding of the ways the perspectives relate to one another.
  • Analyze the implications of each perspective you choose to discuss.

To score well on the ACT essay overall, however, it's not enough to just state your opinions about each part of the perspective; you need to actually back up your claims with evidence to develop your own point of view. This leads straight into the next domain: Development and Support.

Development and Support

Another important component of your essay is that you explain your thinking. While it's obviously important to clearly state what your ideas are in the first place, the ACT essay requires you to demonstrate evidence-based reasoning. As per the description on ACT.org [bolding mine]:

Scores in this domain reflect the ability to discuss ideas, offer rationale, and bolster an argument. Competent writers explain and explore their ideas, discuss implications, and illustrate through examples . They help the reader understand their thinking about the issue.

"Machines are good at low-skill, repetitive jobs, and at high-speed, extremely precise jobs. In both cases they work better than humans. This efficiency leads to a more prosperous and progressive world for everyone."

In your essay, you might start out by copying the perspective directly into your essay as your point of view, which is fine for the Ideas and Analysis domain. To score well in the Development and Support domain and develop your point of view with logical reasoning and detailed examples, however, you're going to have to come up with reasons for why you agree with this perspective and examples that support your thinking.

Here's an example from an essay that would score a 3 in this domain:

Machines are good at low-skill, repetitive jobs and at high-speed, extremely precise jobs. In both cases, they work better than humans. For example, machines are better at printing things quickly and clearly than people are. Prior to the invention of the printing press by Gutenberg people had to write everything by hand. The printing press made it faster and easier to get things printed because things didn't have to be written by hand all the time. In the world today we have even better machines like laser printers that print things quickly.

Essays scoring a 3 in this domain tend to have relatively simple development and tend to be overly general, with imprecise or repetitive reasoning or illustration. Contrast this with an example from an essay that would score a 6:

Machines are good at low-skill, repetitive jobs and at high-speed, extremely precise jobs. In both cases, they work better than humans. Take, for instance, the example of printing. As a composer, I need to be able to create many copies of my sheet music to give to my musicians. If I were to copy out each part by hand, it would take days, and would most likely contain inaccuracies. On the other hand, my printer (a machine) is able to print out multiple copies of parts with extreme precision. If it turns out I made an error when I was entering in the sheet music onto the computer (another machine), I can easily correct this error and print out more copies quickly.

The above example of the importance of machines to composers uses "an integrated line of skillful reasoning and illustration" to support my claim ("Machines are good at low-skill, repetitive jobs and at high-speed, extremely precise jobs. In both cases, they work better than humans"). To develop this example further (and incorporate the "This efficiency leads to a more prosperous and progressive world for everyone" facet of the perspective), I would need to expand my example to explain why it's so important that multiple copies of precisely replicated documents be available, and how this affects the world.

body_theworld-1

World Map - Abstract Acrylic by Nicolas Raymond , used under CC BY 2.0 /Resized from original.

Organization

Essay organization has always been integral to doing well on the ACT essay, so it makes sense that the ACT Writing rubric has an entire domain devoted to this. The organization of your essay refers not just to the order in which you present your ideas in the essay, but also to the order in which you present your ideas in each paragraph. Here's the formal description from the ACT website :

Scores in this domain reflect the ability to organize ideas with clarity and purpose. Organizational choices are integral to effective writing. Competent writers arrange their essay in a way that clearly shows the relationship between ideas, and they guide the reader through their discussion.

Making sure your essay is logically organized relates back to the "development" part of the previous domain. As the above description states, you can't just throw examples and information into your essay willy-nilly, without any regard for the order; part of constructing and developing a convincing argument is making sure it flows logically. A lot of this organization should happen while you are in the planning phase, before you even begin to write your essay.

Let's go back to the machine intelligence essay example again. I've decided to argue for Perspective Two, which is:

An essay that scores a 3 in this domain would show a "basic organizational structure," which is to say that each perspective analyzed would be discussed in its own paragraph, "with most ideas logically grouped." A possible organization for a 3-scoring essay:

An essay that scores a 6 in this domain, on the other hand, has a lot more to accomplish. The "controlling idea or purpose" behind the essay should be clearly expressed in every paragraph, and ideas should be ordered in a logical fashion so that there is a clear progression from the beginning to the end. Here's a possible organization for a 6-scoring essay:

In this example, the unifying idea is that machines are helpful (and it's mentioned in each paragraph) and the progression of ideas makes more sense. This is certainly not the only way to organize an essay on this particular topic, or even using this particular perspective. Your essay does, however, have to be organized, rather than consist of a bunch of ideas thrown together.

Here are my Top 5 ACT Writing Organization Rules to follow:

#1: Be sure to include an introduction (with your thesis stating your point of view), paragraphs in which you make your case, and a conclusion that sums up your argument

#2: When planning your essay, make sure to present your ideas in an order that makes sense (and follows a logical progression that will be easy for the grader to follow).

#3: Make sure that you unify your essay with one main idea . Do not switch arguments partway through your essay.

#4: Don't write everything in one huge paragraph. If you're worried you're going to run out of space to write and can't make your handwriting any smaller and still legible, you can try using a paragraph symbol, ¶, at the beginning of each paragraph as a last resort to show the organization of your essay.

#5: Use transitions between paragraphs (usually the last line of the previous paragraph and the first line of the paragraph) to "strengthen the relationships among ideas" ( source ). This means going above and beyond "First of all...Second...Lastly" at the beginning of each paragraph. Instead, use the transitions between paragraphs as an opportunity to describe how that paragraph relates to your main argument.

Language Use

The final domain on the ACT Writing rubric is Language Use and Conventions. This the item that includes grammar, punctuation, and general sentence structure issues. Here's what the ACT website has to say about Language Use:

Scores in this domain reflect the ability to use written language to convey arguments with clarity. Competent writers make use of the conventions of grammar, syntax, word usage, and mechanics. They are also aware of their audience and adjust the style and tone of their writing to communicate effectively.

I tend to think of this as the "be a good writer" category, since many of the standards covered in the above description are ones that good writers will automatically meet in their writing. On the other hand, this is probably the area non-native English speakers will struggle the most, as you must have a fairly solid grasp of English to score above a 2 on this domain. The good news is that by reading this article, you're already one step closer to improving your "Language Use" on ACT Writing.

There are three main parts of this domain:

#1: Grammar, Usage, and Mechanics #2: Sentence Structure #3: Vocabulary and Word Choice

I've listed them (and will cover them) from lowest to highest level. If you're struggling with multiple areas, I highly recommend starting out with the lowest-level issue, as the components tend to build on each other. For instance, if you're struggling with grammar and usage, you need to focus on fixing that before you start to think about precision of vocabulary/word choice.

Grammar, Usage, and Mechanics

At the most basic level, you need to be able to "effectively communicate your ideas in standard written English" ( ACT.org ). First and foremost, this means that your grammar and punctuation need to be correct. On ACT Writing, it's all right to make a few minor errors if the meaning is clear, even on essays that score a 6 in the Language Use domain; however, the more errors you make, the more your score will drop.

Here's an example from an essay that scored a 3 in Language Use:

Machines are good at doing there jobs quickly and precisely. Also because machines aren't human or self-aware they don't get bored so they can do the same thing over & over again without getting worse.

While the meaning of the sentences is clear, there are several errors: the first sentence uses "there" instead of "their," the second sentence is a run-on sentence, and the second sentence also uses the abbreviation "&" in place of "and." Now take a look at an example from a 6-scoring essay:

Machines excel at performing their jobs both quickly and precisely. In addition, since machines are not self-aware they are unable to get "bored." This means that they can perform the same task over and over without a decrease in quality.

This example solves the abbreviation and "there/their" issue. The second sentence is missing a comma (after "self-aware"), but the worse of the run-on sentence issue is absent.

Our Complete Guide to ACT Grammar might be helpful if you just need a general refresh on grammar rules. In addition, we have several articles that focus in on specific grammar rules, as they are tested on ACT English; while the specific ways in which ACT English tests you on these rules isn't something you'll need to know for the essay, the explanations of the grammar rules themselves are quite helpful.

Sentence Structure

Once you've gotten down basic grammar, usage, and mechanics, you can turn your attention to sentence structure. Here's an example of what a 3-scoring essay in Language Use (based on sentence structure alone) might look like:

Machines are more efficient than humans at many tasks. Machines are not causing us to lose our humanity. Instead, machines help us to be human by making things more efficient so that we can, for example, feed the needy with technological advances.

The sentence structures in the above example are not particularly varied (two sentences in a row start with "Machines are"), and the last sentence has a very complicated/convoluted structure, which makes it hard to understand. For comparison, here's a 6-scoring essay:

Machines are more efficient than humans at many tasks, but that does not mean that machines are causing us to lose our humanity. In fact, machines may even assist us in maintaining our humanity by providing more effective and efficient ways to feed the needy.

For whatever reason, I find that when I'm under time pressure, my sentences maintain variety in their structures but end up getting really awkward and strange. A real life example: once I described a method of counteracting dementia as "supporting persons of the elderly persuasion" during a hastily written psychology paper. I've found the best ways to counteract this are as follows:

#1: Look over what you've written and change any weird wordings that you notice.

#2: If you're just writing a practice essay, get a friend/teacher/relative who is good at writing (in English) to look over what you've written and point out issues (this is how my own awkward wording was caught before I handed in the paper). This point obviously does not apply when you're actually taking the ACT, but it very helpful to ask for someone else to take a look over any practice essays you write to point out issues you may not notice yourself.

Vocabulary and Word Choice

The icing on the "Language Use" domain cake is skilled use of vocabulary and correct word choice. Part of this means using more complicated vocabulary in your essay. Once more, look at this this example from a 3-scoring essay (spelling corrected):

Machines are good at doing their jobs quickly and precisely.

Compare that to this sentence from a 6-scoring essay:

Machines excel at performing their jobs both quickly and precisely.

The 6-scoring essay uses "excel" and "performing" in place of "are good at" and "doing." This is an example of using language that is both more skillful ("excel" is more advanced than "are good at") and more precise ("performing" is a more precise word than "doing"). It's important to make sure that, when you do use more advanced words, you use them correctly. Consider the below sentence:

"Machines are often instrumental in ramifying safety features."

The sentence uses a couple of advanced vocabulary words, but since "ramifying" is used incorrectly, the language use in this sentence is neither skillful nor precise. Above all, your word choice and vocabulary should make your ideas clearer, not make them harder to understand.

Disappointed with your scores? Want to improve your ACT score by 4+ points?   We've written a guide about the top 5 strategies you must use to have a shot at improving your score. Download it for free now:

How Do I Use the ACT Writing Grading Rubric?

Okay, we've taken a look at the ACTual ACT Writing grading rubric and gone over each domain in detail. To finish up, I'll go over a couple of ways the scoring rubric can be useful to you in your ACT essay prep.

Use the ACT Writing Rubric To...Shape Your Essays

Now that you know what the ACT is looking for in an essay, you can use that to guide what you write about in your essays...and how develop and organize what you say!

Because I'm an Old™ (not actually trademarked), and because I'm from the East Coast, I didn't really know much about the ACT prior to starting my job at PrepScholar. People didn't really take it in my high school, so when I looked at the grading rubric for the first time, I was shocked to see how different the ACT essay was (as compared to the more familiar SAT essay ).

Basically, by reading this article, you're already doing better than high school me.

body_portraitofthemusician

An artist's impression of L. Staffaroni, age 16 (look, junior year was/is hard for everyone).

Use the ACT Writing Rubric To...Grade Your Practice Essays

The ACT can't really give you an answer key to the essay the way it can give you an answer key to the other sections (Reading, Math, etc). There are some examples of essays at each score point on the ACT website , but these examples assume that students will be at an equal level in each of domains, which will not necessarily be true for you. Even if a sample essay is provided as part of a practice test answer key, it will probably use different context, have a different logical progression, or maybe even argue a different viewpoint.

The ACT Writing rubric is the next best thing to an essay answer key. Use it as a filter through which to view your essay . Naturally, you don't have the time to become an expert at applying the rubric criteria to your essay to make sure you're in line with the ACT's grading principles and standards. That is not your job. Your job is to write the best essay that you can. If you're not confident in your ability to spot grammar, usage, and mechanics issues, I highly recommend asking a friend, teacher, or family member who is really good at (English) writing to take a look over your practice essays and point out the mistakes.

If you really want custom feedback on your practice essays from experienced essay graders, may I also suggest the PrepScholar test prep platform ? As I manage all essay grading, I happen to know a bit about the essay part of this platform, which provides you with both an essay grade and custom feedback. Learn more about PrepScholar ACT Prep and our essay grading here!

What's Next?

Desirous of some more sweet sweet ACT essay articles? Why not start with our comprehensive guide to the ACT Writing test and how to write an ACT essay, step-by-step ? (Trick question: obviously you should do this.)

Round out your dive into the details of the ACT Writing test with tips and strategies to raise your essay score , information about the best ACT Writing template , and advice on how to get a perfect score on the ACT essay .

Want actual feedback on your essay? Then consider signing up for our PrepScholar test prep platform . Included in the platform are practice tests and practice essays graded by experts here at PrepScholar.

Want to improve your ACT score by 4 points?   We have the industry's leading ACT prep program. Built by Harvard grads and ACT full scorers, the program learns your strengths and weaknesses through advanced statistics, then customizes your prep program to you so you get the most effective prep possible.   Along with more detailed lessons, you'll get thousands of practice problems organized by individual skills so you learn most effectively. We'll also give you a step-by-step program to follow so you'll never be confused about what to study next.   Check out our 5-day free trial today:

Trending Now

How to Get Into Harvard and the Ivy League

How to Get a Perfect 4.0 GPA

How to Write an Amazing College Essay

What Exactly Are Colleges Looking For?

ACT vs. SAT: Which Test Should You Take?

When should you take the SAT or ACT?

Get Your Free

PrepScholar

Find Your Target SAT Score

Free Complete Official SAT Practice Tests

How to Get a Perfect SAT Score, by an Expert Full Scorer

Score 800 on SAT Math

Score 800 on SAT Reading and Writing

How to Improve Your Low SAT Score

Score 600 on SAT Math

Score 600 on SAT Reading and Writing

Find Your Target ACT Score

Complete Official Free ACT Practice Tests

How to Get a Perfect ACT Score, by a 36 Full Scorer

Get a 36 on ACT English

Get a 36 on ACT Math

Get a 36 on ACT Reading

Get a 36 on ACT Science

How to Improve Your Low ACT Score

Get a 24 on ACT English

Get a 24 on ACT Math

Get a 24 on ACT Reading

Get a 24 on ACT Science

Stay Informed

Get the latest articles and test prep tips!

Follow us on Facebook (icon)

Laura graduated magna cum laude from Wellesley College with a BA in Music and Psychology, and earned a Master's degree in Composition from the Longy School of Music of Bard College. She scored 99 percentile scores on the SAT and GRE and loves advising students on how to excel in high school.

Ask a Question Below

Have any questions about this article or other topics? Ask below and we'll reply!

  • Department of Philosophy >
  • Undergraduate >
  • Learning Goals >

Grading Rubrics

1.

Goals and objectives are measured by a performance assessment in the courses required for the Philosophy major. Specifically, student performance in writing essays, and essay exam questions, will be measured using the follwing standardized grading rubrics.

Essays and essay questions are evaluated with an eye both to the student’s mastery of the specific subject matter covered by the course, and to the student’s mastery of more general skills in philosophical thinking and writing. A higher standard of thinking and writing is required for upper-division than for lower-division courses. In logic courses, students’ competence in formal logic is evaluated through assessment of their performance in weekly problem sets and examinations.

On this page:

Why use rubrics.

Rubrics provide a readily accessible way of communicating and developing our goals with students and the criteria we use to discern how well students have reached them.

Our department uses two rubrics, displayed on the two charts below:

  • Grading Rubric for Writing Assignments
  • Grading Rubric for Essay and Short Answer Exam Questions, Quizzes, and Homework Assignments

What is a rubric?

Rubrics (or scoring tools) are a way of describing evaluation criteria or grading standards based on the expected outcomes and performances of students. Each rubric consists of a set of scoring criteria and point values associated with these criteria.

In most rubrics the criteria are grouped into categories so the instructor and the student can discriminate among the categories by level of performance. In classroom use, the rubric provides a concrete standard against which student performance may be compared.

Assessment Purposes

  • To improve the reliability of scoring written assignments and oral presentations.
  • To convey goals and performance expectations of students in an unambiguous way.
  • To convey grading standards or point values and relate them to performance goals.
  • To engage students in critical evaluation of their own performance.

Grading Rubric Charts

  • Global navigation
  • Site navigation

Jacobs School of Music Bulletin 2024-2025

  •   IU Bulletins
  • Undergraduate
  • Regulations and Procedures

Admission Requirements

  • General Requirements for Bachelor's Degrees
  • Curricula for Bachelor's Degrees
  • Bachelor of Music Degrees
  • Bachelor of Music Education Degrees
  • Bachelor of Science Degrees
  • Audio Engineering and Sound Production Degrees
  • Ballet Degrees
  • Minors for Students Whose Majors are Inside the Jacobs School of Music
  • Undergraduate Certificate Programs

Undergraduate Division

Indiana university bloomington requirements for incoming freshmen.

The standards listed below represent the minimum levels of preparation and achievement necessary to be considered for admission. Most admitted students exceed these minimum levels. Each application is reviewed individually. When making admission decisions, the university is primarily concerned with the breadth and depth of the college-preparatory program including the student’s cumulative grade point average, SAT/ACT scores, academic curriculum and the grades received in those academic courses, grade trends in college-preparatory subjects, class rank, and other additional factors.

High School Graduation

Applicants must earn a diploma from an accredited high school (or must have completed the Indiana High School Equivalency Diploma) to be eligible for admission consideration. Students who are homeschooled or attend an alternative school should submit credentials that demonstrate equivalent levels of achievement and ability.

Academic Preparation

Applicants should complete at least 34 credits of college-preparatory courses, advanced placement courses, and/or college courses in high school, including:

  • 8 credits of English , such as literature, grammar, composition, and journalism
  • 7 credits of mathematics , including 4 credits of algebra and 2 credits of geometry (or an equivalent 6 credits of integrated algebra and geometry), and 1 credit of pre-calculus, trigonometry, or calculus
  • 6 credits of social sciences , including 2 credits of U.S. history, 2 credits of world history/civilization/geography, and 2 additional credits in government, economics, sociology, history, or similar topics
  • 6 credits of sciences , including at least 4 credits of laboratory sciences - biology, chemistry, or physics
  • 4 credits of world languages
  • 3 or more credits of additional college-preparatory courses. Additional mathematics credits are recommended for students intending to pursue a science degree and additional world language credits are recommended for all students.

Alternative college-preparatory courses may be substituted for courses that are not available in the applicant's high school.

Grades in Academic Classes

Cumulative GPA, as well as the grades earned in the 34 courses required for admission, is an important part of the application review process. Weighted GPA is also part of the review, if included on transcript.

Application Essay

An IU-specific essay of 200-400 words is required.

Standardized Test Scores

ACT or SAT scores are accepted as either official or self-reported scores. Self-reported scores can be entered in the Indiana University application. If offered admission, the offer will be contingent upon receipt of official test scores from testing agencies, which must match or be higher than those self-reported during the admissions process. IU's test-optional admissions policy allows students (both domestic and international) to choose at the point of application whether to have SAT or ACT test scores considered as part of the admissions review. For applicants who choose not to have test scores considered, a greater emphasis will be placed on grades in academic courses and grade trends in the admissions review. Applicants receive equal consideration for admission and scholarship to the Jacobs School of Music, regardless of whether or not they applied under the test-optional policy. There are several groups of students who will be required to provide SAT or ACT scores. Homeschooled students, students who have attended a school with non-traditional evaluation methods where traditional alpha or numerical grades are not assigned, and student athletes subject to NCAA eligibility standards will be required to submit a standardized test score. Applicants who are at least 21 years old or have been out of high school for three or more years may be considered for admission without standardized SAT and/or ACT test scores.

Information

For additional information, contact the Office of Admissions, Indiana University, Bloomington, IN 47405; (812) 855-0661; [email protected].  

International Students

To be admitted, international students must complete above-average work in their supporting programs. International applicants whose native language is not English must meet the English Proficiency requirements of Indiana University for undergraduate degree-seeking students. A complete description of options to complete the English Proficiency requirement is available at the Office of International Services (OIS) website.

Admitted undergraduate international students are also required to take the Indiana Academic English Test (IAET) and must register for any supplemental English courses prescribed based on the results of this examination or, if necessary, enroll in the intensive English language program.

For additional information, contact the Office of International Services, Indiana University, Ferguson International Center, 330 N. Eagleson Avenue, Bloomington, IN 47405; [email protected] ; (812) 855-9086; http://ois.iu.edu/admissions/index.html .

Academic Bulletins

  • Indiana University
  • IU Bloomington

PDF Version

Previous bulletins.

Students are ordinarily subject to the curricular requirements outlined in the Bulletin in effect at the start of their current degree. See below for links to previous Bulletins.

  • 1999-2001 (PDF)

Copyright © 2024 The Trustees of Indiana University , Copyright Complaints

  • Open access
  • Published: 03 September 2024

Reliability of ChatGPT in automated essay scoring for dental undergraduate examinations

  • Bernadette Quah 1 , 2 ,
  • Lei Zheng 1 , 2   na1 ,
  • Timothy Jie Han Sng 1 , 2   na1 ,
  • Chee Weng Yong 1 , 2   na1 &
  • Intekhab Islam   ORCID: orcid.org/0000-0002-7754-0609 1 , 2  

BMC Medical Education volume  24 , Article number:  962 ( 2024 ) Cite this article

Metrics details

This study aimed to answer the research question: How reliable is ChatGPT in automated essay scoring (AES) for oral and maxillofacial surgery (OMS) examinations for dental undergraduate students compared to human assessors?

Sixty-nine undergraduate dental students participated in a closed-book examination comprising two essays at the National University of Singapore. Using pre-created assessment rubrics, three assessors independently performed manual essay scoring, while one separate assessor performed AES using ChatGPT (GPT-4). Data analyses were performed using the intraclass correlation coefficient and Cronbach's α to evaluate the reliability and inter-rater agreement of the test scores among all assessors. The mean scores of manual versus automated scoring were evaluated for similarity and correlations.

A strong correlation was observed for Question 1 ( r  = 0.752–0.848, p  < 0.001) and a moderate correlation was observed between AES and all manual scorers for Question 2 ( r  = 0.527–0.571, p  < 0.001). Intraclass correlation coefficients of 0.794–0.858 indicated excellent inter-rater agreement, and Cronbach’s α of 0.881–0.932 indicated high reliability. For Question 1, the mean AES scores were similar to those for manual scoring ( p  > 0.05), and there was a strong correlation between AES and manual scores ( r  = 0.829, p  < 0.001). For Question 2, AES scores were significantly lower than manual scores ( p  < 0.001), and there was a moderate correlation between AES and manual scores ( r  = 0.599, p  < 0.001).

This study shows the potential of ChatGPT for essay marking. However, an appropriate rubric design is essential for optimal reliability. With further validation, the ChatGPT has the potential to aid students in self-assessment or large-scale marking automated processes.

Peer Review reports

Large Language Models (LLMs), such as OpenAI’s GPT-4, LLaMA by META, and Google’s LaMDA (Language Models for Dialogue Applications), have demonstrated tremendous potential in generating outputs based on user-specified instructions or prompts. These models are trained using large amounts of data and are capable of natural language processing tasks. Owing to their ability to comprehend, interpret, and generate natural language text, LLMs allow human-like conversations with coherent contextual responses to prompts. The capability of LLMs to summarize and generate texts that resemble human language allows the creation of task-focused systems that can ease the demands of human labor and improve efficiency.

OpenAI uses a closed application programming interface (API) to process data. Chat Generative Pre-trained Transformer (OpenAI Inc., California, USA, https://chat.openai.com/ ) was introduced globally in 2020 as ChatGPT3, a generative language model with 175 billion parameters [ 1 ]. It is based on a generative AI model that can generate new content based on the data on which they have been trained. The latest version, ChatGPT-4, was introduced in 2023 and has demonstrated improved creativity, reasoning, and the ability to process even more complicated tasks [ 2 ].

Since its release in the public domain, ChatGPT has been actively explored by both healthcare professionals and educators in an effort to attain human-like performance in the form of clinical reasoning, image recognition, diagnosis, and learning from medical databases. ChatGPT has proven to be a powerful tool with immense potential to provide students with an interactive platform to deepen their understanding of any given topic [ 3 ]. In addition, it is also capable of aiding in both lesson planning and student assessments [ 4 , 5 ].

The potential of ChatGPT for assessments

Automated Essay Scoring (AES) is not a new concept, and interest in AES has been increasing since the advent of AI. Three main categories of AES programs have been described, utilizing regression, classification, or neural network models [ 6 ]. A known problem of current AES systems is their unreliability in evaluating the content relevance and coherence of essays [ 6 ]. Newer language models such as ChatGPT, however, are potential game changers; they are simpler to learn than current deep learning programs and can therefore improve the accessibility of AES to educators. Mizumoto and Eguchi recently pioneered the potential use of ChatGPT (GPT-3.5 and 4) for AES in the field of linguistics and reported an accuracy level sufficient for use as a supportive tool even when fine-tuning of the model was not performed [ 7 ].

The use of these AI-powered tools may potentially ease the burden on educators in marking large numbers of essay scripts, while providing personalized feedback to students [ 8 , 9 ]. This is especially crucial with larger class sizes and increasing student-to-teacher ratios, where it can be more difficult for educators to actively engage individual students. Additionally, manual scoring by humans can be subjective and susceptible to fatigue, which may put the scoring at risk of being unreliable [ 7 , 10 ]. The use of AI for essay scoring may thus help reduce intra- and inter-rater variability associated with manual scoring by providing a more standardized and reliable scoring process that eases the time- and labor-intensive scoring workload of human assessors [ 10 , 11 ].

The current role of AI in healthcare education

Generative AI has permeated the healthcare industry and provided a diverse range of health enhancements. An example is how AI facilitates radiographic evaluation and clinical diagnosis to improve the quality of patient care [ 12 , 13 ]. In medical and dental education, virtual or augmented reality and haptic simulations are some of the exciting technological tools already implemented to improve student competence and confidence in patient assessment and execution of procedures [ 14 , 15 , 16 ]. The incorporation of ChatGPT into the dental curriculum would thus be the next step in enhancing student learning. The performance of ChatGPT in the United States Medical Licensing Examination (USMLE) was recently validated, with ChatGPT achieving a score equivalent to that of a third-year medical student [ 17 ]. However, no data are available on the performance of ChatGPT in the field of dentistry or oral and maxillofacial surgery (OMS). Furthermore, the reliability of AI-powered language models for the grading of essays in the medical field has not yet been evaluated; in addition to essay structure and language, the evaluation of essay scripts in the field of OMS would require a level of understanding of dentistry, medicine and surgery.

Therefore, this study aimed to evaluate the reliability of ChatGPT for AES in OMS examinations for final-year dental undergraduate students compared to human assessors. Our null hypothesis was that there would be no difference in the scores between the ChatGPT and human assessors. The research question for the study was as follows: How reliable is ChatGPT when used for AES in OMS examinations compared to human assessors?

Materials and methods

This study was conducted in the Faculty of Dentistry, National University of Singapore, under the Department of Oral and Maxillofacial Surgery. The study received ethical approval from the university’s Institutional Review Board (REF: IRB-2023–1051) and was conducted and drafted with guidance from the education interventions critical appraisal worksheet introduced by BestBETs [ 18 ].

Sample size calculation for this study was based on the formula provided by Viechtbauer et al.: n  = ln (1-γ) / ln(1-π), where n, γ and π represent the sample size, significance level and level of confidence respectively [ 19 ]. Based on a 5% margin of error, a 95% confidence level and a 50% outcome response, it was calculated that a minimum sample size of 59 subjects was required. Ultimately, the study recruited 69 participants, all of whom were final-year undergraduate dental students. A closed-book OMS examination was conducted on the Examplify platform (ExamSoft Worldwide Inc., Texas, USA) as a part of the end-of-module assessment. The examination comprised two open-ended essay questions based on the topics taught in the module (Table  1 ).

Creation of standardized assessment

An assessment rubric was created for each question through discussion and collaboration of a workgroup comprising four assessors involved in the study. All members of the work group were academic staff from the faculty (I.I., B.Q., L.Z., T.J.H.S.) (Supplementary Tables S1 and S2) [ 20 ]. An analytic rubric was generated using the strategy outlined by Popham [ 21 ]. The process involved a discussion within the workgroup to agree on the learning outcomes of the essay questions. Two authors (I. I. and B. Q) independently generated the rubric criteria and descriptions for Question 1 (Infection). Similarly, for Question 2 (Trauma), the rubric criteria and descriptions were generated independently by two authors (I.I. and T.J.H.S.). The rubrics were revised until a consensus was reached between each pair. In the event of any disagreement, a third author (L.Z.) provided their opinion to aid in decision making.

Marking categories of Poor (0 marks), Satisfactory (2 marks), Good (3 marks), and Outstanding (4 marks) were allocated to each criterion, with a maximum of 4 marks attainable from each criterion. A criterion for overall essay structure and language was also included, with a maximum attainable 5 marks from this criterion. The highest score for each question was 40.

Model answers to the essays were prepared by another author (C.W.Y.), who did not participate in the creation of the rubrics. Using the rubrics as a reference, the author modified the model answer to create 5 variants of the answers such that each variant fell within different score ranges of 0–10, 11–20, 21–30, 31–40, 41–50. Subsequently, three authors (B. Q., L. Z., and T.J.H.S) graded the essays using the prepared rubrics. Revisions to the rubrics were made with consensus by all three authors, a process that also helped calibrate these three authors for manual essay scoring.

AES with ChatGPT

Essay scoring was performed using ChatGPT (GPT-4, released March 14, 2023) by one assessor who did not participate in the manual essay scoring exercise (I.I.). Prompts were generated based on a guideline by Giray, and the components of Instruction, Context, Input Data and Output Indication as discussed in the guideline were included in each prompt (Supplementary Tables 3 and 4) [ 22 ]. A prompt template was generated for each question by one assessor (I.I.) with advice from two experts in prompt engineering, based on the marking rubric. The criterion and point allocation were clearly written in prose and point forms. For the fine-tuning process, the prompts were input into ChatGPT using variants of the model answers provided by C.W.Y. Minor adjustments were made to the wording of certain parts of the prompts as necessary to correct any potential misinterpretations of the prompts by the ChatGPT. Each time, the prompt was entered into a new chat in the ChatGPT in a browser where the browser history and cookies were cleared. Subsequently, finalized prompts (Supplementary Tables 3 and 4) were used to score the student essays. AES scores were not used to calculate students’ actual essay scores.

Manual essay scoring

Manual essay scoring was completed independently by three assessors (B.Q., L.Z., and T.J.H.S.) using the assessment rubrics (Supplementary Tables S1 and S2). Calibration was performed during the rubric creation stage. The essays were anonymized to prevent bias during the marking process. The assessors recorded the marks allocated to each criterion, as well as the overall score of each essay, on a pre-prepared Excel spreadsheet. Scoring was performed separately and independently by all assessors before the final collation by a research team member (I.I.) for statistical analyses. The student was considered ‘able to briefly mention’ a criterion if they did not mention any of the keywords of the points within the criterion. The student was considered ‘able to elaborate on’ a point within the criterion if they were able to mention the keywords of that point as stated in the rubric, and were thus awarded higher marks in accordance with the rubric (e.g. the student was given a higher mark if they were able to mention the need to check for dyspnea and dysphagia, instead of simply mentioning a need to check the patient’s airway). Grading was performed with only whole marks as specified in the rubrics, and assessors were not allowed to give half marks or subscores.

Data synthesis

The scores given out of 40 per essay by each assessor were compiled. Data analyses were subsequently performed using SPSS® version 29.0.1.0(171) (IBM Corporation, New York, United States). For each essay question, correlations between the essay scores given by each assessor were analyzed and displayed using the inter-item correlation matrix. A correlation coefficient value ( r ) of 0.90–1.00 was indicative of a very strong, 0.70–0.89 indicative of strong, 0.40–0.69 moderate, 0.10–0.39 weak and < 0.10 negligible positive correlation between the scorers [ 23 ]. The cutoff p -value for the significance level was set at p  < 0.05. The intraclass correlation coefficient (ICC) and Cronbach's α were then calculated between all assessors to assess the inter-rater agreement and reliability, respectively [ 24 ]. The ICC was interpreted on a scale of 0 to 1.00, with a higher value indicating a higher level of agreement in scores given by the scorers to each student. A value less than 0.40 was indicative of poor, 0.40–0.59 fair, 0.60–0.74 good, and 0.75–1.00 excellent agreement [ 25 ]. Using Cronbach’s α, reliability was expressed on a range from 0 to 1.00, with a higher number indicating a higher level of consistency between the scorers in their scores given across the students. The reliability was considered ‘Less Reliable’ if the score was less 0.20, ‘Rather Reliable’ if the score was 0.20–0.40, ‘Quite Reliable’ if 0.40–0.60, ‘Reliable’ if 0.60–0.80 and ‘Very Reliable’ if 0.80–1.00 [ 26 ].

Similarly, the mean scores of the three manual scorers were calculated for each question. The mean manual scores were then analyzed for correlations with AES scores by using Pearson’s correlation coefficient. Student’s t-test was also used to analyze any significant differences in mean scores between manual scoring and AES. A p -value of < 0.05 was required to conclude the presence of a statistically different score between the groups.

All final-year dental undergraduate students (69/69, 100%) had their essays graded by all manual scorers and AES as part of the study. Table 2 shows the mean scores for each individual assessor as well as the mean scores for the three manual scorers (Scorers 1, 2, and 3).

Analysis of correlation and agreement between all scorers

The inter-item correlation matrices and their respective p -values are listed in Table  3 . For Question 1, there was a strong positive correlation between the scores provided by each assessor (Scorers 1, 2, 3, and AES), with r -values ranging from 0.752–0.848. All p -values were < 0.001, indicating a significant positive correlation between all assessors. For Question 2, there was a strong positive correlation between Scorers 1 and 2 ( r  = 0.829) and Scorers 1 and 3 ( r  = 0.756). There was a moderate positive correlation between Scorers 2 and 3 ( r  = 0.655), as well as between AES and all manual scores ( r -values ranging from 0.527 to 0.571). Similarly, all p -values were < 0.001, indicative of a significant positive correlation between all scorers.

For the analysis of inter-rater agreement, ICC values of 0.858 (95% CI 0.628 – 0.933) and 0.794 (95% CI 0.563 – 0.892) were obtained for Questions 1 and 2, respectively, both of which were indicative of excellent inter-rater agreement. Cronbach’s α was 0.932 for Question 1 and 0.881 for Question 2, both of which were ‘Very Reliable’.

Analysis of correlation between manual scoring versus AES

The results of the Student’s t-test comparing the test score values from manual scoring and AES are shown in Table  2 . For Question 1, the mean manual scores (14.85 ± 4.988) were slightly higher than those of the AES (14.54 ± 5.490). However, these differences were not statistically significant ( p  > 0.05). For Question 2, the mean manual scores (23.11 ± 4.241) were also higher than those of the AES (18.62 ± 4.044); this difference was statistically significant ( p  < 0.001).

The results of the Pearson’s correlation coefficient calculations are shown in Table  4 . For Question 1, there was a strong and significant positive correlation between manual scoring and AES ( r  = 0.829, p  < 0.001). For Question 2, there was a moderate and statistically significant positive correlation between the two groups ( r  = 0.599, p  < 0.001).

Qualitative feedback from AES

Figures 1 , 2 and 3 show three examples of essay feedback and scoring provided by ChatGPT. ChatGPT provided feedback in a concise and systematic manner. Scores were clearly provided for each of the criteria listed in the assessment rubric. This was followed by in-depth feedback on the points within the criterion that the student had discussed and failed to mention. ChatGPT was able to differentiate between a student who briefly mentioned a key point and a student who provided better elaboration on the same point by allocating them two or three marks, respectively.

figure 1

Example #1 of a marked essay with feedback from ChatGPT for Question 1

figure 2

Example #2 of a marked essay with feedback from ChatGPT for Question 1

figure 3

Example #3 of a marked essay with feedback from ChatGPT for Question 1

One limitation of ChatGPT that was identified during the scoring process was its inability to identify content that was not relevant to the essay or that was factually incorrect. This was despite the assessment rubric specifying that incorrect statements should be given 0 marks for that criterion. For example, a student who included points about incision and drainage also incorrectly stated that bone scraping to induce bleeding and packing of local hemostatic agents should be performed. Although these statements were factually incorrect, ChatGPT was unable to identify this and still awarded student marks for the point. Manual assessors were able to spot this and subsequently penalized the student for the mistake.

Since its recent rise in popularity, many people have been eager to tap into the potential of large language models, such as ChatGPT. In their review, Khan et al. discussed the growing role of ChatGPT in medical education, with promising uses for the creation of case studies and content such as quizzes and flashcards for self-directed practice [ 9 ]. As an LLM, the ability of ChatGPT to thoroughly evaluate sentence structure and clarity may allow it to confront the task of automated essay marking.

Advantages of ChatGPT in AES

This study found significant correlations and excellent inter-rater agreement between ChatGPT and manual scorers, and the mean scores between both groups showed strong to moderate correlations for both essay questions. This suggests that AES has the potential to provide a level of essay marking similar to that of the educators in our faculty. Similar positive findings were reflected in previous studies that compared manual and automated essay scoring ( r  = 0.532–0.766) [ 6 ]. However, there is still a need to further fine-tune the scoring system such that the score provided by AES deviates as little as possible from human scoring. For instance, the mean AES score was lower than that of manual scoring by 5 marks for Question 2. Although the difference may not seem large, it may potentially increase or decrease the final performance grade of students.

Apart from a decent level of reliability in manual essay scoring, there are many other benefits to using ChatGPT for AES. Compared to humans, the response time to prompts is much faster and can thus increase productivity and reduce the burden of a large workload on educational assessors [ 27 ]. In addition, ChatGPT can provide individualized feedback for each essay (Figs. 1 , 2 and 3 ). This helps provide students with comments specific to their essays, a feat that is difficult to achieve for a single educator teaching a large class size.

Similar to previous systems designed for AES, machine scoring is beneficial for removing human inconsistencies that can result from fatigue, mood swings, or bias [ 10 ]. ChatGPT is no exception. Furthermore, ChatGPT is more widely accessible than the conventional AES systems. Its software runs online instead of requiring downloads on a computer, and its user interface is simple to use. With GPT-3.5 being free to use and GPT-4 being 20 USD per month, it is also relatively inexpensive.

Marking the essay is only part of the equation, and the next step is to allow the students to know what went wrong and why. Nicol and Macfarlane described seven principles for good feedback. ChatGPT can fulfil most of these principles, namely, facilitating self-assessment, encouraging teacher and peer dialogue, clarifying what good performance is, providing opportunities to close the gap between current and desired performance, and delivering high-quality information to students [ 28 ]. In this study, the feedback given by ChatGPT was categorized based on the rubric, and elaboration was provided for each criterion on the points the student mentioned and did not mention. By highlighting the ideal answer and where the student can improve, ChatGPT can clarify performance goals and provide opportunities to close the gap between the student’s current and desired performance. This creates opportunities for selfdirected learning and the utilization of blended learning environments. Students can use ChatGPT to review their preparation on topics, self-grade their essays, and receive instant feedback. Furthermore, the simple and interactive nature of the software encourages dialogue, as it can readily respond to any clarification the student wants to make. The importance of effective feedback has been demonstrated to be an essential component in medical education, in terms of enhancing the knowledge of the student without developing negative emotions [ 29 , 30 ].

These potential advantages of engaging ChatGPT for student assessments play well into the humanistic learning theory of medical education [ 31 , 32 ]. Self-directed learning allows students the freedom to learn at their own pace, with educators simply providing a conducive environment and the goals that the student should achieve. ChatGPT has the potential to supplement the role of the educator in self-directed learning, as it can be readily available to provide constructive and tailored feedback for assignments whenever the student is ready for it. This removes the burden that assignment deadlines place on students, which can allow them a greater sense of independence and control over their learning, and lead to greater self-motivation and self-fulfillment.

Potential pitfalls of ChatGPT

Potential pitfalls associated with the use of ChatGPT were identified. First, the ability to achieve reliable scores relies heavily on a well-created marking rubric with clearly defined terms. In this study, the correlations between scorers were stronger for Question 1 compared to Question 2, and the mean scores between the AES and manual scorers were also significantly different for Question 2, but not for Question 1. The lower reliability of the AES for Question 2 may be attributed to its broader nature, use of more complex medical terms, and lengthier scoring rubrics. The broad nature of the question left more room for individual interpretation and variation between humans and AES. The ability of ChatGPT to provide accurate answers may be reduced with lengthier prompts and conversations [ 27 ]. Furthermore, with less specific instructions or complex medical jargon, both automated systems and human scorers may interpret rubrics differently, resulting in varied scores across the board [ 10 , 33 , 34 ]. The authors thus recommend that to circumvent this, the use of ChatGPT for essay scoring should be restricted to questions that are less broad (e.g. shorter essays), or by breaking the task into multiple prompts for each individual criterion to reduce variations in interpretation [ 27 , 35 ]. Furthermore, the rubrics should contain concise and explicit instructions with appropriate grammar and vocabulary to avoid misinterpretation by both ChatGPT and human scorers, and provide a brief explanation to specify what certain medical terms mean (e.g. writing ‘pulse oximetry (SpO2) monitoring’ instead of only ‘SpO2’) for better contextualization [ 35 , 36 ].

Second, prompt engineering is a critical step in producing the desired outcome from ChatGPT [ 27 ]. A prompt that is too ambiguous or lacks context can lead to a response that is incomplete, generic, or irrelevant, and a prompt that exhibits bias runs the risk of bias reinforcement in the given reply [ 22 , 34 ]. Phrasing the prompt must also be carefully checked for spelling, grammatical mistakes, or inconsistencies, since ChatGPT uses the prompt’s phrasing literally. For example, a prompt that reads ‘give 3 marks if the student covers one or more coverage points’ will result in ChatGPT only giving the marks if multiple points are covered, because of the plural nature of the word ‘points’.

Third, irrelevant content may not be penalized during the essay-marking process. Students may ‘trick’ the AES by producing a lengthier essay to hit more relevant points and increase their score. This may result in essays of lower quality with multiple incorrect or nonsensical statements still rewarded with higher scores [ 10 ]. Our assessment rubric attempts to penalize the student with 0 marks if incorrect statements on the criterion are made; however, none of the students were penalized. This issue may be resolved as ChatGPT rapidly and continuously gains more medical and dental knowledge. Although data to support the competence of AI in medical education are sparse, the quality of the medical knowledge that ChatGPT already has is sufficient to achieve a passing mark at the USMLE [ 5 , 37 ]. In dentistry, when used to disseminate information on endodontics to patients, ChatGPT was found to provide detailed answers with an overall validity of 95% [ 38 ]. Over time, LLMs such as ChatGPT may be able to identify when students are not factually correct.

Other comments

The lack of human emotion in machine scoring can be both an advantage and a disadvantage. AES can provide feedback that is entirely factual and less biased than humans, and grades are objective and final [ 39 ]. However, human empathy is an essential quality that ChatGPT does not possess. One principle of good feedback is to encourage and motivate students to provide positive learning experiences and build self-esteem [ 28 ]. While ChatGPT can provide constructive feedback, it will not be able to replace the compassion, empathy, or emotional intelligence possessed by a quality educator possesses [ 40 ]. In our study, ChatGPT awarded lower mean scores of 14.54/40 (36.4%) and 18.62/40 (46.5%) compared to manual scoring for both questions. Although objective, some may view automated scoring as harsh because it provided failing grades to an average student.

This study demonstrates the ability of GPT-4 to evaluate essays without any specialized training or prompting. One long prompt was used to score each essay. Although more technical prompting methods, such as chain of thought, could be deployed, our single prompt method makes the method scalable and easier to adopt. As discussed earlier, ChatGPT is the most reliable when prompts are short and specific [ 34 ]. Hence, each prompt should ideally task ChatGPT to score only one or two criteria, rather than the entire rubric of the 10 criteria. However, in a class of 70, the assessors are required to run 700 prompts per question, which is impractical and unnecessary. With only one prompt, a good correlation was still found between the AES and manual scoring. It is likely that further exploration and experimentation with prompting techniques can improve the output.

While LLMs have the potential to revolutionize education in healthcare, some precautions must be taken. Artificial Hallucination is a widely described phenomenon; ChatGPT may generate seemingly genuine but inaccurate information [ 41 , 42 , 43 ]. Hallucinations have been attributed to biases and limitations of training data as well as algorithmic limitations [ 2 ]. Similarly, randomness of the generated responses has been observed; while it is useful for generating creative content, this may be an issue when ChatGPT is employed for topics requiring scientific or factual content [ 44 ]. Thus, LLMs are not infallible and still require human subject matter experts to validate the generated content. Finally, it is essential that educators play an active role in driving the development of dedicated training models to ensure consistency, continuity, and accountability, as overreliance on a corporate-controlled model may place educators at the mercy of algorithm changes.

The ethical implications of using ChatGPT in medical and dental education also need to be explored. As much as LLMs can provide convenience to both students and educators, privacy and data security remain a concern [ 45 ]. Robust university privacy policies and informed consent procedures should be in place for the protection of student data prior to the use of ChatGPT as part of student assessment. Furthermore, if LLMs like ChatGPT were to be used for grading examinations in the future, issues revolving around fairness and transparency of the grading process need to be resolved [ 46 ]. GPT-4 may have provided harsh scores in this study, possibly due to some shortfall in understanding certain phrases the students have written; models used in assessments will thus require sufficient training in the field of healthcare to properly acquire the relevant medical knowledge and hence understand and grade essays fairly.

As AI continues to develop, ChatGPT may eventually replace human assessors in essay scoring for dental undergraduate examinations. However, given its current limitations and dependence on a well-formed assessment rubric, relying solely on ChatGPT for exam grading may be inappropriate when the scores can affect the student’s overall module scores, career success, and mental health [ 47 ]. While this study primarily demonstrates the use of ChatGPT to grade essays, it also points to great potential in using it as an interactive learning tool. A good start for its use is essay assignments on pre-set topics, where students can direct their learning on their own and receive objective feedback on essay structure and content that does not count towards their final scores. Students can use rubrics to practice and gain effective feedback from LLMs in an engaging and stress-free environment. This reduces the burden on educators by easing the time-consuming task of grading essay assignments and allows students the flexibility to complete and grade their assignments whenever they are ready. Furthermore, assignments repeated with new class cohorts can enable more robust feedback from ChatGPT through machine learning.

Study limitations

The limitations of this study lie in part of its methodology. The study recruited 69 dental undergraduate students; while this is above the minimum calculated sample size of 59, a larger sample size would help to increase the generalizability of the study findings to larger populations of students and a wide scope of topics. The unique field of OMS also requires knowledge of both medical and dental subjects, and hence the results obtained from the use of ChatGPT for essay marking in other medical or dental specialties may differ slightly.

The use of rubrics for manual scoring could also be a potential source of bias. While the rubrics provide a framework for objective assessment, they cannot eliminate the subjectiveness of manual scoring. Variations in the interpretation of the students’ answers, leniency errors (whereby one scorer marks more leniently than another) or rater drift (fatigue from assessing many essays may affect leniency of marking and judgment) may still occur. To minimize bias resulting from these errors, multiple assessors were recruited for the manual scoring process and the average scores were used for comparison with AES.

This study investigated the reliability of ChatGPT in essay scoring for OMS examinations, and found positive correlations between ChatGPT and manual essay scoring. However, ChatGPT tended towards stricter scoring and was not capable of penalizing irrelevant or incorrect content. In its present state, GPT-4 should not be used as a standalone tool for teaching or assessment in the field of medical and dental education but can serve as an adjunct to aid students in self-assessment. The importance of proper rubric design to achieve optimal reliability when employing ChatGPT in student assessment cannot be overemphasized.

Availability of data and materials

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Floridi L, Chiriatti M. GPT-3: Its nature, scope, limits, and consequences. Mind Mach. 2020;30(4):681–94.

Article   Google Scholar  

Abd-Alrazaq A, AlSaad R, Alhuwail D, Ahmed A, Healy PM, Latifi S, Aziz S, Damseh R, Alabed Alrazak S, Sheikh J. Large language models in medical education: opportunities, challenges, and future directions. JMIR Med Educ. 2023;9:e48291.

Kasneci E, Sessler K, Küchemann S, Bannert M, Dementieva D, Fischer F, Gasser U, Groh G, Günnemann S, Hüllermeier E, et al. ChatGPT for good? On opportunities and challenges of large language models for education. Learn Individ Differ. 2023;103:102274.

Javaid M, Haleem A, Singh RP, Khan S, Khan IH. Unlocking the opportunities through ChatGPT Tool towards ameliorating the education system. BenchCouncil Transact Benchmarks Standards Eval. 2023;3(2): 100115.

Kung TH, Cheatham M, Medenilla A, Sillos C, De Leon L, Elepaño C, Madriaga M, Aggabao R, Diaz-Candido G, Maningo J, et al. Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. PLOS Digit Health. 2023;2(2):e0000198.

Ramesh D, Sanampudi SK. An automated essay scoring systems: a systematic literature review. Artif Intell Rev. 2022;55(3):2495–527.

Mizumoto A, Eguchi M. Exploring the potential of using an AI language model for automated essay scoring. Res Methods Appl Linguist. 2023;2(2): 100050.

Erturk S, Tilburg W, Igou E: Off the mark: Repetitive marking undermines essay evaluations due to boredom. Motiv Emotion 2022;46.

Khan RA, Jawaid M, Khan AR, Sajjad M. ChatGPT - Reshaping medical education and clinical management. Pak J Med Sci. 2023;39(2):605–7.

Hussein MA, Hassan H, Nassef M. Automated language essay scoring systems: a literature review. PeerJ Comput Sci. 2019;5:e208.

Blood I: Automated essay scoring: a literature review. Studies in Applied Linguistics and TESOL 2011, 11(2).

Menezes LDS, Silva TP, Lima Dos Santos MA, Hughes MM, Mariano Souza SDR, Leite Ribeiro PM, Freitas PHL, Takeshita WM: Assessment of landmark detection in cephalometric radiographs with different conditions of brightness and contrast using the an artificial intelligence software. Dentomaxillofac Radiol 2023:20230065.

Bennani S, Regnard NE, Ventre J, Lassalle L, Nguyen T, Ducarouge A, Dargent L, Guillo E, Gouhier E, Zaimi SH, et al. Using AI to improve radiologist performance in detection of abnormalities on chest radiographs. Radiology. 2023;309(3): e230860.

Moussa R, Alghazaly A, Althagafi N, Eshky R, Borzangy S. Effectiveness of virtual reality and interactive simulators on dental education outcomes: systematic review. Eur J Dent. 2022;16(1):14–31.

Fanizzi C, Carone G, Rocca A, Ayadi R, Petrenko V, Casali C, Rani M, Giachino M, Falsitta LV, Gambatesa E, et al. Simulation to become a better neurosurgeon An international prospective controlled trial: The Passion study. Brain Spine. 2024;4:102829.

Lovett M, Ahanonu E, Molzahn A, Biffar D, Hamilton A. Optimizing individual wound closure practice using augmented reality: a randomized controlled study. Cureus. 2024;16(4):e59296.

Google Scholar  

Gilson A, Safranek CW, Huang T, Socrates V, Chi L, Taylor RA, Chartash D. How does ChatGPT perform on the United States medical licensing examination? the implications of large language models for medical education and knowledge assessment. JMIR Med Educ. 2023;9:e45312.

Educational Intervention Worksheet, BestBets, Accessed 31/03/2024.  https://bestbets.org/ca/pdf/educational_intervention.pdf .

Viechtbauer W, Smits L, Kotz D, Budé L, Spigt M, Serroyen J, Crutzen R. A simple formula for the calculation of sample size in pilot studies. J Clin Epidemiol. 2015;68(11):1375–9.

Cox G, Morrison J, Brathwaite B: The Rubric: An Assessment Tool to Guide Students and Markers; 2015.

Popham J. W: “What’s Wrong—And What’s Right—With Rubrics.” Educ Leadersh. 1997;55(2):72–5.

Giray L. Prompt Engineering with ChatGPT: A Guide for Academic Writers. Ann Biomed Eng. 2023;51:3.

Schober P, Boer C, Schwarte LA. Correlation Coefficients: Appropriate Use and Interpretation. Anesth Analg. 2018;126(5):1763–8.

Liao SC, Hunt EA, Chen W. Comparison between inter-rater reliability and inter-rater agreement in performance assessment. Ann Acad Med Singap. 2010;39(8):613–8.

Cicchetti DV. Guidelines, criteria, and rules of thumb for evaluating normed and standardized assessment instruments in psychology. Psychol Assess. 1994;6(4):284–90.

Hair J, Black W, Babin B, Anderson R: Multivariate Data Analysis: A Global Perspective; 2010.

Nazir A, Wang Z: A Comprehensive Survey of ChatGPT: Advancements, Applications, Prospects, and Challenges. Meta Radiol 2023;1(2).

Nicol D, Macfarlane D: Rethinking Formative Assessment in HE: a theoretical model and seven principles of good feedback practice. IEEE Personal Communications - IEEE Pers Commun 2004;31.

Spooner M, Larkin J, Liew SC, Jaafar MH, McConkey S, Pawlikowska T. “Tell me what is ‘better’!” How medical students experience feedback, through the lens of self-regulatory learning. BMC Med Educ. 2023;23(1):895.

Kornegay JG, Kraut A, Manthey D, Omron R, Caretta-Weyer H, Kuhn G, Martin S, Yarris LM. Feedback in medical education: a critical appraisal. AEM Educ Train. 2017;1(2):98–109.

Mukhalalati BA, Taylor A. Adult learning theories in context: a quick guide for healthcare professional educators. J Med Educ Curric Dev. 2019;6:2382120519840332.

Taylor DC, Hamdy H. Adult learning theories: implications for learning and teaching in medical education: AMEE Guide No. 83. Med Teach. 2013;35(11):e1561-1572.

Chakraborty S, Dann C, Mandal A, Dann B, Paul M, Hafeez-Baig A: Effects of Rubric Quality on Marker Variation in Higher Education. Studies In Educational Evaluation 2021;70.

Heston T, Khun C. Prompt engineering in medical education. Int Med Educ. 2023;2:198–205.

Sun GH: Prompt Engineering for Nurse Educators. Nurse Educ 2024.

Meskó B. Prompt engineering as an important emerging skill for medical professionals: tutorial. J Med Internet Res. 2023;25:e50638.

Sun L, Yin C, Xu Q, Zhao W. Artificial intelligence for healthcare and medical education: a systematic review. Am J Transl Res. 2023;15(7):4820–8.

Mohammad-Rahimi H, Ourang SA, Pourhoseingholi MA, Dianat O, Dummer PMH, Nosrat A: Validity and reliability of artificial intelligence chatbots as public sources of information on endodontics. Int Endodontic J 2023, n/a(n/a).

Peng X, Ke D, Xu B: Automated essay scoring based on finite state transducer: towards ASR transcription of oral English speech. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1. Jeju Island, Korea: Association for Computational Linguistics; 2012:50–59.

Grassini S. Shaping the future of education: Exploring the Potential and Consequences of AI and ChatGPT in Educational Settings. Educ Sci. 2023;13(7):692.

Limitations.  https://openai.com/blog/chatgpt .

Sallam M, Salim NA, Barakat M, Al-Tammemi AB. ChatGPT applications in medical, dental, pharmacy, and public health education: A descriptive study highlighting the advantages and limitations. Narra J. 2023;3(1):e103.

Deng J, Lin Y. The Benefits and Challenges of ChatGPT: An Overview. Front Comput Intell Syst. 2023;2:81–3.

Choi W. Assessment of the capacity of ChatGPT as a self-learning tool in medical pharmacology: a study using MCQs. BMC Med Educ. 2023;23(1):864.

Medina-Romero MÁ, Jinchuña Huallpa J, Flores-Arocutipa J, Panduro W, Chauca Huete L, Flores Limo F, Herrera E, Callacna R, Ariza Flores V, Quispe I, et al. Exploring the ethical considerations of using Chat GPT in university education. Period Eng Nat Sci (PEN). 2023;11:105–15.

Lee H. The rise of ChatGPT: Exploring its potential in medical education. Anat Sci Educ. 2024;17(5):926–31.

Steare T, Gutiérrez Muñoz C, Sullivan A, Lewis G. The association between academic pressure and adolescent mental health problems: A systematic review. J Affect Disord. 2023;339:302–17.

Download references

Acknowledgements

We would like to extend our gratitude to Mr Paul Timothy Tan Bee Xian and Mr Jonathan Sim for their invaluable advice on the process of prompt engineering for the effective execution of this study.

Author information

Lei Zheng, Timothy Jie Han Sng and Chee Weng Yong contributed equally to this work.

Authors and Affiliations

Faculty of Dentistry, National University of Singapore, Singapore, Singapore

Bernadette Quah, Lei Zheng, Timothy Jie Han Sng, Chee Weng Yong & Intekhab Islam

Discipline of Oral and Maxillofacial Surgery, National University Centre for Oral Health, 9 Lower Kent Ridge Road, Singapore, Singapore

You can also search for this author in PubMed   Google Scholar

Contributions

B.Q. contributed in the stages of conceptualization, methodology, study execution, validation, formal analysis and manuscript writing (original draft and review and editing). L.Z., T.J.H.S. and C.W.Y. contributed in the stages of methodology, study execution, and manuscript writing (review and editing). I.I. contributed in the stages of conceptualization, methodology, study execution, validation, formal analysis, manuscript writing (review and editing) and supervision. All authors provided substantial contributions to this manuscript in the following form:

Corresponding author

Correspondence to Intekhab Islam .

Ethics declarations

Ethics approval and consent to participate.

This study was approved by the Institutional Review Board of the university (REF: IRB-2023–1051). The waiver of consent from students was approved by the University’s Institutional Review Board, as the scores by ChatGPT were not used as the students’ actual grades, and all essay manuscripts were anonymized.

Consent for publication

All the authors reviewed the content of this manuscript and provided consent for publication.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary material 1, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .

Reprints and permissions

About this article

Cite this article.

Quah, B., Zheng, L., Sng, T.J.H. et al. Reliability of ChatGPT in automated essay scoring for dental undergraduate examinations. BMC Med Educ 24 , 962 (2024). https://doi.org/10.1186/s12909-024-05881-6

Download citation

Received : 04 February 2024

Accepted : 09 August 2024

Published : 03 September 2024

DOI : https://doi.org/10.1186/s12909-024-05881-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Artificial intelligence
  • Academic performance
  • Educational
  • Educational needs assessment

BMC Medical Education

ISSN: 1472-6920

essay exam rubrics

Bar exam results for July on the rise

  • Medium Text

Students attend secondary school exams under COVID-19 restrictions in Berlin

Sign up here.

Our Standards: The Thomson Reuters Trust Principles. , opens new tab

essay exam rubrics

Thomson Reuters

Karen Sloan reports on law firms, law schools, and the business of law. Reach her at [email protected]

Read Next / Editor's Picks

COVID-19 mass-vaccination of healthcare workers takes place at Dodger Stadium in Los Angeles

Industry Insight

essay exam rubrics

David Thomas

essay exam rubrics

Luc Cohen, Susan Heavey

essay exam rubrics

Mike Scarcella, David Thomas

  • Services & Software

How I Use AI to Catch Cheaters at School

It's getting harder to spot by the day, but here are some ways you can use ChatGPT to spot student papers using ChatGPT.

essay exam rubrics

It's a tale as old as teaching -- a student, for one reason or another, uses someone else's work to complete their assignment. Only in 2024, that someone else could be an artificial intelligence tool.

The allure is understandable. Away with those shady essay writing services where a student has to plonk down real cash for an unscrupulous person to write them 1,200 words on the fall of the Roman Empire. An AI writing tool can do that for free in 30 seconds flat.

AI Atlas art badge tag

As a professor of strategic communications, I encounter students using AI tools like ChatGPT , Grammarly and EssayGenius on a regular basis. It's usually easy to tell when a student has used one of these tools to draft their entire work. The tell-tale signs include ambiguous language and a super annoying tendency for AI to spit out text with the assignment prompt featured broadly.

For example, a student might use ChatGPT -- an AI tool that uses large language model learning and a conversational question and answer format to provide query results -- to write a short essay response to a prompt by simply copying and pasting the essay question into the tool.

Take this prompt: In 300 words or less, explain how this SWAT and brand audit will inform your final pitch.

This is ChatGPT's result:

AI cheating prompt answer 1

I have received responses like this, or those very close to it, a few times in my tenure as a teacher, and one of the most recognizable red flags is the amount of instances in which key terms from the prompt are used in the final product. 

Students don't normally repeat key terms from the prompt in their work in this way, and the results read closer to old-school SEO-driven copy meant to define these terms rather than a unique essay meant to demonstrate an understanding of subject matter.

But can teachers use AI tools to catch students using AI tools? I came up with some ways to be smarter in spotting artificial intelligence in papers.

Catching cheaters with AI

Here's how to use AI tools to catch cheaters in your class:

  • Understand AI capabilities : There are AI tools on the market now that can scan an assignment and its grading criteria to provide a fully written, cited and complete piece of work in a matter of moments. Familiarizing yourself with these tools is the first step in the war against AI-driven integrity violations. 
  • Do as the cheaters do: Before the semester begins, copy and paste all your assignments into a tool like ChatGPT and ask it to do the work for you. When you have an example of the type of results it provides specifically in response to your assignments, you'll be better equipped to catch robot-written answers. You could also use a tool designed specifically to spot AI writing in papers .
  • Get a real sample of writing: At the beginning of the semester, require your students to submit a simple, fun and personal piece of writing to you. The prompt should be something like "200 words on what your favorite toy was as a child," or "Tell me a story about the most fun you ever had." Once you have a sample of the student's real writing style in hand, you can use it later to have an AI tool review that sample against what you suspect might be AI-written work.
  • Ask for a rewrite : If you suspect a student of using AI to cheat on their assignment, take the submitted work and ask an AI tool to rewrite the work for you. In most cases I've encountered, an AI tool will rewrite its own work in the laziest manner possible, substituting synonyms instead of changing any material elements of the "original" work.

Here's an example:

AI cheating prompt answer 2

Now, let's take something an actual human (me) wrote, my CNET bio:

AI cheating prompt answer 4

The phrasing is changed, extracting much of the soul in the writing and replacing it with sentences that are arguably more clear and straightforward. There are also more additions to the writing, presumably for further clarity.

The most important part about catching cheaters who use AI to do their work is having a reasonable amount of evidence to show the student and the administration at your school if it comes to that. Maintaining a skeptical mind when grading is vital, and your ability to demonstrate ease of use and understanding with these tools will make your case that much stronger.

Good luck out there in the new AI frontier, fellow teachers, and try not to be offended when a student turns in work written by their robot collaborator. It's up to us to make the prospect of learning more alluring than the temptation to cheat.

COMMENTS

  1. PDF Essay Rubric

    Essay Rubric Directions: Your essay will be graded based on this rubric. Consequently, use this rubric as a guide when writing your essay and check it again before you submit your essay. Traits 4 3 2 1 Focus & Details There is one clear, well-focused topic. Main ideas are clear and are well supported by detailed and accurate information.

  2. Short essay question rubric

    Center for Excellence in Teaching. Office of the Provost. 3601 Watt Way, GFS 227. University of Southern California. Los Angeles, CA 90089-1691. [email protected]. (213) 740-3959. Contact Us. Follow Us On Social Media.

  3. PDF Essay Guidelines, Short Grading Rubric, & Corrections Guide

    Richard Keyser Essay Guidelines, Grading Rubric, & Corrections 2015 Essay Guidelines, Short Grading Rubric, & Corrections Guide I. Essential Essay Guidelines Argument: Do you have a thesis statement? Check the last sentence or two of your introduction - this is where your reader will look for a statement that summarizes your argument.

  4. Rubric Best Practices, Examples, and Templates

    A rubric is a scoring tool that identifies the different criteria relevant to an assignment, assessment, or learning outcome and states the possible levels of achievement in a specific, clear, and objective way. Use rubrics to assess project-based student work including essays, group projects, creative endeavors, and oral presentations.

  5. PDF Short Essay Question Rubric*

    Short Essay Question Rubric* EXCELLENT MEETS EXPECTATIONS APPROACHES EXPECTATIONS NEEDS IMPROVEMENT Completeness Shows a thorough understanding of the question. Addresses all aspects of the question completely. Presents a general understanding of the question. Completely addresses most aspects of the question, or addresses all aspects incompletely.

  6. PDF Essay and Short Answer Question Rubric

    University at Buffalo Department of Philosophy Grading Rubric for Essay and Short Answer Exam Questions, Quizzes, and Homework Assignments. Unsatisfactory. Competent. Exemplary. Fails to address the question or demonstrates an inadequate or partial grasp of the question. Demonstrates an adequate understanding of the question.

  7. Essay Rubric: Basic Guidelines and Sample Template

    An essay rubric refers to a way for teachers to assess students' composition writing skills and abilities. Basically, an evaluation scheme provides specific criteria to grade assignments. Moreover, the three basic elements of an essay rubric are criteria, performance levels, and descriptors. In this case, teachers use assessment guidelines to ...

  8. Creating and Using Rubrics

    This rubric was designed for essays and research papers in history (Carnegie Mellon). Projects. ... Example 1: Oral Exam This rubric describes a set of components and standards for assessing performance on an oral exam in an upper-division course in history (Carnegie Mellon).

  9. SAT Essay Rubric: Full Analysis and Writing Strategies

    In your essay, you should use a wide array of vocabulary (and use it correctly). An essay that scores a 4 in Writing on the grading rubric "demonstrates a consistent use of precise word choice.". You're allowed a few errors, even on a 4-scoring essay, so you can sometimes get away with misusing a word or two.

  10. Essay Rubric

    Grading rubrics can be of great benefit to both you and your students. For you, a rubric saves time and decreases subjectivity. Specific criteria are explicitly stated, facilitating the grading process and increasing your objectivity. For students, the use of grading rubrics helps them to meet or exceed expectations, to view the grading process ...

  11. iRubric: Short Essay Questions rubric

    Short Essay Questions. Use this rubric for grading student responses that are part of a test or quiz that include other types of questions as well. Can be customized for any subject. Rubric Code: N4AA82. By marquezh5.

  12. PDF Argumentative essay rubric

    Logical, compelling progression of ideas in essay;clear structure which enhances and showcases the central idea or theme and moves the reader through the text. Organization flows so smoothly the reader hardly thinks about it. Effective, mature, graceful transitions exist throughout the essay.

  13. Sample Rubrics for Essays for Elementary Teachers

    An essay rubric is a way teachers assess students' essay writing by using specific criteria to grade assignments. Essay rubrics save teachers time because all of the criteria are listed and organized into one convenient paper. If used effectively, rubrics can help improve students' writing. Below are two types of rubrics for essays.

  14. PDF Writing Assessment and Evaluation Rubrics

    Holistic scoring is a quick method of evaluating a composition based on the reader's general impression of the overall quality of the writing—you can generally read a student's composition and assign a score to it in two or three minutes. Holistic scoring is usually based on a scale of 0-4, 0-5, or 0-6.

  15. PDF AP Scoring Rubric for Question 1: Synthesis Essay

    Scoring Rubric for Question 1: Synthesis Essay. 0 POINTS. 1 POINT: For any of the following: No defensible thesis ... Examination of rhetorical choices independently rather ... *Thesis may be more than one sentence and may appear anywhere in the essay. 0 POINTS. 1 POINT: 2 POINTS. 3 POINTS: 4 POINTS. Simple restatement of

  16. iRubric: Essay Exam Rubric

    The essay reflects basic reasoning by: (1) synthesizing some of the material, though remains vague and undeveloped; (2) making a few connections between ideas/claims/points, but ignoring or inaccurately connecting others; (3) evaluating the issue/problem at a very basic/superficial level; and (4) ignoring assumptions and implications.

  17. GRADING RUBRIC for EXAM ESSAYS

    GRADING RUBRIC for EXAM ESSAYS. GRADING RUBRIC for EXAM ESSAYS. An "A" essay: Answers the specific central question that was asked. Incorporates pertinent and detailed information from both class discussion and assigned readings (whenever applicable), providing needed evidence. Maintains focus/avoids being sidetracked by tangents.

  18. PDF Comprehensive Exam Essay Rubric

    Comprehensive Exam Essay Rubric Proficient Competent Unacceptable Standard Met? In discussion of goals, concepts, and application, candidate vividly and effectively identifies and applies concepts and expertly articulates theoretical vocabulary relevant to field. In discussion of goals, concepts, and application, candidate clearly identifies

  19. Creating and Using Rubrics

    A rubric is an assessment tool often shaped like a matrix, which describes levels of achievement in a specific area of performance, understanding, or behavior. There are two main types of rubrics: Analytic Rubric: An analytic rubric specifies at least two characteristics to be assessed at each performance level and provides a separate score for ...

  20. ACT Writing Rubric: Full Analysis and Essay Strategies

    If you've chosen to take the ACT Plus Writing, you'll have 40 minutes to write an essay (after completing the English, Math, Reading, and Science sections of the ACT, of course). Your essay will be evaluated by two graders, who score your essay from 1-6 on each of 4 domains, leading to scores out of 12 for each domain.

  21. Essay Grading Rubric

    The following is a sample rubric for an essay assignment. The score levels are marked by skill level, but you may want to change those to specific point values. ... Take quizzes and exams. Earn ...

  22. Grading Rubrics

    Grading Rubrics. Goals and objectives are measured by a performance assessment in the courses required for the Philosophy major. Specifically, student performance in writing essays, and essay exam questions, will be measured using the follwing standardized grading rubrics. Essays and essay questions are evaluated with an eye both to the student ...

  23. Indiana University Bloomington Requirements for Incoming Freshmen

    Application Essay. An IU-specific essay of 200-400 words is required. ... (IAET) and must register for any supplemental English courses prescribed based on the results of this examination or, if necessary, enroll in the intensive English language program. For additional information, contact the Office of International Services, ...

  24. Reliability of ChatGPT in automated essay scoring for dental

    Sixty-nine undergraduate dental students participated in a closed-book examination comprising two essays at the National University of Singapore. Using pre-created assessment rubrics, three assessors independently performed manual essay scoring, while one separate assessor performed AES using ChatGPT (GPT-4).

  25. PDF Virginia Consortium for Teacher Preparation in Special Education

    the Graduate Essays regardless of program. 107 points. Total Points: 147 points ; ... and their exam to indicate that they have followed the honor code. A pledge means that you have ... Assessment Rubric(s) Disability Case Study Assessment Rubric Does Not Meet Expectations 1.

  26. Bar exam results for July on the rise

    Results were mixed for the July 2023 bar exam, when the average MBE score ticked up just 0.2 of a point over the previous year. Nationwide, 32 states saw increases in their July 2023 pass rates ...

  27. How I Use AI to Catch Cheaters at School

    Away with those shady essay writing services where a student has to plonk down real cash for an unscrupulous person to write them 1,200 words on the fall of the Roman Empire.