FAQ’s on Intellimetric and Automated Essay Scoring

August 16, 2019

What is IntelliMetric® ?

IntelliMetric®  is an intelligent scoring system that emulates the process carried out by human scorers.  IntelliMetric is theoretically grounded in a cognitive model often referred to as a “brain-based” or “mind-based” model of information processing and understanding.  This Automat3ed Essay Scoring (AES) engine draws upon the traditions of Cognitive Processing, Artificial Intelligence, Natural Language Understanding and Computational Linguistics in the process of evaluating written text.


How long has IntelliMetric®  been used?

IntelliMetric has been used to score essays and papers since 1998.  There have been many improvements made to the scoring system that have improved it scoring accuracy over the years since 1998.  It is used inside college placement tests, employment application screening, admissions tests, scholarship competitions as well as integrated into textbook platforms and other academic applications.


How do we know that IntelliMetric® works?

In order to evaluate whether IntelliMetric is able to accurately score essays, we put IntelliMetric to the same tests that we would an expert human rater.  After a human rater is trained, he/she is asked to score a set of essays that have previously been scored by experts.  The agreement rate between the new scorer and the “known” scores are compared.  If the human rater meets the criteria for acceptable agreement, the human is allowed to score new essays.  Similarly, after IntelliMetric is trained, it is asked to score a set of essays that were previously scored by experts.  Just as in the human scoring process, we look at the agreement between IntelliMetric and the expert scores

If IntelliMetric meets or exceeds the standards for a human rater, IntelliMetric is able to be put into use to score new essays.In short, IntelliMetric is treated much like any expert scorer on the team when evaluating for consistency and accuracy.  It must meet the same high benchmarks of quality that any human expert must meet.


How is IntelliMetric®  trained?

Similarly to how human raters are trained to score a new prompt, IntelliMetric is given a training set that includes many essays that were previously scored by experts.  Our process takes that information and processes it to determine what it means to be an essay deserving of each score point as mandated by the experts.  IntelliMetric establishes a scoring program unique for each training set that best predicts what the experts would score new essays submitted to that same prompt.  After the model is created, the IntelliMetric scores are compared to expert scores on a validation set of essays.  These essays have been scored by humans, but are not part of the training process.  It typically takes 300-500 human scored essays to provide a sufficient training set. If IntelliMetric and the experts agree, the model is ready to be put into use to score new essays submitted to that prompt.  If not, we review the rubric, gather more essays and fine tune the engine until the results are consistent.


How long does it take to train IntelliMetric®?

The actual IntelliMetric  model creation process is a fast process.  It takes considerably more time to create the training set, which requires collecting an appropriate number of essays that are scored by human experts.  After the model is created, time is also needed to carefully determine whether the model meets an accepted level of agreement.


Is IntelliMetric®  similar to other automated essay scoring products available?

IntelliMetric is a unique automated scoring engine.  Its leverage of Artificial Intelligence, Natural Language Processing, and its close modeling of the human rating process make it distinctive from other scoring engines.  Research has shown that the use of two or more raters provides a more accurate final score than the use of a single rater, and as such IntelliMetric was developed to include the equivalent of a panel of raters.  Specifically, within IntelliMetric, there are in essence multiple automated scoring systems at work each using a different approach to scoring.  Similar to human raters, who will often rate essays differently, these approaches emphasize different characteristics of the writing.  With the resulting scores from each “judge,” a final IntelliMetric score is provided.


Another key difference between IntelliMetric and other solutions is in its inductive approach to learning how to score essays.  Since IntelliMetric is not rule-based or driven upon a set list of features, the IntelliMetric engine is able to score submissions that range from as short as one word all the way through to very long pieces of writing.  IntelliMetric does not require a special solution for short answers compared to long answers, or persuasive essays compared to narrative essays.  One engine can do it all!


The most important differentiation between IntelliMetric and other automated essay scoring is the accuracy of the engine.  IntelliMetric provides unsurpassed essay scoring accuracy as it is based on the expert human raters that provided the training set.


Does IntelliMetric®  score the same as an expert rater?

While IntelliMetric does not read an essay the same way an expert rater does, IntelliMetric is able to score as accurately and often more accurately than a human rater.  We are able to determine this by comparing how IntelliMetric agrees with expert raters with how expert raters agree with each other.  IntelliMetric has consistently been found to agree with expert raters as often or more often than experts agree with each other.


Can IntelliMetric®  score on only one rubric?

Since IntelliMetric is learning engine that learns how to score based on a training set of previously scored essays, IntelliMetric is able to score accurately across a variety of rubrics.


Can IntelliMetric®  provide domain scores?

Yes.  IntelliMetric is able to score holistically as well as for particular domains of writing.  Common domains of writing in which IntelliMetric is used to score include: Focus, Organization, Development, Mechanics, Grammar, Voice, and Language Use.


Can IntelliMetric®  be tricked?

Yes.  Just as an expert rater can sometimes be tricked, IntelliMetric is also not a perfect system.  Since we know IntelliMetric can be tricked, we have controls in place to catch non-legitimate essays submitted for scoring.  At Vantage, we tend to be on the conservative side, flagging a considerable number of essays for expert review to be certain we catch the non-legitimate essays.  For any prompt, approximately 5% of the most aberrant essays will be flagged for expert review.  We have had great success in being able to identify essays that are off-topic, off-task, lack proper development, written in a language other than what was expected, contain bad syntax, copy the question, or are inappropriate or contain messages of harm.


Why use IntelliMetric® ?

IntelliMetric is a highly accurate scoring engine.  It provides accuracy that is superior to that of other scoring engines that are currently available.  Its combination of Artificial Intelligence, Natural Language Processing, and its similarity in training to human expert raters allows IntelliMetric to achieve an agreement rate which is often higher than the rate achieved between two human raters.  After IntelliMetric has been trained to score responses written to a specific prompt, it can be used to successfully and consistently score essays.  Unlike human scorers, which require considerable time to score each response, IntelliMetric requires minimal time to score a large amount of unknown essays.


The ability to score essays using a multiple scoring systems approach, provide holistic and domain scores, detect illegitimate responses, and score essays written in a variety of languages certify that IntelliMetric is the most comprehensive scoring engine available.


Learn more:

Intellimetric product website

Intellimetric Introductory Video

Intellimetric Automated Essay Scoring by McCann