Robo-Graders: How Accurate Are They?I have always felt that it is grossly unfair that teachers are required to spend their leisure time performing tasks that should be paid on a per hour basis since it stands to reason that these dedicated individuals have lives apart from the school setting. Fortunately, in response to this concern, a programmer has designed new weapons for the teacher’s arsenal, called autoreaders, that will allow these often underpaid and underappreciated teachers of America to eliminate the countless hours spent reading and grading student essays.

In an attempt to prove that the teachers who actually groom our children and prepare them to be all they can be can reduce the time they now spend reviewing student work, I have chosen to evaluate this software. Remember, as you read this review, that this program would be a major component in how your child’s assignments are graded. In addition, however, you should be aware that, if this program is downloaded on your computer, your progeny could technically use it to test their essays prior to submission in order to judge what grade they should expect to receive on it. With my grandchildren in mind, I approached the software with excitement, wanting to learn all I could about how it functions.

As I began my own research, I came across a study of autoreaders that was conducted at the University of Akron by Mark D. Shermis (the Dean of the College of Education). According to the press release and subsequent report (linked below), the study covered six states, used 16,000 essays, and rated nine different auto readers. To verify the results of the study, they compared the grading software to essays that had been evaluated and graded by educators. The difference in locale and school grading policies as well as the qualifications of the educators ensured that the samplings covered a variety of grading styles.

The overall conclusion of the above study was that there was little, if any, variation between the scores assigned to the essays by the autoreaders and the educators. In addition, it was proven that while teachers spent between two and three minutes reading an individual essay, autoreaders could read and grade 16,000 essays in 20 seconds. But does speed really matter? Do people trust having their hard work examined by a machine rather than by a human?

While essay answers or reports may be more difficult to critique, one must remember that teachers have been using machines to grade multiple choice tests for years. For these type of tests, the automated process requires one to:

  • restrict using anything other than a number 2 pencil on the answer sheet.
  • make sure that any corrected answer is clearly marked and that the incorrect one is completely erased.
  • make sure that there are no marks between the lines as it will confuse the automated grading process.
  • understand that if two answers are accidentally marked, the answer will be considered incorrect.

So can autoreaders replace a teacher’s expertise in grading student essays?

My belief is that they may be able to assist teachers but, like with any software product, there are some limitations on their effectiveness. Some of these limitations were reported in an interview conducted by Micheal Winerip of the N.Y. Times with Les Perelman, a director of writing at the Massachusetts Institute of Technology. Mr. Perelman, in the interview, noted that there are still several reasons that teachers can’t rely totally on the results of an autoreader. Here are a few of the problems that Les Perelman noted in reference to one autoreader that he tested:

  • The software can be compromised with erroneous information and still give a passing grade.
  • Short sentences are downgraded, even though the sentences may be grammatically correct and purposeful for the essay.
  • The software was ineffective when analyzing poetry.
  • The software appears to automatically assign higher grades to longer essays.

But what I found really surprising was that Perelman claimed that two of his graduate students convinced him that they could write an Android app that would assign a passing grade to any essay of their choosing.

So what do the results of testing one software product mean?

I personally believe it just goes to show that no software is 100% perfect and it doesn’t matter which operating system we choose to use. It also doesn’t seem to matter how many firewalls or anti-virus programs we download since even government computers and credit card information has been found to be vulnerable to the myriad of hackers out there. Knowing that, how can anyone expect that a clever youngster isn’t going to find a way to backdoor a teacher’s autoreader?

So if we take this as a given, you still need to remember that this software has its value, especially if you, as a parent, use it as a tool to sharpen your child’s writing skills. Additionally, if you like to see how your child is doing before they turn an assignment in, the autoreader will make it easy for you to determine what should be changed before your child turns the assignment in to ensure the best grade possible. However, in this case, you may wish to use it only on elementary age students, who may not have the expertise or knowledge to circumvent the software to change their grade. However. when I see our 2.5-year-old granddaughter already using an Apple iPad, where decades ago a child of that age would be playing with wooden blocks, it becomes apparent how technologically advanced our youth have become, meaning that you will have to judge each child on an individual basis.

Comments welcome.

Source: Contrasting State-of-the-Art Automated Scoring of Essays: Analysis, Mark D. Shermis, The University of Akron


Source: N.Y. Times

CC licensed Flickr photo above shared by teachandlearn