Qvatch is German for nonsense.
Wednesday, December 5, 2007
Lecturing and the absence of Learning
The punctuated model of learning that we have is historically embedded in our agrarian heritage. We know, for instance, that total immersion learning for languages works; works to such a degree that learning German in Germany is virtually trivial compared to learning German in 1 hour per every other day sections as usually done in American Universities. Clearly, immersion learning in calculus, which would be incredibly strenuous, would work better than what we do now, but no one is espousing it.
We start with the book, the text, the font of all knowledge. Any casual inspection of current textbooks, comparing them to past efforts, will show that they are significantly improved. Although the reading level has decreased, that decrease does not necessarily lead to intellectual pollution. The idea that students learn to solve canned problems leads critics to shake their heads, but the fact is that performing at such a minimal level is certainly a first step which is necessary before progress can be made. We all really learn by re-visiting the material at a spiraling level of sophistication, placing it into its niche within the intellectual space we are building. The re-visiting is important.
When I teach, I tease my students about what they've forgotten (and I've been criticized for it, and in fact punished for it) but the teasing has a purpose. They are studying a subject whose predecessor materials have been learned (and forgotten). They need to re-learn the material, now from the point of view that they need it for the current material they are muddling through. My teasing seems a better method than assigning them to re-learn the material. They're adults; assigning freshman materials is demeaning (IMHO). But teasing them, puts them in the position, after re-learning the material, of being able to feel superior (certainly to the freshmen who are struggling with the material) since not only have they easily re-learned it, but they did so for a reason which was absent the first time around.
We have all forgotten material that we've learned in the past, partly because this material was never used again, never needed again, and therefore buried under newly learned material which was more important (temporally).
The trick is to change the examining system so that precursor materials need to be known during examinations. This would make the re-learning of material worth while, and would motivate students to retain relevant material as they progressed. Like a muscle being exercised, what we know is what we've used recently. The longer the material has lain dormant, the less we remember about it. If our examinations did not excuse precursor ignorance, we would change the culture of learning, so that as we progressed, as we learned more and more (about less and less?) we would retain more of the precursor material we see now disappearing.
Monday, December 3, 2007
Lecturing and Learning
"Lectures were created as a means of transferring information from one person to many, so an obvious topic for research is the retention of the information by the many. The results of three studies-which can be replicated by any faculty member with a strong enough stomach-are instructive. . . . . ."Hrepic et al. (2007)¨. . . asked 18 students from an introductory physics class to attempt to answer six questions on the physics of sound and then, primed by that experience, to get the answers to those questions by listening to a 14-minute, highly polished commercial videotaped lecture given by someone who is supposed to be the world's most accomplished physics lecturer. On most of the six questions, no more than one student was able to answer correctly. . . . . . These results do indeed make a lot of sense and probably are generic, based on one of the most well-established-yet widely ignored-results of cognitive science: the extremely limited capacity of the short-term working memory. The research tells us that the human brain can hold a maximum of about seven different items in its short-term working memory and can process no more than about four ideas at once. Exactly what an "item" means when translated from the cognitive science lab into the classroom is a bit fuzzy. But the number of new items that students are expected to remember and process in the typical hour-long science lecture is vastly greater. So we should not be surprised to find that students are able to take away only a small fraction of what is presented to them in that format."
(Wieman, C. 2007. "Why Not Try a Scientific Approach to Science Education?" Change Magazine, September/October; online.See also Wieman & Perkins (2005).)
They (and similar studies) have set up a straw man and then knocked him down. I am not interested in the short term learning during or just after lecture. I expect the student to review the material covered, test it against reasonableness, and incorporate it (or not) into his/her psyche. If s/he thinks its wrong, I want the student to come back and argue. If after a history lesson, which is surely solely memorization, i.e., no concepts whatsoever, a student can't remember a factoid from the first few minutes of the lecture, has the lecture failed?
One forgets that from a scale point of view, lecturing is the only effective method of instruction absent distance learning (an as yet unproven technique). From time immemorial, elders have spoken to youth instructing them. There was no other method of instruction, absent one-on-one tutoring, which is never cost effective. And one-on-one instruction is not failsafe anyway. There used to be a folk's tale concerning John Hopkins, a pupil, and a log, but in actuality, one teacher with one student is neither necessary nor sufficient to guarantee learning.
What we forget, IMHO, is that teaching and learning are not the same thing. Teaching means presenting the material. Good teaching means being able to present it is more than one modality, more than one phraseology, more than one viewpoint. But ultimately, teaching means telling a pupil something (perhaps in more than one way) and hoping for learning.
But learning is the pupil's problem, not the teacher's.
Learning is the student's responsibility. S/he can't just absorb it on the fly. Practice makes perfect holds in school as well as in getting to Carnegie Hall.
Most important, it is absurd to think that today's student can learn from the printed page. The texts are now dumbed down enough that anyone who can read can learn from them. But our students can't (IMHO) read for comprehension. They read for pleasure if at all. They come from households (IMHO) without books in them, with parents who do not read (IMHO) (even a newspaper) and from places where the television is on full time (IMHO). They expect to be entertained, and they expect that learning is fun and games. Even their obsession with video games translates into an inability to learn from the WWW. As adults, we've given up on them, and allowed the bifurcation of students into two groups, those from parents who have some modicum of technical ability and care, and the others.
Consider that most parents can not help their children with algebra homework, and undermine the school by saying that they "never understood it" at the time. Why should students strain for understanding when their "successful" parents never needed algebra? When we devalue learning as parents, we can not expect our children to want to learn.
Technically educated parents can hover over their children, correcting their mistakes, knowing the material their children are learning. So what we are developing is a society of two "cultures" which I call the those who can and those who can't do algebra.
Just as we see the enormous drop out rate amongst some students, we see and ever increasing achievement shown by highly precocious students vying for the limited number of spots at prestige Universities. This bifurcation makes second rate Universities into glorified high schools, and prestige Universities into elitist institutions. What a problem; but we've drifted away from lecturing and learning. Sorry.
Returning to lecturing, one notes in real lectures that most of the students are not engaged in the slightest. The only time we really have their attention is during examinations, and it is interesting that Computer Assisted Testing proposals have all recognized that only when interactively facing material under test do we get a reasonable facsimile of wholehearted engagement.
Friday, November 30, 2007
Multiple Choice Testing: Fair, Equitable, and Wrong
The Newsweek article on the Spelling commission's recommendation for testing school's graduates to ascertain whether or not they've been educated used as a graphic a pencil and a multiple choice machine grading form (bubble sheet), indicating that the continuation of multiple choice testing would continue on forever.
No criticism of this method of examining students is heeded, ever.
It remains an article of faith that machine graded multiple choice examinations are the only unprejudiced method available for testing. No one remembers the criticism of MC testing by Banesh Hoffman many years ago, and no one notices that the internet makes testing modalities available today and into the future which didn't (or couldn't) exist in the paper and pencil era. With the continuing flack about "no child left behind" it seems imperative that some discussion of testing the tests for veracity of judgment be carried on, but everyone is silent on this issue.
It is clear (to me) that multiple choice questions are questions in both the subject matter and in reading. Worse, reading of the type needed for MC questions is specialized; its not the reading we do for prose or poetry, its unique.
Furthermore, the reasoning employed in a multiple choice environment is not one which carries over into normal activities of even the smartest and most creative. Guessing, including enlightened guessing, is encouraged. Many test tutors encourage employing reasoning schemes which have nothing to do with the logic of the problem actually being addressed.
And continuing in this vein, one notes that MC tests do not exist in the "real world", where answers need be constructed out of nothing, out of air and thoughts. What are called "constructed response" items, in which there are no hints whatsoever concerning the ultimate answer, are much closer to reality than MC responses, which are not meaningful either when they're right or when they're wrong.
What is the major advantage of paper and pencil multiple choice examinations is that the teacher needn't grade them. A machine can do that. Hail to the machine, and to the teacher who has figured out how to lower his/her work load.
We've known for years how to create "constructed response" numerical response items which are machine readable, and to the best of my knowledge (TTBOMK) these are only used in mathematical competition examinations. I've used them for years (although hand graded, since no machine existed on campus that could read and interpret the items properly) as a way to force students to answer in a single place in a single format. The scheme mirrors the "cgi-bin" Perl programs I've also used for years to allow students to answer "constructed response" items on the World Wide Web.
The main criticism of MC testing is that the "distractors" are intentionally chosen to mislead students, and this is contrary to all normal human practice in problem solving, and is actually harmful to students. When they've made a mistake, and find their (incorrect) error in the list of possible answers, it is just plain human nature to choose that (incorrect) answer and move on. Deceiving the student into this kind of mistake is cruel, unnatural, and not in the student's best interest!
Wednesday, November 28, 2007
Computer Assisted Testing: the quote that started me thinking
src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
FROM:"College Mathematics: Suggestions on How to Teach It"
Undergraduate Committee on the Teaching
of Undergraduate Mathematics
Mathematical Association of America
March 1979
"Just when you have become established as the student's staunch ally, you are obliged to shift into the role of judge and jury and, it may be, executioner."
This is the statement that turned my attention to Computer Assisted Testing! If we could substitute the machine for the human, students could rely on us always without worrying about our chameleon status. We would always be on their side!
If the computer says a student is wrong, the student is likely not to be angry at the machine, but perhaps at him/her self. Perhaps, the student will self-assess and recognize that s/he is not well prepared. Perhaps, perhaps, perhaps.
In essence, the idea of CAT is the idea of making the measurement better rather than eliminating it in favor of "school of education" measures such as portfolios, etc.. You can argue about the content of tests, but whatever it is you want students to demonstrate that they know, you need a tool for assessing that knowledge which is unprejudiced, treats all students identically, and judges correctly. It may, repeat may, offer help to students who make a mistake, provided of course that such help is available to all and any students. This last part is solely at the discretion of the examination's creator (with his/her programmer's help).
I imagine rooms filled with old computers which have been hobbled so that they can not send or receive e-mail or text messages, whose browsers can only access the examiner's server (URL) for examination questions, etc.. Using old discarded (obsolete) machines solely as browsers in a closed environment which is proctored to avoid telephone and interpersonal cheating, students could present themselves when "ready" and take their examinations (asynchronously), since it is easy to present different questions to different students in a completely fair manner. For those who do badly, the ability to come back and re-do examinations after more studying would be appropriate.
In a weird way, I consider this application also an investment in re-cycling used and obsolete computer equipment, i.e., the teaching institutions could scavenge boards, hard drives, etc., from machines to use in others as their equipment failed, thereby giving a "second life" (I know, I know) to machines whose desktop use has passed, but which are still fit for specialized service such as this application (where speed is of little to no importance, and in fact, the machine is idle most of the time while examinees think, calculate, ponder, etc..
For technical material, this is the best chance we have to re-rigorize what appears to be a crumbling intellectual edifice. Woe unto us if we fail to bring excellence back into learning.
Computer Assisted Testing, Almost the Original Proposal
Spring 1981
1. Pre-Introduction
This manuscript is a partial reproduction of a manuscript I published in Spring, 1981, slightly augmented with comments as warranted given the 20+ years since it was written. Since the original manuscript was professionally edited, any errors of grammar and/or usage are mine, induced during the current incarnation. After the CAT discussion, there is an introduction to Computer Guided Reading, a brainstorm which occurred after many years of failing to get CAT even slightly accepted. CGR is a recognition that the reading skills of our students have attrited the same as their math skills, and their acceptance of authority is becoming terrifying, i.e., they are ceasing to question things which should be within their intellectual grasp, and are instead accepting these things “on faith”! CGR was an attempt to correct this tendency. It failed also. Finally, returning to this material, I am posting it as a form of testament (to the failures of a career). I should add that at the time of writing, Spring 2005, the Math Department has begun implementing CAT in a monitored setting. This means that my prediction of many years ago has come to fruition naturally, of course without my help, but what can you do?
At the time of posting, my personal situation has worsened slightly, from a teaching point of view, but that's another story.
1.1 Introduction
The quality of undergraduate academic preparedness is dropping at the same time as grade inflation continues on its rampage. The lack of standard examinations for fourth year students in American schools of higher education makes this statement difficult to quantify The Achievement Test of the College Board and the Graduate Record Examination are voluntary, not mandatory, and therefore are administered to only a fraction of our graduating seniors. but a consensus of colleagues shows that as the reading ability drops, as SAT scores drop, and as the amount of material covered in key courses drops, the performance of our students is dropping also .
Of the indices which we possess to measure this decline, the foremost one is old examinations. The examinations we gave ten+ years ago, and regarded then as quite fair, are now too hard. The knowledge that we expected students to bring to our courses then, is no longer expected of them now. The belief that the teacher’s standards were reasonable is now superseded by the expectation that the class mean sets the standard for class performance.
Accountability has given rise to teacher rating systems that influence tenure, promotion, and salary decisions. Under these circumstances faculty members would be fools to insist on high quality performance, based on their own subjective evaluations, and the facts are that the process of erosion continues.
1.2 What’s Wrong?
My particular set of biases comes from 17+ years of teaching physical chemistry on the university level. My biases about any technical course are quite specific, and it is worthwhile to set these biases out at the beginning, so that they may be examined separately from the question being raised about testing. Technical education is not “life preparation”, culture, or enrichment. At some point technical education in chemistry, physics, mathematics, and all the other sciences and technologies becomes the serious business of building tools for a lifetime of technical work. Regardless of the vocabulary, that is, the particular science or technology being taught, there is a common thread that winds through all technical education, and that is obsolescence. We are all obsessed by the fact that what we learned when we were students is at least partially obsolete now. Therefore, our teaching now is oriented toward the creation of tools and of mind sets. We do not expect the student to remember much, but we expect him to he able to rapidly re-equip himself with any piece of knowledge which he has seen in the past. We accept that the details are lost over time, but want to demand that they be re-creatable when needed. Then, when our students are out in the “real world”, they will be equipped to learn, or re-learn whatever is needed to function in the environment where they happen to find themselves. It is in the re-creation aspect of knowing that our greatest problem clearly presents itself.
Mathematics appears to lie at the center of this problem. Without taking excessive space to catalog the variety of errors seen on examinations, it suffices to note that many of our junior-year students cannot integrate with any distinction. cannot abstract partial derivatives, cannot successfully carry out error-free elementary algebra, and in general cannot translate mathematical equations into meaningful knowledge, or ideas about the physical world into meaningful mathematics. The errors that we see imply a lack of integration of mathematics into the psyche of our students, so that it remains a foreign language, rather than a tool . The fundamental reason for this failure resides in testing (IMHO) and the manner in which we accredit student’s academic progress One of the reasons that testing results in this incredible ignorance on the part of some of our students is the fragmentation it makes necessary of the course material into small digestible chunks (which are examinable). The scheme we use to organize ourselves allows our students to learn an epsilon of information for a given examination, forget that information within minutes of the end of the exam, and relearn that same epsilon for the final. Come the end of the course, the student expurgates the material in a fit of triumph at having beaten the system once again. Years later (or days? ), when the material is suddenly needed, then, and only then, does the student regret his earlier tom foolishness; but, of course, then it is usually too late.
Given the main problem, that material is distributed in quanta, there exists a host of secondary problems associated with these quanta. First and foremost is the problem of testing on pre-passed quanta. Students regard it as universally unfair to grade on material which has been “passed” in a previous course. I have had students argue vehemently that I should tell them the formula for the area of a circle, as they are “not responsible in this course for this knowledge”. The fact that this argument, when carried to its logical conclusion, obviates knowing anything never occurs to our students .
Second, if partial credit is assigned to partial answers, it is possible to partially pass multi-quanta questions without ever actually proving knowledge of any single piece of material .
Another reason exists for the lack of preparation that we see in our students. It is that the fraction of material answered on examinations is used as a gauge of quality of performance. This makes sense at the top and bottom of the scale, but in the middle it fails to distinguish the kind of material that is being missed. No distinction between absolutely essential core material and embellishments is made in the normal test situation. As a result. the grade of C, which technically could be construed to mean that a student knew 70 percent of the tested material, actually could mean that the student knew all the material, but made many silly errors; or, on the other hand. it could mean that the student didn’t know one entire section on an examination, with that section perhaps being part of the core material. It also could mean that the student functioned at about the 70th percentile in his class, with no reference to knowledge whatsoever.
Further, the time-delayed grading scheme does not help with “stupid errors”, the kind that students feel are infinitely excusable. The lack of unit checking, the algebraic silliness, the entire gamut of errors in the areas that are universally conceived to be absolutely important to proper mathematical functioning but below a high level course, these errors detract from the purposes of the testing process and prevent both the student and the teacher from assessing progress and knowledge of subject.
Multiple choice examinations are another source of confusion in this world view I am constructing. Never, after graduation, are there "multiple choices" in technical work (except for decision making). Rather, life poses problems in what educationalists call "constructed response" mode. We've always had the ability to use machine correction schemes for "constructed response" questions, but they have not been widely employed. Instead, multiple choice has been the choice "du jour". When Newsweek had a proposal from Ms. Spelling concerning testing college graduates, the graphics was a pencil lying on a multiple choice form! Objective testing means, in America, multiple choice; and this is just plain wrong and bad all around. Its bad for students' intellects, its bad for the civilization which trusts its results excessively, and its bad in and of itself!
The main function of testing at this advanced level is not to rank students, but rather, to discourage the technically inept from continuing. It is more important to divert the technically incompetent from becoming professional “do-badders” than it is to reward the best students with A’s. For those students who will use their degrees to gain entry into a technical profession, the grades earned are not of paramount importance. The degree is the thing, it almost obliterates the record, and qualifies the owner for the possibility of a technical position in our society. Since that position might be of great importance, the idea that the possession of a degree might not imply technical competence is chilling indeed.
1.3 Some Proposed Changes in How We Examine
If you are primed to read a tirade against multiple-choice testing, you are in for a shock now. We never, never, use the multiple choice format for our examinations. Except for the “standard” American Chemical Society examinations which may be administered by any institution, to my knowledge, all physical chemists examine using problems and derivations. Problems are of the type:
Compute the energy of a molecule of HD in its ground electronic state, its
second vibrational state, and its 43rd rotational state, assuming that the
molecule is not translating. Use the following atomic and molecular constants in
your computation ...
while derivations might be posed in the form:
For a Dieterici gas, obtain an expression for the critical temperature in terms
of the given constants . . . .
An honest attempt in grading is (usually) made to award part credit. First, for many-part questions, errors in early stages that translate forward into future errors are not counted multiply. Second, so called “stupid errors” are “excused” by penalizing the student only microscopically for errors of arithmetic and perhaps algebra. It is an open question how people grade when it comes to calculus errors, but whatever is done, students who are harshly graded resent it sorely. Having passed “Calculus” they feel that being re-graded on it is a form of double jeopardy.
The longer we continue past practices into the future, the more tangled will be the question of how to impose, or re-impose standards. Without being nostalgic for the past, one must ask whether or not standards have been changing with time. It would surely be a better situation if one could definitely say that either students were as good as they used to be, or better
What constitutes the ideal examination? My guess as to a perfect examination is, first, that such an examination satisfies both student and examiner that what is to be measured was indeed measured. Second, a perfect examination should be reproducible within reasonable bounds year after year and student after student. Third, the test should be perfectly unbiased with respect to any and all characteristics of the student such as sex, age, race, and so forth. Fourth, it should judge student responses in real time, as they are being offered so that no self-deception is possible. Fifth, it should guide the examinee taking the test to correct errors on material that are subordinate to that which is actually under examination. Sixth, the perfect test should be patient with slow or error prone students. Seventh, it should allow students to give up if they don’t know material, with the implied promise that when they come back to try again, they will not be judged for having tried once before. If this reminds you of a doctoral oral exam, you're right.
1.4 Why Not Examine on Computer?
Why not have a computer pose the problem to be solved, and grade the answer returned as the student watches? Then, if simple, “trivial” errors are committed, the computer can prompt the student with the location and type of error, and ask for a correction. The computer would have a DON’T KNOW button that would enable the student to face the absolute truth. I happen to be familiar with PLATO , and the rest of the discussion will be couched in PLATOeze .
Assuming that our students aren’t fools, we must presume that pushing the dreaded DON’T KNOW button (and leaving the examination for more study) would start a process of review based on demonstrated lack of knowledge in a specific area. Or, it would finally force the student to re-assess his career choice. In either case, as close as is possible to an absolute determination of a state of knowledge has been achieved in an unambiguous and unbiased manner.
What I am arguing for is a method of facing the student with his or her own lack of knowledge in a manner which is unequivocal. We do not exult in that lack of knowledge, and we do not hold it against the student.
In the PLATO system, it is easy to arrange that the system rarely repeat the exact questions again, so that a student who is required to pass a test before proceeding cannot escalate many failures into a pass. We can control that an entire group takes substantially the same examination without any single member of the group taking the exact same examination. Furthermore, the technology assures us of two very important side effects of this method of testing. First, the method is uniformly applied. There is no question of anyone having any advantage over anyone else. Second, the method is applied consistently regardless of time. This means that the computer doesn’t tire, and start grading easier the further along it is into the process.
It is easy to imagine an entire course posited on this method of testing. Notice, I am not advocating using PLATO to teach the material, only to examine on the material. By using deadlines to pass landmark examinations, one could successfully teach a course without ever falling prey to the numerous misunderstandings which normal testing brings into play. PLATO examination would be the closest thing to having “written-oral” examinations; that is, those in which one expresses himself in words, but in which the examiner’s responses are instantaneous.
Using PLATO, it would be possible to demand perfection in the “trivial error correction mode” for core material before allowing the student to pass, while changing the grading strategy to a more strict variant for optional material required to raise one’s grade above the mere pass.
(I need to note that PLATO is, to the best of my knowledge, now dead (2008). Computer Assisted Testing has been implemented in Perl (by me) and in other venues (see WebAssign.net as an excellent example.) What we lack is the proctoring that would make security less of an issue. Right now, there is rampant cheating with on-line assessment tools.
If we would like to assure the outside world that our education has produced meaningful results, then an objective measure of knowledge attained is necessary. Such a measure would not assure the“consumer of our products” that the students had acquired the proper education . But it would assure that the student had acquired the education that the institution thought was proper. Perhaps more important, such testing would allow the student to value himself more highly. An absolutely impartial examination that has been passed is a credit to the examinee.
As a final suggestion, and this has little to do with computer grading in real time, I would propose that material from previous lessons or courses that are really not learned become penalties to the student. The system should encourage retention of material. In return, we shouldn’t teach or require material that need not be retained.
This brings up a final point, one that is perhaps painful. If we examined students rigorously, and demanded minimum competence, then it is quite possible that we would decimate our classes. Therefore, the change over to such a stark method of examinations should also be a time when we re-investigate our course content to weed out those traditional subtopics which we may be fond of, but that could not morally be used as insurmountable barriers to progress in the field. With rigid examinations, we would be constrained to be examining on meaningful subjects.
1.5 What’s Wrong With This Method?
For a confirmed addict of tomorrow’s technology. I have to admit that the method of examination described does not solve all our problems. Specifically, it ignores derivations that, for those experienced in the field, are an indication of the amount of intellectual twisting that we have accomplished. One of our goals in physical chemistry is to teach the student to describe his or her ideas about a phenomenon in mathematical terms. Although most of our students cannot be made over into theorists, it is good to convince them that describing reality through equations is a do-able task. It is difficult to see how computer grading can lead to effective testing in this particular area of learning.
Also, PLATO is expensive, and with the advent of highly capable microcomputers and cheap secondary storage, it is clear that alternative technologies exist for carrying out the examination scheme outlined here. Unfortunately, this would mean “reinventing the wheel”, as the PLATO TUTOR language capabilities and system capabilities are outstanding tools for concentrating one’s attention on the job at hand, that is, composing lesson/test-ware .
Finally, there will always exist a subset of students who will be unable to deal with non-traditional testing methods, and it follows that traditional methods should be available to them. For some students, typing is an insurmountable chore, and they will not be efficient in front of a terminal. Others fear machines, and still others fear that they will break any machine they touch. So the machine age is not yet with us, even if these proposals are ultimately carried out somewhere .
On the plus side, there is another subset of students who claim that they “do not test well”, and for these students, opening up an alternative testing scheme might be a boon.
But my argument remains that quality education demands standards for students that are uniform, non-varying, and equitable. Computer grading of manipulative questions in chemistry, physics, mathematics, and so forth, would provide unambiguous proof for both the student and the teacher that knowledge was actually attained. Continuation of such a program between courses would allow for follow-up reinforcement of material which should be retained. The ultimate goal of such a program would be to stop graduating technical incompetents into the world. No social pressure of any kind could allow a student to pass through such a program if the student were unable to demonstrate learning of the required material. Consequently, when grades were investigated, one would be measuring by the categories “good”, “better”, and “best”, rather than by our present ambiguous standards.
1.6 Computer Guided Reading, A Logical CAT extension
With the failure of CAT, vide infra, it seems worthwhile to now discuss my other major initiative, Computer Guided Reading, which failed in the same manner, i.e., it was ignored. What follows is a discussion of what failed.
1.7 Introduction To Computer Guided Reading
Over the past few years, it has become apparent that certain skills which used to be a part of graduating college student’s armamentarium were becoming atrophied. In scientific/technical disciplines
1. the widespread introduction of open book or open notes or open formula list examinations,
2. the introduction of multiple choice testing (including the ACS examinations),
3. the introduction of symbolic mathematical programming into texts (and courses and soon into examinations), and
4. the enthronement of calculators (and demise of slide rules)
have given rise to a culture in the class room in which learning derivations in Physical Chemistry courses is silly, but learning the “plug and chug” nuts and bolts of doing standardized problems becomes the sine qua non of undergraduate achievement .
Reading assignments of texts are routinely ignored, and students treat their “homework” as doing assigned problems by searching for and cloning the nearest exemplar in their texts.
More and more our students are patiently and politely listening to lectures where academics derive complicated relations, knowing that this is a form of intellectual masturbation for the academics which has no relevance whatsoever to what real “scientists”, “doctors”, “lawyers”, “engineers”, etc., what ever, really do.
Science, from their point of view, appears to consist of religiously accepting that the formulas they’ve been shown are correct, and that doing science consists of using these formulas. For the vast majority of students, the thought of ever doing an original derivation is inconceivable. Students seem willing to accept things, like the Second Law of Thermodynamics, and want to push ahead. There is little doubting, and little intrinsic faith that they, our next generation of scientists, will be called upon to create new equations for new, as yet unknown, phenomenon.
To a certain extent, they believe that all that lies ahead of them is using computer programs to do something or other about which they are not too clear. But if something comes out of a computer, it is right, to them.
1.8 The challenge
If one asserts that reading science is different than reading Shakespeare, then it is necessary to clarify that difference in a manner which distinguishes between these two kinds of prose offerings. One of the distinguishing characteristics, which one easily notes, is that there is no need to verify anything written by Shakespeare. What ever is on the page is OK. It might be that what is written wasn’t written by Shakespeare, i.e., that someone is lying to the student in some way, but even that need not be terribly important.
On the other hand, scientific writing demands of the reader that s/he verify that every non-trivial assertion (of a scientific kind) be supported or prov-en, or be part of the hypothesis set which drives the prose (and mathematics) for-wards. Scientific writing is an attempt to convince the reader that what is written is true. Almost never is it intentionally misleading (although, of course, misleading is in the eye of the reader). Almost always, the author is attempting to convince the reader that what has been written is correct, i.e., a reflection of some kind of objective reality. Brevity easily results in unintended confusion.
But, if one opens a journal, say the Journal of Chemical Physics as an example, one is astounded to realize how many algebra and calculus manipulations have been omitted between adjacent virtually contiguous equations in most (if not all) of these manuscripts. In fact, the brevity of the manuscripts is part of the bravura nature of scientific writing, requiring that the reader fill in the details. Authors assume that the details are (possibly) trivial, and not worth the valuable space on the printed page.
1.9 Scientific Writing and Teaching Scientific Reading
Assuming that all our technical graduate professionals will eventually read journals and have to learn from them, it seems important to make sure they they learn how to read in the critical style which is necessary in reading mathematics based text. In those texts, equations, interspersed with words, are supposed to lead the reader; but the reader is still expected to be an active one, filling in missing details as necessary so that nothing, at the end, is taken on faith.
In thinking about this problem, of how to get students to read “with a fine toothed comb”, validating each and every equation themselves, I have created a computer guided reading (CGR) scheme which prevents students from “turning the page” and continuing on, when there is something on the page which they do not understand. The thought is that if they could develop the habit of checking what they read as they read, then it would carry on in future life, when text books and teacher/authorities are absent and they are required to learn from the written (journal) page on their own.
1.11 Testing and Reading, a Challenge
In this work, and in my Computer Assisted Testing (CAT) work, the emphasis has been on making a precise measurement. When the computer is doing the “grading” or “judging”, it is imperative that that act be as perfect as possible, especially when the vast majority of students are going to accept the word of the computer as the absolute truth. Little do they suspect that the programming which lead to the information on their screens is a flawed as the human who “inputted” the information, but that is another subject.
Accuracy of grading becomes imperative especially when one considers the ease with which one can (mistakenly, but possibly innocently) mislead a student, and thereby cause great (inadvertent) harm, without ever knowing what harm one has done! Consider the following multiple choice examination question: " A ball is thrown up, comes to rest momentarily, and then falls back down. At its highest point, its velocity is:
1. equal to its displacement?
2. equal to its displacement divided by time?
3. at a minimum?
4. at a maximum?
In a recent discussion amongst physics teachers, this question was discussed, and noted to be flawed in several respects. First, and most distressing to some is the didactic statement that the ball comes to rest momentarily. This is utter nonsense, and frightening, since the better student will surely become discombobulated by this erroneous statement, and loose time, focus, and who knows what else, struggling with this piece of silliness, which was intended to lead the student to the “right” answer.
Forgetting the misstatement in the problem, some students will certainly by mislead into either skipping the question, guessing, thinking about what the teacher (examiner) intended, etc., rather than the physics morsel of knowledge that the question is intended to elicit. Since the velocity will be negative on the second half of the trajectory, then at the first bounce, when the ball has hit the floor, its velocity will be maximally negative, and therefore “maximum” might be a right answer. But it isn’t what the machine expects, and any grading scheme, using computers, templates, or just human eyes, when reading the students response, will be forced to mark it “right” or “wrong” incorrectly, since the question is grotesquely flawed.
The question has to be posed “perfectly” if its “objectivity” is to be exploited. Thus, assuming the offending "comes to rest momentarily" clause is removed, it still remains for the examiner to change velocity to speed and define it (so there is no question of confusion) i.e., the speed, which is the absolute magnitude of the velocity.
In CAT and CGR, it is preferable to ask the student “what is the speed when the ball has achieved its greatest height? ”, and allow a free numerical response (which should be zero). There are no misunderstandings in this case, and all is perfectly clear and not disconcerting, to the student. And if s/he doesn’t know the answer, then the measurement is perfect! (I should note that zero answers are not the best, since these are prone to guesswork which is what we're interested in eliminating.)
1.12 Results and Discussion
This method has failed abominably. "The best laid plans of ..." . I only report it here because rarely are “educational experiments” reported unless their results are glowing. A change might be appreciated.
Granted, only a tiny subset of students used the CGR system for a graduate class in quantum chemistry. But one of the students, after passing the course, returned two years later with a set of corrections to a document that he had claimed to have read as a student. This time, while preparing for his oral examination (in the same field, hence the review of this topic (moments of inertia)), he found several errors. I corrected his errors and asked him to check the revised manuscript again, and, he found more errors (truly typographical) but missed a technical set of errors in equations. That means that he still was not reading in the manner we are supposed to. I conclude that CGR in a classroom situation is not effective in making students read “with a fine toothed comb”.
In fact, I concluded that nothing will work. It appears as if we have changed the ethos attached to learning and scholarship, converting attendance at institutions of higher education into pre-employment certification. Learning for its own sake doesn’t exist any more, if it ever did. Rather, our students are targeted at doing the minimum work required to “get by” passing over, under, or through the barrier to their intended goal.
Recently, in the Times, there was a letter from a business school graduate bemoaning the fact that his class, in seeking to “beat the curve”, could never be expected to compete on a world level, since their only interest was in slightly better than average performance relative to their local cadre. How true.
We have lost our way. Learning, for excellence, doesn’t exist, only its trappings.
Friday, November 16, 2007
Computer Assisted Testing, a Failed Proposal
Note before reading. The equations sometimes show on this blog as whitespace; one needs to click on them to see the associated jpeg file (It seems that this is a FireFox problem, i.e., IE shows equations fine!).
Introduction
In 1981 I wrote a paper (staying anonymous requires that I not cite it here) in which I proposed Computer Assisted Testing in Technical Subjects, with the idea of rigorizing the testing aspect of teaching/learning. More than 20 years later, it is finally coming into existence, with businesses (such as WebAssign) offering such tests, but without the proctoring required to make the system really work.
The idea was that students needed the ability to know when they were wrong during an examination, and needed the ability to come back when they'd studied more. They needed the ability to "give up" or say that "I don't know", so that they themselves would realize their deficiencies vis-a-vis the course requirements.
Many rejected proposals later, I'd given up on CAT but thought that publishing these thoughts might help get people to move more swiftly, if anyone ever reads them, in the direction I'd hoped to champion.
My idea is that schools should use discarded computers to create examination rooms, huge examination rooms, in which students could come to take tests when they were ready, and under proctored conditions, could prove that they knew what they were supposed to have learned.
Some threads of quantum chemistry understanding
It is hard to define what an education actually means, and in the current atmosphere of liberal versus conservative with respect to teaching "values" in the academy, it is amusing that virtually no one disentangles scientific from non-scientific learning. Political Science must be a priori controversial since it attempts a dispassionate examination of a passionate subject.
On the other hand, calculus is value free. No one, to my knowledge, thinks of it as controversial (except for biology which leads to genetics which leads to Darwin and evolution, which remains controversial).
So the discussion of education should bifurcate into two discussions, one concerning politically correct (or otherwise) topics, and the other addressing the engineering, physics, biology, and, of course, chemistry education which the nation needs.
We desperately need large scale, country wide, testing.
I recognize how controversial this is, but ascertaining what our students are learning in algebra, calculus, physics and chemistry (my areas of quasi-expertise) is essential to sustaining the effort our predecessors made possible through their efforts on our behalf. We need to know that what we expect to be learned has, in fact, been learned (and remembered).
That's different than knowing what was taught. All of us will concede that what's taught is rarely learned. What we need is a measure of what is learned.
Since the material we deal with is hierarchical in nature, there is no way that students can continue to level ``b'' if their mastery of precursor material taught in level ``a'' is insufficient to the task at hand. Learned foundational material which has been forgotten is useless, and
the ``learning'' was in vain, i.e., non-existent! Let me give you an example from one of the readings from a quantum chemical discussion of eigenvalues:
Consider the following:
The equations sometimes show here as whitespace; one needs to click on them to see the associated jpeg file (It seems that this is a FireFox problem, i.e., IE shows equations fine!).
Show that
This identity came up in a molecular orbital problem, and we expect any high school graduate to be able to show that the statement is true.Now, before continuing, I want to make sure you understand what I mean by "be able to show that the statement is true". I do not mean plucking something out of a multiple choice list. I mean actually writing lines of (what) algebra/mathematics/arithmetic/whatever and arriving at a proof that the statement is true. A multiple choice question of the type:
Which of the following statements is true:
tells us almost nothing when the student gets it wrong, especially if we've been extraordinarily devious in constructing "distractors''. Even getting the "right'' answer assures us of nothing, since guesses are permissible in this environment! Worse yet, the calculator equipped student can evade the question completely, evaluating
and then evaluate each of the choices until s/he finds the ``correct'' one. Multiple choice examination doesn't test what we think its testing, and its results are therefore useless!The entire motivation for Computer Assisted Testing was to make it possible to understand whether or not the examinee could actually do the work without the hints of a set of choices.
The argument was made that the so-called real world does not give multiple choices to scientific/mathematical queries. More important, even listing choices warps the intellectual environment of the measurement!
By the time students are studying "my subject'', quantum chemistry, it is too late to deal with elementary problems in mathematics and/or physics. We know very well that our students have a poor command of physics, and their calculus skills are atrocious.
Chemistry students have been forced to take calculus and physics without a rationale. They are skeptical that this material actually is applicable to chemistry, and Organic Chemistry, in the sophomore year, adds to the prejudice that numbers and mathematics are not part of "real'' chemistry.
Some thoughts on learning and teaching in the local environment
It has become abundantly clear to me that teaching and learning are orthogonal.
OK, maybe not exactly, but the overlap between the two is small at best.
The social tendency not to harm students, the needs for institutions of ``higher'' education to fill their ever expanding seats, the faculty needs to have time to do the research which really pays the bills, etc., conspire to make teaching the least important
aspect of undergraduate schools. No matter what the publicists write (and say), these institutions regard their undergraduate charges as a burden.
Normal chemistry 127-8 is such a burden that faculty are asking immunity from having to teach it (several already have that de facto). The department has hired permanent sub-faculty to teach it; faculty with little or no research interest, who are dismissable at the stroke of a Dean's pen based on a single bad teacher evaluation (quoted from the Dean when meeting with the Department).
This permanent cadre of sub-faculty completely relieves the normal faculty of the burden of having to teach freshman. Several faculty have never taught freshmen anyway, but what the hell, let's institutionalize the immunity!
The fact that the State is only interested in the teaching of undergraduates, which presumably includes freshman, is irrelevant. What the devil does the State know about Universities, anyway? We know better!
What does it mean to be learned?
In the 12th century,
The equations sometimes show here as whitespace; one needs to click on them to see the associated jpeg file (It seems that this is a FireFox problem, i.e., IE shows equations fine!).
"Bhaskara demonstrated correctly thatWe don't want people writing this about Chemists! George Lang (not so) recently (on the WWW) wrote:an achievement, I might add, utterly beyond the collective intellectual power, say, of the English Department of Duke University. (It is pleasant to imagine members of the department sitting together in a long lecture hall, Marxists to one side, deconstructionists to the other, abusing one another roundly as they grapple with the problem.)''
D. Berlinski, A Tour of the Calculus , Pantheon Books, New York, 1995, page 38
"Most students I have talked to DO NOT link test performance with knowledge of subject.... The message that testing does not measure ability has come through to our students loud and clear. Thus our tests are not learning experiences or opportunities to receive valuable feedback about the students' depth of understanding. They are GAMES the students must play to break into the professional world."
CHEMED-L Internet Discussion Group on Chemical Education, Fall 1995.
It follows that it is better to use a third party as examiner, so teachers and pupils can be allies.
"The National Science Foundation agreed (that there was a problem in teaching/learning calculus) and spent $35-million from 1987 to 1995 on dozens of projects to update the teaching of calculus."
...
"Courses often consisted of bland lectures in which students learned how to calculate derivatives, and integrals. Students practiced the calculations at home, and on exams
professors asked similar problems with different numbers. Students, professors recall, were bored and disengaged."
...
"This approach (reform calculus instruction) really shies away from anything but superficial use of skills (Prof. R. L. Cohen, Stanford). For students who really
need to know math and use it, this wasn't nearly sophisticated or rigorous enough.''
R. Wilson, ``A Decade of Teaching 'Reform Calculus', Has been a Disaster, Critics Charge'', Chronicle of Higher Education, Feb 7, 1997, page A12.
I am convinced that examinations are the key to re-introducing rigor to American classrooms.
Here is a typical examination question (in Mathematics/Calculus):
The equations sometimes show here as whitespace; one needs to click on them to see the associated jpeg file (It seems that this is a FireFox problem, i.e., IE shows equations fine!).
Show that, given
one can obtain
by elementary methods.
Z. A. Melzak, "Companion to Concrete Mathematics, Mathematical Techniques and Various Applications", Wiley-Interscience, New York, 1973, page 177. (This is akin to the infamous "it can be shown by the serious student ... ".)
Offering help during examinations, CAT mode
If faced with the above question on an oral examination, the prepared student might like a hint. For instance, a hint might be "First integrate over x (on the right hand side) and see what you get."
If the student still needs help, perhaps one should suggest "Try integration by parts."
If the student wrote:

the examiner might query if the examinee has made a sign error, etc., etc., etc..
What we proposed here was that the computer could (and should) act as a surrogate for the oral examiner and probe with intelligent hints and corrections the extent of the student's ability to do the actual problem. Alternatively, we could offer the student a "do not know how to proceed" button which would make the student admit that s/he could not do the problem!
A Spectroscopy problem
Here is a typical spectroscopy question:
Calculate the ratio of intensities expected for the (n=0,J=1 to n=1,J=2) line to the (n=0,J=2 to n=1,J=3) line at 25C. Assume that the rotational constant, B, is 10 recirocal centimeters.If the student answers:
(9 x e-B*(2*3)/kT)/(7 e-B*(3*4)/kT)
we can respond with a question about appropriate degeneracies of appropriate rotational levels.
If the student answers:
(9 e-B*(1*2)/kT)/(7 e-B*(2*3)/kT)
instead, then we can respond with a different question about appropriate degeneracies, etc., etc., etc..
A Quantum Chemistry problem
Another question:
What is the value of the 2px wave function at the point x=y=z=1 Angstrom?
Compare this to the question:
What is the value of the 2px wave function at the point r=1, theta= pi/2, phi = -3pi/4?
Should the help text inform the student about the relationship between the two points?
If the `perfect' examination is the oral examination with (decent) examiners, then an even-handed reproducible equivalent is Computer Assisted Testing, in which certain errors can be foreseen and "handled".
High Stakes Testing and the wrong debate
The following is lifted from a listserv
More than 30 years ago, the eminent social scientist Donald T. Campbell warned about the perils of measuring effectiveness via a
single, highly consequential indicator: ``The more any quantitative social indicator is used for social decision making,'' he said, ``the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.'' High-stakes testing is exactly the kind of process Campbell worried about, since important judgments about student, teacher, and
school effectiveness often are based on a single test score. This exaggerated reliance on scores for making judgments creates conditions that promote corruption and distortion. In fact, the over valuation of this single indicator of school success often compromises the validity of the test scores themselves. Thus, the
scores we end up praising and condemning in the press and our legislatures are actually untrustworthy, perhaps even worthless.
Jerry Becker, "The Math Forum@Drexel", 2007
Eliminating the tests, rather than improving them, seems to be the answer "educators" espouse in this regard, much to my regret. The key is Computer Adapted Testing and Computer Assisted Testing, not abandoning testing for portfolios, video tapings, or whatever other silliness the Schools of Education invent.
Postscript to teaching Physical Chemistry after a haitus
I was allowed to teach P. Chem. P. Chem. in the summer recently, after a long hiatus (reason secret).
For me it was an eye opener. Consider that when I took the course, in 1956, 50+ years ago, the course was taught with closed book, closed notes examinations.
Admitedly, the quantum mechanics was less developed than now, in fact I actually don't think there was any in my course, although the newer Prutton and Maron had chapters on it.
Anyway, we learned to carry out derivations, call it memorize if you will, such that we could, from some starting assumptions, derive the equation that we needed to solve a problem. As a result, now, as I teach this material, I need no notes, no preparation, nothing but intellect and the path, i.e., what am I going to "teach".
I develop the equations, one after another, in sequence, one from the other, making a coherent statement of where we were and where we're going.
I point out to students the approximations made, so that one could extend the results to higher accuracy by undoing the approximations and making newer, less restrictive ones, and continuing on, maintaining tractability if possible.
My current students have to have open book, open notes examinations, and as I've done for years I have them do web problems, and base my exams on those questions. Usually, I clone n-1 questions from the web, and add one more they've never seen before, but I've learned my lesson, and now just clone questions!
Be that as it may, class time is taken up with doing every single problem out, so that exams become exercises in transcribing from their notes on these problems to the exam booklet. We've reached the ultimate in pandering!
And my students are incapable of carrying out derivations on their own! Worse, during my derivations, it is clear that my students know no calculus what so ever, i.e., they've retained nothing. We are bringing up a generation of incompetents (who will get A's and B's).
I've lost the desire to teach mainstream P. Chem. Of course every generation believes that their students aren't up to "snuff" whatever that expression means. But this is a serious deviation from past practice, where the hierarchical nature of the learning is becoming derailed. It implies that the future will be little more than extension of the present, with no "ah hah!!!!" moments, since our students are being trained rather than being educated. Heaven help us. (June 26, 2007)




