The Unz Review - Mobile
A Collection of Interesting, Important, and Controversial Perspectives Largely Excluded from the American Mainstream Media
Email This Page to Someone

 Remember My Information

 TeasersiSteve Blog
Standardized Tests

Bookmark Toggle AllToCAdd to LibraryRemove from Library • BShow CommentNext New CommentNext New Reply
🔊 Listen RSS

From the New York Times:

As Other Districts Grapple With Segregation, This One Makes Integration Work

MORRISTOWN, N.J. — The 5,226-student district is one of the few in the country created through such a merger as part of a court-ordered integration effort, and one of even fewer that still endure. Even as communities around the country have been debating how to address school segregation, with some proposals for integration meeting fierce opposition, a new report from the Century Foundation, a left-leaning think tank, calls Morris a model of “diversity and togetherness.”

… The Morris district is notable in that it has long been committed to diversity, even as the composition of its student body has changed. Meanwhile, schools nearby and in New York City have remained deeply segregated.

It’s worth noting that according to the Stanford Education Data Archive of school test scores, which I wrote about for Taki’s Magazine last spring, Morris has the 94th worst white-black test score gap out of more than 2000 school districts nationwide. Morris’s white-Hispanic test score gap is even more of a chasm: Morris comes in 29th worst in the entire country.

Generally speaking, liberal and wealthy districts suffer the worst racial disparities in school achievement. For example, the most gaping white-black disparity in the entire United States is found in Berkeley, CA.

The most impressive combinations of diversity, high average test scores, and modest racial gaps tend to be found in Texas exurbs, such as Frisco, the new home of the Dallas Cowboys.

… Paul Tractenberg, a professor emeritus at Rutgers Law School and the president of the Center for Diversity and Equality in Education, who co-wrote the Century Foundation report, says the [Morris, NJ] district has a “remarkable can-do attitude” that has allowed officials to continuously “rejigger what they are doing to accommodate the demands of the moment.”

The word “rejigger” strikes me as potentially problematic. I suspect that professors younger than emeritus would automatically shy away from saying “rejigger” to the New York Times.

You think you are a widely admired senior statesman in the war on white racism, but you don’t notice all the hungry eyes staring resentfully at your corner office from their cubicles. To them, you are just another Privileged Old White Man who is racistly and sexistly hogging one of the good jobs in the Social Justice Industry. Sure, maybe you’ll get away with using “rejigger” in 2016, but in 2017 you might be quoted on NPR using, say, the word “denigrate,” and a quick Twitterstorm later, you’ll be carrying a cardboard box of your personal effects down the elevator while a younger, nimbler SJW measures the drapes in your old office, little realizing that years later xe will in turn be fired for saying the word “doggerel,” not realizing the outrage that will ensue among CRISPR-enhanced Canine-Americans.

It’s the SJW Circle of Life:

🔊 Listen RSS

2015 NAEP White Hispanic Gap2

This is taking the average of four 2015 federal NAEP scores: both Math and Reading for both 4th and 8th Grades.

🔊 Listen RSS

NAEP 2015 Asian White Gaps

Here are the 2015 National Assessment of Educational Progress scores for Asians (orange) and whites (blue). I took a simple average of four scores: Reading and Math for both 4th and 8th grades. The overall sample size for the whole country is about 280,000, which is a lot, although I wouldn’t put too much faith in any one state’s scores, such as Colorado’s outlier score for Asians.

One observation I’d make is that Hawaii suggests the long term price of importing farm workers: Hawaii brought in a lot of Japanese and Chinese many generations ago, and in 2015 they’re still not scoring impressively.

🔊 Listen RSS

Screenshot 2015-10-28 05.27.50

Here are the brand new 2015 federal National Assessment of Educational Progress (NAEP) tests scores sorted in order of the size of the White-Black Gap on 8th grade math. The color reflects whether the state went for Obama (blue) or Romney (red) in 2012.

A few comments:

- Although it’s often assumed that The Gap is due to racism, it tends to be bigger in blue Democratic states.

- Gentrifying Washington DC now has enough white children to get a white NAEP score. Sure enough, The Gap in very liberal Washington DC is bigger than in all the states, due to a very high white score in DC and a slightly below average black score.

- German-Americans and Nordic-Americans don’t seem to know how to deal with African-Americans. As I’ve often pointed out, the biggest Gap is in Wisconsin, but in this table Nebraska, Minnesota, and Pennsylvania have the next widest Gaps. (Any relationship between this and Merkel’s Boner is probably not coincidental.)

- The highest black scores are in Dept. of Defense schools (DODEA), followed by military intensive states like Arizona and Alaska and well-educated liberals states like New Jersey and Massachusetts that also have high white scores.

- The smallest Gap is in West Virginia, which has, by far, the lowest white scores.

🔊 Listen RSS

Screenshot 2015-10-27 20.06.18

It’s widely believed that racial gaps in test scores are just class gaps. And, if that’s not true, then it’s assumed that race is fading away in importance relative to class. But an important study shows that in multiracial California, race is becoming more influential in recent years.

October 2015
Saul Geiser
Center for Studies in Higher Education
University of California, Berkeley

This paper presents new and surprising findings on the relationship between race and SAT scores. The findings are based on the population of California residents who applied for admission to the University of California from 1994 through 2011, a sample of over 1.1 million students. The UC data show that socioeconomic background factors – family income, parental education, and race/ethnicity – account for a large and growing share of the variance in students’ SAT scores over the past twenty years. More than a third of the variance in SAT scores can now be predicted by factors known at students’ birth, up from a quarter of the variance in 1994. Of those factors, moreover, race has become the strongest predictor. Rather than declining in salience, race and ethnicity are now more important than either family income or parental education in accounting for test score differences. It must be cautioned that these findings are preliminary, and more research is needed to determine whether the California data reflect a broader national trend. But if these findings are representative, they have important implications for the ongoing debate over both affirmative action and standardized testing in college admissions.

… The regression results show a marked increase since 1994 in the proportion of variance in SAT scores that can be predicted from socioeconomic background factors largely determined at students’ birth. After falling slightly from 25% to 21% between 1994 and 1998, the proportion of explained variance increased each year thereafter, growing to 35% by 2011, the last year for which the author has obtained data. Remarkably, more than a third of the variance in SAT scores among UC applicants can now be predicted by family income, education, and race/ethnicity. This result contrasts sharply with that for high school GPA: Socioeconomic background factors accounted for only 7% of the variance in HSGPA in 1994 and 8% in 2011. …

Nevertheless, even without being able to observe those intermediating experiences directly, regression analysis enables one to assess the relative importance of different socioeconomic factors in predicting test performance. Figure 2 provides standardized regression coefficients, or “beta weights,” for predicting SAT scores conditional on family income, parents’ education, and race/ethnicity. The coefficients show the predictive weight of each factor after controlling for the effects of the other two, thereby providing a measure of the unique contribution of each factor to the prediction.

Screenshot 2015-10-27 20.09.47

In 1994, at the beginning of the period covered in this analysis, parental education was the strongest of the three socioeconomic predictors of test performance. (The standardized regression coefficient of 0.27 in that year means that, for each one standard deviation increase in parental education, SAT scores increased by 0.27 of a standard deviation, when income and underrepresented minority status were held constant.) The predictive weight for parental education has remained about the same since then. The weight for family income has shown a small but steady increase from 0.13 in 1998 to 0.18 in 2011. But the most important change has been the growing salience of race/ethnicity. By 2011, the predictive weight for underrepresented minority status, 0.29, was greater than that for either family income or parental education. When the regression results for the UC sample are pooled across applicant cohorts, race/ethnicity is the strongest predictor of SAT scores over the last four years.

A key implication of this finding is that racial and ethnic group differences in SAT scores are not simply reducible to differences in family income and parental education. At least for the UC sample, there remains a large and growing residual effect of race/ethnicity after those factors are taken into account.

Screenshot 2015-10-27 20.14.46

As shown in Figure 8, the test score gap in California is greatest between black and white SAT takers but has oscillated up and down and shows no consistent trend since 1998. If one were to draw inferences about racial and ethnic differences from the black-white gap alone, one might conclude that there has been little change in this respect.

But that conclusion would be wrong. For all other racial/ethnic comparisons, test score gaps between underrepresented minority and other students have been growing. The Black-Asian, Latino-White, and Latino-Asian test score gaps have increased almost every year since 1998.

🔊 Listen RSS
Screenshot 2015-08-29 20.15.45

Federal NAEP reading scores 12th graders 2013

A general assumption of the moderate conventional wisdom over the last half century is that average black performance is dragged down by specific impediments, such as poverty, crime, culture of poverty, parental taciturnity, lead paint, or whatever. One would therefore expect blacks without those impediments to score equal with whites.

But a close inspection of the social science data suggests that the world doesn’t really look like that. For example, above is the 2013 federal National Assessment of Educational Progress scores for 12th graders in Reading. Blacks who are the children of college graduates average 274, which is the same as whites who are the children of high school dropouts.

The Math Gap is the same:

Screenshot 2015-08-29 20.34.33

At the high school dropout level, The Gap in math is 16 points, but at the college graduate level, The Gap is twice as large: 32 points. That’s the opposite of what the conventional wisdom would imply.

So, basically, there are two theories left to account for this. How do we choose between them?

In the past, Western civilization tried to follow Occam’s Razor, which implies the Bell Curve theory of regression toward different means would be most likely.

But the term “Western civilization” is exclusionary and makes people feel bad. These days, we know that the highest form of thought is not using Occam’s Razor but shouting “Occam’s racist!”

So the only viable explanation is the Conspiracy Theory Theory of Pervasive Racism: people who think they are white are constantly destroying black bodies by saying words like “field” and “swing.” Or something. It doesn’t really matter what the specifics of the Conspiracy Theory Theory are since the more unfalsifiable the better.

Because Science.

🔊 Listen RSS

Screenshot 2015-07-01 16.54.40

Paul Krugman argues today that Puerto Rico is kind of like West Virginia, Mississippi, and Alabama:

Put it this way: if a region of the United States turns out to be a relatively bad location for production, we don’t expect the population to maintain itself by competing via ultra-low wages; we expect working-age residents to leave for more favorable places. That’s what you see in poor mainland states like West Virginia, which actually looks a fair bit like Puerto Rico in terms of low labor force participation, albeit not quite so much so. (Mississippi and Alabama also have low participation.) … There is much discussion of what’s wrong with Puerto Rico, but maybe we should, at least some of the time, just think of Puerto Rico as an ordinary region of the U.S. …

Okay, but there’s a huge difference in test scores.

The federal government has been administering a special Puerto Rico-customized version of its National Assessment of Educational Progress (NAEP) exam in Spanish to Puerto Rican public school students, and the results have been jaw-droppingly bad.

For example, among Puerto Rican 8th graders tested in mathematics in 2013, 95% scored Below Basic, 5% scored Basic, and (to the limits of rounding) 0% scored Proficient, and 0% scored Advanced. These results were the same in 2011.

In contrast, among American public school students poor enough to be eligible for subsidized school lunches (“NSLP” in the graph above), only 39% scored Below Basic, 41% scored Basic, 17% scored Proficient, and 3% scored Advanced.

Puerto Rico’s test scores are just shamefully low, suggesting that Puerto Rican schools are completely dropping the ball. By way of contrast, in the U.S., among black 8th graders, 38% score Basic, 13% score Proficient, and 2% score Advanced. In the U.S. among Hispanic 8th graders, 41% reach Basic, 18% Proficient, and 3% Advanced.

In Krugman’s bete noire of West Virginia, 42% are Basic, 20% are Proficient, and 3% are Advanced. In Mississippi, 40% are Basic, 18% Proficient, and 3% are Advanced. In Alabama, 40% are Basic, 16% are Proficient, and 3% are Advanced. (Unmentioned by Krugman, the lowest scores among public school students are in liberal Washington D.C.: 35% Basic, 15% Proficient, and 4% Advanced.)

Let me repeat, in Puerto Rico in Spanish, 5% are Basic, and zero zip zilch are Proficient, much less Advanced.

Am I misinterpreting something? I thought I must be, but here’s a press release from the Feds confirming what I just said:

The 2013 Spanish-language mathematics assessment marks the first time that Puerto Rico has been able to use NAEP results to establish a valid comparison to the last assessment in 2011. Prior to 2011, the assessment was carefully redesigned to ensure an accurate assessment of students in Puerto Rico. Results from assessments in Puerto Rico in 2003, 2005 and 2007 cannot be compared, in part because of the larger-than-expected number of questions that students either didn’t answer or answered incorrectly, making it difficult to precisely measure student knowledge and skills. The National Center for Education Statistics, which conducts NAEP, administered the NAEP mathematics assessment in 2011. But those results have not been available until now, as it was necessary to replicate the assessment in 2013 to ensure that valid comparisons could be made.

“The ability to accurately measure student performance is essential for improving education,” said Terry Mazany, chairman of the National Assessment Governing Board, which oversees NAEP. “With the support and encouragement of education officials in Puerto Rico, this assessment achieves that goal. This is a great accomplishment and an important step forward for Puerto Rico’s schools and students.”

NAEP assessments report performance using average scores and percentages of students at or above three achievement levels: Basic, Proficient and Advanced. The 2013 assessment results showed that 11 percent of fourth-graders in Puerto Rico and 5 percent of eighth-graders in public schools performed at or above the Basic level; conversely, 89 percent of fourth-graders and 95 percent of eighth-graders scored below that level. The Basic level denotes partial mastery of the knowledge and skills needed for grade-appropriate work. One percent or fewer of students in either grade scored at or above the Proficient level, which denotes solid academic performance. Only a few students scored at the Advanced level.

The sample size for 8th graders was 5,200 students at 120 public schools in the Territory.

UPDATE: I’ve now discovered Puerto Rico’s scores on the 2012 international PISA test. Puerto Rico came in behind Jordan in math.

Results this abysmal can’t solely be an HBD problem (although it’s an interesting data point in any discussion of hybrid vigor); this has to also be due to a corrupt and incompetent education system in Puerto Rico.

New York Times’ comments aren’t generally very useful for finding out information, but Krugman’s piece did get this comment:

KO’R New York, NY 4 hours ago

My husband and I have had a house in PR for 24 years. For two of those years we taught English and ESL at Interamericana, the second largest PR university. Our neighbors have children in the public grade schools. In a nutshell: the educational system in PR is a joke!!! Bureaucratic and corrupt. Five examples: (1) In the elementary schools near us if a teacher is sick or absent for any reason, there is no class that day. (2) Trying to get a textbook changed at Interamericana requires about a year or more of bureaucratic shinnanigans (3) A colleague at Interamericana told us that he’d taught in Africa (don’t remember where) for a few years and PR was much worse in terms of bureaucracy and politics. ( (4) The teaching method in PR is for the teacher to stand in front of the class, read from the textbook verbatim, and have the students repeat what he or she read. And I’m not speaking just about English – this goes for all subjects. 5) Interamericana is supposed to be a bi-lingual iniversity. In practice, this means the textbooks are in English, the professor reads the Spanish translation aloud, and the usually minimal discussion is in Spanish. …

Public school spending in Puerto Rico is $7,429 per student versus $10,658 per student in the U.S. Puerto Rico spends more per student than Utah and Idaho and slightly less than Oklahoma.

Puerto Rico spends less than half as much as the U.S. average on Instruction: $3,082 in Puerto Rico vs. $6,520 in America, significantly less than any American state. But Puerto Rico spends more than the U.S. average on Total Support Services ($3,757 vs. $3,700). Puerto Rico is especially lavish when it comes to the shifty-sounding subcategories of General Administration ($699 in PR vs. $212 in America) and Other Support Services ($644 vs. $347). PR spends more per student on General Administration than any state in America, trailing only the notorious District of Columbia school system, and more even than DC and all 50 states on the nebulous Other Support Services.

Being a schoolteacher apparently doesn’t pay well in PR, but it looks like a job cooking the books somewhere in the K-12 bureaucracy could be lucrative.

The NAEP scores for Puerto Rico and the U.S. are for just public school students.

A higher percentage of young people in Puerto Rico attend private schools than in the U.S. The NAEP reported:

In Puerto Rico, about 23 percent of students in kindergarten through 12th grade attended private schools as of the 2011-2012 school year, compared with 10 percent in the United States. Puerto Rico results are not part of the results reported for the NAEP national sample.

So that accounts for part of the gap. But, still, public schools cover 77% of Puerto Ricans v. 90% of Americans, so the overall picture doesn’t change much: the vast majority of Puerto Rican 8th graders are Below Basic in math.

Another contributing factor is likely that quite a few Puerto Ricans summer in America and winter in Puerto Rico and yank their kids back and forth, which is disruptive to their education.

It’s clear that Puerto Ricans consider their own public schools to be terrible and that anybody who can afford private school should get out. The NAEP press release mentions that 100% of Puerto Rican public school students are eligible for subsidized school lunches versus about 50% in the U.S. Heck, Oscar-winner Benicio Del Toro’s lawyer father didn’t just send him to private school, they sent him to a boarding school in Pennsylvania.

Still, these Puerto Rican public school scores are so catastrophic that I also wouldn’t rule out active sabotage by teachers, such as giving students an anti-pep talk, for some local labor reason. For example, a PISA score from Austria was low a couple of tests ago because the teacher’s union told teachers to tell students not to bother working hard on the test. But the diminishment of the Austrian PISA score wasn’t anywhere near this bad. And Puerto Rico students got exactly the same scores in 2011 and 2013.

And here’s Jason Malloy’s meta-analysis of studies of Puerto Rican cognitive performance over the last 90 years.

🔊 Listen RSS

One of the older, more nagging conundrums for anybody interested in education and demographics is the lack of readily available meaningful data on how high school students do by state and by race on high stakes tests such as the SAT and ACT college admissions tests.

The federal government invests a lot of money in the NAEP test, but that is a low stakes test for students, so it’s more easily manipulable by those states that care about the results. For example, Texas usually manages to have a larger percentage of its less academically inclined students not take the NAEP than does Iowa, which helps contribute to Texas’s sterling NAEP scores.

Or maybe Texas really has figured out an effective, economical system of educating students of all ethnic groups. It’s hard to say, but it’s an important question that deserves study.

A high stakes test, in contrast, is one in which students have motivations for doing their best, which is why I’ve always wanted to look at SAT and ACT scores by state. After all, the NAEP isn’t important in the big picture, while the SAT and ACT are.

But, the percent of 17-year-olds taking one or both college admissions tests vary by state. This, however, is not an insuperable problem since estimates of what nontakers might have scored can be modeled demographically by looking at the variation in usage rates.

Another difficult problem, but one I believe can be modeled, is that the two tests started out regionally, with the ACT dominating states near its headquarters in Iowa City and the SAT near its headquarters in Princeton and on the West Coast.

In the upper Midwest, traditionally, the only students who took the SAT were ambitious one looking for admission to national universities on the East or West coasts. This led to Iowa and Illinois students taking the SAT averaging much higher scores than in the East and West.

In recent years, both tests have become less regional, with ACT-taking spreading to the coasts.

That evolution should help an ambitious analyst come up with a reasonable model for estimating the best guess for the combined SAT/ACT scores by state by race.

An iSteve reader (whose identity I have lost in the shuffle) kindly posted average SAT and ACT scores and number taking by state by race each year from 2006-2014 here. He converted the ACT scores into SAT score equivalents, although I don’t know which methodology he used.

Combine this trove of data with the 2010 Census data on the number of 17 year olds by race in each state and you have the raw materials for building a model that will get around the traditional problems that have bogged everybody down.

Me, personally, I’m not going to do all this work, but if somebody out there has the skills and is looking for a topic, this is an important one.

I don’t have the sources for this data, but if you are interested in working with this, post questions in the comments and the person who posted the numbers might respond.

🔊 Listen RSS


This graph displays the mean of the Math, Science, and Reading test scores from the OECD’s 2012 Programme for International Student Assessment. American scores are red, white countries are blue, East Asians countries are yellow, Muslim countries are green, and Latin American countries are brown.So, Asian Americans outscored all large Asian countries (with the exception of three rich cities); white Americans outperformed most, but not all, traditionally white countries; and Latino Americans did better than all Latin American countries. African Americans almost certainly scored higher than any black majority country would have performed.

Bear in mind that many countries did not take part in PISA, such as India, which dropped out after a trial run in two states produced average scores below any seen on this chart. For a broader sampling of Third World scores, see the 2011 TIMSS Math and Science scores.

The reality is that there is not much difference in PISA or TIMSS scores within major racial blocs of countries. The Northeast Asians all tend to score well, the European and white Anglosphere countries tend to score fairly well, the Latin American countries tend to score fair to middling, and on down from there. The rank order of continents is very much like the rank order of racial/ethnic groups on NAEP or SAT or CST tests. Newcomers to the topic like Amanda Ripley, author of The Smartest Kids in the World, get excited about minor differences in PISA scores within continents, but those often are statistical noise.

For more on how to think about PISA scores, see here. And all my postings on PISA are here.
(Republished from iSteve by permission of author or representative)
🔊 Listen RSS
Psychometrics is a relatively mature field of science, and a politically unpopular one. So you might think there isn’t much money to be made in making up brand new standardized tests. Yet, there is.
From the NYT:

U.S. Asks Educators to Reinvent Student Tests, and How They Are Given

By SAM DILLON<nyt_correction_top>

<nyt_byline> Standardized exams — the multiple-choice, bubble tests in math and reading that have played a growing role in American public education in recent years — are being overhauled.

Over the next four years, two groups of states, 44 in all, will get $330 million to work with hundreds of university professors and testing experts to design a series of new assessments that officials say will look very different from those in use today.

The new tests, which Secretary of Education Arne Duncan described in a speech in Virginia on Thursday, are to be ready for the 2014-15 school year.

They will be computer-based, Mr. Duncan said, and will measure higher-order skills ignored by the multiple-choice exams used in nearly every state, including students’ ability to read complex texts, synthesize information and do research projects.

“The use of smarter technology in assessments,” Mr. Duncan said, “makes it possible to assess students by asking them to design products of experiments, to manipulate parameters, run tests and record data.”

I don’t know what the phrase “design products of experiments” even means, so I suspect that the schoolchildren of 2014-15 won’t be doing much of it.

Okay, I looked up Duncan’s speech, “Beyond the Bubble Tests,” and what he actually said was “design products or experiments,” which almost makes sense, until you stop and think about it. Who is going to assess the products the students design? George Foreman? Donald Trump? (The Donald would be good at grading these tests: tough, but fair. Here’s a video of Ali G pitching the product he designed — the “ice cream glove” — to Trump.

Because the new tests will be computerized and will be administered several times throughout the school year, they are expected to provide faster feedback to teachers than the current tests about what students are learning and what might need to be retaught.

Both groups will produce tests that rely heavily on technology in their classroom administration and in their scoring, she noted.

Both will provide not only end-of-year tests similar to those in use now but also formative tests that teachers will administer several times a year to help guide instruction, she said.

And both groups’ tests will include so-called performance-based tasks, designed to mirror complex, real-world situations.

In performance-based tasks, which are increasingly common in tests administered by the military and in other fields, students are given a problem — they could be told, for example, to pretend they are a mayor who needs to reduce a city’s pollution — and must sift through a portfolio of tools and write analytically about how they would use them to solve the problem.

Oh, boy …

There is some good stuff here — adaptive tests are a good idea (both the military’s AFQT and the GRE have gone over to them). But there’s obvious trouble, too.

Okay, so these new tests are going to be much more complex, much more subjective, and get graded much faster than fill-in-the-bubble tests? They’ll be a dessert topping and a floor wax!

These sound a lot like the Advanced Placement tests offered to high school students, which usually include lengthy essays. But AP tests take two months to grade, and are only offered once per year (in May, with scores coming back in July), because they use high school teachers on their summer vacations to grade them.

There’s no good reason why fill-in-the-bubble tests can’t be scored quickly. A lot of public school bubble tests are graded slothfully, but they don’t have to be. My son took the ERB’s Independent School Entrance Exam on a Saturday morning and his score arrived at our house in the U.S. Mail the following Friday, six days later.

The only legitimate reason for slow grading is if there are also essays to be read, but in my experience, essay results tend to be dubious at least below the level of Advanced Placement tests, where there is specific subject matter in common. The Writing test that was added to the SAT around 2003 has largely been a bust, with many colleges refusing to use it in the admissions process.

One often overlooked problem with any kind of writing test, for example, is that graders have a hard time reading kids’ handwriting. You can’t demand that kids type because millions of them can’t. Indeed, writing test results tend to correlate with number of words written, which is often more of a test of handwriting speed than of anything else. Multiple choice tests have obvious weaknesses, but at least they minimize the variance introduced by small motor skills.

And the reference to “performance-based tasks” in which people are supposed to “write analytically” is naive. I suspect that Duncan and the NYT man are confused by all the talk during the Ricci case about the wonders of “assessment centers” in which candidates for promotion are supposed to sort through an in-basket and talk out loud about how they would handle problems. In other words, those are hugely expensive oral tests. The city of New Haven brought in 30 senior fire department officials from out of state to be the judges on the oral part of the test.

And the main point of spending all this money on an oral test is that an oral test can’t be blindgraded. In New Haven, 19 of the 30 oral test judges were minorities, which isn’t something that happens by randomly recruiting senior fire department officials from across the country.

But nobody can afford to rig the testing of 35,000,000 students annually.

Here are some excerpts from Duncan’s speech:

President Obama called on the nation’s governors and state education chiefs “to develop standards and assessments that don’t simply measure whether students can fill in a bubble on a test, but whether they possess 21st century skills like problem-solving and critical thinking and entrepreneurship and creativity.”

You know your chain is being yanked when you hear that schoolteachers are supposed to teach “21st century skills” like “entrepreneurship.” So, schoolteachers are going to teach kids how to be Steve Jobs?

Look, there are a lot of good things to say about teachers, but, generally speaking, people who strive for union jobs with lifetime tenure and summers off are not the world’s leading role models on entrepreneurship.

Further, whenever you hear teachers talk about how they teach “critical thinking,” you can more or less translate that into “I hate drilling brats on their times tables. It’s so boring.” On the whole, teachers aren’t very good critical thinkers. If they were, Ed School would drive them batty. (Here is an essay about Ed School by one teacher who is a good critical thinker.)

And last but not least, for the first time, the new assessments will better measure the higher-order thinking skills so vital to success in the global economy of the 21st century and the future of American prosperity. To be on track today for college and careers, students need to show that they can analyze and solve complex problems, communicate clearly, synthesize information, apply knowledge, and generalize learning to other settings. …

Over the past 19 months, I have visited 42 states to talk to teachers, parents, students, school leaders, and lawmakers about our nation’s public schools. Almost everywhere I went, I heard people express concern that the curriculum had narrowed as more educators “taught to the test,” especially in schools with large numbers of disadvantaged students.

Two words: Disparate Impact.

The higher the intellectual skills that are tested, the larger the gaps between the races will turn out to be. Consider the AP Physics C exam, the harder of the two AP physics tests: In 2008, 5,705 white males earned 5s (the top score) versus six black females.

In contrast, tests of rote memorization, such as having third graders chant the multiplication tables, will have smaller disparate impact than tests of whether students “can analyze and solve complex problems, communicate clearly, synthesize information, apply knowledge, and generalize learning to other settings.” That’s a pretty decent description of what IQ tests measure.

Duncan says that the new tests could replace existing high school exit exams that students must pass to graduate.

Many educators have lamented for years the persistent disconnect between what high schools expect from their students and the skills that colleges expect from incoming freshman. Yet both of the state consortia that won awards in the Race to the Top assessment competition pursued and got a remarkable level of buy-in from colleges and universities.

… In those MOUs, 188 public colleges and universities and 16 private ones agreed that they would work with the consortium to define what it means to be college-ready on the new high school assessments.

The fact that you can currently graduate from high school without being smart enough for college is not a bug, it’s a feature. Look, this isn’t Lake Wobegon. Half the people in America are below average in intelligence. They aren’t really college material. But they shouldn’t all have to go through life branded as a high school dropout instead of high school graduate because they weren’t lucky enough in the genetic lottery to be college material.

The Gates Foundation and the U. of California ganged up on the LA public schools to get the school board to pass a rule that nobody will be allowed to graduate who hasn’t passed three years of math, including Algebra II. That’s great for UC, not so great for an 85 IQ kid who just wants a high school diploma so employers won’t treat him like (uh oh) a high school dropout. But, nobody gets that.

Another benefit of Duncan’s new high stakes tests will be Smaller Sample Sizes of Questions:

With the benefit of technology, assessment questions can incorporate audio and video. Problems can be situated in real-world environments, where students perform tasks or include multi-stage scenarios and extended essays.

By way of example, the NAEP has experimented with asking eighth-graders to use a hot-air balloon simulation to design and conduct an experiment to determine the relationship between payload mass and balloon altitude. As the balloon rises in the flight box, the student notes the changes in altitude, balloon volume, and time to final altitude. Unlike filling in the bubble on a score sheet, this complex simulation task takes 60 minutes to complete.

So, the NAEP has experimented with this kind of question. How did the experiment work out?

You’ll notice that the problem with using up 60 minutes of valuable testing time on a single multipart problem instead of, say, 60 separate problems is that it radically reduces the sample size. A lot of kids will get off track right away and get a zero for the whole one hour segment. Other kids will have seen a hot air balloon problem the week before and nail the whole thing and get a perfect score for the hour.

That kind of thing is fine for the low stakes NAEP where results are only reported by groups with huge sample sizes (for example, the NAEP reports scores for whites, blacks, and Hispanics, but not for Asians). But for high stakes testing of individual students and of their teachers, it’s too random. AP tests have large problems on them, but they are only given to the top quarter or so of high school students in the country, not the bottom half of grade school students.

It’s absurd to think that it’s all that crucial that all American schoolchildren must be able to “analyze and solve complex problems, communicate clearly, synthesize information, apply knowledge, and generalize learning to other settings.” You can be a success in life without being able to do any of that terribly well.

Look, for example, at the Secretary of Education. Arne Duncan has spent 19 months traveling to 42 states, talking about testing with teachers, parents, school leaders, and lawmakers. Yet, has he been able to synthesize information about testing terribly well at all? Has his failure to apply knowledge and generalize learning about testing gotten him fired from the Cabinet?

(Republished from iSteve by permission of author or representative)
🔊 Listen RSS

Americans have devoted an enormous amount of effort over the centuries to devising useful baseball statistics. In recent years, Americans have talked a lot about devising useful educational statistics.

For example, I’ve pointed out a million times over the last decade that it doesn’t make much sense to judge teachers, schools, or colleges by their students’ test scores. Most of the time, all you are doing is determining which kids were smarter to start with. Logically, it makes more sense to judge their “value added” by comparing how the students score now to how they scored in the past before the people or institutions being measured got their mitts on the students.

Over the last few years, everybody who is anybody in education — Bill Gates, Arne Duncan, you name it — has come around to this perspective (although they won’t use the word “smarter”).

A big problem, however, is that this value added idea remains almost wholly theoretical because almost none of the prominent educational statistics are published in value added form.

In contrast, when Bill James was pointing out 30 years ago that Batting Average, traditionally the most prestigious hitting statistic (the guy with the highest BA was crowned “Batting Champion”), wasn’t as good a measure of hitting contribution as Slugging Average plus On-Base Percentage, he could show you what he meant using real numbers that were available to everybody, even if you had to calculate them yourself from other, more widely published statistics.

Readers would say, “Yeah, he’s right. For example, Matty Alou (career batting average .307, but slugging average .381 and on-base percentage .345) wasn’t anywhere near as good as Mickey Mantle (career batting average only .298, but slugging average .557 and on-base percentage .421). If you add on-base percentage and slugging average together to get “OPS,” then Mickey had a .977 while Matty only had .726. And that sounds about right. Mickey was awesome, but it didn’t always show up in his traditional statistics. Now, we’ve finally got a statistic that matches up with what we all could see from watching lots of Yankee games.”

On the other hand, other innovative baseball statistics from that era have faded because they didn’t seem to work as well in practice as in theory. Readers would be rightly skeptical that Glenn Hubbard and Roy Smalley Jr. really were all time greats, as these complicated formulas said they were.

A couple of years ago, Audacious Epigone and I stumbled upon a potentially promising fluke in the federal National Assessment of Educational Progress test scores by state. Since these tests are given every two years to representative samples of fourth and eighth graders, then you ought to be able to roughly estimate how much value the public schools in each state have added from 4th grade to 8th grade by comparing, say, a state’s 2009 8th grade scores to that state’s 2005 4th grade scores.

Granted, people move in and out of states, but if you just look at the scores for non-Hispanic whites, you can cut down the effect of demographic change to what might be a manageable level.

So, how to display this data in a semi-usable form? In the following table, I’ve put the Rank of each state. For example, in NAEP 4th Grade Reading scores in 2005, white public school students in Alabama ranked 48th (out of 52 — the 50 states plus D.C. and the Department of Defense schools for the children of military personnel). By 2009, this cohort of Alabamans was up to 47th in 8th Grade Reading. That’s a Change in Rank of +1. Woo-hoo!

In contrast, in Math, Alabama’s 4th Graders were 50th in 2005 and the state’s 8th Graders were 50th in 2009, so that’s a Change in Rank for Math of zero.

There are measures that are better for some purposes than Rank, but, admit it, ranking all the states is more interesting than using standard deviations or whatever.

A new idea is embodied in the last column, which reports the Difference in Rank between Math and Reading scores for 8th Graders in 2009. Because Alabama was 47th in Reading in 2009, but only 50th in Math in 2009, it gets a Difference in Rank of -3. Boo-hoo …

What’s the point of this last measure?

There’s a fair amount of evidence that schools have more impact on Math performance than Reading performance. For example, math scores on a variety of tests have gone up some since hitting rock bottom during the Seventies (in most of America outside of Berkeley, the Seventies were when the Sixties actually happened). In contrast, reading and verbal scores have staggered around despite a huge amount of effort to raise them.

Why have math scores proven more open to improvement by schools than reading scores? One reason probably is that because kids only spend about 1/5th of their waking hours in school. And almost nobody does math outside of school, but some kids read outside of school. So, if you, say, double the amount of time spent in school on math, then you are increasing the total amount of time kids are spending doing math by about 98%. But if you double the amount of time spent on reading in school, there are some rotten stinker kids who read for fun in their free time, and thus you aren’t doing much for them in terms of total hours devoted to reading.

Not surprisingly, a decade of the No Child Left Behind act, which tells states to hammer on math and reading and don’t worry about that arty stuff like history and science, has seen continued slow improvements in math, but not much in reading — except at the bottom (i.e., the kids who don’t read outside school).

So, by 8th grade, Reading scores would likely be a rough measure of IQ crossed with bookishness (personality and culture). In contrast, 8th Grade Math scores are more amenable to alteration by schools since kids aren’t waiting in line to buy Harry Potter and the Lowest Common Denominator. So, the idea behind the final column is to compare rank on 8th Grade Math to rank on 8th Grade Reading. A positive number means your state has a better (lower) rank on Math than on Reading, which might reflect relatively well on your public schools given the raw materials it has to work with relative to other states.

For example, on the NAEP, Texas ranks 11th among white 8th graders in Reading, which is pretty good for such a huge state. But, it ranks a very impressive 4th among white 8th graders in Math, for a Difference in Ranking score of +7. This suggests Texas is doing something with math that’s worth checking into. Maybe they are just teaching to the test, but this is the NAEP, which isn’t a high-stakes test. And there are worse things than teaching to the test. (Whatever they are doing, they are starting young, because Texas ranks 2nd in Math for white 4th Graders.)

So, here is this huge table:

NAEP Read <spanxl67″ width=”58″><spanxl67″ width=”33″><spanxl67″ width=”33″><spanxl67″ width=”58″><spanxl67″ width=”62″>
White 2005 2009 09-05 2005 2009 09-05 09-09
Rank Rank Chg in Rnk Rank Rank Chg in Rnk Dif in Rnk
Alabama 48 47 +1 50 50 +0 -3
Alaska 37 31 +6 31 21 +10 +10
Arizona 41 29 +12 36 27 +9 +2
Arkansas 34 46 -12 37 44 -7 +2
California 32 33 -1 25 36 -11 -3
Colorado 9 9 0 13 6 7 +3
Connecticut 4 2 +2 8 7 +1 -5
Delaware 3 14 -11 11 17 -6 -3
DC 1 +1 1 +1 0
DoDEA 8 5 +3 21 16 +5 -11
Florida 16 21 -5 14 37 -23 -16
Georgia 27 38 -11 33 34 -1 +4
Hawaii 40 45 -5 40 48 -8 -3
Idaho 30 35 -5 29 26 3 +9
Illinois 13 10 +3 28 18 +10 -8
Indiana 43 34 +9 26 29 +-3 +5
Iowa 42 41 +1 39 41 +-2 0
Kansas 33 19 +14 10 15 +-5 +4
Kentucky 46 37 +9 51 49 +2 -12
Louisiana 45 51 -6 41 45 -4 +6
Maine 36 39 -3 42 39 3 0
Maryland 7 3 +4 7 2 +5 +1
Massachusetts 2 4 -2 3 1 2 +3
Michigan 28 40 -12 22 42 -20 -2
Minnesota 12 7 +5 4 5 +-1 +2
Mississippi 49 48 +1 48 51 +-3 -3
Missouri 26 27 -1 45 32 13 -5
Montana 21 16 +5 35 10 +25 +6
Nebraska 18 20 -2 30 28 2 -8
Nevada 51 49 +2 44 40 +4 +9
New Hampshire 19 24 -5 20 23 -3 +1
New Jersey 6 1 +5 5 3 +2 -2
New Mexico 35 25 +10 49 38 +11 -13
New York 10 8 +2 16 19 +-3 -11
North Carolina 22 28 -6 6 8 -2 +20
North Dakota 20 22 -2 24 9 15 +13
Ohio 14 12 +2 12 30 +-18 -18
Oklahoma 50 50 0 46 46 0 +4
Oregon 44 36 +8 34 31 +3 +5
Pennsylvania 15 6 +9 17 14 +3 -8
Rhode Island 39 43 -4 43 43 0 0
South Carolina 38 44 -6 9 24 -15 +20
South Dakota 29 13 +16 23 12 +11 +1
Tennessee 47 42 +5 47 47 +0 -5
Texas 11 11 0 2 4 -2 +7
Utah 31 30 +1 38 33 +5 -3
Vermont 24 18 +6 32 22 +10 -4
Virginia 5 17 -12 15 20 -5 -3
Washington 17 15 +2 19 11 +8 +4
West Virginia 52 52 0 52 52 0 0
Wisconsin 23 26 -3 18 13 5 +13
Wyoming 25 32 -7 27 35 -8 -3
NAEP Read <spanxl67″ width=”58″><spanxl67″ width=”33″><spanxl67″ width=”33″><spanxl67″ width=”58″><spanxl67″ width=”62″>
White 2005 2009 09-05 2005 2009 09-05 09-09
Rank Rank Chg in Rnk Rank Rank Chg in Rnk Dif in Rnk

As J.K. Simmons asks at the end of Burn After Reading, “What did we learn?

I’m not terribly sure, either. Who knows enough about what goes on within the educational establishments of all the states to know whether these numbers make sense?

But, at least we have some value added numbers and aren’t just still talking about how valuable they’d be if we ever got around to getting any.

(Republished from iSteve by permission of author or representative)
🔊 Listen RSS

Almost a decade ago, President Bush and Senator Kennedy got together and pushed through the No Child Left Behind act, which mandated that every single child in America would score “Proficient” or “Advanced” on reading and writing by 2013-2014, and told the states to concoct, administer, and grade their own tests to demonstrate this (nudge, nudge, wink, wink).

Some states got the hint, such as Mississippi, which soon reported that, even with a couple of years left on its Five Year Plan for Educational Awesomeness, 89% of Mississippi 4th grade readers were already Proficient/Advanced. Whether the governor of Mississippi also invited President Bush and Senator Kennedy to float in state down the Mississippi and see all the thriving new schools that he had erected on the banks of that mighty river is lost in the mists of history.

Unfortunately, while Bush and Kennedy were at it, they forgot to abolish the federal National Assessment of Education Progress test, which has gone on reporting that reading test scores have just kept on keeping on. From today’s Washington Post:

Reading scores stalled under ‘no child’ law, report finds

… progress nationwide has stalled despite huge instructional efforts launched under the No Child Left Behind law.

The 2009 National Assessment of Educational Progress showed that fourth-grade scores for the nation’s public schools stagnated after the law took effect in 2002, rose modestly in 2007, then flatlined. …

The national picture for eighth-grade reading was largely the same: a slight uptick in performance since 2007 but no gain in the seven years when President George W. Bush’s program for school reform was in high gear. …

When Bush signed the law, hopes were high for a revolution in reading. Billions of dollars were spent, especially in early grades, to build fluency, decoding skills, vocabulary, comprehension and a love of books that would propel students in all subjects. The goal was to eliminate racial and ethnic achievement gaps. But Wednesday’s report showed no great leaps for the nation and stubborn disparities in performance between white and black students, among others.

Another way to look at it is that we’re actually doing pretty good. With demographic riptide running in the wrong direction, just staying in the same place is a tribute to a lot of hard work.

Other notes: the white-black gap in 4th grade reading scores is by far the largest in the most liberal jurisdiction, the District of Columbia. Nationwide, it’s 25 points, but in DC it’s 60 points. The next biggest white-black gaps for 4th graders are in Minnesota (35 points) and Wisconsin (35). The smallest white-black gaps are in West Virginia (12 points — dumb whites), New Hampshire and Vermont (few blacks), and Pentagon-run schools (need a 92 IQ to enlist).

Indeed, DC has by far the highest scoring white kids (15 points ahead of Massachusetts). It’s black students are no longer the lowest scoring, being four points ahead of Wisconsin. (The worst scoring black 4th graders are in the socially liberal Old Northwest: Wisconsin, Michigan, and Minnesota. This is probably due in part to high welfare payments and easy eligibility requirements in the 1960s attracting the most feckless Southern blacks.)

Unfortunately, there aren’t enough white 8th graders in DC public schools for the NAEP to come up with an adequate sample size of white 8th graders in DC.

(Republished from iSteve by permission of author or representative)
🔊 Listen RSS

From the Washington Post, here are the scores by state on the Preliminary SAT (PSAT) required to make the first cut in the National Merit Scholarship program. (To convert from the three part PSAT score to the traditional two-part SAT Math plus Verbal scores, divide by 3 and multiply by 20: e.g., Arizona requires a 210, which is like a 1400 on the SAT.) It’s a good indication of the number of upper middle class residents by state.

For example, Washington D.C. always trails all 50 states on average National Assessment of Educational Progress scores for public school students, but it ties with Massachusetts (which leads NAEP scores more often than any other state), Maryland, and New Jersey for first on this measure with a 221 (the equivalent of a 1473 on the post-1995 SAT). Montana usually is close behind Massachusetts on the NAEP, but only requires a 204 because it lacks much of a native, childbearing upper middle class. In contrast, California, whose white students do relatively poorly on the NAEP on average, does well on this measure, requiring a 218. The lowest scoring state is Wyoming at 201. I would guess that’s about 2/3rds of a standard deviation behind the top four states.

Alaska 211
Arizona 210
Arkansas 203
California 218
Colorado 213
Connecticut 218
Delaware 219
Washington D.C. 221
Florida 211
Georgia 214
Hawaii 214
Idaho 209
Illinois 214
Indiana 211
Iowa 209
Kansas 211
Kentucky 209
Louisiana 207
Maine 213
Maryland 221
Massachusetts 221
Michigan 209
Minnesota 215
Mississippi 203
Missouri 211
Montana 204
Nebraska 206
Nevada 202
New Hampshire 213
New Jersey 221
New Mexico 208
New York 218
North Carolina 214
North Dakota 202
Ohio 211
Oklahoma 207
Oregon 213
Pennsylvania 214
Rhode Island 217
South Carolina 211
South Dakota 205
Tennessee 213
Texas 216
Utah 206
Vermont 213
Virginia 218
Washington 217
West Virginia 203
Wisconsin 207
Wyoming 201

I haven’t quantified this, but I would assume that Blue States average higher scores than Red States on this measure, although Texas does well at 216.

In general, Texas does fairly well on most tests of educational competence, and it’s encouraging that such a huge state seems to perform relatively well both for the average and for the elite. It would be interesting to know how far back this goes in time, since Texas does not have a historical reputation for educational attainment the way Massachusetts does.

(Republished from iSteve by permission of author or representative)
🔊 Listen RSS

by State, 1960 –
You probably remember the notorious “Democratic
states have higher IQs
” hoax from last May. Well, here, thanks
to Prof. Henry Harpending of the U. of Utah anthropology dept., might be
the closest thing to a national sample of IQ scores ever: the Project
Talent database of 366,000 9th-12th grade students. Unfortunately, it is
44 years years old. Nonetheless, it correlates reasonably with 2003 NAEP
8th grade achievement test scores (here
are the 2003 scores). As you can see, in this list of kids’ IQs back in
the mid-1960s, of the top 10 smartest states, in 2000, Bush and Gore
each won five. So, we’re back to my original conclusion: red states and
blue states are similar in average IQ, as are, on average, Republican
and Democratic voters.

caveats: These IQ scores are set with the national mean of the 366,000
high school students equal to 100 and the standard deviation set to 15.
But, keep in mind that we are only beginning to explore this huge
database, so take everything with a grain of salt.


New Hampshire










New Jersey

New York




North Dakota

















New Mexico






Rhode Island












West Virginia






North Carolina








weren’t adequate sample sizes from Alaska, Washington DC, and South
Carolina, and I excluded South Dakota because the result was too
different from North Dakota. (I think something might be confused about
both South Carolina and South Dakota — I’ll try to find out more.)

also looked at whites only data (unfortunately, the majority of
participants doesn’t have a race recorded) with the smartest whites
(which I suspect is all that white liberals care about — feeling
smarter than white conservatives) were (in descending order):
Connecticut, Montana, Nevada (I bet that’s not true anymore!), Idaho,
Illinois, Maryland, Massachusetts, Minnesota, Missouri, New Hampshire,
New Jersey, New York, Oregon, and Virginia. The dumbest whites were in
(in descending order): Georgia, Louisiana, Mississippi, North Carolina,
Arkansas, Tennessee, West Virginia, and Kentucky. All of these states
voted for Bush in 2000. I suspect, however, that air conditioning and
the abolition of the caste system have some good for the test scores of
whites in the south, especially in North Carolina. Here,
for purposes of comparison, is the 2003 NAEP public school achievement
tests for white 8th graders.

(Republished from iSteve by permission of author or representative)
No Items Found
Steve Sailer
About Steve Sailer

Steve Sailer is a journalist, movie critic for Taki's Magazine, columnist, and founder of the Human Biodiversity discussion group for top scientists and public intellectuals.

The “war hero” candidate buried information about POWs left behind in Vietnam.
What Was John McCain's True Wartime Record in Vietnam?
The evidence is clear — but often ignored
Are elite university admissions based on meritocracy and diversity as claimed?
A simple remedy for income stagnation