2015 NAEP White Hispanic Gap2

This is taking the average of four 2015 federal NAEP scores: both Math and Reading for both 4th and 8th Grades.

Screenshot 2015-08-29 20.15.45

Federal NAEP reading scores 12th graders 2013

A general assumption of the moderate conventional wisdom over the last half century is that average black performance is dragged down by specific impediments, such as poverty, crime, culture of poverty, parental taciturnity, lead paint, or whatever. One would therefore expect blacks without those impediments to score equal with whites.

But a close inspection of the social science data suggests that the world doesn’t really look like that. For example, above is the 2013 federal National Assessment of Educational Progress scores for 12th graders in Reading. Blacks who are the children of college graduates average 274, which is the same as whites who are the children of high school dropouts.

The Math Gap is the same:

Screenshot 2015-08-29 20.34.33

At the high school dropout level, The Gap in math is 16 points, but at the college graduate level, The Gap is twice as large: 32 points. That’s the opposite of what the conventional wisdom would imply.

So, basically, there are two theories left to account for this. How do we choose between them?

In the past, Western civilization tried to follow Occam’s Razor, which implies the Bell Curve theory of regression toward different means would be most likely.

But the term “Western civilization” is exclusionary and makes people feel bad. These days, we know that the highest form of thought is not using Occam’s Razor but shouting “Occam’s racist!”

So the only viable explanation is the Conspiracy Theory Theory of Pervasive Racism: people who think they are white are constantly destroying black bodies by saying words like “field” and “swing.” Or something. It doesn’t really matter what the specifics of the Conspiracy Theory Theory are since the more unfalsifiable the better.

Because Science.

Screenshot 2015-07-01 16.54.40

Paul Krugman argues today that Puerto Rico is kind of like West Virginia, Mississippi, and Alabama:

Put it this way: if a region of the United States turns out to be a relatively bad location for production, we don’t expect the population to maintain itself by competing via ultra-low wages; we expect working-age residents to leave for more favorable places. That’s what you see in poor mainland states like West Virginia, which actually looks a fair bit like Puerto Rico in terms of low labor force participation, albeit not quite so much so. (Mississippi and Alabama also have low participation.) … There is much discussion of what’s wrong with Puerto Rico, but maybe we should, at least some of the time, just think of Puerto Rico as an ordinary region of the U.S. …

Okay, but there’s a huge difference in test scores.

The federal government has been administering a special Puerto Rico-customized version of its National Assessment of Educational Progress (NAEP) exam in Spanish to Puerto Rican public school students, and the results have been jaw-droppingly bad.

For example, among Puerto Rican 8th graders tested in mathematics in 2013, 95% scored Below Basic, 5% scored Basic, and (to the limits of rounding) 0% scored Proficient, and 0% scored Advanced. These results were the same in 2011.

In contrast, among American public school students poor enough to be eligible for subsidized school lunches (“NSLP” in the graph above), only 39% scored Below Basic, 41% scored Basic, 17% scored Proficient, and 3% scored Advanced.

Puerto Rico’s test scores are just shamefully low, suggesting that Puerto Rican schools are completely dropping the ball. By way of contrast, in the U.S., among black 8th graders, 38% score Basic, 13% score Proficient, and 2% score Advanced. In the U.S. among Hispanic 8th graders, 41% reach Basic, 18% Proficient, and 3% Advanced.

In Krugman’s bete noire of West Virginia, 42% are Basic, 20% are Proficient, and 3% are Advanced. In Mississippi, 40% are Basic, 18% Proficient, and 3% are Advanced. In Alabama, 40% are Basic, 16% are Proficient, and 3% are Advanced. (Unmentioned by Krugman, the lowest scores among public school students are in liberal Washington D.C.: 35% Basic, 15% Proficient, and 4% Advanced.)

Let me repeat, in Puerto Rico in Spanish, 5% are Basic, and zero zip zilch are Proficient, much less Advanced.

Am I misinterpreting something? I thought I must be, but here’s a press release from the Feds confirming what I just said:

The 2013 Spanish-language mathematics assessment marks the first time that Puerto Rico has been able to use NAEP results to establish a valid comparison to the last assessment in 2011. Prior to 2011, the assessment was carefully redesigned to ensure an accurate assessment of students in Puerto Rico. Results from assessments in Puerto Rico in 2003, 2005 and 2007 cannot be compared, in part because of the larger-than-expected number of questions that students either didn’t answer or answered incorrectly, making it difficult to precisely measure student knowledge and skills. The National Center for Education Statistics, which conducts NAEP, administered the NAEP mathematics assessment in 2011. But those results have not been available until now, as it was necessary to replicate the assessment in 2013 to ensure that valid comparisons could be made.

“The ability to accurately measure student performance is essential for improving education,” said Terry Mazany, chairman of the National Assessment Governing Board, which oversees NAEP. “With the support and encouragement of education officials in Puerto Rico, this assessment achieves that goal. This is a great accomplishment and an important step forward for Puerto Rico’s schools and students.”

NAEP assessments report performance using average scores and percentages of students at or above three achievement levels: Basic, Proficient and Advanced. The 2013 assessment results showed that 11 percent of fourth-graders in Puerto Rico and 5 percent of eighth-graders in public schools performed at or above the Basic level; conversely, 89 percent of fourth-graders and 95 percent of eighth-graders scored below that level. The Basic level denotes partial mastery of the knowledge and skills needed for grade-appropriate work. One percent or fewer of students in either grade scored at or above the Proficient level, which denotes solid academic performance. Only a few students scored at the Advanced level.

The sample size for 8th graders was 5,200 students at 120 public schools in the Territory.

UPDATE: I’ve now discovered Puerto Rico’s scores on the 2012 international PISA test. Puerto Rico came in behind Jordan in math.

Results this abysmal can’t solely be an HBD problem (although it’s an interesting data point in any discussion of hybrid vigor); this has to also be due to a corrupt and incompetent education system in Puerto Rico.

New York Times’ comments aren’t generally very useful for finding out information, but Krugman’s piece did get this comment:

KO’R New York, NY 4 hours ago

My husband and I have had a house in PR for 24 years. For two of those years we taught English and ESL at Interamericana, the second largest PR university. Our neighbors have children in the public grade schools. In a nutshell: the educational system in PR is a joke!!! Bureaucratic and corrupt. Five examples: (1) In the elementary schools near us if a teacher is sick or absent for any reason, there is no class that day. (2) Trying to get a textbook changed at Interamericana requires about a year or more of bureaucratic shinnanigans (3) A colleague at Interamericana told us that he’d taught in Africa (don’t remember where) for a few years and PR was much worse in terms of bureaucracy and politics. ( (4) The teaching method in PR is for the teacher to stand in front of the class, read from the textbook verbatim, and have the students repeat what he or she read. And I’m not speaking just about English – this goes for all subjects. 5) Interamericana is supposed to be a bi-lingual iniversity. In practice, this means the textbooks are in English, the professor reads the Spanish translation aloud, and the usually minimal discussion is in Spanish. …

Public school spending in Puerto Rico is $7,429 per student versus $10,658 per student in the U.S. Puerto Rico spends more per student than Utah and Idaho and slightly less than Oklahoma.

Puerto Rico spends less than half as much as the U.S. average on Instruction: $3,082 in Puerto Rico vs. $6,520 in America, significantly less than any American state. But Puerto Rico spends more than the U.S. average on Total Support Services ($3,757 vs. $3,700). Puerto Rico is especially lavish when it comes to the shifty-sounding subcategories of General Administration ($699 in PR vs. $212 in America) and Other Support Services ($644 vs. $347). PR spends more per student on General Administration than any state in America, trailing only the notorious District of Columbia school system, and more even than DC and all 50 states on the nebulous Other Support Services.

Being a schoolteacher apparently doesn’t pay well in PR, but it looks like a job cooking the books somewhere in the K-12 bureaucracy could be lucrative.

The NAEP scores for Puerto Rico and the U.S. are for just public school students.

A higher percentage of young people in Puerto Rico attend private schools than in the U.S. The NAEP reported:

In Puerto Rico, about 23 percent of students in kindergarten through 12th grade attended private schools as of the 2011-2012 school year, compared with 10 percent in the United States. Puerto Rico results are not part of the results reported for the NAEP national sample.

So that accounts for part of the gap. But, still, public schools cover 77% of Puerto Ricans v. 90% of Americans, so the overall picture doesn’t change much: the vast majority of Puerto Rican 8th graders are Below Basic in math.

Another contributing factor is likely that quite a few Puerto Ricans summer in America and winter in Puerto Rico and yank their kids back and forth, which is disruptive to their education.

It’s clear that Puerto Ricans consider their own public schools to be terrible and that anybody who can afford private school should get out. The NAEP press release mentions that 100% of Puerto Rican public school students are eligible for subsidized school lunches versus about 50% in the U.S. Heck, Oscar-winner Benicio Del Toro’s lawyer father didn’t just send him to private school, they sent him to a boarding school in Pennsylvania.

Still, these Puerto Rican public school scores are so catastrophic that I also wouldn’t rule out active sabotage by teachers, such as giving students an anti-pep talk, for some local labor reason. For example, a PISA score from Austria was low a couple of tests ago because the teacher’s union told teachers to tell students not to bother working hard on the test. But the diminishment of the Austrian PISA score wasn’t anywhere near this bad. And Puerto Rico students got exactly the same scores in 2011 and 2013.

And here’s Jason Malloy’s meta-analysis of studies of Puerto Rican cognitive performance over the last 90 years.

From the Baltimore Sun:

Baltimore second in per-pupil spending, Census Bureau says

May 21, 2013|By Erica L. Green, The Baltimore Sun

The Baltimore school system ranked second among the nation’s 100 largest school districts in how much it spent per pupil in fiscal year 2011, according to data released Tuesday by the U.S. Census Bureau.

The city’s $15,483 per-pupil expenditure was second to New York City’s $19,770. Rounding out the top five were Montgomery County, which spent $15,421; Milwaukee public schools at $14,244; and Prince George’s County public schools, which spent $13,775.

Baltimore City, New York, and Milwaukee test scores are broken out separately in the NAEP test’s Trial Urban District Assessment program. (The other two districts are suburban counties in the rich Washington DC area. Three of the top five most expensive districts in the country are in liberal Maryland.) I’ll look at 8th grade math for black students only:

National (public schools): 51% basic or above, 14% proficient or above, 2% advanced

Baltimore City: 44% basic or above, 10% proficient or above, 1% advanced

New York City: 51% basic or above, 13% proficient, 1% advanced

Milwaukee: 31% basic or above, 4% proficient, NA advanced

So, Baltimore gets more for its money than Milwaukee. (Of course, if you’ve been reading iSteve for long, you’ll know of the amazing dismalness of Milwaukee blacks.)

Long time readers know I’ve been interested in the question of school test scores in the two biggest states, California and Texas. In the federal National Assessment of Educational Progress scores, Texas routinely beats California across all racial groups. But the NAEP is low stakes to students, which makes it easier for state officials to manipulate results at the margins.

However, looking at an unverified table of high-stakes SAT and ACT college admission average test scores for 2014, white, Hispanic, and black California high schoolers outscore their counterparts in Texas (using a weighted average of SAT and ACT scores). But Texas’s Asians outscore California’s Asians.

All 350,655 1,016 295,583 973 43
AmInd 1,814 982 1,501 992 (9)
Asian 66,385 1,108 18,569 1,126 (18)
Black 20,667 888 37,615 854 33
Hispanic 131,723 905 113,395 891 14
Other 28,357 1,065 12,961 1,003 61
White 101,709 1,113 111,542 1,069 44

Both states are moderately majority SAT: in California, SAT takers outnumber ACT takers 2.1 to 1, and in Texas 1.5 to 1. This appears to be putting everything on the traditional 400 to 1600 scale, rather than the 600 to 2400 scale of the last decade, but that is being phased out soon. The mean was rescaled in 1995 to, ideally, be 1000 with a standard deviation of 200, although both have drifted since then.

So, California’s overall average is 97 points, or a little under a half of a standard deviation below it’s white average, while Texas’s overall average is 96 points below it’s white average.

I’m not going to put too much credence in these numbers: even if the data are valid (which I haven’t checked), my weighted average methodology is crude. On the other hand, the results don’t seem too implausible.

I mostly want to put some numbers out there to provoke somebody interested in this long-running problem of how to synthesize SAT and ACT scores reliably to try to come up with a more sophisticated general model.

The federal government’s National Assessment of Educational Progress test results for 12th graders in readin’ and ‘rithmetic are now out for 2013. The feds have a nice website to display the numbers. I’ve been following these kind of test score stats for almost as long as I’ve been following baseball statistics, but I have to admit that seldom if ever do any Mike Trouts come along to add excitement to my peculiar hobby.
Above is a graph of the ten states where the NAEP had big enough sample sizes to break out The Gap (white-black, in this graph on the Math test). Of the ten states, the only one where The Gap is notably smaller than in the nation at large is West Virginia. How has West Virginia accomplished this goal that has obsessed policymakers and pundits for most of my lifetime? By having many of the smart white people in West Virginia move to greener pastures in North Carolina, Virginia, Georgia, and so forth.
(Republished from iSteve by permission of author or representative)
Psychometrics is a relatively mature field of science, and a politically unpopular one. So you might think there isn’t much money to be made in making up brand new standardized tests. Yet, there is.
From the NYT:

U.S. Asks Educators to Reinvent Student Tests, and How They Are Given

By SAM DILLON<nyt_correction_top>

<nyt_byline> Standardized exams — the multiple-choice, bubble tests in math and reading that have played a growing role in American public education in recent years — are being overhauled.

Over the next four years, two groups of states, 44 in all, will get $330 million to work with hundreds of university professors and testing experts to design a series of new assessments that officials say will look very different from those in use today.

The new tests, which Secretary of Education Arne Duncan described in a speech in Virginia on Thursday, are to be ready for the 2014-15 school year.

They will be computer-based, Mr. Duncan said, and will measure higher-order skills ignored by the multiple-choice exams used in nearly every state, including students’ ability to read complex texts, synthesize information and do research projects.

“The use of smarter technology in assessments,” Mr. Duncan said, “makes it possible to assess students by asking them to design products of experiments, to manipulate parameters, run tests and record data.”

I don’t know what the phrase “design products of experiments” even means, so I suspect that the schoolchildren of 2014-15 won’t be doing much of it.

Okay, I looked up Duncan’s speech, “Beyond the Bubble Tests,” and what he actually said was “design products or experiments,” which almost makes sense, until you stop and think about it. Who is going to assess the products the students design? George Foreman? Donald Trump? (The Donald would be good at grading these tests: tough, but fair. Here’s a video of Ali G pitching the product he designed — the “ice cream glove” — to Trump.

Because the new tests will be computerized and will be administered several times throughout the school year, they are expected to provide faster feedback to teachers than the current tests about what students are learning and what might need to be retaught.

Both groups will produce tests that rely heavily on technology in their classroom administration and in their scoring, she noted.

Both will provide not only end-of-year tests similar to those in use now but also formative tests that teachers will administer several times a year to help guide instruction, she said.

And both groups’ tests will include so-called performance-based tasks, designed to mirror complex, real-world situations.

In performance-based tasks, which are increasingly common in tests administered by the military and in other fields, students are given a problem — they could be told, for example, to pretend they are a mayor who needs to reduce a city’s pollution — and must sift through a portfolio of tools and write analytically about how they would use them to solve the problem.

Oh, boy …

There is some good stuff here — adaptive tests are a good idea (both the military’s AFQT and the GRE have gone over to them). But there’s obvious trouble, too.

Okay, so these new tests are going to be much more complex, much more subjective, and get graded much faster than fill-in-the-bubble tests? They’ll be a dessert topping and a floor wax!

These sound a lot like the Advanced Placement tests offered to high school students, which usually include lengthy essays. But AP tests take two months to grade, and are only offered once per year (in May, with scores coming back in July), because they use high school teachers on their summer vacations to grade them.

There’s no good reason why fill-in-the-bubble tests can’t be scored quickly. A lot of public school bubble tests are graded slothfully, but they don’t have to be. My son took the ERB’s Independent School Entrance Exam on a Saturday morning and his score arrived at our house in the U.S. Mail the following Friday, six days later.

The only legitimate reason for slow grading is if there are also essays to be read, but in my experience, essay results tend to be dubious at least below the level of Advanced Placement tests, where there is specific subject matter in common. The Writing test that was added to the SAT around 2003 has largely been a bust, with many colleges refusing to use it in the admissions process.

One often overlooked problem with any kind of writing test, for example, is that graders have a hard time reading kids’ handwriting. You can’t demand that kids type because millions of them can’t. Indeed, writing test results tend to correlate with number of words written, which is often more of a test of handwriting speed than of anything else. Multiple choice tests have obvious weaknesses, but at least they minimize the variance introduced by small motor skills.

And the reference to “performance-based tasks” in which people are supposed to “write analytically” is naive. I suspect that Duncan and the NYT man are confused by all the talk during the Ricci case about the wonders of “assessment centers” in which candidates for promotion are supposed to sort through an in-basket and talk out loud about how they would handle problems. In other words, those are hugely expensive oral tests. The city of New Haven brought in 30 senior fire department officials from out of state to be the judges on the oral part of the test.

And the main point of spending all this money on an oral test is that an oral test can’t be blindgraded. In New Haven, 19 of the 30 oral test judges were minorities, which isn’t something that happens by randomly recruiting senior fire department officials from across the country.

But nobody can afford to rig the testing of 35,000,000 students annually.

Here are some excerpts from Duncan’s speech:

President Obama called on the nation’s governors and state education chiefs “to develop standards and assessments that don’t simply measure whether students can fill in a bubble on a test, but whether they possess 21st century skills like problem-solving and critical thinking and entrepreneurship and creativity.”

You know your chain is being yanked when you hear that schoolteachers are supposed to teach “21st century skills” like “entrepreneurship.” So, schoolteachers are going to teach kids how to be Steve Jobs?

Look, there are a lot of good things to say about teachers, but, generally speaking, people who strive for union jobs with lifetime tenure and summers off are not the world’s leading role models on entrepreneurship.

Further, whenever you hear teachers talk about how they teach “critical thinking,” you can more or less translate that into “I hate drilling brats on their times tables. It’s so boring.” On the whole, teachers aren’t very good critical thinkers. If they were, Ed School would drive them batty. (Here is an essay about Ed School by one teacher who is a good critical thinker.)

And last but not least, for the first time, the new assessments will better measure the higher-order thinking skills so vital to success in the global economy of the 21st century and the future of American prosperity. To be on track today for college and careers, students need to show that they can analyze and solve complex problems, communicate clearly, synthesize information, apply knowledge, and generalize learning to other settings. …

Over the past 19 months, I have visited 42 states to talk to teachers, parents, students, school leaders, and lawmakers about our nation’s public schools. Almost everywhere I went, I heard people express concern that the curriculum had narrowed as more educators “taught to the test,” especially in schools with large numbers of disadvantaged students.

Two words: Disparate Impact.

The higher the intellectual skills that are tested, the larger the gaps between the races will turn out to be. Consider the AP Physics C exam, the harder of the two AP physics tests: In 2008, 5,705 white males earned 5s (the top score) versus six black females.

In contrast, tests of rote memorization, such as having third graders chant the multiplication tables, will have smaller disparate impact than tests of whether students “can analyze and solve complex problems, communicate clearly, synthesize information, apply knowledge, and generalize learning to other settings.” That’s a pretty decent description of what IQ tests measure.

Duncan says that the new tests could replace existing high school exit exams that students must pass to graduate.

Many educators have lamented for years the persistent disconnect between what high schools expect from their students and the skills that colleges expect from incoming freshman. Yet both of the state consortia that won awards in the Race to the Top assessment competition pursued and got a remarkable level of buy-in from colleges and universities.

… In those MOUs, 188 public colleges and universities and 16 private ones agreed that they would work with the consortium to define what it means to be college-ready on the new high school assessments.

The fact that you can currently graduate from high school without being smart enough for college is not a bug, it’s a feature. Look, this isn’t Lake Wobegon. Half the people in America are below average in intelligence. They aren’t really college material. But they shouldn’t all have to go through life branded as a high school dropout instead of high school graduate because they weren’t lucky enough in the genetic lottery to be college material.

The Gates Foundation and the U. of California ganged up on the LA public schools to get the school board to pass a rule that nobody will be allowed to graduate who hasn’t passed three years of math, including Algebra II. That’s great for UC, not so great for an 85 IQ kid who just wants a high school diploma so employers won’t treat him like (uh oh) a high school dropout. But, nobody gets that.

Another benefit of Duncan’s new high stakes tests will be Smaller Sample Sizes of Questions:

With the benefit of technology, assessment questions can incorporate audio and video. Problems can be situated in real-world environments, where students perform tasks or include multi-stage scenarios and extended essays.

By way of example, the NAEP has experimented with asking eighth-graders to use a hot-air balloon simulation to design and conduct an experiment to determine the relationship between payload mass and balloon altitude. As the balloon rises in the flight box, the student notes the changes in altitude, balloon volume, and time to final altitude. Unlike filling in the bubble on a score sheet, this complex simulation task takes 60 minutes to complete.

So, the NAEP has experimented with this kind of question. How did the experiment work out?

You’ll notice that the problem with using up 60 minutes of valuable testing time on a single multipart problem instead of, say, 60 separate problems is that it radically reduces the sample size. A lot of kids will get off track right away and get a zero for the whole one hour segment. Other kids will have seen a hot air balloon problem the week before and nail the whole thing and get a perfect score for the hour.

That kind of thing is fine for the low stakes NAEP where results are only reported by groups with huge sample sizes (for example, the NAEP reports scores for whites, blacks, and Hispanics, but not for Asians). But for high stakes testing of individual students and of their teachers, it’s too random. AP tests have large problems on them, but they are only given to the top quarter or so of high school students in the country, not the bottom half of grade school students.

It’s absurd to think that it’s all that crucial that all American schoolchildren must be able to “analyze and solve complex problems, communicate clearly, synthesize information, apply knowledge, and generalize learning to other settings.” You can be a success in life without being able to do any of that terribly well.

Look, for example, at the Secretary of Education. Arne Duncan has spent 19 months traveling to 42 states, talking about testing with teachers, parents, school leaders, and lawmakers. Yet, has he been able to synthesize information about testing terribly well at all? Has his failure to apply knowledge and generalize learning about testing gotten him fired from the Cabinet?

(Republished from iSteve by permission of author or representative)
Americans have devoted an enormous amount of effort over the centuries to devising useful baseball statistics. In recent years, Americans have talked a lot about devising useful educational statistics.

For example, I’ve pointed out a million times over the last decade that it doesn’t make much sense to judge teachers, schools, or colleges by their students’ test scores. Most of the time, all you are doing is determining which kids were smarter to start with. Logically, it makes more sense to judge their “value added” by comparing how the students score now to how they scored in the past before the people or institutions being measured got their mitts on the students.

Over the last few years, everybody who is anybody in education — Bill Gates, Arne Duncan, you name it — has come around to this perspective (although they won’t use the word “smarter”).

A big problem, however, is that this value added idea remains almost wholly theoretical because almost none of the prominent educational statistics are published in value added form.

In contrast, when Bill James was pointing out 30 years ago that Batting Average, traditionally the most prestigious hitting statistic (the guy with the highest BA was crowned “Batting Champion”), wasn’t as good a measure of hitting contribution as Slugging Average plus On-Base Percentage, he could show you what he meant using real numbers that were available to everybody, even if you had to calculate them yourself from other, more widely published statistics.

Readers would say, “Yeah, he’s right. For example, Matty Alou (career batting average .307, but slugging average .381 and on-base percentage .345) wasn’t anywhere near as good as Mickey Mantle (career batting average only .298, but slugging average .557 and on-base percentage .421). If you add on-base percentage and slugging average together to get “OPS,” then Mickey had a .977 while Matty only had .726. And that sounds about right. Mickey was awesome, but it didn’t always show up in his traditional statistics. Now, we’ve finally got a statistic that matches up with what we all could see from watching lots of Yankee games.”

On the other hand, other innovative baseball statistics from that era have faded because they didn’t seem to work as well in practice as in theory. Readers would be rightly skeptical that Glenn Hubbard and Roy Smalley Jr. really were all time greats, as these complicated formulas said they were.

A couple of years ago, Audacious Epigone and I stumbled upon a potentially promising fluke in the federal National Assessment of Educational Progress test scores by state. Since these tests are given every two years to representative samples of fourth and eighth graders, then you ought to be able to roughly estimate how much value the public schools in each state have added from 4th grade to 8th grade by comparing, say, a state’s 2009 8th grade scores to that state’s 2005 4th grade scores.

Granted, people move in and out of states, but if you just look at the scores for non-Hispanic whites, you can cut down the effect of demographic change to what might be a manageable level.

So, how to display this data in a semi-usable form? In the following table, I’ve put the Rank of each state. For example, in NAEP 4th Grade Reading scores in 2005, white public school students in Alabama ranked 48th (out of 52 — the 50 states plus D.C. and the Department of Defense schools for the children of military personnel). By 2009, this cohort of Alabamans was up to 47th in 8th Grade Reading. That’s a Change in Rank of +1. Woo-hoo!

In contrast, in Math, Alabama’s 4th Graders were 50th in 2005 and the state’s 8th Graders were 50th in 2009, so that’s a Change in Rank for Math of zero.

There are measures that are better for some purposes than Rank, but, admit it, ranking all the states is more interesting than using standard deviations or whatever.

A new idea is embodied in the last column, which reports the Difference in Rank between Math and Reading scores for 8th Graders in 2009. Because Alabama was 47th in Reading in 2009, but only 50th in Math in 2009, it gets a Difference in Rank of -3. Boo-hoo …

What’s the point of this last measure?

There’s a fair amount of evidence that schools have more impact on Math performance than Reading performance. For example, math scores on a variety of tests have gone up some since hitting rock bottom during the Seventies (in most of America outside of Berkeley, the Seventies were when the Sixties actually happened). In contrast, reading and verbal scores have staggered around despite a huge amount of effort to raise them.

Why have math scores proven more open to improvement by schools than reading scores? One reason probably is that because kids only spend about 1/5th of their waking hours in school. And almost nobody does math outside of school, but some kids read outside of school. So, if you, say, double the amount of time spent in school on math, then you are increasing the total amount of time kids are spending doing math by about 98%. But if you double the amount of time spent on reading in school, there are some rotten stinker kids who read for fun in their free time, and thus you aren’t doing much for them in terms of total hours devoted to reading.

Not surprisingly, a decade of the No Child Left Behind act, which tells states to hammer on math and reading and don’t worry about that arty stuff like history and science, has seen continued slow improvements in math, but not much in reading — except at the bottom (i.e., the kids who don’t read outside school).

So, by 8th grade, Reading scores would likely be a rough measure of IQ crossed with bookishness (personality and culture). In contrast, 8th Grade Math scores are more amenable to alteration by schools since kids aren’t waiting in line to buy Harry Potter and the Lowest Common Denominator. So, the idea behind the final column is to compare rank on 8th Grade Math to rank on 8th Grade Reading. A positive number means your state has a better (lower) rank on Math than on Reading, which might reflect relatively well on your public schools given the raw materials it has to work with relative to other states.

For example, on the NAEP, Texas ranks 11th among white 8th graders in Reading, which is pretty good for such a huge state. But, it ranks a very impressive 4th among white 8th graders in Math, for a Difference in Ranking score of +7. This suggests Texas is doing something with math that’s worth checking into. Maybe they are just teaching to the test, but this is the NAEP, which isn’t a high-stakes test. And there are worse things than teaching to the test. (Whatever they are doing, they are starting young, because Texas ranks 2nd in Math for white 4th Graders.)

So, here is this huge table:

NAEP Read <spanxl67″ width=”58″><spanxl67″ width=”33″><spanxl67″ width=”33″><spanxl67″ width=”58″><spanxl67″ width=”62″>
White 2005 2009 09-05 2005 2009 09-05 09-09
Rank Rank Chg in Rnk Rank Rank Chg in Rnk Dif in Rnk
Alabama 48 47 +1 50 50 +0 -3
Alaska 37 31 +6 31 21 +10 +10
Arizona 41 29 +12 36 27 +9 +2
Arkansas 34 46 -12 37 44 -7 +2
California 32 33 -1 25 36 -11 -3
Colorado 9 9 0 13 6 7 +3
Connecticut 4 2 +2 8 7 +1 -5
Delaware 3 14 -11 11 17 -6 -3
DC 1 +1 1 +1 0
DoDEA 8 5 +3 21 16 +5 -11
Florida 16 21 -5 14 37 -23 -16
Georgia 27 38 -11 33 34 -1 +4
Hawaii 40 45 -5 40 48 -8 -3
Idaho 30 35 -5 29 26 3 +9
Illinois 13 10 +3 28 18 +10 -8
Indiana 43 34 +9 26 29 +-3 +5
Iowa 42 41 +1 39 41 +-2 0
Kansas 33 19 +14 10 15 +-5 +4
Kentucky 46 37 +9 51 49 +2 -12
Louisiana 45 51 -6 41 45 -4 +6
Maine 36 39 -3 42 39 3 0
Maryland 7 3 +4 7 2 +5 +1
Massachusetts 2 4 -2 3 1 2 +3
Michigan 28 40 -12 22 42 -20 -2
Minnesota 12 7 +5 4 5 +-1 +2
Mississippi 49 48 +1 48 51 +-3 -3
Missouri 26 27 -1 45 32 13 -5
Montana 21 16 +5 35 10 +25 +6
Nebraska 18 20 -2 30 28 2 -8
Nevada 51 49 +2 44 40 +4 +9
New Hampshire 19 24 -5 20 23 -3 +1
New Jersey 6 1 +5 5 3 +2 -2
New Mexico 35 25 +10 49 38 +11 -13
New York 10 8 +2 16 19 +-3 -11
North Carolina 22 28 -6 6 8 -2 +20
North Dakota 20 22 -2 24 9 15 +13
Ohio 14 12 +2 12 30 +-18 -18
Oklahoma 50 50 0 46 46 0 +4
Oregon 44 36 +8 34 31 +3 +5
Pennsylvania 15 6 +9 17 14 +3 -8
Rhode Island 39 43 -4 43 43 0 0
South Carolina 38 44 -6 9 24 -15 +20
South Dakota 29 13 +16 23 12 +11 +1
Tennessee 47 42 +5 47 47 +0 -5
Texas 11 11 0 2 4 -2 +7
Utah 31 30 +1 38 33 +5 -3
Vermont 24 18 +6 32 22 +10 -4
Virginia 5 17 -12 15 20 -5 -3
Washington 17 15 +2 19 11 +8 +4
West Virginia 52 52 0 52 52 0 0
Wisconsin 23 26 -3 18 13 5 +13
Wyoming 25 32 -7 27 35 -8 -3
NAEP Read <spanxl67″ width=”58″><spanxl67″ width=”33″><spanxl67″ width=”33″><spanxl67″ width=”58″><spanxl67″ width=”62″>
White 2005 2009 09-05 2005 2009 09-05 09-09
Rank Rank Chg in Rnk Rank Rank Chg in Rnk Dif in Rnk

As J.K. Simmons asks at the end of Burn After Reading, “What did we learn?

I’m not terribly sure, either. Who knows enough about what goes on within the educational establishments of all the states to know whether these numbers make sense?

But, at least we have some value added numbers and aren’t just still talking about how valuable they’d be if we ever got around to getting any.

(Republished from iSteve by permission of author or representative)
Charles Murray blogs:

The narrowing of the black-white gap extends from children born in 1961 through children born in 1973, and it was substantial—from 1.2 standard deviations to about .8. Then the trend goes flat, with a few spikes, for children born over the next 26 years.

(Republished from iSteve by permission of author or representative)
John McWhorter posts at The New Republic:

Saletan Responds: OK, Let’s Try This
William Saletan has responded to my comment on his discomfort with No Child Left Behind data being tabulated by race.

I get where he’s coming from. He makes many valid points. One of them is that while I argued that cultural differences determine why black people often don’t do as well as white ones on tests, poor whites do significantly better than poor black ones, despite that we can assume that many of their cultural variables, such as a language culture focused on the oral rather than the printed page and direct-question exchanges like “What is the capital of South Dakota?”, are similar to blacks’.

That question is not to be swatted away.

And to show that I mean it when I say that Saletan makes valid points, I am going to put my money where my mouth is.

Namely: I agree with Saletan that if it turns out that there are no genetic differences at all in intelligence between the races, it will be the unexpected case. At the very least, it is utterly plausible, given indisputable differences between races of other kinds, that intelligence may prove to be one of them. If intelligence is, even if only partly, traceable to configurations of neurons in the brain, then there is no a priori reason to suppose that those configurations are statistically identical between races while other physical configurations — i.e. hair, color, etc. — are not.

Yes, racial differences are a matter of probability–members will exhibit traits to varying degrees, a white individual may well be more X or Y than a black individual. Anyone reading this understands that. However, when issues such as this are brought up, this issue of statistics and probabilities is often brought to bear as if it somehow contradicted what I wrote in the previous paragraph. It does not.

The same goes for other facts such as that race is a squishy concept, that individuals within races differ genetically more than individuals of different races, and so on.

The fact remains that I have a certain complex of genetic factors that expresses itself as a degree of melanin, a kinkiness of hair, a nose shape, and so on, whose clustering typifies what we process as the black race, one which emerged in Africa.

Back to the point: sure, it may turn out that whites and/or Asians have higher intelligence than black people. It’s not news I would love hearing, for all the same reasons few of us would. But it could happen.

However, to me, the evidence suggests that the difference in question, if it exists, would be quite small. Other factors are just as plausibly responsible for most or even all of the gap between poor white and poor black kids on tests like the NAEP.

Okay, but once again, what about the big differences nonpoor white and nonpoor black kids on the NAEP? What about that SAT study that found that whites in the lowest decile of family income outscored blacks in the top decile? Why do blacks about to graduate from college get an average score on the LSAT that would only fall at the 12th percentile of the white distribution?

Namely, education-wise, all evidence is that to be a poor white kid is different from being a poor black kid, and not just in the texture of your hair. Just for starters, most of us will spontaneously notice that the worst schools in the nation – the violent, understaffed, ramshackle inner-city disasters where little learning happens–don’t have many white kids in them.

Yes, we must do better than that kind of impressionism, however, upon which: Poor black kids are routinely subject to less qualified teachers, who stick around for less time, than poor white kids. A classic study on the question by John Kain and Kraig Singleton addressed the situation in Texas.

Okay, but why do most of the better teachers do everything that can to eventually get themselves out of schools full of poor black kids? Could it have to do with the conduct of the kids? Could it have to do with their potential for learning? After all, the best teachers tend to like to teach the best students, the ones with the greatest capacity for learning. Nobody is surprised that the best golf swing instructors want to be hired by Tiger Woods rather than by me, even though they could shave more strokes off my average score than off Tiger’s.

Or, the typical poor white child is surrounded by fewer poor people than the typical poor black child, and only about 1 in 20 poor white kids go to schools where almost all students are also poor (useful facts on this here).

Notice that I am not claiming (despite sources such as the one I linked because of its handy presentation of other data) that the problem is “segregation”–i.e. that poor black kids are done in by going to school with people the same color as them, a tragic distortion of the meaning and significance of the word segregation in our times which I deplore. “Segregated” KIPP academies are teaching poor black and brown kids brilliantly all over the country (which, itself, is further evidence that the problem is how such kids are taught more than how their brains are configured).

Okay, but how about the Shaker Heights Effect studied by John Ogbu — all the affluent liberal integrated school districts across the country that got together in the early 2000s to study why blacks students from upper middle class homes performed poorly on average?

The issue is poverty rather than race, and the cultural baggage it often means kids are bringing to school–which the schools poor black kids attend are less adept at compensating for than those attended by the poor white kids. Plus, poor white kids are more likely to have more fortunate students around them to imitate and learn from.

We haven’t seen yet whether addressing these things will close the gaps in question–or maybe narrow them to such an extent that whatever gap was left would be too small to interest anyone but obsessives of sinister motive.

“Obsessives of sinister motive” = citizens interested in finding out what the vast amount of data collected by the federal government for the purposes of enforcing affirmative action actually show.

McWhorter asserts “We haven’t seen yet whether addressing these things will close the gaps in question.” Look, these precise questions have been studied intensively for 45 years. The incentives for any social scientist to be the one who comes up with a breakthrough analytical idea making the race gap disappear are huge.

Now, I take it Saletan is still worried that just such people, such as Steve Sailer, are still a force to be feared. Respectfully, however, I am still not s
ure why.

Think about it: our public discourse is at a point where when Saletan even entertains the data that makes us so uncomfortable he is excoriated endlessly. Where is the space in this discourse for people like Sailer to acquire any kind of meaningful influence?

Indeed. Wielding Occam’s Butterknife pays a lot better than Occam’s Razor.

Really: we have to think about what we’re proposing as a danger worthy of engagement. What legislation would have Steve Sailer’s imprint? What steps can we imagine – and societal evolution happens generally in steps–via which we would get to a point where black people were routinely herded apart as mental deficients?

Because that is what I’ve routinely advocated? Where? When?

What I’ve routinely advocated are colorblind policies in contrast to the current race obsessed policies imposed by the government under the “disparate impact” theory.

Or whatever dystopian horror we are supposed to be worried about.

Other dystopian horrors I’ve advocated:

- Finally finishing the border fence, like Israel’s border fence (just on our side of the border).

- Adopting a Canadian-style system for picking legal immigrants who will most benefit current American citizens as a whole.

- Paying unemployed illegal immigrants to go home.

- Eliminating the EEOC’s four-fifths’ rule.

The horror! The horror!

And if you have more imagination than I do, then specify: how would the steps to the scenario you envision initiate from the back-of-the-class mutterings of people like Steve Sailer, given the now deeply-rooted cultural revulsion towards open bigotry in our society?

Yes, it’s still “out there”–but not to an extent that can keep a black man out of the White House, despite what I was repeatedly told all last year all the way up to the second Obama won the election. The issue is not “whether,” but “how much” it’s out there.

I’d much rather see how far we can get with addressing what kind of schools poor kids go to. My money is on poor black kids looking better decade by decade if we do the right things–but that will mean assessing how the kids are doing by race, and publishing the data for all to see including Big Bad Steve.

Well, you might want to start by looking at all the data that has already been published decade by decade…

As for the moral copout [huh] Sailer-types wait for, where we eliminate all efforts to help black people out of a conclusion that they are beyond assistance because of genetic inferiority, again, we’d have to spell out what kind of actual, plausible sociohistorical process we can imagine leading to it.

Yeah, right, because that’s what I’ve always advocated, as opposed to advocating things like, when the fire truck pulls up at your house to save your loved ones’ lives, the fire captain in charge should have been picked by a colorblind process.

And when we’ve done that, then we have to specify something else: why that rather studied possibility is more urgent for us to devote our mental energy to than, well, quite a few other more pressing matters in this world as we know it.

Well, clearly, John McWhorter hasn’t been devoting much mental energy to this subject he keeps writing about, so he’s got that going for him.

But certainly McWhorter is correct that one individual can hardly have much influence just by being right on the social science and by advocating commonsensical colorblind policies based on the social science, when he can be smeared as a “racist” and “bigot” precisely for being right on the social science?

Apparently, McWhorter’s and Saletan’s working definition of a “racist” is a pundit who knows what the hell he’s talking about.

(Republished from iSteve by permission of author or representative)
Slate’s “Human Nature” correspondent got such a beat-down from his friends when he said a few things in defense of James D. Watson in 2007 that he’s decided that it’s best just not to think about race anymore:

True, False, or Neither?
The perils of analyzing test scores by race.

By William Saletan

“‘No Child’ Law Is Not Closing a Racial Gap.” That’s the New York Times headline about a report issued this week on school test scores. The Times story begins:

The achievement gap between white and minority students has not narrowed in recent years, despite the focus of the No Child Left Behind law on improving the scores of blacks and Hispanics, according to results of a federal test considered to be the nation’s best measure of long-term trends in math and reading proficiency. Between 2004 and last year, scores for young minority students increased, but so did those of white students, leaving the achievement gap stubbornly wide, despite President George W. Bush’s frequent assertions that the No Child law was having a dramatic effect. Although Black and Hispanic elementary, middle and high school students all scored much higher on the federal test than they did three decades ago, most of those gains were not made in recent years, but during the desegregation efforts of the 1970s and 1980s.

The Times implies that the racial angle is important because it shows the No Child law failed. But the same angle is being touted by exponents of hereditary differences in intelligence. In fact, they’re quoting the Times story to validate their point. “NYT: NAEP Racial Gaps Haven’t Magically Disappeared,” says the headline at Steve Sailer’s blog, which serves as a headquarters for believers in “human biodiversity.” “Study after study, yet no one wants to introduce ol’ reliable Occam,” observes one commenter. Another cites a well-known paper on race, heredity, and IQ, asking: “Why don’t they read this—it explains a lot.”

The Washington Post, in its article about the test-scores report, doesn’t focus on race. ” ‘Nation’s Report Card’ Sees Gains in Elementary, Middle Schools,” says the Post headline. The article begins:

Math and reading scores for 9- and 13-year-olds have risen since the 2002 enactment of No Child Left Behind, providing fuel to those who want to renew the federal law and strengthen its reach in high schools. Performance on the National Assessment of Educational Progress, which offers a long view of U.S. student achievement, shows several bright spots. Nine-year-olds posted the highest scores ever in reading and math in 2008. Black and Hispanic students of that age also reached record reading scores, though they continued to trail white peers. But results released yesterday were disappointing for high school students. Seventeen-year-olds gained some ground in reading since 2004, but their average performance in math and reading has not budged since the early 1970s.

You can find the same information about the racial gap in this summary. But it isn’t the focus. It’s just one detail among many.

Why categorize and measure students by race?

Well, one reason is because the second paragraph of the No Child Left Behind legislation reads:

An Act
To close the achievement gap with accountability, flexibility, and choice, so that no child is left behind.

What is the “achievement gap”? In the NCLB’s Statement of Purpose, it says:

(3) closing the achievement gap between high- and low-performing children, especially the achievement gaps between minority and nonminority students, and between disadvantaged children and their more advantaged peers;

In 2001, Ted Kennedy and George W. Bush, on behalf of all right-thinking people everywhere, placed a bet with, more or less, me — as the public face of the tiny minority of despicable bad people who follow social science statistics the way George W. Bush (or Stephen Jay Gould) followed baseball statistics.

Now, the results are coming in on the bet and Saletan says we should stop counting them.

Saletan continues:

Aren’t there better ways to organize the data? “Lower-performing 9- and 13-year-olds make gains,” says one section of the NAEP report [PDF].”No significant change for 17-year-olds at any performance level,” says another. “Reading scores improve for 9-year-old public and private school students over long term,” says a third. “Score increases for 17-year-olds whose parents did not finish high school,” says a fourth. These tables organize the data by factors that can help us target and adjust educational policy: kids with low scores, kids in public school, kids in high school, kids whose parents didn’t graduate. I’d like to see tables for income and spending per pupil, too.

It’s not hard to look up NAEP scores yourself. Here’s the 2007 8th grade Reading scores broken down by race and income. White kids whose parents are so poor that they are eligible for the National School Lunch Program outscore affluent black kids by four points and affluent Hispanic kids by one point. The gap between poor whites and poor blacks is 19 points, and the gap among not poor whites and not poor blacks is 21 points. That’s what you normally get — sizable racial gaps anyway you slice it. And, of course, the percent of poor blacks and Hispanics is higher, as you’d expect from their lower test scores, since the NAEP and the marketplace measure overlapping abilities.

But race? Does that category really help? And what message does it send to kids when headlines assert a persistent “racial gap”?

On this question, I’m in no position to throw stones. I’ve come to my cautionary view the hard way. Liberal creationists—people who think no genetically based difference can be admitted in average ability between populations—are mistaken. But that doesn’t make race a useful or socially healthy way of categorizing people.

Beware looking and settling for racial analysis when some other combination of categorieseconomics, culture, genetics—more accurately fits the data. As the NAEP coverage illustrates, that’s a warning worth heeding on the left as well as the right.

The reason people all over the world and of all different ideologies can’t help but be interested in race is a racial group is, fundamentally, an extended family. So, race is about who your relatives are, which is an inherently interesting topic.

Saletan has been arguing that we should just group people by looking at one gene at a time. (Of course, on average, individual gene differences will tend to follow racial lines.) But, more fundamentally, what he doesn’t get is that racial groups have an existence independent of genetics. They are fundamentally genealogical entitities–who begat whom. Unsurprisingly, when you stop and think about it, the genes tag along with the begats.

(Republished from iSteve by permission of author or representative)
Although demographics obviously are the driving force in measures of student achievement, it is possible for one state to do a better job than another relative to what it has to work with in terms of student potential. One interesting way to analyze the value added performance of a state’s public schools is to compare 8th grade scores versus 4th grade scores on the National Assessment of Educational Progress. If a state improves from 4th to 8th grade relative to the rest of the country, this could be evidence that it is doing a good job of schooling (at least in the middle years).

The NAEP is also given to 12th graders, but those score are distorted by the large number of dropouts.

There are lots of data on the NAEP site, so if anybody analyzes it, let me know.

(Republished from iSteve by permission of author or representative)
The 2005 National Assessment of Educational Progress scores are now out for eighth grade Science, and the cutting edge state of California, home of Silicon Valley and Cal Tech but also of millions of illegal aliens, ranks second worst out of the 44 states measured, ahead of only Mississippi. In California, only 18% of eighth graders scored at the Proficient or Advanced levels, versus 27% nationwide.

Hawaii, was third worst, and then came Alabama, New Mexico, Nevada, Louisiana, Arizona, Florida, and Texas.

The highest scoring state was North Dakota (with 43% scoring Proficient or Advanced), followed by Montana, Vermont, New Hampshire, South Dakota, Massachusetts, Wyoming, Minnesota, and Wisconsin.

As you may have noticed from eyeballing the data, the highest-scoring states don’t have much in common except they tend to be quite … well, northern (if you get my drift). In his obituary for Daniel Patrick Moynihan, George Will coyly wrote:


“The Senate’s Sisyphus, Moynihan was forever pushing uphill a boulder of inconvenient data. A social scientist trained to distinguish correlation from causation, and a wit, Moynihan puckishly said that a crucial determinant of the quality of American schools is proximity to the Canadian border. The barb in his jest was this: High cognitive outputs correlate not with high per-pupil expenditures but with a high percentage of two-parent families. For that, there was the rough geographical correlation that caused Moynihan to suggest that states trying to improve their students’ test scores should move closer to Canada.”


Sure, Dan and George, whatever you say! It must be playing hockey that makes you monogamous and thus smart.

The NAEP also reports the scores broken down by ethnicity. California’s non-Hispanic whites don’t do terribly well either, coming in 8th worst out of 44 states. The bottom of the white barrel is West Virginia, followed by Nevada, Mississippi, Hawaii, Alabama, Tennessee, Louisiana, and California.

The top scoring whites are found in Massachusetts, Colorado, North Dakota, Minnesota, Montana, New Jersey, South Dakota, Virginia, and Wisconsin.

The lowest-scoring Hispanics are in Rhode Island (Cape Verdeans? Brazilians?), California, Nevada, Arizona, Connecticut, Georgia, Washington, New Mexico, Utah, South Carolina, Illinois, and Texas. In general, the more Hispanics in a state, the worse they do, although Texas did fairly well.

The top Hispanics are found in Missouri, Wyoming, Ohio, Virginia, Arkansas, and Delaware. In other words, Hispanics score best when there aren’t many other Hispanics around.

The lowest scoring blacks are in Arkansas, Mississippi, Nevada, Alabama, and Florida.

The highest scoring blacks are in Washington state, Delaware, Virginia, Massachusetts, and Colorado.

Nationwide, whites outscore blacks by 37 points, Hispanics by 32, and Asians by 5.

Similarly, in the 2003 NAEP, in 8th grade Math, California came in 8th worst among the states; and in 8th grade Reading, it came in second worst.

Why does the U.S. Senate and the President want to push the rest of America further down the path being pioneered by California?

(Republished from iSteve by permission of author or representative)
