In September 2008, Mann et al reported a “significant development” in paleoclimate reconstructions – a “skillfull” reconstruction without tree ring data for over 1300 years.
A skillful EIV reconstruction without tree-ring data is possible even further back, over at least the past 1,300 years, for NH combined land plus ocean temperature (see SI Text). This achievement represents a significant development relative to earlier studies with sparser proxy networks (4) where it was not possible to obtain skillful long-term reconstructions without tree-ring data.
The story was widely covered at the time and the result has been relied upon to marginalize criticism of the reliance of IPCC multiproxy studies on strip bark bulges or tree ring chronologies developed by CRU. Now it turns out that the much vaunted claim to have a “validated” no-dendro reconstruction for the past 1300 years was merely an illusion.
Not only was it an illusion, but recent admissions by Gavin Schmidt show that it foundered on Mann’s much criticized use of the Tiljander sediments – a topic on which the seeming obtuseness of the climate science community to the simplest of issues (e.g. contamination by bridge and agricultural sediments) has mystified third parties over the past two years. Only last month, Schmidt had re-assured readers at Keith Kloor’s that Mann’s misuse of the Tiljander sediments didn’t “matter”. It turns out that it did.
Mann et al 2008
Let’s go back to Mann et al 2008. Claims for its no-dendro reconstruction featured prominently not just in the abstract and conclusions of the article itself, but in associated press releases and promotion.
The abstract to Mann et al 2008 stated:
Recent warmth appears anomalous for at least the past 1,300 years whether or not tree-ring data are used.
The running text stated:
For both methods, we perform reconstructions both with and without dendroclimatic proxies to address any potential sensitivity of our conclusions to issues that have been raised with regard to the reliability of tree-ring data on multicentury time scales (4, 11, 16, 19, 33, 34)…
A skillful EIV reconstruction without tree-ring data is possible even further back, over at least the past 1,300 years, for NH combined land plus ocean temperature (see SI Text). This achievement represents a significant development relative to earlier studies with sparser proxy networks (4) where it was not possible to obtain skillful long-term reconstructions without tree-ring data…
We place greatest confidence in the EIV reconstructions, particularly back to A.D. 700, when a skillful reconstruction as noted earlier is possible without using tree-ring data at all…
Recent warmth exceeds that reconstructed for at least the past 1,800 years in the EIV reconstructions, and this conclusion extends back at least 1,500 years without using tree-ring data.
The point was re-iterated in the caption to its Figure S6.
News Releases for Mann et al 2008
The no-dendro claim of Mann et al 2008 was heavily promoted by Penn State and by the media.
The Penn State news release stated
Results of this study without tree-ring data show that for the Northern Hemisphere, the last 10 years are likely unusually warm for not just the past 1,000 as reported in the 1990s paper and others, but for at least another 300 years going back to about A.D. 700 without using tree-ring data. The same conclusion holds back to A.D. 300 if the researchers include tree-ring data.
and continued
“Ten years ago, we could not simply eliminate all the tree-ring data from our network because we did not have enough other proxy climate records to piece together a reliable global record,” said Mann. “With the considerably expanded networks of data now available, we can indeed obtain a reliable long-term record without using tree rings.“
The no-dendro reconstruction was breathlessly reported at realclimate (a post subsequently used as authority by the recent Tamino post):
Now though, the Northern hemisphere land temperature reconstructions without tree rings can go back to 1500 AD or 1000 AD depending on which of two methodologies are used. For the NH land and ocean target, it’s even possible to get a coherent non-tree ring reconstruction back to 700 AD!
ENN reported the no-dendro results pretty much verbatim from the Penn State news release: here here
Results of this study without tree-ring data show that for the Northern Hemisphere, the last 10 years are likely unusually warm for not just the past 1,000 as reported in the 1990s paper and others, but for at least another 300 years going back to about A.D. 700 without using tree-ring data.
Numerous other news outlets and blogs picked up the “discovery”.
Criticism
Mann et al 2008 was covered in numerous contemporary Climate Audit posts. It was quickly discovered that the heavily publicized no-dendro reconstruction used Tiljander’s Lake Korttajarvi sediments despite warning from Tiljander that the sediments had been heavily contaminated by modern construction and farming, making them totally unsuitable for inclusion in the Mann 2008 algorithm. The contamination was so severe that it resulted in Mann et al using the data upside-down to the climatic interpretation adopted by the original authors for the pre-contamination portion of the series.
The SI showed that Mann et al were aware of Tiljander’s caveats but used the contaminated sediments (upside-down) anyway, thereby compromising the no-dendro reconstruction. They purported to justify the inclusion of Tiljander sediments on the grounds that their inclusion didn’t “matter” because they could “get” a somewhat similar looking stick without the Tiljander sediments. The obvious question was – if they didn’t “matter”, then why use them, given the explicit caveats of the originating author? A question that has never received an answer – only the excuse that the use of the compromised proxies didn’t “matter” – an excuse that is now known to be untrue given the “validation” failure of the no-dendro network without the Tljander sediments.
In January 2009, a short comment by Ross and I on Mann et al was published by PNAS, a comment which included criticism of Mann’s use of the Tiljander sediments despite the caveats. In their response, Mann et al merely stated that our criticism of their use of the Tiljander proxies was “bizarre” and that their SI:
showed that none of our central conclusions relied on their use.
Given the central role of the claim to have a “validated” no-dendro reconstruction, this claim in their Reply now appears to be untrue.
Over the next year and half, Mann’s use of the Tiljander sediments has been a recurring issue over many blogs. A newcomer to the climate blogosphere (AMac), who entered with no particular preconceptions, became extremely frustrated with the obtuseness of the climate community to what seemed to be a black-and-white simple issue and challenged Mann’s defenders. It seemed obvious to him (and to me and to others) that Mann et al had erred in their usage of their proxy and that it was their responsibility to acknowledge and correct the error. AMac was persistent. The more that the “community” arm-waved, the more frustrated he got.
The issue was re-raised most recently at Keith Kloor’s Collide-a-scape in a lengthy thread where Gavin Schmidt argued that critics of Mann et al 2008 were refusing to listen, while Gavin’s critics countered that Gavin’s arguments didn’t make any sense, a position summarized by Lucia on June 18, 2010 with her characteristic lucidity as follows:
So let’s go back to Gavin’s closing complaint:
Thus what we have is not scientists refusing to engage with serious questions, it is the critics refusing to accept the answer.
What seems to have happened in comments here is a scientists gave what appears to be an answer so flawed that people of good faith could easily consider it to be flat out wrong. Critics refuse to accept the answer given by Gavin– a scientist– because the asnwer appears flat out wrong. People who support Gavin are suggesting the critics refusal to accept the answer somehow reflects badly on the critics. We await to see if Gavin returns to explain why his critics should not consider his answer to willis point (b) either flat out wrong or at best, highly misleading. Because currently, Gavin’s claim appears to be contradicted by the evidence he gave to support it.
A month later (July 22), Tamino inadvertently revived the unresolved dispute, by citing the Mann 2008 no-dendro reconstruction in his attack on Andrew Montford’s Hockey Stick Illusion as follows:
As a great deal of other research has shown, you can even reconstruct past temperature without bristlecone pine tree rings, or without any tree ring data at all, [linking to the RC post on Mann et al 2008 here ] resulting in: a hockey stick.
This was not the first time that a realclimate post had invoked the Mann et al 2008 no-dendro stick. They had done so in the Yamal controversy as well:
Oh. The hockey stick you get when you don’t use tree-rings at all (blue curve)? [again Citing http://www.realclimate.org/index.php/archives/2008/09/progress-in-millennial-reconstructions/]
Tamino’s post sparked renewed discussion of the no-Tilj no-dendro case. In response to a lengthy post at RC by Judy Curry, Gavin pointed to a figure in the SI to “Mann et al 2008″ showing a no-Tilj no-dendro CPS reconstruction – not mentioning that the figure was not part of the original article, but a correction to the website posted only in Nov 2009 – after Montford’s book was finalized other than the Climategate portion. Gavin:
[Response: Absolutely untrue in all respects. No, really, have you even read these papers? There is no PCA data reduction step used in that paper at all. And this figure shows the difference between reconstructions without any tree ring data (dark and light blue) compared to the full reconstruction (black). (This is a modified figure from the SI in Mann et al (2008) to show the impact of removing 7 questionable proxies and tree ring data together)
This figure is at Mann's website here, where a change notice dated to November 4, 2009 says:
In the newly corrected figure, we have added the result for NH CPS without both tree-rings *and* the 7 potential "problem series." Each of the various alternative versions where these sub-networks of proxy data have been excluded fall almost entirely within the uncertainties of the full reconstruction for at least the past 1100 years, while larger discrepancies are observed further back for the reconstruction without either tree-ring data or the 7 series in question, owing to the extreme sparseness of the resulting sub-network.The new figure can be downloaded here (PDF)
Later in the day, Gavin reiterated at RC the claim that there was "no material difference" in results with or without the Tiljander sediments.
The Tiljander stuff is moot since the Mann et al (2008) paper showed both with and without and found no material difference [ in line to 171 on 24 July Comment by D. Robinson — 24 July 2010 @ 8:16 AM]
On July 25, I responded here to some of the issues in Tamino’s post, noting once again the circularity of the original no-dendro no-Tilj argument in an inline comment. The point continued to be contested through the thread, including the following inline comment to Phil Clarke on July 28 ( who was also commenting at realclimate):
In November 2009, just before Climategate, Mann placed a non-Tiljander non-dendro reconstruction on his website. He did not issue a Corrigendum at PNAS nor did he publish a notice of the new information at realclimate. That Mann did so in late 2009 long after the fact did not refute the claim in respect to Mann et al PNAS 2008. It’s very misleading for Gavin to pretend that a website addition in November 2009 was part of the corpus of Mann et al 2008, that should have been considered in CA commentary on Mann et al 2008 in late 2008 (which was what MOntford was reviewing).
This comment was posted at RC by Judy Curry on July 28 occasioning a retort from Gavin who falsely accused Montford of attributing motives to Mann’s failure to present a no-Tilj no-dendro combination (Montford had observed the circularity and had not speculated as to motive). In his inline comment, Schmidt made what, to my knowledge, is the first public notice that the Mann et al 2009 (Science) SI (not Mann et al 2008 PNAS) had reported that the no-dendro reconstruction without the contaminated sediments did not verify prior to 1500. In other words, announcing the demise of the much vaunted “validated” no-dendro reconstruction back for 1300 years. Gavin:
[Response: Pure spin. The additional graph was posted because of inaccurate claims that there was something wrong with the no-dendro reconstruction because of the inclusion of the already-acknowledged-to-be-problematic Tiljander proxies. The sensitivity studies in the original paper didn't include that the no-dendro/no-Tiljander combination but that does not justify the claims made by Montford that such a combination was impossible or was not included because it undermined the results. Indeed, you can do a no-dendro and no-Tiljander reconstruction with the code that was posted with Mann et al (2008), and that was what was added to the figure I showed. Montford was apparently happy to make up results and conclusions in late 2008 that were just not justified, and for this you give him a pass? Curious. For further information, the no-dendro/no-Tiljander sensitivity test is also part of the SI in Mann et al (2009) (figure S8), where it is noted that it doesn't validate prior to 1500 AD. Of course if you remove all data that is imperfect, you will end up with no results. But as Salzer et al point out, there is likely to be useful climate information in the tree rings so I wouldn't throw them out unnecessarily. - gavin]
The bolded sentence should have caught everyone’s attention, but it was mixed in with a number of other contentious issues and passed without further comment. But it was apparently weighing on Gavin’s mind as he referred to the point again yesterday and this time it was noticed. Phil Clarke had re-capped (rather disparagingly) my analysis of the “frozen” AD1000 network at RC prompting Gavin to once again mention the Mann et al 2009 SI and the “validation” failure of the no-dendro reconstruction prior to 1500:
[Response: It's also worth spelling out some of McIntyre's thimble hiding here. First off, after a 7 years you'd think that he would be aware that the reconstructions are done in a step-wise fashion - i.e. you use as much information as is available as far back as you can. Back to 1500 you use everything that goes back that far, back to 1400 a little less etc. So a proper no-dendro/no-Tijl reconstruction will not just be made with what is available in 1000AD. Second, given all of the bluster about validation statistics, he never seems to compute any. Since the no-dendro CPS version only validates until 1500 AD (Mann et al (2008) ), it is hardly likely that the no-dendro/no-Tijl CPS version will validate any further back, so criticising how bad the 1000 AD network is using CPS is hardly germane. Note too that while the EIV no-dendro version does validate to 1000 AD, the no-dendro/no-Tijl only works going back to 1500 AD (Mann et al, 2009, SI). So again, McIntyre is setting up a strawman, not performing any 'due diligence' and simply making stuff up - all in order to demonstrate some statistical prestidigitation to the adoring commenters. - gavin]
This time, the point wasn’t missed. A few minutes later,(529 July 31) Nicolas Nierenberg asked Gavin to confirm the surprising information that the no-dendro reconstruction did not validate prior to AD1500:
Gavin, So just to be clear with regard to your response to 525. Under either method (CPS or EIV) it is not possible to get a validated reconstruction to before 1500 without the use of tree rings, or the Tijlander sediments. I understand, of course, that as you remove proxies that the ability to project backward will naturally diminish.
[Response: That appears to be the case with the Mann et al 2008 network. Whether you can say more general things about medieval times using these and other proxies (cf osborn and briffa 2006) is another question. -gavin]
Read that again slowly:
Under either method (CPS or EIV) it is not possible to get a validated reconstruction to before 1500 without the use of tree rings, or the Tijlander sediments.
How many times had Gavin and others said that Mann’s use of Tiljander sediments didn’t “matter”? And now we learn that, without the contaminated sediments, 800 years of “validation” are eliminated. Gavin petulantly tried to close off the issue by saying said that “the exact level of the medieval warmth is not a very interesting scientific question”, not the position that was taken in September 2008 when they were issuing press releases about the no-dendro reconstruction.
Mann et al 2009 (Science) was published in November 2009, just before Climategate, and hasn’t been discussed here. In a way, I’d sort of presumed (prematurely, it seems) that people had stopped taking this sort of article in Science (or Nature) seriously.
There was nothing in the text of Mann et al 2009 that stated or even hinted that claims in Mann et al 2008 on the validation of their non-dendro reconstruction were conceded to be no longer valid. Nor did they issue a Corrigendum for Mann et al 2008 at PNAS where the no-dendro claim had actually been made. Nor was the withdrawal of the claim to have a 1300-year validated no-dendro reconstruction reported at the Mann et al 2008 website. Nor were there any press releases withdrawing the claim of a “validated” no-dendro reconstruction with equal prominence to the original press release. However buried in the SI to Mann et al 2009 was the following admission (with a similar caption to their SI Figure S8):
In addition to the tests described by ref. S1 which removed alternatively (a) all tree-ring data or (b) 7 additional long-term proxy records associated with greater uncertainties or potential documented biases (showing the temperature reconstruction was robust to removal of either of these datasets), we here removed both data sets simultaneously from the predictor network (Fig. S8). This additional test reveals that with the resulting extremely sparse proxy network in earlier centuries, a skillful reconstruction is no longer possible prior to AD 1500. Nonetheless, even in this case, the resulting (unskillful) early reconstruction remains almost entirely within the estimated error bounds of the original reconstruction.
Ironically, attention to the no-dendro reconstruction was revived because of attacks by Tamino and Gavin Schmidt on Andrew Montford’s summary of the original CA discussion of Mann et al 2008 and the Tiljander sediments, a summary that is worth re-reading in light of recent admissions:
It turned out that the twentieth century uptick in Tiljander’s proxies was caused by artificial disturbance of the sediment caused by ditch digging rather than anything climatic. Mann had acknowledged this fact, but then, extraordinarily, rather than reject the series, he had purported to demonstrate that the disturbance didn’t matter. The way he had done this was to perform a sensitivity analysis, showing that you still got a hockey stick without the Tiljander proxies.
Great care is needed when reading scientific papers, particularly in the field of paleoclimate, and this was one of the occasions when one could have come away with an entirely wrong impression if the closest attention had not been paid. The big selling point of Mann’s new paper was that you could get a hockey stick shape without tree rings. However, this claim turned out to rest on a circular argument. Mann had shown that the Tiljander proxies were valid by removing them from the database and showing that you still got a hockey stick. However, when he did this test, the hockey stick shape of the final reconstruction came from the bristlecones. Then he argued that he could remove the tree ring proxies (including the bristlecones) and still get a hockey stick – and of course he could, because in this case the hockey stick shape came from the Tiljander proxies. His arguments therefore rested on having two sets of flawed proxies in the database, but only removing one at a time. He could then argue that he still got a hockey stick either way.
As McIntyre said, you had to watch the pea under the thimble.
Yup.
Roman M and TomRude have observed an interesting letter writing campaign in which Michael Mann contests adverse opinion in provincial newspapers, accusing the letter writers of being “parrots”.
Today (July 31, 2010), Mann sent the following letter to the Saint John (New Brunswick) Telegraph Journal objecting to a letter published July 30. Similar letters were sent on July 22 to the Fredericton (New Brunswick) Daily Gleaner and on July 29 to the Minneapolis Star Tribune.
Mann’s July 31 letter to the Saint John Telegraph Journal replied to a July 30 letter to the editor as follows:
A letter published July 30 did a grave disservice to your readers by making false and defamatory statements about me and other climate scientists. It repeats false allegations (based on illegally hacked emails) of supposed scientific misconduct (e.g. the supposed destruction of data) that have now been rejected as false by three separate investigations in the U.K.
A similar investigation by my university has exonerated me of any of the wrongdoing alleged by climate change deniers. Unfortunately, these exonerations cannot stop individuals from repeating the false allegations.
The writer parrots the false claim that I have advised colleagues “to isolate and ignore scientific journals that publish the views of the global-warming skeptics.” This claim is based on a thorough misrepresentation of a single example: a deeply flawed paper in 2003 published by the journal “Climate Research” by Willie Soon & Sallie Baliunas claiming that recent warming is not unusual.
I did in fact have concerns about the paper and the process that led to its publication. The journal’s editor-in-chief Hans Von Storch found that the paper “was flawed” and “shouldn’t have been published” and half the editorial board quit in protest of its publication.
Climate change deniers object to the term, using instead “skeptic” to describe those who deny the overwhelming evidence of human-caused climate change. “Skepticism” is a good thing in science. But when it is applied in only one direction it is not “skepticism” at all, but indeed, denial.
It is ironic that scientists (including myself) are accused of dishonesty. It is those who spread false information about science and scientists – whether knowingly, or parroting the disinformation of others – who do the greatest harm to the public discourse on vital issues such as climate change.
MICHAEL E. MANN
Professor, Dept. of Meteorology, Penn State University Director, Penn State Earth System Science Center
ON July 22, 2010, Mann wrote the Fredericton Daily Gleaner (in New Brunswick, Canada):
Re: Science and truth
In a piece published in your paper July 20, you allowed Thaddee Renault to do a grave disservice to your readers by making false statements about me and other climate scientists.
Mr. Renault repeats allegations (based on illegally hacked emails) of supposed scientific misconduct by scientists at the Climatic Research Unit of the University of East Anglia that have now been rejected as false by three separate investigations in the U.K. A similar investigation by my university has exonerated me of any of the wrongdoing alleged by climate-change deniers.
Unfortunately, these exonerations can’t stop individuals such as Mr. Renault from repeating the false allegations.
Mr. Renault parrots the false claim that I have advised colleagues “to isolate and ignore scientific journals that publish the views of the global-warming skeptics.” His claim is based on a misrepresentation of a single example: a flawed paper in 2003 published by the journal Climate Research by Willie Soon and Sallie Baliunas claiming that recent warming isn’t unusual.
I did have concerns about the paper and the process that led to its publication. As the Wall Street Journal reported, this study, funded by the fossil-fuel industry, was heavily criticized by a large number of other scientists. The journal’s editor-in-chief Hans Von Storch found that the paper “was flawed” and “shouldn’t have been published.”
Mr. Renault objects to the term “climate-change denier” to describe him and his fellow travellers, favouring instead to be called a skeptic. Skepticism is a good thing in science, but when it’s applied in only one direction (i.e. to question all scientific evidence of the reality of climate change), it’s not skepticism at all, but denial.
Readers interested in the truth behind the science, rather than the falsehoods and smears perpetuated by people such as Mr. Renault, should consult the scientist-run website www.realclimate.org or scientifically based books on the topic such as my Dire Predictions: Understanding Global Warming.
It’s ironic that Mr. Renault accuses scientists of wrongdoing. It’s those such as Mr. Renault who spread false information about science and scientists – whether knowingly or by simply parrotting the disinformation of others – who do the greatest harm to the public discourse on vital issues such as climate change.
Michael Mann, director
Penn State Earth System Science Center
University Park, Penn.
On July 29, Mann wrote the Minneapolis Star Tribune in response to a letter here:
In “Warming alarmists can’t stand the heat” (July 26), the Star Tribune allowed Peter J. Havanac to do a grave disservice to its readers by making false statements about me and other climate scientists.
Havanac repeated false allegations (based on illegally hacked e-mails) of supposed scientific misconduct by scientists at the Climatic Research Unit of the University of East Anglia (for example, the supposed destruction of e-mails) that have now been rejected as false by three separate investigations in the U.K. A similar investigation by my university has exonerated me of any of the wrongdoing alleged by climate-change deniers like Havanac. Unfortunately, these exonerations cannot stop individuals like Havanac from repeating the false allegations. Only the possession of decency can do that.
Havanac parroted the false claim that I sought to “undermine” a journal that “contradicted views held by … global-warming alarmists.” His claim was based on a thorough misrepresentation of a single example: a deeply flawed paper published in 2003 by the journal Climate Research. That paper, by Willie Soon and Sallie Baliunas, claimed that recent warming is not unusual.
I did in fact have concerns about the paper and the process that led to its publication. As the Wall Street Journal reported (“Global warming skeptics are facing storm clouds,” July 31, 2003), this fossil-fuel-industry-funded study was heavily criticized by a large number of other scientists. The editor-in-chief of Climate Research, Hans Von Storch, found that the paper “was flawed” and “shouldn’t have been published.”
Other editors at Climate Research (see “Storm brews over global warming,” Chronicle of Higher Education, Sept. 5, 2003) felt that the editor who had handled the Soon and Baliunas paper had been gaming the system to allow through substandard papers simply because they expressed a contrarian viewpoint regarding climate change. Ultimately, both Von Storch and half of the editorial board quit in protest over the apparent corruption of the peer review process at the journal.
Havanac objects to the term “climate-change denier” to describe him and his fellow travelers. Perhaps he prefers to think of himself as a “skeptic” instead? Well, skepticism is a good thing in science. But when it is applied in only one direction (that is, to reject all evidence of climate change while uncritically accepting transparently flawed arguments against it), it is not skepticism at all, but indeed, denial.
Readers interested in the truth behind the science, rather than the falsehoods and smears perpetuated by people like Havanac, should consult the scientist-run website realclimate.org or scientifically based books on the topic like my “Dire Predictions: Understanding Global Warming.”
If it ironic that Havanac accuses climate scientists of dishonesty. It is those who spread false information about science and scientists — whether knowingly, or by simply uncritically parroting the disinformation of others — who do the greatest harm to the public discourse on vital issues such as climate change.
NASA blogger Gavin Schmidt as part of his ongoing attempt to rehabilitate Mannian paleoclimate reconstructions, characterized here as dendro-phrenology, has drawn attention to a graphic posted up at Mann’s website in November 2009. In this graphic, Mann responded to criticisms that his “no-dendro” stick had been contaminated by bridge-building sediments despite warnings from the author (warnings noted by Mann himself but the contaminated data was used anyway.) I’ll show this figure at the end of the post, but first I’m going to show the “raw materials” for this “reconstruction” and my results from the same data.
I’m going to show a lot of plots of “proxies” today. The intuitive idea of a proxy is that the thing being measured (tree ring width, sediment thickness, ice core O18, etc) has a linear relationship with a temperature “signal” plus low-order red noise. Therefore, if the temperature “signal” is a hockey stick, the various proxy plots should look like a hockey stick plus low-order red-noise. I encourage readers to look at the no-dendro no-Tilj data for Mann’s November 2009 example with that in mind. If the topics were being discussed by proper statisticians, the properties of the “noise” would be discussed, rather than ignored.
To illustrate the calculation, I’ve picked the AD1000 Mann 2008 data set as an example since it covers the MWP. I’ve used the late-miss version (calibration 1859-1949) to work through, since it will give a look at any potential “divergence problems” in non-dendro data.
There were 29 “proxies” in the data set- 11 sediments, 2 “documentary” (both Chinese), 9 speleo and 7 ice core. Eleven of these were annually resolved; the other 18 were “decadal” resolution. 22 were NH; 7 SH.
The first step in Mann’s algorithm is determining the orientation of speleo and documentary proxies through their after-the-fact correlation to instrumental data. (The orientation of other proxies is presumed to be known a priori). In this network, there were 11 speleo+documentary proxies and 5 of 11 were flipped. (Interestingly, it is possible in Mann’s algorithm for the same proxy to have opposite “significant” orientations depending on the calibration period.)
The next step is to screen out proxies that do not have a “significant” correlation to gridcell temperature. Although we’ve heard much invective against the meaningful of r^2 statistics from Mann, Schmidt and others in the context of MBH98, Mann then uses correlation (r) to screen series in Mann et al 2008. (Perhaps it is the squaring of the correlation statistic that Schmidt takes exception to.)
There were 16 proxies that “passed” Mannian significance: – 3 of 11 sediments, both “documentary (Chinese), 7 of 9 speleo and 4 of 7 ice cores. Seven of 11 annually resolved passed; nine of 18 decadally resolved passed. 12 of 22 NH passed; 4 of 7 SH passed.
In the figure below, I’ve plotted all 22 NH “proxies” (standardized), coloring the “rejected” proxies in green. I don’t think that anyone can reasonably look at these 22 series and say that the individual “proxies” can be reasonably interpreted as different linear transformations of a Hockey Stick plus low-order AR1 red noise or that the individual proxies look much like one another. They are a hodge-podge to say the least. This is the problem of proxy inconsistency that I’ve talked about frequently and that Ross and I reported in our comment at PNAS in Mann 2008. Mann either didn’t understand or pretended not to understand the problem, which is fundamental to the entire enterprise of proxy reconstructions and readily apparent merely by plotting the “proxies”.
While “ex post screening” by correlation is accepted as a given by realclimatescientists, ex-post screening by correlation is not a statistical procedure that is recommended or discussed in Draper and Smith or standard statistical texts. The tendency of this procedure to produce sticks from red noise is well known in the technical blogosphere (Jeff Id, David Stockwell, Lubos Motl and myself have all more or less independently noticed and reported the phenomenon, with David publishing a short note in an Australian mining newsletter that Ross and I cited in our PNAS comment. However professional climate scientists appear unaware of the effect and it remains unreported in the PeerReviewedLiterature.
The top left proxy (192) is an interesting one. It is Baker’s speleothem record from Scotland that was discussed at CA in early 2009 and here as an interesting example of Upside-Down Mann. In the orientation applied in Mann’s no-dendro no-Tiljander reconstruction endorsed by Gavin Schmidt, Scotland is shown as having experienced the unique phenomena of the Medieval Cold Period and Little Warm Age – bizarro Hubert Lamb, as it were.
The “proxies” show little evidence of an overall pattern, let alone a Stick.
Figure 1. 22 NH No-Dendro No-Tilj Proxies in M08 AD1000 network, rejected in green.
Next, here is a summary plot of the 12 NH “proxies” that “pass” Mannian screening, this time showing flipped proxies shown in red. The top left proxy is still the speleothem with the Scottish Medieval Cold Period and Little Warm Age. This is the same as the above graphic where proxies are accepted. The proxy with the hockey stick shape here is Fisher’s Agassiz, Ellesmere Island melt series, a proxy which has been around for a long time, used in Bradley and Jones 1993, for example.
Figure 2. 12 “Passing” NH No-Dendro No-Tilj Proxies in M08 AD1000 network, flipped in red.
In hte next step in Mann 2008 CPS, the series are Mann-smoothed (Butterworth filter plus Mann endpoints). The smoothed series are then re-standardized on the (short) calibration period. The smoothing of the ternary series in the third column ( a Chinese documentary series) has an interesting effect.
Figure 3. 12 “Passing” NH No-Dendro No-Tilj Proxies in M08 AD1000 network, smoothed and re-scaled on (short) calibration period.
The proxy series are then averaged within a gridcell. You’ll notice that some gridcells are identical. This results because the Mann algorithm contains what I called (in 2008) a “stupid pet trick” – if Mann transcribed the location of a proxy as being exactly on the border of a gridcell (e.g. 25E), the proxy is allocated to both gridcells, in effect doubling the weight of the proxy. In the case of the Socotra stalagmite, the stalagmite is not actually located at 25E and the doubling occurs only because of a transcription error – not that the doubling makes any sense in the first place.
The gridded data are then re-centered and re-scaled to match the mean and standard deviations of the corresponding gridcell instrumental data – thereby yielding an estimate of the gridcell temperature. The 12 NH gridcells are shown below.
Figure 4. 12 NH Gridcells from averaging Mannian proxies
Of the resulting 14 gridcells, 8 are north of 30N. Mann attempts to balance the weights through an odd Mannian mechanism of re-gridding the north of 30N cells into 10×10 cells, averaging the data within each gridcell. This reduces the number of gridcells from 14 (NH – 12) to 10 (NH – 8). The series with the Scottish Little Warm Age survives these various operations pretty much unscathed. These again are a sort of temperature estimate.
Figure 5. Eight re-gridded NH gridded series.
Mann then does a weighted average of the gridcells – weighting each by the cos (latitude) – to yield a NH (and SH) estimate.
The figure below shows the No-dendro No-Tilj for the AD1000 network, using Mannian methods endorsed by Gavin Schmidt.
Figure 6. Emulation of Mann 2008 No-Tilj No-Dendro Reconstruction.
Now here is the version at Mann’s website, which looks nothing like my emulation with the 29 proxies (16 screened) from the AD1000 no-Tilj no-dendro network.
Figure 7. Mann Notilj No-dendro reconstruction.
What accounts for the difference? I’m pretty sure that this calculation is pretty close to the M08 calculation for the corresponding step. I’ve groundtruthed my R-emulation against Matlab intermediates calculated by UC and Jean S in 2008. Because the CPS calculations are, at the end of the day, weighted averages, the composite is going to bear some relationship to the proxies – hence the methodical plotting of intermediates at each step to benchmark the calculation. So while there’s always the possibility of a misstep in emulating Mannian calculations, I don’t see how such a misstep would alter the general shape of the AD1000 CPS calculation (since the general shape can be discerned in the average at each stage.)
Here’s where I think the difference lies. Mann’s graphics all show the results of spliced reconstructions rather than what you get with proxies going back to AD1000. The provenance of the network used in Mann’s November 2009 revision of a figure in his SI isn’t described as clearly as it might be. My interpretation of the figure is that the network includes 71 Luterbacher gridded European series which use instrumental temperature data.
It is my surmise that in its latter portion, the stick-ness of the “new” no_tilj no-dendro reconstruction derives from splicing the Luterbacher gridcell data (using instrumental data) onto the horrible no-dendro reconstruction. I’m not 100% sure of this, but that’s my surmise. I’ll experiment with the splicing steps on another occasion.
Make a stick, make a stick, Michael Mann
Make us a stick as only you can
Flip it and smooth it and pick it to be
In the report for IPCC.
Update Aug 1, 2010: Script is http://www.climateaudit.info/scripts/mann.2008/benchmark_manniancps_blog_20100730.txt . I added a couple of operations at the end to calculate the CPS from the composite shown in the post and to calculate verification stats. The script here is used to step through; it is wrapped in a function manniancps that reconciles perfectly through the regrid and very closely to the composite.
A news release on a new tree ring study here (h/t Anthony Watts) reported a reconstruction maxing out in the mid-20th century, with the characteristic late 20th century divergence problem. Their results contrast with CRU’s notorious Yamal chronology:
Following the summer temperature reconstruction on the Kola Peninsula, the researchers compared their results with similar tree-ring studies from Swedish Lapland and from the Yamal and Taimyr Peninsulas in Russian Siberia, which had been published in Holocene in 2002. The reconstructed summer temperatures of the last four centuries from Lapland and the Kola and Taimyr Peninsulas are similar in that all three data series display a temperature peak in the middle of the twentieth century, followed by a cooling of one or two degrees. Only the data series from the Yamal Peninsula differed, reaching its peak later, around 1990. What stands out in the data from the Kola Peninsula is that the highest temperatures were found in the period around 1935 and 1955, and that by 1990 the curve had fallen to the 1870 level, which corresponds to the start of the Industrial Age. Since 1990, however, temperatures have increased again evidently.
Although the reconstruction declined since mid-20th century, the sub-headline reads: “New data indicate rapid temperature rise in the coldest region of mainland Europe”.
The EPA, as expected, has denied the various petitions for reconsideration of their Endangerment Finding. They refer to the various “inquiries” on some points. Interesting reading here
http://epa.gov/climatechange/endangerment/petitions.html
Self-described “Hansen bulldog” Tamino, writing at NASA’s realclimate blog hosted by Hansen’s other bulldog (Gavin), wrote:
As another example, Montford makes the claim that if you eliminate just two of the proxies used for the MBH98 reconstruction since 1400, the Stahle and NOAMER PC1 series, “you got a completely different result — the Medieval Warm Period magically reappeared and suddenly the modern warming didn’t look quite so frightening.” That argument is sure to sell to those who haven’t done so. But I have. I computed my own reconstructions by multiple regression, first using all 22 proxy series in the original MBH98 analysis, then excluding the Stahle and NOAMER PC1 series.
As always with the Team, you have to watch the pea under the thimble. Tamino has totally misrepresented and misinterpreted Montford on this point. Neither Montford (nor I) ever made such an assertion. The only person to do so, as I’ll show below, was Mann himself.
In our 2005 (EE) article where we analyzed the various permutations and combinations, our concern wasn’t with the Stahle-NOAMER PC1 pair, but with the Gaspe-NOAMER PC1 pair. In our 2005 article, we closely examined both the Gaspe and bristlecone proxies – believing then, as now, that if these series had unique capability to interpret world climate fields, then readers should be enabled to as much as possible about the unique characteristics of these groves.
The Stahle-NOAMER PC1 combination wasn’t mentioned in our articles. Nor did we (or Montford) present the particular sensitivity combination that Tamino now purports to rebut. This combination arose not in our analyses, but in Mann’s own analyses.
In their November 2003 response to MM2003, Mann presented a graphic that showed elevated early 15th century values with variations of three series (1) no NOAMER PC1; (2) no Stahle PC1 and (3) the shorter archived version of Twisted Tree, rather than the longer grey version used in MBH98 (in which early portions did not have the usual minimum numbers of trees.)
Mann et al 2003, Figure 1. The three datasets in the caption are the NOAMER PC1, the Stahle PC1 and Twisted Tree.
As Montford accurately reports in HS Illusion, I was extremely interested in this particular graphic because Mann et al themselves, in effect, conceded that the differences arose out of only a few series. Montford described this as follows:
Mann may well have felt that he had done enough to fend off McIntyre’s criticisms but McIntyre’s perspective was quite different. Without realising that he’d done it, Mann had inadvertently shone a little light on another murky corner of his famous paper. To McIntyre, what made Mann’s response most interesting was not the fact that Mann had used an undisclosed methodology, but the fact that if you left out just two of the proxy series – the Stahle and NOAMER PC1s – you got a completely different result – the Medieval Warm Period magically reappeared and suddenly the modern warming didn’t look quite so frightening. What this meant was that Mann’s result – that the Medieval Warm Period didn’t exist – seemed to rest on just a tiny fraction of his data. The rest of the series were just ‘noise’. Mann may well have been justified in using a stepwise procedure, but if his conclusions depended on just two PC series, then they could hardly be considered robust.
Note Tamino’s selective quotation from Montford’s book. Montford was describing my reaction to Mann’s 2003 response to MM2003. Neither Montford (nor I) claimed that Mann’s calculation in his November 2003 response were correct. Montford described the impact of Mann’s calculation on me. At the time, we hadn’t isolated the precise difference between our calculation and Mann’s calculations. While Mann attempted at the time – mostly successfully in the climate science community – to distract attention to replication details involving unreported aspects of their methodology (an experience which informs some of my present procedures in dealing with these guys), his own diagram showed me that the differences arose from only a few series. Which we proceeded to analyse in detail.
The Twisted Tree series was quickly seen to be moot as it did not come into play in the AD1400 step. (Mann’s defence of his version was unconvincing to say the least. Our comparison used the archived version which did not go back as far as the grey version. The early portion of the grey version included periods with less than the minimum number of cores for a chronology under Jacoby-d’Arrigo methods.) Given that MBH claimed to have screened chronologies to ensure a minimum number of cores, their insistence in this instance of using a chronology portion that did not meet their reported QC standards seemed odd, to say the least. Needless to say, no one in the “community” cared.
Our analyses also quickly showed that the presence/absence of the Stahle PC1 didn’t matter as it didn’t have a HS shape. This was a non-issue in our own presentations – Tamino’s mention of Stahle was therefore a red flag to both Montford and me. Needless to say, the (bristlecone) NOAMER PC1 did “matter”, as did another series (Cook’s 1983 Gaspe chronology), even though it hadn’t been mentioned in the 2003 MBH response.
It’s hard to say precisely what Mann did in his 2003 diagram showing such a large impact from the NOAMER PC1-Stahle PC1. Our own calculations yielding a high early 15th century also involved Gaspe. At the time of MM2003, we had not fully appreciated the important role of the unique and unreported extrapolation of the Gaspe series, but became aware of it very quickly in late 2003 and were fully aware of the issue when we submitted out 2004 Nature articles. (There is a later unpublished version of Gaspe that doesn’t have a HS shape – an issue that is avoided by the Team.)
I presume that Mann’s 2003 diagram inadvertently used the actual Gaspe data (rather than the version with the unique and unreported extrapolation) and this led to a more dramatic result than he might have intended – but this is only speculation.
Mann re-visited this calculation in a graphic in the unpublished 2004 Mann et al submission to Climatic Change shown below. This has a different result than the 2003 diagram, showing high early 15th century results from a Gaspe-NOAMER PC1 combination – a point on which we were and are in agreement with them.
Mann et al 2004 submission to Climatic Change, Figure 2. “Treeline” in this context meant Cook’s 1983 Gaspe series.
A point that is little understood because of constant disinformation from the self-appointed bulldogs is that our results and those of Wahl and Ammann (or Mann) are in close agreement with sufficiently well-defined calculations – a point that we made in MM2005 (EE) as follows:
We emphasize the consensus between ourselves and Mann et al. on the results of sufficiently well-defined calculations. The PC calculations themselves are replicated between parties to complete accuracy. Differences remain in the emulations of NH temperature (given the PC series), but Mann et al. [2003] showed a calculation with high early 15th century results if the North American PC1 were unavailable; the comments in Mann et al. [2004b] about the effect of the PC4 confirm this overall agreement if assumptions are sufficiently well defined.
In December 2005, as I’ve reported on many occasions, recognizing this point, I proposed to Caspar Ammann that we attempt to write a joint paper accurately setting down points of empirical agreement e.g. the results of 2 covariance PCS versus 5 covariance PCs; the impact of the presence/absence of bristlecones; verification r2’s, etc. Ammann refused, saying that this would be “bad for his career”. Whether or not it would have been bad for Ammann’s career, I think that the “community” would have benefited, if Ammann had accepted my proposal. (The offer was made in writing and including a proviso that the parties could go back to square one if they were unsuccessful in achieving a joint paper; it was a very fair offer.)
Instead, the Team’s approach has always been one of misdirection, Tamino’s post and Gavin’s commentary being only the most recent examples. Tamino’s realclimate post totally misrepresented Montford’s paragraph and purported to rebut a claim that neither Montford (or I) ever made. It wasn’t Montford (or I) that presented the Stahle-NOAMER PC1 combination that Tamino purports to refute. It was Mann himself.
Reasonable people can disagree as to whether the post should be entitled “Mann Bites Dog” or “Dog Bites Mann”, but surely no one can dispute that They are The Gang That Can’t Shoot Straight.
David Holland’s adventures with Met Office dishonesty is covered in a recent article in a law journal [link] and in a radio segment here *=(h/t Bishop Hill.)
Untruthful answers by the UK Met Office to David Holland’s FOI requests were discussed at CA in 2008. Holland followed up with CRU, thus the “delete any emails” request. (As noted recently, Muir Russell made a totally untruthful characterization of the FOI underpinning of this email.)
Tamino’s realclimate post re-states points that I’ve discussed at length in the past. Here is a re-posting of a 2008 post on Tamino that deals with most of the issues in his realclimate post.
Tamino has recently re-iterated the climate science incantation that Mann’s results have been “verified”. He has done so in the face of the fact that one MBH98 claim after another has been shown to be false. In some cases, the claim has not only been shown to be false, but there is convincing evidence that adverse results were known and not reported.
Today I’m going to look at what constitutes verification of a relationship between proxies and temperature, assessing MBH results in such a context, trying as much as possible to emphasize agreed facts.
Verification
One thing that Tamino and I agree on is that a proposed reconstruction should “pass verification”. Tamino says:
… frankly, that’s the real test of whether or not a reconstruction may be valid or not. If it passes verification, that’s evidence that the relationship between proxies and temperature is a valid one, and that therefore the reconstruction may well reflect reality. If it fails verification, that’s evidence that the reconstruction does not reflect reality. It has the drawback that the data we set aside for verification we must omit from calibration; with less data, the calibration is less precise. But without verification, we can’t really test whether or not the reconstruction has a good chance of being correct.
and later
… it’s the verification statistics that are the real test of whether or not a reconstruction may be valid. Pass verification: probably valid. Fail verification: probably wrong.
While we strongly disagree on what constitutes “verification” and whether the MBH reconstruction “passes” verification, I’m prepared to stipulate to a verification standard.
If the MBH reconstruction can be shown to pass thorough verification testing, including, at a minimum, the steps described below, then, however implausible the notions may seem, I will advise readers to get used to the idea that bristlecones are magic trees, that their tune is a secret recording of world climate history and that Donald Graybill had a unique method of detecting their tune. However, these alleged magic properties should be subjected to (and withstand) scrupulous scientific investigation and verification and I do not agree with Tamino that these magic properties have been “verified”.
Without limiting the range of scientific investigation that any claim of a magical relationship might be subject to, the following verification tests seem to be to be a minimum that any scientist should require prior to grudgingly acquiescing in the view that a magical relationship exists between Graybill’s bristlecone ring width chronologies and world climate. (Similar considerations apply to any reconstruction heavily dependent on a very small number of “key” series.)
Failure in any one of these should result in Tamino rejecting the MBH reconstruction according to the verification standard. I submit that MBH has failed every one of these tests. Indeed, it’s hard to imagine a more dismal verification failure than what we’ve seen with MBH. Worse, efforts to verify their work have been contested and obstructed at every turn, leaving a very unsavory impression of the people involved.
Standard Verification Tests
First, the MBH AD1400 reconstruction failed standard dendroclimatic verification tests (Fritts 1976, 1991; Cook et al 1994; see NAS Panel Box 9.1): verification r2 (0.02 MM2005a; 0.018 Wahl and Ammann); CE ( -0.26 MM2005a; -0.21 Wahl and Ammann). These are not immaterial or irrelevant failures: for example, Eduardo Zorita said that his attitude towards the MBH reconstruction changed when he learned of the verification r2 failure.
Second, while Wahl and Ammann now (after the failure was exposed) argue that these failures don’t “matter”, that it’s all about low-frequency versus high-frequency, these are subtle issues where Wahl and Ammann hardly constitute high statistical authority (or even low authority). Readers are entitled to full disclosure of the adverse results and then judge for themselves whether they are persuaded by the Wahl and Ammann high frequency-low frequency argument. MBH readers were not given this alternative. MBH claimed that their reconstruction had “highly significant reconstructive skill”, not just in the RE statistic, but also in the verification r2 statistic, illustrating this claim in their Figure 3 excerpted below:
Figure 1: MBH98 Figure 3 panels b, c. The running text in MBH98 stated: “Figure 3 shows the spatial patterns of calibration ?, and verification ? and the squared correlation statistic r2, demonstrating highly significant reconstructive skill over widespread regions of the reconstructed spatial domain [emphasis added]” and later: “? [or RE] is a quite rigorous measure of the similarity between two variables … For comparison, correlation (r) and squared-correlation (r2) statistics are also determined. [emphasis added]“
These claims of statistical “skill” were not an idle puff by MBH, but were relevant to the widespread view that MBH methods represented a new level of sophistication, separating their work from Lamb’s prior work purporting to show a Medieval Warm Period. These claims of statistical skill were relied on by IPCC TAR, which made extensive use of the MBH reconstruction stating:
[MBH] estimated the Northern Hemisphere mean temperature back to AD 1400, a reconstruction which had significant skill in independent cross-validation tests.
The failure of important verification statistics should have been reported in MBH98, but wasn’t. It should have been reported in the 2004 Corrigendum wasn’t. Mann told Marcel Crok of Natuurwetenschap & Techniek that his reconstruction passed the verification r2 test,
Our reconstruction passes both RE and R^2 verification statistics if calculated correctly.
Later, Mann was reduced to telling a nonplussed NAS panel, well aware of Figure 3 shown above, that he had never calculated the verification r2 statistic, as that would be “foolish and incorrect reasoning”.
Perhaps Tamino can try, like Wahl and Ammann, to make a strained argument that the verification r2 (and CE) statistics don’t “matter”, but please – no more of this talk that MBH claims of statistical skill in the verification r2 statistic have been vindicated. They haven’t. And if you don’t believe me, look at Table 1S of Wahl and Ammann 2007 (which required a long and unsalubrious history prior to its inclusion in this article.
All of this discussion pertains to separation of in-sample calibration and verification periods – a separation which is complicated by the fact that you already know the results. The relevant test really comes from out-of-sample testing and verification scores, which I’ll discuss below.
“Robustness” to Dendroclimatic Indicators
Third, another important and untrue MBH claim has not been verified is its supposed “robustness” to the presence/absence of all dendroclimatic indicators. Various issues related to dendroclimatic indicators had been cited in IPCC Second Assessment Report; one of the main selling points of MBH was its multiproxy approach which seemed to offer some protection against potential dendro problems. MBH98 stated:
the long-term trend in NH is relatively robust to the inclusion of dendroclimatic indicators in the network, suggesting that potential tree growth trend biases are not influential in the multiproxy climate reconstructions. (p. 783, emphasis added.)
We have also verified that possible low-frequency bias due to non-climatic influences on dendroclimatic (tree-ring) indicators is not problematic in our temperature reconstructions…
These claims have been demonstrated to be untrue. If a sensitivity analysis is done in which the Graybill bristlecone chronologies are excluded from the AD1400 network, then a materially different reconstruction results – a point made originally in the MM articles [note: also Cook's old Gaspe chronology which has its own serious issues - see below], confirmed by Wahl and Ammann 2007 and noted by the NAS panel. In addition to failing the verification r2 test, a reconstruction without bristlecones fails even the RE test. Wahl and Ammann argue that this is evidence that the bristlecones should be included in the reconstruction; this argument has not been accepted by any third party statistician. However, for the present point, the issue is quite different and has never been confronted by Mannians: the discrepancy between reconstructions with bristlecones and without bristlecones means that the representation that the reconstruction was “robust” to the presence/absence of all dendroclimatic indicators is untrue. This recognition of non-robustness was recognized by the NAS panel which actually cited Wahl and Ammann on this point (STR, 111):
some reconstructions are not robust with respect to the removal of proxy records from individual regions (see, e.g., Wahl and Ammann in press)
There is convincing evidence that Mann et al knew of the impact of Graybill bristlecone chronologies on their reconstruction, as the notorious CENSORED directory shows the results of principal components calculations in which the Graybill chronologies have been “censored” from the network. Long before we identified the non-robustness to bristlecones, this non-robustness was known to Mann et al. While some comments in MBH99 can be construed as somewhat qualifying the robustness claims in MBH98, any such qualifications were undone in Mann et al 2000, which re-iterated the original robustness claims in even stronger terms than MBH98.
Some defenders of the Mann corpus have argued that the claims in Mann et al 2000 were narrowly constructed and referred only to the AD1730 network, which was the one illustrated in the graphic. In my opinion, the robustness claims were not limited to the AD1730 network, but included all networks ["our temperature reconstructions" is the phrase used.] But regardless, if Mann et al knew that the AD1400 network was not robust to the presence/absence of dendroclimatic indicators (which they did), then they had an obligation not to omit this fact (just as they had an obligation not to omit reporting the failed verification r2 statistics for networks prior to AD1820. )
Fifth, there is an important claim about the relative importance of the HS pattern in the North American network that not only has not been verified, but has been refuted. This particular issue has more resonance in terms of our personal experience than to others, but, as the people most directly involved, it was an extremely important matter. In response to MM2003, Mann et al argued that the HS shape of the North American PC1 represented the “dominant component of variance” or “leading component of variance” in the North American tree ring network and that the emulation in MM2003 had omitted this “dominant” component of variance. This was played out pretty loudly at the time. As readers will now recognize, this “dominant” or “leading” pattern was nothing of the sort. It was merely the shape of the Graybill bristlecone chronologies promoted into a far more prominent position in the PC rankings than they deserved, by reason of the erroneous Mann PC methodology.
In Mann’s first Nature reply, he was still holding to the “dominant component of variance” position. However, by the time of his revised Nature reply, he’d realized that the problem was deeper and conceded that the bristlecone shape had been demoted to the PC4 (an observation noted in MM 2005 (GRL, EE)). Instead of continuing to argue that the HS was the “dominant” or “leading” component of variance, he now argued that he could still “get” an HS shape with the bristlecones in the 4th PC if the number of retained PCs was increased to 5, invoking Preisendorfer’s Rule N as a rationale for expanding the roster to include the PC4. Of course, MBH98 had indicated a somewhat different rationale for PC retention in tree ring networks, but the description was vague.
My calculations indicate that it is impossible to obtain observed PC retention patterns using Rule N, with notable discrepancies in some networks. Was Rule N actually used in MBH98 or was it an after the fact effort to rationalize inclusion of the PC4? Wahl and Ammann didn’t touch the issue. With 20-20 hindsight, Mann et al might wish that they had used Rule N in MBH98, but no one’s verified that they did.
Graybill and Gaspé Chronologies
Given the acknowledged dependence of the MBH reconstruction on a very small number of tree ring chronologies, any engineering-quality verification for policy reliance, would inevitably include a close examination and assessment of the reliability of these chronologies, including re-sampling if necessary.
The key bristlecone chronologies were taken over 20 years ago. They were all taken by one researcher (Donald Graybill), who was trying to prove the existence of CO2 fertilization. Graybill may well have been eminent in his field but it is ludicrous that major conclusions should be drawn from unreplicated results from one researcher. TParticularly when there are also extremely important and unexplained differences in the behavior of Graybill’s chronologies from those of all other North American chronologies. The graphic on the left is a scatter plot compares the weights of the Graybill chronologies (red) in the MBH PC1 to those of all authors, relative to the difference between the 20th century mean and overall mean. You can tell visually that the Graybill chronologies have a far larger difference in mean than the majority of chronologies (unsurprisingly, this difference in mean is statistically significant under a t-test).
Figure 2. Comparison of MBH98 NOAMER PC1 weights to difference in mean, showing Graybill in red. Left – unquared; right – squared weights.
Aside from every other issue pertaining to MBH, any examination of this data requires an explanation of why the Graybill chronologies have a difference in mean that is not present in the other chronologies. This issue has nothing to do with PC1 or PC4. It’s really a question of whether there is an “instrumental drift” in the Graybill chronologies.
Let’s suppose that you have 70 satellites, using 8 different instruments, and that one instrument type has a drift relative to the others. If you do a Mannian pseudo-PC analysis on the network, the Mannian PC1 will pick out the instruments with the drift as a distinct pattern. Obviously, that would only be the beginning of the analysis, not the end of it. You then have to analyze the reasons for the drift of one set of instruments relative to the others – maybe the majority of instruments are wrong. But neither Spencer and Christy on the one hand nor Mears and Wentz on the other would simply say that Preisendorfer’s Rule N shows that the instrumental drift is a “distinct pattern” and terminate the analysis at that point. They’d get to the bottom of the problem.
Unfortunately, nothing like that has happened here. Mann and his supported have paralyzed the debate on esoteric issues like Preisendorfer’s Rule N and “proper” or “correct” or “standard” rules of PC retention and most climate scientists seem to be content with this and have failed to inquire as to the validity of the Graybill chronologies, both as tree ring chronologies and as tree-mometers capable of acting as unique antennae for world temperature.
Updating the Graybill Chronologies
An obvious way of shedding light on potential problems with the Graybill chronologies would simply be to bring them up-to-date, show that they are valid or not. Mann (and this argument is repeated by supporters) justified the failure to verify the Graybill chronologies on the basis that it is too “expensive” and that the sites are too “remote” – a justification conclusively refuted by our own “Starbucks Hypothesis” in Colorado.
Aside from our own efforts at Almagre in 2007, there is one other reported (but not archived) update, one which happened to be at the most important Graybill chronology – Sheep Mountain, a site which is not merely the most important in the AD1400 network, but one which becomes progressively more important in the longer PCs (especially the Mann and Jones 2003 PC1.) The Sheep Mt chronology was updated by Linah Ababneh, then a PhD student at the University of Arizona in 2003: see Ababneh 2006 (Ph. D. Thesis), 2007 (Quat Int). However, as previously reported at CA here (and related posts), Ababneh failed to replicate the distinctive HS shape of Graybill’s Sheep Mountain chronology, a shape that imprints the MBH reconstruction and, in particular, failed to verify the difference between the 20th century mean and long-term mean that led to the heavy weighting in the PC1. Her reconstruction was based on a far larger sample than Graybill’s. The differences are illustrated below:
Figure 3. Sheep Mountain Chronologies, Graybill versus Ababneh.
Linah Ababneh’s work has definitely not verified the most critical Graybill bristlecone chronology. Quite the contrary. Until the differences between her results and Graybill’s results are definitively reconciled, I do not see how any prudent person can use the Graybill chronologies, regardless of the multivariate method.
In our own work at Almagre, we identified issues related to ring widths in trees with strip bark that compromise statistical analysis, but have nothing to do with CO2 fertilization or previously identified issues. We found (See here here ) that strip bark forms can result in enormous (6-7 standard deviation) growth pulses in one portion of the core that are totally absent from other sections of the core, as illustrated below.
Figure 4. Almagre Tree 31 core samples, showing difference between cores taken only a few cm apart. Black (and red) show 2007 samples.
In a small collection (and “small” here can be as high as 30 or 50 cores), the presence/absence of a few such almost “cancerous” pulses would completely distort the average. The NAS panel said that “strip bark” forms should be “avoided” although they seem to have in mind the more traditional concerns of CO2 fertilization, than what seem to Pete Holzmann and myself as the problematic “mechanical” issues. Here there are some worrying aspects about the Graybill chronologies that should be of concern to more people than ourselves. Graybill and Idso (1993) said that cores were selected for the presence of strip bark so the possibility of a bias is latent in the original article. Second, at Almagre, we identified trees with tag numbers where cores had been taken and are located at the University of Arizona, but Graybill’s archiving was incomplete. Why were cores excluded from the archive? Given that the Graybill chronologies underpin the entire MBH enterprise, these missing invoices are, to say the least, disquieting, given Graybill’s seemingly unique ability to detect 20th century differences.
Gaspé
As noted elsewhere, there are issues about whether the Gaspé reconstruction has been included in the AD1400 netowrk only through ad hoc, undisclosed and unjustified accounting methods.
But aside from such issues, there is the important problem that, like Sheep Mountain, an update of the Gaspé chronology failed to yield the HS shape of the reconstruction used in MBH98. In this case, the authors of the update (Jacoby and d’Arrigo) failed to report or archive their update and it is through sheer chance that I even know about the update (which has not been reported anywhere other than CA). Again the “key” chronology used in MBH98 has not been verified.
The Bristlecone Divergence Problem
Ultimately the most relevant test of the “relationship between proxies and temperature” is whether updated proxies can reconstruct the temperature history of the 1980s, 1990s and 2000s. Here I mean the exact MBH98-99 proxies used in the AD1400 (and AD1000) networks; not a bait-and-switch. In the AD1400 (and AD1000) MBH case, a few key chronologies have been updated and so we have some insight on how the supposed “relationship” is holding up.
In our own sampling at Almagre, we found that ring widths in the 2000s were not at the record levels predicted by the Mannian relationship – and in fact had declined somewhat – one more instance of the prevalent “divergence problem”, but this example not limited to high latitudes and affecting one of the MBH PC1 proxies. Likewise, the Mann “relationship” at Sheep MT would call for record ring widths there, but not only did Ababneh not observe such records, as noted above, she raised serious questions about the original Graybill chronology in the first place.)
RE Statistic
In the face of all of this, how can Tamino (or anyone else) claim that the MBH reconstruction has been “verified”? Other than uncritical reliance on realclimate?
The main sleight of hand involves the RE statistic. The AD1400 reconstruction with old Sheep Mt and Gaspe chronologies has a high RE statistic. This appears to be the beginning and end of what Tamino (and realclimate) regards as “verification”. No need to verify the individual proxies. No need to pass other verification tests – even ones said to have been used in MBH98. No need to prove the validity of the relationship out-of-sample. All you need is one magic statistic – the RE statistic.
The trouble with the RE statistic, as we observed long ago, is that, meritorious or not, it’s not used in conventional statistics and little is known about its properties. In MM2005 (GRL) we showed that you could get high RE statistics using Mannian methodology on red noise. However, the problem with the RE statistic can be illustrated far more easily than occurred to us at the time. As noted on CA, I checked RE statistics for “reconstructions” using two of the most famous examples of spurious regression in econometrics: 1) Yule (1926) which shows a relationship between mortality and proportion of Church of England marriages; 2) Hendry (1980) which shows a relationship between cumulative U.K. rainfall and inflation). Both classic spurious regressions yield extremely high RE statistics – even higher than MBH98.
So although Mann characterizes the RE test as “rigorous”, it isn’t. It will fail with virtually any spurious regression (between co-trending unrelated series.) I’m not saying that the RE test shouldn’t be run: I see no harm in using this test, but it’s only one test and is not in itself anywhere near sufficient to constitute verification of a supposed relationship between proxies and temperature. For Mann, Wahl and Ammann or Tamino to argue that passing an RE test is some sort of accomplishment merely sounds goofy to anyone familiar with Yule 1926 or Hendry 1980. You’d think that third party climate scientists would catch onto this by now.
I don’t think that anything useful can be shown by more and more calculations on the MBH network. At this point, the only relevant testing is the out-of-sample re-sampling, showing that the supposed “relationships between proxies and temperature” can be confirmed. Available information on MBH proxies has not verified these relationships.
Anything Else?
Is there anything else that remotely constitutes verification of MBH? I’d be happy to consider and respond to any suggestions or inquiries.
In the above discussion, I haven’t talked about principal components very much and there’s a reason for that. In our articles, we observed that the Mannian pseudo-PC methodology was severely biased towards picking out HS-shaped series. In the critical NOAMER network, the relationship between the difference in 20th century mean and PC1 weighting is so strong that the MBH PC1 could be described as follows:
Construct the following linear combination of chronologies: assign a weight to each chronology equal to the difference between the 20th century mean and overall mean (with negative weights assigned to negative differences.)
This methodology will regularly deliver HS shaped series from red noise. Mannian pseudo-PC methodology is a poor methodology in that its efforts to locate a HS shape interfere with the operation of the PC algorithm. If there is a very strong “signal” or if the true signal actually is HS-shaped, then the poor methodology doesn’t matter much relative to conventional PC methodology. In the practical situation of the NOAMER network, the net result of the flawed methodology was to deliver a high weight to bristlecones.
If the bristlecones are magic trees, then the methodology might be flawed, but, at the end of the day, that wouldn’t “matter”.
If (1) bristlecones are not magic trees and/or the Graybill chronologies have sort of “instrumental drift” resulting in a spurious regression relationship to world temperature, (2) the Mannian pseudo-PC methodology is flawed and (3) there is some other methodology that avoids the grossest flaws of the Mannian pseudo-PC methodology, but is still inadequate to detect a spurious regression against the Graybill methodologies, then, in a bizarro-world, bizarro-scientists might argue that the flawed methodology didn’t “matter” because they were going to do the calculation incorrectly anyway. Leading bizarro-scientists would perhaps go futher, arguing, in addition that the fact that they could go on to make completely different errors meant that criticisms of the original errors were “wrong”.
At the end of the day, the issue, as the NAS panel realized, is about proxies and verification statistics. That doesn’t mean that the criticisms of the PC methodology are incorrect; they aren’t. Just that the PC issues could be coopered up without settling the key issues on proxies and verification.
Preisendorfer described PC methodology as “exploratory” and this is precisely how we (but not Mann) applied PC methodology. Mannian pseudo-PC methodology identified the most HS-shaped series quite effectively. We used this to explore the NOAMER network and found that its selections were not random – it picked out the Graybill bristlecones. The scientific issue is then whether these are valid proxies – and this is an issue that is not settled by Rule N, but one that requires scientific evidence. And in all the discussion to date, Mann et al have produced no such evidence.
So did the PC error “matter”? Well, it probably mattered in a different way than people think.
Consider what would have happened had MBH had not used an erroneous PC methodology. Let’s suppose that they used a centered PC calculation together Preisendorfer’s Rule N. So that they retained 5 PCs in the AD1400, including the bristlecones, and everything reconciled the first time. What would have happened? In 2003, I’d probably have more or less replicated their results and thought no more about it. I would probably not have peered beneath surface inquiring about the PC4 and bristlecones. verification r2 statistics and so on. I’d be making a handsome living in speculative mining stocks.
I followed the magic flute instead.
Hansen’s twin pit bulls, Tamino and Gavin, have launched into a spirited defence of Mannian paleo-phrenology at realclimate here, with a counter-discussion at Bishop Hill here.
In the Muir Russell report, Richard Horton observed that orthodox medicine “mostly rejects” papers that invoke invisible pathways (meridians of qi):
For example, the world of complementary and alternative medicine (CAM) divides the medical community. Orthodox medicine mostly rejects papers about reflexology, iridology, and acupuncture treatment that invokes invisible pathways (meridians) of qi. CAM is served by a separate class of journals that have little overlap with the more mainstream medical literature. In this instance, ideas are incommensurable.
Unfortunately, the climate science community has been far more accommodating to the paleoclimate equivalent of alternative statistics, into orthodox journals. The wider climate science “community” is placed in the awkward position of trying to reassure the public that other parts of their field are, in fact, based on science, while, at the same time, not only not disavowing, but actively defending paleo-phrenologists and the meridians of qi converging on bristlecones in California and the magic larches in Yamal. Given that strip bark bulges, which are mostly likely merely mechanical, are interpreted as expressions not just of local temperature and precipitation but of world “instrumental training patterns” or “climate fields”, phrenology is a surprisingly apt term.
The problems of strip bark standardization were being discussed in the thread where Climategate was first mentioned – a thread which contains relevant illustrations of the problems in trying to fit strip bark bulges into any statistical framework – let alone the statistical framework stated to underpin MBH98. The picture below shows the sort of phrenological bulge that underpins the strip bark Hockey Stick – see here for further discussion.) There is convincing evidence that such bulges are present in strip bark chronologies – one of the reasons why the NAS panel said that they should be avoided in temperature reconstructions. Gavin and Tamino can huff and puff all they want about the 4th principal component, but this is the sort of data that they are importing under the guise of the “right” number of principal components.
All their talk about the “right” number of principal components is simply sleight-of-hand to confuse you – when you watch the pea, the entire purpose of the high-falutin talk about principal components is to “get” strip bark bulges into the reconstruction.
This is an old debate, but the only thing that the Team moves is the pea under the thimble.
MBH98 stated:
Implicit in our approach are at least three fundamental assumptions. (1) The indicators in our multiproxy trainee network are linearly related to one or more of the instrumental training patterns. In the relatively unlikely event that a proxy indicator represents a truly local climate phenomenon which is uncorrelated with larger scale climate variations, or represents a highly nonlinear response to climate variations, this assumption will not be satisfied.
This, of course, is a large part of the problem with strip bark bristlecones (and YAD061 and its cousins.) Actually, the problem with strip bark trees looks even worse – it seems very possible, even likely, that the 6-sigma bulges in strip bark widths are purely mechanical, arising from the formation of strip bark itself. However, these 6-sigma bulges become proof in the hands of paleo-phreonologists using their own alternative statistics.
The failure of the most critical MBH proxies – strip-bark bristlecones – to meet the assumptions of their statistical model was stated as early as out 2004 Nature submission (there is compelling evidence that Jones was the third and very antagonistic reviewer), where we stated:
The NOAMER PC1 thus gets its hockey stick shape from the Graybill-Idso sites, which exhibit a nonclimatic response and/or a nonlinear response to 20th century temperature. Since MBH98 states (p. 780) that their method requires the assumption that proxies exhibit a linear response to temperature, the Graybill-Idso sites, explicitly acknowledged as problematic in Mann et al (1999) (ref. [13]), should have been disqualified as contributors to the NOAMER PC1 in MBH98, let alone as the main determinants of its shape.
Much effort has been spent by paleo-phrenologists to frame the issue as the “right” number of principal components to retain – as opposed to the underlying issue as we had framed it – whether the assumptions of the underlying statistical model had been satisfied. Indeed, we noted that MBH99 had even acknowledged the failure of stripbark bristlecones to satisfy the assumptions of their model:
Mann et al. (1999) themselves pointed out, with reference to these proxies: “A number of the highest elevation chronologies in the western U.S. do appear, however, to have exhibited long-term growth increases that are more dramatic than can be explained by instrumental temperature trends in these regions.”
With the inconsistency that so characterizes the field, after conceding that bristlecones do not meet the assumptions of their statistical model, Mann proceeded to use them anyway. (Despite statements in MBH98 that the reconstruction was “robust” to the presence/absence of all dendro proxies, MBH98 was not “robust” to the presence/absence of bristlecones. Thus, instead of not using bristlecones because they failed to satisfy the assumption of the statistical model, Mann purported to “adjust” the strip bark bristlecone chronologies – an adjustment convincingly criticized by Jean S last year. Mann’s methodology, here as elsewhere, belongs to what can only be described as alternative statistics, a discipline that, as noted above, has found a home in the climate science sections of otherwise orthodox journals.
Mann responded to our observation that Graybill strip bark bristlecones did not meet the fundamental assumption of his methodology by invoking a supposed relationship to “instrumental training patterns” as opposed to local temperature and precipitationi:
MM04 demonstrate their failure to understand our methods by claiming that we required that “proxies follow a linear temperature response”. In fact we specified (MBH98) that indicators should be “linearly related to one or more of the instrumental training patterns2”, not local temperatures.
(Update-Jul 25-6 the criticism of Mannian teleconnections is not refuted by point to ENSO. Individual trees respond to local temperature and precipitation etc; they do not respond to abstractions like a PC3.Further, the problematic 6-sigma strip bark bulges that characterize Team reconstructions are not a linear response to climate at all.) Roman Mureika expresses the point in a comment as follows:
What the climate scientists don’t seem to understand is that for teleconnections to be usable in a scientific fashion, there must be a specific real identifiable physical effect which operates at the proxy location. This effect is clearly not local temperature since the proxy has not responded to that. To further assume that this unidentified effect is related in an appropriate equivalent quantitative form to the proxy measurements is a fiction which lends itself to the cherry picking of spuriously correlated series.
[end - update]
In my opinion, if climate scientists in other parts of the community took pains to disavow paleoclimate meridians of qi and alternative statistical methods used to buttress them – which , after all, are an important part of the public face of climate science – there would have been less fall-out for the rest of the discipline in the wake of Climategate.
When the NAS panel said that strip bark bristlecones should not be used in temperature reconstructions, this should have put an end to the use of Graybill bristlecones in temperature reconstructions. However, this didn’t happen. Wahl and Ammann totally ignored the recommendations of the NAS panel, even though it wasn’t finally published until a year after the NAS panel; the companion paper, Ammann and Wahl 2007, wasn’t even submitted until after the NAS panel. Other members of the Team also continued the use of strip bark after the NAS panel e.g. Hegerl et al 2007, Juckes et al 2007, Mann et al 2008.
Wahl and Ammann, as discussed in past CA posts, is a sustained exercise in Texas sharpshooting. Their efforts to benchmark RE significance were, of course, a singular contribution to Texas sharpshooting literature. But most of the rest of their article are variations on the theme.
Even the longstanding issue of 2 or 5 PCs comes down to Texas sharpshooting. As Jean S reminded readers at Bishop Hill, there was no evidence that Mann used Preisendorfer’s Rule N in determining the number of retained PCs in MBH98. Indeed, the explicit language of the article indicates another rule. Mann has refused to provide source code evidencing the use of this rule in MBH98. Using this rule after the act is simply one more example of Texas sharpshooting – what Wegman called “no statistical integrity”.
Gavin Schmidt’s inline responses to Judy Curry here relies heavily on Wahl and Ammann 2004 2005 2006 2007 includes a complaint that we haven’t published a rebuttal of Wahl and Ammann in the peer-reviewed litchurchur.
Obviously I’ve commented on Wahl and Ammann at length at Climate Audit. I recognize that these comments haven’t been peer reviewed by Jones, Santer, Mann and their associates, but they are still comments that I believe to be thoughtful and ones that are worth reading by someone interested in the topic. There is a separate left-frame category for Wahl and Ammann.
Second, it is very much my belief that, if the points made in these threads and elsewhere are correct (and I believe them to be), then these are sorts of things that specialists in the field, employed to do these sorts of studies, should be responsible for knowing whether or not I’d written the threads. That I’ve commented should be an assistance to them, but surely not a prerequisite.
Third, although Schmidt complains that we haven’t rebutted Wahl and Ammann in the litchurchur, this is not entirely true. McIntyre and McKitrick (E&E 2005) rebutted many, if not most, of the points at issue in Wahl and Ammann. This may seem a little surprising given that MM2005 (EE) was published prior to Wahl and Ammann. Nonetheless, it is so.
All the key arguments of Wahl and Ammann 2007 – bristlecones in a lower-order PC4, two versus 5 PCs, Mannian inverse regression without PCs – were first put forward in the Mann response to our re-submission, which I’ve placed online here – see, in particular, Mann’s cover letter.
These arguments from Mann’s 2004 response to our Nature submission featured prominently in multiple threads in the opening of realclimate. (It was these pre-emptive attacks on us that led to the opening of climateaudit as a blog in late January 2005, thanks to the suggestion and initiative of John A.)
Although Wahl and Ammann did not cite either the Mann submission to Nature or the realclimate posts (and conspicuously do not even acknowledge Mann), virtually all the main arguments in Wahl and Ammann derive from these prior publications by Mann. Isn’t the failure to acknowledge such priority a form of plagiarism?
In MM2005 (EE), we reported on our examination of the various permutations and combinations of correlation and covariance PCs, the impact of 2 or 5 PCs, etc, that had been previously raised, plus a few others. If you go through the salient cases of Wahl and Ammann, you’ll find that they are already considered in MM2005 (EE). Of course, this isn’t reported either. (Wahl’s awareness of this priority is demonstrated in his Climategate correspondence with Briffa.)
Doubtless it would have made things easier for people if we’d responded to Wahl and Ammann/Ammann and Wahl (the SI to which only became available in summer 2008) and it’s on my list of things to do. But the fact that I haven’t attempted to run the gauntlet of Team reviewers in the litchurchur doesn’t mean that I haven’t responded to Wahl and Ammann. The points have been responded to at considerable length.
Schmidt also grasped the verification r2 nettle – a nettle that he would have been better off leaving ungrasped. This was a battleground issue in 2005. Judy Curry had written:
just because no single significance test is objectively the best in all circumstances does not mean that you can cherry pick significance tests until you find one you like and ignore R2.
Gavin Schmidt replied:
[Response: This is simply insulting. You have absolutely no evidence that this was the case. The RE/CE statistics are perfectly fine at describing what the authors thought were relevant and have a long history in that field (Fritts, 1976) and as we have seen the PCA issue is moot. The idea that people went looking for 'bad statistics' to fix their results is without merit whatsoever. Please withdraw that claim.]
Well, it may be insulting, but the evidence is what it is.
Fritts, 1976 does not stand as an authority for not using verification r2, as it is a test that Fritts recommends prior to doing the RE test. Secondly, Schmidt’s claim that Mann reported an RE/CE pair is untrue. Mann did not report CE results for MBH98. They were first reported in MM2005 (GRL), where we observed that the AD1400 step failed the CE test, as it had the verification r2 test.
However, the most compelling evidence of Mann reporting a verification r2 in a step where it was favorable was, of course, in MBH98 itself, where Figure 3b is clearly labeled “verification r2″ – see below:
While the verification r2 is illustrated geographically in the above graphic, and MBH98 stated that they considered r2 statistics, the SI to MBH98 showed only the RE results and not verification r2 statistics. Mann’s source code, archived in response to the House committee, showed that he calculated verification r2 in the same step as verification RE, a point made at the time and later presented to the NAS panel.
The original Wahl and Ammann submission likewise did not include verification r2 results (even though they had issued a press release that our results were “unfounded”) Our codes and the Wahl-Ammann code reconciled – Wegman waggishly observed that it was more correct to say that Wahl and Ammann replicated our results, than Mann’s. As a reviewer of Wahl and Ammann, I asked that they include verification r2 results. They refused, citing their GRL article as authority (without disclosing to Schneider that their GRL article had already been rejected.)
In December 2005, I suggested to Ammann that we write a joint paper clearly summarizing points of agreement and disagreement. He refused, saying that it would be “bad for his career”. This has led to a great deal of wasted time on everybody’s part. Again, he refused to report verification r2 results. These were reported only after an academic misconduct complaint was filed against Ammann. Needless to say, Ammann got the same negligible verification r2 results that we had.
The NAS offered to examine the verification r2 issue, but Cicerone removed it from the terms of reference of the NAS panel. Nonetheless, panelist Christy asked Mann whether he had calculated verification r2 for the 1400 step and what the result was. Mann denied calculating the verification r2, saying that this would be a “foolish and incorrect” thing to do. Of course, it was known at the time that he had calculated verification r2 statistics, since it was in his code and illustrated for the AD1820 step.
The “dirty laundry” email (in which Mann sent to Briffa and Osborn the residuals that he later refused to send me) had not been available to the NAS panel. With these residuals in hand (or even if the actual reconstruction steps had been made available), it was child’s play to see the failed verification r2, CE and other results.
As it was, the NAS panel was seemingly dumbfounded by Mann’s bald-faced answer and did not follow up. There was supposed to be an opportunity for public discussion after presentations. However, Mann fled the room before anyone from the public e.g. me had an opportunity to ask. I sharply criticized the NAS panel for sitting like bumps on a log and not following up. Nychka came up to me afterwards and said that, just because they didn’t say anything didn’t mean that they didn’t notice. They didn’t say anything in their report either on the topic so they might as well not have noticed.
The Wahl and Ammann attempt to justify the failed verification r2 test was itself one more instance of the Texas sharpshooting. Once the failed verification r2 was exposed (and only after its exposure by third parties), they attempted to re-frame the question by now arguing that verification r2 wasn’t a relevant statistic – notwithstanding their use of the statistic in the illustration when it was to their advantage. Schmidt may find this impolite, but facts are sometimes stubborn.
The failure of the MBH verification r2 results was not as small a result at the time as now portrayed by the Team. Eduardo Zorita told me that his view on MBH changed once he knew of the failed verification r2. If Mann wanted to argue that the failed verification r2 didn’t “matter”, the failed results should have been reported and discussed in the original article.
Thus, while reconstructions relying on strip bark bulges of California bristlecones and magic Yamal larches have been published in orthodox scientific journals, this does not change the fact that the underlying analyses do not rise above phrenology.
One of the more controversial issues in WG2 arose out of Robert Muir-Wood’s calculations on climate-related damages – Pielke Jr taking issue
http://rogerpielkejr.blogspot.com/2010/02/ipcc-mystery-graph-solved.html and http://rogerpielkejr.blogspot.com/2010/01/hot-on-trail-of-ipcc-mystery-graph.html.
Muir-Wood was a Contributing Author to IPCC AR4.
Earlier this year, I attempted to obtain data used in the underlying publications in order to carry out statistical analyses. I wrote on four occasions.
On Jan 27, 2010, I wrote as follows:
Dear Dr Muir-Wood, Could you please provide me with a digital version of the time series used in the production of Figure 12.5 through 12.9 and Tables 12.A.1 and 12.A.2 of Miller, Muir-Wood and Boissonnade, An exploration of trends in normalized weather-related catastrophe losses. Thank you for your attention, Steve McIntyre
In response to my fourth request, Muir-Wood was on holiday and, contrary to Muir-Wood’s total failure to respond, a secretary responded as follows:
Robert is on holiday this week. By copying Auguste [Boissonnade], I am hoping he can help you.
Needless to say, Boisonnade didn’t provide the data either.
Muir-Wood does not work for a public agency – he is employed by Risk Management Solutions (who Bob Ward used to work for.) I asked Ward for assistance, but he refused.
Ryan O asked serial Mann coauthor, Caspar Ammann, for supporting data for Ammann et al (PNAS 2007), which was referred to in CCSP (2009c) Past Climate Variability and Change in the Arctic and at High Latitude, an assessment report that was, in turn, cited in the EPA Endangerment Finding. Ryan’s request was as follows:
the monthly gridded gridded monthly temperature anomalies over the entire 850 – 2000 AD period from the NCAR CSM 1.4 experiments used for the 2007 PNAS paper, “Solar influence on climate during the past millennium: Results from transient simulations with the NCAR Climate System Model”, with Dr. Ammann as the lead author.
Ammann demonstrated the new Team openness by simply not acknowledging or replying to Ryan’s request.
The data is not confidential – Ryan observed that Ammann had previously provided the data to people who were not NCAR employees (e.g. Mann, Wahl and Rutherford)
Ryan eventually sent an FOI request to UCAR – a consortium of universities that manages NCAR for the National Science Foundation. (I’ve written on it before; it’s a sort of off-balance sheet method of public expenditure.)
UCAR’s General Counsel, Meg McClellan, refused Ryan’s request for data as follows:
The University Corporation for Atmospheric Research (UCAR) is a private non-profit research organization that operates the National Center for Atmospheric Research (NCAR) and other programs. Neither UCAR nor NCAR are federal agencies. FOIA does not apply to private organizations.
The structure of NCAR is fairly murky. I did posts on UCAR-NCAR a few years ago see http://climateaudit.org/2006/03/23/inhofe-ucar-and-ncar/ and http://climateaudit.org/2006/03/24/ncar-competition-announcement/.
My impression was that NCAR was owned by the National Science Foundation. I can see how UCAR might have evaded FOI, but I’m a little puzzled as to the argument in respect to NCAR.
Given public sentiment, it seems foolish for Ammann and UCAR to obstruct Ryan’s request.
In the meantime, Ryan has sent an FOI to National Science Foundation, observing:
…
OMB Circular A-110, which defines the requirements by which private, non-profit organizations may accept NSF funds, states under subpart (C):
“(d) (1) In addition, in response to a Freedom of Information Act (FOIA) request for research data relating to published research findings produced under an award that were used by the Federal Government in developing an agency action that has the force and effect of law, the Federal awarding agency shall request, and the recipient shall provide, within a reasonable time, the research data so that they can be made available to the public through the procedures established under the FOIA. If the Federal awarding agency obtains the research data solely in response to a FOIA request, the agency may charge the requester a reasonable fee equaling the full incremental cost of obtaining the research data. This fee should reflect costs incurred by the agency, the recipient, and applicable subrecipients. This fee is in addition to any fees the agency may assess under the FOIA (5 U.S.C. 552(a)(4)(A)).”
Full text available here: http://www.whitehouse.gov/omb/rewrite/circulars/a110/a110.html
…
NSF FOI specialist Leslie Jensen replied as follows:
Dear Sir:
Your request is not perfected as submitted. See the NSF FOIA regulations: http://www.nsf.gov/policies/foia.jsp. In addition, with respect to your request for research data, 2 CFR 215.36. Section 5 CFR 215.36(d) provides:
215.36 Intangible property.
* * *
(d) (1) In addition, in response to a Freedom of Information Act (FOIA) request for research data relating to published research findings produced under an award that was used by the Federal Government in developing an agency action that has the force and effect of law, the Federal awarding agency shall request, and the recipient shall provide, within a reasonable time, the research data so that they can be made available to the public through the procedures established under the FOIA. If the Federal awarding agency obtains the research data solely in response to a FOIA request, the agency may charge the requester a reasonable fee equaling the full incremental cost of obtaining the research data. This fee should reflect costs incurred by the agency, the recipient, and the applicable subrecipients. This fee is in addition to any fees the agency may assess under the FOIA (5 U.S.C. 552(a)(4)(A)).
(2) The following definitions apply for purposes of paragraph (d) of this section:
(i) Research data is defined as the recorded factual material commonly accepted in the scientific community as necessary to validate research findings, but not any of the following: Preliminary analyses, drafts of scientific papers, plans for future research, peer reviews, or communications with colleagues. This “recorded” material excludes physical objects (e.g., laboratory samples). Research data also do not include:
(A) Trade secrets, commercial information, materials necessary to be held confidential by a researcher until they are published, or similar information which is protected under law; and
(B) Personnel and medical information and similar information the disclosure of which would constitute a clearly unwarranted invasion of personal privacy, such as information that could be used to identify a particular person in a research study.
(ii) Published is defined as either when:
(A) Research findings are published in a peer-reviewed scientific or technical journal; or
(B) A Federal agency publicly and officially cites the research findings in support of an agency action that has the force and effect of law.
(iii) Used by the Federal Government in developing an agency action that has the force and effect of law is defined as when an agency publicly and officially cites the research findings in support of an agency action that has the force and effect of law.
This section provides for access to research data (as defined above)
– relating to published (as defined above) research findings
– produced under an (NSF) award (made after the effective date of this provision)
– that was used by the Federal Government in developing an agency action that has the force and effect of law (as defined above).
Please identify the NSF awards that meet the preconditions for research data access set forth in section 215.36(d)(1). I have enclosed NSF abstracts of awards to Dr. Caspar Ammann that may be of interest to you. Please also provide your contact information and your agreement to pay accrued fees. Once you have perfected your request, I will proceed in accordance with the procedures set forth the in above Regulations.
Sincerely,
Leslie A. Jensen
FOIA/Privacy Act Officer
Ryan replied with a re-stated request containing the additional information:
Below is the complete request, with the additional information required:
Pursuant to 2 CFR 215.36, Section 5 (d), I hereby request information that was used in support of Endangerment and Cause or Contribute Findings for Greenhouse Gases under the Clean Air Act (EPA). 2 CFR 215.36, Section 5(d) requires release of research data subject to the following conditions:
1. The data is necessary to validate published, peer-reviewed research findings:
A. Ammann, M. C., F. Joos, D. S. Schimel, B. L. Otto-Bleisner, and R. A. Tomas (2007): Solar influence on climate during the past millennium: Results from transient simulations with the NCAR Climate System Model. PNAS, 104, 3713-3718, doi:10.1073/pnas.0605064103
B. Mann, M. E., S. Rutherford, E. Wahl, and C. M. Ammann (2007): Robustness of proxy-based climate field reconstruction methods. Journal of Geophysical Research, 112, D12109, 1-18, doi:10.1029/2006JD008272
C. Mann, M. E., Z. Zhang, M. K. Hughes, R. S. Bradley, S. K. Miller, S. Rutherford, and F. Ni (2008): Proxy-based reconstructions of hemispheric and global surface temperature variations over the past two millennia. PNAS, 105, 13252-13257, doi:10.1073_pnas.0805721105
The data is directly required to validate A and B. Since C relies on B for evaluating the statistical significance of the reconstruction, the data is required to validate the statistical significance calculations (and, hence, whether the results are physically meaningful) for C.
2. The data was produced under a federal award:
A. Award #8217015 (special allocation from the NCAR Directors Reserve)
B. Award #0542356 (M. E. Mann) and #8217015 (C. M. Ammann)
C. Award #0542356 (M. E. Mann)
All award numbers are National Science Foundation numbers.
3. The published research was used by a federal agency in developing an action with the force and effect of law:
A. Cited in the Technical Support Document of the previously mentioned EPA Endangerment Finding via the following synthesis report:
CCSP (2009c) Past Climate Variability and Change in the Arctic and at High Latitude. A Report by the U.S. Climate Change Program and Subcommittee on Global Change Research [Alley, R.B., Brigham-Grette, J., Miller, G.H., Polyak, L., and White, J.W.C. (coordinating lead authors)]. U.S. Geological Survey, Reston , VA, 461 pp.
C. Cited in the Technical Support Document of the previously mentioned EPA Endangerment Finding via the following synthesis report:
CCSP (2009c) Past Climate Variability and Change in the Arctic and at High Latitude. A Report by the U.S. Climate Change Program and Subcommittee on Global Change Research [Alley, R.B., Brigham-Grette, J., Miller, G.H., Polyak, L., and White, J.W.C. (coordinating lead authors)]. U.S. Geological Survey, Reston , VA, 461 pp.
The data I request is the complete time series of monthly gridded gridded monthly temperature anomalies over the 850 – 2000 AD period from the NCAR CSM 1.4 experiments used directly in Refs. A and B, and indirectly in Ref. C above. I require no specific format for the data; however, if it is available in text format using a common delimiting scheme (such as space, tab, comma, or fixed-width delimitation) I would prefer those formats.
References:
CCSP (2009c) Past Climate Variability and Change in the Arctic and at High Latitude. A Report by the U.S. Climate Change Program and Subcommittee on Global Change Research [Alley, R.B., Brigham-Grette, J., Miller, G.H., Polyak, L., and White, J.W.C. (coordinating lead authors)]. U.S. Geological Survey, Reston , VA, 461 pp. http://downloads.climatescience.gov/sap/sap1-2/sap1-2-final-report-all.pdf
EPA Endangerment Finding http://www.epa.gov/climatechange/endangerment/downloads/Endangerment%20TSD.pdf
The Muir Russell panel blatantly misrepresented the facts surrounding Jones’ notorious request to “delete all emails”, a misrepresentation that, in my opinion, was done, at a minimum, either recklessly or out of gross negligence.
The Muir Russell Report
Muir Russell’s findings on the “delete any emails” incident are contained in chapter 10 paragraph 28. Obviously the issuing of an FOI request affects the right of Jones or anyone else to delete documents. Muir Russell purported to exonerate CRU on this count on the empirical basis that the “delete any emails” request had not occurred in the context of a prior FOI request – a claim that is totally untrue.
There seems clear incitement to delete emails, although we have seen no evidence of any attempt to delete information in respect of a request already made. Two e-mails from Jones to Mann on 2nd February 2005 (1107454306.txt) and 29th May 2008 (in 1212063122.txt) relate to deletion:
2nd February 2005: ?The two MMs have been after the CRU station data for years. If they ever hear there is a Freedom of Information Act now in the UK, I think I’ll delete the file rather than send to anyone”.
29th May 2008: ?Can you delete any emails you may have had with Keith re AR4? Keith will do likewise. He’s not in at the moment – minor family crisis. Can you also email Gene and get him to do the same? I don’t have his new email address. We will be getting Caspar to do likewise”.
As hundreds, if not thousands of people, know, David Holland had submitted an FOI request (denoted by UEA as 08-31) on May 27, 2008, only two days prior to the “delete any emails” request, a request which covered the correspondence between Eugene Wahl and Keith Briffa that Fred Pearce described as “back-channel communications that were a direct subversion” of IPCC policies of openness and transparency.
Holland’s request initiated a flurry of activity by Climategate participants. The next day (888. 1212009215.txt), Jones emailed FOI officers Palmer and McGarvie and Briffa and Osborn stating that “Keith [Briffa] should say” that the back-channel Wahl-Briffa correspondence didn’t exist. The following day (May 29), Jones sent the notorious email (1212063122.txt) to Mann and Briffa famously asking them to “delete any emails” with Briffa regarding AR4, saying that they planned to also ask Ammann, and asking Mann to contact Wahl to delete his emails.
Holland’s prior email request is hardly something that the Muir Russell could or should be unaware of. The UK Information Commissioner was aware of Holland’s request, commenting that it would be “impossible” to contemplate “more cogent prima facie evidence” of a section 77 offence than Jones’ email (while also regretting that poor wording of the legislation meant that the prosecution was time barred under the statute of limitations before the incident had been brought to light.)
The incident had also drawn the attention of the Parliamentary Committee, who stated that the importance of a “conclusive resolution” of the resolution meant that the incident should be “thoroughly investigated” regardless of the time bar:
93… There is prima facie evidence that CRU has breached the Freedom of Information Act 2000. It would, however, be premature, without a thorough investigation affording each party the opportunity to make representations, to conclude that UEA was in breach of the Act. In our view, it is unsatisfactory to leave the matter unresolved simply because of the operation of the six month time limit on the initiation of prosecutions. Much of the reputation of CRU hangs on the issue. We conclude that the matter needs to be resolved conclusively— either by the Independent Climate Change Email Review or by the Information Commissioner.
Muir Russell’s blatant misrepresentation of the undisputed factual record on this point meant that the conclusive resolution requested by the Parliamentary Committee obviously didn’t occur.
Nor is there any evidence that Muir Russell carried out the “thorough investigation” of the matter that the Parliamentary Committee requested. The supplementary information on the inquiry website does not contain any answers, oral or written, on the “delete all emails” incident. Geoffrey Boulton and Muir Russell not only appear not to have investigated the incident “thoroughly”; they don’t appear to have investigated it at all.
The March 4 interview with Jones and Briffa by Clarke and Norton (Muir Russell not bothering to attend) was primarily about CRUTEM, but the topic of “suggestions that emails be deleted” was mentioned passim, with Jones saying that “he had not received any specific training:
Suggestions that e-mails should be deleted
17. Prof Jones, in response to questioning, noted that he had not received any specific training on DPA/FoIA/EIR issues from the UEA.
The only other interview with Jones et al came on April 9, (09 April Jones and Briffa.pdf) this time by Geoffrey Boulton (a vigorous climate campaigner who worked for 18 years at UEA) and Peter Clarke, Muir Russell once again not bothering to attend. There is no transcript of the April 9 meeting. There are partial minutes (“Salient points”), which evidence that FOI had been raised in connection with CRU’s obstruction of Willis Eschenbach’s September 2007 request for CRUTEM data, but “time ran out”. There is no mention of the “delete all emails” request. Boulton sent some follow-up written questions to Jones and Briffa, but none touched on FOI or the notorious “delete all emails” request.
Subsequent to Jones’ March 4 claim that “he had not received any specific training on DPA/FoIA/EIR issues from the UEA”, there was a March 30 interview with FOI officials Palmer and Colam-French, who described FOI training programs for UEA staff. The minutes do not record whether Palmer and Colam-French were asked whether they agreed with Jones’ assertion that he had never been “trained” in FOI (not that a supposed lack of training would justify the “delete all emails” request.)
Rather than the Muir Russell panel “thoroughly investigating” an issue highlighted by the Parliamentary Committee, there is no record of Jones’ answering a single question about the request to “delete all emails” or the equally damning instruction that Briffa “should say” (untruthfully) that there had been no such correspondence between him and Wahl.
The UK Research Councils have a code of conduct here which includes misrepresentation under its code of conduct, including the following:
misrepresentation of data, for example suppression of relevant findings and/or data, or knowingly, recklessly or by gross negligence, presenting a flawed interpretation of data;
Note that misconduct arises here (as it does in society in tort law) not just from dishonesty, but through recklessness or gross negligence.
The Muir Russell panel misrepresented the fact that the “delete any emails” request came after an FOI request, leading to a “flawed” interpretation. It was an important issue – one that the Parliamentary Committee had asked them to attend to; the facts were easy to ascertain and known to thousands. If the Muir Russell panel or its members were subject to this code of conduct of the UK Research Councils or an equivalent code of conduct, in my opinion, there is convincing prima facie evidence that their misrepresentation of the facts surrounding Jones’ “delete any emails” request was done “knowingly, recklessly or by gross negligence” and would thus warrant investigation.
It is, of course, possible that the Muir Russell panel is not subject to any code of conduct and can blatantly misrepresent the facts surrounding the “delete any emails” request with impunity.
The most logical way to clear the air would be for the Parliamentary Committee to invite Muir Russell (and Oxburgh) to testify to them about their findings. They had asked Muir Russell to thoroughly investigate the ‘delete any emails’ request and to “conclusively resolve” the matter. They should invite Muir Russell (and obviously Geoffrey Boulton as well) to explain the basis of their findings on the “delete any emails” request, as well as other incongruous findings that I will report on in other posts.
In the Guardian debate, George Monbiot’s opening question (made in good faith on his part) pertained to CRUTEM, George noting that the inquiry had been able to derive a CRUTEM-like result from GHCN data and challenging me that this had somehow rebutted my “crusade” on this point.
I tried to deal with this as quickly as I could, since I did not want to waste an already short 5 minutes to deal with disinformation. My answer – which surprised Monbiot – was that CRUTEM had been little more than a passing interest at Climate Audit on which I’d seldom commented. And that Muir Russell’s finding on the triviality of CRU’s temperature unit simply endorsed a point previously made at Climate Audit. This answer seemed to baffle George and others.
Unfortunately, Monbiot and others had uncritically accepted disinformation from the Muir Russell inquiry, which, on this point (as on some others), instead of examining (with citations) actual criticisms from sources like Climate Audit, preferred instead to construct its own allegations which, in this case, they described as “broad allegations which are prevalent in the public domain”. Lucia has often criticized such Gavinesque behavior in other contexts.
My long-standing position on CRUTEM was that CRU’s obstruction of data requests was most likely due to its desire to conceal that it did so little work on quality control; that the CRU result could be derived so trivially that, in effect, CRU no longer served any useful function in this field. Long before Climategate, I’d recommended that CRU’s responsibilities in this field be transferred to the UK Met Office and that the US Department of Energy re-allocate its funding in this area to improvements at GHCN – a point that should be considered carefully in the US DOE review of their funding of CRU (reported by Jonathan Leake here.)
At the Guardian panel, I observed that CRUTEM was an almost microscopically small issue in the Climategate emails – Climategate was about the Hockey Stick and its handling by IPCC, not CRUTEM. CRUTEM was mentioned in only 25 emails and, even then, often passim.
I’ll review some past CA posts to provide support for this.
In early 2007 here, I’d observed that the HadCRU series for gridcell 57N 77E (containing the single Siberian station, Barabinsk) could be derived from a simple anomaly calculation from a GHCN version of Barabinsk. At the time, we didn’t know what stations CRU used, but, in this case, I observed that the CRU calculation was straightforward – unlike GISS, which had all sorts of weird smoothing and adjusting, which were then a topic of interest. This post contains some interesting plots of differences between various various versions for the gridcell and station – the understanding of these differences has underpinned the desire to examine data as used by the various agencies.
In late 2008, long before my own FOI requests for CRU station data, I discussed CRU calculations in more detail here, concluding with the observation that “if, like GISS, they are doing nothing other than trivial sums on GHCN data, one feels that the money would be better spent on beefing up QC and data collection at GHCN.”
The reverse engineering of CRUTEM3 looks almost pathetically easy given that we’ve already waded through step 0 of GISS, where they collate different GHCN versions (dset0) into a single station history (dset1.) CRU doesn’t have the bewildering sequence of smoothing operations that Hansen uses at multiple stages (though Hansen, mercifully, doesn’t use Mannian butterworth smoothing).
To my knowledge, unlike GISS, CRU does not make the slightest attempt to adjust for UHI, relying instead of articles like Jones et al 1990 purporting to show that UHI doesn’t “matter”.
We can already emulate GISS step0 – not that it makes any sense, but it provides a benchmark. Here’s all that seems to be necessary to produce a gridded CRUTEM3 series given a dset1 data set. First, create an anomaly-version of the series. I have a simple function anom on hand and this could be done as follows:
dset1.anom=apply(dset1,2,anom)
Then one could make an average of dset1 series within gridcell i as follows, where info is an information dataset in my usual style containing for each station, inter alia, its lat, long and gridcell number (called “cell” here):
for (i in 1:2592) grid[,i]=apply(dset1.anom[,info$cell==i],1,mean,na.rm=T)
This would yield the CRUTEM3 series. My guess as to why they don’t
want to show their work is because they probably use hundreds of line of bad Fortran code to do something that you can do in a couple of lines in a modern language. Anyway, I’ll experiment with this at some point, but this is my hypothesis on all that’s required to emulate CRUTEM3. CRU has been funded by the US DOE; if, like GISS, they are doing nothing other than trivial sums on GHCN data, one feels that the money would be better spent on beefing up QC and data collection at GHCN.
We re-visited this issue in the Mole post last summer. I observed in a comment (along the lines of my 2008 post):
Nowhere have I encouraged readers to expect any smoking guns in this data set. Quite the opposite. My own best guess as to why they are so obstructive about the data is the specific commercial interest of CRU. My guess is that they spend negligible time on quality control, but derive a lot of funding for a prestigious data set and use the funds for other purposes. They don’t want anyone to see how simplistic their analysis is and how negligible their quality control. Nothing more, nothing less. (But that’s just a guess. The real reason may be different again.)
Reader Adam observed:
The whole CRUTEM / HadCRUT gridded series can be easily reproduced for the most of the globe with the GHCN dataset. I’ve tried it for some gridcells and it worked.
This is not actually true, though it holds for some gridcells (as I’d observed for the Barabinsk gridcell.) For example, HadCRU includes SST data, which is not in the GHCN land data set. In addition, Jones has his own 1961-1990 “normals” that he uses for standardization and an exact replication of CRUTEM cannot be accomplished without these “normals”, though the calculation can be approximated using freshly calculated normals. (Oddly enough, I have a copy of the CRUTEM2 normals from my 2002 correspondence with Jones – before I’d been blacklisted because of the MM2003 criticism of MBH98). In response to Adam, I observed:
I agree with your comments. Like you, I believe that 95% of CRU is obtained from GHCN, with a very few non-GHCN sources, of which Austria is one (Norway, Sweden, Denmark are others.) Like you, I believe that they do relatively trivial manipulations of GHCN data. AS I’ve said elsewhere, that is my best guess as to the secret that they don’t want exposed and the only commercial interest that they are protecting.
Also see my post from earlier this year discussing the “end of CRUTEM” and the desireability of this responsibility being taken over by the Met Office, a post in which I review earlier comments to this effect going back a few years.
A couple of other posts at the time of the Mole incident were here and Dr Phil, Confidential Agent , in which I observed that Jones was a temperature accountant:
Jones has spent much of his academic career as a sort of temperature accountant. Commencing in the early 1980s, he collected station data and compiled averages – a useful enterprise, but surely no more than accounting.
Muir Russell’s second of three “broad allegations which are prevalent in the public domain”:
That CRU adjusted the data without scientific justification or adequate explanation. Some allegations imply that this was done to fabricate evidence for recent warming.
I don’t know who they had in mind here, as they’ve followed the Gavin Schmidt practice of not providing a citation. And perhaps somebody somewhere has made an allegation in this form. But this is not an allegation that was made at Climate Audit. My own surmise was not that CRU had adjusted the data, but that they hadn’t adjusted the data for UHI – a surmise that has been verified.
The criticism from Climate Audit was that (1) CRU provided their station data as collated to “friends” but not to potential critics; and (2) that their excuses for not providing station data were what one London reporter (not Jonathan Leake) described to me as “deliberately deceptive”.
Muir Russell did not directly address either issue. Instead, they re-framed both questions.
Postscript: In their Appendix 7, Muir Russell say that they were able to make a concordance of 90% of CRU stations to GHCN stations despite the lack of such a concordance by CRU. This is a lower percentage than the concordance (95.6%) that I’d placed online in December 2007.
Muir Russell stated:
31. The Review Team was able to match 90% of stations given in the CRU list to GHCN (see Appendix 7). CRU has stated in a written submission [15] that that the remaining 10% can be obtained from other sources including the NMOs. Thus substantial work is required to take the CRU published list and assemble 100% of the primary station data from global repositories and NMOs. We make a recommendation for the future below.
When Willis Eschenbach managed to get the list of CRU stations as used, I did semi-automated matching of CRU information to GHCN information, discussing the matter here and posted up my concordance in Dec 2007 here, in which I’d matched 95.6% of CRU stations to GHCN sources.
Sitting on the dais at the Guardian panel, it seemed to me that the most remarkable moment came when the audience laughed out loud at Trevor Davies about Muir Russell. Usually, you can’t hear this sort of thing from an audience, but you could hear it loud and clear at 1:29:12. The Guardian laundered this episode in their account, but it sure seemed like Davies totally lost the audience at this point.
In my talk, I reported Muir Russell’s failure to attend either of the evidence-taking interviews with Phil Jones following the unveiling of the Muir Russell panel (on Feb 11, 2010), a point previously reported at CA here, as follows:
Muir Russell was due to report in spring 2010, but as of the start of April, nobody at CRU had been interviewed on anything to do with the Hockey Stick or IPCC. In fact, Muir Russell does not appear to have even met with Jones or Briffa after the unveiling of the Muir Russell panel in February. Muir Russell didn’t even bother attending the one and only interview with Jones and Briffa on the Hockey Stick and IPCC on April 9. Nor did two other panellists.
Thus, Muir Russell’s non-attendance at the Jones’ interviews was an issue that I’d prominently raised at CA immediately prior to the Guardian debate and you’d think that Trevor Davies, representing the university and the inquiries, would have at least taken a look at Climate Audit given that I was attending and anticipated this issue being raised.
However, the issue, when raised by Sunday Times reporter Jonathan Leake, caused Davies to implode. Here’s an approximate transcript leading up to the Davies’ incident. It started with a question from Jonathan Leake of the Sunday Times, who, oddly enough, Monbiot recognized only as a “gentleman from the front”. Leake (approximate transcript):
Steve, in your presentation, you seemed to say that Phil Jones wasn’t interviewed by Muir Russell at all in person. He was interviewed by some other members of the panel. If that is true, who did? And it seems astonishing that the chairman of the inquiry did not interview .. if that is true, perhaps Trevor Davies can tell us who did… it seems remarkable.
Leading to the following exchange:
McIntyre – I’m going from the minutes of the report. In December, Muir Russell arrived; they had 8 meetings that day, one of which was between Muir Russell and Phil Jones, accompanied by Trevor Davies, at which I presume that no evidence was taken. In January, there was an exploratory meeting. The panel announced on Feb 11, there were two meetings with Phil Jones after that – one on March 4, as I recall, between Norton and Clarke covering CRUTEM series. And other one on April 9 with Boulton and Clarke again, covering the Hockey Stick issues and IPCC. Muir Russell didn’t attend either of the two meetings with Phil Jones after the unveiling of the panel, and which were the only two meetings at which any evidence was taken… Yes, it bewilders me that a responsible chairman of an inquiry did not attend the only material interviews with the people involved in the whole affair. Muir Russell did however have extensive meetings with administrative staff
Monbiot: Trevor, does that chime…
Davies- My memory for these details is not as good as Steve’s. He has confirmed that Muir Russell did indeed interview Phil Jones and in my list, he interviewed Phil Jones. And this info is in the back of Muir Russell report.
McIntyre – Not after the panel was announced and not where any evidence was taken.
Monbiot (speaking to Davies) – can you contradict that?
Long pause –
McIntyre – he can’t contradict that.
Monbiot – just a minute Steve
Davies – Steve will have to remind me when the panel was actually announced…
Uproarious laughter and applause.
McIntyre – Feb 11, Trevor
Davies – yes…
Long pause….
Davies – Steve appears to be…
McIntyre – it’s on page 92 or so…
Davies: there were interviews between Muir Russell, the chairman, and Phil Jones. Later on those interviews were done by the specialists.
Monbiot – when did those interviews take place between Muir Russell and Phil Jones?
Davies – the last one that I can see is on January 27.
Another not-so-good moment for the defenders of the Team that can’t shoot straight.
After the panel, the Guardian held a very pleasant reception at a nearby bar. I walked over with Davies, observing that it couldn’t have been much fun for him the last number of months. He took some offence at the Climate Audit had described him as “ever-present” [in the UEA's post-Climategate dealing with the public]. I laughed (I think) cheerfully at this criticism, observing that, in his shoes, I’d have tried to be “ever-present” as well and thus this was hardly a criticism.
I thought about this point when I got home. “Ever-present” is not a word that I use a lot. In fact, I’d only used the adjective once in connection with Trevor Davies.
This, ironically, is the very post in which I itemized Muir Russell’s non-attendance at the Jones’ interview in meticulous detail. Davies presumably had to have read the post in order to object to being described as “ever-present”, but apparently hadn’t notice of the post’s actual content – Muir Russell’s non-attendance.
At the reception, Davies challenged me in front of a reporter to withdraw some supposedly inaccurate statements at Climate Audit about the Oxburgh inquiry terms of reference. The reporter seemed to want me to make the changes on the spot at the reception. I said that I would look at the matter when I got home and that I would be more than willing to correct any inaccuracies. They sent me a copy of a statement that they released after Roger Harrabin’s story on the matter, but, thus far, have not responded to two requests to identify what, if anything, at CA requires correction. More on this on another occasion.
Demetris Koutsoyannis was at two sessions of the 11th Meeting on Statistical Climatology in Edinburgh last week. The purpose of the meetings is:
The aim of IMSC is to promote good statistical practice in the atmospheric and climate sciences and to maintain and enhance the lines of communication between the atmospheric and statistical science communities.
Geoffrey Boulton’s associate, Gabrielle Hegerl, was an organizer. One of the important sessions was entitled “Reconstructing and understanding climate change over the Holocene”.
Demetris’ presentation is online here, poster here. I urge readers to look closely at his interesting example showing how very high autocorrelation can arise from a compound stochastic process consisting of:
(1) random changes of level persisting for exponentially distributed lengths;
(2) white noise.
It’s a different sort of stochastic process than the all-too-artificial ARMA processes that dominate present-day analyses and well worth paying attention to.
Demetris wrote me to say that he attended on Tuesday and Wednesday, including the session “Reconstructing and understanding climate over the Holocene” (see p. 18 in the “final program and logistics“, commending the first presentation by Heinz Wanner (Holocene climate change – facts and mysteries).
He said that Mann’s talk included an interesting cartoon with an ensemble of hockey sticks, one of which is being broken by an angry guy.
He reported that Climategate came up in a talk by Reinhard Böhm, who, according to Demetris, “promoted a thesis that it is dangerous to put raw data open on the internet because some they would misuse them.” Demetris said that, in his own talk the next day, he tried to respond to this (and “to comment on Mann’s cartoon with the ensemble of hockey sticks–but Mann wasn’t there”).
Demetris sends the following extended commentary on his exchange with Böhm.
One of the interesting talks I attended in the 11th International Meeting on Statistical Climatology (http://cccma.seos.uvic.ca/imsc/11imsc.shtml, University of Edinburgh, Scotland, 12-16 July 2010) was that by Reinhard Böhm. He is the author or a recent (2008) book in German, “Heiße Luft – Reizwort Klimawandel: Fakten – Ängste – Geschäfte” (Hot air: the climate change controversy – facts – fears – funding). His talk was given in the session “Climate Data Homogenization and Climate trend/Variability Assessment” and was entitled “Bridging the gap from indirect to direct climate data – experience with homogenizing long climate time series in the early instrumental period”. The long abstract can be seen in p. 90 of the book of abstracts, accessible from http://cccma.seos.uvic.ca/imsc/11imsc/final_program_abstracts.pdf. In his talk he referred to Climategate and discussed the question whether original climatic data should be available to the public or not. His main point was that the original data are contaminated with biases and inhomogeneities and thus need homogenization. Therefore, only processed data are useful and should be available to the public.
I disagree with this thesis and I addressed three questions to him in the end of his talk:
1. If I homogenize a data set of an area, do you think that there might be a possibility that I introduce more biases that originally contained?
2. If you studied the climate of that area would you rely solely on my processed data or would you retrieve also the original data?
3. Do you think that the original data should be available to the interested scientists or not?
In my question 1 he replied “yes”, which I appreciate, given that I believe that standard procedures for consistency checking and homogenization are strongly affected by inappropriate statistical assumptions (e.g. iid variables with exponential distribution tails), which are invalidated in the real word. In question 2 he replied that if I give explanation of the procedures I followed he would rely on my processed data. About question 3 he said (if understood well) that its reply would need a long time, but in brief the raw data should not be available on the internet because some could misuse them, e.g. choosing only a few stations that demonstrate a specific behaviour that they want to advocate.
I was not happy about the last answer and, next day, I found the opportunity to reply indirectly, using the first slide of my own talk (http://www.itia.ntua.gr/en/docinfo/991/), which contains the title and the web link to my presentation. I said that the online availability is not just for this presentation. Rather, in my group we believe in transparency and have agreed that everything we produce, papers, reports, data, etc., should be openly available on the internet. And I continued “Please feel free to misuse them but, also, please be advised that transparency is the most powerful weapon against misuse”.
Stephen Schneider was only a few years older than me and his death seems all too early.
I had a fair bit of contact with him by email in 2004. He seemed very cheerful – a characteristic that I respect – and certainly much more likely to be good company than the fellow climate scientists that I was then encountering – a point that Ross and I discussed at the time. Schneider recalled the exchange in his recent book – a recollection that, unfortunately, was totally inaccurate.
My original contact with Schneider came in the wake of MM2003. He had severely criticized Energy & Environment for not letting Mann review our 2003 article. In keeping with that premise, he asked me to review a 2004 submission to Climatic Change by Mann et al responding to MM2003 – consistent with his public representations. It seemed to me that there was an inherent conflict of interest in such a review but this was obviously known to Schneider and I attempted to separate out my interests as a disputant from my obligations as a reviewer as much as possible.
At the time, I was very fresh to academic exchanges – this was long before Climate Audit. I’d never reviewed an academic article and my approach was informed by ideas of due diligence that were not then characteristic of academic peer reviewing. In my capacity as a reviewer, I asked to see supporting data for Mann’s supposed rebuttal to MM2003 – the topic of his submission – and to see source code to document his allegations that we’d supposedly made grievous mistakes in implementing his methodology – again an important aspect of his submission. (BTW this was all shortly after our 2004 submission to Nature.)
Schneider replied that he had been editor of Climatic Change for 28 years and, during that time, nobody had ever requested supporting data, let alone source code, and he therefore required a policy from his editorial board approving his requesting such information from an author. He observed that he would not be able to get reviewers if they were required to examine supporting data and source code. I replied that I was not suggesting that he make that a condition of all reviews, but that I wished to examine such supporting information as part of my review, was willing to do so in my specific case (and wanted to do so under the circumstances) and asked him to seek approval from his editorial board if that was required.
This episode became an important component of Climategate emails in the first half of 2004. As it turned out (though it was not a point that I thought about at the time), both Phil Jones and Ben Santer were on the editorial board of Climatic Change. Some members of the editorial board (e.g. Pfister) thought that it would be a good idea to require Mann to provide supporting code as well as data. But both Jones and Santer lobbied hard and prevailed on code, but not data. They defeated any requirement that Mann supply source code, but Schneider did adopt a policy requiring authors to supply supporting data.
I therefore re-iterated my request as a reviewer for supporting data – including the residuals that Climategate letters show that Mann had supplied to CRU (described as his “dirty laundry”). The requested supporting data was not supplied by Mann and his coauthors and I accordingly submitted a review to Climatic Change, observing that Mann et al had flouted the new policy on providing supporting data. The submission was not published. I observed on another occasion that Jones and Mann (2004) contained a statement slagging us, based on a check-kiting citation to this rejected article.
During this exchange, I attempted to write thoughtfully to Schneider about processes of due diligence, drawing on my own experience and on Ross’ experience in econometrics. The correspondence was fairly lengthy; Schneider’s responses were chatty and cordial and he seemed fairly engaged, though the Climategate emails of the period perhaps cast a slightly different light on events.
Following the establishment of a data policy at Climatic Change, I requested data from Gordon Jacoby – which led to the “few good men” explanation of non-archiving (see CA in early 2005) and from Lonnie Thompson (leading to the first archiving of any information from Dunde, Guliya and Dasuopu, if only summary 10-year data inconsistent with other versions.) Here Schneider accomplished something that almost no one else has been able to do – get data from Lonnie Thompson, something that, in itself, shows Schneider’s stature in the field.
It was very disappointing to read Schneider’s description of these fairly genial exchanges in his book last year. Schneider stated:
The National Science Foundation has asserted that scientists are not required to present their personal computer codes to peer reviewers and critics, recognizing how much that would inhibit scientific practice.
A serial abuser of legalistic attacks was Stephen McIntyre a statistician who had worked in Canada for a mining company. I had had a similar experience with McIntyre when he demanded that Michael Mann and colleagues publish all their computer codes for peer-reviewed papers previously published in Climatic Change. The journal’s editorial board supported the view that the replication efforts do not extend to personal computer codes with all their undocumented subroutines. It’s an intellectual property issue as well as a major drain on scientists’ productivity, an opinion with which the National Science Foundation concurred, as mentioned.
This was untrue in important particulars and a very unfair account of our 2004 exchange. At the time, Schneider did not express any hint that the exchange was unreasonable. Indeed, the exchange had the positive outcome of Climatic Change adopting data archiving policies for the first time.
To further evidence Schneider’s lack of objection to my conduct as a reviewer at the time, a year later, Schneider invited me once again to act as a reviewer, this time as reviewer of Wahl and Ammann 2004 2005 2006 2007. Needless to say, this once again featured heavily in the Climategate letters. Its story was nicely told by Andrew Montford as “Caspar and the Jesus Paper” – an account that preceded the Climategate Letters. In this case, the experience was not as cordial. (Schneider’s cancer had been reported publicly just before the invitation to review Wahl and Ammann, but I was unaware of his illness until his death.)
Once again, the role of a reviewer was an odd one due to the conflict of interest. Again, I tried to separate as much as possible my adverse interest as someone being criticized from my obligations as a reviewer. In this case, there was much in Wahl and Ammann that could be objectively criticized. (e.g. the check-kiting of Ammann and Wahl, submitted to GRL and rejected, and the later replacement of all references to this article by a later article, Ammann and Wahl 2007, not even submitted as at the time of the supposed acceptance of Wahl and Ammannm which was in the last few hours of the last day, with the references to the still unaccepted and soon rejected Ammann and Wahl companion paper very much a loose end.)
Climategate documents show that Phil Jones was also a reviewer of Wahl and Ammann, observing:
This paper is to be thoroughly welcomed and is particularly timely with the next IPCC assessment coming along in 2007.
My review was less positive. Schneider terminated me as a reviewer and I didn’t have much further correspondence with him. I did write to him recently pointing out that, although I was included on his blacklist of scientists who had signed various petitions that he disapproved of, I had not actually signed any of the petitions. He replied, in effect, that the public blacklist at Anderegg’s website differed from the private blacklist used for the PNAS article and that I had not been included in the private blacklist, as though that resolved the matter.
Schneider repeatedly invoked medical metaphors in order to urge deference by the public to climate scientists.
In one of his last statements, he said:
It is completely inappropriate, if there’s an announcement of the new cancer drug for pediatric leukemia [with] a panel of three doctors from various hospitals, to then give equal time to the president of the herbalist society, who says that modern medicine is a crock. They wouldn’t even put that person on the air, so why put on petroleum geologists—who know as much about climate as we climatologists know about drilling for oil—because they’ve studied one climate change a hundred million years ago?”
In his recent book, he made a similar point:
If all scientists are created equal, then all MDs are likewise equivalent. So I’ll ask my podiatrist to prescribe my heart medicine and ask my cardiologist – who hasn’t touched a scalpel in 30 years – to take off my bad toe nail. My point, of course, is that these are not climate experts, as they do not represent a community expert in the details of climatology. A petroleum geologist can no more tell us about cloud feedback than a climatologist could competently tell us about oil reserves(p. 146.)
Nonetheless, in his own valiant battle against his disease, Schneider did not passively accept dicta from authority, but sought to understand the details as best he could, describing himself as “The Patient from Hell”:
To increase the odds against the disease, mantle cell lymphoma, Dr. Schneider, 60, involved himself in every aspect of his treatment. How he pushed his doctors to experiment with new techniques to control the cancer is the subject of a book he has just completed, tentatively titled “The Patient From Hell: Getting the Best That Modern Medicine Can Offer.” Da Capo Press/Perseus is to publish it in the fall.
As I noted above, at his best, Schneider was engaging and cheerful – qualities that I prefer to remember him by. I was unaware of his personal battles or that he ironically described himself as “The Patient from Hell” – a title that seems an honorable one.
Bishop Hill has cheered up readers with Josh’s cartoons. Josh has started a blog as an album of his climate cartoons here. Here is his most recent entitled The Three Stooges – two lords and an aspiring lord (now merely a sir) – Muir Russell, Oxburgh and Acton.
Arrived back in Canada last evening after a very stimulating and interesting trip to England – two speeches, lots of questions and interviews.
First thanks to the over 250 CA readers who chipped in to support the trip. I ended up not simply with my trip paid for, but a nice appearance fee.
Second, thanks to David Holland and to Josh of Cartoons by Josh (and particularly to their wives, Kate and Liz), each of who billeted me for 3 nights. I don’t think that I’ve ever billeted with anyone before. One of the drawbacks of business travel in hotels (which isn’t usually to conferences) is that you spend a lot of time by yourself and it can be a bit lonely. Also thanks to Richard Drake for arranging a couple of interesting meetings. I was fun to have excellent company throughout.
Also thanks to the organizers of the two presentations – the Guardian and the Global Warming Foundation. Both were gracious hosts. The Guardian representatives were very impressed by the generosity of CA readers in contributing to my expenses. I thanked them for not paying my expenses? as CA readers were far more generous than they would have been. (Like all newspapers, they are under severe financial constraints, and, in their shoes, I wouldn’t have paid my expenses either.)
I checked into the blogs briefly from time to time but haven’t posted or commented for a week. I paid attention to one observer’s helpful suggestion following the GWF speech and used the lectern at the Guardian panel the next day. He was right. I’m amazed by Anthony’s stamina in keeping up trip reports from Australia, as I generally had no energy left by the end of the day to even think about making reports.
The trip was occasioned by the Guardian’s decision to convene a panel discussion following the release of the report from the Muir Russell inquiry – only a few weeks after a convening a panel on the release of Fred Pearce’s book documenting his more diligent inquiry into Climategate. I had decided to make the trip only on July 2. The Muir Russell report was released on the morning of Wednesday, July 7 (with its supplementary information later in the day). I left on Saturday, July 10 and spent most of the few days digesting the Muir Russell report and getting ready for the trip. It takes me quite a bit of time and concentration to organize speeches. It’s not like I have one stump speech that I can give over and over (amortizing the prep effort, so to speak). In this case, the inquiries were the topic of interest and this was new territory for presentation effort, especially, Muir Russell, where I’d only done a couple of blog posts scratching the surface. Thus, I was preoccupied right up to both presentations, editing and re-editing right up to show time in each case.
Other than prepping for my two appearances, I didn’t do much writing or notetaking during my trip. On Monday, Thursday and Friday, I’d arranged for a number of meetings and interviews (which were mostly background meetings and not ones that I plan to document.) So I was busy all week.
One major regret of the trip is that I didn’t get to really spend time with the many Climate Audit readers that attended the two presentations. Many readers came to say hello after the presentations, but, in each case, I was whisked off by organizers of the occasion to post-event get-togethers. While these occasions were fun, if I do this sort of trip again, I’ll try to arrange a venue where CA is, in effect, the host, and my obligation is to readers, rather than hosts.
London is, of course, a very gracious city to visit. Just that I ended up being pretty busy for the entire trip and didn’t do any sightseeing, other than choosing to sometimes walk from place to place. Billeting also meant that I got more of the perspective of living in the London area if you don’t start your day from a downtown hotel. You have to get from your house to a train station – the number of people using trains is large enough that service is very frequent. Then half an hour or so to one of the hub train stations – in my case, Euston for the first half, Waterloo for the second half. Then, transfer to the tube, which typically gets you to about a 10-minute walk from you want to go. Then, walk. In Toronto, business offices are typically located right by a subway stop, so, even if you take the subway downtown, you don’t walk as much as London, where the offices are lower rise and seldom accessible without the standard 10-minute walk.
On Thursday and Friday, I had a little extra time and walked to/from Waterloo to some destinations – about 25-40 minutes depending on whether I (shall-we-say) detoured or not. The weather was ideal for walking – low 20s deg C and partly cloudy. London streetscapes along Regent St to Piccadilly Circus, to Trafalgar Square to the Strand and across to the South Bank are the sort of thing that I like to do as a tourist, so even going from place to place was not necessarily a chore.
I’ll post on the presentations separately, but wanted first to thank my hosts.
As posted by Latimer Alder in my previous post:
Just back from the Climategate debate run by the Guardian tonight. We’re assured that the Guardian website will have a full video of the whole proceeding sometime tomorrow. So just some very sketchy impressions.
Steve obviously read the remarks from last night’s meeting and insisted on speaking from a lectern. This was a good move as it gave him more ‘authority’. And he was (mostly) crisper…making his points more directly. The others spoke while seated.
George Monbiot chaired the meeting and I think he did a fair job of it. He tried hard to be unbiased, and only once or twice strayed into partisan territory. And he managed to keep the speeches and questions mostly to time and to the point
Fred Pearce took a longer perspective than the others. He spoke well and described Climategate as a tragedy rather than a conspiracy…the tragedy being that the CRU guys had adopted siege mentality. Climategate has certainly widened his perspective.
Trevor Davies representing UEA/CRU was appallingly bad. He mouthed platitudes by the shedload, but was unfamiliar with the details of any of the subjects likely to be raised. And was several times embarrassed by doing so. Apart from the fact that he had a sharp suit. I can find nothing positive to say about him. Struck me as a devious smooth cove.
UPDATES
See the Guardian story at: http://www.guardian.co.uk/environment/2010/jul/15/uea-hacked-emails-climate-change
Video of the event is available here: http://www.guardian.co.uk/environment/blog/2010/jul/15/climategate-public-debate
The audio of the debate is up. http://www.guardian.co.uk/environment/audio/2010/jul/15/guardian-climategate-hacked-emails-debate
I’m doing some housekeeping in Steve’s absence.
Maurizio Morabito attended the conference with Steve McIntyre and David Holland today, and has some short notes on the conference here.
When/if video is available of the press conference, it will be posted here and at WUWT. – Anthony