Uncertainty in the Time of COVID-19, Part 2

Part 2: How Do We Know What We Know?

When a new pathogen first shows up to threaten human lives, ignorance dominates knowledge. The faster we retire our ignorance and maximize our knowledge, the better our response to any novel threat. The good news is that knowledge of what is happening during the current COVID-19 pandemic is accumulating more rapidly than it did during the SARS outbreak, in part because we have new tools available, and in part because Chinese clinicians and scientists are publishing more, and faster, than in 2003. And yet there is still a great deal of ignorance about this pathogen, and that ignorance breeds uncertainty. While it is true that the virus we are now calling SARS-CoV-2 is relatively closely related genetically to the SARS-CoV that emerged in 2002, the resulting disease we call COVID-19 is notably different than SARS. This post will dig into what methods and tools are being used today in diagnosis and tracking, what epidemiological knowledge is accumulating, and what error bars and assumptions are absent, being misunderstood, or are errant.

First, in all of these posts I will keep a running update of good sources of information. The Atlantic continues its excellent reporting into lack of testing in the US by digging into the decision-making process, or lack thereof, that resulted in our current predicament. I am finding it useful to read the China CDC Weekly Reports, which constitute source data and anecdotes used in many other articles and reports.

Before diving in any further, I would observe that it is now clear that extreme social distancing works to halt the spread of the virus, at least temporarily, as demonstrated in China. It is also clear that, with widespread testing, the spread can also be controlled with less severe restrictions — but only if you assay the population adequately, which means running tests on as many people as possible, not just those who are obviously sick and in hospital.

Why does any of this matter?

In what follows, I get down into the weeds of sources of error and of sampling strategies. I suggest that the way we are using tests is obscuring, rather than helping, our ability to understand what is happening. You might look at this, if you are an epidemiologist or public health person, and say that these details are irrelevant because all we really care about are actions that work to limit or slow the spread. Ultimately, as the goal is to save lives and reduce suffering, and since China has demonstrated that extreme social distancing can work to limit the spread of COVID-19, the argument might be that we should just implement the same measures and be done with it. I am certainly sympathetic to this view, and we should definitely implement measures to restrict the spread of the virus.

But it isn’t that simple. First, because the population infection data is still so poor, even in China (though perhaps not in South Korea, as I explore below) every statement about successful control is in actuality still a hypothesis, yet to be tested. Those tests will come in the form of 1) additional exposure data, such as population serology studies that identify the full extent of viral spread by looking for antibodies to the virus, which persist long after an infection is resolved, and 2) carefully tracking what happens when social distancing and quarantine measures are lifted. Prior pandemics, in particular the 1918 influenza episode, showed waves of infections that reoccured for years after the initial outbreak. Some of those waves are clearly attributable to premature reduction in social distancing, and different interpretations of data may have contributed to those decisions. (Have a look at this post by Tomas Pueyo, which is generally quite good, for the section with the heading “Learnings from the 1918 Flu Pandemic”.) Consequently, we need to carefully consider exactly what our current data sets are teaching us about SARS-CoV-19 and COVID-19, and, indeed, whether current data sets are teaching us anything helpful at all.

What is COVID-19?

Leading off the discussion of uncertainty are differences in the most basic description of the disease known as COVID-19. The list of observed symptoms — that is, visible impacts on the human body — from the CDC includes only fever, cough, and shortness of breath, while the WHO website list is more expansive, with fever, tiredness, dry cough, aches and pains, nasal congestion, runny nose, sore throat, or diarrhea. The WHO-China Joint Mission report from last month (PDF) is more quantitative: fever (87.9%), dry cough (67.7%), fatigue (38.1%), sputum production (33.4%), shortness of breath (18.6%), sore throat (13.9%), headache (13.6%), myalgia or arthralgia (14.8%), chills (11.4%), nausea or vomiting (5.0%), nasal congestion (4.8%), diarrhea (3.7%), and hemoptysis (0.9%), and conjunctival congestion (0.8%). Note that the preceding list, while quantitative in the sense that it reports the frequency of symptoms, is ultimately a list of qualitative judgements by humans.

The Joint Mission report continues with a slightly more quantitative set of statements:

Most people infected with COVID-19 virus have mild disease and recover. Approximately 80% of laboratory confirmed patients have had mild to moderate disease, which includes non-pneumonia and pneumonia cases, 13.8% have severe disease (dyspnea, respiratory frequency ≥30/minute, blood oxygen saturation ≤93%, PaO2/FiO2 ratio <300, and/or lung infiltrates >50% of the lung field within 24-48 hours) and 6.1% are critical (respiratory failure, septic shock, and/or multiple organ dysfunction/failure).

The rate of hospitalization, seriousness of symptoms, and ultimately the fatality rate depend strongly on age and, in a source of more uncertainty, perhaps on geography, points I will return to below.

What is the fatality rate, and why does it vary so much?

The Economist has a nice article exploring the wide variation in reported and estimated fatality rates, which I encourage you to read (also this means I don’t have to write it). One conclusion from that article is that we are probably misestimating fatalities due to measurement error. The total rate of infection is probably higher than is being reported, and the absolute number of fatalities is probably higher than generally understood. To this miscalculation I would add an additional layer of obfuscation, which I happened upon in my earlier work on SARS and the flu.

It turns out that we are probably significantly undercounting deaths due to influenza. This hypothesis is driven by a set of observations of anticorrelations between flu vaccination and deaths ascribed to stroke, myocardial infarction (“heart attack”), and “sudden cardiac death”, where the latter is the largest cause of “natural” death in the United States. Influenza immunization reduces the rate of those causes of death by 50-75%. The authors conclude that the actual number of people who die from influenza infections could be 4X-2.5-5X higher than the oft cited 20,000-40,000.

How could the standard estimate be so far off? Consider these two situations: First, if a patient is at the doctor or in the hospital due to symptoms of the flu, they are likely to undergo a test to rule in, or out, the flu. But if a patient comes into the ER in distress and then passes away, or if they die before getting to the hospital, then that molecular diagnostic is much less likely to be used. And if the patient is elderly and already suffering from an obvious likely cause of death, for example congestive heart failure, kidney failure, or cancer, then that is likely to be what goes on the death certificate. Consequently, particularly among older people with obvious preexisting conditions, case fatality rate for influenza is likely to be underestimated, and that is for a pathogen that is relatively well understood for which there is unlikely to be a shortage of diagnostic kits.

Having set that stage, it is no leap at all to hypothesize that the fatality rate for COVID-19 is likely to be significantly underestimated. And then if you add in insufficient testing, and thus insufficient diagnostics, as I explore below, it seems likely that many fatalities caused by COVID-19 will be attributed to something else, particularly among the elderly. The disease is already quite serious among those diagnosed who are older than 70. I expect that the final toll will be greater in communities that do not get the disease under control.

Fatality rate in China as reported by China CDC.

Fatality rate in China as reported by China CDC.

How is COVID-19 diagnosed?

For most of history, medical diagnoses have been determined by comparing patient symptoms (again, these are human-observable impacts on a patent, usually constituting natural language nouns and adjectives) with lists that doctors together agree define a particular condition. Recently, this qualitative methodology has been slowly amended with quantitative measures as they have become available: e.g., pulse, blood pressure, EEG and EKG, blood oxygen content, “five part diff” (which quantifies different kinds of blood cells), CT, MRI, blood sugar levels, liver enzyme activity, lung and heart pumping volume, viral load, and now DNA and RNA sequencing of tissues and pathogens. These latter tools have become particularly important in genetically tracking the spread of #SARS-CoV-2, because by following the sequence around the world you can sort out at the individual case level where it came from. And then simply being able to specifically detect viral RNA to provide a diagnosis is important because COVID-19 symptoms (other than fatality rate) are quite similar to that of the seasonal flu. Beyond differentiating COVID-19 from “influenza like illness”, new tools are being brought to bear that enable near real time quantification of viral RNA, which enables estimating viral load (number of viruses per sample volume), and which in turn facilitates 1) understanding how the disease progresses and then 2) how infectious patients are over time. These molecular assays are the result of decades of technology improvement, which has resulted in highly automated systems that take in raw clinical samples, process them, and deliver results electronically. At least in those labs that can afford such devices. Beyond these achievements, novel diagnostic methods based on the relatively recent development of CRISPR as a tool are already in the queue to be approved for use amidst the current pandemic. The pandemic is serving as a shock to the system to move diagnostic technology faster. We are watching in real time a momentous transition in the history of medicine, which is giving us a glimpse of the future. How are all these tools being applied today?

(Note: My original intention with this post was to look at the error rates of all the steps for each diagnostic method. I will explain why I think this is important, but other matters are more pressing at present, so the detailed error analysis will get short shrift for now.)

Recapitulating an explanation of relevant diagnostics from Part 1 of this series (with a slight change in organization):

There are three primary means of diagnosis:

1. The first is by display of symptoms, which can span a long list of cold-like runny nose, fever, sore throat, upper respiratory features, to much less pleasant, and in some cases deadly, lower respiratory impairment. (I recently heard an expert on the virus say that there are two primary ways that SARS-like viruses can kill you: “Either your lungs fill up with fluid, limiting your access to oxygen, and you drown, or all the epithelial cells in your lungs slough off, limiting your access to oxygen, and you suffocate.” Secondary infections are also more lethal for people experiencing COVID-19 symptoms.)

2. The second method of diagnosis is imaging of lungs, which includes x-ray and CT scans; SARS-CoV-2 causes particular pathologies in the lungs that can be identified on images and that distinguish it from other respiratory viruses.

3. Thirdly, the virus can be diagnosed via two molecular assays, the first of which uses antibodies to directly look for viral proteins in tissue or fluid samples, while the other looks for whether genetic material is present; sophisticated versions can quantify how many copies of viral RNA are present in a sample.

Imaging of lungs via x-ray and CT scan appears to be an excellent means to diagnose COVID-19 due to a distinct set of morphological features that appear throughout infected tissue, though those features also appear to change during the course of the disease. This study also examined diagnosis via PCR assays, and found a surprisingly high rate of false negatives. It is not clear from the text whether all patients had two independent swabs and accompanying tests, so either 10 or 12 total tests were done. If 10 were done, there are two clear false negatives, for a 20% failure rate; if 12 were done, there are up to four false negatives, for a 33% failure rate. The authors observe that “the false negative rate of oropharyngeal swabs seems high.” Note that this study directly compares the molecular assay with imaging, and the swab/PCR combo definitely comes up short. This is important because for us to definitively diagnose even the number of serious cases, let alone start sampling the larger population to track and try to get ahead of the outbreak, imaging is low throughput and expensive; we need rapid, accurate molecular assays. We need to have confidence in testing.

How does “testing” work? First, testing is not some science fiction process that involves pointing a semi-magical instrument like a Tricorder at a patient and instantly getting a diagnosis. In reality, testing involves multiple process steps implemented by humans — humans who sometimes are inadequately trained or who make mistakes. And then each of those process steps has an associated error or failure rate. You almost never hear about the rate of mistakes, errors, or failures in reporting on “testing”, and that is a problem.

Let’s take the testing process in order. For sample collection the CDC Recommendations include nasopharyngeal and oropharyngeal (i.e., nose and throat) swabs. Here is the Wikipedia page on RT-PCR, which is a pretty good place to start if you are new to these concepts.

The Seattle Flu Study and the UW Virology COVID-19 program often rely on home sample collection from nasal and throat swabs. My initial concern about this testing method was motivated in part by the fact that it was quite difficult to develop a swab-PCR for SARS-CoV that delivered consistent results, where part of the difficulty was simply in collecting a good patient sample. I have a nagging fear that not everyone who is collecting these samples today is adequately trained to get a good result, or that they are tested to ensure they are good at this skill. The number of sample takers has clearly expanded significantly around the world in the last couple of weeks, with more expansion to come. So I leave this topic with a question: is there a clinical study that examines the success rate sample collection by people who are not trained to do this every day?

On to the assays themselves: I am primarily concerned at the moment with the error bars on the detection assays. The RT-PCR assay data in China are not reported with errors (or even variance, which would be an improvement). Imaging is claimed to be 90-95% accurate (against what standard is unclear), and the molecular assays worse than that by some amount. Anecdotal reports are that they have only been 50-70% accurate, with assertions of as low as 10% in some cases. This suggests that, in addition to large probable variation in the detectable viral load, and possible quality variations in the kits themselves, human sample handling and lab error is quite likely the dominant factor in accuracy. There was a report of an automated high throughput testing lab getting set up in a hurry in Wuhan a couple of weeks ago, which might be great if the reagents quality is sorted, but I haven’t seen any reports of whether that worked out. So the idea that the “confirmed” case counts are representative of reality even in hospitals or care facilities is tenuous at best. South Korea has certainly done a better job of adequate testing, but even there questions remain about the accuracy of the testing, as reported by the Financial Times:

Hong Ki-ho, a doctor at Seoul Medical Centre, believed the accuracy of the country’s coronavirus tests was “99 per cent — the highest in the world”. He pointed to the rapid commercial development and deployment of new test kits enabled by a fast-tracked regulatory process. “We have allowed test kits based on WHO protocols and never followed China’s test methods,” Dr Hong said.

However, Choi Jae-wook, a medical professor of preventive medicine at Korea University, remained “worried”. “Many of the kits used at the beginning stage of the outbreak were the same as those in China where the accuracy was questioned . . . We have been hesitating to voice our concern because this could worry the public even more,” Mr Choi said.

At some point (hopefully soon) we will see antibody-based tests being deployed that will enable serology studies of who has been previously infected. The US CDC is developing these serologic tests now, and we should all hope the results are better than the initial round of CDC-produced PCR tests. We may also be fortunate and find that these assays could be useful for diagnosis, as lateral flow assays (like pregnancy tests) can be much faster than PCR assays. Eventually something will work, because this antibody detection is tried and true technology.

To sum up: I had been quite concerned about reports of problems (high error rates) with the PCR assay in China and in South Korea. Fortunately, it appears that more recent PCR data is more trustworthy (as I will discuss below), and that automated infrastructure being deployed in the US and Europe may improve matters further. The automated testing instruments being rolled out in the US should — should — have lower error rates and higher accuracy. I still worry about the error rate on the sample collection. However, detection of the virus may be facilitated because the upper respiratory viral load for SARS-CoV-2 appears to be much higher than for SARS-CoV, a finding with further implications that I will explore below.

How is the virus spread?

(Note: the reporting on asymptomatic spread has changed a great deal just in the last 24 hours. Not all of what appears below is updated to reflect this yet.)

The standard line, if there can be one at this point, has been that the virus is spread by close contact with symptomatic patients. This view is bolstered by claims in the WHO Joint Mission report: “Asymptomatic infection has been reported, but the majority of the relatively rare cases who are asymptomatic on the date of identification/report went on to develop disease. The proportion of truly asymptomatic infections is unclear but appears to be relatively rare and does not appear to be a major driver of transmission.”(p.12) These claims are not consistent with a growing body of clinical observations. Pinning down the rate of asymptomatic, or presymptomatic, infections is important for understanding how the disease spreads. Combining that rate with evidence that patients are infectious while asymptomatic, or presymptomatic, is critical for planning response and for understanding the impact of social distancing.

Two sentences in the Science news piece describing the Joint Mission report undermine all the quantitative claims about impact and control: “A critical unknown is how many mild or asymptomatic cases occur. If large numbers of infections are below the radar, that complicates attempts to isolate infectious people and slow spread of the virus.” Nature picked up this question earlier this week: “How much is coronavirus spreading under the radar?” The answer: probably quite a lot.

A study of cases apparently contracted in a shopping mall in Wenzhou concluded that the most likely explanation for the pattern of spread is “that indirect transmission of the causative virus occurred, perhaps resulting from virus contamination of common objects, virus aerosolization in a confined space, or spread from asymptomatic infected persons.”

Another recent paper in which the authors built an epidemiological transmission model all the documented cases in Wuhan found that, at best, only 41% of the total infection were “ascertained” by diagnosis, while the most likely acertainment rate was a mere 21%. That is, the model best fits the documented case statistics when 79% of the total infections were unaccounted for by direct diagnosis.

Finally, a recent study of patients early after infection clearly shows “that COVID-19 can often present as a common cold-like illness. SARS-CoV-2 can actively replicate in the upper respiratory tract, and is shed for a prolonged time after symptoms end, including in stool.” The comprehensive virological study demonstrates “active [infectious] virus replication in upper respiratory tract tissues”, which leads to a hypothesis that people can present with cold-like symptoms and be infectious. I will quote more extensively from the abstract, as this bit is crucially important:

Pharyngeal virus shedding was very high during the first week of symptoms (peak at 7.11 X 10^8 RNA copies per throat swab, day 4). Infectious virus was readily isolated from throat- and lung-derived samples, but not from stool samples in spite of high virus RNA concentration. Blood and urine never yielded virus. Active replication in the throat was confirmed by viral replicative RNA intermediates in throat samples. Sequence-distinct virus populations were consistently detected in throat- and lung samples of one same patient. Shedding of viral RNA from sputum outlasted the end of symptoms. Seroconversion occurred after 6-12 days, but was not followed by a rapid decline of viral loads.

That is, you can be sick for a week with minimal- to mild symptoms, shedding infectious virus, before antibodies to the virus are detectable. (This study also found that “Diagnostic testing suggests that simple throat swabs will provide sufficient sensitivity at this stage of infection. This is in stark contrast to SARS.” Thus my comments above about reduced concern about sampling methodology.)

So the virus is easy to detect because it is plentiful in the throat, which unfortunately also means that it is easy to spread. And then even after you begin to have a specific immune response, detectable as the presence of antibodies in blood, viral loads stay high.

The authors conclude, rather dryly, with an observation that “These findings suggest adjustments of current case definitions and re-evaluation of the prospects of outbreak containment.” Indeed.

One last observation from this paper is eye opening, and needs much more study: “Striking additional evidence for independent replication in the throat is provided by sequence findings in one patient who consistently showed a distinct virus in her throat as opposed to the lung.” I am not sure we have seen something like this before. Given the high rate of recombination between strains in this family of betacoronaviruses (see Part 1), I want to flag the infection of different tissues by different strains as a possibly worrying route to more viral innovation, that is, evolution.

STAT+ News summarizes the above study as follows:

The researchers found very high levels of virus emitted from the throat of patients from the earliest point in their illness —when people are generally still going about their daily routines. Viral shedding dropped after day 5 in all but two of the patients, who had more serious illness. The two, who developed early signs of pneumonia, continued to shed high levels of virus from the throat until about day 10 or 11.

This pattern of virus shedding is a marked departure from what was seen with the SARS coronavirus, which ignited an outbreak in 2002-2003. With that disease, peak shedding of virus occurred later, when the virus had moved into the deep lungs.

Shedding from the upper airways early in infection makes for a virus that is much harder to contain. The scientists said at peak shedding, people with Covid-19 are emitting more than 1,000 times more virus than was emitted during peak shedding of SARS infection, a fact that likely explains the rapid spread of the virus. 

Yesterday, CNN joined the chorus of reporting on the role asymptomatic spread. It is a nice summary, and makes clear that not only is “presymptomatic transmission commonplace”, it is a demonstrably significant driver of infection. Michael Osterholm, director of the Center for Infectious Disease Research (CIDRAP) and Policy at the University of Minnesota, and always ready with a good quote, was given the opportunity to put the nail in the coffin on the denial of asymptomatic spread:

"At the very beginning of the outbreak, we had many questions about how transmission of this virus occurred. And unfortunately, we saw a number of people taking very firm stances about it was happening this way or it wasn't happening this way. And as we have continued to learn how transmission occurs with this outbreak, it is clear that many of those early statements were not correct," he said. 

"This is time for straight talk," he said. "This is time to tell the public what we know and don't know."

There is one final piece of the puzzle that we need to examine to get a better understanding of how the virus is spreading. You may have read about characterizing the infection rate by the basic reproduction number, R0, which is a statistical measure that captures the average dynamics of transmission. There is another metric the “secondary attack rate”, or SAR, which is a measurement of the rate of transmission in specific cases in which a transmission event is known to have occurred. The Joint Mission report cites an SAR in the range of 5-10% in family settings, which is already concerning. But there is another study (that, to be fair, came out after the Joint Mission report) of nine instances in Wuhan that calculates the secondary attack rate in specific community settings is 35%. That is, assuming one initially infected person per room attended an event in which spread is known to have happened, on average 35% of those present were infected. In my mind, this is the primary justification for limiting social contacts — this virus appears to spread extremely well when people are in enclosed spaces together for a couple of hours, possibly handling and sharing food.

Many missing pieces must be filled in to understand whether the high reported SAR above is representative globally. For instance, what where the environmental conditions (humidity, temperature) and ventilation like at those events? Was the source of the virus a food handler, or otherwise a focus of attention and close contact, or were they just another person in the room? Social distancing and eliminating public events was clearly important in disrupting the initial outbreak in Wuhan, but without more specific information about how community spread occurs we are just hanging on, hoping old fashioned public health measures will slow the thing down until countermeasures (drugs and vaccines) are rolled out. And when the social control measures are lifted, the whole thing could blow up again. Here is Osterholm again, from the Science news article covering the Joint Mission report:

“There’s also uncertainty about what the virus, dubbed SARS-CoV-2, will do in China after the country inevitably lifts some of its strictest control measures and restarts its economy. COVID-19 cases may well increase again.”

“There’s no question they suppressed the outbreak,” says Mike Osterholm, head of the Center for Infectious Disease Research and Policy at the University of Minnesota, Twin Cities. “That’s like suppressing a forest fire, but not putting it out. It’ll come roaring right back.”

What is the age distribution of infections?

The short answer here is that everyone can get infected. The severity of one’s response appears to depend strongly on age, as does the final outcome of the disease (the “endpoint”, as it is somewhat ominously referred to). Here we run smack into another measurement problem, because in order to truly understand who is infected, we would need to be testing broadly across the population, including a generous sample of those who are not displaying symptoms. Because only South Korea has been sampling so widely, only South Korea appears to have a data set that gives some sense of the age distribution of infections across the whole population. Beyond the sampling problem, I found it difficult to find this sort of demographic data published anywhere on the web.

Below is the only age data I have been able to come up with, admirably cobbled together by Andreas Backhaus from screenshots of data out of South Korea and Italy.

Why would you care about this? Because, in many countries, policy makers have not yet closed schools, restaurants, or pubs that younger and healthier members of the population tend to frequent. If this population is either asymptomatic or mildly symptomatic, but still infectious — as indicated above — then they are almost certainly spreading virus not only amongst themselves, but also to members of their families who may be more likely to experience severe symptoms. Moreover, I am led to speculate by the different course of disease in different communities that the structure of social contacts could be playing a significant role in the spread of the virus. Countries that have a relatively high rate of multi-generational households, in which elderly relatives live under the same roof as young people, could be in for a rough ride with COVID-19. If young people are out in the community, exposed to the virus, then their elderly relatives at home have a much higher chance of contracting the virus. Here is the distribution of multigenerational households by region, according to the UN:

Screen Shot 2020-03-15 at 8.39.46 PM.png

The end result of all this is that we — humanity at large, and in particular North America and Europe — need to do a much better job of containment in our own communities in order to reduce morbidity and mortality caused by SARS-CoV-2.

How did we get off track with our response?

It is important to understand how the WHO got the conclusion about the modes of infection wrong. By communicating so clearly that they believed there was a minimal role for asymptomatic spread, the WHO sent a mixed message that, while extreme social distancing works, perhaps it was not so necessary. Some policy makers clearly latched onto the idea that the disease only spreads from very sick people, and that if you aren’t sick then you should continue to head out to the local pub and contribute to the economy. The US CDC seems to have been slow to understand the error (see the CNN story cited above), and the White House just ran with the version of events that seemed like it would be politically most favorable, and least inconvenient economically.

The Joint Mission based the assertion that asymptomatic and presymptomatic infection is “rare” on a study in Guangdong Province. Here is Science again: “To get at this question, the report notes that so-called fever clinics in Guangdong province screened approximately 320,000 people for COVID-19 and only found 0.14% of them to be positive.” Caitlin Rivers, from Johns Hopkins, hit the nail on the head by observing that “Guangdong province was not a heavily affected area, so it is not clear whether [results from there hold] in Hubei province, which was the hardest hit.”

I am quite concerned (and, frankly, disappointed) that the WHO team took at face value that the large scale screening effort in Guangdong that found a very low “asymptomatic count” is somehow representative of anywhere else. Guangdong has a ~50X lower “case count” than Hubei, and a ~400X lower fatality rate, according to the Johns Hopkins Dashboard on 15 March — the disparity was probably even larger when the study was performed. The course of the disease was clearly quite different in Guangdong than in Hubei.

Travel restrictions and social distancing measures appear to have had a significant impact on spread from Hubei to Guangdong, and within Guangdong, which means that we can’t really know how many infected individuals were in Guangdong, or how many of those were really out in the community. A recent study computed the probability of spread from Wuhan to other cities given both population of the city and number of inbound trips from Wuhan; for Guangzhou, in Guangdong, the number of infections was anomalously low given its very large population. That is, compared with other transmission chains in China, Guangdong wound up with many fewer cases that you would expect, and the case count there is therefore not representative. Consequently, the detected infection rate in Guangdong is not a useful metric for understanding anything but Guangdong. The number relevant for epidemiological modeling is the rate of asymptomatic infection in the *absence* of control measures, because that tells us how the virus behaves without draconian social distancing, and any return to normalcy in the world will not have that sort of control measure in place.

Now, if I am being charitable, it may have been that the only large scale screening data set available to the Joint Mission at the time was from Guangdong. The team needed to publish a report, and saying something about asymptomatic transmission was critically important to telling a comprehensive story, so perhaps they went with the only data they had. But the conclusions smelled wrong to me as soon as they were announced. I wrote as much to several reporters and on Twitter, observing that the WHO report was problematic because it assumed the official case counts approximated the actual number of infections, but I couldn’t put my finger on exactly what bugged me until I could put together the rest of the story above. Nevertheless, the WHO has a lot of smart people working for it; why did the organization so quickly embrace and promulgate a narrative that was so obviously problematic to anyone who knows about epidemiology and statistics?

What went wrong at the WHO?

There are some very strong opinions out there regarding the relationship between China and the WHO, and how that relationship impacts the decisions made by Director-General Dr. Tedros Adhanom. I have not met Dr. Tedros and only know what I read about him. However, I do have personal experience with several individuals now higher up in the chain of command for the WHO coronavirus response, and I have no confidence in them whatsoever. Here is my backstory.

I have wandered around the edges of the WHO for quite a while, and have spent most of my time in Geneva at the UN proper and working with the Biological Weapons Convention Implementation Support Unit. Then, several years ago, I was asked to serve on a committee at WHO HQ. I wasn’t particularly enthusiastic about saying yes, but several current and former high ranking US officials convinced me it was for the common good. So I went. It doesn’t matter which committee at the moment. What does matter is that, when it came time to write the committee report, I found that the first draft embraced a political narrative that was entirely counter to my understanding of the relevant facts, science, and history. I lodged my objections to the draft in a long minority report that pointed out the specific ways in which the text diverged from reality. And then something interesting happened.

I received a letter informing me that my appointment to the committee had been a mistake, and that I was actually supposed to be just a technical advisor. Now, the invitation said “member”, and all the documents that I signed beforehand said “member”, with particular rights and responsibilities, including a say in the text of the report. I inquired with the various officials who had encouraged me to serve, as well as with a diplomat or two, and the unanimous opinion was that I had been retroactively demoted so that the report could be written without addressing my concerns. All of those very experienced people were quite surprised by this turn of events. In other words, someone in the WHO went to surprising lengths to try to ensure that the report reflected a particular political perspective rather than facts, history, and science. Why? I do not know what the political calculations were. But I do know this: the administrative leadership in charge of the WHO committee I served on is now high up in the chain of command for the coronavirus response.

Coda: as it turns out, the final report hewed closely to reality as I understood it, and embraced most of the points I wanted it to make. I infer, but do not know for certain, that one or more other members of the committee — who presumably could not be shunted aside so easily, and who presumably had far more political heft than I do — picked up and implemented my recommended changes. So alls well that ends well? But the episode definitely contributed to my education (and cynicism) about how the WHO balances politics and science, and I am ill disposed to trust the organization. Posting my account may mean that I am not invited to hang out at the WHO again. This is just fine.

How much bearing does my experience have on what is happening now in the WHO coronavirus response? I don’t know. You have to make up your own mind about this. But having seen the sausage being made, I am all too aware that the organization can be steered by political considerations. And that definitely increases uncertainty about what is happening on the ground. I won’t be writing or saying anything more specific about that particular episode at this time.

Uncertainty in the Time of COVID-19, Part 1

Part 1: Introduction

Times being what they are, in which challenging events abound and good information is hard to come by, I am delving back into writing about infectious disease (ID). While I’ve not been posting here about the intersection of ID, preparedness, and biosecurity, I have continued to work on these problems as a consultant for corporations, the US government, and the WHO. More on that in a bit, because my experience on the ground at the WHO definitely colors my perception of what the organization has said about events in China.

These posts will primarily be a summary of what we do, and do not, know about the current outbreak of the disease named COVID-19, and its causative agent, a coronavirus known officially as SARS-CoV-2 (for “SARS coronavirus-2”). I am interested in 1) what the ground truth is as best we can get to it in the form of data (with error bars), and I am interested in 2) claims that are made that are not supported by that data. You will have read definitive claims that COVID-19 will be no worse than a bad flu, and you will have read definitive claims that the sheer number of severe cases will overwhelm healthcare systems around the world, potentially leading to shocking numbers of fatalities. The problem with any definitive claim at this point is that we still have insufficient concrete information about the basic molecular biology of the virus and the etiology of this disease to have a good idea of what is going to happen. Our primary disadvantage right now is that uncertainty, because uncertainty necessarily complicates both our understanding of the present and our planning for the future.

Good sources of information: If you want to track raw numbers and geographical distribution, the Johns Hopkins Coronavirus COVID-19 Global Cases dashboard is a good place to start, with the caveat that “cases” here means those officially reported by national governments, which data are not necessarily representative of what is happening out in the real world. The ongoing coverage at The Atlantic about testing (here, and here, for starters) is an excellent place to read up on the shortcomings of the current US approach, as well as to develop perspective on what has happened as a result of comprehensive testing in South Korea. Our World In Data has a nice page, updated often, that provides a list of basic facts about the virus and its spread (again with a caveat about “case count”). Nextstrain is a great tool to visualize how the various mutations of SARS-CoV-2 are moving around the world, and changing as they go. That we can sequence the virus so quickly is a welcome improvement in our response, as it allows sorting out how infection is spreading from one person to another, and one country to another. This is a huge advance in human capability to deal with pathogen outbreaks. However, and unfortunately, this is still retrospective information, and means we are chasing the virus, not getting ahead of it.

How did we get here?

My 2006 post, “Nature is Full of Surprises, and We Are Totally Unprepared”, summarizes some of my early work with Bio-era on pandemic preparedness and response planning, which involved looking back at SARS and various influenza epidemics in order to understand future events. One of the immediate observations you make from even a cursory analysis of outbreaks is that pathogen surveillance in both animals and humans needs to be an ongoing priority. Bio-era concluded that humanity would continue to be surprised by zoonotic events in the absence of a concerted effort to build up global surveillance capacity. We recommended to several governments that they address this gap by aggressively rolling out sampling and sequencing of wildlife pathogens. And then not much happened to develop and real surveillance capacity until — guess what — we were surprised again by the 2009 H1N1 (aka Mexican, aka Swine) flu outbreak, which nobody saw coming because nobody was looking in the right place.

In the interval since, particularly in the wake of the “West Africa” Ebola outbreak that started in 2013, global ID surveillance has improved. The following years also saw lots of news about the rise of the Zika virus and the resurgence of Dengue, about which I am certain we have not heard the last. In the US, epidemic planning and response was finally taken seriously at the highest levels of power, and a Global Health and Security team was established within the National Security Council. That office operated until 2018, when the current White House defunded the NSC capability as well as a parallel effort at DHS (read this Foreign Policy article by Laurie Garrett for perspective: “Trump Has Sabotaged America’s Coronavirus Response”). I am unable to be adequately politic about these events just yet, even when swearing like a sailor, so I will mostly leave them aside for now. I will try to write something about US government attitudes about preparing to deal with lethal infectious diseases under separate cover; in the meantime you might get some sense of my thinking from my memorial to virologist Mark Buller.

Surprise? Again?

Outside the US government, surveillance work has continued. The EcoHealth Alliance has been on the ground in China for many years now, sequencing animal viruses, particularly from bats, in the hopes of getting a jump on the next zoonosis. I was fortunate to work with several of the founders of the EcoHealth Alliance, Drs. Peter Daszak and Billy Karesh, during my time with Bio-era. They are good blokes. Colorful, to be sure — which you sort of have to be to get out of bed with the intention of chasing viruses into bat caves and jumping out of helicopters to take blood samples from large predators. The EcoHealth programs have catalogued a great many potential zoonotic viruses over the years, including several that are close relatives of both SARS-CoV (the causative agent of SARS) and SARS-CoV-2. And then there is Ralph Baric, at UNC, who with colleagues in China has published multiple papers over the years pointing to the existence of a cluster of SARS-like viruses circulating in animals in Hubei. See, in particular, “A SARS-like cluster of circulating bat coronaviruses shows potential for human emergence”, which called out in 2015 a worrisome group of viruses to which SARS-CoV-2 belongs. This work almost certainly could not have picked out that specific virus before it jumped to humans, because that would require substantially more field surveillance and more dedicated laboratory testing than has been possible with existing funding. But Baric and colleagues gave a clear heads up that something was brewing. And yet we were “surprised”, again. (Post publication note: For more on what has so far been learned about the origin of the virus, see this absolutely fantastic article in Scientific American that came out today: How China’s “Bat Woman” Hunted Down Viruses from SARS to the New Coronavirus, by Jane Qiu. I will come back to it in later installments of this series. It is really, really good.)

Not only were we warned, we have considerable historical experience that (wildlife consumption + coronavirus + humans) leads to zoonosis, or a disease that jumps from animals to humans. This particular virus still caught us unawares; it snuck up on us because we need to do a much better job of understanding how viruses jump from animal hosts to humans. Unless we start paying closer attention, it won’t be the last time. The pace of zoonotic events among viruses related to SARS-CoV has accelerated over the last 25 years, as I will explore in a forthcoming post. The primary reason for this acceleration, according to the wildlife veterinarians and virus hunters I talk to, is that humans continue to both encroach on natural habitats and to bring animals from those habitats home to serve for dinner. So in addition to better surveillance, humans could reduce the chance of zoonosis by eating fewer wild animals. Either way, the lesson of being surprised by SARS-CoV-2 is that we must work much harder to stay ahead of nature.

Why is the US, in particular, so unprepared to deal with this virus?

The US government has a long history of giving biological threats and health security inadequate respect. Yes, there have always been individuals and small groups inside various agencies and departments who worked hard to increase our preparedness and response efforts. But people at the top have never fully grasped what is at stake and what needs to be done.

Particularly alarming, we have recently experienced a unilateral disarming in the face of known and obvious threats. See the Laurie Garrett article cited above for details. As reported by The New York Times,

“Mr. Trump had no explanation for why his White House shut down the Directorate for Global Health Security and Biodefense established at the National Security Council in 2016 by President Barack Obama after the 2014 Ebola outbreak.”

Yet this is more complicated than is apparent or is described in the reporting, as I commented on Twitter earlier this week. National security policy in the US has been dominated for many decades by people who grew up intellectually in the Cold War, or were taught by people who fought the Cold War. Cold War security was about nation states and, most importantly, nuclear weapons. When the Iron Curtain fell, the concern about large nations (i.e., the USSR) slipped away for a while, eventually to be replaced by small states, terrorism, and WMDs. But WMD policy, which in principle includes chemical and biological threats, has continued to be dominated by the nuclear security crowd. The argument is always that nuclear (and radiological) weapons are more of a threat and can cause more damage than a mere microbe, whether natural or artificial. And then there is the spending associated with countering the more kinetic threats: the big, shiny, splody objects get all the attention. So biosecurity and pandemic preparedness and response, which often are lumped together as "health security", get short shrift because the people setting priorities have other priorities. This has been a problem for both Democrat and Republican administrations, and demonstrates a history of bipartisan blindness.

Then, after decades of effort, and an increasing number of emergent microbial/health threats, finally a position and office were created within the National Security Council. While far from a panacea, because the USG needs to do much more than have policy in place, this was progress.

And then a new Administration came in, which not only has different overall security priorities but also is dominated by old school security people who are focussed on the intersection of a small number of nation states and nuclear weapons. John Bolton, in particular, is a hardline neocon whose intellectual roots are in Cold War security policy; so he is focussed on nukes. His ascendence at the NSC was coincident not just with the NSC preparedness office being shut down, but also a parallel DHS office responsible for implementing policy. And then, beyond the specific mania driving a focus on nation states and nukes as the primary threats to US national security, there is the oft reported war on expertise in the current exec branch and EOP. Add it all up: The USG is now severely understaffed for the current crisis.

Even the knowledgeable professionals still serving in the government have been hamstrung by bad policy in their ability to organize a response. To be blunt: patients are dying because the FDA & CDC could not get out of the way or — imagine it — help in accelerating the availability of testing at a critical time in a crisis. There will be a reckoning. And then public health in the US will need to be rebuilt, and earn trust again. There is a long road ahead. But first we have to deal with SARS-CoV-2.

Who is this beastie, SARS-CoV-2?

Just to get the introductions out of the way, the new virus is classified within order Nidovirales, family Coronaviridae, subfamily Orthocoronaviridae. You may also see it referred to as a betacoronavirus. To give you some sense of the diversity of coronaviruses, here is a nice, clean visual representation of their phylogenetic relationships. It contains names of many familiar human pathogens. If you are wondering why we don’t have a better understanding of this family of viruses given their obvious importance to human health and to economic and physical security, good for you — you should wonder about this. For the cost of a single marginally functional F-35, let alone a white elephant new aircraft carrier, we could fund viral surveillance and basic molecular biology for all of these pathogens for years.

The diversity of pathogenic coronaviruses. Source: Xyzology.

The diversity of pathogenic coronaviruses. Source: Xyzology.

Betacoronaviruses (BCVs) are RNA viruses that are surrounded by a lipid membrane. The membrane is damaged by soap and by ethyl or isopropyl alcohol; without the membrane the virus falls apart. BCVs differ from influenza viruses in both their genome structure and in the way they evolve. Influenza viruses have segmented genomes — the genes are, in effect, organized into chromosomes — and the virus can evolve either through swapping chromosomes with other flu strains or through mutations that happen when the viral polymerase, which copies RNA, makes a mistake. The influenza polymerase makes lots of mistakes, which means that many different sequences are produced during replication. This is a primary driver of the evolution of influenza viruses, and largely explains why new flu strains show up every year. While the core of the copying machinery in Betacoronaviruses is similar to that of influenza viruses, it also contains an additional component called Nsp-14 that corrects copying mistakes. Disable or remove Nsp-14 and you get influenza-like mutation rates in Betacoronaviruses. (For some reason I find that observation particularly fascinating, though I can’t really explain why.)

There is another important feature of the BCV polymerase in that it facilitates recombination between RNA strands that happen to be floating around nearby. This means that if a host cell happens to be infected with more than one BCV strain at the same time, you can get a relatively high rate of new genomes being assembled out of all the parts floating around. This is one reason why BCV genome sequences can look like they are pasted together from strains that infect different species — they are often assembled exactly that way at the molecular level.

Before digging into the uncertainties around this virus and what is happening in the world, we need to understand how it is detected and diagnosed. There are three primary means of diagnosis. The first is by display of symptoms, which can span a long list of cold-like runny nose, fever, sore throat, upper respiratory features, to much less pleasant, and in some cases deadly, lower respiratory impairment. (I recently heard an expert on the virus say that there are two primary ways that SARS-like viruses can kill you: “Either your lungs fill up with fluid, limiting your access to oxygen, and you drown, or all the epithelial cells in your lungs slough off, limiting your access to oxygen, and you suffocate.” Secondary infections are also more lethal for people experiencing COVID-19 symptoms.) The second method of diagnosis is imaging of lungs, which includes x-ray and CT scans; SARS-CoV-2 causes particular pathologies in the lungs that can be identified on images and that distinguish it from other respiratory viruses. Finally, the virus can be diagnosed via two molecular assays, the first of which uses antibodies to directly look for viral proteins in tissue or fluid samples, while the other looks for whether genetic material is present; sophisticated versions can quantify how many copies of viral RNA are present in a sample.

Each of these diagnostic methods is usually described as being “accurate” or “sensitive” to some degree, when instead they should be described as having some error rate, a rate than might be dependent on when or where the method was applied, or might vary with who was applying it. And every time you read how “accurate” or “sensitive” a method is, you should ask: compared to what? And this is where we get into uncertainty.

Part 2 of this series will dig into specific sources of uncertainty spanning measurement and diagnosis to recommendations.

A Few Thoughts and References Re Conservation and Synthetic Biology

Yesterday at Synthetic Biology 7.0 in Singapore, we had a good discussion about the intersection of conservation, biodiversity, and synthetic biology. I said I would post a few papers relevant to the discussion, which are below.

These papers are variously: the framing document for the original meeting at the University of Cambridge in 2013 (see also "Harry Potter and the Future of Nature"), sponsored by the Wildlife Conservation Society; follow on discussions from meetings in San Francisco and Bellagio; and my own efforts to try to figure out how quantify the economic impact of biotechnology (which is not small, especially when compared to much older industries) and the economic damage from invasive species and biodiversity loss (which is also not small, measured as either dollars or jobs lost). The final paper in this list is my first effort to link conservation and biodiversity with economic and physical security, which requires shifting our thinking from the national security of nation states and their political boundaries to the natural security of the systems and resources that those nation states rely on for continued existence.

"Is It Time for Synthetic Biodiversity Conservation?", Antoinette J. Piaggio1, Gernot Segelbacher, Philip J. Seddon, Luke Alphey, Elizabeth L. Bennett, Robert H. Carlson, Robert M. Friedman, Dona Kanavy, Ryan Phelan, Kent H. Redford, Marina Rosales, Lydia Slobodian, Keith WheelerTrends in Ecology & Evolution, Volume 32, Issue 2, February 2017, Pages 97–107

Robert Carlson, "Estimating the biotech sector's contribution to the US economy", Nature Biotechnology, 34, 247–255 (2016), 10 March 2016

Kent H. Redford, William Adams, Rob Carlson, Bertina Ceccarelli, “Synthetic biology and the conservation of biodiversity”, Oryx, 48(3), 330–336, 2014.

"How will synthetic biology and conservation shape the future of nature?", Kent H. Redford, William Adams, Georgina Mace, Rob Carlson, Steve Sanderson, Framing Paper for International Meeting, Wildlife Conservation Society, April 2013.

"From national security to natural security", Robert Carlson, Bulletin of the Atomic Scientists, 11 Dec 2013.

Late Night, Unedited Musings on Synthesizing Secret Genomes

By now you have probably heard that a meeting took place this past week at Harvard to discuss large scale genome synthesis. The headline large genome to synthesize is, of course, that of humans. All 6 billion (duplex) bases, wrapped up in 23 pairs of chromosomes that display incredible architectural and functional complexity that we really don't understand very well just yet. So no one is going to be running off to the lab to crank out synthetic humans. That 6 billion bases, by the way, just for one genome, exceeds the total present global demand for synthetic DNA. This isn't happening tomorrow. In fact, synthesizing a human genome isn't going to happen for a long time.

But, if you believe the press coverage, nefarious scientists are planning pull a Frankenstein and "fabricate" a human genome in secret. Oh, shit! Burn some late night oil! Burn some books! Wait, better — burn some scientists! Not so much, actually. There are a several important points here. I'll take them in no particular order.

First, it's true, the meeting was held behind closed doors. It wasn't intended to be so, originally. The rationale given by the organizers for the change is that a manuscript on the topic is presently under review, and the editor of the journal considering the manuscript made it clear that it considers the entire topic under embargo until the paper is published. This put the organizers in a bit of a pickle. They decided the easiest way to comply with the editor's wishes (which were communicated to the authors well after the attendees had made travel plans) was to hold the meeting under rules even more strict than Chatham House until the paper is published. At that point, they plan to make a full record of the meeting available. It just isn't a big deal. If it sounds boring and stupid so far, it is. The word "secret" was only introduced into the conversation by a notable critic who, as best I can tell, perhaps misconstrued the language around the editor's requirement to respect the embargo. A requirement that is also boring and stupid. But, still, we are now stuck with "secret", and all the press and bloggers who weren't there are seeing Watergate headlines and fame. Still boring and stupid.

Next, It has been reported that there were no press at the meeting. However, I understand that there were several reporters present. It has also been suggested that the press present were muzzled. This is a ridiculous claim if you know anything about reporters. They've simply been asked to respect the embargo, which so far they are doing, just like they do with every other embargo. (Note to self, and to readers: do not piss off reporters. Do not accuse them of being simpletons or shills. Avoid this at all costs. All reporters are brilliant and write like Hemingway and/or Shakespeare and/or Oliver Morton / Helen Branswell / Philip Ball / Carl Zimmer / Erica Check-Hayden. Especially that one over there. You know who I mean. Just sayin'.)

How do I know all this? You can take a guess, but my response is also covered by the embargo.

Moving on: I was invited to the meeting in question, but could not attend. I've checked the various associated correspondence, and there's nothing about keeping it "secret". In fact, the whole frickin' point of coupling the meeting to a serious, peer-reviewed paper on the topic was to open up the conversation with the public as broadly as possible. (How do you miss that unsubtle point, except by trying?) The paper was supposed to come out before, or, at the latest, at the same time as the meeting. Or, um, maybe just a little bit after? But, whoops. Surprise! Academic publishing can be slow and/or manipulated/politicized. Not that this happened here. Anyway, get over it. (Also: Editors! And, reviewers! And, how many times will I say "this is the last time!")

(Psst: an aside. Science should be open. Biology, in particular, should be done in the public view and should be discussed in the open. I've said and written this in public on many occasions. I won't bore you with the references. [Hint: right here.] But that doesn't mean that every conversation you have should be subject to review by the peanut gallery right now. Think of it like a marriage/domestic partnership. You are part of society; you have a role and a responsibility, especially if you have children. But that doesn't mean you publicize your pillow talk. That would be deeply foolish and would inevitably prevent you from having honest conversations with your spouse. You need privacy to work on your thinking and relationships. Science: same thing. Critics: fuck off back to that sewery rag in — wait, what was I saying about not pissing off reporters?)

Is this really a controversy? Or is it merely a controversy because somebody said it is? Plenty of people are weighing in who weren't there or, undoubtedly worse from their perspective, weren't invited and didn't know it was happening. So I wonder if this is more about drawing attention to those doing the shouting. That is probably unfair, this being an academic discussion, full of academics.

Secondly (am I just on secondly?), the supposed ethical issues. Despite what you may read, there is no rush. No human genome, nor any human chromosome, will be synthesized for some time to come. Make no mistake about how hard a technical challenge this is. While we have some success in hand at synthesizing yeast chromosomes, and while that project certainly serves as some sort of model for other genomes, the chromatin in multicellular organisms has proven more challenging to understand or build. Consequently, any near-term progress made in synthesizing human chromosomes is going to teach us a great deal about biology, about disease, and about what makes humans different from other animals. It is still going to take a long time. There isn't any real pressing ethical issue to be had here, yet. Building the ubermench comes later. You can be sure, however, that any federally funded project to build the ubermench will come with a ~2% set aside to pay for plenty of bioethics studies. And that's a good thing. It will happen.

There is, however, an ethical concern here that needs discussing. I care very deeply about getting this right, and about not screwing up the future of biology. As someone who has done multiple tours on bioethics projects in the U.S. and Europe, served as a scientific advisor to various other bioethics projects, and testified before the Presidential Commission on Bioethical Concerns (whew!), I find that many of these conversations are more about the ethicists than the bio. Sure, we need to have public conversations about how we use biology as a technology. It is a very powerful technology. I wrote a book about that. If only we had such involved and thorough ethical conversations about other powerful technologies. Then we would have more conversations about stuff. We would converse and say things, all democratic-like, and it would feel good. And there would be stuff, always more stuff to discuss. We would say the same things about that new stuff. That would be awesome, that stuff, those words. <dreamy sigh> You can quote me on that. <another dreamy sigh>

But on to the technical issues. As I wrote last month, I estimate that the global demand for synthetic DNA (sDNA) to be 4.8 billion bases worth of short oligos and ~1 billion worth of longer double-stranded (dsDNA), for not quite 6 Gigabases total. That, obviously, is the equivalent of a single human duplex genome. Most of that demand is from commercial projects that must return value within a few quarters, which biotech is now doing at eye-popping rates. Any synthetic human genome project is going to take many years, if not decades, and any commercial return is way, way off in the future. Even if the annual growth in commercial use of sDNA were 20% — which is isn't — this tells you, dear reader, that the commercial biotech use of synthetic DNA is never, ever, going to provide sufficient demand to scale up production to build many synthetic human genomes. Or possibly even a single human genome. The government might step in to provide a market to drive technology, just as it did for the human genome sequencing project, but my judgement is that the scale mismatch is so large as to be insurmountable. Even while sDNA is already a commodity, it has far more value in reprogramming crops and microbes with relatively small tweaks than it has in building synthetic human genomes. So if this story were only about existing use of biology as technology, you could go back to sleep.

But there is a use of DNA that might change this story, which is why we should be paying attention, even at this late hour on a Friday night.

DNA is, by far, the most sophisticated and densest information storage medium humans have ever come across. DNA can be used to store orders of magnitude more bits per gram than anything else humans have come up with. Moreover, the internet is expanding so rapidly that our need to archive data will soon outstrip existing technologies. If we continue down our current path, in coming decades we would need not only exponentially more magnetic tape, disk drives, or flash memory, but exponentially more factories to produce these storage media, and exponentially more warehouses to store them. Even if this is technically feasible it is economically implausible. But biology can provide a solution. DNA exceeds by many times even the theoretical capacity of magnetic tape or solid state storage.

A massive warehouse full of magnetic tapes might be replaced by an amount of DNA the size of a sugar cube. Moreover, while tape might last decades, and paper might last millennia, we have found intact DNA in animal carcasses that have spent three-quarters of a million years frozen in the Canadian tundra. Consequently, there is a push to combine our ability to read and write DNA with our accelerating need for more long-term information storage. Encoding and retrieval of text, photos, and video in DNA has already been demonstrated. (Yes, I am working on one of these projects, but I can't talk about it just yet. We're not even to the embargo stage.) 

Governments and corporations alike have recognized the opportunity. Both are funding research to support the scaling up of infrastructure to synthesize and sequence DNA at sufficient rates.

For a “DNA drive” to compete with an archival tape drive today, it needs to be able to write ~2Gbits/sec, which is about 2 Gbases/sec. That is the equivalent of ~20 synthetic human genomes/min, or ~10K sHumans/day, if I must coin a unit of DNA synthesis to capture the magnitude of the change. Obviously this is likely to be in the form of either short ssDNA, or possibly medium-length ss- or dsDNA if enzymatic synthesis becomes a factor. If this sDNA were to be used to assemble genomes, it would first have to be assembled into genes, and then into synthetic chromosomes, a non trivial task. While this would be hard, and would to take a great deal of effort and PhD theses, it certainly isn't science fiction.

But here, finally, is the interesting bit: the volume of sDNA necessary to make DNA information storage work, and the necessary price point, would make possible any number of synthetic genome projects. That, dear reader, is definitely something that needs careful consideration by publics. And here I do not mean "the public", the 'them' opposed to scientists and engineers in the know and in the do (and in the doo-doo, just now), but rather the Latiny, rootier sense of "the people". There is no them, here, just us, all together. This is important.

The scale of the demand for DNA storage, and the price at which it must operate, will completely alter the economics of reading and writing genetic information, in the process marginalizing the use by existing multibillion-dollar biotech markets while at the same time massively expanding capabilities to reprogram life. This sort of pull on biotechnology from non-traditional applications will only increase with time. That means whatever conversation we think we are having about the calm and ethical development biological technologies is about to be completely inundated and overwhelmed by the relentless pull of global capitalism, beyond borders, probably beyond any control. Note that all the hullabaloo so far about synthetic human genomes, and even about CRISPR editing of embryos, etc., has been written by Western commentators, in Western press. But not everybody lives in the West, and vast resources are pushing development of biotechnology outside of the of West. And that is worth an extended public conversation.

So, to sum up, have fun with all the talk of secret genome synthesis. That's boring. I am going off the grid for the rest of the weekend to pester litoral invertebrates with my daughter. You are on your own for a couple of days. Reporters, you are all awesome, make of the above what you will. Also: you are all awesome. When I get back to the lab on Monday I will get right on with fabricating the ubermench for fun and profit. But — shhh — that's a secret.

Staying Sober about Science

The latest issue of The Hastings Center Report carries an essay of mine, "Staying Sober about Science" (free access after registration), about my thoughts on New Directions: The Ethics of Synthetic Biology and Emerging Technologies (PDF) from The Presidential Commission for the Study of Bioethical Issues.

Here is the first paragraph:

Biology, we are frequently told, is the science of the twenty-first century. Authority informs us that moving genes from one organism to another will provide new drugs, extend both the quantity and quality of life, and feed and fuel the world while reducing water consumption and greenhouse gas emissions. Authority also informs that novel genes will escape from genetically modified crops, thereby leading to herbicide-resistant weeds; that genetically modified crops are an evil privatization of the gene pool that will with certainty lead to the economic ruin of small farmers around the world; and that economic growth derived from biological technologies will cause more harm than good. In other words, we are told that biological technologies will provide benefits and will come with costs--with tales of both costs and benefits occasionally inflated--like every other technology humans have developed and deployed over all of recorded history.

And here are a couple of other selected bits:

Overall, in my opinion, the report is well considered. One must commend President Obama for showing leadership in so rapidly addressing what is seen in some quarters as a highly contentious issue. However, as noted by the commission itself, much of the hubbub is due to hype by both the press and certain parties interested in amplifying the importance of the Venter Institute's accomplishments. Certain scientists want to drive a stake into the heart of vitalism, and perhaps to undermine religious positions concerning the origin of life, while "civil society" groups stoke fears about Frankenstein and want a moratorium on research in synthetic biology. Notably, even when invited to comment by the commission, religious groups had little to say on the matter.

The commission avoided the trap of proscribing from on high the future course of a technology still emerging from the muck. Yet I cannot help the feeling that the report implicitly assumes that the technology can be guided or somehow controlled, as does most of the public discourse on synthetic biology. The broader history of technology, and of its regulation or restriction, suggests that directing its development would be no easy task.8 Often technologies that are encouraged and supported are also stunted, while technologies that face restriction or prohibition become widespread and indispensable.

...The commission's stance favors continued research in synthetic biology precisely because the threats of enormous societal and economic costs are vague and unsubstantiated. Moreover, there are practical implications of continued research that are critical to preparing for future challenges. The commission notes that "undue restriction may not only inhibit the distribution of new benefits, but it may also be counterproductive to security and safety by preventing researchers from developing effective safeguards."12 Continued pursuit of knowledge and capability is critical to our physical and economic security, an argument I have been attempting to inject into the conversation in Washington, D.C., for a decade. The commission firmly embraced a concept woven into the founding fabric of the United States. In the inaugural State of the Union Address in 1790, George Washington told Congress "there is nothing which can better deserve your patronage than the promotion of science and literature. Knowledge is in every country the surest basis of publick happiness."13

The pursuit of knowledge is every bit as important a foundation of the republic as explicit acknowledgment of the unalienable rights of life, liberty, and the pursuit of happiness. Science, literature, art, and technology have played obvious roles in the cultural, economic, and political development of the United States. More broadly, science and engineering are inextricably linked with human progress from a history of living in dirt, disease, and hunger to . . . today. One must of course acknowledge that today's world is imperfect; dirt, disease, and hunger remain part of the human experience. But these ills will always be part of the human experience. Overall, the pursuit of knowledge has vastly improved the human condition. Without scientific inquiry, technological development, and the economic incentive to refine innovations into useful and desirable products, we would still be scrabbling in the dirt, beset by countless diseases, often hungry, slowly losing our teeth.

There's more here.

References:

8. R. Carlson, Biology Is Technology: The Promise, Peril, and New Business of Engineering Life (Cambridge, Mass.: Harvard University Press, 2010).

12. Presidential Commission for the Study of Bioethical Issues, New Directions, 5.

13. G. Washington, "The First State of the Union Address," January 8, 1790, http://ahp.gatech.edu/first_state_union_1790.html.

Shame On You, Portland!

What Happened to March?  I got on a plane this morning headed for New York, but somehow arrived on April 1st.  It's the only explanation for this:

Portland hurts Tibetans
(China Daily)
Updated: 2010-03-11 07:51

While many in the international community are watching with anxiety to see if Washington moves to repair its ties with Beijing, a reckless decision by an American city is rubbing salt into the unhealed wound of the world's most important bilateral relations.

The city of Portland, Oregon, proclaimed Wednesday, March 10, their "Tibet Awareness Day" despite strong opposition from the Chinese government.

While most people and most countries in the world recognize Tibet as part of China, the decision by the American city interferes in China's internal affairs and is an open defiance of China's state sovereignty.

It could have an adverse effect on Sino-US relations, which has yet to recover from major deterioration following Washington's $6.4-billion arms sale to Taiwan and US President Barack Obama's meeting with the Dalai Lama.

The designation of the "Tibet Awareness Day" was apparently orchestrated by the Dalai Lama clique, which has been engaged in activities aimed to separate China and undermine Tibet's stability in the guise of religion.

It is still beyond our belief that politicians in Portland have chosen to celebrate a handful of fanatics trumpeting Tibet independence while turning a blind eye to either history or the status quo of present-day Tibet. History has told us that Tibet has always been a part of China, and there is ample evidence proving the fact that Tibetan people now enjoy a much better life and enjoy the full freedom of religion.

Americans are well-known for putting individual freedom above everything. While the city of Portland entertains a few Tibet separatists, has it ever occurred to its decision-makers that their move are infringing on the interest of 2.8-million Tibetans here in China?

Whither Gene Patents?

Wired and GenomeWeb (subscription only) have a bit of reporting on arguments in a case that will probably substantially affect patents on genes.  The case is Association of Molecular Pathology , et al. v. US Patent and Trademark Office, otherwise known as "the BRCA1 case", which seeks to overturn a patent held by Myriad Genetics on a genetic sequence correlated with breast cancer.

Here is a brief summary of what follows: I have never understood how naturally occurring genes can be patentable, but at present patents are the only way to stake out a property right on genes that are hacked or, dare I say it, "engineered".  So until IP law is changed to allow some other form of protection on genes, patents are it.

The ACLU is requesting a summary judgment that the patent in question be overturned without a trial.  Success in that endeavor would have immediate and enormous effect on the biotech industry as a whole, and I doubt the ACLU is going to get that in one go.  (Here is the relevant recent ACLU press release.)

However, the lawsuit explicitly addresses the broader question of whether any patents should have been granted in the first place on human genes.  This gets at the important question of whether isolating and purifying a bit of natural DNA counts as an invention.  Myriad is arguing that moving DNA out of the human genome and into a plasmid vector counts as sufficient innovation.  This has been at the core of arguments supporting patents on naturally occurring genes for decades, and it has never made sense to me for several reasons.  First, changing the context of a naturally occurring substance does not constitute an invention -- purifying oxygen and putting it in a bottle would never be patentable.  US case law is very clear on this matter.  Second, moving the gene to a new context in a plasmid or putting into a cell line for expression and culturing doesn't change its function.  In fact, the whole point of the exercise would be to maintain the function of the gene for study, which is sort of the opposite of invention.  Nonetheless, Myriad wants to maintain its monopoly.  But their arguments just aren't that strong.

GenomeWeb reports that defense attorney Brian Poissant, argued that "'women would not even know they had BRCA gene if it weren't discovered'under a system that incentivizes patents."  This is, frankly, and with all due respect, a manifestly stupid argument.  Mr. Poissant is suggesting that all of science and technology would stop without the incentive of patents.  Given that most research doesn't result in a patent, and given that most patent application are rejected, Mr. Poissant's argument is on its face inconsistent with reality.  He might have tried to argue more narrowly that developing a working diagnostic assays requires a guarantee on investment through the possession of the monopoly granted by a patent.  But he didn't do that.  To be sure, the assertion that the particular gene under debate in this case would have gone undiscovered without patents is an untestable hypothesis.  But does Mr. Poissant really want the judge to believe that scientists around the world would have let investigation into that gene and disease lie fallow without the possibility of a patent?  As I suggested above, it just isn't a strong argument.  But we can grind it further into the dust.

Mr. Poissant also argued "that if a ruling were as broadly applied here as the ACLU would like then it could 'undermine the entire biotechnology sector.'"  This is, at best, an aggressive over generalization.  As I have described several times over the past couple of years (here and here, for starters), even drugs are only a small part of the revenues from genetically modified systems.  Without digging into the undoubtedly messy details, a quick troll of Google suggests that molecular diagnostics as a whole generate only $3-4 billion a year, and at a guess DNA tests are probably a good deal less than half of this.  But more importantly, of the nearly ~2% of US GDP (~$220-250 billion) presently derived from biological technologies, the vast majority are from drugs, plants, or bacteria that have been hacked with genes that themselves are hacked.  That is, both the genes and the host organisms have been altered in a way that is demonstrably dependent on human ingenuity.  What all this means is that only a relatively small fraction of "the entire biotechnology sector" is related to naturally occurring genes in the first place.   

I perused some of the court filings (via the Wired article), and the defense needs to up its game.  Perhaps they think the weight of precedent is on their side.  I would not be as confident as they are. 

But neither is the plaintiff putting its best foot forward.  Even though I like the analysis made comparing DNA patents to attempts to patent fresh fruit, it is unclear to me that the ACLU is being sufficiently careful with both its logic and its verbiage.  In the press release, ACLU attorey Chris Hansen is quoted as saying "Allowing patents on genetic material imposes real and severe limits on scientific research, learning and the free flow of information."  GenomeWeb further quotes the ACLU's Hansen as saying "Patenting human genes is like patenting e=mc2, blood, or air."

As described above, I agree that patenting naturally occurring genes doesn't make a lot of sense.  But we need some sort of property right as an incentive for innovators.  Why should I invest in developing a new biological technology, relying on DNA sequences that have never occurred in nature, if anybody can make off with the sequence (and revenues)?  As it happens, I am not a big fan of patents -- they cost too damn much.  At present, the patent we are pursuing at Biodesic is costing about ten times as much as the capital cost of developing the actual product.  Fees paid to lawyers account for 90% of that.  If it were realistically possible to engage the patent office without a lawyer, then the filing fees would be about the same as the capital cost of development, which seems much more reasonable to me.

I go into these issues at length in the book.  Unfortunately, without Congressional action, there doesn't seem to be much hope for improvement.  And, of course, the direction of any Congressional action will be dominated by large corporations and lawyers.  So much for the little guy.

Are We Cutting Off Our GM Nose to Spite Our

News today that a federal judge has rejected the approval of GM sugar beets by the USDA.  The ruling stated that the government should have done an environmental impact statement, and is similar to a ruling two years ago that led to halting the planting of GM alfalfa.  As in that case, according to the New York Times, "the plaintiffs in the [sugar beet] lawsuit said they would press to ban planting of the biotech beets, arguing that Judge White's decision effectively revoked their approval and made them illegal to grow outside of field trials."  The concern voiced by the plaintiffs, and recognized by the judge, is that pollen from the GM beets might spread transgenes that contaminate GM-free beets.

A few other tidbits from the article: sugar beets now supply about half the US sugar demand, and it seems that GM sugar beets account for about 95% of the US crop (I cannot find any data on the USDA site to support the latter claim).  A spokesman for the nation's largest sugar beet processor claims that food companies, and consumers, have completely accepted sugar from the modified beets -- as they should, because it's the same old sugar molecule. 

I got lured into spending most of my day on this because I noticed that the Sierra Club was one of the plaintiffs.  This surprised me, because the Sierra Club is less of a noisemaker on biotech crops than some of the co-plaintiffs, and usually focuses more on climate issues.  Though there is as yet no press release, digging around the Sierra Club site suggests that the organization wants all GM crops to be tested and evaluated with an impact statement before approval.  But my surprise also comes in part because the best review I can find of GM crops suggests that their growing use is coincident with a substantial reduction in soil loss, carbon emissions, energy use, water use, and overall climate impact -- precisely the sort of technological improvement you might expect the Sierra Club to support.  The reductions in environmental impact -- which range from 20% to 70%, depending on the crop -- come from "From Field to Market" (PDF) published earlier this year by the Keystone Alliance, a diverse collection of environmental groups and companies.  Recall that according to USDA data GM crops now account for about 90% of cotton, soy, and corn.  While the Keystone report does not directly attribute the reduction in climate impacts to genetic modification, a VP at Monsanto recently made the connection explicit (PDF of Kevin Eblen's slides at the 2009 International Farm Management Congress).  Here is some additional reporting/commentary.

So I find myself being pulled into exploring the cost/benefit analysis of biotech crops sooner than I had wanted.  I dealt with this issue in Biology is Technology by punting in the afterword:
 

The broader message in this book is that biological technologies are beginning to change both our economy and our interaction with nature in new ways.  The global acreage of genetically modified (GM) crops continues to grow at a very steady rate, and those crops are put to new uses in the economy every day.  One critical question I avoided in the discussion of these crops is the extent to which GM provides an advantage over unmodified plants.  With more than ten years of field and market experience with these crops in Asia and North and South America, the answer would appear to be yes.  Farmers who have the choice to plant GM crops often do so, and presumably they make that choice because it provides them a benefit.  But public debate remains highly polarized.  The Union of Concerned Scientists recently released a review of published studies of GM crop yields in which the author claimed to "debunk" the idea that genetic modification will "play a significant role in increasing food production"  The Biotechnology Industry Organization responded with a press release claiming to "debunk" the original debunking.  The debate continues.

Obviously we will all be talking about biotech crops for years to come.  I don't see how we are going to address the combination of 1) the need for more biomass for fuel and materials, 2) the mandatory increase in crop yields necessary to feed human populations, and 3) the need to reduce our climatic impacts, without deploying biotech crops at even larger scales than we have so far.  But I am also very aware that nobody, but nobody, truly understands how a GM organism will behave when released into the wild.

We do live in interesting times.

And the Innovation Continues...Starting with Shake and Bake Meth!

My first published effort at tracking the pace and proliferation of biological technologies (PDF) was published in 2003.  In that paper, I started following the efforts of the DEA and the DOJ to restrict production and use of methamphetamine, and also started following the response to those efforts as an example of proliferation and innovation driven by proscription.

The story started circa 2002 with 95% of meth production in Mom and Pop operations that made less than 5 kg per year.  Then the US Government decided to restrict access to the precursor chemicals and also to crack down on domestic production.  As I described in 2008, these enforcement actions did sharply reduce the number of "clandestine laboratory incidents" in the US, but those actions also resulted in a proliferation of production across the US border, and a consequently greater flow of drugs across the border.  Domestic consumption continued to increase.  The DEA acknowledged that its efforts contributed to the development of a drug production and distribution infrastructure that is, "[M]ore difficult for local law enforcement agencies to identify, investigate, and dismantle because[it is] typically much more organized and experienced than local independent producers and distributors."  The meth market thus became both bigger and blacker.

Now it turns out that the production infrastructure for meth has been reduced to a 2-liter soda bottle.  As reported by the AP in the last few days, "The do-it-yourself method creates just enough meth for a few hits, allowing users to make their own doses instead of buying mass-produced drugs from a dealer."  The AP reporters found that meth-related busts are on the increase in 2/3 of the states examined.  So we are back to distributed meth production -- using methods that are even harder to track and crack than bathtub labs -- thanks to innovation driven by attempts to restrict/regulate/proscribe access to a technology.

And in Other News...3D Printers for All

Priya Ganapati recently covered the latest in 3D printing for Wired.  The Makerbot looks to cost about a grand, depending on what you order, and how much of it you build yourself.  It prints all sorts of interesting plastics.  According to the wiki, the "plastruder" print head accepts 3mm plastic filament, so presumably the smallest voxel is 3mm on a side.  Alas this is quite macroscopic, but even if I can't yet print microfluidic components I can imagine all sorts of other interesting applications.  The Makerbot is related to the Reprap, which can now (mostly) print itself.  Combine the two, and you can print a pretty impressive -- and always growing -- list of plastic and metal objects (see the Thingiverse and the Reprap Object Library).

How does 3D printing tie into drug proscription?  Oh, just tangentially, I suppose.  I make more of this in the book.  More power to create in more creative people's hands.  Good luck trying to ban anything in the future.

The Origin of Moore's Law and What it May (Not) Teach Us About Biological Technologies

While writing a proposal for a new project, I've had occasion to dig back into Moore's Law and its origins.  I wonder, now, whether I peeled back enough of the layers of the phenomenon in my book.  We so often hear about how more powerful computers are changing everything.  Usually the progress demonstrated by the semiconductor industry (and now, more generally, IT) is described as the result of some sort of technological determinism instead of as the result of a bunch of choices -- by people -- that produce the world we live in.  This is on my mind as I continue to ponder the recent failure of Codon Devices as a commercial enterprise.  In any event, here are a few notes and resources that I found compelling as I went back to reexamine Moore's Law.

What is Moore's Law?

First up is a 2003 article from Ars Technica that does a very nice job of explaining the why's and wherefore's: "Understanding Moore's Law".  The crispest statement within the original 1965 paper is "The number of transistors per chip that yields the minimum cost per transistor has increased at a rate of roughly a factor of two per year."  At it's very origins, Moore's Law emerged from a statement about cost, and economics, rather than strictly about technology.

I like this summary from the Ars Technica piece quite a lot:

Ultimately, the number of transistors per chip that makes up the low point of any year's curve is a combination of a few major factors (in order of decreasing impact):

  1. The maximum number of transistors per square inch, (or, alternately put, the size of the smallest transistor that our equipment can etch),
  2. The size of the wafer
  3. The average number of defects per square inch,
  4. The costs associated with producing multiple components (i.e. packaging costs, the costs of integrating multiple components onto a PCB, etc.)

In other words, it's complicated.  Notably, the article does not touch on any market-associated factors, such as demand and the financing of new fabs.

The Wiki on Moore's Law has some good information, but isn't very nuanced.

Next, here an excerpt from an interview Moore did with Charlie Rose in 2005:

Charlie Rose:     ...It is said, and tell me if it's right, that this was part of the assumptions built into the way Intel made it's projections. And therefore, because Intel did that, everybody else in the Silicon Valley, everybody else in the business did the same thing. So it achieved a power that was pervasive.

Gordon Moore:   That's true. It happened fairly gradually. It was generally recognized that these things were growing exponentially like that. Even the Semiconductor Industry Association put out a roadmap for the technology for the industry that took into account these exponential growths to see what research had to be done to make sure we could stay on that curve. So it's kind of become a self-fulfilling prophecy.

Semiconductor technology has the peculiar characteristic that the next generation always makes things higher performance and cheaper - both. So if you're a generation behind the leading edge technology, you have both a cost disadvantage and a performance disadvantage. So it's a very non-competitive situation. So the companies all recognize they have to stay on this curve or get a little ahead of it.

Keeping up with 'the Law' is as much about the business model of the semiconductor industry as about anything else.  Growth for the sake of growth is an axiom of western capitalism, but it is actually a fundamental requirement for chipmakers.  Because the cost per transistor is expected to fall exponentially over time, you have to produce exponentially more transistors to maintain your margins and satisfy your investors.  Therefore, Intel set growth as a primary goal early on.  Everyone else had to follow, or be left by the wayside.  The following is from the recent Briefing in The Economist on the semiconductor industry:

...Even the biggest chipmakers must keep expanding. Intel todayaccounts for 82% of global microprocessor revenue and has annual revenues of $37.6 billion because it understood this long ago. In the early 1980s, when Intel was a $700m company--pretty big for the time--Andy Grove, once Intel's boss, notorious for his paranoia, was not satisfied. "He would run around and tell everybody that we have to get to $1 billion," recalls Andy Bryant, the firm's chief administrative officer. "He knew that you had to have a certain size to stay in business."

Grow, grow, grow

Intel still appears to stick to this mantra, and is using the crisis to outgrow its competitors. In February Paul Otellini, its chief executive, said it would speed up plans to move many of its fabs to a new, 32-nanometre process at a cost of $7 billion over the next two years. This, he said, would preserve about 7,000 high-wage jobs in America. The investment (as well as Nehalem, Intel's new superfast chip for servers, which was released on March 30th) will also make life even harder for AMD, Intel's biggest remaining rival in the market for PC-type processors.

AMD got out of the atoms business earlier this year by selling its fab operations to a sovereign wealth fund run by Abu Dhabi.  We shall see how they fare as a bits-only design firm, having sacrificed their ability to themselves push (and rely on) scale.

Where is Moore's Law Taking Us?

Here are a few other tidbits I found interesting:

Re the oft-forecast end of Moore's Law, here is Michael Kanellos at CNET grinning through his prose: "In a bit of magazine performance art, Red Herring ran a cover story on the death of Moore's Law in February--and subsequently went out of business."

And here is somebody's term paper (no disrespect there -- it is actually quite good, and is archived at Microsoft Research) quoting an interview with Carver Mead:

Carver Mead (now Gordon and Betty Moore Professor of Engineering and Applied Science at Caltech) states that Moore's Law "is really about people's belief system, it's not a law of physics, it's about human belief, and when people believe in something, they'll put energy behind it to make it come to pass." Mead offers a retrospective, yet philosophical explanation of how Moore's Law has been reinforced within the semiconductor community through "living it":

After it's [Moore's Law] happened long enough, people begin to talk about it in retrospect, and in retrospect it's really a curve that goes through some points and so it looks like a physical law and people talk about it that way. But actually if you're living it, which I am, then it doesn't feel like a physical law. It's really a thing about human activity, it's about vision, it's about what you're allowed to believe. Because people are really limited by their beliefs, they limit themselves by what they allow themselves to believe what is possible. So here's an example where Gordon [Moore], when he made this observation early on, he really gave us permission to believe that it would keep going. And so some of us went off and did some calculations about it and said, 'Yes, it can keep going'. And that then gave other people permission to believe it could keep going. And [after believing it] for the last two or three generations, 'maybe I can believe it for a couple more, even though I can't see how to get there'. . . The wonderful thing about [Moore's Law] is that it is not a static law, it forces everyone to live in a dynamic, evolving world.

So the actual pace of Moore's Law is about expectations, human behavior, and, not least, economics, but has relatively little to do with the cutting edge of technology or with technological limits.  Moore's Law as encapsulated by The Economist is about the scale necessary to stay alive in the semiconductor manufacturing business.  To bring this back to biological technologies, what does Moore's Law teach us about playing with DNA and proteins?  Peeling back the veneer of technological determinism enables us (forces us?) to examine how we got where we are today. 

A Few Meandering Thoughts About Biology

Intel makes chips because customers buy chips.  According to The Economist, a new chip fab now costs north of $6 billion.  Similarly, companies make stuff out of, and using, biology because people buy that stuff.  But nothing in biology, and certainly not a manufacturing plant, costs $6 billion.

Even a blockbuster drug, which could bring revenues in the range of $50-100 billion during its commercial lifetime, costs less than $1 billion to develop.  Scale wins in drug manufacturing because drugs require lots of testing, and require verifiable quality control during manufacturing, which costs serious money.

Scale wins in farming because you need...a farm.  Okay, that one is pretty obvious.  Commodities have low margins, and unless you can hitch your wagon to "eat local" or "organic" labels, you need scale (volume) to compete and survive.

But otherwise, it isn't obvious that there are substantial barriers to participating in the bio-economy.  Recalling that this is a hypothesis rather than an assertion, I'll venture back into biofuels to make more progress here.

Scale wins in the oil business because petroleum costs serious money to extract from the ground, because the costs of transporting that oil are reduced by playing a surface-to-volume game, and because thermodynamics dictates that big refineries are more efficient refineries.  It's all about "steel in the ground", as the oil executives say -- and in the deserts of the Middle East, and in the Straights of Malacca, etc.  But here is something interesting to ponder: oil production may have maxed out at about 90 million barrels a day (see this 2007 article in the FT, "Total chief warns on oil output").  There may be lots of oil in the ground around the world, but our ability to move it to market may be limited.  Last year's report from Bio-era, "The Big Squeeze", observed that since about 2006, the petroleum market has in fact relied on biofuels to supply volumes above the ~90 million per day mark.  This leads to an important consequence for distributed biofuel production that only recently penetrated my thick skull.

Below the 90 million barrel threshold, oil prices fall because supply will generally exceed demand (modulo games played by OPEC, Hugo Chavez, and speculators).  In that environment, biofuels have to compete against the scale of the petroleum markets, and margins on biofuels get squeezed as the price of oil falls.  However, above the 90 million per day threshold, prices start to rise rapidly (perhaps contributing to the recent spike, in addition to the actions of speculators).  In that environment, biofuels are competing not with petroleum, but with other biofuels.  What I mean is that large-scale biofuels operations may have an advantage when oil prices are low because large-scale producers -- particularly those making first-generation biofuels, like corn-based ethanol, that require lots of energy input -- can eke out a bit more margin through surface to volume issues and thermodynamics.  But as prices rise, both the energy to make those fuels and the energy to move those fuels to market get more expensive.  When the price of oil is high, smaller scale producers -- particularly those with lower capital requirements, as might come with direct production of fuels in microbes -- gain an advantage because they can be more flexible and have lower transportation costs (being closer to the consumer).  In this price-volume regime, petroleum production is maxed out and small scale biofuels producers are competing against other biofuels producers since they are the only source of additional supply (for materials, as well as fuels).

This is getting a bit far from Moore's Law -- the section heading does contain the phrase "meandering thoughts" -- I'll try to bring it back.  Whatever the origin of the trends, biological technologies appear to be the same sort of exponential driver for the economy as are semiconductors.  Chips, software, DNA sequencing and synthesis: all are infrastructure that contribute to increases in productivity and capability further along the value chain in the economy.  The cost of production for chips (especially the capital required for a fab) is rising.  The cost of production for biology is falling (even if that progress is uneven, as I observed in the post about Codon Devices).&nb sp; It is generally becoming harder to participate in the chip business, and it is generally becoming easier to participate in the biology business.  Paraphrasing Carver Mead, Moore's Law became an organizing principal of an industry, and a driver of our economy, through human behavior rather than through technological predestination.  Biology, too, will only become a truly powerful and influential technology through human choices to develop and deploy that technology.  But access to both design tools and working systems will be much more distributed in biology than in hardware.  It is another matter whether we can learn to use synthetic biological systems to improve the human condition to the extent we have through relying on Moore's Law.