Seeing The End Of Oil

(Originally posted at Planetary Technologies, 04 October, 2019. Written in advance of the 2019 White House Bioeconomy Summit.)

Summary. The end of petroleum is in sight. The reason is simple: the black goo that powered and built the 20th century is now losing economically to other technologies. Petroleum is facing competition at both ends of the barrel, from low value, high volume commodities such as fuel, up through high value, low volume chemicals. Electric vehicles and renewable energy will be the most visible threats to commodity transportation fuel demand in the short term, gradually outcompeting petroleum via both energy efficiency and capital efficiency. Biotechnology will then deliver the coup de grace, first by displacing high value petrochemicals with products that have lower energy and carbon costs, and then by delivering new carbon negative biochemicals and biomaterials that cannot be manufactured easily or economically, if at all, from petrochemical feedstocks.

Bioeconomy Capital is investing to accelerate, and to profit from, the transition away from petroleum to biomanufacturing. We will continue to pursue this strategy beyond the endgame of oil into the coming era when sophisticated biological technologies completely displace petrochemicals, powered by renewable energy and containing only renewable carbon. We place capital with companies that are building critical infrastructure for the 21st century global economy. There is a great foundation to build on.

Biotechnology is already an enormous industry in the U.S., contributing more than 2% of GDP (see “Estimating the biotech sector's contribution to the US economy”; updates on the Bioeconomy Dashboard). The largest component of the sector, industrial biotechnology, comprises materials, enzymes, and tools, with biochemicals alone generating nearly $100B in revenues in 2017 (note that this figure excludes biofuels). That $100B is already between 1/6 and 1/4 of fine chemicals revenues in the U.S., depending on whether you prefer to use data from industrial associations or from the government. In other words, biochemicals are already outcompeting petrochemicals in some categories. That displacement is a clear indication that the global economy is well into shifting away from petrochemicals.

See the Bioeconomy Dashboard for downloadable graphics and additional analysis.

The common pushback to any story about the end of fossil fuels is to assert that nothing can be cheaper than an energy-rich resource that oozes from a hole in the ground. But, as we shall see, that claim is now simply, demonstrably, false for most petroleum production and refining, particularly when you include the capital required to deliver the end use of that petroleum. It is true that raw petroleum is energy rich. But it is also true that it takes a great deal of energy, and a great deal of capital-intensive infrastructure, to process and separate oil into useful components. Those components have quite different economic value depending on their uses. And it is through examining the economics of those different uses that one can see the end of oil coming.

First, let us be clear: the demise of the petroleum industry as we know it will not come suddenly. Oil became a critical energy and materials feedstock for the global economy over more than a century, and oil is not going to disappear overnight. Nor will the transition be smooth. Revenues from oil are today integral to maintaining many national budgets, and thus governments, around the globe. As oil fades away, governments that continue to rely on petroleum revenues will be forced to reduce spending. Those governments have a relatively brief window to diversify their economies away from heavy reliance on oil, for example by investing in domestic development of biotechnology. Without that diversification, some of those governments may fall because they cannot pay their bills. Yet even when oil’s clear decline becomes apparent to everyone, it will linger for many years. Government revenues for low cost producers (e.g., Iran and Saudi Arabia) will last longer than high cost producers (e.g. Brazil and Canada). But the end is coming, and it will be delivered by the interaction of many different technical and economic trends. This post is an outline of how all the parts will come together.

WHAT PRODUCES VALUE IN A BARREL OF OIL? ERGS AND ATOMS

Any analysis of the future of petroleum that purports to make sense of the industry must grapple with two kinds of complexity. Firstly, the industry as a whole is enormously complex, with different economic factors at work in different geographies and subsectors, and with those subsectors in turn relying on a wide variety of technologies and processes. In 2017, The Economist published a useful graphic (below — click through to the original) and story (“The world in a barrel”) that explored this complexity. Moreover, the cost of recovering a barrel and delivering it to market in different countries varies widely, between $10 and $70. Further complicating analysis, those reported cost estimates also vary widely, depending on both the data source and the analyst: here is The Economist, and here is the WSJ, and note that these articles cite the same source data but report quite different costs. The total market value of petroleum products is about $2T per year, a figure that of course varies with the price of crude.

Secondly, “a barrel of oil” is itself complex; that is, barrels are neither the same nor internally homogeneous. Not only are barrels from different wells composed of different spectra of molecules (see the lower left panel above in “Breaking down oil”), but those molecules have very different end uses. Notably, on average, of the approximately 44 gallons per barrel worth of products that are generated during petroleum refining, >90% winds up as heating oil or transportation fuel. Another approximately 5% comprises bitumen and coke. Both of these are are low value; bitumen (aka “tar”) gets put on roads and coke is often combined with coal and burned. In other words, about 42 of the 44 gallons of products from a barrel of oil are applied to roads or burned for the energy (the ergs) they contain.

The other 2% of a barrel, or 1-2 gallons depending on where it comes from, comprise the matter (the atoms) from which we build our world today. This includes plastics precursors, lubricants, solvents, aromatic compounds, and other chemical feedstocks. After being further processed via synthetic chemistry into more complex compounds, these feedstocks wind up as constituents of nearly everything we build and buy. It is widely repeated that chemical products are components of 96% of U.S.-manufactured goods. That small volume fraction of a barrel of oil is thus enormously important for the global economy; just ~2% of the barrel produces ~25% of the final economic value of the original barrel of crude oil, to the tune of more than $650B annually.

CHEAPER ERGS

The big news about the ergs in every barrel is that their utility is coming to an end because the internal combustion engine (ICE) is on its way out. Electric vehicles (EVs) are coming in droves. EVs are far more efficient, and have many fewer parts, than ICE powered vehicles. Consequently, maintenance and operating costs for EVs are signficantly lower than for ICE vehicles. Even a relatively expensive Tesla Model 3 is cheaper to own and operate over 15 years than is a Honda Accord. Madly chasing Tesla into the EV market, and somewhat late to the game, Volkswagen has announced it is getting out of manufacturing ICEs altogether. Daimler will invest no more in ICE engineering and will produce only electric cars in the future. Daimler is also launching an electric semi truck in an effort to compete with Tesla’s forthcoming freight hauler. Not to be left out, VW just announced its own large investment into electic semi trucks. Adding to the trend, last week Amazon ordered 100,000 electric delivery trucks. Mass transit is also shifting to EVs. Bloomberg reported earlier in 2019 that, by the end of this year “a cumulative 270,000 barrels a day of diesel demand will have been displaced by electric buses.” In China total diesel demand is already falling, gasoline demand may well peak this year (see below). Bloomberg points to EVs as the culprit. Finally, as described in a recent report by Mark Lewis at BNP Paribas, the combination of renewable electricity and EVs is already 6-7X more capital efficient than fossil fuels and ICEs at delivering you to your destination; i.e. oil would have to fall to $10-$20/barrel to be competitive.

(Click through image to story.) From “China Is Winning the Race to Dominate Electric Cars”, Nathaniel Bullard, Bloomberg, 20 September, 2019

Consequently, for the ~75% of the average barrel already directly facing competition from cheaper electricity provided by renewables, the transition away from oil is already well underway. Gregor Macdonald covers much of this ground quite well in his short book Oil Fall, as well as in his newsletter. Macdonald also demonstrates that renewable electricity generation is growing much faster than is EV deployment, which puts any electricity supply concerns to rest. We can roll out EVs as fast as we can build them, and anyone who buys and drives one will save money compared to owning and operating a new ICE vehicle. Forbes put it succinctly: “Economics of Electric Vehicles Mean Oil's Days As A Transport Fuel Are Numbered.”

But it isn’t just the liquid transportation fuel use of oil that is at risk, because it isn’t just ergs that generate value from oil. Here is where the interlocking bits of the so-called “integrated petroleum industry” are going to cause financial problems. Recall that each barrel of oil is complex, composed of many different volume fractions, which have different values, and which can only be separated via refining. You cannot pick and choose which volume fraction to pull out of the ground. As described above, a disproportionate fraction of the final value of a barrel of oil is due to petrochemicals. In order to get a hold of the 2% of a barrel that constitutes petrochemical feedstocks, and thereby produce the 25% of total value derived from those compounds, you have to extract and handle the other 98% of the barrel. And if you are making less money off that 98% due to decreased demand, then the cost of production for the 2% increases. It is possible to interconvert some of the components of a barrel via cracking and synthesis, which might enable lower value compounds to become higher value compounds, but it is also quite expensive and energy intensive to do so. Worse for the petroleum industry, natural gas can be converted into several low cost petrochemical feedstocks, adding to the competitive headwinds the oil industry will face over the coming decade. Still, there is a broad swath of economically and technologically important petroleum compounds that currently have no obvious replacement. So the real question that we have to answer is not what might displace the ergs in a barrel of oil — that is obvious and already happening via electrification. The much harder question is: where do we get all the complex compounds — that is, the atoms, in the form of petrochemicals and feedstocks — from which we currently build our complex economy? The answer is biology.

Biochemicals are already competing with petrochemicals in a ~$650B global market.

RENEWABLE ATOMS

Bioeconomy Fund 1 portfolio companies Arzeda, Synthace, and Zymergen have already demonstrated that they can design, construct, and optimize new metabolic pathways to directly manufacture any molecule derived from a barrel of oil. Again, at least 17%, and possibly as much as 25%, of US fine chemicals revenues are already generated by products of biotechnology. To be sure, there is considerable work to do before biotechnology can capture the entire ~$650B petrochemical revenue stack. We have to build lots of organisms, and lots of manufacturing capacity in which to grow those organisms. But scores of start-ups and Fortune 50 companies alike are pursuing this goal. As metabolic engineering and biomanufacturing matures, an increasing number of these companies will succeed.

The attraction is obvious: the prices for high value petrochemicals are in the range of $10 to $1000 per liter. And whereas the marginal cost of production for petroleum products is around $20 billion dollars — the cost of a new refinery — the marginal cost of production for biological production looks like a beer brewery, which comes in at between $100,000 and $10 million, depending on the scale. This points to one of the drivers for adopting biotechnology that isn’t yet on the radar for most analysts and investors: the return on capital for biological production will be much higher than for petroleum products, while the risk will be much lower. This gap in understanding the current and future advantages of biology in chemicals manufacturing shows up in overoptimistic growth predictions all across the petroleum industry.

For example, the IEA recently forecast that petrochemicals will account for the largest share of demand growth for the petroleum industry over the next two decades. But the IEA, and the petroleum industry, are likely to be surprised and disappointed by the performance of petrochemicals. This volume fraction is, as noted above, already being replaced by the products of biotechnology. (Expected demand growth in “Passenger vehicles”, “Freight”, and “Industry”, which uses largely comprise transportation fuel and lubricants, will also be disappointing due to electrification.) We should certainly expect the demand for materials to grow, but Bioeconomy Capital is forecasting that by 2030 the bulk of new chemical supply will be provided by biology, and that by 2040 biochemicals will be outcompeting petrochemicals all across the spectrum. This transition could happen faster, depending on how much investment is directed at accelerating the roll out of biological engineering and manufacturing.

Before moving on, we have to address the role of biofuels in the future economy. Because biofuels are very similar to petroleum both technologically and economically — that is, biofuels are high volume, low margin commodities that are burned at low efficiency — they will generally suffer the same fate, and from the same competition, as petroleum. The probable exception is aviation fuel, and perhaps maritime fuel, which may be hard to replace with batteries and electricity for long haul flights and transoceanic surface shipment.

But this likely fate for biofuels points to the use of those atoms in other ways. As of 2019, approximately 10% of U.S. gasoline consumption is contributed by ethanol, as mandated in the Renewable Fuels Standard. That volume is the equivalent of 4% of a barrel of oil, and it is derived from corn kernels. As ethanol demand falls, those renewably-sourced atoms will be useful as feedstocks for products that displace other components of a barrel of oil. The obvious use for those atoms is in the biological manufacture of chemicals. Based on current yields of corn, and ongoing improvements in using more of each corn plant as feedstock, there are more than enough atoms available today just from U.S. corn harvests, let alone other crops, to displace the entire matter stream from oil now used as petrochemical feedstocks.

BEYOND PETROCHEMISTRY

The economic impact of biochemical manufacturing is thus likely to grow significantly over the next decade. Government and private sector investments have resulted in the capability today to biomanufacture not just every molecule that we now derive from a barrel of petroleum, but, using the extraordinary power of protein engineering and metabolic engineering, to also biomanufacture a wide range of new and desirable molecules that cannot plausibly be made using existing chemical engineering techniques. This story is not simply about sustainability. Instead, the power of biology can be used to imbue products with improved properties. There is enormous economic and technical potential here. The resulting new materials, manufactured using biology, will impact a wide range of industries and products, far beyond what has been traditionally considered the purview of biotechnology.

For example, Arzeda is now scaling up the biomanufacturing of a methacrylate compound that can be used to dramatically improve the properties of plexiglass. This compound has long been known by materials scientists, and long been desired by chemical engineers for its utility in improving such properties as temperature resistance and hardness, but no one could figure out how to make it economically in large quantities. Arzeda's biological engineers combined enzymes from different organisms with enzymes that they themselves designed, and that have never existed before, to produce the compound at scale. This new material will shortly find its way into such products as windshields, impact resistant glass, and aircraft canopies.

Similarly, Zymergen is pursuing remarkable new materials that will transform consumer electronics. Zymergen is developing a set of films and coatings that have a set of properties unachievable through synthetic chemistry and that will be used to produce flexible electronics and displays. These materials simply cannot be made using the existing toolbox of synthetic chemistry; biological engineering gives access to a combination of material properties that cannot be formulated any other way. Biological engineering will bring about a renaissance in materials innovation. Petroleum was the foundation of the technology that built the 20th century. Biology is the technology of the 21st century.

FINANCING RISK

The power and flexibility of biological manufacturing create capabilities that the petroleum industry cannot match. Ultimately, however, the petroleum industry will fade away not because demand for energy and materials suddenly disappears, or because that demand is suddenly met by renewable energy and biological manufacturing. Instead, long before competition to supply ergs and atoms displaces the contents of the barrel, petroleum will die by the hand of finance.

The fact that both ends of the barrel are facing competition from technologically and economically superior alternatives will eventually lead to concerns about oil industry revenues. And that concern will reduce enthusiasm for investment. That investment will falter not because total petroleum volumes see an obvious absolute drop, but rather because the contents of the “marginal barrel” – that is, the next barrel produced – will start to be displaced by electricity and by biology. This is already happening in China and in California, as documented by Bloomberg and by Gregor Macdonald. Thus the first sign of danger for the oil industry is that expected growth will not materialize. Because it is growth prospects that typically keep equities prices high via demand for those equities, no growth will lead to low demand, which will lead to falling stock prices. Eventually, the petroleum industry will fail because it stops making money for investors.

The initial signs of that end are already apparent. In an opinion piece in the LA Times, Jagdeep Singh Bachher, the University of California’s chief investment officer and treasurer, and Richard Sherman, chairman of the UC Board of Regents’ Investments Committee, write that “UC investments are going fossil free. But not exactly for the reasons you may think.” Bachher and Sherman made this decision not based on any story about saving the planet or on reducing carbon emissions. The reason for getting rid of these assets, put simply, is that fossil fuels are no longer a good long-term investment, and that other choices will provide better returns:

We believe hanging on to fossil fuel assets is a financial risk [and that] there are more attractive investment opportunities in new energy sources than in old fossil fuels.

An intriguing case study of perceived value and risk is the 3 year saga of the any-day-now-no-really Saudi Aramco IPO. Among the justifications frequently mooted for the IPO is the need to diversify the country's economy away from oil into industries with a brighter future, including biotechnology, that is, to ameliorate risk:

The listing of the company is at the heart of Prince Mohammed’s ambitious plans to revamp the kingdom’s economy, with tens of billions of dollars urgently needed to fund megaprojects and develop new industries.

There have been a few hiccups with this plan. The challenges that Saudi Aramco is facing in its stock market float are multifold, from physical vulnerability to terrorism, to public perception and industry divestment, through to concerns about the long-term price of oil:

When Saudi Arabia’s officials outlined plans to restore output to maximum capacity after attacks that set two major oil facilities ablaze on Saturday, they were also tasked with convincing the world that the national oil company Saudi Aramco was investable.

The notion that the largest petroleum company in the world might have trouble justifying its IPO, and might have trouble hitting the valuation necessary to raise the cash its current owners are looking for, is eye opening. This uncertainty creates the impression that Aramco may have left it too late. The Company managers may see less value from their assets than they had hoped, precisely because increased financial risk is reducing that value.

And that is the point — each of the factors discussed in this post increases the financing risk for the petroleum industry. Risk increases the cost of capital, and when financiers find better returns elsewhere they rapidly exit the scene. This story will play out for petroleum investments just as it has for coal. Watch what the bankers do; they don’t like to lose money, and the writing is on the wall already. In 2018, global investment in renewable electricity generation was three times larger than the investment in fossil fuel powered generation. Biotechnology already provides at least 17% of chemical industry revenues in the U.S., and is growing in the range of 10-20% annually (see the inset in Figure 2). If you put the pieces together, you can already see the end of oil coming.

DNA Synthesis and Sequencing Costs and Productivity for 2025

In the run up to Synbiobeta25 I decided to update the cost and productivity curves.

Here is the prior update, with a description of what they are, and are not, and of my history in developing them. You can follow the thread backwards for comments on comparisons to Moore’s Law.

I was also asked recently to provide a opinion about the feasibility of the Human Genome Project 2 proposal, which led me to dig into the performance of the Ultima UG100 instrument. I will publish my thoughts on the HGP 2 later.

The UG 100 is a truly impressive instrument, capable of sequencing >30,000 human genomes annually at 30x coverage, with only about an hour of human hands on time to start a sequencing run. The most recent price and productivity sequencing data are based on the UG 100.

As usual, please remember where you found them.

The price per base of DNA sequencing and synthesis — reading and writing DNA — based on price surveys and industry interviews. Until recently, most synthetic genes (the red line) were assembled from short oligonucleotides (oligos) synthesized in large volumes on columns (pink line). Now genes can be readily assembled from oligos synthesized in very small volumes on arrays, though data on the usage and price of array oligos is difficult to pin down; prices for array oligos are asserted to fall in the range from $.00001 to $.001 per base.

The productivity of DNA synthesis and sequencing, measured as bases per person per day, using commercially available instruments, and compared to Moore's Law, which is a proxy for IT productivity. Productivity in sequencing DNA has increased much faster than Moore's Law in recent years. Productivity in synthesizing DNA must certainly have increased substantially for privately developed and assembled synthesizers, but no new synthesis instruments, and no relevant performance figures, have been released since 2008.

Written comments for Artificial Intelligence and Automated Laboratories for Biotechnology: Leveraging Opportunities and Mitigating Risks, 3-4 April, 2024

Here are my written comments for the recent NASEM workshopArtificial Intelligence and Automated Laboratories for Biotechnology: Leveraging Opportunities and Mitigating Risks”, convened at the request of the Congressionally-chartered National Security Commission on Emerging Biotechnology (NCSEB), in April, 2024.

The document is composed of two parts: 1) remarks delivered during the Workshop in response to prompts from NASEM and the National Security Commission for Emerging Biotechnologies and 2) remarks prepared in response to comments arising during the Workshop.

PDF

These comments extend and document my thoughts on the reemergent hallucination that restricting access to DNA synthesis will improve security, and that such regulation will do anything other than constitute perverse incentives that create insecurity. DNA synthesis, and biotechnology more broadly, are examples of a particular kind of distributed and democratized technology. In large markets, served by distributed and accessible production technologies, restrictions on access to those markets and technologies incentivize piracy and create insecurity. There is no data to suggest regulation of such technologies improves security, and here I document numerous examples of counterproductive regulation, including the perverse incentives already created by the 2010 DNA Synthesis Screening Guidelines.

Let’s not repeat this mistake.

Here are a few excerpts:

Biology is a General Purpose Technology. I didn't hear anyone at this meeting use that phrase, but all of our discussions about what we might manufacture using biology, and the range of applications, make clear that we are talking about just such a thing. The Wikipedia entry on GPTs has a pretty good definition: “General-purpose technologies (GPTs) are technologies that can affect an entire economy (usually at a national or global level). GPTs have the potential to drastically alter societies through their impact on pre-existing economic and social structures.” This definitely describes biology. We are already seeing significant economic impacts from biotechnology in the U.S., and we are only just getting started.

My latest estimate is that biotechnology contributed at least $550B to the U.S. economy in 2021, a total that has steadily grown since 1980 at about 10% annually, much faster than the rest of the economy. Moreover, participants in this workshop outlined a future in which various other technologies—hardware, software, and automation, each of which is also recognized as a General Purpose Technology, and each of which contributes significantly to the economy—will be used to enhance our ability to design and manufacture pathways and organisms that will then themselves be used to manufacture other objects.

The U.S. invests in many fields with the recognition that they inform the development of General Purpose Technologies; we expect that photolithography, or control theory, or indeed machine learning, will each have broad impact across the entire economy and social fabric, and so they have. However, in the U.S. investment in biology has been scattershot and application specific, and its output has been poorly monitored. I do have some hope that the recent focus on the bioeconomy, and the creation of various Congressional and Executive Branch bodies, directed to study and secure the bioeconomy, will help. Yet I am on my third White House trying to get the economic impact of biotechnology measured as well as we measure virtually everything else in our economy, and so far the conversation is still about how hard it is to imagine doing this, if only we could first decide how to go about it.

If we in the U.S. were the only ones playing this game, with no outside pressure, perhaps we could take our time and continue fiddling about as we have for the last forty or fifty years. But the global context today is one of multiple stresses from many sources. We must have better biological engineering and manufacturing in order to deal with threats to, and from, nature, whether these are zoonotic pathogens, invasive species, or ecosystems in need of resuscitating, or even rebooting. We face the real threat of engineered organisms or toxins used as weapons by human adversaries. And some of our competitors, countries with a very different perspective on the interaction of the state and political parties with the populace than we have in the U.S., have made very clear that they intend to use biology as a significant, and perhaps the most important, tool in their efforts to dominate the global economy and the politics of the 21st century. So if we want to compete, we need to do better.

In summary, before implementing restrictions on access to DNA synthesis, or lab automation, or machine learning, we must ask what perverse incentives we will create for adaptation and innovation to escape those restrictions. And we must evaluate how perverse incentives may increase risks.

The call to action here is not to do nothing, but rather to be thoughtful about proposed regulation and consider carefully the implications of taking action. I am concerned that we all too frequently embrace the hypothetical security and safety improvements promised by regulation or proscription without considering that we might recapitulate the very real, historically validated, costs of regulation and proscription. Moreover, given the overwhelming historical evidence, those proposing and promoting regulation should explain how this time it will be different, how this time regulation will improve security rather than create insecurity.

Here I will throw down the nitrile gauntlet: would-be regulators frequently get their thinking backwards on regulatory policy. I have heard more than one time the proposition “if you don't propose an alternative, we will regulate this”. But, given prior experience, it is the regulators who must explain how their actions will improve the world, and will increase security, rather than achieve the opposite.1 Put very plainly, it is the regulators' responsibility to not implement policies that make things worse.

1 In conversations in Washington DC I also frequently hear “But Rob, we must do something”. To which I respond: must we? What if every action we contemplate has a greater chance of worsening security than improving it? Dissatisfaction with the status quo is a poor rationale for taking actions that are reasonably expected to be counterproductive. Engaging in security theater that obscures a problem for which we have yet to identify a path forward is no security at all.

DNA Cost and Productivity Data, aka "Carlson Curves"

I have received a number of requests in recent days for my early DNA synthesis and productivity data, so I have decided to post it here for all who are interested. Please remember where you found it.

A bit of history: my efforts to quantify the pace of change in biotech started in the summer of 2000 while I was trying to forecast where the industry was headed. At the time, I was a Research Fellow at the Molecular Sciences Institute (MSI) in Berkeley, and I was working on what became the essay “Open Source Biology and Its Impact on Industry”, originally written in the summer of 2000 for the inaugural Shell/Economist World in 2050 Competition and originally titled “Biological Technology in 2050”. I was trying to conceive of where things were going many decades out, and gathering these numbers seemed like a good way to anchor my thinking. I had the first, very rough, data set by about September of 2000. I presented the curves that summer for the first time to an outside audience in the form of a Global Business Network (GBN) Learning Journey that stopped at MSI to see what we were up to. Among the attendees was Steward Brand, whom I understand soon started referring to the data as “Carlson Curves” in his own presentations. I published the data for the first time in 2003 in a paper with the title “The Pace and Proliferation of Biological Technologies”. Somewhere in there Ray Kurzweil started making reference to the curves, and then a 2006 article in The Economist, “Life 2.0”, brought them to a wider audience and cemented the name. It took me years to get comfortable with “Carlson Curves”, because, even if I did sort it out first, it is just data rather than a law of the universe. But eventually I got it through my thick skull that it is quite good advertising.

The data was very hard to come by when I started. Sequencing was still a labor intensive enterprise, and therefore highly variable in cost, and synthesis was slow, expensive, and relatively rare. I had to call people up to get their rough estimates of how much time and effort they were putting in, and also had to root around in journal articles and technical notes looking for any quantitative data on instrument performance. This was so early in the development of the field that, when I submitted what became the 2003 paper, one of the reviews came back with the criticism that the reviewer – certainly the infamous Reviewer Number 2 – was “unaware of any data suggesting that sequencing is improving exponentially”.

Well, yes, that was the first paper that collected such data.

The review process led to somewhat labored language in the paper asserting the “appearance” of exponential progress when comparing the data to Moore's Law. I also recall showing Freeman Dyson the early data, and he cast a very skeptical eye on the conclusion that there were any exponentials to be written about. The data was, in all fairness, a bit thin at the time. But the trend seemed clear to me, and the paper laid out why I thought the exponential trends would, or would not, continue. Steward Brand, and Drew Endy at the next lab bench over, grokked it all immediately, which lent some comfort that I wasn’t sticking my neck out so very far.

I've written previously about when the comparison with Moore's Law does, and does not, make sense. (Here, here, and here.) Many people choose to ignore the subtleties. I won't belabor the details here, other than to try to succinctly observe that the role of DNA in constructing new objects is, at least for the time being, fundamentally different than that of transistors. For the last forty years, the improved performance of each new generation of chip and electronic device has depended on those objects containing more transistors, and the demand for greater performance has driven an increase in the number of transistors per object. In contrast, the economic value of synthetic DNA is decoupled from the economic value of the object it codes for; in principle you only need one copy of DNA to produce many billions of objects and many billions of dollars in value.

To be sure, prototyping and screening of new molecular circuits requires quite a bit more than one copy of the DNA in question, but once you have your final sequence in hand, your need for additional synthesis for that object goes to zero. And even while the total demand for synthetic DNA has grown over the years, the price per base has on average fallen about as fast; consequently, as best as I can tell, the total dollar value of the industry hasn't grown much over the last ten years. This makes it very difficult to make money in the DNA synthesis business, and may help explain why so many DNA synthesis companies have gone bankrupt or been folded into other operations. Indeed, most of the companies that provided DNA or gene synthesis as a service no longer exist. Due to similar business model challenges it is difficult to sell stand alone synthesis instruments. Thus the productivity data series for synthesis instruments ends several years ago, because it is too difficult to evaluate the performance of proprietary instruments run solely by the remaining service providers. DNA synthesis is likely to remain a difficult business until there is a business model in which the final value of the product, whatever that product is, depends on the actual number of bases synthesized and sold. As I have written before, I think that business model is likely to be DNA data storage. But we shall see.

The business of sequencing, of course, is another matter. It's booming. But as far as the “Carlson Curves” go, I long ago gave up trying to track this on my own, because a few years after the 2003 paper came out the NHGRI started tracking and publishing sequencing costs. Everyone should just use that data. I do.

Finally, a word on cost versus price. For normal, healthy businesses, you expect the price of something to exceed its cost, and for the business to make at least a little bit of money. But when it comes to DNA, especially synthesis, it has always been difficult to determine the true cost because it has turned out that the price per base has frequently been below the cost, thereby leading those businesses to go bankrupt. There are some service operations that are intentionally run at negative margins in order to attract business; that is, they are loss leaders for other services, or in order to maintain sufficient scale so that the company can have access to that scale for its own internal projects. There are a few operations that appear to be priced so that they are at least revenue neutral and don't lose money. Thus there can be a wide range of prices at this point in time, which further complicates sorting out how the technology may be improving and what impact this has on the economics of biotech. Moreover, we might expect the price of synthetic DNA to *increase* occasionally, either because providers can no longer afford to lose money or because competition is reduced. There is no technological determinism here. Just as Moore's Law is ultimately a function of industrial planning and expectations, there is nothing about Carlson Curves that says prices must continuously fall monotonically.

A note on methods and sources: as described in the 2003 paper, this data was generally gathered by calling people up or by extracting what information I could from what little was written down and published at the time. The same is true for later data. The quality of the data is limited primarily by that availability and by how much time I could spend to develop it. I would be perfectly delighted to have someone with more resources build a better data set.

The primary academic references for this work are:

Robert Carlson, “The Pace and Proliferation of Biological Technologies”. Biosecurity and Bioterrorism: Biodefense Strategy, Practice, and Science. Sep, 2003, 203-214. http://doi.org/10.1089/153871303769201851.

Robert Carlson, “The changing economics of DNA synthesis”. Nat Biotechnol 27, 1091–1094 (2009). https://doi.org/10.1038/nbt1209-1091.

Robert Carlson, Biology Is Technology The Promise, Peril, and New Business of Engineering Life, Harvard University Press, 2011. Amazon.

Here are my latest versions of the figures, followed by the data. Updates and commentary are on the Bioeconomy Dashboard.

Creative Commons image licence (Attribution-NoDerivatives 4.0 International (CC BY-ND 4.0)) terms: 

  • Share — copy and redistribute the material in any medium or format for any purpose, even commercially.

  • Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.

  • NoDerivatives — If you remix, transform, or build upon the material, you may not distribute the modified material.

Here is the cost data (units in [USD per base]):

Year DNA Sequencing Short Oligo (Column) Gene Synthesis
1990 25

1991


1992
1
1993


1994


1995 1 0.75
1996


1997


1998


1999

25
2000 0.25 0.3
2001

12
2002

8
2003 0.05 0.15 4
2004 0.025

2005


2006 0.00075 0.1 1
2007

0.5
2008


2009 8E-06 0.08 0.39
2010 3.17E-06 0.07 0.35
2011 2.3E-06 0.07 0.29
2012 1.6E-06 0.06 0.2
2013 1.6E-06 0.06 0.18
2014 1.6E-06 0.06 0.15
2015 1.6E-09

2016 1.6E-09 0.05 0.03
2017 1.6E-09 0.05 0.02

Here is the productivity data (units in [bases per person per day] and [number of transistors per chip]) — note that commercially available synthesis instruments were not sold new for the decade following 2011, and I have not sat down to figure out the productivity of any of the new boxes that may be for sale as of today:

year Reading DNA Writing DNA Transistors
1971

2250
1972

2500
1974

5000
1978

29000
1982

1.20E+05
1985

2.75E+05
1986 25600

1988

1.18E+06
1990
200
1993

3.10E+06
1994 62400

1996


1997 4.22E+05 15320
1998

7.50E+06
1999 576000
2.40E+07
2000
1.38E+05 4.20E+07
2001


2002


2003

2.20E+08
2004

5.92E+08
2005


2006 10000000

2007 200000000 2500000
2008

2000000000
2009 6000000000

2010 17000000000

2011

2600000000
2012 54000000000

A memorial to Mark Buller, PhD, and our response to the propaganda film "Demon in the Freezer".

Earlier this year my friend and colleague Mark Buller passed away. Mark was a noted virologist and a professor at Saint Louis University. He was struck by a car while riding his bicycle home from the lab, and died from his injuries. Here is Mark's obituary as published by the university.

In 2014 and 2015, Mark and I served as advisors to a WHO scientific working group on synthetic biology and the variola virus (the causative agent of smallpox). In 2016, we wrote the following, previously un-published, response to an "Op-Doc" that appeared in the New York Times. In a forthcoming post I will have more to say about both my experience with the WHO and my thoughts on the recent publication of a synthetic horsepox genome. For now, here is the last version (circa May, 2016) of the response Mark I and wrote to the Op-Doc, published here as my own memorial to Professor Buller.


Variola virus is still needed for the development of smallpox medical countermeasures

On May 17, 2016 Errol Morris presented a short movie entitled “Demon in the Freezer” [note: quite different from the book of the same name by Richard Preston] in the Op-Docs section of the on-line New York Times. The piece purported to present both sides of the long-standing argument over what to do with the remaining laboratory stocks of variola virus, the causative agent of smallpox, which no longer circulates in the human population.

Since 1999, the World Health Organization has on numerous occasions postponed the final destruction of the two variola virus research stocks in Russia and the US in order to support public health related research, including the development of smallpox molecular diagnostics, antivirals, and vaccines.  

“Demon in the Freezer” clearly advocates for destroying the virus. The Op-Doc impugns the motivation of scientists carrying out smallpox research by asking: “If given a free hand, what might they unleash?” The narrative even suggests that some in the US government would like to pursue a nefarious policy goal of “mutually assured destruction with germs”. This portion of the movie is interlaced with irrelevant, hyperbolic images of mushroom clouds. The reality is that in 1969 the US unilaterally renounced the production, storage or use biological weapons for any reason whatsoever, including in response to a biologic attack from another country. The same cannot be said for ISIS and Al-Qaeda. In 1975 the US ratified the 1925 Geneva Protocol banning chemical and biological agents in warfare and became party to the Biological Weapons Convention that emphatically prohibits the use of biological weapons in warfare.

“Demon in the Freezer” is constructed with undeniable flair, but in the end it is a benighted 21st century video incarnation of a middling 1930's political propaganda mural. It was painted with only black and white pigments, rather than a meaningful palette of colors, and using a brush so broad that it blurred any useful detail. Ultimately, and to its discredit, the piece sought to create fear and outrage based on unsubstantiated accusations.

Maintaining live smallpox virus is necessary for ongoing development and improvement of medical countermeasures. The first-generation US smallpox vaccine was produced in domesticated animals, while the second-generation smallpox vaccine was manufactured in sterile bioreactors; both have the potential to cause serious side effects in 10-20% of the population. The third generation smallpox vaccine has an improved safety profile, and causes minimal side effects. Fourth generation vaccine candidates, based on newer, lower cost, technology, will be even safer and some are in preclinical testing. There remains a need to develop rapid field diagnostics and an additional antiviral therapy for smallpox.

Continued vigilance is necessary because it is widely assumed that numerous undeclared stocks of variola virus exist around the world in clandestine laboratories. Moreover, unsecured variola virus stocks are encountered occasionally in strain collections left behind by long-retired researchers, as demonstrated in 2014 with the discovery of 1950s vintage variola virus in a cold room at the NIH. The certain existence of unofficial stocks makes destroying the official stocks an exercise in declaring “victory” merely for political purposes rather than a substantive step towards increasing security. Unfortunately, the threat does not end with undeclared or forgotten samples.

In 2015 a WHO Scientific Working Group on Synthetic Biology and Variola Virus and Smallpox determined that a “skilled laboratory technician or undergraduate student with experience of working with viruses” would be able to generate variola virus from the widely available genomic sequence in “as little as three months”. Importantly, this Working Group concluded that “there will always be the potential to recreate variola virus and therefore the risk of smallpox happening again can never be eradicated.” Thus, the goal of a variola virus-free future, however laudable, is unattainable. This is sobering guidance on a topic that requires sober consideration.

We welcome increased discussions of the risk of infectious disease and of public health preparedness. In the US these topics have too long languished among second (or third) tier national security conversations. The 2014 West Africa Ebola outbreak and the current Congressional debate over funding to counter the Zika virus exemplifies the business-as-usual political approach of throwing half a bucket of water on the nearest burning bush while the surrounding countryside goes up in flames. Lethal infectious diseases are serious public health and global security issues and they deserve serious attention.

The variola virus has killed more humans numerically than any other single cause in history. This pathogen was produced by nature, and it would be the height of arrogance, and very foolish indeed, to assume nothing like it will ever again emerge from the bush to threaten human life and human civilization. Maintenance of variola virus stocks is needed for continued improvement of molecular diagnostics, antivirals, and vaccines. Under no circumstances should we unilaterally cripple those efforts in the face of the most deadly infectious disease ever to plague humans. This is an easy mistake to avoid.

Mark Buller, PhD, was a Professor of Molecular Microbiology & Immunology at Saint Louis University School of Medicine, who passed away on February 24, 2017. Rob Carlson, PhD, is a Principal at the engineering and strategy firm Biodesic and a Managing Director of Bioeconomy Capital.

The authors served as scientific and technical advisors to the 2015 WHO Scientific Working Group on Synthetic Biology and Variola Virus.

Guesstimating the Size of the Global Array Synthesis Market

(Updated, Aug 31, for clarity.)

After chats with a variety of interested parties over the last couple of months, I decided it would be useful to try to sort out how much DNA is synthesized annually on arrays, in part to get a better handle on what sort of capacity it represents for DNA data storage. The publicly available numbers, as usual, are terrible, which is why the title of the post contains the word "guesstimating". Here goes.

First, why is this important? As the DNA synthesis industry grows, and the number of applications expands, new markets are emerging that use that DNA in different ways. Not all that DNA is produced using the same method, and the different methods are characterized by different costs, error rates, lengths, throughput, etc. (The Wikipedia entry on Oligonucleotide Synthesis is actually fairly reasonable, if you want to read more. See also Kosuri and Church, "Large-scale de novo DNA synthesis: technologies and applications".) If we are going to understand the state of the technology, and the economy built on that technology, then we need to be careful about measuring what the technology can do and how much it costs. Once we pin down what the world looks like today, we can start trying to make sensible projections, or even predictions, about the future.

While there is just one basic chemistry used to synthesize oligonucleotides, there are two physical formats that give you two very different products. Oligos synthesized on individual columns, which might be packed into 384 (or more) well plates, can be manipulated as individual sequences. You can use those individual sequences for any number of purposes, and if you want just one sequence at a time (for PCR or hybridization probes, gene therapy, etc), this is probably how you make it. You can build genes from column oligos by combining them pairwise, or in larger numbers, until you get the size construct you want (typically of order a thousand bases, or a kilobase [kB], at which point you start manipulating the kB fragments). I am not going to dwell on gene assembly and error correction strategies here; you can Google that.

The other physical format is array synthesis, in which synthesis takes place on a solid surface consisting of up to a million different addressable features, where light or charge is used to control which sequence is grown on which feature. Typically, all the oligos are removed from the array at once, which results in a mixed pool. You might insert this pool into a longer backbone sequence to construct a library of different genes that code for slightly different protein sequences, in order to screen those proteins for the characteristics you want. Or, if you are ambitious, you might use the entire pool of array oligos to directly assemble larger constructs such as genes. Again, see Google, Codon Devices, Gen9, Twist, etc. More relevant to my purpose here, a pool of array-synthesized oligos can be used as an extremely dense information storage medium. To get a sense of when that might be a viable commercial product, we need to have an idea of the throughput of the industry, and how far away from practical implementation we might be. 

Next, to recap, last year I made a stab at estimating the size of the gene synthesis market. Much of the industry revenue data came from a Frost & Sullivan report, commissioned by Genscript for its IPO prospectus. The report put the 2014 market for synthetic genes at only $137 million, from which I concluded that the total number of bases shipped as genes that year was 4.8 billion, or a bit less than a duplex human genome. Based on my conversations with people in the industry, I conclude that most of those genes were assembled from oligos synthesized on columns, with a modest, but growing, fraction from array oligos. (See "On DNA and Transistors", and preceding posts, for commentary on the gene synthesis industry and its future.)

The Frost & Sullivan report also claims that the 2014 market for single-stranded oligonucleotides was $241 million. The Genscript IPO prospectus does not specify whether this $241 million was from both array- and column-synthesized oligos, or not. But because Genscript only makes and uses column synthesis, I suspect it referred only to that synthesis format.  At ~$0.01 per base (give or take), this gives you about 24 billion bases synthesized on columns sold in 2014. You might wind up paying as much as $0.05 to $0.10 per base, depending on your specifications, which if prevalent would pull down the total global production volume. But I will stick with $0.01 per base for now. If you add the total number of bases sold as genes and the bases sold as oligos, you get to just shy of 30 billion bases (leaving aside for the moment the fact that an unknown fraction of the genes came from oligos synthesized on arrays).

So, now, what about array synthesis? If you search the interwebs for information on the market for array synthesis, you get a mess of consulting and marketing research reports that cost between a few hundred and many thousands of dollars. I find this to be an unhelpful corpus of data and analysis, even when I have the report in hand, because most of the reports are terrible at describing sources and methods. However, as there is no other source of data, I will use a rough average of the market sizes from the abstracts of those reports to get started. Many of the reports claim that in 2016 the global market for oligo synthesis was ~$1.3 billion, and that this market will grow to $2.X billion by 2020 or so. Of the $1.3B 2016 revenues, the abstracts assert that approximately half was split evenly between "equipment and reagents". I will note here that this should already make the reader skeptical of the analyses, because who is selling ~$260M worth of synthesis "equipment"? And who is buying it? Seems fishy. But I can see ~$260M in reagents, in the form of various columns, reagents, and purification kit. This trade, after all, is what keeps outfits like Glenn Research and Trilink in business.

Forging ahead through swampy, uncertain data, that leaves us with ~$650M in raw oligos. Should we say this is inclusive or exclusive of the $241M figure from Frost & Sullivan? I am going to split the difference and call it $500M, since we are already well into hand waving territory by now, anyway. How many bases does this $500M buy?

Array oligos are a lot cheaper than column oligos. Kosuri and Church write that "oligos produced from microarrays are 2–4 orders of magnitude cheaper than column-based oligos, with costs ranging from $0.00001–0.001 per nucleotide, depending on length, scale and platform." Here we stumble a bit, because cost is not the same thing as price. As a consumer, or as someone interested in understanding how actually acquiring a product affects project development, I care about price. Without knowing a lot more about how this cost range is related to price, and the distribution of prices paid to acquire array oligos, it is hard to know what to do with the "cost" range. The simple average cost would be $0.001 per base, but I also happen to know that you can get oligos en masse for less than that. But I do not know what the true average price is. For the sake of expediency, I will call it $0.0001 per base for this exercise.

Combining the revenue estimate and the price gives us about 5E12 bases per year. From there, assuming roughly 100-mer oligos, you get to 5E10 difference sequences. And adding in the number of features per array (between 100,000 and 1M), you get as many as 500,000 arrays run per year, or about 1370 per day. (It is not obvious that you should think of this as 1370 instruments running globally, and after seeing the Agilent oligo synthesis operation a few years ago, I suggest that you not do that.) If the true average price is closer to $0.00001 per base, then you can bump up the preceding numbers by an order of magnitude. But, to be conservative, I won't do that here. Also note that the ~30 billion bases synthesized on columns annually are not even a rounding error on the 5E12 synthesized on arrays.

Aside: None of these calculations delve into the mass (or the number of copies) per synthesized sequence. In principle, of course, you only need one perfect copy of each sequence, whether synthesized on columns or arrays, to use DNA in any just about application (except where you need to drive the equilibrium or reaction kinetics). Column synthesis gives you many more copies (i.e., more mass per sequence) than array synthesis. In principle — ignoring the efficiency of the chemical reactions — you could dial down the feature size on arrays until you were synthesizing just one copy per sequence. But then it would become exceedingly important to keep track of that one copy through successive fluidic operations, which sounds like a quite difficult prospect. So whatever the final form factor, an instrument needs to produce sufficient copies per sequence to be useful, but not so many that resources are wasted on unnecessary redundancy/degeneracy.

Just for shits and giggles, and because array synthesis could be important for assembling the hypothetical synthetic human genome, this all works out to be enough DNA to assemble 833 human duplex genomes per year, or 3 per day, in the absence of any other competing uses, of which there are obviously many. Also if you don't screw up and waste some of the DNA, which is inevitable. Finally, at a density of ~1 bit/base, this is enough to annually store 5 TB of data, or the equivalent of one very beefy laptop hard drive.

And so, if you have access to the entire global supply of single stranded oligonucleotides, and you have an encoding/decoding and sequencing strategy that can handle significant variations in length and high error rates at scale, you can store enough HD movies and TV to capture most of the new, good stuff that HollyBollyWood churns out every year. Unless, of course, you also need to accommodate the tastes and habits of a tween daughter, in which case your storage budget is blown for now and evermore no matter how much capacity you have at hand. Not to mention your wallet. Hey, put down the screen and practice the clarinet already. Or clean up your room! Or go to the dojo! Yeesh! Kids these days! So many exclamations!

Where was I?

Now that we have some rough numbers in hand, we can try to say something about the future. Based on my experience working on the Microsoft/UW DNA data storage project, I have become convinced that this technology is coming, and it will be based on massive increases in the supply of synthetic DNA. To compete with an existing tape drive (see the last few 'graphs of this post), able to read and write ~2 Gbits a second, a putative DNA drive would need to be able to read and write ~2 GBases per second, or ~183 Pbits/day, or the equivalent of ~10,000 human genomes a day — per instrument/device. Based on the guesstimate above, which gave a global throughput of just 3 human genomes per day, we are waaaay below that goal.

To be sure, there is probably some demand for a DNA storage technology that can work at lower throughputs: long term cold storage, government archives, film archives, etc. I suspect, however, that the many advantages of DNA data storage will attract an increasing share of the broader archival market once the basic technology is demonstrated on the market. I also suspect that developing the necessary instrumentation will require moving away from the existing chemistry to something new and different, perhaps enzymatically controlled synthesis, perhaps even with the aid of the still hypothetical DNA "synthase", which I first wrote about 17 years ago.

In any event, based on the limited numbers available today, it seems likely that the current oligo array industry has a long way to go before it can supply meaningful amounts of DNA for storage. It will be interesting to see how this all evolves.

A Few Thoughts and References Re Conservation and Synthetic Biology

Yesterday at Synthetic Biology 7.0 in Singapore, we had a good discussion about the intersection of conservation, biodiversity, and synthetic biology. I said I would post a few papers relevant to the discussion, which are below.

These papers are variously: the framing document for the original meeting at the University of Cambridge in 2013 (see also "Harry Potter and the Future of Nature"), sponsored by the Wildlife Conservation Society; follow on discussions from meetings in San Francisco and Bellagio; and my own efforts to try to figure out how quantify the economic impact of biotechnology (which is not small, especially when compared to much older industries) and the economic damage from invasive species and biodiversity loss (which is also not small, measured as either dollars or jobs lost). The final paper in this list is my first effort to link conservation and biodiversity with economic and physical security, which requires shifting our thinking from the national security of nation states and their political boundaries to the natural security of the systems and resources that those nation states rely on for continued existence.

"Is It Time for Synthetic Biodiversity Conservation?", Antoinette J. Piaggio1, Gernot Segelbacher, Philip J. Seddon, Luke Alphey, Elizabeth L. Bennett, Robert H. Carlson, Robert M. Friedman, Dona Kanavy, Ryan Phelan, Kent H. Redford, Marina Rosales, Lydia Slobodian, Keith WheelerTrends in Ecology & Evolution, Volume 32, Issue 2, February 2017, Pages 97–107

Robert Carlson, "Estimating the biotech sector's contribution to the US economy", Nature Biotechnology, 34, 247–255 (2016), 10 March 2016

Kent H. Redford, William Adams, Rob Carlson, Bertina Ceccarelli, “Synthetic biology and the conservation of biodiversity”, Oryx, 48(3), 330–336, 2014.

"How will synthetic biology and conservation shape the future of nature?", Kent H. Redford, William Adams, Georgina Mace, Rob Carlson, Steve Sanderson, Framing Paper for International Meeting, Wildlife Conservation Society, April 2013.

"From national security to natural security", Robert Carlson, Bulletin of the Atomic Scientists, 11 Dec 2013.

Warning: Construction Ahead

I am migrating from Movable Type to Squarespace. There was no easy way to do this. Undoubtedly, there are presently all sorts of formatting hiccups, lost media and images, and broken links. If you are looking for something in particular, use the Archive or Search tabs.

If you have a specific link you are trying to follow, and it has dashes between words, try replacing them with underscores. E.g., instead of "www.synthesis.cc/x-y-z", try "www.synthesis.cc/x_y_z". If the URL ends in "/x.html", try replacing that with "/x/".

I will be repairing links, etc., as I find them.