Team:Lethbridge/human practices
From 2013.igem.org
Human Practices
Biosecurity and DNA Synthesis
Current Sequence Screening Methods
In the last 5 years, there has been increased recognition of the powers of gene synthesis. It is now easy and affordable to look up genetic sequences for nearly any organism, design an expression construct, and order that gene from a synthesis company. This allows for the creative projects we see each year at the iGEM jamborees, but it also allows those with malevolent intentions and adequate knowledge to easily order genes that may pose a hazard to others.
The recognition of this potential has led members of governments and large synthesis companies to try and establish a framework for screening these synthesis orders to ensure that potentially hazardous sequences stay in the hands of those who would use them for legitimate research purposes. This effort to regulate the gene synthesis industry has largely come from within. In the late 2000's, both Europe's International Association of Synthetic Biology (IASP) and North America's International Gene Synthesis Consortium (IGSC) put forth reports on the state of synthesis order screening as well as a set of best practices to follow [1-2]. These bodies are made up of individuals from the major gene synthesis companies in each region as well as experts from major universities.
Both groups outline a very similar approach to screening these orders for legitimacy. This entails a two part approach that first compares the ordered sequence to sequences on a list of known bio-hazardous agents and second, verifies the legitimacy of the customer and their intended use of the final product. In both reports the sequence screening utilizes existing pathogen databases, such as the US Select Agents and Toxins List or the Australia Group List as well as internal pathogen databases, and BLASTs the submitted sequence against these regulated ones. This first step in screening is conducted automatically. If there is a similarity between the submitted sequence and one of the sequences on these lists that exceeds the specified threshold, human investigation is used to further characterize the sequence [2].
Customer screening is arguably the most important aspect of the current gene synthesis security strategies. It is possible that ordering sequences that could be considered hazardous is necessary for research applications and adequate customer screening could determine if this sequence was going to someone at a research facility for legitimate use. European and North American groups recommend collection of the name, mailing address, and institutional affiliation of the customer to ensure that they are individuals working in verifiable positions within companies or academic institutions [1-2]. This information is then independently verified and checked against a number of national and international lists of individuals of concern, such as the US Specially Designated Nationals list.
While these protocols are put forth by consortium members in both Europe and North America, as well as there being a set of guidelines published by the US Department of Health and Human Services, all of these measures are voluntary [3]. There are no penalties to synthesis companies that do not screen the sequences or customers they deal with, outside of restrictions on international shipment of dual-use goods. This lack of legal regulation has the potential to allow dangerous sequences into the hands of malevolent individuals if any company decides to loosen their security criteria in order to save time or money in processing an order.
Potential Weaknesses of Current Screening Procedures
Although companies included in the IASB and IGSC adhere to the regulations of the Code of Conduct for Best Practices in Gene Synthesis or the Harmonized Screening Protocol, respectively [1-2], these protocols have a few potential weaknesses. Both of these protocols require that all synthesis orders are at minimum screened against a regulated pathogen database. However, these lists are by no means complete and there is a chance that potentially hazardous sequences can be ordered and synthesized without any efforts made to investigate the source of the order. This is currently one of the major weaknesses of screening protocols, and efforts are being made to compile a list of data from organisms on the Select Agents list, the Australia Group List, and other national lists of regulated pathogens. Once complete, this list will provide a more comprehensive database of potential pathogenic and toxic organism sequences as a step toward higher biosecurity.
The following are a few other weaknesses associated with current screening protocols. First, the IASB requires its member companies to screen orders of a minimum 200 base pairs in length, but there is also the potential of larger sequences being ordered as a series of short oligonucleotide sequences, from one company or multiple companies, that could bypass the screening process entirely. Though it can be more difficult to get direct database hits for shorter sequences, including these types of orders in the screening procedure is still feasible and may only require extra processing time for human investigation for these database matches. Second, though a legitimate customer can be approved for ordering hazardous sequences, the synthesis company cannot be sure of the final end user. There is no way to ensure that the customer does not ship the product to a third-party user that has not been investigated. Finally, and almost the most concerning weakness of current screening protocol, is the accountability of DNA synthesis companies. While most of the larger synthesis companies are members of the IASB or IGSC, complying to the standards mandated by these groups is still only a voluntary practice. There are no regulations in place that require a synthesis company to screen their orders for hazardous sequences or to follow-up with customer investigations of suspicious orders [4]. Even for orders that do not give a direct match to a hazardous sequence, any additional steps to associate function with the sequence is at the discretion of the company. Minshull and Wagner (representing DNA2.0 and GENEART) suggest that synthesis companies should be subject to routine “tests” of their screening protocols by their respective government bodies to ensure that they are complying to screening protocols and using the most up-to-date screening databases [5].
How elements of our project were used to examine synthesis screening procedures
Our project involves the characterization of pseudoknot RNA secondary structural motifs. These motifs can be used to express dual-coding gene sequences to give protein products whose expression can be regulated by the pseudoknot’s ability to induce ribosomal frameshifting. This method of coding can allow for the expression of a protein which may be encoded by fragments in alternating reading frames. This technology adds another level of complexity in terms of screening for controlled sequences, in that the protein produced from a synthesized construct may not be the product of translating a gene in one continuous reading frame.
It was our goal to investigate the ability of DNA synthesis companies to identify hazardous sequences in their screening procedures in the presence of frameshifting elements. A series of hazardous sequences containing intervening pseudoknots were designed and tested by two of the leading synthesis companies in North America in their standard screening procedures. These constructs contained all the necessary components to form a dangerous protein product, with DNA segments allocated into different reading frames and successively frameshifted using pseudoknots. The results from this screening test indicate that the current screening methods are successful at identifying hazardous sequences that had been “hidden” in multiple reading frames. The companies expressed their support of our efforts to investigate loopholes and problems in current screening procedures with regards to this new type of technology.
Possible Methods for Bypassing Screening
Codon redundancy
Codon redundancy in the genetic code refers to having multiple codons that code for a single amino acid. This redundancy allows for the DNA sequence of a protein to be changed without altering the resulting amino acid sequence. By utilizing codon redundancy, bioterrorists could drastically change the known DNA sequence of a harmful virus or protein. Fortunately, synthesis companies scan both the DNA and protein sequence of sequences submitted for synthesis, and in this way would still be able to identify a harmful sequence that had been changed using codon redundancy. However, this method in conjunction with others, such as frameshifting elements or those others listed below, could potentially be used to bypass the DNA and amino acid sequence screening performed by synthesis companies.
Utilizing conservative and non-conservative regions of proteins
Homologous proteins are those that are derived from the same ancestor; however, the two proteins do not have to share 100% amino acid identity. Multiple sequence alignments of amino acid sequences of homologous proteins from different organisms can be used to identify functionally important residues in a protein by indicating which residues are absolutely conserved, semi-conserved, and non-conserved. This would allow an individual to alter a controlled protein sequence by changing all or some of the conserved and semi-conserved residues to residues with similar physiochemical properties. In addition, all or some of the non-conserved residues could be substituted with essentially any other amino acid without risking loss of the protein’s function. This method, in combination with utilizing codon redundancy, would allow for more drastic alterations to be made to both the DNA and protein sequence from a pathogenic organism that could bypass screening procedures.
Using “custom” tRNAs
A more complicated means for bypassing screening procedures by decoupling protein sequence from function would be to use a highly engineered system with non-canonical tRNAs. An organism could be designed that uses engineered amino acyl-tRNA synthetases that recognize non-cognate tRNAs and therefore aminoacylate the tRNA with the incorrect amino acid. By using this alternative genetic code in the engineered organism, the DNA sequence from a pathogenic organism could be altered in an almost indistinguishable way while still producing the protein of interest.
Do-it-yourself synthesis
As time progresses, the cost of a DNA synthesizer is getting more affordable to research labs and independent users. Initially this may seem like a good thing, but there are tremendous dangers that are associated with this development. Directly bypassing screening procedures by not requiring the services of synthesis companies allows the owner of the DNA synthesizer unrestricted access to synthesize whatever sequence they choose. This would make any techniques to bypass the screening methods of synthesis companies obsolete. As a result, there may need to be regulations put in place to limit or restrict the access of DNA synthesizers. This can be done for example by requiring the owner to upload any sequences they synthesize to a governing body that will scan them for harmful sequences, or by installing software that will screen sequences prior to allowing them to be synthesized. A combination of these two methods as well as additional advances in screening procedures is crucial to ensure the safety of the general public.
Changes recommended for screening protocols
Though commendable biosecurity efforts have been put forward by major international synthesis companies, these groups are aware that standard protocols may not be enough to mitigate the risk of the synthesis and delivery of hazardous sequences. In the IASB Code of Conduct for Best Practices in Gene Synthesis, all member companies are mandated to take part in ongoing efforts to refine and improve the current screening technologies by establishing a review committee to update and expand the Code of Conduct as new or changing threats emerge, maintain open communication with member companies through the exchange of research and literature searches, and regularly collaborating on best practices and new screening ideas [2]. While these practices are important for synthesis companies to implement, DNA synthesis is becoming less expensive and more accessible by non-professionals. According to Minshull and Wagner “[a]nyone who is sufficiently motivated could synthesize the gene for a toxin or even an entire viral genome using readily available reagents and without ever going near a specialized synthesizer” [5]. With molecular biology equipment becoming available through avenues such as E-Bay and other online dealers, individuals with limited molecular biology experience could soon realistically synthesize their own DNA sequences in the next few years [4]. Screening protocols could thereafter become obsolete. Until then, further steps are required to assure the public, government, and research community that biosecurity is being upheld to the highest standards possible. This may involve expanding the use of online forums, such as VIREP (Virulence Factor Information Repository), to allow researchers to deposit and access information about genes and organisms. Additionally, government regulations may need to be implemented that require all synthesis companies to adhere to standard practices and implement human investigation of suspicious orders [6]. This may best be achieved through the integration of both the IASB and IGSC protocols into an industry-wide Code of Conduct.
References
[1] International Gene Synthesis Consortium. Harmonized screening protocol: gene sequence & customer screening to promote biosecurity. http://www.genesynthesisconsortium.org/wp-content/uploads/2012/02/IGSC-Harmonized-Screening-Protocol1.pdf (2009).
[2] International Association Synthetic Biology. Code of conduct for best practices in gene synthesis. http://www.ia-sb.eu/tasks/sites/synthetic-biology/assets/File/pdf/iasb_code_of_conduct_final.pdf (2009).
[3] U.S. Department of Health and Human Services. Screening Framework Guidance for Providers of Synthetic DoublesStranded DNA. http://www.phe.gov/Preparedness/legal/guidance/syndna/Documents/syndna-guidance.pdf (2010).
[4] Maurer S. M., Fischer M., Schwer H., Stähler C., Stähler P., & Bernauer H. S. Working paper: making commercial biology safer: what the gene synthesis industry has learned about screening customers and orders. http://gspp.berkeley.edu/iths/Maurer_IASB_Screening.pdf (2009).
[5] Minshull J. & Wagner, R. Nat. Biotechnol. 27, 800-801 (2009).
[6] Fischer M. & Maurer S. M. Nat. Biotechnol. 28, 20-22 (2010).
Learning to Be Bad
This year, we focused on the implications our frameshifting project might have on biosecurity. In thinking about the ways our pseudoknots could be used to do new, exciting things in synthetic biology, we came up with a use that is more frightening than exciting. Bioterrorism.
The idea is this: There are guidelines put forward by a number of industry groups on how DNA synthesis orders should be screened to ensure no biohazardous sequences get into the hands of the wrong people. The standard protocol for screening sequences involves taking the submitted DNA sequences and translating all six reading frames, then using BLAST to compare the DNA and amino acid sequences to those of organisms on a list of controlled agents (see paper below for more detailed explanation).
Our pseudoknot enables the ribosome to switch frames mid-translation, essentially splitting the entire protein amongst as many reading frames as there are pseudoknots. If someone were to split a protein from the Ebola virus into small fragments distributed across the reading frames, could they bypass this initial automatic screening step?
Putting our White Hats On
To investigate this potential for abuse of our project, we worked together with major North American synthesis companies to see if we could try and fool their screening methods using our frameshifting elements. We designed and submitted sequences with vary coding changes and coding fragment sizes between the sequences for our PK401 pseudoknot to the synthesis companies we had partnered with. There is a full description of the sequences and a link to the raw data files below.
Click her for sequences used
Sequence ID Number Sequence Origin Total Length (bp) Codon Changes (%) Length between PK (bp) 1 CFP 966 25 180 2 SARS-CoV 1869 0 210 3 Ricin 2392 25 198 4 SARS-CoV 2450 25 102 5 CFP 966 16 210 6 CFP 966 0 180 7 SARS-CoV 2450 0 102 8 Ricin 2392 0 198 9 Ebola Matrix Protein 1031 0 0 10 CFP 966 16 180 11 SARS-CoV 2450 0 102 12 CFP 966 0 210 13 SARS-CoV 1869 20 210 14 Ricin 3139 25 99 15 SARS-CoV 1869 25 210 16 CFP 966 25 210 17 Ricin 3139 0 99