Team:Calgary/Project/OurSensor/Detector

From 2013.igem.org

Revision as of 00:13, 26 September 2013 by Azzucoloto (Talk | contribs)

Detector

Transcription Activator-Like Effectors (TALEs) are naturally occurring proteins that bind to DNA. Because TALEs have a highly conserved modular region for DNA binding, they are particularly amenable to genetic engineering and customization, which make them a powerful tool in Synthetic Biology.

For proof of concept, we worked with TALEs from the Parts Registry.

For our detecting system itself, we have engineered TALE proteins that binds to two different regions of the Shiga Toxin II gene, Stx2, found in E. coli and other entero-haemorrhagic bacteria.

Background

Transcriptor activator-like effectors (TALEs) are proteins produced by bacteria of the genus Xanthomonas and secreted into plant cells. These naturally occurring TALEs play a key role in bacterial infection, as they are responsible for upregulation of the host genes required for pathogenic growth and expansion (Mussolino & Cathomen, 2012). Recently, it was reported that another plant pathogen, Ralstonia solanacearum, produces type III effectors, which have a sequence similar to TALEs from Xanthomonas spp. These proteins were therefore named Ralstonia injected protein TALEs or RipTALEs (De Lange et al., 2013).

TALEs share some common features, such as an N-terminal type III secretion signal, which allows the proteins to be translocated from the bacterium and into the plant cell. They also present nuclear localization signals (NLS) and an acidic activation domain (AAD) in the C-terminus. The central region, also termed repeat region, mediates DNA recognition through tandem repeats of 33 to 35 amino acids residues each (Bogdanove et al., 2010). The binding domain usually comprises 15.5 to 19.5 single repeats (figure 1). The last repeat, close to the C-terminus, is called “half-repeat” because it is only around 20 amino acids in length. Although the modules have conserved sequences, polymorphisms are found in residues 12 and 13, the “repeat-variable di-residue” (RVD). RVDs are specific for a single nucleotide; therefore, 19.5 repeat units target a specific 20-nucleotide sequence in the DNA (Mussolino & Cathomen, 2012).

Figure 1.(A) Schematic representation of a TAL effector with the DNA binding domain in red. (B) 3D structure of TALEs obtained from our team’s work in Autodesk Maya. To learn more about our modeling, click here.

When in contact with the DNA, the TALE aligns the N-terminal to C-terminal with the DNA 5’ to 3’ direction. Each repeat has a RVD loop, which is a two alpha helices structure connected by three residues, two of them being RVDs. Although both amino acids 12 and 13 are responsible for base specificity, the TALE-DNA interaction happens through intermolecular bonds between residue 13 and the target base in the major groove. Residue 12 plays a role in stabilizing the RVD loop (Meckler et al., 2013).

Over 20 different RVDs have been identified in TAL effectors. However, four of them appear in 75% of the repeats: HD, NG, NI and NN (Bogdanove et al., 2010). Quantitative analysis of DNA-TALE interactions by Meckler et al.. (2013) revealed that the binding affinity is affected by the RVDs in the following order: NG > HD ~ NN >> NI > NK. NG, specific to thymine, and HD, specific to cytosine, are strong RVDs. NN binds both guanine and adenine, but it prefers guanine. NK also interacts with guanine, but with 103-fold lower affinity. NI is specific for adenine, but it has low affinity when compared to strong RVDs such as NG and HD (Meckler et al., 2013). Although less common, another naturally occurring RVD, NH, was described to bind strongly to guanine (Cong et al., 2013). NS binds to any of the four bases and it is present in naturally occurring TALEs such as AvrBs3 from Xanthomonas campestris (Boch et al., 2009).

In addition to RVDs, the DNA binding affinity is also subject to polarity effects. Point mutations at the 5’ end of the target sequence affect TALE-DNA recognition more than the ones at the 3’ end (Meckler et al., 2013). Taking this in consideration, recommendations for TALE design include incorporation of strong RVDs close to the N-terminus (Streubel et al., 2012).

Because TAL effectors can be engineered to bind virtually any DNA sequence, they represent a powerful tool in synthetic biology. They have been extensively used in gene modulation by fusing an activator or a repressor to their C-terminus. Slovenia 2012 iGEM team designed and created repressor TAL effectors by adding KRAB repressor domains and activator TALEs through fusion VP16 activation domain.

Besides gene regulation, TALEs can be fused with DNA cleavage domains of endonucleases and serve as restriction enzymes (Beurdeley et al., 2013). These engineered proteins are termed TALENs or Transcription Activator-Like Effector Nucleases. TALENs can also be used in gene knockout as they are able to promote gene disruption (Bogdanove & Voytas, 2011).

Our team, however, proposes an innovative application for TAL effectors: detection of pathogenic E. coli (EHEC) and other entero-haemorrhagic bacteria in feces of super-shedders in cattle populations. As sensors, TALEs can bind to specific regions of the Shiga Toxin II gene (Stx2) and capture the DNA of interest from a feces sample, making it available for a second TALE, whose binding domain is specific for another region of Stx2. This second TALE is connected to a reporter, which turns the TALE-DNA interaction visible in a short period of time. To find out more about our EHEC TALEs, click here

Proof of Concept

The TALEs are a very fundamental part of our project this year. In order to have a functioning system for E. coli detection, we need to have proteins that will successfully recognize and bind our DNA. However, before we can actually create and use TALEs to bind to E. coli DNA, we need to have a proof of concept. On the iGEM parts registry we came across TALEs that the Slovenian team synthesized for a previous project. TALEs are very large proteins and take a long time to design and synthesize. Therefore, we decided to use these TALEs to test our system. We also saw this as an opportunity to use and build upon parts made by former iGEM teams. Thus, we ordered their three TALEs: TALA (BBa_K782004), TALB (BBa_K782006), and TALD (BBa_K782005).

We used TALA(BBa_K782004) and TALB (BBa_K782006) to build our constructs and test our system, as it requires the use of two different TALE proteins. To test our TALEs, we had to synthesize the target sequences that they would recognize and bind to. We constructed a series of target sequences in order to test the binding affinity. Some sequences were directly corresponding with the TALE protein sequence, while some had certain base pair alterations. These mutations will allow us to test the binding affinity of the TALE when it encounters a non-exact target sequence. In this way, we can determine how specific these proteins really are and how they might respond in our test to DNA not belonging to enterohemorrhagic (EHEC) E. coli. This will help us define how specific we expect our final system to be.

When we sequenced TALB (BBa_K782006), we discovered that it had a small mutated segment. We expected a sequence of AGCAATGGG in the repeat variable di-residue of the second repeat. However, the sequence was actually TCCCACGAC. This means that the required target sequence at this position is a C, and not a T, as the parts registry web page shows. The two TALEs also had a kozak sequence at the front of the sequence. Instead of changing the TALE, we decided to create a new series of target sequences for the mutated TAL B (BBa_K782006) to bind to. We also used PCR to remove the kozak sequence at the front of each TALE, so it would not inhibit the expression.

Having target sequences created for both TALA(BBa_K782004) and TALB (BBa_K782006), we assembled a target sequence construct (BBa_K1189006), to be used in our system. Special primers were synthesized and used to create a linear PCR product, containing both the A and B target sequences. We then put it into a pSB1C3 backbone by switching out RFP.

With the TALEs ordered from the registry and our correct TALE target sequences, we have been able to demonstrate a proof of concept by binding TALE proteins with DNA. Next, we will use our designed TALEs to bind to the DNA of EHEC E. coli .

Engineered TALEs

Our system had to be designed to only detect EHEC E.coli. We found out that Stx2 is the common gene between all EHEC organisms. This gene is responsible for production of Shigatoxin, which is the factor that causes illness in humans. Therefore, we decided to design two TALEs that bind to two regions of Stx2 gene. We have to remember that our TALE-capture system detects DNA; therefore, if the TALEs are not designed to exclusively bind to the Stx2 gene they could produce a false positive by binding to anything such as the DNA of a type of grass that the cow has eaten, cattle’s own DNA, or another type of microorganism living inside the cow’s gut. Furthermore, TALEs are not perfect. They can bind to a DNA segment if its sequence is close enough to their target site. This could especially be problematic if a piece of DNA has a high similarity with the nucleotides on the 5’ end of TALE’s target site.

In addition to specificity, we also tried to design TALEs that have the highest binding affinity as possible. This is important for both TALEs. The immobilized TALE has to be able to keep a relatively large DNA in place and keep holding on to it when the very large mobile complex also docks onto the DNA (TALE attached to a ferritin nanoparticle or beta-lactamase). The mobile TALE is fused to a ferritin nanoparticle, which in turn is fused to eleven other TALEs. So the mobile TALE has to be able to keep a very large complex in place when it binds to the DNA. Thus, if our TALEs do not bind strongly to the DNA, the system would produce false negatives.

To increase the binding affinity of the TALEs, we picked the pyrimidine-rich regions of the Stx2 gene. The relative binding affinity of different RVDs to their respective nucleotides is as follows: NG(1)> NN (0.18) ~ HD (0.16) >> NI (0.0016)> NK(0.00016). NG, NN, HD, NI, and NK bind to T, G, C, A, and G, respectively (Meckler et al. 2013). Therefore, picking regions rich in Thymidines and Cystidines would dramatically increase the binding affinity of the TALE to the DNA segment. Although NN also has a relatively high binding affinity, it was avoided as much as possible, as NN can also bind Adenine and therefore decrease specificity.

Another important factor in picking the target sequences of the two TALEs was the distance between the two. The distance has to be far enough so that the proteins fused to one TALE would not block the target sequence of the other TALE. With that in mind, we tried to pick two target sequences that are as close to each other as possible; DNA can get sheared into smaller pieces due to physical factors, or it can get cut by endonucleases. In order for our system to detect an EHEC, the two regions of DNA that the TALEs bind to must remain attached together. Therefore, we picked target sequences as close to each other as possible to decrease the chances of a cut between the two target sites.

The TALEs were designed to be extremely specific. Based on the considerations explained above, a number of possible target sites were selected. To find out the most specific pair, we conducted some BLAST searches on each TALE target sequence separately. As expected, we observed a huge number of alignments with EHEC strains. We also found out some partial alignments in non-EHEC organisms that were screened to find out if they can be problematic. Lets call the two selected target sites [1] and [2]. If [1] had a 90% alignment to a region in the Homo Sapiens genome, we checked to see if [2] can also be found in Homo Sapiens genome. If it was found, then we checked to see whether both [1] and [2] are found on the same chromosome; if they are found on different chromosomes, they are not attached so the system cannot detect it as a false positive. If they were found on the same chromosome, we figured out which part of [1] or [2] the aligns with that match. Remember that TALEs are polar. In other words, they bind much more strongly to the 5’ end of their target sequence compared to the 3’ end. Therefore, if the alignment included 0 to 10th nucleotide of the target sequence, that combination of those two target sites was ruled out. To get better idea of how we went about designing the TALE, refer to the figure below.