Cloning, expression and
purification of difficult to clone, express and purify proteins in E. coli
Low temperature decreases the rate of protein synthesis and usually more soluble protein is obtained. Also, if the temperature is reduced before induction of the cells, it is more likely to yield protein in soluble fraction, it kind of diverts from the pathway of going into inclusion bodies (Sorry, I do not know how).
I have got some mails in relation to the expression of difficult to purify proteins, so I thought of making a short do's and don't's. For pure bioinformatic people, please bear with me for a couple of posts. First
of all it is important to know about the protein, gather as much information
about the protein as you can. All those small pieces of information help a lot
if kept in mind while designing the strategy for cloning, expression and
purification of the proteins. Also be informed about the source of protein,
eukaryotic or prokaryotic or any others source. Some of the basic parameters
like the size of the protein, PI, amino acid composition etc. pays a vital role
in designing the strategy. Here are some tools to look for such information I
have compiled on this blog before http://bioinformatictools.blogspot.in/2014/04/functional-annotation-of-hypothetical.html
and http://bioinformatictools.blogspot.in/2011/11/in-silico-characterization-of-proteins.html.
Look for other sources too. Main theme is to find as much information about the
protein as much one could. I am not a big fan of purifying the protein under
denaturing condition. There are lots of question that are difficult to answer
if the protein needs to be refolded from denaturing conditions, like if the
protein has folded properly, if this is the way the protein is natively folded
and not just any random refolding of the protein, which are difficult to
demonstrate experimentally until you already have some assay in mind. Since I
have tried that too I will end by suggesting what all I have learned on that
part.
Downstream experimental
procedures:
Before designing strategy for Cloning, expression and purification of protein,
it is wise to determine the downstream experimental procedure you are going to
perform and strategy for Cloning, expression and purification mainly depends on
this. At times it is possible to purify the protein in soluble form in very
small amount using a very large culture (which is ok, if you need very small
amount of protein for downstream experiments) for which one need not go through
all the standardization experiments with trials in different vectors and host
cells. However, in case if large amount of protein is required (such as in
crystallization experiments) it is advised to optimize the purification process
overall.
Read as much as you can: There are
various resources available for suggestions for cloning, expression and
purification of the protein in soluble fraction (i.e. QIAexpress handbook). But
please keep in mind that it’s easy to suggest in wet lab work but it takes a
lot of time and energy to perform the experiments the way one wishes to, so try
what you think is logical and more importantly easily available to you (do-able).
Membrane or membrane
associated protein: check if the selected protein is Membrane or
membrane associated protein. This can be done by using surface localization tools,
some of them are listed here http://bioinformatictools.blogspot.in/2007/09/predicting-subcellular-localization-of.html.
Also, check if the protein Transmembrane domain (TMHMM http://www.cbs.dtu.dk/services/TMHMM/) or signal
peptide (Signal Phttp://www.cbs.dtu.dk/services/SignalP/) in it.
These are hydrophobic regions and are normally intrinsically disordered. Membrane proteins are bit tough to get in
soluble form till one removes the transmembrane or signal peptide part. It is
logical to remove the initial (normally N-terminal) transmembrane or signal
peptide part to get the functional domain or multiple domains in soluble form. (I
had similar problem with a protein I was working on, when removed the signal peptide
and transmembrane domain, it solved everything, got the protein into soluble
fraction and got purified as charm, got it crystallized also).
Check for the
functional domain in protein if any: This will
help in determining the probable function the protein might be having. This
will also indicate the other proteins with similar domain and their nature with
respect to the cloning, expression and purification of the protein in E.
coli. If you can find the protein with the similar domain use the cloning, expression
and purification protocol for target protein. Also, for some of the protein the
sequence based analysis results/characters change with addition of the tag,
keep this in mind too, it might lead to change in PI or so on.
Optimize the
temperature: Try different temperature for growth and induction.
Induction temperature is more crucial.
- Try growing cells at 37 C and induction at 37 C.
- Try growing cells at 37 C and induction at 25 C for long time.
- Try growing cells at 37 C and induction at 16 C for long time.
- Try growing cells at 25 C and induction at 16 C for long time.
- Try growing cells at 37 C followed by chilling at 16 C at least one hour before induction.
Low temperature decreases the rate of protein synthesis and usually more soluble protein is obtained. Also, if the temperature is reduced before induction of the cells, it is more likely to yield protein in soluble fraction, it kind of diverts from the pathway of going into inclusion bodies (Sorry, I do not know how).
Optimize the IPTG
concentration: it is a good idea to check a gradient in a small
scale for the amount of IPTG (using a range from 0.1, 0.2, 0.3 ….mM) required
for optimal expression level of the protein. Normally, IPTG is required at very
low levels for optimal expression and using higher concentration not only is
costly, but also doesn’t show much improvement in the expression level of the
protein.
Use a large tag, but make
sure to make and arrangement to remove it once you have the protein: Larger
tags like intein tag, His-SUMO, GST tag, MBP (maltose binding protein) etc. are
known to increase the solubility of proteins, use them if you have the corresponding
vectors easily available for them.
Change the vector: using a
weaker promoter (e.g. trc instead of T7) and using a lower copy number plasmid
normally increases the chance of protein to be purified in soluble fraction. Also,
using N- and/or C- terminal tags (in various vectors) affects the solubility of
the protein, especially in those protein where folding is dependent on any of
these terminals.
Change the host cells: Some of the
E. coli strains are better capable of handling toxic or membrane
proteins in comparison to others. I had very good experience working with C41
and C43 strains which I came to know through this paper http://www.ncbi.nlm.nih.gov/pubmed/15294299.
There are also pLysS versions of these strains, I did not try but you can read
and try. Other strains like rosetta etc. might also be good to try (depends
upon the strains you can get your hands on) (So, beg, borrow or steal ;)). For
a new protein I usually perform as many changes one by one as I can do at small
scale and then move them onto large scale. Also, check if your protein is using
codons that are rarely used in E. coli. You can check ‘rare codon usage’
using different software available.
Change the culture media:
After
changing and optimizing as many parameters I could, I was getting low level of
protein in soluble fraction in LB media, I read somewhere that someone had good
yield with the Terrific Broth, I tried and it gave a way more protein in
soluble fraction. I was happy to use it thereafter for any protein I had to
purify.
Use Auto-induction
media: it
will be worthwhile trying auto-induction. The idea is that instead of using an
inducing agent like IPTG one uses the native function of the T7 promoter. So if
you use media containing glucose and lactose and grow the cells, as the glucose
is depleted, the cells will slowly start activating their T7 promoters which
will start using lactose in place of glucose. This will also induce the
promoters on your expression vector and lead to a much more gradual expression
than from using IPTG.