X-ray Crystallography

The purpose of macromolecular X-ray crystallography is to obtain a 3-dimensional model (PDB) of a protein structure. The knowledge gained from the analysis of such 3D models is of immense value but it is rarely easy, fast or cheap. The following list shows the necessary steps leading to the 3D protein model:

  • Protein expression.
  • Protein purification.
  • Crystallisation, optimisation and crystal handling.
  • Data collection.
  • Structure solution: the phase problem.
  • Structure analysis.

Protein expression

Most of our projects depend upon recombinant expression of the target gene in E. coli. We try and drive high-level expression from strong promoters on the plasmid that harbours the gene of interest because protein crystallography is very hungry for protein. The more protein you have, the greater the chance of screening sufficient conditions to obtain crystals. However, for some projects (e.g. membrane proteins, eukaryotic proteins, multi-protein complexes) making sufficient protein is a problem not easily overcome.

Protein purification

We typically recommend a purity of at least 95% as determined by SDS-PAGE. Complementary techniques such as spectroscopy, mass-spectrometry, analytical ultracentrifugation, activity assays, static or dynamic light scattering (SLS/DLS), small angle X-ray scattering (SAXS) are very valuable methods to gain both quantitative and qualitative insights about your protein sample. The last purification step should ideally be gel filtration to ensure that the protein sample is monodisperse. The chances of successful crystallisation are increased with protein purity and chemical and conformational homogeneity. Some proteins are found to degrade throughout the purification process, and consequently it can be beneficial to use mild trypsinolysis to identify a more stable fragment. With good mass spectrometry it should be possible to identify the domain boundaries, and, if need be, redesign the recombinant construct. Your expression system should allow you to have at least 200 microliters of pure (>95%) protein at 10 mg/ml (in most cases).


The goal of crystallisation is to obtain well-ordered protein crystals that are large enough to diffract X-rays in a useful manner. Finding crystallisation conditions is a process of trial and error. We suggest a starting protein concentration of 10 mg/ml, but sometimes you can be lucky with concentrations as low as 3 mg/ml. If you are unlucky, you may have to concentrate your protein up to 150 mg/ml before it will crystallise! Knowing how the protein behaves whilst concentrating, and the absolute concentration limit, can be useful and can be used to increase the chances of obtaining a crystallisation “hit”. The screens we routinely start the crystallisation process with are 2 sparse matrix screens from Molecular Dimensions (Structure I + II and JCSG) and a systematic grid screen (PACT. The temperature of the crystallisation room is 20 degrees C but two RUMED vibration free crystallisation incubators are also available at 4 degrees C. We have invested in an automated system (Rigaku Minstrel/Gallery) for crystal storage (at 4 degrees) and visualisation. More than 10% of heavy amorphous precipitates and less than 10% of clear drops are indicators that the protein concentration is in the right range. The protein concentration might have to be adjusted based on the above criteria. In absence of crystal “hits” the protein surface entropy can be reduced chemically or by mutagenesis. The last step of crystallisation is to optimize the crystal hit(s) varying all possible parameters, including the protein concentration, precipitant, buffer and/or pH, salt, additives, ligand, temperature, drop ratios, crystallisation techniques, nucleation control (nucleant, seeding)…

Crystal Screening

The crystals obtained ought be be screened for their diffraction properties. Since X-rays can severely damage proteins, crystals are these days exposed to X-rays whilst being maintained at 100 K. Therefore, the first step is to cryoprotect the crystal. We routinely use 20% PEG 400 or 25% ethylene glycol or an oil such as Paratone-N. We try to avoid glycerol at any cost as glycerol will solubilise your protein hence your crystal. Additionaly it mimics carbohydrates and interfere with many of our results. This step is crucial and should be repeated in case of poor diffraction quality to eliminate possible mechanical stress and other handling issues. Then crystals are screened for quality by looking for resolution, ice rings, and spot definition. If the quality of the crystals is not satisfactory, the the crystals ought to be tested at room temperature, without the inclusion of a cryoprotectant, to determine if the poor diffraction quality is an inherent property of the crystal.

Data Collection and Processing

We usually collect diffraction data from our crystals at the Diamond Light Source synchrotron, UK. The crystals are kept in a storage dewar in liquid nitrogen prior to their transport in a dry shipper to the synchrotron. Diffraction data sets from well diffracting crystals can be collected on the home X-ray source, but this normally takes 15 to 20 minutes rather than a couple of minutes at the synchrotron. Phasing by SAD or MAD is normally done at a synchrotron, although we have solved SAD/MIRAS/SIRAS cases using in house derivative data sets.

Structure Solution

Solving the crystallographic “phase” problem can be a major hurdle. When possible the easiest and fastest way to obtain the phases is by molecular replacement using a structurally similar model already solved. In absence of a suitable model for molecular replacement, the phases have to be obtained experimentally by anomalous scattering (SAD, MAD) or isomorphous replacement (SIR, MIR), or a combination of the two (SIRAS, MIRAS). An initial model can then be built either automagically or manually, and then refined and rebuilt iteratively to improve the phases until the electron density and the model agree with each other. The completed structure is validated with tools like Molprobity and deposited, along with the native diffraction data amplitudes, with the PDB database.

Structure Analysis

Just solving the structure and making a pretty picture is not enough, the goal of the entire process is to understand how the protein works. Sometimes we get lucky, and the protein crystallises in the presence of a ligand that mimics closely reaction substrates or products. Other times, clues to function can be obtained by structural comparison to other related (sometimes distantly) structures. Rarely we encounter the problem that the structure leaves us scratching our heads, and we have to go back to the wet lab bench and test some hypotheses.