A nuclear localization signal or sequence (NLS) is an amino acid sequence that 'tags' a protein for import into the cell nucleus by nuclear transport. Typically, this signal consists of one or more short sequences of positively charged lysines or arginines exposed on the protein surface. Different nuclear localized proteins may share the same NLS. An NLS has the opposite function of a nuclear export signal (NES), which targets proteins out of the nucleus.
These types of NLSs can be further classified as either monopartite or bipartite. The major structural differences between the two is that the two basic amino acid clusters in bipartite NLSs are separated by a relatively short spacer sequence (hence bipartite - 2 parts), while monopartite NLSs are not. The first NLS to be discovered was the sequence PKKKRKV in the SV40 Large T-antigen (a monopartite NLS). The NLS of nucleoplasmin, KR[PAATKKAGQA]KKKK, is the prototype of the ubiquitous bipartite signal: two clusters of basic amino acids, separated by a spacer of about 10 amino acids. Both signals are recognized by importin ?. Importin ? contains a bipartite NLS itself, which is specifically recognized by importin ?. The latter can be considered the actual import mediator.
Chelsky et al. proposed the consensus sequence K-K/R-X-K/R for monopartite NLSs. A Chelsky sequence may, therefore, be part of the downstream basic cluster of a bipartite NLS. Makkerh et al. carried out comparative mutagenesis on the nuclear localization signals of SV40 T-Antigen (monopartite), C-myc (monopartite), and nucleoplasmin (bipartite), and showed amino acid features common to all three. The role of neutral and acidic amino acids was shown for the first time in contributing to the efficiency of the NLS.
Rotello et al. compared the nuclear localization efficiencies of eGFP fused NLSs of SV40 Large T-Antigen, nucleoplasmin (AVKRPAATKKAGQAKKKKLD), EGL-13 (MSRRRKANPTKLSENAKKLAKEVEN), c-Myc (PAAKRVKLD) and TUS-protein (KLKIKRPVK) through rapid intracellular protein delivery. They found significantly higher nuclear localization efficiency of c-Myc NLS compared to that of SV40 NLS.
There are many other types of NLS, such as the acidic M9 domain of hnRNP A1, the sequence KIPIK in yeast transcription repressor Mat?2, and the complex signals of U snRNPs. Most of these NLSs appear to be recognized directly by specific receptors of the importin ? family without the intervention of an importin ?-like protein.
Recently a class of NLSs known as PY-NLSs has been proposed, originally by Lee et al. This PY-NLS motif, so named because of the proline-tyrosine amino acid pairing in it, allows the protein to bind to Importin ?2 (also known as transportin or karyopherin ?2), which then translocates the cargo protein into the nucleus. The structural basis for the binding of the PY-NLS contained in Importin ?2 has been determined and an inhibitor of import designed.
The presence of the nuclear membrane that sequesters the cellular DNA is the defining feature of eukaryotic cells. The nuclear membrane, therefore, separates the nuclear processes of DNA replication and RNA transcription from the cytoplasmic process of protein production. Proteins required in the nucleus must be directed there by some mechanism. The first direct experimental examination of the ability of nuclear proteins to accumulate in the nucleus were carried out by John Gurdon when he showed that purified nuclear proteins accumulate in the nucleus of frog (Xenopus) oocytes after being micro-injected into the cytoplasm. These experiments were part of a series that subsequently led to studies of nuclear reprogramming, directly relevant to stem cell research.
The presence of several million pore complexes in the oocyte nuclear membrane and the fact that they appeared to admit many different molecules (insulin, bovine serum albumin, gold nanoparticles) led to the view that the pores are open channels and nuclear proteins freely enter the nucleus through the pore and must accumulate by binding to DNA or some other nuclear component. In other words, there was thought to be no specific transport mechanism.
This view was shown to be incorrect by Dingwall and Laskey in 1982. Using a protein called nucleoplasmin, the archetypal 'molecular chaperone', they identified a domain in the protein that acts as a signal for nuclear entry. This work stimulated research in the area, and two years later the first NLS was identified in SV40 Large T-antigen (or SV40, for short). However, a functional NLS could not be identified in another nuclear protein simply on the basis of similarity to the SV40 NLS. In fact, only a small percentage of cellular (non-viral) nuclear proteins contained a sequence similar to the SV40 NLS. A detailed examination of nucleoplasmin identified a sequence with two elements made up of basic amino acids separated by a spacer arm. One of these elements was similar to the SV40 NLS but was not able to direct a protein to the cell nucleus when attached to a non-nuclear reporter protein. Both elements are required. This kind of NLS has become known as a bipartite classical NLS. The bipartite NLS is now known to represent the major class of NLS found in cellular nuclear proteins and structural analysis has revealed how the signal is recognized by a receptor (importin ?) protein (the structural basis of some monopartite NLSs is also known). Many of the molecular details of nuclear protein import are now known. This was made possible by the demonstration that nuclear protein import is a two-step process; the nuclear protein binds to the nuclear pore complex in a process that does not require energy. This is followed by an energy-dependent translocation of the nuclear protein through the channel of the pore complex. By establishing the presence of two distinct steps in the process the possibility of identifying the factors involved was established and led on to the identification of the importin family of NLS receptors and the GTPase Ran.
Proteins gain entry into the nucleus through the nuclear envelope. The nuclear envelope consists of concentric membranes, the outer and the inner membrane. The inner and outer membranes connect at multiple sites, forming channels between the cytoplasm and the nucleoplasm. These channels are occupied by nuclear pore complexes (NPCs), complex multiprotein structures that mediate the transport across the nuclear membrane.
A protein translated with a NLS will bind strongly to importin (aka karyopherin), and, together, the complex will move through the nuclear pore. At this point, Ran-GTP will bind to the importin-protein complex, and its binding will cause the importin to lose affinity for the protein. The protein is released, and now the Ran-GTP/importin complex will move back out of the nucleus through the nuclear pore. A GTPase-activating protein (GAP) in the cytoplasm hydrolyzes the Ran-GTP to GDP, and this causes a conformational change in Ran, ultimately reducing its affinity for importin. Importin is released and Ran-GDP is recycled back to the nucleus where a Guanine nucleotide exchange factor (GEF) exchanges its GDP back for GTP.