The Hedgehog signaling domain was acquired from a prokaryote

The presence of well-conserved N-terminal Hh domains in several bacterial strains indicates that the metazoan Hedgehog (Hh) gene was acquired via horizontal gene transfer from bacteria. The bilaterian and cnidarian Hh proteins are rooted within the bacterial clade, while the cnidarian hedgling protein forms the outgroup. This relationship indicates that the N-terminal signaling part of the Hh proteins was acquired from a bacterium, and not from the hedgling protein. Introduction Hedgehog genes play central roles during development of most bilaterians. Hedgehog proteins are synthesized as pro-proteins, which are characterized by two distinct domains, and N-terminal domain which is retained as the ligand (the “Hedge” domain), and a C-terminal domain which is required for the autoproteolytic processing (the “Hog” domain). The Hedge domain (also referred to as Hh-N) has all characteristics of a Zn2+ metalloprotease 1, although no proteolytic target of it has been identified. The Hog domain resembles that of inteins, self-excising proteins that re-join the adjacent peptides 2. Both the Hedge and Hog domains are modular. The Hedge domain can be found as the extreme Nterminal end of large Cadherin-like membrane proteins in Choanoflagellates, Sponges, and Cnidarians 3, members of the Hedgling family. Cnidarians also have Hh proteins, as well as intein-like Hog sequence, leading to the hypothesis that Hh genes were a rearrangement of the N-domain of Hedgling with a Hog/intein domain from an unknown protein 4. Results Here I propose an alternate hypothesis for the origin of the Hh protein. By performing a protein BLAST in prokaryotic genomes using Shh-N as the query sequence, I found that highly conserved Hh-N domains are coded for in several bacterial genomes (Figure 1 A). A BLAST phylogenetic tree (nearest neighbor joining) of the Hh-N containing proteins including bacterial, cnidarian, insect and mammalian hedgehog proteins showed that all metazoan Hh proteins are rooted within the bacterial clade. Cnidarian (Nematostella) hedgling is the outgroup (Figure 1B). In all instances, the Hh-N domain is part of a larger protein. The Hh-N domains in bacteria are located at the C-terminal ends of larger proteins, all with unknown function. In contrast, in metazoans the Hh-N domain is the amino-terminal part of both hedgling and the Hh pro-proteins (Figure 1 A). Performing the alignment with only the Hh-N domain (the conserved block in figure 1A), did not change the overall interpretation. Again, the metazoan Hh proteins rooted within the bacterial clade, with the cnidarian hedgling N-domain being the outgroup (Figure 2). The hog domain is widely distributed among prokaryotes and eukaryotes including fungi, plants, and algae. Nevertheless, the eukaryotic hog domains are more similar to each other than to the bacterial hog domains. This indicates that a horizontally transferred bacterial Hh-N domain was recombined with a eukaryotic Hog domain to give rise to the ancestral Hh gene. . CC-BY-NC-ND 4.0 International license certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (which was not this version posted March 8, 2018. . https://doi.org/10.1101/276295 doi: bioRxiv preprint


Introduction
Hedgehog genes play central roles during development of most bilaterians. Hedgehog proteins are synthesized as pro-proteins, which are characterized by two distinct domains, and N-terminal domain which is retained as the ligand (the "Hedge" domain), and a C-terminal domain which is required for the autoproteolytic processing (the "Hog" domain). The Hedge domain (also referred to as Hh-N) has all characteristics of a Zn 2+ metalloprotease 1 , although no proteolytic target of it has been identified. The Hog domain resembles that of inteins, self-excising proteins that re-join the adjacent peptides 2 .
Both the Hedge and Hog domains are modular. The Hedge domain can be found as the extreme Nterminal end of large Cadherin-like membrane proteins in Choanoflagellates, Sponges, and Cnidarians 3 , members of the Hedgling family. Cnidarians also have Hh proteins, as well as intein-like Hog sequence, leading to the hypothesis that Hh genes were a rearrangement of the N-domain of Hedgling with a Hog/intein domain from an unknown protein 4 .

Results
Here I propose an alternate hypothesis for the origin of the Hh protein. By performing a protein BLAST in prokaryotic genomes using Shh-N as the query sequence, I found that highly conserved Hh-N domains are coded for in several bacterial genomes (Figure 1 A). A BLAST phylogenetic tree (nearest neighbor joining) of the Hh-N containing proteins including bacterial, cnidarian, insect and mammalian hedgehog proteins showed that all metazoan Hh proteins are rooted within the bacterial clade. Cnidarian (Nematostella) hedgling is the outgroup ( Figure 1B). In all instances, the Hh-N domain is part of a larger protein. The Hh-N domains in bacteria are located at the C-terminal ends of larger proteins, all with unknown function. In contrast, in metazoans the Hh-N domain is the amino-terminal part of both hedgling and the Hh pro-proteins (Figure 1 A).
Performing the alignment with only the Hh-N domain (the conserved block in figure 1A), did not change the overall interpretation. Again, the metazoan Hh proteins rooted within the bacterial clade, with the cnidarian hedgling N-domain being the outgroup ( Figure 2). The hog domain is widely distributed among prokaryotes and eukaryotes including fungi, plants, and algae. Nevertheless, the eukaryotic hog domains are more similar to each other than to the bacterial hog domains. This indicates that a horizontally transferred bacterial Hh-N domain was recombined with a eukaryotic Hog domain to give rise to the ancestral Hh gene.

Discussion
It has been proposed that Hh arose in a cnidarian ancestor by the recombination of the hedgling Ndomain with a eukaryotic hog domain before the urbilaterian (a common ancestor of protostomes and deuterostomes) arose, likely in a cnidarian ancestor 4 . This is consistent with the presence of hedgling in Choanoflagellates, sponges and cnidarians, but not in bilaterians, and the presence of Hh in cnidarians and bilaterians. However, the Hedge domain of eukaryotic Hh proteins is more similar to several bacterial proteins than to the cnidarian hedgling N-domain. This indicates that a bacterial Hh-N gave rise to the Hh gene by the recombination to a cnidarian hog domain. This would require horizontal gene transfer of the Hh-N domain from a bacterium into the cnidarian genome before the evolution of the urbilaterian.
Although all bacterial Hh-N containing proteins are hypothetical, they are likely Zn 2+ peptidase proproteins 5 . The conservation is in this domain is strong, and it is likely that all the Hh-N proteins fold in a similar manner. The near complete conservation of the residues involved in Zn 2+ and Ca 2+ binding further support this notion and indicate that most Hh-N domains function as peptidases. The targets of the Hh-N peptidase activity remain unknown. Some of the highly conserved residues are found mutated in the congenital birth defect holoprosencephaly, which is caused by aberrant Shh signaling. These includes several residues in the large a helix (Figure 2A), but also include several of the residues involved in Zn 2+ coordination 6 , and thus peptidase activity. As this demonstrates that these residues are critical for Hh signaling, it is likely that a conserved functional activity is present in bacteria.

Methods
All alignments were performed in protein blast at the NCBI website (www.ncbi.nlm.nih.gov) Accession numbers for the proteins used for the lineup and tree in Figure 1.