2.7 Gene family construction
To carry out gene family analysis, the whole protein-coding gene
repertoires from 21 published genomes, including Danio rerio ,Ctenopharyngodon idellus , Oreochromis niloticus ,Paralichthys olivaceus , Cynoglossus semilaevis ,Litopenaeus vannamei , Penaeus monodon ,Fenneropenaeus chinensis , Portunus trituberculatus ,Procambarus virginalis , Cherax quadricarinatus ,Daphnia magna , Locusta migratoria , Aedes aegypti ,Drosophila melanogaster , Bombyx mori , Daphnia
pulex , Caenorhabditis elegans , Argopecten purpuratus ,Crassostrea gigas , and Branchiostoma floridae , were
retrieved. For genes with multiple alternative isoforms, the longest
transcript of each gene (encoding more than 30 amino acids) was
retained. All-against-all BLASTP (v2.2.26) with an e-value threshold of
1e-7 was performed to assess the similarities among the retained protein
sequences. OrthoMCL software (Li et al., 2003) was used to construct
gene families with the parameter of ‘-inflation 1.5’.