2.7 Gene family construction
To carry out gene family analysis, the whole protein-coding gene repertoires from 21 published genomes, including Danio rerio ,Ctenopharyngodon idellus , Oreochromis niloticus ,Paralichthys olivaceus , Cynoglossus semilaevis ,Litopenaeus vannamei , Penaeus monodon ,Fenneropenaeus chinensis , Portunus trituberculatus ,Procambarus virginalis , Cherax quadricarinatus ,Daphnia magna , Locusta migratoria , Aedes aegypti ,Drosophila melanogaster , Bombyx mori , Daphnia pulex , Caenorhabditis elegans , Argopecten purpuratus ,Crassostrea gigas , and Branchiostoma floridae , were retrieved. For genes with multiple alternative isoforms, the longest transcript of each gene (encoding more than 30 amino acids) was retained. All-against-all BLASTP (v2.2.26) with an e-value threshold of 1e-7 was performed to assess the similarities among the retained protein sequences. OrthoMCL software (Li et al., 2003) was used to construct gene families with the parameter of ‘-inflation 1.5’.