University of Illinois Chicago
Browse

kakapo: easy extraction and annotation of genes from raw RNA-seq reads

Download (1.44 MB)
journal contribution
posted on 2024-06-09, 19:22 authored by Karolis Ramanauskas, Boris Igić
kakapo (kākāpō) is a Python-based pipeline that allows users to extract and assemble one or more specified genes or gene families. It flexibly uses original RNA-seq read or GenBank SRA accession inputs without performing global assembly of entire transcriptomes or metatranscriptomes. The pipeline identifies open reading frames in the assembled gene transcripts and annotates them. It optionally filters raw reads for ribosomal, plastid, and mitochondrial reads, or reads belonging to non-target organisms (e.g., viral, bacterial, human). kakapo can be employed for targeted assembly, to extract arbitrary loci, such as those commonly used for phylogenetic inference in systematics or candidate genes and gene families in phylogenomic and metagenomic studies. We provide example applications and discuss how its use can offset the declining value of GenBank's single-gene databases and help assemble datasets for a variety of phylogenetic analyses.

Funding

SG: Rapid Systematic Search for a Mechanism of Self-Incompatibility | Funder: National Science Foundation | Grant ID: DEB-1655692

History

Citation

Ramanauskas, K.Igić, B. (2023). kakapo: easy extraction and annotation of genes from raw RNA-seq reads. PeerJ, 11, e16456-. https://doi.org/10.7717/peerj.16456

Publisher

PeerJ

Language

  • en

issn

2167-8359