Title: Enumerating Alternative Spliced Transcripts for Understanding Complexity of the Human Genome
Authors: Shinichi Morishita, Jun Ogasawara, Toshihiko Honkura, and Tomoyuki Yamada
Series: Linköping Electronic Articles in Computer and Information Science
ISSN 1401-9841
Issue: Vol. 6 (2001), No. 020
URL: http://www.ep.liu.se/ea/cis/2001/020/

Abstract: According to recent several reports, the number of human genes is estimated fewer than 40,000, which, surprisingly, is only double the number (19,000) present in Caenorhabditis elegans. This fact indicates that the correlation between complexity of a species and its gene number appears to be loose, motivating studies on other sources of complexity such as alternatively splicing and cis-regulatory machinery. The first step towards this research direction would be to list candidates of alternatively spliced transcripts and promoters. Because of the release and updates of the human genome, this enumeration could be achieved by aligning millions of expressed sequence tags (ESTs, for short) to a draft human genome and then organizing alignments into related groups, which is however computationally costly. We have developed an efficient algorithm for this purpose, taking only one day to align three millions of ESTs to a newly revised draft genome from scratch. Analysis of 2.2 millions of alignments identifies about 8,000 groups of alternatively spliced transcripts. A typical group contains tens of alignments, demanding a method of selecting representatives. All the results are accessible through the WWW at our Gene Resource Locator web site http://grl.gi.k.u-tokyo.ac.jp/

Original publication
Postscript Checksum