Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

A 2-phased approach for detecting multiple loci associations with traits

Authors
Lee, SunwonKang, JaewooOh, Junho
Issue Date
2012
Publisher
INDERSCIENCE ENTERPRISES LTD
Keywords
TF-IDF; term frequency - inverse document frequency; class association rule mining; GWAS; SNP; bioinformatics; apriori algorithm; data mining
Citation
INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, v.6, no.5, pp.535 - 556
Indexed
SCIE
SCOPUS
Journal Title
INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS
Volume
6
Number
5
Start Page
535
End Page
556
URI
https://scholar.korea.ac.kr/handle/2021.sw.korea/109307
DOI
10.1504/IJDMB.2012.049318
ISSN
1748-5673
Abstract
The recent advance in SNP genotyping has made a significant contribution to reduction of the costs for large-scale genotyping. The development also has dramatically increased the size of the SNP genotype data. The increase in the volume of the data, however, has posed a huge obstacle to the conventional analysis techniques that are typically vulnerable to the high-dimensionality problem. To address the issue, we propose a method that exploits two well-tested models: the document-term model and the transaction analysis model. The proposed method consists of two phases. In the first phase, we reduce the dimensions of the SNP genotype data by extracting significant SNPs through transformation of the data in lieu of the document-term model. In the second phase, we discover the association rules that signify the relations between the SNPs and the traits, through the application of transactional analysis in the reduced-dimension genotype data. We validated the discovered rules through literature survey. Experiments were also carried out using the HGDP panel data provided by the Foundation Jean Dausset-CEPH, which prove the validity of our new method for identifying appropriate dimensional reduction and associations of multiple SNPs and traits. This paper is an extended version of our workshop paper presented in the 2010 International Workshop on Data Mining for High Throughput Data from Genome-Wide Association Studies.
Files in This Item
There are no files associated with this item.
Appears in
Collections
Graduate School > Department of Computer Science and Engineering > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Kang, Jae woo photo

Kang, Jae woo
컴퓨터학과
Read more

Altmetrics

Total Views & Downloads

BROWSE