Text-based Experiment Retrieval in Genomic Databases

dc.contributor.authorSener, Duygu Dede
dc.contributor.authorOgul, Hasan
dc.contributor.authorBasak, Selen
dc.contributor.orcIDhttps://orcid.org/0000-0001-6766-4977en_US
dc.date.accessioned2023-09-18T13:04:53Z
dc.date.available2023-09-18T13:04:53Z
dc.date.issued2022
dc.description.abstractWith the growing number of genomic data in public repositories, efficient search methodologies have become a basic need to reach the relevant genomic data. However, this need cannot be fulfilled with the current repositories because they offer a limited search option which is a lexical matching of textual descriptions or metadata of the experiments. This technique is insufficient to get the required information needed to detect similarities between experiments within a large data collection. Due to the limitation of the existing repositories, in this study, we develop a text-based experiment retrieval framework by using both lexical and semantic similarity approaches to find similarities between experiments, and their retrieval performance was compared. This study is the first attempt to use text-driven semantic analysis approaches for developing a retrieval framework for experiments. An empirical study was conducted on a large textual description of Arabidopsis microarray experiments from the Gene Expression Omnibus database. In the proposed model, Jaccard similarity was used as a lexical similarity approach; Latent Semantic Analysis, Probabilistic Latent Semantic Analysis and Latent Dirichlet allocation were used as semantic similarity approaches to detect similarities between the textual descriptions of the experiments. According to the experimental results, relevant experiments can be retrieved successfully by text-driven semantic similarity approaches compared with the lexical similarity approach.en_US
dc.identifier.issn0165-5515en_US
dc.identifier.scopus2-s2.0-85138261267en_US
dc.identifier.urihttp://hdl.handle.net/11727/10682
dc.identifier.wos000849582000001en_US
dc.language.isoengen_US
dc.relation.isversionof10.1177/01655515221118670en_US
dc.relation.journalJOURNAL OF INFORMATION SCIENCEen_US
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergien_US
dc.rightsinfo:eu-repo/semantics/closedAccessen_US
dc.subjectInformation retrievalen_US
dc.subjectlexical similarityen_US
dc.subjectmicroarray experimentsen_US
dc.subjectsemantic similarityen_US
dc.subjecttext-based retrievalen_US
dc.titleText-based Experiment Retrieval in Genomic Databasesen_US
dc.typeArticleen_US

Files

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: