Experimental investigation of three machine learning algorithms for ITS dataset

Yearwood, JL; Kang, Byeong; Kelarev, A

File(s) under permanent embargo

Experimental investigation of three machine learning algorithms for ITS dataset

conference contribution

posted on 2023-05-23, 04:50 authored by Yearwood, JL, Byeong KangByeong Kang, Kelarev, A

The present article is devoted to experimental investigation of the performance of three machine learning algorithms for ITS dataset in their abil- ity to achieve agreement with classes published in the biological literature be- fore. The ITS dataset consists of nuclear ribosomal DNA sequences, where rather sophisticated alignment scores have to be used as a measure of distance. These scores do not form a Minkowski metric and the sequences cannot be re- garded as points in a finite dimensional space. This is why it is necessary to de- velop novel machine learning approaches to the analysis of datasets of this sort. This paper introduces a k-committees classifier and compares it with the dis- crete k-means and Nearest Neighbour classifiers. It turns out that all three machine learning algorithms are e␣cient and can be used to automate future biologically significant classifications for datasets of this kind. A simplified ver- sion of a synthetic dataset, where the k-committees classifier outperforms k-means and Nearest Neighbour classifiers, is also presented.

History

Publication title

Proceedings of Future Generation Information Technology

Editors

Lee YH, Kim TH, Fang WC, Slezak D

Pagination

308-316

ISBN

978-3-642-10508-1

Department/School

School of Information and Communication Technology

Publisher

Springer-Verlag

Place of publication

New York, USA

Event title

Future Generation Information Technology

Event Venue

Jeju Island, Korea

Date of Event (Start Date)

2009-12-10

Date of Event (End Date)

2009-12-12

Rights statement

The original publication is available at http://www.springerlink.com

Repository Status

Restricted

Socio-economic Objectives

Information systems, technologies and services not elsewhere classified

Usage metrics

Keywords

No keyword provided

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

File(s) under permanent embargo

Experimental investigation of three machine learning algorithms for ITS dataset

History

Publication title

Editors

Pagination

ISBN

Department/School

Publisher

Place of publication

Event title

Event Venue

Date of Event (Start Date)

Date of Event (End Date)

Rights statement

Repository Status

Socio-economic Objectives

Usage metrics

Categories

Keywords

Licence

Exports