Show simple item record

dc.contributor.advisorMehta, Dinesh P.
dc.contributor.authorThiagarajan, Dheivya
dc.date.accessioned2017-07-31T16:32:51Z
dc.date.accessioned2022-02-03T13:00:56Z
dc.date.available2017-07-31T16:32:51Z
dc.date.available2022-02-03T13:00:56Z
dc.date.issued2017
dc.identifierThiagarajan_mines_0052E_11303.pdf
dc.identifierT 8330
dc.identifier.urihttps://hdl.handle.net/11124/171241
dc.descriptionIncludes bibliographical references.
dc.description2017 Summer.
dc.description.abstractIsomer networks provide a mechanism to understand and interpret relationships between organic molecules with applications in medicinal chemistry and drug design. The extraction of isomer networks is a time and data-intensive computation. The contributions of this dissertation are a variety of techniques to more efficiently (with respect to time and memory) compute isomers networks. Specifically, we describe our efforts to improve the network extraction process by 1) Using the symmetry present in most molecules to reduce run time and memory and streamlining the algorithm used for the detection of duplicate canonical names, a key step in determining the bond count distances between pairs of isomers. Together, these techniques result in reductions in memory of up to 60% and improvements in runtime of up to a factor of 100. 2) Developing an optimal grouping algorithm to subdivide an all-all computation with large memory requirements. The algorithm provides a solution to sub divide the "big data" problem that arises in the construction of isomer networks into several independent "small data" problems. Our results show that using the grouping algorithm can help divide large data sets into independent smaller ones that can be processed in parallel. 3) Generating the isomer network for 1,050,125 isomers of Nicotine (with a preliminary analysis of the same) using the cloud computing capabilities of Amazon Web Services and Microsoft Azure. These techniques can also be employed to successfully compute isomers networks for other chemical compounds.
dc.format.mediumborn digital
dc.format.mediumdoctoral dissertations
dc.languageEnglish
dc.language.isoeng
dc.publisherColorado School of Mines. Arthur Lakes Library
dc.relation.ispartof2017 - Mines Theses & Dissertations
dc.rightsCopyright of the original work is retained by the author.
dc.subjectcheminformatics
dc.subjectnetwork analytics
dc.subjectbig data
dc.subjectsymmetry
dc.subjectisomers
dc.titleFaster isomer network generation
dc.typeText
dc.contributor.committeememberHan, Qi
dc.contributor.committeememberWu, Bo
dc.contributor.committeememberCiobanu, Cristian V.
thesis.degree.nameDoctor of Philosophy (Ph.D.)
thesis.degree.levelDoctoral
thesis.degree.disciplineComputer Science
thesis.degree.grantorColorado School of Mines


Files in this item

Thumbnail
Name:
Thiagarajan_mines_0052E_11303.pdf
Size:
1.323Mb
Format:
PDF

This item appears in the following Collection(s)

Show simple item record