Knowledge Discovery: Making Sense of Big Data
Implementing Algorithms for Data Analytics
In an age where we are buried in information, how do we make sense of it?
Graduate student in computer science, Wellington Cabrera (Ph.D., â17) worked on implementing
algorithms for data mining and analytics.One of technologyâs many advantages is the ease with which we can collect information
on everything from internet browsing patterns to activity levels recorded by wearable
trackers to weather patterns, all of which allow for more accurate insights. This
ease of data collection, however, has its own drawback, as the sheer amount can be
overwhelming.
Data Mining: Teasing Out Hidden Patterns
While working on graduate degree in computer science, Wellington Cabrera (Ph.D., â17), sought to address this problem by creating and implementing algorithms for data analytics and mining in database systems. This research was performed under the guidance of Carlos Ordonez, associate professor of computer science in the College of Natural Sciences and Mathematics.
In this deluge of information, with all of its tangled implications, data mining works to tease out the hidden patterns.
âAnother name for data mining is âknowledge discovery,ââ Cabrera noted.
Parallel Computing Increases Speed and Storage Capabilities
Cabreraâs research was to develop algorithms that could work for parallel database systems. Often, database systems are distributed across multiple computers, a strategy termed parallel computing. Although this increases a databaseâs speed and storage capabilities, this also requires an adjustment in how tasks are performed.
âDeveloping an algorithm for a single computer requires a lot of sequential steps,â Cabrera said. âFor parallel systems, the challenge is getting these multiple computers to work together to solve problems.â
Scalable Algorithms
Cabrera focused on scalable algorithms, in order to get comparable performance regardless of a databaseâs size.
âYou want an algorithm that can work just as well with two computers as it does with 1,000,â Cabrera said. âWhen you have many computers working together, you tend to see a degradation. If the algorithm does not coordinate the parallel processing correctly, then the computers cannot work together in the right manner, becoming a mess.â
Overcoming Challenges
During his time as a graduate student, Cabrera faced many of the typical challenges
of juggling coursework, research and his responsibilities as a teaching assistant,
all while trying to plan for the next
step in his career.
âTo get a Ph.D., you have to overcome many obstacles,â Cabrera said.
This hard work ultimately paid off, as Cabrera landed several internships, published numerous papers in well-respected journals, and, after graduation, was offered a job in the tech industry.
âI am very stubborn,â Cabrera said. âI donât like to give up.â
- Rachel Fairbank, College of Natural Sciences and Mathematics
November 30, 2017