Heterogeneous data Data mining solutions
What we are doing!
About
We define a heterogeneous dataset as a set of complex objects, that is, those defined by several data types including structured data, images, free text or time series. We envisage this could be extensible to other data types. There are currently research gaps in how to deal with such complex data. We have proposed an intermediary fusion approach called SMF which produces a pairwise matrix of distances between heterogeneous objects by fusing the distances between the individual data types. More precisely, SMF aggregates partial distances that we compute separately from each data type, taking into consideration uncertainty. Consequently, a single fused distance matrix is produced that can be used to produce a clustering using a standard clustering algorithm. In the practical work, the result shows that the SMF approach can improved the clustering configuration when compared with clustering on an individual data type.
Then we introduce Hk-medoids, a modified version of the standard k-medoids algorithm. The modification extends the algorithm for the problem of clustering our complex heterogeneous objects. Our implementation of Hk-medoids proposed here works with the fused distances produced by SMF and deals with the uncertainty in the fusion process. We experimentally evaluate the potential of our proposed algorithm using five datasets with different combinations of data types that define the objects. Our results show the feasibility of the our proposed algorithm and also they show a performance enhancement when comparing to the application of the original SMF approach without taking uncertainty into account.
Who is me!
Aalaa Mojahed obtained my BSc in Computer Science at the Faculty of Sciences, King Abdulaziz University (KAU), Jeddah, Saudi Arabia with First Class Honors in 2004 and then 3 years later I received a M.Sc degree in Advanced Computing Sciences in the Computing School of Science faculty at the University of East Anglia (UEA), Norwich, UK. Since 2012, I work for the faculty of Computing Sciences, KAU as a lecturer and besides she joined the machine learning group at the Faculty of Science at the UEA and started my PhD research in April 2013 in the field of data mining under the supervesion of Dr. Beatriz de La Iglesia and Dr. Wenjia Wang . I have been involved in many research work, however, my main research interests include data mining,
multimedia, database and algorithms design and analysis.