Jiawen Sun, a PhD student in the School of Electronics, Electrical Engineering and Computer Science has been working for the last three years to create a software system which can analyse graph-structured data. This can be used to detect insurance fraud.
The data organisations collect is usually represented by graphs which can be useful for detecting fraud. However, as data sets grow into the trillions of bytes this can create problems in in high-performance computing, making it very hard to use the computer at full capacity.
Jiawen Sun, who is from Tianjin, China, explains:
“The algorithm I have created means that we can now process this information quickly and efficiently, enabling organisations to tackle issues such as insurance fraud.”
Through her research, Jiawen studied how to lay out the data in a computer’s memory and how to assign parts of the computation to different processors.
She also came up with two solutions to change the order of how the data is processed, which allows the computer can be used to its full capacity.
The first solution changes the order in which graph edges are processed, splitting the graph in a way where there is no interference between processors, making the process more efficient. The second solution changes the order of processing vertices, allowing analysis to be completed faster.
Dr Hans Vandierendonck, who was supervisor of the project, says the findings will have a positive impact for many organisations across the globe.
“These techniques accelerate graph analytics up to 10-fold, which is a game changer for many organisations, allowing them to tap into analysis that they have never used before and at a much faster pace,” said Vandierendonck.
According to Jiawen, the algorithm can match the performance of the Apache Open Source projects GraphX (Spark) by 21x, Giraph by 55x and GraphLab by 37x.
Jiawen recently received a Silver medal at the Association for Computing Machinery Student Research Competition, which is sponsored by Microsoft.