A team of NUS researchers has put Singapore on the global map of Artificial Intelligence (AI) and big data analytics. Their open-source project, called Apache SINGA, “graduated” from the Apache Incubator on 16 October 2019 and is now Southeast Asia’s first Top-Level Project (TLP) under the Apache Software Foundation, the world’s largest open-source software community.
Being recognised as a TLP is no small feat as Apache SINGA now joins the ranks of leading open-source tools such as the Apache HTTP Server and Apache Kafka. While the name may not immediately ring a bell, Apache Kafka powers big data solutions at Airbnb, LinkedIn, Netflix, PayPal, Spotify and many other corporations. The Apache HTTP Server is the most popular web server in the world and currently serves 29 per cent of all active websites on the Internet.
Led by Professor Ooi Beng Chin, Apache SINGA was initiated by the Database System Research Group from NUS Computing together with Zhejiang University and NetEase in 2014. The prototype was submitted to Apache Incubator in March 2015, and the first official release was made in October 2015. Since then, the NUS researchers have received support from the National Research Foundation Singapore, Ministry of Education, and the Agency for Science, Technology and Research.
Prof Ooi said, “We saw an increasing demand for deep learning and machine platforms in 2012, but there was a lack of efficient distributed platforms. The graduation is a mark of recognition for Apache SINGA, but this is just the beginning. We hope that Apache SINGA can make an impact on deep learning the same way Apache HTTP Servers did for website servers.”
Deep learning is a subset of machine learning that seeks to leverage artificial neural networks to generate meaningful insight from large amounts of data. While machine learning typically requires humans to provide structured data, deep learning can structure raw data by itself. An example would be identifying the image of cat; machine learning will require human input to define that a cat has features such as whiskers, pointed ears and paws. Deep learning will analyse multiple images of cats through various algorithms to determine all the features by itself, simulating an artificial brain.
However, the limitation of deep learning is that it requires an astronomical amount of data which in turn needs a lot of computational power. A typical centralised system would require a single supercomputer to process all this information which is not an option for most organisations. Apache SINGA’s distributed system approach helps to overcome the need for a single supercomputer as it spreads the workload across a large number of regular computers.
Apache SINGA currently powers applications across multiple sectors including healthcare, banking and finance, software development and cybersecurity. One such application is FoodLG, which uses image recognition to identify a dish based on the photo uploaded by the end-user. Five hospitals in Singapore are currently using different versions of FoodLG to promote healthy living and facilitate disease management for ailments such as diabetes, hypertension and high cholesterol.
The National University Hospital (NUH) and the Singapore General Hospital are also leveraging Apache SINGA to analyse MRI and X-ray images to improve identification of health problems. In addition, NUH uses models trained on Apache SINGA for disease progression modelling and patient re-admission modelling. In the area of cybersecurity, SecureAge is developing malware detection deep learning models using Apache SINGA to identify malware more accurately, as well as identifying new types of malware based on past data. Local banks, on the other hand, are also using Apache SINGA to develop and train models for risk modelling and solving anti-money laundering compliance.
The next step for Apache SINGA is enhancing its system so that even non-AI experts are able to use it and preparing for the age of 5G by streamlining it to run on edge devices.