Data Science decoded at the eighth Roddam Narasimha lecture at IIT Gandhinagar

Gandhinagar: Indian Institute of Technology Gandhinagar (IITGN) organised the eighth Roddam Narasimha Distinguished Lecture on August 5, 2019, on the subject ‘Data Science: The Good, the Bad and the Ugly’. Prof Jayant R Haritsa, a data Scientist and Professor in the Department of Computational & Data Sciences at the Indian Institute of Science, Bangalore (IISc Bangalore), delivered the lecture and helped the audience decipher various aspects of data science.

During his talk, Prof Jayant R Haritsa gave background and various examples of use of data science by various sectors, government entities and corporate companies. Talking about the good aspects of data science first, Prof Haritsa talked about some of the sectors where companies are utilising the power of data science and analytics to understand their customer preferences and provide them better choices and advanced services. He gave an example of power sector where deployment of a condition monitoring and predictive analytics solutions helps the power company’s managers in taking informed maintenance decisions quickly and hence saves huge losses and also improves service delivery.

Explaining this, he said, “Data science is trying to look into the future. It proves to be a win-win situation for the customers as well as for the companies. Data science can play a very direct and positive role for the services in transport, consumer electronics, banking, power sectors etc. There is scope for data science to do a lot of public good and at the same time also help the corporate world in terms of better services and more projects.”

He then introduced the audience to some lesser known perils and realities of data science. He also informed that only a few enterprises really have curated big data, but many others claim to have it too because of the much created hype around it. Talking about some of the interesting methodological issues of data science, Prof Haritsa said, “Big data encourages to ask the wrong questions. People get big answers to such questions, but coming up with the right questions is more difficult than coming up with the right answers. Nowadays people tend to compare incomparable things because of the bid data available on the web.” The audience was amused with an example of one such big data analysis which claimed to calculate the age of a musician depending on the kind of music genre he/she plays, but the statistical and probability subtleties are being lost in the big picture.

He also gave some examples of the design errors in the implementations of big data systems and how it can lead to the breakdown of critical web infrastructure. Explaining the limitations of data science systems he said, “Everybody loves big data systems but you can not test them. So, many big data systems will be prone to failure by definition because they are too difficult to test due to the large scale of the data.”

Prof Haritsa also pointed out to its methodological misuse that, “Data science can be used to bend us to preconceived biases and reinforce them. The directed behaviour is being made possible because of data science, through news feed algorithms to selectively push an opinion, confuse the issues with fake news, and apply peer pressure on social media.”

In his concluding remarks Prof Haritsa emphasised the ideal way to use data science for the benefit of mankind and said, “It is very important to use data science, it should be a tool of last resort, not the first, to validate a hypothesis, and should be used as a support tool, not substitute, for domain expertise. Because data science, like nuclear power, has enormous potential for benefiting mankind, if used with care, otherwise it has equally destructive power for ruining the society.”