Online user-generated data is often multimodal, combining visual, audio, and text channels. This type of content is highly valuable to enterprises due to the wealth of information it contains, which can be used to improve user engagement through better recommender systems and AdSense. However, analyzing this data effectively requires integrating information from multiple modalities, presenting significant technical challenges for the development of AI agents. At our research group, we are dedicated to advancing the field of AI by developing cutting-edge techniques designed to solve complex multimodal tasks such as text-to-audio generation, video generation, and emotion recognition.
During a recent TikTok hearing in the United States Congress, several members expressed concern about the platform's promotion of harmful content to children and teens. Frank Melone, a congress member from New Jersey, cited research indicating that TikTok has been pushing content related to self-harm and eating disorders to young users. However, TikTok is not the only social media platform facing such allegations. Government agencies and NGOs have also called on other platforms, including Facebook, YouTube, and Instagram, to take on more proactive measures to safeguard children using their services.
Along with the increased consumption of streamed movies, zoom meetings and home-based learnings, memes have also been gaining steady popularity across social media platforms during the course of the pandemic. Along with the increased consumption of streamed movies, zoom meetings and home-based learnings, memes have also been gaining steady popularity across social media platforms during the course of the pandemic. Usually intended to be funny or satirical in nature, memes are presented as an image with accompanying text. Noting its growing prevalence, TIME had also published a collection of the best memes of 2020, reflecting the similar lived realities that people from different corners of the world were sharing. However, as much as memes have helped many to cope with the stress of the pandemic, they have also gained notoriety for peddling out falsehoods, hate speech and cyberbullying.
Today’s speech synthesis technology has focused on delivering the text content which is not ready for real time human-machine dialogue. An effective, engaging, and cooperative human-human conversation includes many other aspects, for example, to communicate with kindness, to express empathy, to agree to disagree, to seek common ground, and to seek understanding more than being right. We believe that all these must be expressed through appropriate speech prosody beyond the text. At the SUTD Speech & Intelligent Systems (SIS) Lab, Assistant Professor Berrak Sisman and her team study the theory and algorithms that enable such speech synthesis. This research represents an important step to equip AI-enabled speech synthesis with human-like emotions and expressiveness.
Critical infrastructure becomes a strategic target in the midst of a cyber-war. Challenges in securing critical infrastructure are different as compared with conventional IT systems, especially in terms of consequences in case of a security lapse. Those attacks might result in damage to the physical property or severely affecting people’s living as in the incident of nationwide blackout in Ukraine. Governments are investing significantly in response to the risks and challenges while researchers and vendors are aggressively developing and marketing new technologies aimed at protecting critical infrastructure.
Dialogue systems have multifaceted applications in customer service, virtual assistants, education, mental health, and many more areas. Such applications are often executed in the form of chatbots due to its flexibility. Its market value evidences the popularity of chatbots–– according to Revechat, 2.6 billion USD in 2019, that is projected to grow to 9.4 billion USD by 2024. Creating a human-like conversational system is a long-standing goal of artificial intelligence (AI). However, it is not a trivial task as we, human beings, count on several variables such as emotions, sentiment, prior assumptions, intent, or personality traits to participate in dialogues and monologues. These variables control the language that we generate and the way we understand the language that we hear. At the DeClare Lab at SUTD, Asst Prof Soujanya Poria and his team focus on developing human-like dialogue understanding system by leveraging key factors such as pragmatics, affect, empathy, multimodal cues, and commonsense.
Wireless technology is the stepping-stone for the success of internet-of-things (IoTs). Among others, Bluetooth and Wi-Fi enabled devices are most common in the IoT world. For instance, the Bluetooth technology allows a large number of a highly diverse set of devices to be connected and communicate with each other. In a research project led by Assistant Professor Sudipta Chattopadhyay from the ISTD Pillar of SUTD, the team shows that there exists a significant gap in ensuring the security of wireless protocol implementation.
With the fifth generation (5G) of wireless communications expected to be rolled out in Singapore in 2020, everyone is buzzing with anticipation to experience the larger bandwidth, wider coverage and lightning speed that 5G promises to offer. Associate Professor Tony Quek, Acting Head of Pillar for Information Systems Technology and Design in SUTD and a leader in the field of wireless networks, explains how 5G will be playing a key role in Singapore’s digital transformation and ultimately change the way we live.
Over the last few years, there’s been a steady growth in revenue from digital music. In just six years, revenue from music streaming moved from zero to 40 percent of the overall global recorded music industry revenues, according to a report by IFPI. With revenues to the tune of 11.2 billion dollars a year, the digital model is only set to grow. So “is there still room for a traditional record company?”
Natural Language Processing (NLP) has been around for more than 50 years, but has lately gotten more well-known, thanks to the likes of personal assistant applications Apple’s Siri and Amazon’s Alexa. NLP has also been the driving force behind language translation applications such as Google Translate, but it has been no easy feat coming this far. It continues to be a complex job to enable computers to understand how humans naturally speak or type.