logo
#

Latest news with #Exafluence

I see, I hear, I speak, I read
I see, I hear, I speak, I read

Time of India

time2 days ago

  • Time of India

I see, I hear, I speak, I read

Malaya Rout works as Director of Data Science with Exafluence in Chennai. He is an alumnus of IIM Calcutta. He has worked with TCS, LatentView Analytics and Verizon prior to the role at Exafluence. He takes pride in sharing his knowledge and insights on diverse topics of Data Science with colleagues and aspiring data scientists. LESS ... MORE I am amused at how we have started referring to traditional AIML as 'traditional'. I am equally amazed at the presence of 'traditional' LLMs. How fast do you want us to move? The so-called traditional LLMs are entirely text-based. The not-so-traditional LLMs are multimodal by nature. They handle images, videos, audio, as well as textual inputs and outputs. When you ask an LLM to write a poem for you and it generates a creatively crafted poem, that's unimodal (It's called a text-only LLM. Unimodal is technically correct). When you upload an image and ask the LLM to identify whether a person is in the image, that's multimodal (the output is text). When you upload a picture and ask the LLM to change the background from red to yellow, and the LLM returns the required image, that's multimodal (the output is an image). When you instruct the LLM to create a specific image for you, that's multimodal (the output is an image). Instead of an image, using audio or video is also a multimodal approach. Encoders take text, images, and audio and transform them into a mathematical format that the AI can understand. This is akin to translating everything into a common language. The fusion module utilises an input projector to integrate all the various types of processed information into a single, unified representation. The numeric representation of a cat's image, the numeric representation of the word 'cat', the numeric representation of the sound of saying 'cat', and that of the description of what a cat does are all related. A multimodal LLM is closer to reality than a unimodal LLM. Human beings deal with multimodality in their day-to-day lives. The context provided to and extracted from a multimodal large language model (LLM) is richer. The downside? Yes, multimodality requires intensive computing to process different types of data. We immediately notice the difference in inferencing speeds when using text only versus multimodal inputs and outputs. I would think twice before uploading a three-minute video onto a paid API LLM service and asking to explain what the LLM sees in the video. I would reach my innocent credit limits in no time. I would be more eager to do the same through an open-source, locally downloaded large language model (LLM). Multimodal large language models (LLMs) are widely used in various applications. For example, they are used in content moderation. They are used to flag off plagiarism, explicit content, toxic content, self-harm and drug use, graphic terrorism, racial abuse, bad gestures, legal compliance issues, political preferences, and Personal Identifiable Information (PII). Multimodal LLMs can also be utilised to build chatbots that can answer questions related to a repository of videos, audio, or text. For example, the user might like the bot to summarise videos, or determine the presence or absence of something specific that the user is interested in. The bot can automatically take you to the exact position in the artefact that contains the object of interest. They can be used to help health professionals diagnose abnormalities in reports, X-rays, and other medical imaging techniques. They can be used in highly creative areas such as music composition and video editing. While we stand at this juncture of technical advancement, multimodality takes us toward more human-like artificial intelligence. The shift from text-only to multimodal large language models (LLMs) is often underestimated. The shift has made AI more human-like by embodying a rich interplay of sight, sound, and language. Can I say that here we have an AI that thinks, sees, hears, speaks, and reads? Haven't we moved multiple steps closer to AGI (Artificial General Intelligence) with this? Think about it. Don't delegate all your thinking to AI, because we don't want the only sharp brain to be the AI's. Facebook Twitter Linkedin Email Disclaimer Views expressed above are the author's own.

Giving up?
Giving up?

Time of India

time13-06-2025

  • Business
  • Time of India

Giving up?

Malaya Rout works as Director of Data Science with Exafluence in Chennai. He is an alumnus of IIM Calcutta. He has worked with TCS, LatentView Analytics and Verizon prior to the role at Exafluence. He takes pride in sharing his knowledge and insights on diverse topics of Data Science with colleagues and aspiring data scientists. LESS ... MORE Sahil was learning to walk. He took tiny steps while trying to balance his cute little body. He was born to Sanjay and Saundarya thirteen months ago. Sahil meant the whole world to them. Theirs was a lower-middle-class family. Sometimes, finances, or rather the lack of them, gave the family a hard time. Sanjay was the manager of a shoe shop in Chennai. It was a good business. It paid enough for Sanjay and his employer. The salary was weekly. Enough for the livelihood of the two families. The shop had remained closed for the last four weeks due to the nationwide lockdown aimed at limiting the spread of COVID-19 in the country. Hence, there were no sales for a month. There were no sales for a month even before the lockdown was announced. That might be due to a decline in demand. Lockdown came at the wrong time for Sanjay. He had not left his house in the last three weeks. Saundarya went out a couple of times for vegetables and groceries. Things grew worrisome for the family. They ran out of their cash reserve. There was no clarity on whether the lockdown would be extended for a second time. It was already extended once. People guessed that the spread of infection would be under control only after four or five months. The immediate future didn't look promising to Sanjay. The fear weighed heavily on him. He had grown increasingly silent and thoughtful with each passing day. Of the two, Saundarya was usually more confident. It might be because she didn't fully understand the extent of the COVID-19 pandemic's impact. Last night, they fed Sahil dinner. Sanjay and Saundarya had to go to bed on an empty stomach. This morning, Sanjay had gone crazy arranging breakfast for their toddler and themselves. He couldn't manage. He called up his employer, frantically asking for help. Things there were not as bad as in Sanjay's family. However, things were not great, either. His boss had helped him with money several times in the past. That morning, he couldn't. Thoughts of him coming out of the crisis ran obsessively in Sanjay's head. They troubled him and made him more thoughtful. It was a vicious cycle. The lack of nutrition didn't help his mind. He picked up his mobile phone and an empty wallet and stormed out of the house. 'Will the world know me as a failure. Will I be able to face society? Will I be remembered as the one who couldn't feed his family?' Many such worries created havoc inside his head. He has been walking for 30 minutes now along smaller streets. At times, he was walking along broader and longer roads. He didn't know where he was going. He didn't know what he was up to. He didn't know whether he wanted to live this life at all. He turned left and went inside a four-story building. It looked like an office space deserted due to the lockdown. Places and structures, which would otherwise be crowded, gave a deserted and there-is-no-tomorrow look. Sanjay started climbing the stairs absent-mindedly. 'This is it. This is how it will end. Thank you, God, for the 33 years. Sorry, Saundarya, and sorry, Sahil.' He thought. By the time he reached the fourth floor, he was tired. He didn't climb further. He walked towards the edge and stood motionless. 'Everything is empty. Everything is void. Everything is oblivious.' He murmured. Just while he was gathering all the courage to jump off the building, his phone rang. He picked up the call. 'Sanjay, I have some good news for you. As part of the Indian government's efforts to support people in need during the lockdown, your family has been awarded 50,000 rupees. You give me your Aadhaar number, please'. Finally, a ray of hope and life ran through his body. God must have listened to his prayers. He hurriedly took out his wallet to read the Aadhaar number. As he started to read it out to the person over the phone, he lost his balance and fell down the building. The phone line was on. It dropped around six meters away from the body. He was holding the wallet tightly in his right hand. The body was motionless and showed no signs of pain or struggle. There was nobody around. In some sense, the above story tells us what we should not do. One in four Gen AI projects fail (I know that giving up on projects and giving up on life are not comparable). Setbacks shouldn't signal surrender but rather strategic recalibration. Organisations that celebrate failures as learning opportunities cultivate cultures of experimentation, which ultimately drive innovation. Facebook Twitter Linkedin Email Disclaimer Views expressed above are the author's own.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store