logo
I see, I hear, I speak, I read

I see, I hear, I speak, I read

Time of India2 days ago

Malaya Rout works as Director of Data Science with Exafluence in Chennai. He is an alumnus of IIM Calcutta. He has worked with TCS, LatentView Analytics and Verizon prior to the role at Exafluence. He takes pride in sharing his knowledge and insights on diverse topics of Data Science with colleagues and aspiring data scientists. LESS ... MORE
I am amused at how we have started referring to traditional AIML as 'traditional'. I am equally amazed at the presence of 'traditional' LLMs. How fast do you want us to move? The so-called traditional LLMs are entirely text-based. The not-so-traditional LLMs are multimodal by nature. They handle images, videos, audio, as well as textual inputs and outputs.
When you ask an LLM to write a poem for you and it generates a creatively crafted poem, that's unimodal (It's called a text-only LLM. Unimodal is technically correct). When you upload an image and ask the LLM to identify whether a person is in the image, that's multimodal (the output is text).
When you upload a picture and ask the LLM to change the background from red to yellow, and the LLM returns the required image, that's multimodal (the output is an image). When you instruct the LLM to create a specific image for you, that's multimodal (the output is an image). Instead of an image, using audio or video is also a multimodal approach.
Encoders take text, images, and audio and transform them into a mathematical format that the AI can understand. This is akin to translating everything into a common language. The fusion module utilises an input projector to integrate all the various types of processed information into a single, unified representation.
The numeric representation of a cat's image, the numeric representation of the word 'cat', the numeric representation of the sound of saying 'cat', and that of the description of what a cat does are all related.
A multimodal LLM is closer to reality than a unimodal LLM. Human beings deal with multimodality in their day-to-day lives. The context provided to and extracted from a multimodal large language model (LLM) is richer.
The downside? Yes, multimodality requires intensive computing to process different types of data. We immediately notice the difference in inferencing speeds when using text only versus multimodal inputs and outputs.
I would think twice before uploading a three-minute video onto a paid API LLM service and asking to explain what the LLM sees in the video. I would reach my innocent credit limits in no time. I would be more eager to do the same through an open-source, locally downloaded large language model (LLM).
Multimodal large language models (LLMs) are widely used in various applications. For example, they are used in content moderation.
They are used to flag off plagiarism, explicit content, toxic content, self-harm and drug use, graphic terrorism, racial abuse, bad gestures, legal compliance issues, political preferences, and Personal Identifiable Information (PII). Multimodal LLMs can also be utilised to build chatbots that can answer questions related to a repository of videos, audio, or text.
For example, the user might like the bot to summarise videos, or determine the presence or absence of something specific that the user is interested in. The bot can automatically take you to the exact position in the artefact that contains the object of interest.
They can be used to help health professionals diagnose abnormalities in reports, X-rays, and other medical imaging techniques. They can be used in highly creative areas such as music composition and video editing.
While we stand at this juncture of technical advancement, multimodality takes us toward more human-like artificial intelligence. The shift from text-only to multimodal large language models (LLMs) is often underestimated. The shift has made AI more human-like by embodying a rich interplay of sight, sound, and language.
Can I say that here we have an AI that thinks, sees, hears, speaks, and reads? Haven't we moved multiple steps closer to AGI (Artificial General Intelligence) with this?
Think about it. Don't delegate all your thinking to AI, because we don't want the only sharp brain to be the AI's.
Facebook Twitter Linkedin Email Disclaimer
Views expressed above are the author's own.

Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

Why Cognizant paid less than a rupee for 21.31 acres of land in Andhra Pradesh's Visakhapatnam
Why Cognizant paid less than a rupee for 21.31 acres of land in Andhra Pradesh's Visakhapatnam

Time of India

time12 hours ago

  • Time of India

Why Cognizant paid less than a rupee for 21.31 acres of land in Andhra Pradesh's Visakhapatnam

Global IT services giant Cognizant Technology Solutions will establish a major IT campus in Visakhapatnam after securing 21.31 acres of land from the Andhra Pradesh government for just 99 paise. The company plans to invest Rs 1,582 crore in the project, which is expected to generate 8,000 jobs over the next eight years. The land allocation at Kapulauppada under the Visakhapatnam Metropolitan Region Development Authority (VMRDA) reflects the state's aggressive push to transform the coastal city into a technology hub. According to sources cited by MoneyControl, Cognizant has set March 2029 as the target date for commencing commercial operations from the facility. Andhra's strategic move to build IT ecosystem in the state The development aligns with Chief Minister N Chandrababu Naidu's vision of positioning Visakhapatnam as Andhra Pradesh's economic capital. The nominal land price underscores the state government's commitment to attracting major technology investments to the region. by Taboola by Taboola Sponsored Links Sponsored Links Promoted Links Promoted Links You May Like Forget Furosemide, Use This Household Item To Help Drain Edema Fluid WellnessGuide Learn more Undo Earlier this year, Andhra Pradesh's IT and Human Resources Minister Nara Lokesh met with Cognizant CEO Ravi Kumar S at the World Economic Forum in Davos, urging the company to expand into tier-2 cities like Visakhapatnam. Lokesh has previously described his vision for the city, stating that "if Goa were to marry Bangalore and have a child, that could be Visakhapatnam." Following TCS blueprint for tech hub bevelopment This land deal follows a similar arrangement made in April 2025, when the state government allotted 21 acres to Tata Consultancy Services (TCS) for the same nominal amount of 99 paise. The pattern demonstrates Andhra Pradesh's systematic approach to building a technology corridor in Visakhapatnam. The Teaneck, New Jersey-based Cognizant's investment represents a significant boost to the region's employment landscape. With Visakhapatnam located approximately 600 kilometers from Hyderabad and 800 kilometers from Chennai, two of India's major software hubs alongside Bengaluru, the city is positioned to become a strategic alternative for IT operations. The campus development is expected to contribute substantially to the local economy while providing skilled employment opportunities in the region. AI Masterclass for Students. Upskill Young Ones Today!– Join Now

After TCS, AP govt allots land to Cognizant in Vizag
After TCS, AP govt allots land to Cognizant in Vizag

Time of India

timea day ago

  • Time of India

After TCS, AP govt allots land to Cognizant in Vizag

Vijayawada: Continuing its aggressive incentive policy stand, Andhra Pradesh govt has decided to allot land to another IT major, Cognizant Technology Solutions , at 99 paisa per acre in Visakhapatnam. The 21.31 acres will be allotted at Kapuluppada area in the city. The State Investment Promotion Board (SIPB), in its recent meeting, approved the proposal of Cognizant to invest Rs 1,583 crore in its new ITES/IT campus. The state govt earlier allotted about 21 acres to another IT services major, Tata Consultancy Services (TCS), at 99 paisa per acre. The new campus of TCS is likely to be launched by the end of July. TCS has promised to provide 12,000 jobs. IT minister Nara Lokesh earlier announced that the govt is ready to give land to top IT companies at the same price as that of TCS if they come up with a promise of creating employment opportunities for local youth. "The state govt is committed to turn Visakhapatnam into the IT capital of Andhra Pradesh and a new investment destination for companies. To kick off the investment cycle rolling, a trigger is needed, and TCS will be that moment," Lokesh said earlier, justifying the allotment of land at almost free of cost to TCS. He said Tata Motors went to Gujarat after Narendra Modi, the then chief minister of Gujarat, offered land at 99 paisa per acre. by Taboola by Taboola Sponsored Links Sponsored Links Promoted Links Promoted Links You May Like Free P2,000 GCash eGift UnionBank Credit Card Apply Now Undo "After Tata Motors landed in Gujarat, an entire ecosystem of automobile industry was created," said Lokesh. The discussions with Cognizant were initiated during the World Economic Forum summit in Davos. According to sources, Cognizant is planning its investment in three phases. The company has set a target of March 2029 for commencement of commercial operations. The project will be 100% self-funded. Together, TCS and Cognizant will bring 20,000 IT jobs to Visakhapatnam. Apart from direct employment, every job in the IT sector is expected to create indirect employment for four people. Besides TCS and Cognizant, Google is also planning to set shop in Visakhapatnam, which, according to sources, will be another major turning point in creating a whole ecosystem in new and emerging technologies like artificial intelligence and data centres.

TCS expands SDV capabilities with new centres in Germany and Romania
TCS expands SDV capabilities with new centres in Germany and Romania

Business Standard

timea day ago

  • Business Standard

TCS expands SDV capabilities with new centres in Germany and Romania

India's largest IT services firm, Tata Consultancy Services (TCS), today announced the expansion of its capabilities in the rapidly evolving Software-Defined Vehicles (SDV) space. TCS has set up two new automotive delivery centres in Germany — located in Munich and Villingen-Schwenningen — as well as an engineering centre in Romania. These new hubs mark a strategic move to help TCS' global automotive clients accelerate their transition to next-generation mobility solutions. The delivery centres in Germany will support automakers in developing and deploying TCS' software-driven services that cater to autonomous driving, infotainment, safety systems and connected vehicle technologies. Meanwhile, the engineering centre in Romania will focus on designing and building advanced automotive software platforms to support early-stage development and innovation. This expansion is part of TCS' long-term strategy to strengthen its end-to-end automotive software capabilities, chip-to-cloud technologies and services. Regu Ayyaswamy, senior vice-president and global head, Internet of Things (IoT) and Digital Engineering at TCS, said: 'These new centres will position TCS at the forefront of automotive innovation, enabling us to deliver state-of-the-art solutions in autonomous driving and advanced cockpit systems. This expansion reaffirms our commitment to leading the transformation in the Software-Defined Vehicles space.' The strategic location of these centres will enable close collaboration with leading European original equipment manufacturers (OEMs) and global automotive enterprises, providing nearshore capabilities. The new centres currently house over 100 professionals who will work alongside more than 2,000 SDV engineers across TCS' global locations. TCS' expansion aligns with the global automotive industry's shift towards software-defined, connected and autonomous mobility. As demand grows for intelligent vehicle technologies, TCS is well-positioned to lead this transformation. Europe is a strategically significant location for TCS' automotive delivery hubs due to its robust automotive industry and the presence of several leading manufacturers in the region. Anupam Singhal, president and business group head, manufacturing, TCS, said: 'The shift to software-defined vehicles marks a defining moment for the automotive industry. With the launch of these new centres, we are deepening our commitment to support OEMs in building the next generation of intelligent, connected and sustainable vehicles. This expansion is a key milestone in our journey toward Future-Ready Mobility — where software, engineering and design, backed by AI, converge to deliver safer, more personalised and continuously enriching experiences for drivers and passengers.' TCS has been serving automotive customers in Europe for the past 25 years and has a strong presence in automotive hubs across the region. The organisation has been delivering innovative solutions in digital cockpit, electrification, autonomous vehicles and connected car ecosystems through digital engineering, IoT, cloud and data analytics. TCS also leverages generative AI to accelerate product development with feature generation and testing in SDVs, enabling faster innovation and enhanced personalisation. TCS' presence in Europe enhances its position in the global automotive value chain by leveraging local talent and expertise, fostering closer collaboration with clients and improving customer satisfaction. TCS has a long-standing commitment to serving as a trusted IT partner for European enterprises, with a presence in the region for over 45 years. TCS Europe has over 15,000 employees.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store