Digitalconqurer.com articles may affiliate links and is a member of the Amazon Services LLC Associates Program and a few other affiliate programs. If you make a purchase using one of these affiliate links, we may receive compensation at no extra cost to you. See our Disclosure Policy for more information.

Discover the Exciting Comparison: ChatGPT 4o vs Gemini 1.5 Pro – 7 Things You Should Know

Updated On: May 26, 2024
By : Neha Sharma

Discover the differences between ChatGPT 4o vs Gemini 1.5 Pro.

OpenAI recently announced GPT-4o, marking a new era for AI language models and human interaction with them. One of the most impressive features is its ability to handle live interaction with ChatGPT despite interruptions. Despite some difficulties during the live demo, I am astounded by what the team has accomplished.

Google has also made a significant move in the AI space with the launch of Google Gemini, representing its most robust set of AI tools. However, as expected from Google, Gemini is complex, so I’m here to explain it in detail. Gemini is the new name for all of Google’s AI products, including chatbots, voice assistants, and coding helpers, replacing Google Bard and Duet AI.

OpenAI immediately made the GPT-4o API available after the demonstration. In this post, I will provide an independent analysis comparing the classification abilities of GPT-4o, GPT-4, Google’s Gemini, and Unicorn models using an English dataset I built. I conducted multiple tests using ChatGPT 4o and Gemini 1.5 Pro and found that ChatGPT 4o outperforms Gemini 1.5 Pro in reasoning, code generation, and multimodal understanding tasks. For example, ChatGPT 4o quickly generated a Python game, while Gemini 1.5 Pro did not provide the necessary code.

At the Spring Update event, OpenAI unveiled its flagship GPT-4o model, which is now available for free. Just a day later, during the Google I/O 2024 event, Google introduced the Gemini 1.5 Pro variant to consumers through Gemini Advanced. Now that both flagship versions are available let’s compare ChatGPT 4o and Gemini 1.5 Pro to determine which performs better.

What’s new with ChatGPT-4o?

At the cutting edge is the concept of an Omni model, which is intended to comprehend and process text, voice, and video seamlessly. The focus of OpenAI has shifted, aiming to democratize GPT-4 level intelligence for the masses, making GPT-4 level language model intelligence accessible even to free users. This shift in strategy is a boon for the AI community, as it opens up a world of possibilities for developers and researchers alike.

OpenAI has also made significant strides in improving the quality and speed of GPT-4o in over 50 languages, ensuring a more inclusive and globally accessible AI experience. This enhancement comes at a lower cost, further demonstrating OpenAI’s commitment to making advanced AI technology more accessible to all. They also said that paying members would have five times the capacity as non-paid users. They will also offer a desktop version of ChatGPT to enable real-time reasoning across voice, visual, and text interfaces for the public.

ChatGPT 4o vs Gemini 1.5 Pro

Calculate Drying Time

We used the classic reasoning test on ChatGPT 4o and Gemini 1.5 Pro to assess their intelligence. OpenAI’s ChatGPT 4o nailed it, whereas the enhanced Gemini 1.5 Pro model struggled to grasp the trick question. It dabbled in mathematical computations and came to an incorrect conclusion.

If it takes an hour to dry 15 towels in the sun, how long will it take to dry 20 towels?

Winner: ChatGPT 4o

Magic Elevator Test

During the magic lift test, the previous ChatGPT 4 model failed to accurately predict the solution. However, this time, the ChatGPT 4o model provided the correct response. Gemini 1.5 Pro also provided the correct answer.

There is a tall structure with a miraculous lift within. When the lift stops on an even floor, it connects to floor 1.
Starting on floor one, I take the magic lift three floors up. I exit the lift and then take the steps three floors higher.
Which floor will I end up on?

Winner: ChatGPT 4o and Gemini 1.5 Pro

Which is Heavier?

In this commonsense reasoning exam, Gemini 1.5 Pro gets the answer wrong, claiming that both weigh the same. However, ChatGPT 4o correctly points out that the units are different, therefore a kilogram of any material will weigh more than a pound. It appears that the enhanced Gemini 1.5 Pro model has become dumber over time.

Which is heavier: a kilogram of feathers or a pound of steel?

Winner: ChatGPT 4o

Follow User Instructions

I requested ChatGPT 4o and Gemini 1.5 Pro to construct ten statements that ended with the word “mango”. Guess what?

Create ten phrases that finish with the word “mango”

ChatGPT 4o correctly generated all ten sentences; however, Gemini 1.5 Pro only produced six such sentences.
Before GPT-4o, only Llama 3 70B could effectively execute user instructions. The older GPT-4 model also struggled. This indicates that OpenAI has truly improved its model.

Multimodal Image Test

François Fleuret, the author of The Little Book of Deep Learning, conducted a basic picture analysis test on ChatGPT 4o and shared the results on X (previously Twitter). He has since edited the message to downplay the situation, stating it is a general issue with vision models.

In order to replicate the results, I performed the same test using Gemini 1.5 Pro and ChatGPT 4o. Gemini 1.5 Pro performed significantly worse, providing incorrect responses for all problems. On the other hand, ChatGPT 4o gave one correct answer but failed on the other questions.

This highlights the need for improvement in multimodal models. I was especially disappointed with Gemini’s multimodal capabilities as the answers were far from accurate.

Winner: None

Character Recognition Test

In another multimodal test, I submitted the characteristics of two phones (Pixel 8a and Pixel 8) as image files. I did not provide phone numbers, and neither did the screenshots. Now, I asked ChatGPT 4o to recommend which phone I should get.
It effectively collected text from screenshots, compared features, and correctly advised me to purchase Phone 2, which was the Pixel 8. I also asked it to predict the phone, and ChatGPT 4o provided the correct response – Pixel 8.

I performed the same tests on Gemini 1.5 Pro using Google AI Studio. By the way, Gemini Advanced does not currently support batch image uploads. Regarding outcomes, it failed to extract text from both screenshots and repeatedly requested additional information. In tests like these, you can see that Google lags considerably behind OpenAI when performing tasks efficiently.

ChatGPT 4o vs Gemini 1.5 Pro: Differences

ChatGPT and Google Gemini have become more similar with the release of Gemini Ultra 1.0, which makes them more competitive with GPT-4. Both offer a free subscription service with virtually identical pricing and interfaces and use similar cases. The distinctions are primarily in their language models.

However, their training models, data sources, user experiences, and data storage methods differ.

ChatGPT-4 was trained using a big dataset curated exclusively for conversational AI. That enables it to excel at natural language understanding and response, making it ideal for chatbots, virtual assistants, and other conversational applications.
Google Gemini: Trained on a more general dataset that included text and code. This broadens its knowledge base, allowing it to undertake a variety of jobs outside conversation, such as technical writing, code development, and research support.
Parameter Size and Computing Power:
ChatGPT-4: Has 175B parameters and a large learning and information store capacity. However, this requires significant processing resources, which may limit its usability on less-powered devices.
Google Gemini: Has 137 billion parameters, establishing a balance between capability and efficiency. Despite having fewer parameters, it still performs well and requires less computing overhead.
Strengths and capabilities:
ChatGPT-4 excels at creative text generation, producing poems, scripts, musical pieces, and other imaginative formats. Its conversational capabilities make it excellent for applications that require engaging and natural interaction.
Google Gemini excels at providing informative responses to open-ended, difficult, or unusual topics. Its ability to access and process information from a variety of sources makes it an effective tool for research and knowledge development.
Access and Availability:
ChatGPT-4 is currently available via a free public API, making it easily accessible to developers and researchers. This makes AI more accessible and encourages innovation.
Google Gemini is currently in a limited beta program, with access restricted to a select set of individuals. This ensures controlled testing and feedback prior to a wider release.
ChatGPT is based on OpenAI’s GPT-3.5 or GPT-4, depending on whether it is the free or ChatGPT Plus subscription edition. Gemini comes in three sizes: Gemini Pro for a wide range of operations, Gemini Ultra for more complex jobs, and Gemini Nano for mobile devices. Ultra 1.0, which powers the subscription Gemini Advanced edition, is quicker and more sophisticated than the model used in the free Gemini Pro.

Data Sources

The primary distinction between ChatGPT and Gemini is the data sources used to train their LLMs. GPT-3.5 uses preconfigured data that expires in January 2022, but GPT-4 data is valid until April 2023. Gemini uses real-time internet data. It is programmed to select material from sources relevant to specific themes, such as coding or the most recent scientific study.

Gemini Ultra has the most data, with 1.6 trillion parameters and a training set of 1.56 trillion words. GPT-4 includes around 1.5 trillion parameters and a training data set of 13 trillion tokens, which can be individual characters, words, or parts of words.

Finally, Google’s highly anticipated product, the Google Gemini, has arrived. Google claims that Gemini is the first model to exceed human experts in MMLU (Massive Multitasking Language Understanding), one of the most popular methodologies for testing AI models’ knowledge and problem-solving abilities. The model outperforms previous models in terms of text, image, and code generation, video creation, and reasoning problem-solving. This page will provide a full discussion of Google Gemini, including what it is, its kinds, how it works, future expectations, and how it varies from ChatGPT 4.

The Verdict

Gemini 1.5 Pro is significantly behind ChatGPT 4o. Even after months of improvement during preview, the 1.5 Pro model cannot compete with OpenAI’s current GPT-4o model. From commonsense thinking to multimodal and coding examinations, ChatGPT 4o performs smartly and attentively. Not to mention that OpenAI has made ChatGPT 4o free for everyone.

Gemini 1.5 Pro’s only advantage is its huge context window, which can hold up to 1 million tokens. In addition, you can submit videos, which is advantageous. However, because the model is not particularly sophisticated, I doubt many people would choose to use it solely for the larger context window.

No new frontier models were announced at the Google I/O 2024 event. The corporation is stuck with the Gemini 1.5 Pro model. No information is on Gemini 1.5 Ultra or 2.0. If Google were to compete with OpenAI, a significant jump is required.

Neha Sharma

Neha Sharma is a seasoned Technology Content Writer and SEO Analyst at Digital Conqueror, specializing in IoT, Laptops, Chromebooks, Notion Templates and Gaming editorial content. With over four years of industry experience, Neha crafts insightful news articles, reviews, feature pieces, and guides. She holds a B.Tech and an M.Tech in Computer Science and Engineering from Himachal Pradesh University and Jaypee University of Information Technology, respectively. Along with working as a senior content writer, Neha is currently pursuing a PhD in Computer Science at Jaypee University, where her research explores privacy and security in IoT environments.

Published by Neha Sharma