The Grok chatbot, developed by xAI, has introduced a new feature that enables it to answer questions about objects viewed through a smartphone’s camera, similar to the real-time vision capabilities offered by Google’s Gemini and ChatGPT.
On Tuesday, xAI announced the launch of Grok Vision, a feature that allows users to point their phone at various objects, such as products, signs, and documents, and ask questions about them. Currently, Grok Vision is only accessible through the Grok app for iOS, with Android app support not yet available.
GROK CAN SEE WHAT YOU SEE—LITERALLY
The voice mode on Grok now includes camera access, allowing users to point their phone at an object and inquire, “What am I looking at?”
The Vision feature on iOS enables the chatbot to analyze real-world objects, text, and environments through your… https://t.co/cmtINP8yp6 pic.twitter.com/N1b6pcYZOi
— Mario Nawfal (@MarioNawfal) April 20, 2025
In addition to Grok Vision, the chatbot has also introduced multilingual audio and real-time search capabilities in its voice mode. However, these features are only available to Android users who are subscribed to xAI’s $30-per-month SuperGrok plan.
Introducing Grok Vision, multilingual audio, and real-time search in Voice Mode. Available now.
Grok is now capable of speaking multiple languages, including Spanish, French, Turkish, Japanese, and Hindi. pic.twitter.com/lcaSyty2n5
— Ebby Amir (@ebbyamir) April 22, 2025
Grok has been continuously adding new features, with the recent introduction of a “memory” component that allows the bot to recall details from past conversations. Additionally, xAI has also added a canvas-like tool for creating documents and applications.