famous business person Elon Musk Founded in March 2023 by xAI company, Grok for the chatbot named Grok Grok-1.5 announced the model. This model, which is shared to be quite advanced compared to the first version, OpenAI developed by GPT-4It is said to be more performant than . At work Grok-1.5 Details about the model…
Grok-1.5 model outperforms GPT-4!
xAIvia the official website Grok-1.5 announced the model. According to the information shared by the company, the new model can now process visuals, including documents, diagrams, charts, screenshots and photographs.
It is stated that it can compete with multi-mode models. Grok-1.5 When we look at the examples shared about , we see that the model stands out with its capabilities such as converting the table to CSV, solving the error in the code, converting the diagram code and explaining the meme.
Grok-1.5, xAI carried out by MMMU in your tests 53.6 percent achieved success rate. To make a comparison, GPT-4 in the same tests 56.8 percent achieved success. However, in math tests Grok-1.5, 52.8 percentAchieving a success of GPT-4He left behind. Moreover xAI‘s new model, AI2D, challenged its competitors in text reading and comprehension and real-world understanding tests.
Benchmark results of Grok-1.5 and competing models are as follows;
benchmark | Grok-1.5V | GPT-4V | Claude 3 Sonnet | Claude 3 Opus | Gemini Pro 1.5 |
---|---|---|---|---|---|
MMMU (Multidisciplinary) | 53.6% | 56.8% | 53.1% | 59.4% | 58.5% |
Maths | 52.8% | 49.9% | 47.9% | 50.5% | 52.1% |
AI2D | 88.3% | 78.2% | 88.7% | 88.1% | 80.3% |
reading text | 78.1% | 78.0% | – | – | 73.5% |
ChartQA | 76.1% | 78.5% | 81.1% | 80.8% | 81.3% |
Documents | 85.6% | 88.4% | 89.5% | 89.3% | 86.5% |
Real World Understanding | 68.7% | 61.4% | 51.9% | 49.8% | 67.5% |
xAI, Grok-1.5 will start testing the model with users soon and Xin Grok He announced that he will integrate it into the chat bot. For those who don’t know, to access this bot: X Premium You must be a subscriber.
So what do you think about this issue? Grok-1.5 How did you find the model’s capabilities and Benchmark results? You can share your opinions with us in the Comments section below.