Timon Harz
December 1, 2024
New AI model for complex problems: Alibaba researchers present Marco-o1
The Chinese e-commerce company Alibaba is conducting its own AI research. The team, named Marco Polo, has now presented its first major language model.

With Marco-o1, Alibaba's AI team is launching a Large Language Model (LLM) that it claims has been designed to solve complex problems. Marco-o1 uses techniques such as chain-of-thought (CoT) fine-tuning, Monte Carlo Tree Search (MCTS) and innovative reasoning strategies to improve its logical thinking capabilities.
Marco-o1 with significant improvement over previous versions
Marco-o1 was developed on the basis of Qwen2-7B-Instruct and fine-tuned with a combination of filtered CoT data, as the team emphasises. This combination enables Marco-o1 to explore different reasoning paths and find optimal solutions for complex tasks.
In tests, Marco-o1 showed an increase in accuracy of 6.17 per cent on the MGSM dataset (English) and 5.6 per cent on the MGSM dataset (Chinese), which underlines its improved reasoning capabilities compared to previous versions. According to the Alibaba researchers, the model was also able to demonstrate its performance in translation tasks - particularly in the machine translation of colloquial expressions and phrases, which is precise at the same time.
Open source: Marco-o1 available on GitHub
The development team has made Marco-o1 available on GitHub and Hugging Face so that it is accessible to researchers and developers alike. The model was developed using a combination of open-source CoT data and proprietary synthetic data.
Chain-of-thought data contains explanations or steps that encourage a model to generate logically reasoned answers instead of jumping straight to a solution. In calculations, for example, not only the results but also the individual calculation steps are output. CoT data guides models to break down tasks into sub-problems and solve them systematically.
The Marco Polo team is working on further ways to use Marco-o1 in various areas, including multilingual translations and scaling the inference time. Inference time is the time it takes for a model to generate an answer after it has been fed with an input.
Chinese Deepseek model to challenge OpenAI's o1
Marco-01 follows the release of the Deepseek R1 Lite Preview by Deepseek, a Chinese AI research lab. This is an AI model for logical thinking that is intended to challenge OpenAI's o1 model.
The performance of the Deepseek model is said to be comparable to OpenAI's o1 preview in rigorous benchmarks such as AIME and MATH, which assess the logical and mathematical reasoning capabilities of LLMs. However, it must be said that such comparisons are generally difficult to objectify.
Press contact
Timon Harz
oneboardhq@outlook.com
Other posts
Company
About
Blog
Careers
Press
Legal
Privacy
Terms
Security