Join our daily and weekly newsletters for the latest updates and the exclusive content on AI coverage. Learn more
Microsoft introduced a new class of very effective AI models that simultaneously deal with text, images and speech while requiring much less computing power than existing systems. The new Phi-4 modelsPublished today, represent a breakthrough in the development of models of small languages (SLM) which offer capacities previously reserved for much larger AI systems.
Phi-4-Multimodala model with only 5.6 billion parameters, and Phi-4-miniWith 3.8 billion parameters, competitors of similar size surpass and correspond or exceed the performance of the models twice their size on certain tasks, according to Microsoft technical report.
“These models are designed to allow developers of advanced AI capabilities,” said Weizhu Chen, Vice-President of the AI generator in Microsoft. “Phi-4-Multimodal, with its ability to simultaneously treat speech, vision and text, opens up new possibilities to create innovative applications and devoted to the context.”
Technical realization occurs at a time when companies are looking for more and more AI models that can work on standard equipment or on “edge– Directly on devices rather than in Cloud data centers – to reduce costs and latency while maintaining data confidentiality.
How Microsoft has built a small AI model that does it all
What is sets Phi-4-Multimodal Outside is his novel “Loras mixtureTechnique, allowing him to manage the text, images and entries of speech in a single model.
“By taking advantage of the Loras mixture, Phi-4-Multimodal extends multimodal capacities while minimizing interference between the methods”, the search document States. “This approach allows transparent integration and guarantees coherent performance between the tasks involving text, images and speech / audio.”
Innovation allows the model to maintain its solid linguistic capacities while adding a vocal vision and recognition without the degradation of the performance which often occurs when the models are adapted to several types of input.
The model claimed the higher position on the Openasr embrace With a 6.14%word error rate, outperforming specialized voice recognition systems as Whisperv3. It also demonstrates competitive performance on vision tasks such as mathematical and scientific reasoning with images.
Compact IA, Massive impact: Phi-4-Mini establishes new performance standards
Despite its compact size, Phi-4-mini demonstrates exceptional capacities in textual tasks. Microsoft reports that the model “surpasses similar size models and is on the sheet with twice larger models” in various references of language understanding.
The model performance on mathematics and model coding tasks are particularly notable. According to the search document“Phi-4-minini consists of 32 layers of transformer with a hidden state size of 3,072” and incorporates the attention of the group query to optimize the use of memory for the long-term context generation.
On GSM-8K mathematical benchmarkPhi-4-minini obtained a score of 88.6%, surpassing most parameter models of 8 billion, while on the reference in mathematics, it reached 64%, significantly higher than competitors of similar size.
“For the reference index in mathematics, the model surpasses models of similar size with large margins, sometimes more than 20 points. He even surpasses the larger model scores twice, ”notes the technical report.
Transformer deployments: real efficiency of Phi-4 in action
AbilityAn AI response engine that helps organizations to unify various data sets, has already exploited the Phi family to improve the efficiency and precision of their platform.
Steve Frederickson, capacity product manager, said in a statement“According to our initial experiences, which really impressed us about the Phi is its remarkable precision and the ease of deployment, even before personalization. Since then, we have been able to improve precision and reliability, while maintaining the profitability and scalability that we appreciated from the start. »»
The capacity reported a cost savings of 4.2x compared to competing workflows while obtaining qualitative results the same or better for pre -treatment tasks.
AI without limits: Microsoft’s Phi-4 models provide advanced intelligence anywhere
For years, the development of AI is motivated by a singular philosophy: the greater is better. More parameters, larger models, larger calculation requests. But Microsoft’s Phi -4 models question this hypothesis, proving that power is not only a question of scale – it is effective.
Phi-4-Multimodal And Phi-4-mini are not designed for the data centers of technological giants, but for the real world – where the computing power is limited, the confidentiality concerns are essential and the AI must operate transparently without a constant connection with the Cloud. These models are small, but they have weight. Phi-4-Multimodal integrates speech, vision and word processing in a single system without sacrificing precision, while Phi-4-minini offers mathematics, coding and reasoning performance with the models twice its size.
It is not only a question of making AI more effective; It is a question of making it more accessible. Microsoft has positioned Phi-4 for generalized adoption, which makes it available via Foundry Azure Ai,, Faceand the API NVIDIA catalog. The objective is clear: the AI which is not locked behind expensive equipment or a massive infrastructure, but which can work on standard devices, on the edge of networks and in industries where the computing power is rare.
Masaya Nishimaki, Director of the Japanese company of AI, Headwaters Co., Ltd., sees the first -hand impact. “Edge ai demonstrates exceptional performance even in environments with unstable network connections or where confidentiality is essential,” he said in a statement. This means the AI that can work in factories, hospitals, autonomous vehicles – places where real -time information is necessary, but where traditional cloud models fail.
Basically, Phi-4 represents a change of thought. AI is not only a tool for those who have the largest servers and the deepest pockets. It is a capacity which, if it is well designed, can work anywhere, for anyone. The most revolutionary thing about Phi -4 is not what it can do – that’s where it can do it.