OctoTools: Stanford’s open-source framework optimizes LLM reasoning through modular tool orchestration

Join our daily and weekly newsletters for the latest updates and the exclusive content on AI coverage. Learn more

OctotoolsA new open source agent platform published by scientists from the University of Stanford, can turbocate large languages (LLM) models for task reasoning by breaking down the tasks into subunits and improving models with tools . Although the use of tools has already become an important LLMS application, Octotools makes these capacities much more accessible by removing technical obstacles and allowing developers and companies to extend a platform with their own tools and flow work.

Experiences show that Octotools surpasses classic incentive methods and other LLM application frames, making it a promising tool for real uses of AI models.

LLMs often find it difficult for reasoning tasks that involve several stages, logical decomposition or specialized knowledge in the field. A solution is to subcontract specific steps from the solution to external tools such as calculators, code interpreters, search engines or image processing tools. In this scenario, the model focuses on higher level planning while real calculation and reasoning is done via the tools.

However, the use of tools has its own challenges. For example, conventional LLMs often require substantial or learning a few strokes With data organized to adapt to new tools, and once increased, it will be limited to specific areas and types of tools.

The selection of tools also remains a point of pain. LLM can become good to use one or a few tools, but when a task requires the use of several tools, they can be confused and misunderstand.

Octotools addresses these points of pain via an agent frame without training which can orchestrate several tools without the need to complain about or adjust the models. Octotools uses a modular approach to fight against planning and reasoning tasks and can use any LLM for general use as a skeleton.

Among the key octotools components are “tool cards”, which act as packaging to the tools that the system can use, such as Python code interpreters and web search APIs. Tool cards include metadata such as output formats, limitations and best practices for each tool. Developers can add their own tool cards to the frame to adapt to their applications.

When a new prompt is introduced in Octotools, a “planner” module uses the LLM backbone to generate a high level plan that sums up the objective, analyzes the required skills, identifies the relevant tools and includes additional considerations for the task. The planner determines a set of sub-objectives that the system must perform to accomplish the task and describes them in a textual action plan.

For each stage of the plan, an “action predictor” module refines the sub-objective to specify the required tool to make it and ensure that it is executable and verifiable.

Once the plan is ready to be executed, a “command generator” maps the textual plan in Python code which invokes the specified tools for each sub-objective, transmits the command to “executor command”, which executes the command in a Python environment. The results of each step are validated by a “context verifier” module and the final result is consolidated by a “solution summary”.

“By separating the strategic planning of the command generation, Octotools reduces errors and increases transparency, which makes the system more reliable and easier to maintain,” write researchers.

Octotools also uses an optimization algorithm to select the best subset of tools for each task. This avoids submerging the model with unrelevant tools.

Agencies

There are several executives to create LLM applications and agent systems, in particular Microsoft Autogen,, Lubricole and OPENAI API “call function. “Octotools surpasses these platforms on tasks that require reasoning and use of tools, according to its developers.

OCTOTOOLS VS other agent frameworks (Source: GitHub)

The researchers have tested all the executives on several benchmarks for visual, mathematical and scientific reasoning, as well as medical knowledge and agency tasks. Octotools reached an average precision gain of 10.6% compared to Autogen, 7.5% compared to GPT functions and 7.3% on Langchain when using the same tools. According to the researchers, the reason for the best performance of Octotools is its distribution of higher tool use and the good decomposition of the request in sub-objective.

Octotools offers companies a practical solution for using LLMS for complex tasks. Its integration of expandable tools will help overcome existing obstacles to the creation of AI advanced reasoning applications. The researchers published the code to Octotools on Github.

Daily information on business use cases with VB daily

If you want to impress your boss, VB Daily has covered you. We give you the interior scoop on what companies do with a generative AI, from regulatory changes to practical deployments, so that you can share information for a maximum return on investment.

Read our Privacy Policy

Thank you for subscribing. Find out more VB Newsletters here.

An error occurred.