Teaching language models to reason algorithmically

🔬 ANALYSEUR SCIENCE & TECH

Teaching language models to reason algorithmically

🤖 Intelligence Artificielle
✍️ Auteur(s)
Hattie Zhou
📅 Publication
2023-08-24T12:33:00.002-07:00
📖 Longueur
800 mots
Teaching language models to reason algorithmically

Source: https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiAcRJnal11Q...

📋 Extrait de l'article
Posted by Hattie Zhou, Graduate Student at MILA, Hanie Sedghi, Research Scientist, Google Large language models (LLMs), such as GPT-3 and PaLM , have shown impressive progress in recent years, which have been driven by scaling up models and training data sizes . Nonetheless, a long standing debate has been whether LLMs can reason symbolically (i.e., manipulating symbols based on logical rules). For example, LLMs are able to perform simple arithmetic operations when numbers are small, but struggle to perform with large numbers. This suggests that LLMs have not learned the underlying rules needed to perform these arithmetic operations. While neural networks have powerful pattern matching capabilities , they are prone to overfitting to spurious statistical patterns in the data. This does not hinder good performance when the training data is large and diverse and the evaluation is in-distribution. However, for tasks that require rule-based reasoning (such as addition), LLMs struggle with out-of-distribution generalization as spurious correlations in the training data are often much easier to exploit than the true rule-based solution. As a result, despite significant progress in a variety of natural language processing tasks, performance on simple arithmetic tasks like addition has remained a challenge. Even with modest improvement of GPT-4 on the MATH dataset, errors are still largely due to arithmetic and calculation mistakes . Thus, an important question is whether LLMs are capable of algorithmic reasoning, which involves solving a task by applying a set of abstract rules that define the algorithm. In “ Teaching Algorithmic Reasoning via In-Context Learning ”, we describe an approach that leverages in-context learning to enable algorithmic reasoning capabilities in LLMs. In-context learning refers to a model’s ability to perform a task after seeing a few examples of it within the context of the model. The task is specified to the model using a prompt, without the need for weight updates. We also present a novel algorithmic prompting technique that enables general purpose language models to achieve strong generalization on arithmetic problems that are more difficult than those seen in the prompt. Finally, we demonstrate that a model can reliably execute algorithms on out-of-distribution examples with an appropriate choice of prompting strategy. By providing algorithmic prompts, we can teach a model the rules of arithmetic via in-context learning. In this example, the LLM (word predictor) outputs the correct answer when prompted with an easy addition question (e.g., 267+197), but fails when asked a similar addition question with longer digits. However, when the more difficult question is appended with an algorithmic prompt for addition (blue box with white + shown below the word predictor), the model is able to answer correctly. Moreover, the model is capable of simulating the multiplication algorithm ( X ) by composing a series of addition calculations. Teaching an algorithm as a skill In order to teach a model an algorithm as a skill, we develop algorithmic prompting, which builds upon other rationale-augmented approaches (e.g., scratchpad and chain-of-thought ). Algorithmic prompting extracts algorithmic reasoning abilities from LLMs, and has two notable distinctions compared to other prompting approaches: (1) it solves tasks by outputting the steps needed for an algorithmic solution, and (2) it explains each algorithmic step with sufficient detail so there is no room for misinterpretation by the LLM. To gain intuition for algorithmic prompting, let’s consider the task of two-number addition. In a scratchpad-style prompt, we process each digit from right to left and keep track of the carry value (i.e., we add a 1 to the next digit if the current digit is greater than 9) at each step. However, the rule of carry is ambiguous after seeing only a few examples of carry values. We find that including explicit equations to describe the rule of carry helps the model focus on the relevant details and interpret the prompt more accurately. We use this insight to develop an algorithmic prompt for two-number addition, where we provide explicit equations for each step of computation and describe various indexing operations in non-ambiguous formats. Illustration of various prompt strategies for addition. Using only three prompt examples of addition with answer length up to five digits, we evaluate performance on additions of up to 19 digits. Accuracy is measured over 2,000 total examples sampled uniformly over the length of the answer. As shown below, the use of algorithmic prompts maintains high accuracy for questions significantly longer than what’s seen in the prompt, which demonstrates that the model is indeed solving the task by executing an input-agnostic algorithm. Test accuracy on addition questions of increasing length for different prompting methods. Leveraging algorithmic skills as tool use To evaluate if the model can leverage algorithmic reasoning in a broader reasoning process, we evaluate performance using grade school math word problems ( GSM8k ). We specifically attempt to replace addition calculations from...

📖 LIRE L'ARTICLE COMPLET SUR CE LIEN :

🔗 http://ai.googleblog.com/2023/08/teaching-language-models-to-reason.html

Cliquez sur le lien ci-dessus pour consulter l'article dans son intégralité.

🏷️ Mots-clés : 🤖 Intelligence Artificielle
📊 Statistique : Extrait de 800 mots
🤖 Publication automatique par Analyseur Science | Source originale : http://ai.googleblog.com/2023/08/teaching-language-models-to-reason.ht...

Commentaires

Posts les plus consultés de ce blog

Comment mettre un accent à une lettre majuscule À, É, È, Ç, Î, Ô, Û pour Windows

Comment supprimer son historique Canal ?

Quel est le poids du BelugaXL, cet étrange avion-cargo au design surprenant ?

Les 5 cultures les plus gourmandes en eau dans le monde

Bourses d’excellence de la Confédération suisse 2026 – 2027

Monstre d’acier : avec une hauteur de 250 mètres, Big Carl est la plus grande grue du monde

5 mythes sur la Grande Muraille de Chine que beaucoup croient encore vrais

 Pourquoi le prix du café monte en flèche ?

Comment le changement climatique perturbe la dynamique des systèmes marins

L'avion du “Jugement dernier”, un appareil unique repéré au-dessus de l’Europe