Fine-grained Controllable Text Generation through In-context Learning with Feedback

1Saarland University

Figure 1: Rewriting an input sentence to dependency depth 4 through prompting.

Overview

We explore the task of rewriting texts to enhance comprehension for specific readers by employing Controllable Text Generation with Linguistic Features (CTG-LFs). This approach utilizes a language model to adjust input texts according to predetermined linguistic specifications, such as syntactic complexity, to cater to the cognitive abilities of individual readers. Previous methods have relied on fine-tuning models with extensive parallel data, often limiting application to certain target audiences (grade levels) or languages. Our work investigates a novel implementation of CTG-LFs using in-context learning (ICL), which bypasses the need for a large corpus by leveraging examples within the model’s context to guide text transformation.

We present a new methodology that refines the ICL with Chain-of-Thought reasoning and Feedback approach to perform reader-specific text modifications based on nontrivial linguistic features like dependency depth, length, number of difficult words in a sentence and sentence length. We show that our method performs accurate rewrites, with e.g. 81% of test sentences being rewritten to the exact requested dependency depth. Furthermore, by integrating our CTG-LFs model with a model that predicts linguistic feature values for desired target grade level, we develop an end-to-end system capable of rewriting sentences to match a desired (school) grade level. Our system outperforms previous methods, achieving effective reader-specific rewrites using only five in-context examples, thus eliminating the need for extensive training corpora. Our findings highlight the potential of ICL in expanding the applicability of CTG-LFs to diverse reader groups and languages, enhancing personalized text comprehension.

Method

Our goal is to build a model that takes a sentence w and a specification of a reader as input and rewrites w to be optimal for that type of reader. We will approximate the specification of a reader with school grade levels, which indicate a level of text complexity that is suitable for students of a certain grade in an American school.

We split the process of rewriting w for a target grade level into two steps:

  1. Step-1: Predicting Linguistic Feature values for w according to the given target grade level.
  2. Step-2 (our main contribution): Rewrite w to match the predicted feature values via CTG-LF with ICL.

Step-1: Feature Value Predictor

A model that predicts values for the features given the input sentence w and target grade level.


Figure 2: Feature value predictor based on decision tree classifier. (SG - Source Grade and TG - desired Target Grade)


We tailor text complexity and content using established linguistic features that are known to impact text comprehension and cognitive load. Those features are:

Example Explanation of Linguistic Feature Value Calculation


Step-2: CTG-LF with ICL


Figure 3: Workflow of CTG-LF along with a detailed look of "input prompt" with analysis plus instruction to rewrite


Our approach combines two core ideas. First, we include an analysis of the input sentence in the prompt and ask the LLM to generate an analysis of the output sentence, followed by the output sentence itself. With an “analysis”, we mean a representation of the sentence that makes a feature value explicit; the analysis takes the role of a thought in CoT reasoning (Wei et al., 2022). Analyses allow us to incorporate explicit syntactic information into the prompting process; note, however, that the output analysis is generated by the LLM.

Second, we equip our model with a feedback mechanism (Shinn et al., 2024): after each LLM output, we run an external validator on the generated output sentence to determine its true feature values; e.g. a dependency parser for DD. If the feature value differs from the requested one, the LLM is called again, after amending the prompt with the true analysis of the generated output sentence and a feedback message such as “The maximum dependency depth of the rewritten sentence is 5; please revise it with a depth of 4.” All previous LLM queries for this sentence, with the LLM response and the judgments of the parser, are included in the prompt. We permit up to 10 iterations of this feedback loop; if none yield the correct feature value, we return the output of the final iteration.

Values for multiple features can be specified at the same time by concatenating the descriptions and analyses for all the features.

Evaluation

First we evaluate the ability of our CTG model to rewrite to the requested feature values in isolation, and then the ability of the combined model to rewrite to a requested grade level (varing from 1 to 12). We use GPT-4o (version gpt-4o-2024-05-13) as our LLM for all ICL experiments.

Dataset: We utilize the WikiLarge Zhang and Lapata (2017) text simplification datasets, which consists of automatically aligned complex-simple sentence pairs from English Wikipedia (EW) and Simple English Wikipedia (SEW). This dataset provides a practical foundation for our research, simplification studies often adjust each input sentence’s complexity to approximate different grade level(s).

CTG to Linguistic Features

Table 1 shows results for rewriting every source sentence in the test set with respect to the gold feature values of its corresponding target sentence, using our CTG-LF model. Our proposed method exhibited high accuracy in rewriting sentences to meet specific linguistic feature values, such as dependency depth, and the number of difficult words. The combination of ICL and a feedback mechanism allowed for precise control over the features, significantly outperforming simpler prompting techniques.

Metrics

Tested Prompt Types

CTG to Grade Levels

Our method demonstrated a significant ability to rewrite sentences to match specified school grade levels (from 1 to 12) (results on Table: 2), achieving state-of-the-art accuracy. The use of ICL enabled the model to adapt texts accurately with minimal data, outperforming traditional fine-tuning (Agrawal and Carpuat (2023)) approach.

Metrics

Compared Baselines

References

Sweta Agrawal and Marine Carpuat. 2023. Controlling pre-trained language models for grade-specific text simplification. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 12807–12819, Singapore. Association for Computational Linguistics.

Louis Martin, Éric de la Clergerie, Benoît Sagot, and Antoine Bordes. 2020. Controllable sentence simplification. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 4689– 4698, Marseille, France. European Language Resources Association.

Louis Martin, Angela Fan, Eric De La Clergerie, Antoine Bordes, and Beno^ıt Beno^ıt Sagot. 2022. Muss: Multilingual unsupervised sentence simplification by mining paraphrases.

Jiao Sun, Yufei Tian, Wangchunshu Zhou, Nan Xu, Qian Hu, Rahul Gupta, John Wieting, Nanyun Peng, and Xuezhe Ma. 2023. Evaluating large language models on controlled generation tasks. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 3155–3168, Singapore. Association for Computational Linguistics.

Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, Denny Zhou, et al. 2022. Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems, 35:24824–24837.

Noah Shinn, Federico Cassano, Ashwin Gopinath, Karthik Narasimhan, and Shunyu Yao. 2024. Reflexion: Language agents with verbal reinforcement learning. Advances in Neural Information Processing Systems, 36

Xingxing Zhang and Mirella Lapata. 2017. Sentence simplification with deep reinforcement learning. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 584–594, Copenhagen, Denmark. Association for Computational Linguistics.

BibTeX

@misc{thillainathan2024finegrained,
      title={Fine-grained Controllable Text Generation through In-context Learning with Feedback}, 
      author={Sarubi Thillainathan and Alexander Koller},
      year={2024},
      eprint={2406.11338},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}