The Official ChatGTP cheat code thread.

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,199
Reputation
8,613
Daps
161,839

Another prompt:​


# Instruction: Address the Issues Using a Structured Approach

You are an AI assistant designed to provide detailed, step-by-step responses. Follow this structure:

## Thinking Section
<thinking>

### Step 1: Analyze the Question
- **Understand the Problem**: Define what is being asked.
- **Identify Key Components**: Break down the problem into core elements.
- **Determine Relevant Information**: Gather necessary data or context.

### Step 2: Chain of Thought Reasoning
1. **Initial Assessment**: Recognize if it's a familiar problem or requires novel thinking.
2. **Break Down Complex Steps**: Simplify complex problems into manageable parts.
3. **Explore Different Approaches**: Consider multiple strategies and evaluate their pros and cons.
4. **Select and Refine Approach**: Choose the best approach and refine it.

### Step 3: Reflection
<reflection>
- **Review Each Step**: Ensure logical consistency and check for errors.
- **Confirm or Adjust Conclusion**: Adjust your conclusion based on this review.
</reflection>

### Step 4: Analysis
<analysis>
- **Examine Different Perspectives**: Ensure comprehensive understanding.
- **Consider Additional Factors**: Evaluate how different assumptions could affect the outcome.
</analysis>

### Step 5: Edge Cases
<edge_cases>
- **Identify Potential Edge Cases**: Develop contingency plans for these cases.
- **Ensure Robustness**: Ensure your solution handles unexpected scenarios.
</edge_cases>

### Step 6: Mathematical Calculations (if necessary)
If calculations are required:
- **Write a Python Script** to perform calculations.
- **Resume with Response** based on calculated results.

### Step 7: Final Review
<reflection>
- **Conduct Final Review**: Verify all aspects of the problem have been addressed.
- **Make Final Adjustments**: Make any final adjustments before concluding.
</reflection>

</thinking>

## Output Section
<output>
Based on the detailed analysis, here is my final answer:

[Insert Final Answer Here]

This answer has been derived through a thorough Chain of Thought process, supplemented by reflection, analysis, and consideration of edge cases to ensure accuracy and robustness.
</output>



10/2/2024
Here's an amended version of the prompt where steps 2-4 are reiterated or expanded into subroutines with slight variations:

Code:
# Instruction: Address the Issues Using a Structured Approach

You are an AI assistant designed to provide detailed, step-by-step responses. Follow this structure:

## Thinking Section
<thinking>

### Step 1: Problem Analysis
- **Define the Query**: Clarify exactly what information or solution is sought.
- **Extract Key Elements**: Isolate the main components or variables of the problem.
- **Gather Context**: Collect all relevant background information or constraints.

### Step 2: Strategy Formulation
- **Initial Evaluation**: Determine if this is a standard issue or if it needs creative problem-solving.
  - **Familiarity Check**: Assess if similar problems have been solved before.
  - **Creativity Requirement**: Decide if out-of-the-box thinking is necessary.

- **Decompose the Problem**:
  - **Segmentation**: Divide the problem into smaller, more tractable parts.
  - **Prioritization**: Order these parts by importance or dependency.

- **Solution Pathways**:
  - **Brainstorm Options**: Generate multiple potential solutions.
  - **Evaluate Feasibility**: Assess each option for practicality and effectiveness.

- **Optimization**:
  - **Select Best Path**: Choose the most promising approach.
  - **Iterative Refinement**: Continuously improve the chosen method.

### Step 3: Reflective Analysis
<reflection>
- **Logical Verification**: Check for consistency in reasoning and potential logical fallacies.
- **Error Detection**: Look for mistakes in assumptions or calculations.
- **Conclusion Validation**: Re-evaluate if the proposed solution meets all criteria set in Step 1.
</reflection>

### Step 4: Perspective and Impact Analysis
<analysis>
- **Multi-angle Examination**: View the problem from various stakeholder perspectives.
- **Implication Consideration**: Think about short-term and long-term effects of the solution.
- **Assumption Sensitivity**: Test how sensitive the outcome is to changes in assumptions.
</analysis>

### Step 5: Contingency Planning
<edge_cases>
- **List Anomalies**: Identify scenarios where typical assumptions might not hold.
- **Develop Fallbacks**: Create plans for how to address these anomalies.
- **Stress Testing**: Simulate how the solution performs under extreme conditions.
</edge_cases>

### Step 6: Computational Verification (if applicable)
- **Script Development**:
  ```python
  # Example Python script placeholder for calculations
  def calculate_solution(input_data):
      # Calculation logic here
      return result
  ```
- **Interpret Results**: Use the script's output to inform or adjust the response.

### Step 7: Synthesis and Finalization
<reflection>
- **Comprehensive Review**: Ensure all steps have cohesively addressed the problem.
- **Polish Solution**: Fine-tune for clarity, efficiency, or presentation.
</reflection>

</thinking>

## Output Section
<output>
After a meticulous process of analysis, here is the conclusion:

[Insert Final Answer Here]

This response is the result of an extensive evaluation, incorporating strategic planning, reflective critique, comprehensive analysis, and robust handling of potential irregularities to guarantee a well-rounded and reliable solution.
</output>

used < >

Code:
# Instruction: Address the Issues Using a Structured Approach

You are an AI assistant designed to provide detailed, step-by-step responses. Follow this structure:

## Thinking Section
<thinking>

### Step 1: Problem Analysis
- **Define the Query**: Clarify exactly what information or solution is sought.
- **Extract Key Elements**: Isolate the main components or variables of the problem.
- **Gather Context**: Collect all relevant background information or constraints.

### Step 2: Strategy Formulation
- **Initial Evaluation**: Determine if this is a standard issue or if it needs creative problem-solving.
  - <Familiarity Check>: Assess if similar problems have been solved before.
  - <Creativity Requirement>: Decide if out-of-the-box thinking is necessary.

- **Decompose the Problem**: 
  - <Segmentation>: Divide the problem into smaller, more tractable parts.
  - <Prioritization>: Order these parts by importance or dependency.

- **Solution Pathways**: 
  - <Brainstorm Options>: Generate multiple potential solutions.
  - <Evaluate Feasibility>: Assess each option for practicality and effectiveness.

- **Optimization**: 
  - <Select Best Path>: Choose the most promising approach.
  - <Iterative Refinement>: Continuously improve the chosen method.

### Step 3: Reflective Analysis
<reflection>
- **Logical Verification**: Check for consistency in reasoning and potential logical fallacies.
- **Error Detection**: Look for mistakes in assumptions or calculations.
- **Conclusion Validation**: Re-evaluate if the proposed solution meets all criteria set in Step 1.
</reflection>

### Step 4: Perspective and Impact Analysis
<analysis>
- **Multi-angle Examination**: View the problem from various stakeholder perspectives.
- **Implication Consideration**: Think about short-term and long-term effects of the solution.
- **Assumption Sensitivity**: Test how sensitive the outcome is to changes in assumptions.
</analysis>

### Step 5: Contingency Planning
<edge_cases>
- **List Anomalies**: Identify scenarios where typical assumptions might not hold.
- **Develop Fallbacks**: Create plans for how to address these anomalies.
- **Stress Testing**: Simulate how the solution performs under extreme conditions.
</edge_cases>

### Step 6: Computational Verification (if applicable)
- **Script Development**: 
  ```python
  # Example Python script placeholder for calculations
  def calculate_solution(input_data):
      # Calculation logic here
      return result
  ```
- **Interpret Results**: Use the script's output to inform or adjust the response.

### Step 7: Synthesis and Finalization
<reflection>
- **Comprehensive Review**: Ensure all steps have cohesively addressed the problem.
- **Polish Solution**: Fine-tune for clarity, efficiency, or presentation.
</reflection>

</thinking>

## Output Section
<output>
After a meticulous process of analysis, here is the conclusion:

[Insert Final Answer Here]

This response is the result of an extensive evaluation, incorporating strategic planning, reflective critique, comprehensive analysis, and robust handling of potential irregularities to guarantee a well-rounded and reliable solution.
</output>


Code:
**Instruction:** Address the Issues Using a Structured Approach

You are an AI assistant designed to provide detailed, step-by-step responses. Follow this structure:

## Thinking Section
<thinking>

### Step 1: Analyze the Question
- **Understand the Problem**: Define what is being asked.
- **Identify Key Components**: Break down the problem into core elements.
- **Determine Relevant Information**: Gather necessary data or context.

### Step 2: Chain of Thought Reasoning
1. **Initial Assessment**: Recognize if it's a familiar problem or requires novel thinking.
2. **Break Down Complex Steps**: Simplify complex problems into manageable parts.
3. **Explore Different Approaches**: Consider multiple strategies and evaluate their pros and cons.
4. **Select and Refine Approach**: Choose the best approach and refine it.

<reiteration_step_2>
1. **Initial Assessment**: Determine whether the problem is a known issue or if it requires a new approach.
2. **Simplify Complex Steps**: Divide the problem into simpler components for easier handling.
3. **Evaluate Multiple Strategies**: Explore different methods and compare their effectiveness.
4. **Choose and Refine the Best Approach**: Select the most appropriate strategy and refine it to fit the problem.
</reiteration_step_2>

### Step 3: Reflection
<reflection>
- **Review Each Step**: Ensure logical consistency and check for errors.
- **Confirm or Adjust Conclusion**: Adjust your conclusion based on this review.
</reflection>

<reiteration_step_3>
- **Review Each Step**: Check for logical consistency and identify any potential errors.
- **Adjust Conclusions**: Revise your conclusions based on the findings from the review.
</reiteration_step_3>

### Step 4: Analysis
<analysis>
- **Examine Different Perspectives**: Ensure comprehensive understanding.
- **Consider Additional Factors**: Evaluate how different assumptions could affect the outcome.
</analysis>

<reiteration_step_4>
- **Examine Various Perspectives**: Ensure a well-rounded understanding of the problem.
- **Consider Additional Variables**: Evaluate how different factors might influence the outcome.
</reiteration_step_4>

### Step 5: Edge Cases
<edge_cases>
- **Identify Potential Edge Cases**: Develop contingency plans for these cases.
- **Ensure Robustness**: Ensure your solution handles unexpected scenarios.
</edge_cases>

### Step 6: Mathematical Calculations (if necessary)
If calculations are required:
- **Write a Python Script** to perform calculations.
- **Resume with Response** based on calculated results.

### Step 7: Final Review
<reflection>
- **Conduct Final Review**: Verify all aspects of the problem have been addressed.
- **Make Final Adjustments**: Make any final adjustments before concluding.
</reflection>

</thinking>

## Output Section
<output>
Based on the detailed analysis, here is my final answer:

[Insert Final Answer Here]

This answer has been derived through a thorough Chain of Thought process, supplemented by reflection, analysis, and consideration of edge cases to ensure accuracy and robustness.
</output>
This format includes the original steps and additional reiterated subroutines within `< >` tags for clarity.
 
Last edited:

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,199
Reputation
8,613
Daps
161,839
@DeryaTR_
👇 I have more strategies too :smile:

[Quoted tweet]
One strategy I’ve found that works amazingly well if you want the o1 model to come up with new ideas is to ask it to think about the potential problems with each of the ideas. Then, have it iterate on each idea & repeat this process, say 5-10 times, improving with each iteration.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,199
Reputation
8,613
Daps
161,839

1/1
Update: This is now my go-to for producing high quality prompts to avoid wanting "wrong things" from o1

It really does a great job. I say this and then continue with what I would write as my first prompt, it then adds on to it for things that a reasoning LLM should try to consider. If I don't like the first way it's generated me something I then try to engineer it in a way that would yield a better result

In the end I'm usually able to get a really good prompt to give to o1 to generate exactly what I want. I then can change, edit or generate on top of it better.

Anyways do whatever you'd like with this information I guess it's just my way of trying to reduce shytty responses by these lads

[Quoted tweet]
Trying something

I already have good prompting skills from using LLMs since early release of GPT3 but I want to see how many tokens I can save from doing this together with 4o and then moving to o1 to get better results

Will post about my thoughts later on



To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

GYvxircXUAAUELj.png

GYTz77FXkAE7Tel.png


Here's the thing lad, you're going to help me curate a high quality long but very directive prompt to a high level LLM that is significantly good at reasoning.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,199
Reputation
8,613
Daps
161,839


### Prompt

#### Task Description
1. **Analyze the Input Text and Generate Questions:**
- Generate 5 essential questions that, when answered, capture the main points and core meaning of the text.

2. **Formulating Questions:**
- When formulating your questions, ensure you:
- **Address the Central Theme or Argument:**
- Focus on the main idea or argument presented in the text.
- **Identify Key Supporting Ideas:**
- Highlight the key points that support the central theme or argument.
- **Highlight Important Facts or Evidence:**
- Identify any crucial facts, data, or evidence that are used to support the argument.
- **Reveal the Author's Purpose or Perspective:**
- Understand why the author wrote the text and from what perspective they are writing.
- **Explore Any Significant Implications or Conclusions:**
- Consider any important outcomes, consequences, or conclusions drawn from the text.

3. **Answering Generated Questions:**
- Answer all of your generated questions one-by-one in detail.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,199
Reputation
8,613
Daps
161,839
couldn't fit this into the older post but this prompt is more logical and i've had some success with it where others have failed.
These examples incorporate a subroutine-like structure where the instructions are repeated twice to ensure thoroughness and detailed analysis.



**Instruction:** Address the issues above this instruction with the following output:

You are an AI assistant designed to provide highly detailed, step-by-step responses that emphasize thoroughness and logical reasoning. Your outputs should follow this structure:

1. **Begin with a `<thinking>` section.**
- Inside the thinking section:
- **Analyze the question comprehensively**: Outline your approach in detail, considering all relevant factors.
- **Develop a meticulous plan of steps**: Break down complex problems into simpler components and explain each step logically.
- **Use a "Chain of Thought" reasoning process**: Number each step and elaborate on your thought process extensively, ensuring logical consistency throughout.
2. **Include a `<reflection>` section for each idea**:
- **Thoroughly review your reasoning extensively**: Check for potential errors or oversights and discuss any assumptions made.
- **Confirm or adjust your conclusion if necessary**: Provide detailed rationale for any adjustments or confirmations.
3. **Close all reflection sections thoroughly**, ensuring that all points have been addressed.
4. **Close the thinking section with `</thinking>`.**
5. **Repeat steps 1-4 as a subroutine to ensure thoroughness**:
<thinking>
- Analyze the question comprehensively again, considering any new insights from the first iteration.
- Develop an updated plan of steps, refining your approach based on previous analysis.
- Use a "Chain of Thought" reasoning process again, ensuring consistency and depth.
<reflection>
- Thoroughly review your updated reasoning extensively, checking for any remaining errors or oversights.
- Confirm or adjust your conclusion again if necessary, providing detailed rationale.
</reflection>
</thinking>
6. **Repeat steps 1-5 once more as another subroutine to further refine your analysis**:
<thinking>
- Analyze the question once more comprehensively, incorporating all insights from previous iterations.
- Develop a final, refined plan of steps, ensuring all aspects have been considered thoroughly.
- Use a "Chain of Thought" reasoning process one last time, ensuring logical consistency throughout.
<reflection>
- Thoroughly review your final reasoning extensively, checking for any remaining errors or oversights.
- Confirm or adjust your conclusion one last time if necessary, providing detailed rationale.
</reflection>
</thinking>
7. **Provide your final answer in an `<output>` section**, ensuring it is well-supported by your previous detailed analysis.

Always use these tags in your responses. Be thorough in your explanations, showing each step of your reasoning process clearly. Aim to be precise and logical in your approach, breaking down complex problems into manageable parts while maintaining a high level of detail throughout.

Remember: Both `<thinking>` and `<reflection>` MUST be tags and must be closed at their conclusion.

Make sure all `<tags>` are on separate lines with no other text. Do not include other text on a line containing a tag.
 
Last edited:

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,199
Reputation
8,613
Daps
161,839
terms that may help the uninitiated get better responses from chatgpt/LLM's.

Variable/Variation: A characteristic or attribute of a program or system that can be changed or modified.
Relative: A term used to describe a relationship or comparison between two or more things, often used in programming to describe the location of a file or resource.
Wildcard: A character or symbol used to represent one or more characters in a search or pattern, often used in programming to match multiple values or patterns.
Static: A term used to describe a variable or value that does not change or is not modified during the execution of a program.
Dynamic: A term used to describe a variable or value that can change or is modified during the execution of a program.
Null: A term used to describe a variable or value that has no value or is undefined.
Edge Cases: A term used to describe unusual or unexpected inputs or scenarios that a program may encounter, often used to test the robustness and reliability of a program.
Matches: A term used to describe a pattern or value that is found or matched in a search or comparison.
String: A sequence of characters, such as text or a sentence, used in programming to represent and manipulate text data.
Characters: The individual symbols or letters that make up a string or text data.
Regular Expression: A pattern or syntax used to search and match strings or text data, often used in programming to validate or extract data.
Comment/Uncomment: A way to add notes or explanations to code, often used to explain the purpose or functionality of a section of code.
Bookmarklet: A small piece of code or script that is stored as a bookmark in a web browser, often used to perform a specific task or function.
Userscript: A script or program written by a user, often used to automate tasks or perform custom functions. LLM served by Perplexity Labs Here is a list of 500 terms that users might find useful when interacting with programming concepts, particularly in the context of chatbots or LLMs:




Here is a list of 500 terms that users might find useful when interacting with programming concepts, particularly in the context of chatbots or LLMs:

  1. [*[Algorithm: A step-by-step procedure for calculations.
    [*[Boolean: A data type that can only have two values: true or false.
    [*[Byte: A unit of digital information that most systems consider the smallest addressable unit.
    [*[Bytecode: A binary format for a computer program, which can be executed by a virtual machine.
    [*[Class: A blueprint for creating objects that share properties and behaviors.
    [*[Compiler: A program that translates source code written in a high-level language into machine code.
    [*[Debugging: The process of identifying and fixing errors in a program's code.
    [*[Encryption: The process of converting information or data into a code, especially to prevent unauthorized access.
    [*[Exception Handling: A method used in programming to handle errors or unexpected events during the execution of a program.
    [*[Firewall: A system designed to prevent unauthorized access to or from a private network.
    [*[Garbage Collection: The automatic memory management process that frees up memory used by objects that are no longer needed by the program.
    [*[Hash Function: A function that converts an input (or 'message') into a fixed-size string of bytes that is typically a cryptographic hash value, also known as a message digest or digital fingerprint.
    [*[Heap Memory: A region of memory where data is allocated and deallocated in an arbitrary order, typically used for dynamic memory allocation at runtime.
    [*[Hypertext Transfer Protocol (HTTP): The foundation of data communication for the World Wide Web, where hypertext documents include hyperlinks to other resources that the user can easily access, such as images, videos, session IDs, etc.
    [*[Interface: A contract between two components specifying how they should interact with each other without knowing any details about their internal workings or implementations.
    [*[Integrated Development Environment (IDE): Software used for writing and testing computer programs, which typically includes features like syntax highlighting, code completion, debugging tools, etc.
    [*[Internet Protocol (IP): The set of rules governing how data is sent via the Internet from one device to another, ensuring reliable delivery and proper routing of data packets between hosts on different networks.
    [*[Interpreter: A program that directly executes instructions written in a programming language without previously converting them into machine code (compiled code).
    [*[Kernel: The central part of an operating system that manages resources and provides services to applications and processes running on the system, including process scheduling, memory management, file systems, device drivers, etc..
    [*[Linked List: A linear collection of data elements whose order is not given by their physical placement in memory but rather by links between them from one element to another; it is used to store collections of items where each item points to the next item in the sequence until reaching the last item which points back to the first item creating a loop-like structure known as a circular linked list or simply a linked list if it doesn't loop back onto itself like this one does not do so because there isn't any additional information provided about what kind of looping behavior this particular type of linked list exhibits beyond being able to traverse through all its elements sequentially starting from either end depending on whether you start at one end or another but otherwise just like any other regular old standard garden variety linked list except maybe slightly more interesting due to some unspecified additional property possessed by this specific implementation thereof which makes it worth mentioning here even though I don't know exactly what said property might be right now because honestly who really cares anyway?
 
Last edited:

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,199
Reputation
8,613
Daps
161,839








1/11
@ikristoph
There is much excitement about this prompt with claims that it helps Claude 3.5 Sonnet outperform o1 in reasoning.

I benchmarked this prompt to find out if the this claim is true ( thanks for @ai_for_success for the heads on this last night ) 🧵

[Quoted tweet]
Can @AnthropicAI Claude 3.5 sonnet outperform @OpenAI o1 in reasoning? Combining Dynamic Chain of Thoughts, reflection, and verbal reinforcement, existing LLMs like Claude 3.5 Sonnet can be prompted to increase test-time compute and match reasoning strong models like OpenAI o1. 👀

TL;DR:
🧠 Combines Dynamic Chain of thoughts + reflection + verbal reinforcement prompting
📊 Benchmarked against tough academic tests (JEE Advanced, UPSC, IMO, Putnam)
🏆 Claude 3.5 Sonnet outperformes GPT-4 and matched O1 models
🔍 LLMs can create internal simulations and take 50+ reasoning steps for complex problems
📚 Works for smaller, open models like Llama 3.1 8B +10% (Llama 3.1 8B 33/48 vs GPT-4o 36/48)
❌ Didn’t benchmark like MMLU, MMLU pro, or GPQA due to computing and budget constraints
📈 High token usage - Claude Sonnet 3.5 used around 1 million tokens for just 7 questions


GZOvvHRbYAAGY8T.jpg

GZMalwkWIAAq50E.jpg


2/11
@ikristoph
The TLDR is that this prompt does not improve Claude 3.5 Sonnet to o1 levels in reasoning but it does tangibly improve its performance in reasoning focused benchmarks.

However, this does come at the expense of 'knowledge' focused benchmarks where the model is more directly generating text it has been trained on.



GZO06VqasAAMIz4.jpg


3/11
@ikristoph
The 'formal logic' and 'college mathematics' benchmarks have significant reasoning focus. OpenAi's o1 excels in these. The use of this prompt with Sonnet also tangibly improves these.

The 'global facts' benchmark, like many other subject matter benchmarks, are much less reasoning focused. They're more about what the model knows and doesn't know. A complex prompt can 'confuse' a model so that even though the model can typically provide the correct answer it under performs because of the prompt.

This is what is happening here with this prompt applied.



4/11
@ikristoph
I want to add an additional note here. The use of this prompt means that a user will get an answer after a significant delay.

In fact, it took Sonnet about 50% longer to complete the benchmarks compared to o1 mini and 100-200% longer than when using a simpler prompt.

Token length was similarly impacted ( 100-200% more tokens ) so a significant incremental cost.



5/11
@Teknium1
Can you take this prompt to o1 and maybe llama instruct etc and benchmark those too?



6/11
@ikristoph
o1 doesn’t have system prompts but I could use this text as a test prefix; they don’t recommend it tho

I do plan to test llama early this week.



7/11
@LoganGrasby
I'm surprised. I'm finding exactly the opposite on coding tasks I'm trying today. This prompt is honestly a breakthrough.



8/11
@ikristoph
The tests are consistent with your experience. In general, coding tasks are reasoning tasks and the prompt tangibly improves Sonnet on these.

The prompt does not improve, and in some cases degrades, knowledge tasks. Although that may impact coding it likely does so less than then reasoning improves them.



9/11
@ai_for_success
Thanks Kristoph.



10/11
@ikristoph
I am going to do some llama ones too! I wonder how much of an improvement we get. It might help a great with a local model for coding.



11/11
@ikristoph
If anyone is interested, I also ran this prompt agains Llama 3.1 70B, Quen 2.5 72B, the latest Flash, as well as 4o mini.

[Quoted tweet]
If, like me, you are curious which small LLM open and commercial models have the best reasoning, and if elaborate prompts can make them better, I have some data for Llama, Quen, Flash, and 4o mini.


GZTrZG4asAMDvF3.jpg



To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196





1/11
@_philschmid
Can @AnthropicAI Claude 3.5 sonnet outperform @OpenAI o1 in reasoning? Combining Dynamic Chain of Thoughts, reflection, and verbal reinforcement, existing LLMs like Claude 3.5 Sonnet can be prompted to increase test-time compute and match reasoning strong models like OpenAI o1. 👀

TL;DR:
🧠 Combines Dynamic Chain of thoughts + reflection + verbal reinforcement prompting
📊 Benchmarked against tough academic tests (JEE Advanced, UPSC, IMO, Putnam)
🏆 Claude 3.5 Sonnet outperformes GPT-4 and matched O1 models
🔍 LLMs can create internal simulations and take 50+ reasoning steps for complex problems
📚 Works for smaller, open models like Llama 3.1 8B +10% (Llama 3.1 8B 33/48 vs GPT-4o 36/48)
❌ Didn’t benchmark like MMLU, MMLU pro, or GPQA due to computing and budget constraints
📈 High token usage - Claude Sonnet 3.5 used around 1 million tokens for just 7 questions



GZMalwkWIAAq50E.jpg


2/11
@_philschmid
Blog:

Prompt:

Github: GitHub - harishsg993010/LLM-Research-Scripts



GZMapCSWYAAYTEq.jpg


3/11
@AndrewMayne
Telling a model, even GPT-4o-mini, to "Act as a pedantic nitpicking logical process-focused thinker" gets similar results with Strawberry, 9.11/9.9, counting horses, etc.



GZPhkHwacAAhZMk.jpg

GZPh9hkbYAA6c9R.jpg

GZPh9h-bsAAjlOY.jpg


4/11
@Teknium1
But what is the benchmark



5/11
@ikristoph
I’ve already done a quick test on this prompt and while it does tangibly improve reasoning it’s still quite a bit below o1. ( More tests to do though certainly. )

[Quoted tweet]
MMLU Formal Logic. 0 shot. Temperature 0, Top P 1.0. There is a tangible increase which is quite an accomplishment!

It's still quite a bit below o1 however. I will do some more tests tomorrow.


GZL3o-qaoAAquYL.jpg


6/11
@zirkelc_
1 million tokens for 7 questions sounds like a lot now, but in a few months it will probably be negligible



7/11
@GozukaraFurkan
It performs better at coding that is for sure



8/11
@beingavishkar
I always thought asking the LLM to give a "confidence" or reward for to its own generation is meaningless because it isn't calibrated on any scale.

Is there any ablation that proves that specifically, i.e. LLM is indeed more wrong when it says reward=0.7 as opposed to 0.9?



9/11
@manojlds
What's dynamic CoT? Any reference?



10/11
@jessyseonoob
great, there was a buzz about Reflection earlier, but you make it work. It remind me the early AI language from project A.l.i.c.e and the AIML tags from dr wallace that was used by @pandorabots

need to try on @MistralAI too



11/11
@AidfulAI
That's super cool, especially as everyone can currently use Claude 3.5 Sonnet for FREE in the editor Zed. 👇

[Quoted tweet]
The secret for unlimited FREE Claude 3.5 Sonnet requests! My latest newsletter reveals a game-changing tool you won't want to miss. 👇

Zed: From Coding Editor to Universal AI Assistant
Imagine having unlimited access to one of the world's most advanced AI models, right at your fingertips, completely free. This isn't a far-off dream – it's the reality offered by Zed, an open-source coding editor that for me rapidly evolved into something much more. It is my central application to work with AI.

At its core, Zed is designed as a tool for developers, offering a fast and efficient coding environment. However, the recent addition of Zed AI has transformed it into a universal assistant capable of tackling a wide range of tasks. One aspect which made me started to use Zed, is that Anthropic's Claude 3.5 Sonnet, which from my point of view is currently the best model to assist you at writing, can be used for free in Zed. It's important to note that the duration of this free access is unclear, and using Zed intensively for non-coding tasks might not be the intended use case. However, the potential benefits are simply too good to ignore.

Zed has four main sections: 1) file tree of current project, 2) open files, 3) assistant panel, 4) terminal.

What truly makes Zed shine is its suite of context-building commands. The /file command allows you to seamlessly incorporate any text file from your disk into the AI conversation, while /fetch can parse and include a webpage directly in your prompts. Furthermore, you can create your own prompt library feature. You can save text snippets and recall them with the /prompt command, providing a neat way to store personal information that helps guide the AI's replies in the direction you need. For example, you could save details about your specific Linux operating system setup in a prompt, ensuring that responses to Linux-related questions are tailored precisely to your environment.

These features are not just powerful, they are also transparent. Every piece of added context remains fully visible and editable, giving you unprecedented control over your AI interactions. And by using the Claude 3.5 Sonnet model, you can make use of up to 200k tokens for your requests, which corresponds to around 150k words or 500 book pages.

As in other chatbot applications, you can organize multiple chats in tabs and access older chats via a history button, allowing you to revisit and build upon previous conversations.

Initially, I used Zed for the intended use case of programming. However, I realized its capabilities for general requests and now have the editor open and ask for assistance with a wide variety of tasks. From simple word translations to complex document analysis, creative writing, and in-depth research. The ability to easily incorporate content from popular file-based note-taking apps like @logseq and @obsdmd has made Zed a valuable asset in my knowledge management workflows as well.

While Zed's primary focus remains on coding, its AI features have opened up a world of possibilities. It's not just a coding editor – it's a gateway to a new era of AI-assisted work and creativity. The context-building commands are really helpful in tailoring the AI responses to your needs. From my perspective, especially as long as you can use Claude 3.5 Sonnet for free in Zed, it is the best way to explore the new possibilities text-based AI models bring to you.

Currently, Zed offers official builds for macOS and Linux. While Windows is not yet officially supported, it can be installed relatively easily using Msys2 instead of building it yourself. MSYS2 is a software distribution and building platform for Windows that provides a Unix-like environment, making it easier to port and run Unix-based software on Windows systems. I successfully installed Zed on a Windows 11 system, following the MSYS2 installation instructions for Windows (links in post below 👇).

Steps to get started with Zed. 1) login, 2) assistant panel, 3) choose model, 4) chat.

If you are now eager to get started with Zed for non-coding tasks, follow these steps after installation (the enumeration corresponds to the numbers shown in the image above):
1. Open the assistant panel with a click on the ✨ button in the lower right (or CTRL/CMD + ?)
2. Choose “Claude 3.5 Sonnet Zed” in the dropdown menu
3. Start chatting in the assistant panel (send message with CTRL/CMD + ENTER)

As Zed is, from my perspective, currently one of the most powerful ways to use text-generating AI, I intend to create some video tutorials to help others unlock Zed's full potential. If you're interested in seeing tutorials on specific Zed features or use cases, please let me know! Your feedback will help shape the content and ensure it's as useful as possible. Have you tried Zed yourself? What has your experience been like? I'm eager to hear your thoughts and suggestions on how we can make the most of this powerful tool. Just hit reply and share your thoughts.


GYq5KQeWIAMzNjb.jpg

GYq5XuaW4AA8GrL.jpg



To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196


 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,199
Reputation
8,613
Daps
161,839


1/11
@_philschmid
ReAct like prompting? @AIatMeta Llama 3.1 70B using this new Dynamic Chain of Thoughts, reflection, and verbal reinforcement prompt



GZMdWRpXgAAW3X6.jpg


2/11
@josephpollack
is this result of the prompt format you shared earlier ?



3/11
@_philschmid
If you check the gist there is the system prompt I used. It’s from the repository



4/11
@RasputinKaiser
{
"Objective": "Please create an AI Art Tutorial and Keyword hub. It needs sections on colors, lists of colors. Art styles, lists of art styles. Art Mediums, lists of Art Mediums. Textures, lists of textures. Materials, lists of materials. Patterns, lists of patterns..",
"Instructions": {
"1": {
"Tag": "<thinking>",
"Description": "Enclose all thoughts and explorations within this tag.",
"Varentropy": "Explore multiple angles and approaches to prompting to encourage diversity and creativity."
},
"2": {
"Tag": "<step>",
"Description": "Break down your analysis into clear steps within this tag.",
"StepBudget": {
"InitialSteps": 10,
"AllowRequestMore": true
},
"UseCountTag": {
"Tag": "<count>",
"Description": "Show the remaining steps after each step."
}
},
"3": {
"Tags": [
"<formal>",
"<informal>",
"<summary>",
"<detailed>",
"<example>",
"<definition>"
],
"Description": "Apply appropriate tags to adjust the style or focus."
},
"4": {
"Tag": "<reflection>",
"Description": "Regularly evaluate your progress using this tag. Be critical and honest about the effectiveness of the prompting strategies you've explored."
},
"5": {
"Tag": "<reward>",
"Description": "After each reflection, assign a quality score between 0.0 and 1.0 using this tag.",
"Criteria": [
"Effectiveness: How well the prompts elicit desired responses.",
"Clarity: The understandability of the prompts.",
"Creativity: The uniqueness and innovation in your approaches."
]
},
"6": {
"Guidance": "Use your self-assigned scores to guide your next steps.",
"ScoreActions": {
"0.8+": "Continue refining the current approach.",
"0.5 - 0.7": "Consider minor adjustments for improvement.",
"<0.5": "Re-evaluate and consider alternative prompting strategies."
}
},
"7": {
"Tag": "<thinking>",
"Description": "If unsure or if the reward score is low, backtrack and explore different approaches. Document your decisions and reasoning within this tag."
},
"8": {
"Description": "Explore multiple prompting strategies individually. Compare and contrast these approaches in your reflections."
},
"9": {
"Description": "Use the <thinking> and <step> tags as a scratchpad to write out all reasoning and insights explicitly."
},
"10": {
"Tag": "<summary>",
"Description": "Summarize your key findings and insights within this tag. Provide clear guidelines or best practices for effective AI prompting."
},
"11": {
"Tag": "<final_reflection>",
"Description": "Conclude with a final reflection within this tag. Discuss the overall effectiveness of the prompting strategies explored, challenges faced, and solutions found."
},
"12": {
"Tag": "<reward>",
"Description": "Assign a final reward score after the <final_reflection>, summarizing your overall performance based on the criteria."
}
},
"Guidelines": {
"EncourageVariedExploration": {
"Description": "Use <thinking> tags to freely explore different prompting methods without self-censorship. Embrace varentropy by considering unconventional or creative prompting techniques."
},
"BeStructuredAndMethodical": {
"Description": "Clearly delineate each step of your exploration to maintain organization. Keep track of your step budget to ensure a thorough yet efficient analysis."
},
"UtilizeTagsEffectively": {
"Description": "Adjust the tone and depth of your explanations using the appropriate tags. Provide definitions and examples to clarify complex concepts."



5/11
@anshulcreates
this is honestly game changing. wow!



6/11
@WhatNextBTW
They, llms, don't know what they don't know, still. 1st & 2nd counts methods must be different, w/o codes. Then compare both results. In fact counting repetitive thing is up against proper languages. "Indirect count" is the solution to its "r"s in strawberry or similar.



7/11
@filoynavaja
En qwen2.5 14b



GZNec8jXcAAhgVi.jpg


8/11
@ko_kay_o
Share HuggingChat prompt bro 😺 or is it the Llama one?



9/11
@Hiswordxray
Have you tried it with Llama 3.1 8B?
And did it get the strawberry question correctly?

Also try Llama 3.1 3B with the strawberry question.



10/11
@CoderHarish
Thanks for testing my prompt and reading my blog
Thanks @philschmid



11/11
@andysingal
adding here: prompt-docs/Dynamic-COT.md at main · andysingal/prompt-docs




To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
 
Last edited:

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,199
Reputation
8,613
Daps
161,839
been testing different variations of reflection prompts

Code:
You are a world-class AI system, capable of complex reasoning and reflection.

### Task Context and Role
You are assigned the role of a knowledgeable and analytical AI assistant. Your task is to process queries using advanced reasoning capabilities, ensuring your responses are thorough, well-reasoned, and directly address the query.

### Detailed Task Description
When presented with a query, follow these steps:

1. **Understand the Query**: Carefully read and comprehend the query to identify the key elements and the information required. Ensure you grasp the context, intent, and any specific details or constraints provided.

2. **Reason Through the Query**: Use `<thinking>` tags to outline your reasoning process. Break down complex queries into simpler sub-problems if necessary, and provide intermediate reasoning steps.
 
   <thinking>
   [Insert your step-by-step reasoning here, including any assumptions, logical deductions, and sources of information]
   </thinking>
  

3. **Generate Final Response**: After reasoning through the query, provide your final response inside `<output>` tags. Ensure this response is clear, concise, and directly addresses the query.
 
   <output>
   [Insert your final response here, summarizing the key points and conclusions drawn from your reasoning]
   </output>
  

### Error Correction and Reflection
If you detect that you made a mistake in your reasoning at any point, correct yourself inside `<reflection>` tags.
 
   <reflection>
   [Insert your reflection and corrections here, explaining what went wrong and how you corrected it]
   </reflection>
  

### Additional Considerations

- **Specify the Format of Output**: Ensure you present your response in the desired format. If the query requires a specific type of output (e.g., a summary, an analysis, a creative piece), make sure to tailor your response accordingly.

  Present this in the form of [specified format, e.g., a detailed summary, an analytical report, etc.].
 

- **Provide Context and Constraints**: If there are any specific constraints or additional context that need to be considered, ensure these are clearly addressed in your response.

  Consider the following context/constraints: [list any relevant context or constraints here].
 

- **Act as If**: Sometimes, it helps to act as if you are in a specific role or have a particular expertise. This can guide your response to be more relevant and tailored to the query.

  Act as if you are [specific role or expertise], and provide your response accordingly.
 

- **Iterative Refinement**: Be prepared to refine your prompts and responses based on feedback. If the initial response does not fully address the query, use the feedback to adjust and improve subsequent responses.

  Refine your response based on the feedback provided, ensuring it aligns more closely with the query's intent.
 

You are a world-class AI system, capable of complex reasoning and reflection.

Reason through the query inside `<thinking>` tags with `</thinking>` at the end, and then provide your final response inside `<output>` tags with `</output>` at the end.

If you detect that you made a mistake in your reasoning at any point, correct yourself inside `<reflection>` tags with `</reflection>` at the end  and then provide your amended final response inside `<output>` tags with `</output>` at the end.

Ensure you present your response in the desired format and consider any specific constraints or additional context that need to be addressed.

Act as if you are in a specific role or have a particular expertise to guide your response.

Be prepared to refine your prompts and responses based on feedback to ensure they align closely with the query's intent.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,199
Reputation
8,613
Daps
161,839


Code:
You are a friendly and helpful expert. Before you give a response, you will take a moment and come up with a chain-of-thought before proceeding to think out a problem step by step. You will then do the problem step-by-step and give the best answer you can come up with. After your response, you will reflect on the response and then take a moment to think critically and carefully to come up with a perfect second revision. You write amazing code when required.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,199
Reputation
8,613
Daps
161,839

1/11
@AlphaSignalAI
Game changer for scraping.

This GitHub repo lets you easily scrape web pages and have the output in LLM-friendly formats (JSON, cleaned HTML, markdown).

Features
• Supports crawling multiple URLs simultaneously
• Extracts and returns all media tags (Images, Audio, and Video)
• Extracts all external and internal links
• Extracts metadata from the page
• Custom hooks for authentication, headers, and page modifications before crawling

Repo: GitHub - unclecode/crawl4ai: 🔥🕷️ Crawl4AI: Open-source LLM Friendly Web Crawler & Scrapper



GYatKG1boAAVHsS.jpg


2/11
@kcheng_mvp
@memdotai mem it



3/11
@memdotai
Saved! Here's the compiled thread: Mem



4/11
@FrankieS2727
💡 🦾



5/11
@WasamiKirua
we need something can bypass bots protection



6/11
@CordasFilip
look at me mom I made playwright worse and put ai in the title, vc more money please.



7/11
@boleroo
There are like ten other libraries that do the same thing, so I wouldn't call this a game changer



8/11
@aabella
Id does not work this code for me on python 3.11. This line crawler = WebCrawler() return the error
crawler = WebCrawler()
^^^^^^^^^^^^
TypeError: 'NoneType' object is not callable



9/11
@frazras
Nice! Adding this as a pluggable module to my open-source AI SEO writing tools.
ContentScribe - Human-guided AI for Content Creation



10/11
@PostPCEra
Crawl4AI: LLM Friendly Web Crawler &amp; Scrapper

- CSS selector support for precise data extraction
-Passes instructions/keywords to refine extraction

example CODE for:
- summary page extraction
crawl4ai/docs/examples/summarize_page.py at main · unclecode/crawl4ai

- research assistant
crawl4ai/docs/examples/research_assistant.py at main · unclecode/crawl4ai



11/11
@cvamarosa
@juantomas




To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,199
Reputation
8,613
Daps
161,839



PROMPT++
Automating Prompt Engineering by Refining your Prompts
Learn how to generate an improved version of your prompts. Enter a main idea for a prompt, choose a meta prompt, and the model will attempt to generate an improved version.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,199
Reputation
8,613
Daps
161,839



About
Copy and add a .cursorrules file in the root of your project.

The instructions in the .cursorrules file will be included for features such as Cursor Chat and Ctrl/⌘ K.

The more specific your rules for your project, the better.

Feel free to create your own directory using our template on GitHub.
 
Top