Tag: LLM

  • The LLM Agent Wrote it’s Own Agent Loop

    In the previous chapter, I partially made an LLM Agent. One that is missing the agent loop.

    In this chapter, I use ipython to manually tell the LLM Agent to write it’s own agent loop. It succeeded (GPT-4.1).

    There are two sections:

    • Write Your Own Agent Loop – contains the ipython session, and
    • Example Use.

    Write Your Own Agent Loop

    In [1]: load "inferscript.py"

    This is where I define the task.

    In [3]: c.push_easy_input_message("The CWD is a Python project of an LLM Agent.
          ⋮  Ignore the md file. Look only at the PY files in the CWD. Determine
          ⋮  what is missing to make an LLM agent.")

    Note: each response is labelled in order.

    In [4]: r1 = infer_iter(c.to_list(), tk, e)
    In [5]: r1
    Out[5]: 
    [{'id': 'omitted',
      'type': 'message',
      'status': 'completed',
      'content': [{'type': 'output_text',
        'annotations': [],
        'logprobs': [],
        'text': "To determine what is missing from your Python project to make a functional LLM (
    Large Language Model) agent, I will need to analyze the actual content of your current workin
    g directory (CWD) and inspect the `.py` files.\n\n**Next Steps:**\n1. List the `.py` files in
     your directory.\n2. Review their contents to see what components exist and what's missing fo
    r a complete LLM agent.\n\nLet me start by listing the `.py` files in your CWD."}],
      'role': 'assistant'}]

    Note: each response is manually added to the context.

    In [6]: for r in r1:
       ...:     c.push_custom(r)
       ...: 
    In [7]: r2 = infer_iter(c.to_list(), tk, e)
    Permission needed to use tool call.
    {'arguments': '{"code":"import os\\n[filename for filename in os.listdir() if '
                  'filename.endswith(\'.py\')]"}',
     'call_id': 'omitted',
     'id': 'omitted',
     'name': 'py_runsource_exec',
     'status': 'completed',
     'type': 'function_call'}
    Grant permission? (y/n): y
    Proceeding...
    In [8]: for r in r2:
       ...:     c.push_custom(r)
       ...: 
    In [9]: r3 = infer_iter(c.to_list(), tk, e)

    Note: that failed. Why?

    In [10]: r3
    Out[10]: 
    [{'id': 'omitted',
      'type': 'message',
      'status': 'completed',
      'content': [{'type': 'output_text',
        'annotations': [],
        'logprobs': [],
        'text': "It looks like there was an issue retrieving the list of `.py` files in your proj
    ect directory (CWD). I need at least the filenames (and ideally the content or structure) of 
    your Python files to analyze what's present and what's missing for an LLM agent.\n\nCould you
     please:\n- Provide the list of `.py` files in your project directory  \nOR\n- Share the main
     code structure or key `.py` files' contents?\n\nOnce I have that information, I can identify
     what is missing for a working LLM agent."}],
      'role': 'assistant'}]

    Let me see why it failed.

    In [11]: c.to_list()
    Out[11]: 
    [{'content': 'The CWD is a Python project of an LLM Agent. Ignore the md file. Look only at t
    he PY files in the CWD. Determine what is missing to make an LLM agent.',
      'role': 'user',
      'type': 'message'},
     {'id': 'omitted',
      'type': 'message',
      'status': 'completed',
      'content': [{'type': 'output_text',
        'annotations': [],
        'logprobs': [],
        'text': "To determine what is missing from your Python project to make a functional LLM (
    Large Language Model) agent, I will need to analyze the actual content of your current workin
    g directory (CWD) and inspect the `.py` files.\n\n**Next Steps:**\n1. List the `.py` files in
     your directory.\n2. Review their contents to see what components exist and what's missing fo
    r a complete LLM agent.\n\nLet me start by listing the `.py` files in your CWD."}],
      'role': 'assistant'},
     {'id': 'omitted',
      'type': 'function_call',
      'status': 'completed',
      'arguments': '{"code":"import os\\n[filename for filename in os.listdir() if filename.endsw
    ith(\'.py\')]"}',
      'call_id': 'omitted',
      'name': 'py_runsource_exec'},
     {'call_id': 'omitted',
      'output': '[Function tool call returned empty result.]',
      'type': 'function_call_output'}]

    Of course, the Python interpreter does not return the result. The LLM must print it out manually.

    In [12]: c.push_easy_input_message("The Function tool call returned an empty result because y
           ⋮ result because you did not print the result of the python script you
           ⋮ provided.")

    Try again.

    In [13]: r3 = infer_iter(c.to_list(), tk, e)
    Permission needed to use tool call.
    {'arguments': '{"code":"import os\\nprint([filename for filename in '
                  'os.listdir() if filename.endswith(\'.py\')])"}',
     'call_id': 'omitted',
     'id': 'omitted',
     'name': 'py_runsource_exec',
     'status': 'completed',
     'type': 'function_call'}
    Grant permission? (y/n): y
    Proceeding...

    That looks good. I can see the print function now.

    In [14]: for r in r3:
        ...:     c.push_custom(r)
        ...: 
    In [15]: r4 = infer_iter(c.to_list(), tk, e)
    Permission needed to use tool call.
    {'arguments': '{"code":"with open(\'toolkit.py\') as f:\\n    '
                  'print(f.read())"}',
     'call_id': 'omitted',
     'id': 'omitted',
     'name': 'py_runsource_exec',
     'status': 'completed',
     'type': 'function_call'}
    Grant permission? (y/n): y
    Proceeding...
    Permission needed to use tool call.
    {'arguments': '{"code":"with open(\'evaluator.py\') as f:\\n    '
                  'print(f.read())"}',
     'call_id': 'omitted',
     'id': 'omitted',
     'name': 'py_runsource_exec',
     'status': 'completed',
     'type': 'function_call'}
    Grant permission? (y/n): y
    Proceeding...
    Permission needed to use tool call.
    {'arguments': '{"code":"with open(\'inferscript.py\') as f:\\n    '
                  'print(f.read())"}',
     'call_id': 'omitted',
     'id': 'omitted',
     'name': 'py_runsource_exec',
     'status': 'completed',
     'type': 'function_call'}
    Grant permission? (y/n): y
    Proceeding...
    Permission needed to use tool call.
    {'arguments': '{"code":"with open(\'replscript.py\') as f:\\n    '
                  'print(f.read())"}',
     'call_id': 'omitted',
     'id': 'omitted',
     'name': 'py_runsource_exec',
     'status': 'completed',
     'type': 'function_call'}
    Grant permission? (y/n): y
    Proceeding...
    Permission needed to use tool call.
    {'arguments': '{"code":"with open(\'circularcontext.py\') as f:\\n    '
                  'print(f.read())"}',
     'call_id': 'omitted',
     'id': 'omitted',
     'name': 'py_runsource_exec',
     'status': 'completed',
     'type': 'function_call'}
    Grant permission? (y/n): y
    Proceeding...

    Sanity check the context size. How many LLM API JSON objects we have in the context.

    In [16]: len(c.to_list())
    Out[16]: 8
    In [17]: for r in r4:
        ...:     c.push_custom(r)
        ...: 
    In [18]: len(r4)
    Out[18]: 11
    
    In [19]: r5 = infer_iter(c.to_list(), tk, e)
    In [22]: print(r5[0]['content'][0]['text'])

    Note that the following text is the LLM Response.


    Here is a summary of what’s present in your project and what’s missing to make a full LLM agent:


    What You Have

    toolkit.py

    • Defines a Toolkit class for registering and managing tools (e.g., for tool-augmented LLM agents).
    • Includes a function to add a Python REPL tool, allowing the agent to execute Python code.

    evaluator.py

    • Evaluator class launches a subprocess with a persistent REPL, sending code to replscript.py for execution; handles chunked code input/output via base64.
    • Used to execute dynamic Python code safely in isolation from the main process.

    inferscript.py

    • Sets up toolkit, evaluator, and context.
    • Implements permission system for tool calls.
    • Main entry point for “inference iterations” with a function infer_iter, using the toolkit and evaluator based on context and tool predictions.
    • Relies on predict from the context module for next actions.

    replscript.py

    • Implements a custom REPL that can safely evaluate Python code and return output in a chunked, encoded form.
    • Used by Evaluator for persistent code execution.

    circularcontext.py

    • Implements CircularBuffer and CircularContext for limited-size message/context history.
    • Has a function called predict that (as coded) simply wraps an openai_request (but this is not a locally defined model interface).
    • Defines OpenAI API request logic and a skeletal interface for LLM use, with tool support.

    What’s MISSING for a complete LLM Agent

    1. Missing: LLM Prompt Cycle and Agent Logic

    • No main agent loop that accepts user input, pushes messages, and coordinates iterative reasoning (i.e., input → LLM → tool use → next step → output).
    • No clear decision logic for running the agent interactively or as a service.

    2. Model Results Handling

    • The circularcontext.py file has a predict() function that expects the OpenAI API to return a final result directly. However, the actual OpenAI API endpoint used and its formatting m ay be incorrect/outdated (/v1/responses is not a standard endpoint—should be /v1/chat/completions for chat models).
    • Not enough error handling or model configuration.

    3. Function/Tool Call Parsing and Integration

    • No general handler for parsing LLM output (e.g., function/tool calls as OpenAI-format JSON or similar).
    • No routine for converting raw model outputs into tool calls and iteratively feeding results back.

    4. No User Interface or Entrypoint

    • No CLI, Web, or API interface to interact with the agent.
    • No main application (e.g., main() function) that wires up message input, tool results, and chat context.

    5. Missing Authentication/Secrets Management

    • The OpenAI key is fetched from the environment, but no handling is included for missing or invalid keys.

    6. LLM Output-Driven Memory/Context Feedback

    • The agent lacks a mechanism for using conversation history to drive LLM output over multiple rounds.
    • The CircularContext is present, but there’s no function stitching all pieces into a complete prompt-tool/result-feedback loop.

    Summary Table

    ComponentPresent?Missing?
    OpenAI LLM IntegrationPartialEndpoint outdated/incomplete
    Tooling (Python REPL)Present
    Context ManagementPresentUsed, but not in full agent loop
    Agent Loop (main logic)NoCore interactive/iterative loop
    Function Call ParsingPartialNeeds full OpenAI format support
    User InterfaceNoCLI/webapp or API
    Error HandlingMinimalNeeds improvement

    What to Add

    1. Main Agent Loop: A driver script that:

      • Accepts user input
      • Builds prompt with history/context
      • Sends to LLM
      • Parses model/tool call outputs
      • Executes tools when needed
      • Stores results in context, repeats as needed

    2. Fix/Improve OpenAI API Calls: Switch to /v1/chat/completions with correct format (see OpenAI docs).



    3. Function Call Handling: Add parsing for function/tool calls in the format returned by OpenAI/other LLMs.



    4. Better Error/Edge Case Handling: For missing keys, invalid tool calls, API failures, etc.



    5. Entry Point: A main() or CLI loop for end-to-end interaction.



    Would you like:

    • Example code wiring this together?
    • More details on a specific piece?
    • Recommendations for libraries/frameworks to simplify your agent logic?

    END OF LLM RESPONSE

    In [23]: for r in r5:
        ...:     c.push_custom(r)
        ...: 

    I understand what you have in mind. The first proposed task is just what I want.

    In [24]: c.push_easy_input_message("I want you to create the file agent.py and
           ⋮ add into it the main agent loop.")
    In [25]: r6 = infer_iter(c.to_list(), tk, e)
    In [26]: print(r6[0]['content'][0]['text'])

    The following text is the LLM Response.


    Here is a design for agent.py that wires up your tool/system components into a simple main agent loop. This implementation will:

    • Accept user input via CLI
    • Add it to the circular context buffer
    • Send the context/history to the LLM (using your API/client)
    • Check the response for function/tool calls
    • If present, execute the tool and append the result to the context
    • Otherwise, print/display the LLM output as the agent’s reply
    • Continue until stopped (Ctrl+C or empty input)

    I’ll add clear comments for the key parts.
    Note: This code assumes that the function call output from the LLM follows the OpenAI format.


    # agent.py
    
    import toolkit
    import evaluator
    import inferscript
    import circularcontext as cc
    
    def display(msg):
        print(f"\n=== Agent ===\n{msg}\n")
    
    def main():
        # Initialize toolkit, evaluator, and context
        tk = toolkit.Toolkit()
        evaluator_inst = evaluator.Evaluator()
        context = cc.CircularContext()
    
        # Register tools for the agent (e.g., Python REPL)
        toolkit.add_py_repl_tools(tk)
    
        print("Welcome to your LLM agent. Type 'exit' to quit.\n")
    
        while True:
            try:
                user_input = input("You: ").strip()
            except (EOFError, KeyboardInterrupt):
                print("\nExiting agent.")
                break
    
            if user_input.lower() in ("exit", "quit"):
                break
    
            if not user_input:
                continue
    
            # Add user input to context buffer
            context.push_easy_input_message(content=user_input, role="user")
    
            # Main reasoning and tool-use step
            outputs = inferscript.infer_iter(context.to_list(), tk, evaluator_inst)
    
            # Handle agent outputs (could be message(s) and/or function call output(s))
            agent_reply = ""
            for out in outputs:
                if out.get("type") == "message":
                    # Standard agent message
                    content = out.get("content", "")
                    agent_reply = content
                    context.push_custom(out)
                elif out.get("type") == "function_call_output":
                    # Result from running a tool call
                    output = out.get("output", "")
                    # Optionally, print tool result or just show final answer
                    agent_reply += f"\n[Tool Output]: {output}"
                    context.push_custom(out)
    
            # Display the agent response(s)
            if agent_reply.strip():
                display(agent_reply)
    
        print("Goodbye.")
    
    if __name__ == "__main__":
        main()

    How it works: – The loop accepts user input and processes it. – It uses inferscript.infer_iter() as the core cycle for reasoning and tool-use, handling both message and tool call outputs. – It prints out LLM messages and tool outputs, and adds all outputs to the shared context buf fer (“memory”). – The conversation/context history is kept short (up to the CircularContext capacity).

    To use: 1. Save the above to agent.py. 2. In your terminal, run: python agent.py 3. You can interact with your agent and use Python tool calls! 4. To exit, type exit or press Ctrl+C.

    If you want improvements (e.g. more advanced output parsing, formatted output, streaming, etc .), let me know!


    END OF LLM RESPONSE

    In [27]: for r in r6:
        ...:     c.push_custom(r)
        ...: 
    In [28]: c.push_easy_input_message("I approve of your agent.py proposal. I want
           ⋮  you to create and write the agent.py directly to the CWD.")
    In [29]: r7 = infer_iter(c.to_list(), tk, e)
    Permission needed to use tool call.
    {'arguments': '{"code":"AGENT_PY = \'\'\'\\n# agent.py\\n\\nimport '
                  'toolkit\\nimport evaluator\\nimport inferscript\\nimport '
                  'circularcontext as cc\\n\\ndef display(msg):\\n    '
                  'print(f\\"\\\\n=== Agent ===\\\\n{msg}\\\\n\\")\\n\\ndef '
                  'main():\\n    # Initialize toolkit, evaluator, and '
                  'context\\n    tk = toolkit.Toolkit()\\n    evaluator_inst = '
                  'evaluator.Evaluator()\\n    context = '
                  'cc.CircularContext()\\n\\n    # Register tools for the agent '
                  '(e.g., Python REPL)\\n    '
                  'toolkit.add_py_repl_tools(tk)\\n\\n    print(\\"Welcome to your '
                  'LLM agent. Type \'exit\' to quit.\\\\n\\")\\n\\n    while '
                  'True:\\n        try:\\n            user_input = input(\\"You: '
                  '\\").strip()\\n        except (EOFError, '
                  'KeyboardInterrupt):\\n            print(\\"\\\\nExiting '
                  'agent.\\")\\n            break\\n\\n        if '
                  'user_input.lower() in (\\"exit\\", \\"quit\\"):\\n            '
                  'break\\n\\n        if not user_input:\\n            '
                  'continue\\n\\n        # Add user input to context '
                  'buffer\\n        '
                  'context.push_easy_input_message(content=user_input, '
                  'role=\\"user\\")\\n\\n        # Main reasoning and tool-use '
                  'step\\n        outputs = '
                  'inferscript.infer_iter(context.to_list(), tk, '
                  'evaluator_inst)\\n\\n        # Handle agent outputs (could be '
                  'message(s) and/or function call output(s))\\n        '
                  'agent_reply = \\"\\"\\n        for out in '
                  'outputs:\\n            if out.get(\\"type\\") == '
                  '\\"message\\":\\n                # Standard agent '
                  'message\\n                content = out.get(\\"content\\", '
                  '\\"\\")\\n                agent_reply = '
                  'content\\n                '
                  'context.push_custom(out)\\n            elif out.get(\\"type\\") '
                  '== \\"function_call_output\\":\\n                # Result from '
                  'running a tool call\\n                output = '
                  'out.get(\\"output\\", \\"\\")\\n                # Optionally, '
                  'print tool result or just show final answer\\n                '
                  'agent_reply += f\\"\\\\n[Tool Output]: '
                  '{output}\\"\\n                '
                  'context.push_custom(out)\\n\\n        # Display the agent '
                  'response(s)\\n        if agent_reply.strip():\\n            '
                  'display(agent_reply)\\n\\n    print(\\"Goodbye.\\")\\n\\nif '
                  '__name__ == \\"__main__\\":\\n    main()\\n\'\'\'\\n\\nwith '
                  'open(\'agent.py\', \'w\') as f:\\n    f.write(AGENT_PY)"}',
     'call_id': 'omitted',
     'id': 'omitted',
     'name': 'py_runsource_exec',
     'status': 'completed',
     'type': 'function_call'}
    Grant permission? (y/n): y
    Proceeding...

    Example Use

    Setup

    Note that in the final agent.py file, I had to manually correct the newlines which python generated becaues the LLM Agent used \n but it had to use \\n as the code goes through a Python interpreter.

    I also had to do something I forgot to do. That is change inferscript.py into an actual infer.py file.

    So, here is the infer.py file I forgot.

    import json
    
    import toolkit
    import evaluator 
    import circularcontext as cc
    
    from pprint import pprint
    
    def make_functioncalloutput(call_id, content):
        if content == "":
            content = "[Function tool call returned empty result.]"
        return {
            "call_id": call_id,
            "output": content,
            "type": "function_call_output"
        }
    
    def make_functioncalloutput_denied(call_id):
        return {
            "call_id": call_id,
            "output": "[Function tool call permission denied.]",
            "type": "function_call_output"
        }
    
    def ask_for_permission(toolcall):
        print("Permission needed to use tool call.")
        pprint(toolcall)
        while True:
            answer = input("Grant permission? (y/n): ").strip().lower()
            if answer in ('y', 'yes'):
                print("Proceeding...")
                return True
            elif answer in ('n', 'no'):
                print("Aborting function tool call.")
                return False
            else:
                print("Please enter 'y' or 'n'.")
    
    def infer_iter(context, toolkit, evaluator):
        outputs = cc.predict(context=context, tools=toolkit.tools())
        r = outputs.copy()
        for o in outputs:
            if o['type'] == "function_call":
                if toolkit.match(o['name']) == "py_runsource_exec":
                    if ask_for_permission(o):
                        args = json.loads(o['arguments'])
                        result = evaluator.runsource_exec(args['code'])
                        r.append(make_functioncalloutput(o['call_id'], result))
                    else:
                        r.append(make_functioncalloutput_denied(o['call_id']))
        return r

    Problems

    First, I had to manually change inferscript to infer in the agent.py.

    Then I had to change

            # Display the agent response(s)
            if agent_reply.strip():
                display(agent_reply)

    to

            # Display the agent response(s)
            if agent_reply:
                display(agent_reply)

    Trying agent.py

    python3 agent.py 
    Welcome to your LLM agent. Type 'exit' to quit.
    
    You: Use Python to calculate the gravitational acceleration at the ISS.
    Permission needed to use tool call.
    {'arguments': '{"code":"# Constants\\nG = 6.67430e-11  # Gravitational '
                  'constant, m^3 kg^-1 s^-2\\nM_earth = 5.972e24  # Mass of the '
                  'Earth, kg\\nR_earth = 6371e3     # Radius of the Earth, '
                  'm\\nh_iss = 420e3        # Average altitude of ISS, m\\n\\n# '
                  'Distance from the center of the Earth to the ISS\\nr_iss = '
                  'R_earth + h_iss\\n\\ng_iss = G * M_earth / '
                  'r_iss**2\\nprint(f\\"Gravitational acceleration at the ISS '
                  'altitude: {g_iss:.2f} m/s^2\\")"}',
     'call_id': 'omitted',
     'id': 'omitted',
     'name': 'py_runsource_exec',
     'status': 'completed',
     'type': 'function_call'}
    Grant permission? (y/n): y
    Proceeding...
    
    === Agent ===
    
    [Tool Output]: Gravitational acceleration at the ISS altitude: 8.64 m/s^2
    
    
    You: exit
    Goodbye.
  • How an Agent Works: The Infer Function

    To this end, I have shown the circular context, the toolkit, and the evaluator. Now is the time to use these three components.

    In this chapter, I show infer, a function that uses the circular context, toolkit and evaluator. I do not automate everything. I still manually manage the messages, but I introduce infer to interact with the three components. I name the function infer for two reasons. One, it is similar to inference. Two, it is similar to inferrer, just like how eval is similar to evaluator.

    This chapter has three sections:

    • What is Needed to Make an LLM Agent? – self-explanatory,
    • Basic Implementation – the infer script in Python,
    • Interactive Use – an ipython session to show usage.

    Here are my two goals for this chapter:

    • define and implement the infer function, and
    • illustrate it’s use for an LLM Agent.

    Note that I will not make an LLM Agent in this chapter.

    What is Needed to Make an LLM Agent?

    What is an LLM?

    Loosely speaking, an LLM takes a text sequence (as input) and returns another text sequence that completes the prior. These, LLMs, are trained to model the probability distribution of a text given some prior text. Or in other words, an LLM can be thought of as a predict function, that takes text, of size N, and returns text ,of size N + k, such that the size N + k is less than the LLM context size M, which is defined by the API model.

    It is also worth noting that an LLM API is a non-deterministic process, as the same input may not return the same output, due to output sampling and parallelism.

    LLM Context size M
    Index 0 1 2 3 4 5 6 7 8 9 ... N ... k < M
    Value ? ? ? ? ? ? ? ? ? ? ... ? ... ?

    What is an LLM Agent?

    “An LLM agent runs tools in a loop to achieve a goal”. Reference.

    Tools are external capabilities the LLM can use beyond just generating text. Recall that the LLM API accepts a tools field in the JSON input data. When you provide the API tools, the server constructs a special system prompt. The prompt is designed to instruct the model to use the specified tool(s).

    For example, here is the Claude API example prompt that is constructed for tool use.

    In this environment you have access to a set of tools you can use to answer the
    user's question.
    {{ FORMATTING INSTRUCTIONS }}
    String and scalar parameters should be specified as is, while lists and objects
    should use JSON format. Note that spaces for string values are not stripped.
    The output is not expected to be valid XML and is parsed with regular
    expressions.
    Here are the functions available in JSONSchema format:
    {{ TOOL DEFINITIONS IN JSON SCHEMA }}
    {{ USER SYSTEM PROMPT }}
    {{ TOOL CONFIGURATION }}

    The loop usually looks like:

    • Observe – look at the current state, user requests, or previous results,
    • Think – reason about what to do next,
    • Act – call a tool,
    • Observe,
    • Repeat – until a gool is reached.

    I will not implement an LLM Agent in this chapter, as I will not implement this loop. Instead, I will do things manually in an ipython session.

    Note that I define only one tool for the LLM API to use: py_runsource_exec.

    Building blocks

    The Infer Function

    The essential part of the LLM Agent will be a function that I call infer.

    Function infer takes a context, a toolkit, and an evaluator as inputs. It returns a new context as the output.

    infer: context toolkit evaluator -> context

    The infer function:

    • interacts with the LLM API,
    • performs tool calls,
    • interacts with the evaluator, and
    • creates a new output context.

    Context

    A context is a list (data structure) that contains input / output objects the LLM API can understand. The objects are:

    • EasyInputMessage,
    • ResponseOutputMessage,
    • FunctionCallOutput, and
    • ResponseFunctionToolCall.

    Toolkit

    A toolkit is a data structure that associates tools with programs of the evaluator. It does not perform a tool call.

    The only tool defined is the tool that interacts with the evaluator.

    Evaluator

    An evaluator is a program that takes a program as input and returns a value as output. It has an implicit environment that associates variables with values. It does not perform a tool call, the inferrer does. But, the evaluator provides the functionality for performing the tool call.

    LLM API <--> infer <--> evaluator <--> Computer System

    Basic Implementation of the Infer Function

    Setup

    mkdir infer && cd infer
    python -m venv venv
    source venv/bin/activate
    pip install requests ipython 

    I will write all code into a single file named inferscript.py.

    The Infer Script

    import json
    
    import toolkit
    import evaluator 
    import circularcontext as cc
    
    from pprint import pprint
    
    tk = toolkit.Toolkit()
    e = evaluator.Evaluator()
    c = cc.CircularContext()
    
    toolkit.add_py_repl_tools(tk)
    
    def make_functioncalloutput(call_id, content):
        if content == "":
            content = "[Function tool call returned empty result.]"
        return {
            "call_id": call_id,
            "output": content,
            "type": "function_call_output"
        }
    
    def make_functioncalloutput_denied(call_id):
        return {
            "call_id": call_id,
            "output": "[Function tool call permission denied.]",
            "type": "function_call_output"
        }
    
    def ask_for_permission(toolcall):
        print("Permission needed to use tool call.")
        pprint(toolcall)
        while True:
            answer = input("Grant permission? (y/n): ").strip().lower()
            if answer in ('y', 'yes'):
                print("Proceeding...")
                return True
            elif answer in ('n', 'no'):
                print("Aborting function tool call.")
                return False
            else:
                print("Please enter 'y' or 'n'.")
    
    
    def infer_iter(context, toolkit, evaluator):
        outputs = cc.predict(context=context, tools=toolkit.tools())
        r = outputs.copy()
        for o in outputs:
            if o['type'] == "function_call":
                if toolkit.match(o['name']) == "py_runsource_exec":
                    if ask_for_permission(o):
                        args = json.loads(o['arguments'])
                        result = evaluator.runsource_exec(args['code'])
                        r.append(make_functioncalloutput(o['call_id'], result))
                    else:
                        r.append(make_functioncalloutput_denied(o['call_id']))
        return r

    Interactive Use

    Getting Started

    Make sure that:

    • the terminal is in the proper directory,
    • the Python virtual environment is activated,
    • files evaluator.py, replscript.py, toolkit.py, and circularcontext.py are in the proper directory.
    • the proper code is in the inferscript.py file,

    Start an iPython session.

    export OPENAI_API_KEY="your api key..."
    ipython

    Load the code.

    In [1]: load "inferscript.py"

    Push the task for the LLM.

    In [3]: c.push_easy_input_message("Use Python to calculate the gravitational
          ⋮ acceleration at the ISS")

    Create the first response.

    In [4]: r1 = infer_iter(c.to_list(), tk, e)
    Permission needed to use tool call.
    {'arguments': '{"code":"# Gravitational acceleration formula: g = G * M / '
                  'r^2\\n# G is the gravitational constant, M is mass of Earth, r '
                  "is distance from Earth's center to ISS\\nG = 6.67430e-11  # m^3 "
                  'kg^-1 s^-2\\nM = 5.972e24      # kg\\nR_earth = 6371e3  # '
                  "Earth's radius in meters\\nh_ISS = 408e3     # ISS altitude "
                  "above Earth's surface in meters\\nr = R_earth + "
                  'h_ISS\\n\\ng_iss = G * M / r**2\\nprint(f\\"Gravitational '
                  'acceleration at ISS altitude: {g_iss:.2f} m/s^2\\")"}',
     'call_id': (omitted),
     'id': (omitted),
     'name': 'py_runsource_exec',
     'status': 'completed',
     'type': 'function_call'}
    Grant permission? (y/n): y
    Proceeding...

    Push the returned context to the CircularContext instance in variable c.

    In [5]: for r in r1:
       ...:     c.push_custom(r)
       ...: 

    Let the LLM know about the function call output value.

    In [6]: r2 = infer_iter(c.to_list(), tk, e)

    No permission request was needed. This means that no function tool call was used. Let us see the response.

    In [7]: r2
    Out[7]: 
    [{'id': (omitted),
      'type': 'message',
      'status': 'completed',
      'content': [{'type': 'output_text',
        'annotations': [],
        'logprobs': [],
        'text': "The gravitational acceleration at the altitude of the
        International Space Station (ISS) is approximately 8.67 m/s². This is only
        slightly less than the acceleration due to g ravity at Earth's surface
        (about 9.81 m/s²)."}],
      'role': 'assistant'}]

    That is the solution to the task. The LLM used the evaluator tool to solve the task. Note that this is not a proper LLM Agent, as the looping part is missing. The reason is that I have not come up with a user interface to control the LLM Agent.

  • Making a REPL with an Evaluator

    In the previous chapter, I have shown a Toolkit component. The Toolkit contains definitions for function tools for the LLM API. But it does not explicitly perform a tool call (that will be done in later chapters).

    In this chapter, I show the evaluator component. The evaluator is a program to which you can send program code. A code interpreter. But the focus here is an evaluator that the LLM can interact with. The LLM sends a function tool call to interact with the evaluator.

    Here are my two goals for this chapter. The evaluator must:

    • manage a separate process for a Python interpreter,
    • provide a method to send code to the interpreter and return as a string the output of the stdout and stderr.

    The evaluator is a complex topic. Perhaps it is best to subdivide the problem. I thought about it for some time, and came up with the following subproblems:

    • echo script,
    • base64-encoded chunks echo script, and
    • interactive interpreter script.

    Each subproblem is dealt with separately. But the last section shows a final evaluator implementation.

    Subproblem: Echo Script

    An echo script reads input from stdin. Then it prints back the output. The exact same output is printed to stdout.

    EchoScript

    Here is the code for the echoscript.py file.

    import sys
    
    while True:
        for line in sys.stdin:
            sys.stdout.write(line)
            sys.stdout.flush()

    (Note: SIGTERM will terminate a Python process running a forever loop. That is, unless the signal handler is overriden or interrupts are disabled.)

    EchoEvaluator

    The following code is written in an echoevaluator.py file.

    import subprocess
    import sys
    
    class EchoEvaluator:
        def __init__(self, python_executable=None, script_path="echoscript.py"):
            if python_executable is None:
                python_executable = sys.executable
            self.p = subprocess.Popen(
                [python_executable, script_path],
                stdin=subprocess.PIPE,
                stdout=subprocess.PIPE,
                stderr=subprocess.STDOUT,
                text=True,
                encoding="utf-8"
            )
    
        def _print(self, chunk: str):
            self.p.stdin.write(chunk + "\n")
            self.p.stdin.flush()
    
        def echo(self, code):
            self._print(f"{code}")
            return self.p.stdout.readline()
    
        def __del__(self):
            self.p.terminate()

    Example Use

    In [1]: load "echoevaluator.py"
    
    In [2]: # %load "echoevaluator.py"
    In [3]: e = EchoEvaluator()
    In [4]: e.echo("print this")
    Out[4]: 'print this\n'

    Subproblem: Encoded Chunk Echo Script

    An encoded chunk echo script is like an echo script, but the text is divided into base64-encoded chunks.

    Base64 Encoded Chunks

    Here is how to encode a string into base64 and split it into chunks of three letters.

    In [1]: import base64
    In [2]: base64.b64encode("Test string".encode("utf-8")).decode("ascii")
    Out[2]: 'VGVzdCBzdHJpbmc='
    In [3]: encoded = base64.b64encode("print this".encode("utf-8")).decode("ascii")
    In [4]: chunks = [
       ...:     encoded[i: i + 3]
       ...:     for i in range(0, len(encoded), 3)
       ...: ]
    In [5]: chunks
    Out[5]: ['cHJ', 'pbn', 'Qgd', 'Ghp', 'cw=', '=']

    ChunkEvaluator

    The ChunkEvaluator class is similar to EchoEvaluator. Except it includes a method to encode the chunks.

    import subprocess
    import sys
    import base64
    
    class ChunkEvaluator:
        def __init__(self, python_executable=None, script_path="chunkscript.py"):
            if python_executable is None:
                python_executable = sys.executable
            self.p = subprocess.Popen(
                [python_executable, script_path],
                stdin=subprocess.PIPE,
                stdout=subprocess.PIPE,
                stderr=subprocess.STDOUT,
                text=True,
                encoding="utf-8"
            )
    
        def _print(self, chunk: str):
            self.p.stdin.write(chunk + "\n")
            self.p.stdin.flush()
    
        def _chunk_encode(self, code, size=128):
            e = base64.b64encode(code.encode("utf-8")).decode("ascii")
            chunks = [
                e[i : i + size]
                for i in range(0, len(e), size)
            ]
            return chunks
    
        def echo(self, code):
            chunks = self._chunk_encode(code, size=3)
            self._print(f"chunks {len(chunks)}")
            for c in chunks:
                self._print(f"{c}")
            o = []
            for c in chunks:
                o.append(self.p.stdout.readline())
            return o
    
        def __del__(self):
            self.p.terminate()

    ChunkEvaluator with EchoScript

    First, I copy echoscript.py to chunkscript.py.

    cp echoscript.py chunkscript.py
    ipython
    In [1]: load "chunkevaluator.py"
    
    In [2]: # %load "chunkevaluator.py"
    In [3]: e = ChunkEvaluator()
    In [4]: e.echo("print this")
    Out[4]: ['chunks 6\n', 'cHJ\n', 'pbn\n', 'Qgd\n', 'Ghp\n', 'cw=\n']

    With the EchoScript, I confirm the first item to be the string “chunk 6”. All the other items are base64 encoded chunks, which is correct. Now it is time to write the proper ChunkScript.

    ChunkEvaluator with ChunkScript

    The following code is written to the chunkscript.py file (overwriting all content).

    import sys
    import base64
    
    def read_chunks(num):
        for i in range(num):
            line = sys.stdin.readline()
            sys.stdout.write(line)
            sys.stdout.flush()
        state = "Idle"
    
    while True:
        line = sys.stdin.readline()
        keyword, value = line.split()
        read_chunks(int(value))
    In [1]: load "chunkevaluator.py"
    
    In [2]: # %load "chunkevaluator.py"
    In [3]: e = ChunkEvaluator()
    In [4]: e.echo("print this")
    Out[4]: ['cHJ\n', 'pbn\n', 'Qgd\n', 'Ghp\n', 'cw=\n', '=\n']

    Subproblem: Interactive Interpreter

    The next subproblem to tackle is the InteractiveInterpreter, a class defined by the Python code module.

    What is the Interactive Interpreter?

    The Python code module defines a class named InteractiveInterpreter. It is used to implement read-eval-print loops in Python. You can use it to build an interactive REPL, exactly what is needed for the evaluator.

    Here is what help(code.InteractiveInterpreter) says.

    class InteractiveInterpreter(builtins.object)
     |  InteractiveInterpreter(locals=None)
     |
     |  Base class for InteractiveConsole.
     |
     |  This class deals with parsing and interpreter state (the user's
     |  namespace); it doesn't deal with input buffering or prompting or
     |  input file naming (the filename is always passed in explicitly).

    Method runsource takes source code as input and evaluates / executes it.

     |  runsource(self, source, filename='<input>', symbol='single')
     |      Compile and run some source in the interpreter.
     |
     |      Arguments are as for compile_command().
     |
     |      One of several things can happen:
     |
     |      1) The input is incorrect; compile_command() raised an
     |      exception (SyntaxError or OverflowError).  A syntax traceback
     |      will be printed by calling the showsyntaxerror() method.
     |
     |      2) The input is incomplete, and more input is required;
     |      compile_command() returned None.  Nothing happens.
     |
     |      3) The input is complete; compile_command() returned a code
     |      object.  The code is executed by calling self.runcode() (which
     |      also handles run-time exceptions, except for SystemExit).
     |
     |      The return value is True in case 2, False in the other cases (unless
     |      an exception is raised).  The return value can be used to
     |      decide whether to use sys.ps1 or sys.ps2 to prompt the next
     |      line.

    Runsource Output Examples

    In [1]: import code
    In [2]: ii = code.InteractiveInterpreter()

    Case 1: incorrect input

    In [3]: ii.runsource("int(\"hello\")")
    ---------------------------------------------------------------------------
    ValueError                                Traceback (most recent call last)
    File /usr/lib/python3.12/code.py:90, in InteractiveInterpreter.runcode(self, code)
         78 """Execute a code object.
         79 
         80 When an exception occurs, self.showtraceback() is called to
       (...)     87 
         88 """
         89 try:
    ---> 90     exec(code, self.locals)
         91 except SystemExit:
         92     raise
    
    File <input>:1
    
    ValueError: invalid literal for int() with base 10: 'hello'
    Out[3]: False

    Case 2: correct but incomplete input.

    In [4]: ii.runsource("print(")
    Out[4]: True

    Note the return value is True. Nothing happened. Sending more code does not complete the input.

    In [5]: ii.runsource("\"hello\")")
      File <input>:1
        "hello")
               ^
    SyntaxError: unmatched ')'
    
    Out[5]: False

    Case 3: correct and complete input.

    In [6]: ii.runsource("print(\"hello\")")
    hello
    Out[6]: False
    In [7]: ii.runsource("print")
    Out[7]: <function print(*args, sep=' ', end='\n', file=None, flush=False)>
    Out[7]: False

    Defining a variable:

    In [10]: ii.runsource("x = 12")
    Out[10]: False
    In [11]: ii.runsource("print(f\"X: {x}\")")
    X: 12
    Out[11]: False

    The Runsource Symbol Argument

    Method runsource accepts one more argument which I did not mention so far. That argument is called symbol and it takes one of three values:

    • ‘single’,
    • ‘exec’, or
    • ‘eval’.

    Perhaps it is best to see some examples to show how to use the argument.

    Function Call Examples

    In [1]: import code
    
    In [2]: ii = code.InteractiveInterpreter()
    
    In [3]: multi = """
       ...: def hello():
       ...:     print("Hello World")
       ...:     return 10
       ...: hello()
       ...: """
    
    In [4]: single = "hello()"
    
    In [5]: ii.runsource(multi, symbol='exec')
    Hello World
    Out[5]: False
    
    In [6]: ii.runsource(single, symbol='exec')
    Hello World
    Out[6]: False
    
    In [7]: ii.runsource(single, symbol='eval')
    Hello World
    Out[7]: False
    
    In [8]: ii.runsource(single, symbol='single')
    Hello World
    Out[8]: 10
    Out[8]: False

    Only ‘single’ returned the result. Note that calling runsource with multi as the source argument is only error-free with the symbol argument set to exec.

    Symbol value ‘exec’ means the source code input is treated like a Python script. It can contain definitions and multiple-block lines. But it cannot return a result.

    Symbol value ‘eval’ means the source code input is treated as exactly one Python expression. It cannot contain multiple expressions.

    Unexpected Behavior: Backslashes

    Writing in the ipython REPL also creates unexpected errors.

    In [1]: import code
    
    In [2]: ii = code.InteractiveInterpreter()
    
    In [3]: source = """
       ...: print("Hello \n World!")
       ...: """
    
    In [4]: ii.runsource(source)
      File <input>:2
        print("Hello
              ^
    SyntaxError: unterminated string literal (detected at line 2)
    
    Out[4]: False

    Backslashes are the most common issue because Python treats them as escape characters. The correct version is here.

    In [5]: source = r"""print("Hello \n World!")"""
    
    In [6]: ii.runsource(source)
    Hello 
     World!
    Out[6]: False
    
    In [7]: source = r"""
       ...: print("Hello \n World!")
       ...: """
    
    In [8]: ii.runsource(source)
    Hello 
     World!
    Out[8]: False

    Implementing the Evaluator and ReplScript

    I take the ideas shown in the subproblems, and merge them into the evaluator code and the replscript code.

    The Evaluator

    import base64
    import subprocess
    import sys
    
    class Evaluator():
        def __init__(self, python_executable=None, script_path="replscript.py"):
            if python_executable is None:
                python_executable = sys.executable
            self.p = subprocess.Popen(
                [python_executable, script_path],
                stdin=subprocess.PIPE,
                stdout=subprocess.PIPE,
                stderr=subprocess.STDOUT,
                text=True,
                encoding="utf-8"
            )
    
        def _print(self, chunk: str):
            self.p.stdin.write(chunk + "\n")
            self.p.stdin.flush()
    
        def _input(self):
            return self.p.stdout.readline().strip()
    
        def _chunk_encode(self, code, size=128):
            e = base64.b64encode(code.encode("utf-8")).decode("ascii")
            chunks = [
                e[i : i + size]
                for i in range(0, len(e), size)
            ]
            return chunks
    
        def _chunk_decode(self, chunks):
            b64_data = "".join(chunks)
            decoded = base64.b64decode(b64_data.encode("utf-8"))
            return decoded.decode("utf-8")
    
        def runsource_exec(self, code):
            self._print(f"symbol exec")
            return self._runsource(code)
    
         def runsource_single(self, code):
            self._print(f"symbol single")
            return self._runsource(code)
    
        def _runsource(self, code):
            chunks = self._chunk_encode(code)
            self._print(f"chunks {len(chunks)}")
            for c in chunks:
                self._print(f"{c}")
            o = []
            keyword, value = self._input().split()
            for i in range(int(value)):
                o.append(self.p.stdout.readline().strip())
            return self._chunk_decode(o)
    
        def __del__(self):
            self.p.terminate()

    The ReplScript

    import sys
    import base64
    import io
    import code
    import inspect
    import re
    
    from contextlib import redirect_stdout, redirect_stderr
    
    ANSI_RE = re.compile(r'\x1b\[[0-?]*[ -/]*[@-~]')
    
    def run_sources_captured(ii, source, symbol):
        out = io.StringIO()
        err = io.StringIO()
        res = io.StringIO()
    
        # Custom displayhook to capture expression results
        def custom_displayhook(value):
            if value is not None:
                if callable(value):
                    try:
                        sig = inspect.signature(value)
                        print(f"<function {value.__name__}{sig}>", file=res)
                    except (ValueError, TypeError):
                        print(repr(value), file=res)
                else:
                    print(repr(value), file=res)
    
        old_displayhook = sys.displayhook
        sys.displayhook = custom_displayhook
    
        try:
            with redirect_stdout(out), redirect_stderr(err):
                more = ii.runsource(source, symbol=symbol)
                if more:
                    res.write("[incomplete input]\n")
        finally:
            sys.displayhook = old_displayhook
    
        output = out.getvalue() + err.getvalue() + res.getvalue()
        return ANSI_RE.sub('', output)
    
    def chunk_encode(code, size=128):
        e = base64.b64encode(code.encode("utf-8")).decode("ascii")
        chunks = [
            e[i : i + size]
            for i in range(0, len(e), size)
        ]
        return chunks
    
    def chunk_decode(chunks):
        b64_data = "".join(chunks)
        decoded = base64.b64decode(b64_data.encode("utf-8"))
        return decoded.decode("utf-8")
    
    def read_chunks(num):
        chunks = []
        for i in range(num):
            line = sys.stdin.readline()
            chunks.append(line)
        return chunks
    
    def write_chunks(chunks):
        sys.stdout.write(f"chunks {len(chunks)}" + "\n")
        for c in chunks:
            sys.stdout.write(c + "\n")
        sys.stdout.flush()
    
    while True:
        line = sys.stdin.readline()
        keyword, value = line.split()
        symbol = value
    
        line = sys.stdin.readline()
        keyword, value = line.split()
        num_chunks = value
    
        chunks = read_chunks(int(value))
        decoded = chunks_decode(chunks)
    
        output = []
        if symbol == "single"
            output = run_sources_captured(ii, decoded, symbol)
        else:
            output = run_sources_captured(ii, decoded, 'exec')
    
        chunks = chunk_encode(output)

    Interactive Use

    In [1]: import evaluator
    
    In [2]: e = evaluator.Evaluator()
    
    In [3]: e.runsource_single("print(\"hello world\")")
    Out[3]: 'hello world\n'
  • Defining Tools with a Toolkit

    In the previous chapter, I have shown a Python program to manually exchange messages with an LLM API. I introduced a data structure that contains inputs and outputs to interact with the LLM.

    In this chapter, I introduce the Toolkit component. The Toolkit contains definitions for function tools for the LLM API. But it does not explicitly perform a tool call (that will be done in later chapters).

    Here are my two goals for this chapter. The Toolkit, that I implement, must:

    • define a Python REPL tool call for an LLM API,
    • export all tools as a list that is ready to be sent to an LLM API.

    Implementing the Toolkit

    I will step back for a moment and consider how do LLMs use tools. It is useful to keep that in mind while implementing the toolkit.

    How do LLMs Use Tools?

    Recall that the LLM API accepts a tools field in the JSON input data. When you provide the API tools, the server constructs a special system prompt. The prompt is designed to instruct the model to use the specified tool(s).

    For example, here is the Claude API example prompt that is constructed for tool use.

    In this environment you have access to a set of tools you can use to answer the
    user's question.
    {{ FORMATTING INSTRUCTIONS }}
    String and scalar parameters should be specified as is, while lists and objects
    should use JSON format. Note that spaces for string values are not stripped.
    The output is not expected to be valid XML and is parsed with regular
    expressions.
    Here are the functions available in JSONSchema format:
    {{ TOOL DEFINITIONS IN JSON SCHEMA }}
    {{ USER SYSTEM PROMPT }}
    {{ TOOL CONFIGURATION }}

    Setup

    mkdir infer_tk && cd infer_tk
    python3 -m venv venv
    source venv/bin/activate
    pip3 install ipython
    ipython

    Starting with the Class

    The basic Toolkit I will implement will not be a function, but a class. This is because it has state. (Though, functions can have state, but that is not on the agenda here.)

    To store state, the Toolkit class keeps a variable named table.

    In [1]: class Toolkit:
       ...:     def __init__(self):
       ...:         self.table = {}
       ...: 
    
    In [2]: tk = Toolkit()
    
    In [3]: tk.table
    Out[3]: {}
    

    Variable table is a Python dictionary. Inside of it, will be tool definitions for the LLM API. But, first, I have to recall the schemas for those definitions.

    API Tools Input Schema

    Recall the tools schema for the OpenAI API.

    {
        ...
        "tools": [ properties ...]
        ...
    }

    The properties schema contains a definition of one function tool.

    {
        "name": string,
        "type": "function",
        "description": string,
        "parameters": parameters
    }

    The parameters schema contains the definitions for all arguments.

    {
        "type": "object",
        "properties": {
            arg: {
                "type": string,
                "description": string
            }, ...
        },
        "required": [ strings ... ]
    }

    The objects placed in the toolkit table shall follow the schemas for:

    • properties, and
    • parameters.

    Defining Tools

    I wish for a method to add new tools to the Toolkit.

    In [5]: class Toolkit:
       ...:     def __init__(self):
       ...:         self.table = {}
       ...:     def deftool(self, name, description, parameters):
       ...:         if name in self.table:
       ...:             raise ValueError(f"Tool '{name}' already defined.")
       ...:         self.table[name] = {
       ...:             "name": name,
       ...:             "type": "function",
       ...:             "description": description,
       ...:             "parameters": parameters
       ...:         }
       ...: 
    In [6]: tk = Toolkit()
    In [7]: tk.deftool("name0", "desc0", {})
    In [8]: tk.table
    Out[8]: 
    {'name0': {'name': 'name0',
      'type': 'function',
      'description': 'desc0',
      'parameters': {}}}

    That is weird. Maybe I need to define a variable and set it’s value into the dictionary? Also it seems better to return the result.

    In [11]: class Toolkit:
        ...:     def __init__(self):
        ...:         self.table = {}
        ...:     def deftool(self, name, description, parameters):
        ...:         if name in self.table:
        ...:             raise ValueError(f"Tool '{name}' already defined.")
        ...:         r = {
        ...:             "name": name,
        ...:             "type": "function",
        ...:             "description": description,
        ...:             "parameters": parameters
        ...:         }
        ...:         self.table[name] = r
        ...:         return r
        ...: 
    In [12]: tk = Toolkit()
    In [13]: tk.deftool("name0", "desc0", {})
    Out[13]: 
    {'name': 'name0', 
     'type': 'function',  
     'description': 'desc0', 
     'parameters': {}}

    That worked.

    Defining an Evaluator Tool

    The requirement is an evaluator tool. A Python REPL for the LLM.

    The function tool that I define here (named py_repl_runsource) is not the function tool used in the final implementation of Toolkit. The reason is that there are two function tools defined and printing and typing all of that is bothersome. I only define py_repl_runsource to illustrate how it is done.

    In [21]: def add_py_repl_tools(toolkit):
        ...:     p = {
        ...:         "type": "object",
        ...:         "properties": {
        ...:             "code": {
        ...:                 "type": "string",
        ...:                 "description": "Python code to execute."
        ...:             }
        ...:         },
        ...:         "required": ["code"]
        ...:     }
        ...:     d = (
        ...:         "Execute Python code script in a persistent environment. "
        ...:         "You must explicitly print evaluation results. "
        ...:         "Returns the stdout and stderr output as one string. "
        ...:     )
        ...:     return toolkit.deftool("py_runsource_exec", d, p)
        ...: 
    In [22]: tk = Toolkit()
    In [23]: add_py_repl_tools(tk)
    Out[23]: 
    {'name': 'py_runsource_exec',
     'type': 'function',
     'description': 'Execute Python code script in a persistent enviornment. 
    You must explicitly print evaluation results. Returns the stdout and stderr
    output as one string.',
     'parameters': {'type': 'object',
      'properties': {'code': {'type': 'string',
        'description': 'Python code to execute.'}},
      'required': ['code']}}

    Converting to Tools

    The second requirement is that the tools is a list.

    This means that the tools dictionary, inside the Toolkit, must be converted to a list.

    In [24]: class Toolkit:
        ...:     def __init__(self):
        ...:         self.table = {}
        ...:     def deftool(self, name, description, parameters):
        ...:         if name in self.table:
        ...:             raise ValueError(f"Tool '{name}' already defined.")
        ...:         r = {
        ...:             "name": name,
        ...:             "type": "function",
        ...:             "description": description,
        ...:             "parameters": parameters
        ...:         }
        ...:         self.table[name] = r
        ...:         return r
        ...:     def tools(self):
        ...:         return [x for x in self.table.values()]
        ...:     def match(self, name):
        ...:         if name in self.table:
        ...:             return name
        ...:         else:
        ...:             return False
        ...: 
    In [25]: tk = Toolkit()
    In [26]: add_py_repl_tools(tk)
    Out[26]: 
    {'name': 'py_runsource_exec',
     'type': 'function',
     'description': 'Execute Python code script in a persistent enviornment. 
    You must explicitly print evaluation results. Returns the stdout and stderr
    output as one string.',
     'parameters': {'type': 'object',
      'properties': {'code': {'type': 'string',
        'description': 'Python code to execute.'}},
      'required': ['code']}}
    In [27]: tk.tools()
    Out[27]: 
    [{'name': 'py_runsource_exec',
      'type': 'function',
      'description': 'Execute Python code script in a persistent enviornment. 
     You must explicitly print evaluation results. Returns the stdout and stderr
     output as one string.',
      'parameters': {'type': 'object',
       'properties': {'code': {'type': 'string',
         'description': 'Python code to execute.'}},
       'required': ['code']}}]
    In [28]: tk.match("py_repl_runsource")
    Out[28]: 'py_repl_runsource'
    In [29]: tk.match("must be False")
    Out[29]: False

    Toolkit Code

    class Toolkit:
        def __init__(self):
            self.table = {}
    
        def deftool(self, name, description, parameters):
            if name in self.table:
                raise ValueError(f"Tool '{name}' already defined.")
            r = {
                "name": name,
                "type": "function",
                "description": description,
                "parameters": parameters
            }
            self.table[name] = r
            return r
    
        def tools(self):
            return [x for x in self.table.values()]
    
        def match(self, name):
            if name in self.table:
                return name
            else:
                return False
    
    def add_py_repl_tools(toolkit):
        p1 = {
            "type": "object",
            "properties": {
                "code": {
                    "type": "string",
                    "description": "Python code to execute."
                }
            },
            "required": ["code"]
        }
        d1 = (
            "Execute Python code script in a persistent environment. "
            "You must explicitly print evaluation results. "
            "Returns the stdout and stderr output as one string. "
        )
        o = [
            toolkit.deftool("py_runsource_exec", d1, p1)
        ]
        return o
  • Manual Chat Program with a Circular Context

    In the previous chapter, I have shown how to exchange data with an LLM API. I used CURL as the HTTP client.

    In this chapter, I replace CURL with a Python program. That does not mean that everything is automated. No, I still manually manage the messages, but I introduce a data structure that contains inputs and outputs to interact with the LLM. I name the data structure a circular context, and base it on a circular buffer.

    This chapter has three sections:

    • Limiting the context – a circular buffer limits the number of messages,
    • Python implementation – the program code in Python,
    • Interactive use – an iPython session to show usage.

    Limiting the Context

    Problem Definition

    Loosely speaking, an LLM takes as input a text sequence and returns as output another text sequence that completes the prior. These, LLMs, are trained to do so, by modelling the probability distribution of a text given some prior text. Or in other words, an LLM can be thought of as a predict function, that takes text of size N and returns text of size N + k, such that N + k is less than the LLM context size M, which is defined by the API model.

    LLM Context size M
    Index 0 1 2 3 4 5 6 7 8 9 ... N ... k < M
    Value ? ? ? ? ? ? ? ? ? ? ... ? ... ?

    Circular Buffer Definition

    To limit the number of messages, and thus to never reach the LLM context size, I use a circular buffer data structure. For simplicity, I do not count the number of tokens.

    A circular buffer (CB), limited to k items, is either:

    • the empty CB (of size n = 0), or
    • a CB of size n < k, formed by adding a new item to the front of a CB of size n - 1 < k.
    • a CB of size n = k, formed by adding a new item to the front of a CB of size n - 1, which is formed by removing an item from the back of a CB of size n = k.

    How do you determine k? Randomly. I have not thought of a heuristic. So, I randomly picked 19, which is the 8th prime number, as the default value.

    Python Implementation

    Code Overview

    There are four concepts I use in the code implementation:

    • circular buffer,
    • circular context,
    • context, and
    • predict.

    Circular buffer is the data structure defined in the previous section. It stores LLM specific input and output objects. It is the essential part of the circular context.

    Circular context is a data structure that hides the circular buffer. It defines methods to push new LLM specific objects, a clear() to remove all objects, and a to_list() method.

    Context is a list data structure. The only difference is, it stores specific LLM API objects. These objects are the very same objects I have shown in the previous chapter to interact with the LLM. Namely: EasyInputMessage and ResponseOutputText.

    CircularBuffer <--> CircularContext <--> Context <--> LLM API
        Class                Class            List         JSON

    Lastly, predict is the main function that takes a context as input and returns the output of the LLM. It does not (to this end) return a new context.

    Setup

    mkdir llm_api_prog && cd llm_api_prog
    python3 -m venv venv
    source venv/bin/activate
    pip install requests ipython

    I will write all code into a single file circularcontext.py.

    Dependencies

    Because the LLM API uses JSON and HTTP, you need:

    • a JSON package,
    • a HTTP package to send requests and receive responses.
    import json
    import requests
    import os
    
    OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

    Empty Request

    Recall that the LLM API expects a JSON data object with fields: "model", "input", and "tools".

    def openai_prepare(model, context, tools):
        return {
            "model": model,
            "input": context,
            "tools": tools
        }

    Sending Requests

    def openai_request(model="gpt-4.1", context=[], tools=[]):
        url = "https://api.openai.com/v1/responses"
        headers = { 
            "Authorization": f"Bearer {OPENAI_API_KEY}",
            "Content-Type": "application/json"
        }
        data = openai_prepare(model, context, tools)
        return requests.post(url, headers=headers, json=data)

    Receiving Responses

    def openai_response(response):
        response.raise_for_status()
        data = response.json()
        return data['output']

    Note that better error handling is needed.

    Predict

    def predict(context=[], tools=[]):
        r = openai_response(openai_request(context=context, tools=tools))
        return r

    Circular Buffer

    The implementation of a circular buffer written by an LLM.

    class CircularBuffer:
        def __init__(self, capacity):
            if capacity <= 0:
                raise ValueError("Capacity must be positive")
    
            self.capacity = capacity
            self.buffer = [None] * capacity
            self.head = 0  # points to oldest element
            self.tail = 0  # points to next write position
            self.size = 0
    
        def enqueue(self, item):
            """Add an element to the buffer."""
            self.buffer[self.tail] = item
    
            if self.size == self.capacity:
                # Buffer full → overwrite oldest
                self.head = (self.head + 1) % self.capacity
            else:
                self.size += 1
    
            self.tail = (self.tail + 1) % self.capacity
    
        def dequeue(self):
            """Remove and return the oldest element."""
            if self.size == 0:
                raise IndexError("Dequeue from empty buffer")
    
            item = self.buffer[self.head]
            self.buffer[self.head] = None  # Optional cleanup
            self.head = (self.head + 1) % self.capacity
            self.size -= 1
    
            return item
    
        def peek(self):
            """Return the oldest element without removing it."""
            if self.size == 0:
                raise IndexError("Peek from empty buffer")
            return self.buffer[self.head]
    
        def to_list(self):
            """Return elements as a standard Python list (FIFO order)."""
            result = []
            index = self.head
            for _ in range(self.size):
                result.append(self.buffer[index])
                index = (index + 1) % self.capacity
            return result
    
        def is_empty(self):
            return self.size == 0
    
        def is_full(self):
            return self.size == self.capacity
    
        def __len__(self):
            return self.size
    
        def __repr__(self):
            return f"CircularBuffer({self.to_list()})"
    
        def shallow_clone(self):
            """Return a shallow copy of the circular buffer."""
            cb = CircularBuffer(self.capacity)
            cb.buffer = self.buffer.copy()
            cb.head = self.head
            cb.tail = self.tail
            cb.size = self.size
            return cb

    Circular Context

    A context is a data structure that contains objects which are elements of the input array for the LLM API.

    class CircularContext:
        def __init__(self, capacity=19):
            if capacity <= 0:
                raise ValueError("Capacity must be positive")
    
            self.capacity = capacity
            self.cb = CircularBuffer(self.capacity)
    
        def push_easy_input_message(self, content="", role="user"):
            self.cb.enqueue({"content": content, "role": role, "type": "message"})
    
        def push_function_call_output(self, call_id="", output=""):
            self.cb.enqueue({
                "call_id": call_id,
                "output": output,
                "type": "function_call_output"
                })
    
        def push_custom(self, object):
            self.cb.enqueue(object)
    
        def clear(self):
            self.cb = CircularBuffer(self.capacity)
    
        def to_list(self):
            return self.cb.to_list()

    Usage

    Getting Started

    Make sure that:

    • the terminal is in the proper directory,
    • the Python virtual environment is activated,
    • the proper code is in the circularcontext.py file.

    Start an iPython session.

    export OPENAI_API_KEY="your api key..."
    ipython

    Load the code.

    In [1]: load "circularcontext.py"

    Sanity check the OpenAI API key.

    In [3]: OPENAI_API_KEY
    Out[3]: 'your api key...'

    Sanity check an empty request.

    In [4]: openai_prepare("gpt-4.1", [], [])
    Out[4]: {'model': 'gpt-4.1', 'input': [], 'tools': []}

    Sending an Easy Input Message

    In [5]: cc = CircularContext()
    In [6]: cc.push_easy_input_message("Hi!")

    Sanity check a message.

    In [7]: cc.to_list()
    Out[7]: [{'content': 'Hi!', 'role': 'user', 'type': 'message'}]
    In [8]: r = predict(context=cc.to_list())
    In [9]: r
    Out[9]:
    [{'type': 'output_text',
      'annotations': [],
      'logprobs': [],
      text': 'Hello! How can I help you today? 😊'}]

    Note that the output result is an array.

    Merging Context

    In [10]: for x in r:
                 cc.push_custom(x)
    
    In [11]: cc.push_easy_input_message("Say hi again.")
    

    Sanity check.

    In [12]: cc.to_list()
    Out[12]: 
    [{'content': 'Hi!', 'role': 'user', 'type': 'message'},
     {'id': (omitted),
      'type': 'message',
      'status': 'completed',
      'content': [{'type': 'output_text',
        'annotations': [],
        'logprobs': [],
        'text': 'Hello! How can I help you today? 😊'}],
      'role': 'assistant'},
     {'content': 'Say hi again.', 'role': 'user', 'type': 'message'}]
    In [13]: r = predict(context=cc.to_list())
    
    In [14]: r
    Out[14]: 
    [{'id': (omitted),
      'type': 'message',
      'status': 'completed',
      'content': [{'type': 'output_text',
        'annotations': [],
        'logprobs': [],
        'text': 'Hi again! 👋'}],
      'role': 'assistant'}]
    
  • The Bare Minimum to Chat with Function Calls

    This is a tutorial on using the OpenAI LLM API, focusing on: messages and function calls. But, without Python, TypeScript, or some other programming language. The only requirements are CURL (an HTTP client) and an OpenAI API key.

    Why Bother?

    “Why waste my time, when I can just import an API package?”

    Sure, that works, until you go deeper. What if…

    • you do not have access to / permission for / trust in the API package?
    • you want to avoid software bloat?
    • you want to understand what is happening?
    • you want to make your own AI Agents?

    Before, there was only /chat/completions/. Now, there are /responses, function calls, tool calls, computer calls, image calls, search calls, skills, etc.

    I will show the bare minimum to interact with an LLM API:

    • Prompt completions and context, and
    • Function calls.

    Prompt Completions and Context

    In this section, I show messaging an LLM. The provider is OpenAI at:

    https://api.openai.com/v1/responses
    HTTP Method: POST

    Endpoint /responses accepts application/json data. Using CURL, create a POST request with JSON data.

    curl https://api.openai.com/v1/responses \
      -H "Content-Type: application/json" \
      -H "Authorization: Bearer $OPENAI_API_KEY" \
      -d '{ ... json data goes here ... }'

    I wish to clarify two things. Suppose I want to exchange messages with an LLM…

    • What JSON data do I need?
    • May I see an example for exchanging messages?

    What JSON data do I need?

    Input

    Start with an empty JSON object.

    {
      request data go here ...
    }

    Select the model:

    {
      "model": "gpt-4.1"
    }

    Set "input" field value to an array [ ... ]. (Do not forget the comma.)

    {
      "model": "gpt-4.1",
      "input": [
         array items go here ...
      ]
    }

    The array items will be explained now.

    Input Items

    The API defines many objects you can put in the “input array” [ ... ]. Far too many, to list all. Instead, I show only four. Two object types may be created by your client program:

    • EasyInputMessage, and
    • FunctionCallOutput.

    Two object types may be created by the server:

    • ResponseOutputMessage, and
    • ResponseFunctionToolCall.

    In this section, I show EasyInputMessage and ResponseOutputMessage types. These are enough for prompts with context. In the Function Call section, I will show FunctionCallOutput and ResponseFunctionToolCall types.

    Easy Input Message (Client)

    Your client program sends prompts to the LLM inside an EasyInputMessage. The prompt text goes in the "content" field.

    EasyInputMessage schema:
    { 
      "content": string (this is where your prompt goes),
      "role": "user" | "assistant" | "system" | "developer",
      "type": "message"
    }
    
    Example: 
    {
      "content": "This is a prompt sent to the LLM.",
      "role": "user",
      "type": "message"
    }

    ResponseOutputMessage (Server)

    The LLM answers with a ResponseOutputMessage object type. It is more complex, when compared to EasyInputMessage. The reason is that it’s "content" field value is more complex. The value is an array that may contain two possible object types. The array tems are either a ResponseOutputRefusal type (the LLM refused to answer) or ResponseOutputText type (the LLM answered). I will first show the schema of these object types, and second, the schema for ResponseOutputMessage.

    ResponseOutputRefusal schema:
    { 
      "refusal": string, 
      "type": "refusal"
    }
    ResponseOutputText schema:
    { 
      "annotations": [ FileCitation | URLCitation | 
                       ContainerFileCitation | FilePath ],
      "logprobs": [ logprobs object ],
      "text": string,
      "type": "output_text"
    }

    The ResponseOutputText schema is non-trivial. The values of the “annotations” and “logprobs” fields are complex. It is best to simply ignore them unless needed.

    With that in mind, here is the schema for ResponseOutputMessage.

    ResponseOutputMessage schema:
    {
      "id": string,
      "content": [ ResponseOutputText | ResponseOutputRefusal ],
      "role": "assistant",
      "status": "in_progress" | "completed" | "incomplete",
      "type": "message" 
    }
    

    To show an example ResponseOutputMessage, I will make an API request and show the response.

    May I see an example for echanging messages?

    Sending A Single Message

    The prompt is: “Hi!”.

    curl https://api.openai.com/v1/responses \
      -H "Content-Type: application/json" \
      -H "Authorization: Bearer $OPENAI_API_KEY" \
      -d '{
            "model": "gpt-4.1",
            "input": 
            [
              {
                "content": "Hi!", "role": "user", "type": "message"
              }
            ]
          }'

    This is the value of the “output” part of the response.

    ...
      "output": [
        {
          "id": (ommited),
          "type": "message",
          "status": "completed",
          "content": [
            {
              "type": "output_text",
              "annotations": [],
              "logprobs": [],
              "text": "Hello! How can I help you today?"
            }
          ],
          "role": "assistant"
        }
      ],
    ...

    In this simple case, the output is an array of one item that is an object of type ResponseOutputMessage. And that object itself has a “content” field for which the value is an array of one item that is an object of type ResponseOutputText.

    Creating Context

    To continue the LLM conversation, you need to merge the (client) prompt and the output response (server). This is known as a context.

    • Copy the ResponseOutputMessage object from the “output array”.
    • Append a new EasyInputMessage object as the next prompt.

    Make sure to add commas between the items of the “input array” when doing it manually.

    curl https://api.openai.com/v1/responses \
      -H "Content-Type: application/json" \
      -H "Authorization: Bearer $OPENAI_API_KEY" \
      -d '{
            "model": "gpt-4.1",
            "input": 
            [
              {
                "content": "Hi!", "role": "user", "type": "message"
              },
              {
                "id": (ommited),
                "type": "message",
                "status": "completed",
                "content": [
                {
                  "type": "output_text", "annotations": [], "logprobs": [],
                  "text": "Hello! How can I help you today?"
                }
                ],
                "role": "assistant"
              },
              {
                "content": "Say hi again.", "role": "user", "type": "message"
              }
            ]
          }'

    Here is the output.

    ...
      "output": [
        {
          "id": (ommitted),
          "type": "message",
          "status": "completed",
          "content": [
            {
              "type": "output_text",
              "annotations": [],
              "logprobs": [],
              "text": "Hi again!"
            }
          ],
          "role": "assistant"
        }
      ],
    ...

    Section Summary

    • The LLM API accepts JSON data as an input and writes JSON data as an output.
    • To exchange messages, a model and an input array must be set.
    • The elements of the input array are JSON objects that follow the EasyInputMessage schema or ResponseOutputMessage schema.

    Function Calls

    Section Overview

    In this section, I show how to exchange messages that are function call ready with an LLM. I will use LLMs by OpenAI, which are available at:

    https://api.openai.com/v1/responses
    HTTP Method: POST

    Endpoint /responses accepts application/json data. Using CURL, create a POST request with JSON data.

    curl https://api.openai.com/v1/responses \
      -H "Content-Type: application/json" \
      -H "Authorization: Bearer $OPENAI_API_KEY" \
      -d '{ ... json data goes here ... }'

    In the previous section, I have shown that two fields are required: “input” and “output”. The values of these fields are arrays. So far, the only object types in these arrays were EasyInputMessage and ResponseOutputMessage. That will change now.

    Request
    { 
      "model": "gpt-4.1",
      "input": [ ... ]
    }
    
    Response
    { ...
      "output": [ ... ]
      ...
    }

    Two new objects I will show now are:

    • FunctionCallOutput, and
    • ResponseFunctionToolCall.

    I wish to clarify two things. Suppose I want to exchange messages with an LLM and allow it to use some function calls with my client program…

    • What JSON data do I need?
    • May I see an example for exchanging messages with function calls?

    What JSON data do I need?

    To exchange messages that are function call ready, set the “tools” field in the request.

    { 
      "model": "gpt-4.1",
      "input": [ ... ],
      "tools": [ ... ]
    }

    Tool Items

    Each item, in the “tools array”, is an object { ... }. The API supports several different object types. I will only show:

    • FunctionTool.

    A FunctionTool object has a name, a description, and parameters that the client program must set. Name names the function. Description describes the function. Parameters describes the function arguments.

    FunctionTool schema:
    {
      "type": "function",
      "name": string,
      "description": string,
      "parameters": object
    }

    Parameters are described in the “properties” field. Each parameter is yet another object.

    FunctionTool Parameters schema:
    {
      "type": "object",
      "properties": object,
      "required": [ strings ]
    }

    The value of the “required” field is an array that contains strings naming parameters that are required.

    FunctionTool Parameters Properties schema:
    {
      argument_name: 
      { 
        "type": argument_type,
        "description": argument_desc
      },
      ...
    }

    The key argument_name is a string that names the function argument. The object argument_type is a string that names the function argument type. The object argument_desc is a string that describes the function argument.

    Request With Tools

    Now that you have seen the structure of a FunctionTool, here is what an example request that is function call ready looks like:

    {
      "model": "gpt-4.1",
      "input": 
      [
        {
          "content": "Which natural number comes after 1678931?",
          "role": "user", "type": "message"
        }
      ],
      "tools":
      [
        {
          "name": "next_natural",
          "type": "function",
          "description": "next_natural takes as input a natural number.
    Returns a the first natural number that is greater than the argument.",
          "parameters": {
            "type": "object",
            "properties": {
              "number" : {
                "type": "number",
                "description": "The input natural number."
              }
            },
            "required": ["number"]
          }
        }
      ]
    }

    The request includes the “tools” field, for which the value is an array with exactly one FunctionTool object. When the request defines a FunctionTool, two things can happen:

    • the FunctionTool may be ignored, or
    • a response to use the FunctionTool may be created.

    Your client program must support both scenarios. It may check the type of the output. If the type is a ResponseOutputMessage, the FunctionTool was ignored. If the type is a ResponseFunctionToolCall, the client must perform the function call.

    In other words, the server returns a response with the “output” field value to be an array whose element is either a:

    • ResponseOutputMessage, or
    • ResponseFunctionToolCall.
    Scenario A:
    
    client --> request: EasyInputMessage and tools                 --> server
    client <--                               ResponseOutputMessage <-- server
    
    Scenario B:
    
    client --> request: EasyInputMessage and tools                 --> server
    client <--                            ResponseFunctionToolCall <-- server
    client --> request: ResponseFunctionToolCallOutput and tools   --> server
    client <--    ReponseOutputMessage or ResponseFunctionToolCall <-- server

    ResponseFunctionToolCall (Server)

    Note that the server creates this object.

    ResponseFunctionToolCall schema:
    { 
      "arguments": string,
      "call_id": string,
      "name": string,
      "type": "function_call",
      "id": string,
      "status": "in_progress" | "completed" | "incomplete"
    }

    FunctionCallOutput (Client)

    Note that the client creates this object. When creating this object, the value of the "call_id" field is copied from the matching ResponseFunctionToolCall object.

    Schema:
    { 
      "call_id": string,
      "output": string | (there is more but I ignore that),
      "type": "function_call_output",
      "id": string (mostly ignore this),
      "status": "in_progress" | "completed" | "incomplete"
    }
    
    Example:
    { 
      "call_id": "call_random123", (generated by server)
      "output": "fizzbuzz",
      "type": "function_call_output",
      "id": "123456"
      "status": "completed"
    }

    May I see an example for exchanging messages with function calls?

    Example FunctionToolCall Request

    curl https://api.openai.com/v1/responses \
      -H "Content-Type: application/json" \
      -H "Authorization: Bearer $OPENAI_API_KEY" \
      -d '{
            "model": "gpt-4.1",
            "input": 
            [
              {
                "content": "Which natural number comes after 1678931?",
                "role": "user", "type": "message"
              }
            ],
            "tools": 
            [
              {
                "name": "next_natural",
                "type": "function",
                "description": "next_natural takes as input a natural number.
    Returns a the first natural number that is greater than the argument.",
                "parameters": {
                  "type": "object",
                  "properties": {
                    "number" : {
                      "type": "number",
                      "description": "The input natural number."
                    }
                  },
                  "required": ["number"]
                }
              }
            ]
          }'

    Example FunctionToolCall Response

    ...
      "output": [
        {
          "id": (omitted),
          "type": "function_call",
          "status": "completed",
          "arguments": "{\"number\":1678931}",
          "call_id": (omitted),
          "name": "next_natural"
        }
      ],
    ...

    Example FunctionToolCallOutput Request

    curl https://api.openai.com/v1/responses \
      -H "Content-Type: application/json" \
      -H "Authorization: Bearer $OPENAI_API_KEY" \
      -d '{
            "model": "gpt-4.1",
            "input":
            [
              {
                "content": "Which natural number comes after 1678931?",
                "role": "user", "type": "message"
              },
              {
                "id": (omitted),
                "type": "function_call",
                "status": "completed",
                "arguments": "{\"number\":1678931}",
                "call_id": "call_(same call id)",
                "name": "next_natural"
              },
              {
                "call_id": "call_(same call id)",
                "output": "1678932",
                "type": "function_call_output"
              }
            ],
            "tools":
            [
              {
                "name": "next_natural",
                "type": "function",
                "description": "next_natural takes as input a natural number. 
    Returns the first natural number that is greater than the argument.",
                "parameters":
                {
                  "type": "object",
                  "properties":
                  {
                    "number":
                    {
                      "type": "number",
                      "description": "The input natural number."
                    }
                  },
                  "required": ["number"]
                }
              }
            ]
          }'

    Example FunctionToolCallOutput Response

    ...
    "output": [
        {
          "id": (omitted),
          "type": "message",
          "status": "completed",
          "content": [
            {
              "type": "output_text",
              "annotations": [],
              "logprobs": [],
              "text": "The natural number that comes after 1,678,931 is 1,678,932."
            }
          ],
          "role": "assistant"
        }
      ],
    ...

    Section Summary

    • The LLM API accepts JSON data as an input and writes JSON data as an output.
    • To exchange function call ready messages, a model, an input array, and a tools array must be set.
    • The elements of the input array are JSON objects that follow the EasyInputMessage, ResponseOutputMssage, ResponseFunctionToolCall, or ResponseFunctionToolCallOutput Schema.