ISKRA Experiment 07_10_2024-1 - Maze

Hello everyone reading this!

Last time we were testing how ChatGPT can deal with a task of fighting a stationary enemy in Minecraft. In the long run, that research may benefit development of intelligent NPCs in games. But unlike an NPC which usually have full data on everything that happens in the game, the AI was given only how much a human Minecraft player can see. It was playing like a human player would.

In next experiment, I were supposed to continue creating the artifical Minecraft player, but I got a question from someone who read my previous post:
How AI would deal with a maze?

Curiosity and questions first, so let's find out.

But before we start, let me remind what Project ISKRA is about:
Project ISKRA is my research project supposed to see how generative AI can deal with practical tasks and control robots or digital bodies to accomplish objectives. The goal is to do that on default models, without fine-turning (teaching AI how to deal with specific tasks on its internal level) nor making AIs from scratch. 

This experiment will not involve a robot nor a physical body (next itrations with for example 3D mazes may tho), but it will still check how AI can deal with the task of solving a maze, and what can be done to make it good in it.

Setup:

Let's first think how to interact with the maze.

Let's try drawing straight lines from point X_1 Y_1 to X_2 Y_2. I created a function which does that. 

def draw_line(image, x1, y1, x2, y2):
    d = ImageDraw.Draw(image)
    d.line([(x1, y1), (x2, y2)], fill=(255,0,0), width=4)
    image.save("log/"+str(time.time())+".jpg")
    return image

This function will also save every version of the maze image to the hard drive. Our AI will be generating starting and ending points, which will be used by this function to draw the line. It will solve the maze step by step with such approach.

Let's define the AIs. We will use a similar approach as in Minecraft - Brain AI which will be solving the maze, and Movement AI which will be extracting the drawing commands, because it is just easier to use a GPT-3.5-turbo for that than write the text extraction from scratch.

Brain and Movement are the names for the AIs which I used in the Minecraft experiment. I will stick to them, even if Movement's task will not be related to moving anything now.

Let's now define the AI agents, I will use my own interface for interacting with the OpenAI API. 

Here is code which defines them and also contains first versions of their SYSTEM messages (SYSTEM message is text which determines how AI is supposed to behave):

system_brain = '''You will recieve a picture of a maze. Solve it by getting from point A to B.
You can draw a straight red line on the maze image by providing start and end X and Y coordinates in pixels. Separate the values with |.
For example, to draw a line from point (100,200) to point (150,300) write:
100|200|150|300
Draw new lines one by one to solve the maze. Do only one step at a time.'''

system_movement = '''You will recieve a text, which will contain a string which determines how to draw a line.
For example, command to draw a line from point (100,200) to point (150,300) is:
100|200|150|300
Extract the command from text given to you. Write only the command, never write anything else.
'''

brain = ImgChatGPTAgent(api_key, 'gpt-4o-mini', 500, system_brain)

movement = ChatGPTAgent(api_key, 'gpt-3.5-turbo', 500, system_movement)

I added little more code to send the image to the AI, to see if they work. And apparently they do:

Now let's add a loop and drawing the line. The new code looks like this:

while True:
    response = brain.GetResponseWithImg(image,"Continue solving the maze")
    commands = movement.GetResponse(response)
    print(time.time())
    print("Response: "+response)
    print("Commands: "+commands)
    command_table = commands.split("|")
    image = draw_line(image, int(command_table[0]), int(command_table[1]), int(command_table[2]), int(command_table[3]))

Let's now start the AI and see what happens.

Tests:

Sadly, after a longer while, we got something like this:

Let's try again with GPT-4o instead of GPT-4o-mini. GPT-4o is a more advanced model.

The result is sadly not good either, tho seems to make more sense.



Sadly, GPT-3.5-turbo answered with more than what it was supposed to crashing the program. Maybe the AI would eventually get to B, even if not by solving a maze.

"Commands: There seems to be an error in the input. Let's try again.

Move down:

400|520|400|600"
- GPT-3.5-turbo's (Movement's) response

Let's lower TEMPERATURE of Movement a little and try again. I will set it to 0.8, this decreases randomness making random tokens (words) less likely to appear. Fun fact: very high TEMPERATURE makes ChatGPT start speaking gibberish until token (word) limit ends. This is because end of whole message is a token by itself, and when tokens become very random due to high TEMPERATURE, the "I stopped generating text" invisible symbol (token) does not manage to appear. Lower TEMPERATURE makes AI respond in more generic manner, high makes AI more random and therefore more "creative" in writing, too high results in gibberish. If you want AI to reliably copy a text (like in our case now), I recommend low TEMPERATURE settings. Anyway, movement's definition is now:

movement = ChatGPTAgent(api_key, 'gpt-3.5-turbo', 500, system_movement, 0.8)

Let's run the code again now.

Sadly, it makes even less sense now:

Let's lower TEMPERATURE of Brain to a lower value, let's say 0.5. And try again. It's done analogically as with Movement.

The result is very similar as when the TEMPERATURE was default 1.

Let's try now with TEMPERATURE of 1.4. Hopefully it won't end up writing gibberish.

After giving it same amount of time, the result also did not make sense.

Maybe trying out different TEMPERATURES turned out pointless in the end, but it was nice to check how that setting will affect the program.

My personal conclusion is: it is not capable of pixel-perfect image analysis, this is why it heavily approximates starting and ending points to draw the lines.

It was of course never designed to do such tasks either.

Of couse a solution could be fine-turning (further training of the AI) into solving mazes, but let's try to do as much as we can on the default AI models available.

Can ChatGPT actually determine coordinates of anything in an image?:

Let's first see if GPT-4o or 4o-mini are pixel-perfect. To do that, I will first create a image:

This image I just drew in Gnome Paint (I am now working on my second, Linux laptop btw) contains two purple squares, one blue point, one green point and likely not visible but one red pixel too (around top left corner). The image is 300x300 pixels big.

Let's see if ChatGPT can tell exact positions of these points. Let's first try cheaper and faster GPT-4o-mini:


GPT-4o-mini sadly cannot determine exact coordinates of anything in the image.

Let's try GPT-4o:

It seems better at finding the coordinates. For sake of accuracy, I will repeat this yet again with GPT-4o-mini sending the image in every message. After all in the first text I sent the image only once, so it could had got lost in context:

The results did not change, so it was not because of resending the image in every message that it was better in finding the coordinates. Apparently, GPT-4o-mini has lower resolution of image analysis.

But before any conclusions, let's check if what GPT-4o written was any correct. I will use the online tool PixSpy to rediscover coordinates of the points myself.

First, the red pixel. The AI claims it found it near point (9,8). The actual location is (41, 38).


What about the purple rectangles? They are around (90, 222) and (180, 240) according to the AI. 

Sadly, this is too innacurate as well:


Conclusion: Both GPT-4o and GPT-4o-mini cannot accurately determine pixel coordinates of objects within images, therefore they can't accurately continue drawing a path thru the maze.

What now?

Since the AI is unable to determine coordinates on the image by itself and knowledge of size of the image, we need a different approach.

I personally see two options how to tackle this:

- Add something to the image to help the AI determine coordinates. For example on chessboards every square has its own index, which is used to determine movements and locations. In chess to describe a movement we for example say "Knight moved from 1B to 3A". We could draw squares and their indexes on the maze image to let the AI describe movements with them.

- Another option is to instead of drawing a line, let the AI move a point in 2D; making movements just UP, DOWN, LEFT, RIGHT + how many pixels to move in that direction. Even with imperfect location and distance approximation, the AI should be able to do it.

The second option will be faster to prepare, so let's try it first.

First, let's create a function to draw the pawn. Let's make the pawn a red square. I also made the function use X and Y coordinates center of the drawn square, not top left corner. The function operates on copy of the original image, by not overriding it the pawn will "move" instead of duplicating in new places.
def draw_square(image, x, y, wall=40):
    new_image = image.copy()
    d = ImageDraw.Draw(new_image)
    c_x = wall//2
    c_y = wall//2
    if x>c_x and y>c_y:
        d.rectangle((x-c_x , y-c_y , x-c_x+wall , y-c_y+wall), fill=(255,0,0), width=1)
    else:
        d.rectangle((x , y , x+wall , y+wall), fill=(255,0,0), width=1)
    new_image.save("log/"+str(time.time())+".jpg")
    return new_image
Both AI agents need new SYSTEM prompts, and the loop has to be modified too. This is how I changed the code:

current_x=230
current_y=100

new_image = draw_square(image, current_x, current_y)
#image.show()

system_brain = '''You will recieve a picture of a maze. Solve it by getting from point A to B.
You control a red square. Solve the maze by moving it from point A to B.
Control the red square by writing direction and then amount of pixels. Available directions are:
W - up/north
S - down/south
A - left/west
D - right/east

Examples:
Move 150 pixels north:
W150
Move 30 pixels left:
A30

Move only once per iteration. Never make multiple movements at once.
'''

system_movement = '''You will recieve a text, which will contain a letter and number, which indicate movement.
Available directions are:
W - up/north
S - down/south
A - left/west
D - right/east

Examples:
Move 150 pixels north:
W150
Move 30 pixels left:
A30

Extract the command from text given to you. Write only the command, never write anything else.
'''

brain = ImgChatGPTAgent(api_key, 'gpt-4o', 500, system_brain, 1.4)

movement = ChatGPTAgent(api_key, 'gpt-3.5-turbo', 500, system_movement, 0.8)

while True:
    response = brain.GetResponseWithImg(new_image,"Continue solving the maze")
    commands = movement.GetResponse(response)
    print(time.time())
    print("Response: "+response)
    print("Commands: "+commands)
    
    command_table = []
    command_table.append(str(commands[0]).lower())
    command_table.append(int(commands[1:]))
    
    if command_table[0]=="w":
        current_y = current_y - command_table[1]
    elif command_table[0]=="s":
        current_y = current_y + command_table[1]
    elif command_table[0]=="a":
        current_x = current_x - command_table[1]
    elif command_table[0]=="d":
        current_x = current_x + command_table[1]
    else:
        print("An error occured.")
        quit()
    new_image = draw_square(image, current_x, current_y)
        
We save current position into variables, then draw the square nearby "A" on the image of the maze, them define SYSTEMs for AI agents, then loop starts and we send current state of the puzzle to Brain, then directions and amounts of pixels to move the pawn are extracted. Finally, after moving the pawn, the loop returns to sending the AI new image of current state of the puzzle.

On the first run, we got such sequence:

The pawn is approaching point B, tho not in way we'd expect.

Let's modify Brain's SYSTEM to make sure it does not cross the black walls, and let's let it write comments on what its doing so it leaves more context for itself (in this run it would return the movements and nothing more). Lastly, let's set Brain's TEMPERATURE to default 1, because I forgot to do that and it was still 1.4 in the run.

"You will recieve a picture of a maze. Solve it by getting from point A to B.
You control a red square. Solve the maze by moving it from point A to B.
Control the red square by writing direction and then amount of pixels. Available directions are:
W - up/north
S - down/south
A - left/west
D - right/east

Examples:
Move 150 pixels north:
W150
Move 30 pixels left:
A30

Move only once per iteration. Never make multiple movements at once.
Write your own thoughts and comments, including on where the red square is now and how it should move next.
Never cross black walls of the maze.
Write only "END" once you reach the point B."

I also added the directive to write "END" once the maze is solved. Let's include that in the code too.
I will just add a IF to check for "END" right after Brain's response. The sample of that part of code is:

while True:
    response = brain.GetResponseWithImg(new_image,"Continue solving the maze")
    
    if str(response.strip()).lower()=="end":
        print("DONE")
        quit()
    
    commands = movement.GetResponse(response)
    ...
I added copy of that IF statement after Movement's "commands" response.

Let's run the AI again now:

It technically solved the maze. But went to point B directly, ignoring the maze itself.
This is full log of this attempt:

%Run ISKRAlabirynt.py
Type image file name: maze.jpg
1728501065.2808523
Response: The red square is at the starting point A. Let's move it towards point B.

Move 50 pixels down:
S50
Commands: S50
1728501069.1735294
Response: The red square is currently 50 pixels down from the top left corner.

Move 50 pixels right:
D50
Commands: D50
1728501071.7336936
Response: The red square is at the starting point A. Let's move to solve the maze toward point B.

Move 100 pixels down:
S100
Commands: S100
1728501074.598248
Response: The red square is currently positioned near the top left, labeled A.

Move 100 pixels down:
S100
Commands: S100
1728501077.463576
Response: The red square is currently near the middle on the left side. Let's continue navigating through the maze.

Move 50 pixels down:
S50
Commands: S50
1728501080.1276205
Response: The red square is currently a little below the starting point A.

Move 100 pixels down:
S100
Commands: S100
1728501082.8906786
Response: The red square is near the bottom left. Let's move it along the path to point B.

Move 100 pixels right:
D100
Commands: D100
1728501085.8643894
Response: The red square is positioned a bit down the path on the left side. Let’s continue solving.

Move 50 pixels down:
S50
Commands: S50
1728501088.6266737
Response: The red square is on the left side of the maze. Let's move right.

Move 50 pixels right:
D50
Commands: D50
1728501091.595703
Response: The red square is on the left side of the maze.

Move 50 pixels right:
D50
Commands: D50
1728501094.3633986
Response: The red square has moved slightly to the right. Let's continue navigating.

Move 50 pixels down:
S50
Commands: S50
1728501096.9310725
Response: The red square is currently in the middle of the left side. Let's move further.

Move 100 pixels down:
S100
Commands: S100
1728501099.3823495
Response: The red square is currently near the bottom left corner of the maze.

Move 100 pixels right:
D100
Commands: D100
1728501104.0511148
Response: The red square is near the middle on the left side. Let's continue navigating through the maze.

Move 50 pixels up:
W50
Commands: W50
1728501107.1614254
Response: The red square is in the middle section of the maze. Let's move it down.

Move 50 pixels down:
S50
Commands: S50
1728501109.8225186
Response: The red square is currently positioned on the left side of the maze.

Move 50 pixels right:
D50
Commands: D50
1728501112.690546
Response: The red square is close to point B. Let's move toward it.

Move 50 pixels up:
W50
Commands: W50
1728501115.8683639
Response: The red square is close to point B, on the left side of the inner path. Let’s proceed towards B.

Move 50 pixels up:
W50
Commands: W50
1728501118.9399807
Response: The red square is close to the center, near point B. 

Move 50 pixels right:
D50
Commands: D50
1728501123.2421985
Response: The red square is close to the inner section near point B.

Move 50 pixels left:
A50
Commands: A50
1728501125.9731746
Response: The red square is close to the center and needs to reach point B. Let's move down.

Move 50 pixels down:
S50
Commands: S50
1728501128.7706983
Response: The red square is on the left side, near the center. Let's move it toward point B.

Move 50 pixels right:
D50
Commands: D50
1728501131.3271213
Response: The red square is positioned near the central area on the right side.

Move 100 pixels up:
W100
Commands: W100
1728501134.7072434
Response: The red square is close to point B on the inner path. 

Move 100 pixels down:
S100
Commands: S100
1728501137.5772278
Response: The red square is near point B on the right side. Let's move it closer.

Move 50 pixels up:
W50
Commands: W50
1728501140.239705
Response: The red square is close to point B on the right side. Let's move towards B.

Move 50 pixels left:
A50
Commands: A50
1728501143.0052705
Response: The red square is currently near the center on the left side of the maze. Let's move it towards point B.

Move 50 pixels right:
D50
Commands: D50
1728501145.97357
Response: The red square is currently near the middle, just to the right of point B.

Move 50 pixels up:
W50
Commands: W50
1728501151.4021986
Response: The red square is currently close to the center on the left side of the path. Let's move toward point B.

Move 50 pixels right:
D50
Commands: D50
1728501154.6785378
Response: The red square is currently close to the center at point B. The maze is solved! If there's anything else you need, feel free to ask.
Commands: Great job! If you have any more tasks or questions, feel free to ask.
Traceback (most recent call last):
  File "/home/wojtekb30/Pulpit/ISKRAlabirynt/ISKRAlabirynt.py", line 88, in module
    command_table.append(int(commands[1:]))
ValueError: invalid literal for int() with base 10: 'reat job! If you have any more tasks or questions, feel free to ask.'

The long number right before log of the Response is unix time at that moment.

As we can see, the AI technically did the job. Tho it ignored the maze itself, and insead of writing END to finish the program written something else, which command extractor appreciated crashing the code.

Let's change Brain's SYSTEM again:

"You will recieve a picture of a maze. Solve it by getting from point A to B.
You control a red square. Solve the maze by moving it from point A to B.
Control the red square by writing direction and then amount of pixels. Available directions are:
W - up/north
S - down/south
A - left/west
D - right/east

Examples:
Move 150 pixels north:
W150
Move 30 pixels left:
A30

Move only once per iteration. Never make multiple movements at once.
Write your own thoughts and comments.
Go thru the maze and not directly to point B. Never move in a way that would cross walls of the maze.
Write "END" and nothing more once point B is reached."

I written the commands in different way and removed specific instructions on what "thoughts and comments" to write, to make them more diverse.

Let's run this:

The behavior was similar, tho now it went much more quickly. The descriptions which were meant for context became even more meaningless, as instead of comments where the square is now, the AI would only write "Let's continue solving the maze".

I decided to give a second chance to GPT-4o-mini, changing the AI model for one run. It got completely lost tho. I see no point in providing GIF of this attempt, it was too hopeless.

Let's return to GPT-4o and remove mention of the maze's walls from SYSTEM message, to make sure it is not interpreted wrongly. Removing "Never move in a way that would cross walls of the maze." will be enough.

Not much has changed:
It went quickly and the program properly ended with Brain returning END.

Let's try a more advanced AI Agent approach and add a yet another AI named Helper, which will tell Brain how to move to not hit any wall and actually move thru the maze.

We will also let both know about current position of the square, so they know it from text not only visual data.
This is how I modified the code to add the new AI and new SYSTEMs constantly updated with current position of the square:
current_x=230
current_y=100

new_image = draw_square(image, current_x, current_y)
#image.show()

system_brain = '''You will recieve a picture of a maze. Solve it by getting from point A to B.
You control a red square. Solve the maze by moving it from point A to B.
Control the red square by writing direction and then amount of pixels. Available directions are:
W - up/north
S - down/south
A - left/west
D - right/east

Examples:
Move 150 pixels north:
W150
Move 30 pixels left:
A30

Move only once per iteration. Never make multiple movements at once.
Write your own thoughts and comments.
Go thru the maze and not directly to point B. 
Write "END" and nothing more once point B is reached.
Current position of the red square:
'''

system_movement = '''You will recieve a text, which will contain a letter and number, which indicate movement.
Available directions are:
W - up/north
S - down/south
A - left/west
D - right/east

Examples:
Move 150 pixels north:
W150
Move 30 pixels left:
A30

Extract the command from text given to you. Write only the command, never write anything else.
'''

system_helper = '''You will recieve a picture of a maze. Help solve it.
The goal is to move the red square from point A to point B thru the maze.
The square moves only once per iteration.
The red square must never move thru a wall.
Write recommendations of only 1 next movement.
Current position of the red square:
'''

brain = ImgChatGPTAgent(api_key, 'gpt-4o', 500, system_brain, 1)

movement = ChatGPTAgent(api_key, 'gpt-3.5-turbo', 500, system_movement, 0.8)

helper = ImgChatGPTAgent(api_key, 'gpt-4o', 255, system_helper, 1)

while True:
    
    updated_system_brain = system_brain + " X: " + str(current_x) + " Y: " + str(current_y)
    updated_system_helper = system_helper + " X: " + str(current_x) + " Y: " + str(current_y)
    
    helper.ChangeSystemMessage(updated_system_helper)
    tip = helper.GetResponseWithImg(new_image, "What should I do now to continue solving the maze? How to move to not move thru walls?")
    print("Tip: "+tip)
    
    brain.ChangeSystemMessage(updated_system_brain)
    response = brain.GetResponseWithImg(new_image, tip)


Nothing changed in the path, since the Helper AI considers the same wrong path as corrent.

What I noticed is that the AI always first goes down and then gets confused. Let's modify the maze so the red square will already be in a tunnel:
This is how the new maze looks like. It is the same, but now it is fully closed.

After running the AI, nothing changed about the path. 

Let's make the area outside of the maze green, and edit SYSTEMs to not allow the AI to move into green areas.

Let's add to SYSTEMs of Brain and Helper:
"The maze has black walls and white floor.
Never move the red square into a green area."

The movements became smaller and much more careful. Sadly, both Brain and Helper agree to go down and then they get lost. The path remains extremely similar to what it was in previous attempts.

Let's increase TEMPERATUREs to high 1.8, to try to make the outputs more diverse.

Sadly, that did not change much either, I decreased it back to 1 after this test.

This post is turning out much longer (and more interesting) than I originally expected. 

I will publish it for now and write a follow-up in the future.

But before ending this, I want to try last thing: block the path down:

Let's launch the program one more time:

Brain understood that it can't move down, tho Helper recommends that.

The square moves very little, but the path to B is still turning out extremely similar to all previous attempts.

I disabled Helper since it seemed to confuse Brain, but it also had no effect.

Let's add one more SYSTEM directive to Brain and run the code last time:
"Enter the maze by first moving to the right."

The AI first moved to the right, but then proceeded incorrectly as earlier:

Lastly, let's paste the maze on chatgpt.com to see if it comes up with any idea at all (I should had tried that earlier tbh):


At first the description made sense, but then it seems it started describing a very different maze.

Trying again from scratch resulted in description similar to what we saw in practical trials:
The first move is incorrect, them movements seem to make sense and then when it is very close to B it gets wrong again.

Conclusions (for now):

Since this post is becoming too long, I am going to end it for now and continue in a next one. I am not going to give up.

As we can see, ChatGPT is not too capable of understanding and solving a maze.

It was a interesting thing to try tho, thank you for the idea!

In the future post we will check even more possibilities, but currently is seems to be something only proper fine-turning could make successfull. 

See you all in next posts!

PS: The final code we're left with:


from APIWojOpenAI import * #my OpenAI API interface, available on my GitHub
from WojSafeAPI import * #safely loads my own API key, you are unable to ever use this library and its function.
from PIL import Image, ImageDraw 
import time

api_key = YourAPIKeyHere("openai")

image = Image.open(str(input("Type image file name: ")))
image = image.convert('RGB')

def draw_line(image, x1, y1, x2, y2):
    d = ImageDraw.Draw(image)
    d.line([(x1, y1), (x2, y2)], fill=(255,0,0), width=4)
    image.save("log/"+str(time.time())+".jpg")
    return image

def draw_square(image, x, y, wall=40):
    new_image = image.copy()
    d = ImageDraw.Draw(new_image)
    c_x = wall//2
    c_y = wall//2
    if x>c_x and y>c_y:
        d.rectangle((x-c_x , y-c_y , x-c_x+wall , y-c_y+wall), fill=(255,0,0), width=1)
    else:
        d.rectangle((x , y , x+wall , y+wall), fill=(255,0,0), width=1)
    new_image.save("log/"+str(time.time())+".jpg")
    return new_image

current_x=230
current_y=100

new_image = draw_square(image, current_x, current_y)
#image.show()

system_brain = '''You will recieve a picture of a maze. Solve it by getting from point A to B.
You control a red square. Solve the maze by moving it from point A to B.
The maze has black walls and white floor.
Never move the red square into a green area.
Control the red square by writing direction and then amount of pixels. Available directions are:
W - up/north
S - down/south
A - left/west
D - right/east

Examples:
Move 150 pixels north:
W150
Move 30 pixels left:
A30

Move only once per iteration. Never make multiple movements at once.
Write your own thoughts and comments.
Go thru the maze and not directly to point B. 
Write "END" and nothing more once point B is reached.
Enter the maze by first moving to the right.
Current position of the red square:
'''

system_movement = '''You will recieve a text, which will contain a letter and number, which indicate movement.
Available directions are:
W - up/north
S - down/south
A - left/west
D - right/east

Examples:
Move 150 pixels north:
W150
Move 30 pixels left:
A30

Extract the command from text given to you. Write only the command, never write anything else.
'''

system_helper = '''You will recieve a picture of a maze. Help solve it.
The goal is to move the red square from point A to point B thru the maze.
The maze has black walls and white floor.
Never move the red square into a green area.
The square moves only once per iteration.
The red square must never move thru a wall.
Write recommendations of only 1 next movement.
Current position of the red square:
'''

brain = ImgChatGPTAgent(api_key, 'gpt-4o', 500, system_brain, 1)

movement = ChatGPTAgent(api_key, 'gpt-3.5-turbo', 500, system_movement, 0.8)

helper = ImgChatGPTAgent(api_key, 'gpt-4o', 500, system_helper, 1)

while True:
    
    updated_system_brain = system_brain + " X: " + str(current_x) + " Y: " + str(current_y)
    updated_system_helper = system_helper + " X: " + str(current_x) + " Y: " + str(current_y)
    
    #helper.ChangeSystemMessage(updated_system_helper)
    #tip = helper.GetResponseWithImg(new_image, "What should I do now to continue solving the maze? How to move to not move thru walls?")
    #print("Tip: "+tip)
    
    brain.ChangeSystemMessage(updated_system_brain)
    response = brain.GetResponseWithImg(new_image, "Continue solving the maze")
    
    if str(response.strip()).lower()=="end":
        print("DONE")
        quit()
    
    commands = movement.GetResponse(response)
    
    if str(commands.strip()).lower()=="end":
        print("DONE")
        quit()
    
    print(time.time())
    print("Response: "+response)
    print("Commands: "+commands)
    
    command_table = []
    command_table.append(str(commands[0]).lower())
    command_table.append(int(commands[1:]))
    
    if command_table[0]=="w":
        current_y = current_y - command_table[1]
    elif command_table[0]=="s":
        current_y = current_y + command_table[1]
    elif command_table[0]=="a":
        current_x = current_x - command_table[1]
    elif command_table[0]=="d":
        current_x = current_x + command_table[1]
    else:
        print("An error occured.")
        quit()
    new_image = draw_square(image, current_x, current_y)
        

Comments

Popular posts from this blog

Project ISKRA - introduction

ISKRA experiment 02_10_2024-1 - "Defeat the Minecraft creeper" [reupload]

Using agent-based approach to make ChatGPT capable of video analysis - part 1 - proof of concept