Q-learning 玩maze游戏seo优化

Q-learning 玩maze游戏

news/2024/9/23 18:17:26

import pygame
import numpy as np
import random
import sys# 定义迷宫环境
class Maze:def __init__(self):self.size = 10self.maze = np.zeros((self.size, self.size))self.start = (0, 0)self.goal = (9, 9)self.maze[4, 2:7] = 1  # 添加墙壁self.maze[2, 1] = 1self.current_position = self.startdef reset(self):self.current_position = self.startreturn self.current_positiondef manhattan_distance(self):return abs(self.current_position[0] - self.goal[0]) + abs(self.current_position[1] - self.goal[1])def step(self, action):x, y = self.current_positionif action == 0:  # 上x -= 1elif action == 1:  # 右y += 1elif action == 2:  # 下x += 1elif action == 3:  # 左y -= 1if 0 <= x < self.size and 0 <= y < self.size and self.maze[x, y] == 0:self.current_position = (x, y)if self.current_position == self.goal:reward = 100done = Trueelse:reward = -1done = Falseelse:reward = -100done = True# done = self.current_position == self.goalreturn self.current_position, reward, donedef render(self, screen):for x in range(self.size):for y in range(self.size):color = (255, 255, 255) if self.maze[x, y] == 0 else (0, 0, 0)if (x, y) == self.current_position:color = (0, 255, 0)if (x, y) == self.goal:color = (255, 0, 0)pygame.draw.rect(screen, color, (y*40, x*40, 40, 40))pygame.display.flip()# Q-learning
class QLearning:def __init__(self, env):self.env = envself.q_table = np.zeros((env.size, env.size, 4))self.gamma = 0.9self.epsilon = 0.1self.alpha = 0.1def select_action(self, state):if random.random() < self.epsilon:return random.randint(0, 3)else:x, y = statereturn np.argmax(self.q_table[x, y])def update(self, state, action, reward, next_state):x, y = statenx, ny = next_statefuture_rewards = np.max(self.q_table[nx, ny])self.q_table[x, y, action] += self.alpha * (reward + self.gamma * future_rewards - self.q_table[x, y, action])# 主程序
def main():pygame.init()screen = pygame.display.set_mode((400, 400))clock = pygame.time.Clock()maze = Maze()agent = QLearning(maze)for episode in range(10000):state = maze.reset()done = Falsewhile not done:action = agent.select_action(state)next_state, reward, done = maze.step(action)agent.update(state, action, reward, next_state)state = next_statefor event in pygame.event.get():if event.type == pygame.QUIT:pygame.quit()sys.exit()if episode >= 8000:screen.fill((0, 0, 0))maze.render(screen)clock.tick(10)if __name__ == '__main__':main()

运行效果：

本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若转载，请注明出处：http://www.ryyt.cn/news/30990.html

如若内容造成侵权/违法违规/事实不符，请联系我们进行投诉反馈，一经查实，立即删除！

react native 项目使用 Xcode 打包上架 App Store

一、创建证书、标识符和描述文件等： 1. 前提条件可正常运行和打包的代码、Apple开发者账号点击注册Apple开发者账号注册完进入页面可以看到证书、标识符和描述文件创建入口2. 创建App ID点击Identifiers旁边的加号选择 App IDs，点击 Continue。选择 App，点击 Continue。填写…

35岁测试工程师被辞退，给你们一个忠告

一：前言：人生的十字路口静坐反思入软件测试这一行至今已经10年多，承蒙领导们的照顾与重用，同事的支持与信任，我的职业发展算是相对较好，从入行到各类测试技术岗位，再到测试总监，再转行入测试讲师做技术分享，每一步都刚刚好。最近自身的职业发展也遇到了瓶颈，又一个…

下载并安装VMware虚拟机。下载统信UOS。https://www.chinauos.com/resource/download-professional 统信UOS桌面专业版AMD64（1070版本）支持：Intel、AMD、兆芯、海光工作站还可以考虑社区版： https://www.uniontech.com/next/product/desktop-system?edition=CommunityVM…

使用 Python 旋转PDF页面、或调整PDF页面顺序

在将纸质文档扫描成PDF电子文档时，有时可能会出现页面方向翻转或者页面顺序混乱的情况。为了确保更好地浏览和查看PDF文件，本文将分享一个使用Python来旋转PDF页面或者调整PDF页面顺序的解决方案。要实现Python对PDF页面进行设置，我们需要用到第三方库 Spire.PDF for Pytho…

Testing Egineer note:2024_5_13-day08-part01

肖SIR__数据库之搭建__11.2 数据库之搭建 1、rpm -qa|grep 服务名称案例：rpm -qa|grep mysql 2、将所有msyql的包删除干净删除方法：（1）yum remove mysql * 删除linux中的数据库（2）yum erase 包名，删除linux中的数据库（3）rpm -e --nodeps 包名…

c++ true_type与false_type

std::true_type和std::false_type实际上是类型别名是两个类型(类模板)注意区分true_type与false_type与true和false区别true_type，false_type代表类型true,false代表值nmsp1::FalseType myfunc1();//返回假这种含义 nmsp1::TrueType myfunc2();//返回真这种含义自己模拟实现na…

2024.5.13

寄：2024.5.13：眼瞎挂 \(130pts\) .

uniapp自定义input清除按钮

uniapp小程序，引入uni-ui库后，观察到其他组件，有的默认有清除按钮，比如：在写内置组件 input框，查看文档没有此属性，官方示例在这里：https://github.com/dcloudio/hello-uniapp/blob/master/pages/component/input/input.nvue 还需自行复制对应的css，试了下效果不太好…