Skip to content

evansnyanney/RL4SE

Repository files navigation

RL4SE: Reinforcement Learning for AIGAME (LunarLander-v3)

WandB Project License Python Version GitHub Repo size GitHub stars
GitHub forks

RL4SE stands for Reinforcement Learning for Software Engineering. This project leverages cutting-edge reinforcement learning algorithms to optimize and enhance software engineering processes. By applying the Proximal Policy Optimization (PPO) algorithm to the LunarLander-v3 environment from OpenAI's Gymnasium, RL4SE demonstrates the practical applications of reinforcement learning in complex, real-world scenarios.

📝 Table of Contents

📈 Training Demo

RL4SE Agent Training Demo

Watch the RL4SE agent successfully land on the lunar surface using the PPO algorithm.

📊 Key Performance Metrics

Eval 1
**Eval 1:** Initial evaluation metric showing baseline performance.
Eval 2
**Eval 2:** Secondary evaluation metric indicating progress.
Rollout 2
**Rollout 2:** Rollout metrics during training phases.
Train Entropy Loss
**Train Entropy Loss:** Measures the randomness of the policy over time.
Train Learning Rate
**Train Learning Rate:** Adaptive learning rate schedule during training.
Train Loss
**Train Loss:** Overall training loss decreasing over epochs.
Train Value Loss
**Train Value Loss:** Value loss metric showing accurate value estimation.
Train Approx KL
**Train Approx KL:** Approximate KL divergence during training.
Train Clip Fraction
**Train Clip Fraction:** Fraction of policy updates that were clipped.

🔍 Description

RL4SE is a reinforcement learning project focused on applying the Proximal Policy Optimization (PPO) algorithm to the LunarLander-v3 environment from OpenAI's Gymnasium. This project leverages Stable Baselines3 for model implementation and Weights & Biases (WandB) for experiment tracking and visualization. Additionally, it incorporates Git Large File Storage (LFS) to manage large video recordings of agent performance.

Key Objectives:

  • Demonstrate the effectiveness of PPO in complex environments.
  • Track and visualize training metrics using WandB.
  • Manage large media files efficiently with Git LFS.
  • Provide a modular and scalable codebase for future enhancements.

✨ Features

  • Standard PPO Implementation: Utilizes the PPO algorithm from Stable Baselines3 for training agents.
  • Experiment Tracking: Integrates with WandB to monitor training progress, visualize metrics, and save code snapshots.
  • Video Recording: Records and displays videos of the trained agent's performance.
  • Model Saving: Saves trained models for future use and evaluation.
  • Git LFS Integration: Manages large video files efficiently using Git Large File Storage.
  • Modular Code Structure: Organized scripts and utilities for maintainability and scalability.
  • Configuration Flexibility: Easily adjustable hyperparameters and environment settings.
  • Comprehensive Documentation: Detailed instructions and explanations for ease of use.

🚀 Installation

🔧 Prerequisites

Before you begin, ensure you have met the following requirements:

  • Operating System: Windows, macOS, or Linux
  • Python: Version 3.7 or higher
  • Git: Installed on your system
  • Git LFS: Installed and configured (Installation Guide)
  • OpenAI Gymnasium Environment: Installed as part of the dependencies

📥 Clone the Repository

git clone https://github.com/evansnyanney/RL4SE.git
cd RL4SE

About

Reinforcement Learning Project for CS4900/5900GameAI

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages