Leduc Holdem

Run examples/leducholdemhuman.py to play with the pre-trained Leduc Hold'em model. Leduc Hold'em is a simplified version of Texas Hold'em. Leduc Hold'em is a simplified version of Texas Hold'em. Rules can be found here. Reinforcement Learning / AI Bots in Card (Poker) Games - Blackjack, Leduc, Texas, DouDizhu, Mahjong, UNO. JonathanLehner/rlcard.

Fall 2020

Public Reports

Public Video Reports

Other Titles

  • Tic Tac Toe Solver
  • Beating the house in Blackjack
  • Effect of Noise on Learning a Planar Pushing Task using SAC
  • ResQNet: Finding Optimal Fire Rescue Routes
  • COVID Chatbot
  • Regularized Follow-the-Leader in Online MDP for Efficient Topographical Mapping
  • Learning POMDP model parameters from missing observations
  • Reinforcement Learning for No-Limit Texas Hold ‘Em with Bomb Pots
  • Identifying Optimal Locations for Satellite Image Capture
  • Diet Conscious Meal Planner
  • Mitigating Risk of Public Transit during COVID-19
  • Predicting the Match: Using Bayesian Networks to Predict Professional Tennis Outcomes
  • Efficient Single-Agent Capture of a Moving Target
  • Q-Learning Applied to the Taxi Problem
  • Settlers of CATAN
  • Autonomous Snake
  • Q-Learning for Pre-Flop Texas Hold ‘Em
  • Deep RL for Atari Games
  • Simulating a D&D Encounter with Q-Learning
  • Deep RL for Automated Stock Trading
  • Dating Under Uncertainty
  • Retinal Implant Electrical Stimulation via RL
  • Batch Offline RL in the Healthcare Setting
  • Computer Caddy – Using RL to advice Golfers’ Club Selection
  • RL for Fischer Random Chess
  • Timely Decision Making with Probability Path Model
  • Satellite-Imagery Based Poverty Level Evaluation System in Mexico with Deep RL Approach
  • Pokemon Showdown
  • Deep RL for Space Invaders
  • Learning to Run
  • RL-Based Control of Policy Selection in Near-Accident Scenarios
  • Model Predictive Control for an Aircraft Autopilot
  • Finding Inharmonic Timbres Locally Consonant with Arbitrary Scales
  • Escape Roomba
  • Driving in Traffic
  • Playing Snake
  • The 2020 FLatland Challenge
  • Elevator Scheduling with Neural Q-Learning
  • Optimizing Immunotherapy Treatment using RL
  • Modeling Leduc Hold ‘Em Poker
  • Auto Trading System Using Q-Learning
  • Energy System Modeling
  • Optimizing Fox in the Forest through RL
  • Learning Gin Rummy
  • Car Racing with Deep RL
  • Sequential Decision Making for Mineral Exploration
  • Advanced Driver Assistance Systems
  • Learning to Play Stargunner with Deep Q-Networks
  • A Fourth-and-Goal Football Recommender System
  • Algorithms for Motion Planning
  • Playing Farkle
  • Connect 4: A Survey of Different RL Techniques to Destroy Your Pride
  • Decision Making in the word game, Codenames
  • Reinforcement Learning Approaches for An Adversarial Snake Agent
  • An Attention-Based, Reinforcement-Learned Heuristic Solver for the Double Travelling Salesman Problem With Multiple Stacks
  • Uncertainty Aware Model-Based Policy Optimization
  • Navigating the Four-Way Stop Autonomously
  • Ground Water Remediation Using Sequential Decision Making
  • Final Project: Satellite Collision Avoidance
  • Q Learning for 4th Down Decision Making in the NFL
  • Q-Learning for the Game of Nim: Does The Agent Learn a Combinatorially Optimal Strategy On Its Own?
  • Contextual Bandit Algorithms in Recommender Systems
  • A Comparison of Reinforcement Learning Methods for Autonomous Navigation
  • Reinforcement Learning for Behavior Planning in Intersections for Autonomous Vehicles
  • Reinforcement Learning for Pacman Capture the Flag
  • Comparing Different Optimization Techniques for Learning Continuous Control with Neural Networks
  • Autonomous Exploration in Subterranean Environments
  • Improving Image Denoising through Decision Making
  • Using MDPs to Optimally Allocate Funds
  • Explanations Meet Decision Theory
  • Learning Policies for Adaptive LiDAR Scanning with POMDPs
  • Cautious Markov Games: A New Framework for Human-Robot Interaction
  • Selecting a multibasis community structure for the connectome
  • Reinforcement Learning Techniques for Long-Term Trading and Portfolio Management
  • Optimal Asset Allocation with Markov Decision Processes
  • Symbolic Regression with Bayesian Networks
  • Scheduling battery charging using deep reinforcement learning
  • Online Knapsack Problem Using Reinforcement Learning
  • Policy gradient optimization for
  • Resource Allocation for Wildfire Prevention
  • Using Reinforcement Learning to Play Omaha
  • Fraud Detection for Mobile Payments using Bayesian Network and CNN
  • Neuro-Adaptive Artificial Neural Networks for Reinforcement Learning
  • AI Agent for Qwirkle
  • Learning Optimal Wildfire Suppression Policies With Reinforcement Learning
  • Bid Smart with Uncertainty: An Autonomous Bidder
  • AA228/CS238 Final Report
  • Modeling Identification of Approaching Aircraft as a POMDP
  • Short-Term Trading Policies for Bitcoin Cryptocurrency Using Q-learning
  • Reinforcement Learning of a Battery Power Schedule for a Short-Haul Hybrid-Electric Aircraft Mission
  • Autonomous Helicopter Control for Rocket Recovery
  • Reinforcement Learning Strategies Solving Game Gomoku
  • A Wildfire Evacuation Recommendation System
  • Battleship with Alogrithm
  • Developing an Optimal Structure for Breast Cancer Single Cell Classification
  • Utilizing Deep Q Networks to Optimally Execute Stock Market Entrance and Exit Strategies
  • Online Planning for a Grid World POMDP
  • Contingency Manager Agent for Safe UAV Autonomous Operations
  • Solving Mastermind as a POMDP
  • Simulated Drone Flight with Advantaged Actor Critic Reinforced Learning in 2 and 3 Dimensions
  • Solving Queueing Problem Using Monte Carlo Tree Search
  • Bayesian Structure Learning on NFL play data
  • Multi-Agent Rendezvous Using Reinforcement Learning
  • Dynamic Portfolio Optimization
  • Fairness and Efficiency in Multi-Portfolio Liquidation: An Multiple-Agent Deep Reinforcement Learning Approach
  • Evaluating Poker Hands
  • Saving Artificial Intelligence Clinician
  • Evaluation of online trajectory planning methods for autonomous vehicles
  • Solving Leduc Hold’em Counterfactual Regret Minimization
  • From aerospace guidance to COVID-19: Tutorial for the application of the Kalman filter to track COVID-19
  • A Reinforcement Learning Algorithm for Recycling Plants
  • Monte Carlo Tree Search with Repetitive Self-Play for Tic-Tac-Toe
  • Developing a Decision Making Agent to Play RISK

Fall 2019

Public Reports

Other Titles

  • Linear Array Target Motion Analysis Using POMDPs
  • Speed or Safety?: Calculating Urban Walking Routes Based on Probability of Crime and Foot Traffic
  • AlphaGomoku
  • Modelling Uncertainty in Dynamic Real-time Multimodal Routing Problems
  • Reinforcement Learning for Portfolio Allocation
  • Preparation of Papers for AIAA Technical Conferences
  • Autonomous Racing
  • Deep Learning Enabled Uncorrelated Space Observation Association
  • Landing a Lunar Spacecraft with Deep Q-Learning
  • POkerMDP: Decision Making for Poker
  • 1V1 Leduc Hold’em Bot
  • Political Influencers: Using Election Finance Data to Analyze Campaign Success via Bayesian Networks
  • Developing AI Policies for Street Fighter via Q-learning
  • Impact of Market Technical Indicators On Future Stock Prices Using Reinforcement Learning
  • Allocation of Hearts for Transplant as an MDP
  • Multi-Agent Reinforcement Learning in a 2D Environment for Transportation Optimization
  • Planning under Uncertainty for Discrete Robotic Navigation with Partial Observability
  • Deep Reinforcement Learning Applied to Mid-Frequency Trading
  • Application of Subspace Identification for Classification of Neural-Activity during Decision-Making
  • Using Markov-Decision Processes to Design Betting Strategies for the NFL
  • Maneuvering Characteristics Control Systems using Discrete-Time MDPs
  • MDP Based Motion Planning In Unsignaled Intersections
  • Competitive Blackjack Using Reinforcement Learning
  • Modelling Pedestrian Vehicle Interaction at Stop Sign using Markov Decision Process
  • Jeopardy! Wagering Under Uncertainty
  • Love Letters Under Uncertainty
  • Playing The Resistance with a POMDP
  • Robotic Simultaneous Localization and Mapping with 2D Laser Scan
  • Mars Rover: Navigating an Uncertain World
  • Modeling Blood Donations Over Time as a POMDP
  • Reinforcement Learning for Control on OpenAI Gym Environments
  • Playing Connect 4 using Reinforcement Learning
  • Evaluation of Reduced Algorithmic Complexity for Grasping Tasks by Using a Novel Underactuated Curling Grasper with Reinforcement Learning
  • Optimizing Strategies for Settlers of Catan
  • Exploring Search Algorithms for Klondike Solitaire
  • A Sparse Sampling Control Strategy for Risk Minimization during Stretchable Sensor Network Deployment
  • Computing Strategies for the 7 Wonders Board Game
  • POMDP modeling of stochastic Tetris
  • Solving a Maze with Doors and Hidden Tigers
  • Playing “Dominion” with Deep Reinforcement Learning
  • Delivery Vehicle Navigation in Crowd with Reinforcement Learning
  • Capturing Uncertainty in a Multi-Modal Setting With JRMOT: A Real-Time 2D-3D Multi-Object Tracker
  • Decentralized Satellite Network Communication
  • Seismic Network Planning
  • Reinforcement Learning for PaoDeKuai, A Card Game
  • Training A Bai Fen Agent with Reinforcement Learning
  • Decision Making for Launch Cancellation Based Upon Storm Conditions
  • Optimizing for the Competitive Edge: Modeling Sequential Binary Decision Making for Two Competing Firms
  • Datacenter Equipment Maintenance Optimization
  • To Heat Or Not To Heat: Reinforcement Learning for Optimal Residential Water Heater Control
  • Learning to Play Snake Game with Deep Reinforcement Learning
  • Optimal Traffic Light Control for Efficient City Transportation
  • Modeling NBA Point Spread Betting as an MDP
  • Solving a car racing game using Reinforcement Learning
  • Is Uncertainty Really Harmful: Solving Partially Observable Lunar Lander Problem with Deep Reinforcement Learning
  • Autonomous Navigation of an RC Boat Under a POMDP Framework
  • Evaluating the Bayes-Adaptive MDP Framework on Stochastic Gridworld Environments
  • Value Iteration with Enhanced Action Space for Path Planning
  • The Medical Triage Problem: Improving Hospitals’ Admission Decisions
  • Optimal Route Selection for Riders in Toronto
  • Model Free Learning for Optimal Power Grid Management
  • Wasting Less Time on the Road Using MDPs
  • Learning User Preferences to Produce Sequential Movie Recommendations
  • A Comparison of Learning Based Control Methods for Optimal Trajectory Tracking with a Quadrotor
  • Artificial Pancreas: Q-Learning Based Control for Closed-loop Insulin Delivery Systems
  • Navigating in an Uncertain World
  • Teaching an Autonomous Car to Drive through an Intersection with POMDPs
  • Atomic structure minimization using simulated annealing with a MCTS temperature scheme
  • AI Game Player for 2048
  • Deep Q-Learning with GARCH for Stock Volatility Trading
  • Learning to Become President
  • Solving GNSS Integrity-Based Path Planning in Urban Environments via a POMDP Framework
  • Reservoir operation under climate uncertainty
  • Reinforcement Learning for Maze Solving
  • Using Reinforcement Learning to Find Basins of Attraction
  • Planning for Asteroid Prospecting Missions with POMDPs
  • Human-Aware Robot Motion Planner
  • Determining Federal Funds Rate Changes – Hike / Cut / Hold – Under Economic Uncertainty
  • Simulating Work-Life Balance with POMDP
  • Solving 2048 as a Markov Decision Process
  • Accounting for Delay in Dynamic Resource Allocation for Wildfire Suppression – a POMDP Approach
  • Daily Allocation of Assets with Distinct Risk Profiles using Reinforcement Learning
  • LocoNets for Deep Reinforcement Learning
  • Exploring a full joint observability game with Markov decision processes
  • Deep Bayesian Active Learning for Multiple Correct Outputs
  • Convolutionally Reducing Markov Decision Processes
  • Robust Decision Making Agent for Frozen Lake
  • Tic-Tac-Toe How Many In A Row?
  • Turbomachinery Optimization Under Uncertainty
  • Devising a Policy for Liar’s Dice Using Model Free Reinforcement Learning
  • Political Compromises: an Iterative Game of Prisoner’s Dilemma
  • Optimal Home Energy System Management using Reinforcement Learning
  • Drone Tracking in a 2-dimentional Grid using Particle Filter Algorithm
  • Deep Reinforcement Learning for Traffic Signal Control
  • A Deep Reinforcement Learning Approach to Recommender Systems
  • FlyCroTugs – Collaborative Object Manipulation Using Flying Tugs
  • Local Approximation Q-Learning for a Simplified Satellite Reconnaissance Mission
  • Developing Policies for Blackjack Using Reinforcement Learning
  • Applying Q-learning to the Homicidal Chauffeur Problem
  • Optimal Satellite Detumbling through Reinforcement Learning
  • Active Preference-Based Gaussian Process Regression for Reward Learning and Optimization
  • A Comparative Study on Heart Disease Prediction
  • Robot Navigation with Human Intent Uncertainty
  • Conquering the Queen of Spades: A Hearts Agent
  • Using Markov Decision Processes to Predict Soccer Player Market Value
  • Effectiveness of Recurrent Network for Partially-Observable MDPs
  • Capture The Flag
  • Predicting uncertainty
  • Optimal Asset Allocation with Markov Decision Processes
  • Nets on Nets: Using Bayesian Networks to Predict Supplier Links in Economic Networks
  • Playing 2048 With Reinforcement Learning
  • Trading strategies using deep reinforcement learning with news and time series stock data
  • Modeling Contract Bridge as a POMDP
  • Solving Rubik’s Cubes Using Milestones
  • Playing 2048 with Deep Reinforcement Learning
  • An Approximate Dynamic Programming Minimum-Time Guidance Policy for High Altitude Balloons
  • Identifying Bots on Twitter
  • Approaches to Model-Free Blackjack
  • Jumping Robot Simulator: An Exploration of Methods to Teach a Bio-Inspired Frog Robot to Navigate
  • Air Traffic Control Tower Policy for Terminal Environment Operations
  • Managing a Prediction Market Portfolio
  • Applying Partially Observable Markov Decision Making Processes to a Product Recommendation System
  • Self-Driving Under Uncertainty
  • Reinforcement Learning for QWOP
  • Modeling Macroeconomic Phenomena with Multi-Agent Reinforcement Learning
  • Optimal Learning Policy via POMDP planning
  • AI Guidance for Thermalling in a Glider
  • Decision Making For Profit: Portfolio Management using Deep Reinforcement Learning
  • Self-play Reinforcement Learning for Open-face Chinese Poker
  • Feature Constrained Graph Generation with a Modified Multi-Kernel Kronecker Model
  • Sensor Fusion of IMU and LiDAR Data Using a Multirate Extended Kalman Filter
  • Optimizing Empiric Antibiotic Delivery in the Emergency Department
  • The Task Completion Game
  • Optimizing Modified Mini-Metro (M³)
  • Improving Pragmatic Inferences with BERT and Rational Speech Act Framework and Data Augmentation
  • Deep Q-Learning for Playing Hanabi as a POMDP
  • A Comparative Study on Heart Disease Prediction
Leduc

Fall 2018

Public Reports

Other Titles (excluding optional final projects)

  • Occlusion Handling for Local Path Planning with Stereo Vision
  • Pre-Flop Betting Policy in Poker
  • Optimal Impulsive Maneuver Times for Simultaneous Imaging and Gravity Recovery of an Asteroid
  • Monte-Carlo Planning in Subsurface Resource Development
  • Learning to Win at Go-Stop
  • Police Officer Distribution
  • Optimizing Road Construction to Improve Traffic Conditions Using Reinforcement Learning
  • Q-Learning for Casino Hold’em
  • Modeling a Connected Highway Merge as a POMDP Using Dynamic GPS Error
  • Figure 8 Race Track Optimal and Safe Driving
  • Predictive Maintenance of Trucks using POMDPs
  • Predictive Models for Maximizing Return on Agriculture given Location and Temperature
  • A Policy to Deal With Delay Uncertainty
  • Reinforcement Learning Methods for Energy Microgrids
  • Boom! Tetris for Bot – Designing a Reinforcement Learning Framework for NES Tetris
  • Hidden Markov Models for Economic Cycle Regime Estimation
  • Push Me: Optimizing Notification Timing to Promote Physical Activity
  • Resource Allocation for Floridian Hurricanes
  • Motion Planning in Human-Robot Interaction Using Reinforcement Learning
  • Automated Neural Network Architecture Tuning with Reinforcement Learning
  • Imitation Learning in OpenAI Gym with Reward Shaping
  • Collision Avoidance for Unmanned Rockets using Markov Decision Processes
  • MDP Solvers for a Successful Sushi Go! Agent
  • Uncovering Personalized Mammography Screening Recommendations through the use of POMDP Methods
  • Implementing Particle Filters for Human Tracking
  • Decision Making in the Stock Market: Can Irrationality be Mathematically Modelled?
  • Single and Multi-Agent Autonomous Driving using Value Iteration and Deep Q-Learning
  • Buying and Selling Stock with Q-Learning
  • Application and Analysis of Online, Offline, and Deep Reinforcement Learning Algorithms on Real-World Partially-Observable Markov Decision Processes
  • Reward Augmentation to Model Emergent Properties of Human Driving Behavior Using Imitation Learning
  • Classification and Segmentation of Cancer Under Uncertainty
  • Comparison of Learning Methods for Price Setting of Airfare
  • QMDP Method Comparisons for POMDP Pathfinding
  • Global Value Function Approximation using Matrix Completion
  • Artificial Intelligence Techniques for a Game of 2048
  • Exploring the Boundaries of Art
  • An Iterative Linear Algebra Approach to Dynamic Programming
  • Solving Open AI Gym’s Lunar Lander with Deep Reinforcement Learning
  • Application of Imitation Learning to Modeling Driver Behavior in Generalized Environments
  • Craps Shoot: Beating the House…?
  • Movie Recommendations with Reinforcement Learning
  • Playing Atari 2600 Games Using Deep Learning
  • Traverse Synthesis for Planetary Exploration
  • Optimal operation of an islanded microgird under a Markov Decision Process framework
  • Implementing Deep Q-learning Extensions in Julia with Flux.jl
  • Learning How to Buy Food
  • Using Dynamic Programming for Optimal Meal Planning
  • Modelling Wildfire Evacuation using MDPs
  • Comparing Multimodal Representations for Robotic Reinforcement Learning Tasks
  • Applying Reinforcement Learning to Packet Routing in Mesh Networks
  • Xs & Os: Creating a Tic-Tac-Toe Foe
  • Doggo Does a Backflip: Deep Reinforcement Learning on a Quadruped Platform
  • GrocerAI: Using Reinforcement Learning to Optimize Supermarket Purchases
  • Reinforcement Learning For The Buying and Selling of Financial Assets
  • Towards Designing a Policy on Automotive GPS Integrity
  • Generalized Kinetic Component Analysis
  • Trading Wheat Futures Contracts
  • Using PCR, Neural Networks, and Reinforcement Learning
  • Reinforcement Learning for Inverted Pendulums
  • Electric Vehicle Charging under Uncertainty
  • Automatic Accompaniment Generator: An MDP Approach
  • Comparison of Methods in Artificial Life
  • Modeling a Better Visual Acuity Test
  • Online Methods Applied to the Game of Euchre
  • Missile Defense Strategy: Towards Optimal Interceptor Allocation
  • Smart Charging of Electric Vehicles under State Uncertainty
  • Learning to Play Atari Breakout Using Deep Q-Learning and Variants
  • Decision making on fault-code
  • Learning FlappyBird with Deep Q-Networks
  • A Fresh Start: Using Reinforcement Learning to Minimize Food Waste and Stock-Outs in Supermarkets
  • Autonomous orbital maneuvering using reinforcement learning
  • Autonomous Decision-Making for Space Debris Avoidance
  • Maximizing Monthly Expenditures Under Uncertainty
  • Modeling Voter Preferences in US General Elections
  • Application of Reinforcement Learning to the Path Planning with Dynamic Obstacles
  • A Decision Making framework for Medical Diagnostics
  • Learning to Walk Using Deep RL
  • Q(λ)-Learning with Boltzmann and ε-greedy Exploration Applied to a Race Car Simulation
  • Reinforcement learning for Glassy/Phase Transitions
  • Proximal Policy Optimization in Julia
  • University Technology Patent and License Decisions: Open- versus Closed-Loop Planning in a Markov Decision Process
  • A Policy Gradient Approach for Continuous-Time Control of Spacecraft Manipulator Systems
  • Applying Techniques in Reinforcement Learning to Motion Planning in Redundant Robotic Manipulators
  • Deep Q-Learning for Atari Pong
  • Adversarial Curiosity for Model-Based Reinforcement Learning
  • A Markov Decision Process Approach to Home Energy Management with Integrated Storage
  • Using Maximum Likelihood Model-Based Reinforcement Learning to Play Skull
  • Cryptocurrency Trading Strategy with Deep Reinforcement Learning
  • Evaluating Multisense Word Embeddings Final Report
  • Near-Earth Object (NEO) Deflection via POMDP
  • Reinforcement Learning for Car Driving
  • Reinforcement Learning for Automatic Wheel Alignment
  • Julia Implementation of Trust Region Policy Optimization
  • Deep Reinforcement Learning with Target and Prediction Networks
  • Playing Tower Defense with Reinforcement Learning
  • Q-Learning agent as a portfolio trader
  • Multi-Robot Rendezvous from Indoor Acoustics
  • Portfolio Asset Allocation using Reinforcement Learning
  • Creating a 2048 AI Solver using Expectimax
  • Robustness of Reinforcement Learning Based Communication Networks in Multi-Modal Multi-Step Games to Input Based Adversarial Attacks
  • Deep Q-Learning with Nearest Neighbors in Sequential Decision-Making for Sepsis Treatment
  • Positioning Archival Radar Data with a Particle Filter
  • Reinforcement Learning for Atari Skiing
  • Understanding Donations with Reinforcement Learning
  • Known and Unknown Discrete Space Exploration Using Deep Q-Learning
  • Speeding Up Reinforcement Learning with Imitation
  • Learning Bandwidth-Limited Communication in Decentralized Multi-agent Reinforcement Learning

Fall 2017

  • 2048 as a MDP
  • A Computational Approach to Employee Resource Allocation between Multiple Projects
  • Accelerated Training of Deep Q Learning Models for Atari Games
  • AlphaOthello: Developing an Othello player through Reinforcement Learning on Deep Neural Networks
  • An Online Approach to Energy Storage Management Optimization
  • An Optimal Basketball Foul Strategy by Value Iteration
  • Annealed Reward Functions in Continuous Control Reinforcement Learning
  • Applications of Inverse Reinforcement Learning for Multi-Feature Path Planning
  • Attributing Authorship in the Case of Anonymous Supreme Court Opinions Utilizing SVMs and Probabilistic Inference on Score Uncertainty
  • Balancing Safety and Performance in Imitation Learning
  • Baseball Pitch Calling as a Markov Decision Process
  • Batch Reinforcement Learning Technological Investment Strategies Utilizing The Contingent Effectiveness Model In A Markov Decision Process
  • Bayesian Learning of Image Transformations from User Preferences for Individualized Automatic Filters
  • BetaMiniMax: An Agent for Cheat
  • Building a Game Agent to Play Resistance
  • Building Trust in Autonomy: Sharing Future Behavior in Reinforcement Learning
  • Car racing with low dimensional input features
  • Comparison of Classical Control Methods and POMDPs for 3D Motion Control
  • Control of a Partially-Observable Linear Actuator
  • DDQN Learning for 2048
  • Deep Q-learning in OpenAI gym
  • Deep Q-Learning with Target Networks for Game Playing
  • Design of A Planning Machinery for Choosing an NBA Team’s Play Style Strategy
  • Detecting Human from Image with Double DQN
  • Determining the Optimal Betting Policy: World Series
  • Disrupting Distributed Consensus (or Not) Using Reinforcement Learning
  • Dominating Dominoes
  • Double A3C: Deep Reinforcement Learning on OpenAI Gym Games
  • Emergent Language in Multi-Agent Co-operative Reinforcement Learning
  • Explore the Frontier of Safe Imitation Learning for Autonomous Driving
  • Fast Operation of Coordinated Distributed Energy Resources without Network Models using Deep Deterministic Policy Gradients
  • Faster Algorithms for Contextual Combinatorial Cascading Bandits
  • Finding a Scent Source with a Soft Growing Robot Using Monte Carlo Tree Search
  • Gaming Bitcoin Leveraging Model-Based Reinforcement Learning
  • Get Ready for Demand Response
  • GlideAI: Optimizing Soaring Strategy with Reinforcement Learning
  • Grid Stability Management and Price Arbitrage for Distributed Energy Storage and Generation via Reinforcement Learning
  • Guiding the management of sepsis with deep reinforcement learning
  • HMMs for Prediction of High-Cost Failures
  • Integrating Mini-Model Evidence into Policy Evaluation
  • Investigating Parametric Insurance Models As Multi- Variable Decision Networks
  • Learning an Optimal Policy for Police Resource Allocation on Freeways
  • Learning Terminal Airspace Models from TSAA Data
  • Learning the Education System
  • Learning the Policy of the Policy: Deep Reinforcement Learning with Model-Based Feedback Controllers
  • Learning to Play a Simplified Version of Monopoly Using Multi-Agent SARSA
  • Learning to Play Othello Without Human Knowledge
  • Limbed Robot Motion Control through Online Reinforcement Learning
  • Linear Approximation Q-Learning to Learn Movement in a 2D Space
  • Locally Optimal Risk Aware Path Planning
  • Massively Parallel Reinforcement Learning Using Trust Region Policy Optimization
  • Model-Free Learning of Casino Blackjack
  • Model-Free Reinforcement Learning of a Modified Helicopter Game
  • Model-Free Reinforcement Learning on Flappy Bird
  • Modeling Disaster Evacuation Paths
  • Modeling Flight Delay and Cancelation
  • Modeling NBA Matchups
  • Modeling Optimally Efficient Earth to Earth Flight Trajectories in Kerbal Space Program with Reinforcement Learning
  • Modeling Real Estate Investment with Deep Reinforcement Learning
  • Multi-Agent Cooperative Language Learning
  • Multi-armed Bandits with Unobserved Confounders
  • Multidisciplinary Design Optimization for Approximating Unsteady Aerodynamics of Flexible Aircraft Structures
  • Navigating Chaos: Autonomous Driving in a Highly Stochastic Environment
  • Optimal Flight Itineraries Under Uncertainty Using a Stochastic Markov Decision Process
  • Optimal Strategy for Two-Player Incremental Classification Games Under Non-Traditional Reward Mechanisms
  • Optimizing sequential time-lapse seismic davcx bta collection using a POMDP
  • Personal Portfolio Asset Allocation as An MDP Problem
  • Planetary Lander with Limited Sensor Information and Topographical Uncertainty
  • Playing Flappy Bird Using Deep Reinforcement Learning
  • POMDP and MDP for Underwater Navigation
  • POMDP Modeling of a Simulated Automatic Faucet for Cognitive State and Task Recognition
  • Portfolio Management
  • Power Grid real time optimization using Q-Learning
  • Predicting Congressional Voting Behavior and Party Affiliation using Machine Learning
  • Predicting Income From OkCupid Profiles
  • Predicting NBA Game Outcomes using POMDPs
  • Predicting Subjective Sleep Quality
  • Preparation of Papers for AIAA Technical Journals
  • Pursuit-Evasion Game with an Agent Unaware of its Role
  • Rapid Reinforcement Learning by Injecting Stochasticity into Bellman
  • Real Time Collision Detection and Identification for Robotic Manipulators
  • Reinforcement Learning Applied to Quadcopter Hovering
  • Reinforcement Learning Approaches to Pathfinding in Stochastic Environments
  • Reinforcement Learning For A Reach-Avoid Game
  • Reinforcement Learning for Atari Breakout
  • Reinforcement Learning for Crypto-Currency Arbitrage Bot
  • Reinforcement Learning for Precision Landing of a Physically Simulated Rocket
  • Reinforcement learning in an online multiplayer game
  • Reinforcement Learning of Blackjack Variants
  • Reinforcement training of nonlinear reduced order models
  • Reward Shaping with Dynamic Guidance to Accelerate Learning for Multi-Agent Systems
  • Risk – Bayesian World Conquest
  • Roboat: Reinforcement of Boat’s Optimal Adaptive Trajectory
  • Robotic Arm Motion Planning Based on Reinforcement Learning
  • Robotic Decision Making Under Uncertainty
  • Sensor Selection for Energy-Efficient Path Planning Under Uncertainty
  • ShAIkespeare: Generating Poetry with Reinforcement Learning and Factor Graphs
  • Shared Policies in Aircraft Avoidance
  • Simulated Autonomous Driving with Deep Reinforcement Learning
  • Simulating Coverage Path Planning with Roomba
  • SLAMming into Obstacles: Simultaneous Localization and Mapping the Path of a Turtlebot
  • Smart Health Coach: Using Markov Decision Processes to Optimize Health Advising Strategies
  • Smarter Queues by Reinforcement Learning
  • Solving Real-world Oil Drilling Problem with Multi-Armed Bandit and POMDP Models
  • Stay in Your Lane: Probabilistic Vehicular Automation for DIY Robocars
  • Supervised Learning and Reinforcement Learning for Algorithmic Trading
  • Taking Out the GaRLbage
  • Terrain Relative Navigation and Path Planning for Planetary Rovers
  • Time-Constrained Sample Retrieval in a Martian Gridworld with Unknown Terrain
  • Trade-offs in Connect Four Game-Playing Agents
  • Training an Intelligent Driver on Highway Using Reinforcement Learning
  • UAV Collision Avoidance Using Neural Network-Assisted Q-Learning
  • Understanding Limitations of Network Meta-analysis Approaches to Rank Effectiveness of Treatments
  • Using Bayesian Networks to Impute Missing Data
  • Using Bayesian Networks to Predict Credit Card Default
  • Using Bayesian Networks to Understand and Predict Wildfires
  • Using Classification Models to Represent and Predict Students’ Restaurant Preferences
  • Using Q-Learning to Optimize Lunar Lander Game Play
  • Using the QMDP Method to Determine an Open Ocean Fishing Policy
  • Utilizing fundamental factors in reinforcement learning for active portfolio management
  • Utilizing Fundamental Factors in Reinforcement Learning for Active Portfolio Management
HoldemLeduc hold

Fall 2016

  • Model-Free Reinforcement Learning of Blackjack
  • Partially Observable Actions in Solving Markov Decision Processes. The Case for Insulin Dosing Optimization in Diabetic Patients.
  • Using Monte-Carlo Tree Search to Solve the Board Game Hive
  • Blackjack: How to use MDP’s to (nearly) beat the house
  • Cancer Metabolism Mapping: Bayesian Networks and Network Learning Techniques to Understand Cancer Metabolic and Regulatory Pathways
  • Gibbs Sampling in BayesNets.jl
  • UAV Collision Avoidance Policy Optimization with Deep Reinforcement Learning
  • Improving Training Efficiency in Deep Q-Learning for Atari Breakout
  • Monitoring Machine Workload Distribution with Kalman Filter
  • Approximating Transition Functions to Cart Track MDPs via Sub-State Sampling
  • Approaching Quantitative Trading with Machine Learning
  • Structure and Parameter Learning in Bayesian Networks with Applications to Predicting Breast Cancer Tumor Malignancy in a Lower Dimension Feature Space
  • Autonomous Racing by Learning from Professionals
  • Bravo Zulu: Optimizing Teammate Selection for Military and Civilian Applications
  • Investigating Transfer Learning in Deep Reinforcement Learning Context
  • Simultaneous Estimation and Control with MCTS
  • Controlling Soft Robots with POMCP
  • Automatic Learning of Computer Users’ Habits
  • Learning to Play Soccer in the OpenAI Gym
  • Playing Ultimate Tic-Tac-Toe with TD Learning and Monte Carlo Tree Search
  • A Bayesian Network Model of Pilot Response to TCAS Resolution Advisories
  • Improving Head Impact Kinematics Measurement Accuracy using Sensor Fusion
  • Drive Decision Making at Intersections
  • Deterministic and Bayesian Techniques for Spaceborne Vision-Based Non-Stellar Object Detection
  • A Two-Phased Deep Reinforcement Learning Algorith for High-Frequency Trading
  • Implementation and Experimentation of a DQN solver in Julia for POMDPs.jl
  • Landing on the Moon
  • Deserted Island: Cooperative Behavior in Absence of Explicit Delayed Reward
  • DeepGo.py
  • Managing Groundwater under Uncertain Seasonal Recharge
  • Using Reinforcement Learning to Find Flaws in Collision Avoidance Systems
  • Effectiveness of Bayesian Networks in Building a Prediction Model for Movie Success
  • Data Driven Agent based on Aircraft Intent
  • Deep Q-Learning with Natural Gradients
  • A Shot in the Dark: Beating Battleship with POMCP
  • Accelerated Asynchronous Deep Reinforcement Learning Variant of Advantage Actor-Critic
  • Applying Reinforcement Learning and Online Methods on the Inverted Pendulum Problem
  • Predicting Sentiment with Deep Q-Learning
  • A Lookahead Strategy for Super-Level Set Estimation using Gaussian Processes
  • Modeling Breast Cancer Treatment as a Markov Decision Process
  • Learning 31 using Cross-Entropy Methods
  • Improving Haptic Guidance using Reinforcement Learning
  • NLPLab: Actor-Critic Training in Natural Language Processing
  • Deep Reinforcement Learning on Atari Breakout
  • Reinforcement Learning for LunarLander
  • Reinforcement Learning for AI Machine Playing Hearthstone
  • Using Deep Q-Learning to Automate CNN Training
  • Automatic Continuous Variable Encoder in Bayesian Network
  • Side Channel Analysis using Neural Networks and Random Forests
  • A Decision-Making System for Wildfire Management
  • Decentralized Game Theoretic Methods for the Distributed Graph Coverage Problem
  • Autonomous altitude control for high altitude balloons
  • Neural Network Arbitration for Better Time and Accuracy trade-offs
  • Deep Deterministic Policy Gradient with Robot Soccer
  • Towards a Personal Decision Support System
  • Optimal Gerrymandering under Uncertainty
  • The Ambulance Dilemma: Crossing an Intersection with Monte Carlo Tree Search
  • DeepDominionDevelopmental Policy Design: an MDP approach
  • Training of a craps betting strategy with Reinforcement Learning Techniques
  • Engineering a Better Monkey
  • Decision Making During a Bicycle Race
  • Using Discrete Pressure Measurements to Understand Subsonic Bluff-Body Dynamic Damping
  • Effective Move Selection in Chess Without Lookahead Search
  • Solving Texas Hold’em with Monte-Carlo Planning
  • Reinforcement Learning of High-Speed Autonomous Driving through Unknown Map
  • Implementation and deployment of particle filter for simulated and real-world localization tasks
  • Tree Augmented Naive Bayes and Backward Simulation
  • Transfer of Q values across tasks in Reinforcement Learning
  • Training Regime Modifications for Deep Q-Network Learning Acceleration
  • Reinforce Optimizer
  • Approximating Ligand Docking Using a Markov Decision Process
  • Breaking Down Social Media Filter Bubbles via Reinforcement Learning
  • Performing an N-Sentiment Classification Task on Tinder Profiles Based On Image Feature Extraction
  • Play Blackjack With Monte Carlo Simulation And Q-learning with Linear Regression
  • Observer-Actor Neural Networks for Self-Play in Imperfect Information Games
  • Using Hybrid Bayes Nets to Model Country Prosperity
  • Solving a Pandemic! Various Approaches for Tackling the Board Game
  • Improved Markov Decision Process Model for Resource Allocation in Disaster Scenarios
  • Learning Chess through Reinforcement Learning
  • Deep Reinforcement Learning For Continuous Control: An Investigation of Techniques and Tricks
  • Computer Vision Through Perception: Semantic Understanding of Novel Scenes through Data Programming
  • Path Planning for Insertion of a Bevel-Tip Needle
  • Modeling human biases through reinforcement learning
  • Bootstrapping Neural Network with Auxiliary Tasks
  • Q-Learning Application in Optimizing Pokémon Battle Strategy
  • Model-based exploration in natural language generation
  • Automated Aircraft Touchdown
  • Longitudinal Vehicle Control using a Markov Decision Process and Deep Neural Network
  • MOMDP-based Aerial Target Search Optimization
  • Greedy Thick-Thinning Structure Learning and Bayesian Network Conditional Independence Implementations in BayesNets.jl
  • Multiagent Planning For Aerial Broadband Internet
  • Viral Marketing as an MDP
  • Neural Soccer – Towards Exploration by the Pursuit of Novelty
  • Locally Weighted Value Iteration in Julia
  • Fully-Nested Interactive POMDPs for Partially-Observable Turn-Based Games
  • Optimal Policy Considerations for Gas Turbine Maintenance
  • Learning Optimal Manipulation of Food Webs
  • Estimating Resource Prospector’s Probability of Failure Using Importance Sampling and Cross Entropy
  • Dynamically Discount Deep Reinforcement Learning
  • Deep Reinforcement Learning: Accelerated Learning with Effective Gradient Ascent Optimization Algorithms
  • Autonomous Human Tracking in Simulated Environment
  • A LQG Library for POMDPs.jl

Fall 2015

  • Mars Hab-Bot: Using MDPs to simulate a robot constructing human-livable habitats on Mars
  • A Value Iteration Study of BlackJack
  • Optimized Store-Stocking via Monte Carlo Tree Search with Stochastic Rewards
  • Trajectory Planning for Map Exploration Using Terrain Features
  • Instruction Following with Deep Reinforcement Learning
  • Using Markov Decision Processes to Minimize Golf Score
  • Reinforcement Learning for Scheduling I/O Requests on Multi-Queue Flash Storage
  • Finding the Perfect ‘Job’ in resource allocation
  • Maximizing Influence in Social Networks
  • A Machine Learning Regression Approach to General Game Playing
  • Modeling GPS Spoofing as a Partially Observable Markov Decision Process
  • Travel Hacking with MDPs
  • Optimal Mission Planning for a Satellite-Based Particle Detector via Online Reinforcement Learning
  • An MDP Approach to Motion Planning for Hopping Rovers on Small Solar System Bodies
  • Sampling Strategies for Deep Reinforcement Learning
  • Descriptive Power of Bayesian Structure Learning in Stock Market
  • Large-Scale Traffic Grid Signal Control Using Fuzzy Logic and Decentralized Reinforcement Learning
  • Simulated Pedestrian-like Navigation with a 1D Kalman Filter with an Accelerometer and the Global Positioning System
  • Search and Track Tradeoff for Multifunction Radars
  • Play Calling in American Football Using Value Iteration
  • Reinforcement learning for commodity trading
  • Learning the Stock Market, a Naive Approach
  • A POMDP Framework for Modelling Robotic Guidance During a Tissue Palpation Task
  • Reinforcement Learning of an Artificially Intelligent Hearts Player1
  • Toy Helicopter Control via Deep Reinforcement Learning
  • Gas Refuelling Optimization Modelled as a Markov Decision Process
  • Q-Matrix and Policy Compression via Deep Learning
  • Augmenting Self-Learning In Chess Through Expert Imitation
  • Monte Carlo Tree Search Applied to a Variant of the RockSample Problem
  • Supply Chain Management using POMDPs
  • Online Markov Decision Process Framework for Modeling Efficient Home Robot Cleaners
  • Reinforcement Learning for Path Planning with Soft Robotic Manipulators
  • Exploring POMDPS with Recurrent Neural Networks
  • Tic-tac-toe with reinforcement learning: best strategies and influence of parameters
  • Vehicle Speed Prediction using Long Short-Term Memory Networks
  • Explorations on Learning Bayesian Networks
  • Playing unknown game on a visual world
  • Reinforcement Learning for Atari Games
  • Q-learning in the Game of Mastermind
  • Modeling of a Baseball Inning as MDP
  • Reinforcement Learning for Path Planning with Soft Robotic Manipulators
  • Autonomous Driving on a Multi-lane Highway Using POMDPs
  • Solving a Maze Without Location Data
  • Markov Decision Processes and Optimal Policy Determination for Street Parking
  • Solving an opponent-based match-three mobile game
  • Life begins as a POMDP: improving decision making in the IVF clinic
  • Path Planning for Target-Tracking Unmanned Aerial Vehicle
  • Discrete State Filter Implementation for a Battleships Artificial Intelligence
  • POMDP for Search and Rescue with Obstacle Avoidance: Incorporation of Human in the Loop
  • Application performance over cellular networks
  • An MDP Approach to Motion Planning for Hopping Rovers on Small Solar System Bodies
  • Solving Dudo: beating Liar’s Dice with a POMDP
  • Reinforcement Learning for Tetris
  • Robot Path Planning using Monte Carlo POMDP
  • Reinforcement Learning of an Artificially Intelligent Hearts Player
  • Enhancing Computational Efficiency of PILCO Model-based Reinforcement Learning Algorithm
  • Analysis of UCT Exploration Parameter in Sailing Domain Problems
  • Solving a Search and Rescue Planning problem with MOMDPs
  • Robot Motion Planning in Unknown Environments using Monte Carlo Tree Search
  • Delivery optimization of an on-demand delivery service
  • Solving Multi­Agent Decision Making using MDPs
  • Efficient and Modular Inventory Management Framework for Small Businesseses
  • Markov Decision Processes in Board Game Playing
  • Automated Model Selection via Gaussian Processes
  • Predictive Hybrid Vehicle Control Policy
  • Optimal Policies for In-Space Satellite Communications
  • Spacecraft Navigation in Cluttered, Dynamic Environments Using 3D Lidar
  • Playing Chess Endgames using Reinforcement Learning
  • Space Debris Removal
  • Large-Scale Traffic Grid Signal Control Using Fuzzy Logic and Decentralized Reinforcement Learning
  • Relation Extraction from Scratch
  • Lane Merging as a Markov Decision Process
  • Using MDP/POMDP to Help in Search of Survivors of a Plane Crash
  • Applying POMCP to Controlling Partially Observable Diffusion Processes
  • Credit Risk Classification using Bayesian Network

Leduc Hold'em Rules

Fall 2014

Leduc Hold'em Game

  • Automating Air Traffic Management for Flight Arrivals
  • Policy Learning for Sokoban
  • Flight Path Optimization Under Constraints Using a Markov Decision Process Approach
  • Visual Localization and POMDP for Autonomous Indoor Navigation
  • Monte Carlo Tree Search for Online Learning in Golf Course Management
  • Pushing on Leaves
  • Beating 2048
  • Improved electrical grid balancing with demand response scheduled by an MDP
  • Multi-Fidelity Model Management in Engineering Design Optimization Using Partially Observable Markov Decision Processes
  • Smarter Generators in Power Markets
  • Beach Paddle Ball
  • Applying POMDP to RockSample problem
  • Targeting Hostile Vehicle Modeled as a Partially Observable Markov Decision Process with State-Dependent Observation Model
  • Reinforcement Learning and Linear Gaussian Dynamics Applied to Multifidelity Optimization of a Supersonic Wedge
  • Approximate POMDP Solutions for Short-Range UAV Traffic Conflict Resolution
  • WorkSmart: The Implementation of a Modified Q-Learning Algorithm for an Intelligent Daily To-Do List Android Application
  • Imminent Obstacle Avoidance with Friction Uncertainty
  • Dynamic Restrictions during Commercial Space Vehicle Launches
  • Autonomous Direct Marketing with Deep Q-Learning
  • Efficient Risk Estimation for Chance-Constrained Robotic Motion Planning Under Uncertainty
  • Probabilistic Aircraft Arrival Rate Prediction
  • Audio Keylogging: Translating Acoustic Signals into Keystrokes
  • Collision Avoidance for Small Multi-Rotor Aircraft using SARSA(Chipmonkz Slots
  • Sportsbook Prop Bets
  • Betting Sites Free Bets No Deposit
  • Taco Cat Goat Cheese Pizza
  • Play Short Deck Poker Online