RLog2

Quick Python basic course for OIST new students

en

education

Implement fast 2D physics simulation with Jax

en

RL

physics

※ This article is translated from Japanese version with some improvements on code.

Jaxで高速な2D物理シミュレーションを実装してみる

ja

RL

physics

GPU上での高速な物理シミュレーションは、(RLHFやOffline RLに押され気味とはいえ)強化学習界隈では話題のトピックですよね。また単純に、GPU上で爆速でシミュレーションが終わるのはなかなか楽しいものです。 NVIDIA IsaacSymもありますが、jaxで強化学習パイプライン全体を高速化したいならbraxが便利です。以前紹介するブログ…

Try reinforcement learning with equinox

ja

RL

deep

I recently tried equinox, a jax-based library for defining and managing neural nets, and I liked it. So in this blog post I will introduce equinox and demonstrate its use…

equinoxで強化学習してみる

ja

RL

deep

最近equinoxというjaxベースの深層学習モデルを定義するライブラリを使ってみたのですが、これが中々いいと思ったので紹介ついでに強化学習してみます。他のjaxベースのライブラリにはDeepmindのhaikuやGoogle Researchのflaxがあります。この2つのライブラリは実際のところあまり変わりはありません。というのも、jaxには…

Understanding what self-attention is doing

en

NLP

deep

ChatGPT is getting a lot of buzz these days. I don’t use it much because I don’t like to worry about prompts, but my friend uses it to write papers, and my mom uses it to…

Attentionが何をやっているのか理解しよう

ja

NLP

deep

ChatGPTが大バズリしている昨今です。僕はプロンプトを考えるのが面倒なので（ええ…)あまり使わないのですが、友人が論文を書くのに使っていたり、僕の母親が話し相手に使っていたりするようです。親不孝な息子でごめんなさいという感じもします。…

Exercise 5.12 Racetrack from the Reinforcement Learning textbook

en

RL

basic

Here I demonstrate the execise 5.12 of the textbook Reinforcement Learning: An Introduction by Richard Sutton and Andrew G. Barto, using both of planning method and Monte…

Jax・Brax・HaikuでGPU引きこもり学習

ja

RL

deep

強化学習若手の会 Advent Calendar 2021 18日目

より良い問題設計へ向けて：何が強化学習を難しくするのかを理解しよう

ja

RL

basic

強化学習苦手の会 Advent Calendar 2020 20日目 # 1. はじめに