Obsidian Source: Notes / Unreasonable Effectiveness of RNNS

Summary

Pending synthesis from local Obsidian source.

Original source title: Unreasonable Effectiveness Of Rnns

Andrej Karpathy, so you know it is going to be good.

- RNNs allow us to operate over a sequence of vectors(one to many, many to one, one to one, many to many(sync, async))

- They combine the input vector with their state vector with a fixed(but learned function) to produce new state vector.

- Even when static inputs are there, this sequencing of vectors helps a lot(see Deepmind paper Ba et al. & Gregor et al.)

- The output vector's contents are influenced by the entire history of inputs you've fed in in the past.

- Forward pass is same as step. Out task is to find the matrices that give desirable behavior.

- LSTMs are boss.

- Recurrent connection helps us keep track of the context to achieve the task.

Task - Can we use Garcon like tool(transformer lens is for transformers only) to sample a RNN and find what all activations are doing?

Source folder: /home/yashs/Documents/Docs/Obsidian/Research-Notes
Local source: /home/yashs/Documents/Docs/Obsidian/Research-Notes/Notes/Unreasonable Effectiveness of RNNS.md
Raw copy: raw/obsidian/research-notes/Notes/Unreasonable Effectiveness of RNNS.md