Obsidian Source: Notes / Unreasonable Effectiveness of RNNS
Summary
Pending synthesis from local Obsidian source.
Original source title: Unreasonable Effectiveness Of Rnns
Extracted Preview
Andrej Karpathy, so you know it is going to be good.
What exactly are RNNs?
- RNNs allow us to operate over a sequence of vectors(one to many, many to one, one to one, many to many(sync, async))
- They combine the input vector with their state vector with a fixed(but learned function) to produce new state vector.
- Even when static inputs are there, this sequencing of vectors helps a lot(see Deepmind paper Ba et al. & Gregor et al.)
- The output vector's contents are influenced by the entire history of inputs you've fed in in the past.
- Forward pass is same as step. Out task is to find the matrices that give desirable behavior.
- LSTMs are boss.
- Recurrent connection helps us keep track of the context to achieve the task.
Task - Can we use Garcon like tool(transformer lens is for transformers only) to sample a RNN and find what all activations are doing?
Integration Notes
- Source folder:
/home/yashs/Documents/Docs/Obsidian/Research-Notes - Local source:
/home/yashs/Documents/Docs/Obsidian/Research-Notes/Notes/Unreasonable Effectiveness of RNNS.md - Raw copy:
raw/obsidian/research-notes/Notes/Unreasonable Effectiveness of RNNS.md