Email autocomplete memory lab

Watch RNN, LSTM, and GRU compete to remember context.

SequenceLab AI turns recurrent neural networks into a guided, gamified lab: one email draft, three models, live memory curves, rigorous equations, and plain-language explanations.

Start the lab Read the theory

Live sequence trace

Maya

during

Monday's

kickoff

the

client

specifically

requested

the

compliance

appendix

but

Step 1

Choose an email prediction challenge

The app does not auto-run. Pick a preset or write your own email fragment, then decide when the models start processing.

Custom email fragment

Target word or phrase

Auto-suggested target: Add a longer email fragment
Works for a single word or phrase. For presets, edit this field to test a different expected completion.

Step 2

Run the three-model arena

All metrics come from the simplified TypeScript simulation. The goal is explainable behavior: memory, confidence, prediction quality, latency, and parameter complexity all arise from the same run.

Long Clue: Meeting Follow-up

Hi Maya, during Monday's kickoff the client specifically requested the compliance appendix, but after several unrelated updates could you please attach the final

Target word or phrase: appendix

Edit target for this run

Predictions are hidden until the run reaches the final token. This keeps the lab honest: first watch the memory behavior, then reveal each model's completion.

Simulation speed

The Step button advances one word at a time for classroom explanation. Watch the glow labels: context fading means the model is losing the earlier clue.

Recurrent Neural Network

RNN

1x params

An RNN reads a sequence one token at a time and compresses everything it has seen into a hidden state.

Current token

Context fading

HiStep 1 / 23

The vanilla RNN updates its hidden state, but older clues fade as newer words overwrite the memory.

15%

Memory retention

46%

Confidence

Prediction locked until the run reaches the end. Press Start or Step through the sequence to reveal how this model arrives at its answer.

Long Short-Term Memory

LSTM

4x params

An LSTM adds a cell state and gates so it can choose what to write, forget, and reveal.

Current token

Context fading

HiStep 1 / 23

The LSTM balances new input with preserved cell memory, reducing long-context drift.

16%

Memory retention

40%

Confidence

Prediction locked until the run reaches the end. Press Start or Step through the sequence to reveal how this model arrives at its answer.

input gate54%

forget gate80%

output gate66%

Gated Recurrent Unit

GRU

3x params

A GRU keeps the gating idea but merges memory and hidden state into a simpler structure.

Current token

Context fading

HiStep 1 / 23

The GRU blends previous memory and new context through a compact gated update.

20%

Memory retention

45%

Confidence

Prediction locked until the run reaches the end. Press Start or Step through the sequence to reveal how this model arrives at its answer.

update gate67%

reset gate69%

Step 3

Compare the outcome

The charts translate the run into visible evidence: where memory fades, where gates help, and what each model trades for speed, stability, quality, latency, and complexity.

Complete the arena run to unlock memory, confidence, error, capability, latency, and complexity charts.

Step 4

Theory, math, architecture, strengths, and weaknesses

Each model is explained twice: first in classroom language, then with formal equations suitable for technical study.

Recurrent Neural Network

RNN

An RNN reads a sequence one token at a time and compresses everything it has seen into a hidden state.

Input x_t

Hidden h_t

Output y_t

A single recurrent hidden state carries compressed context forward.

Input x_t

The current word represented as numbers.

Hidden h_t

A compressed memory of what has been read so far.

Output y_t

The model's next-word probability guess.

How to read this math: x_t is the current word, h_t is the memory after reading it, and the softmax output becomes the next-word guess.

h_t = \tanh(W_{xh}x_t + W_{hh}h_{t-1} + b_h)

\hat{y}_t = \operatorname{softmax}(W_{hy}h_t + b_y)

Strengths

Simple architecture
Fast baseline
Good for short dependencies

Weaknesses

Vanishing gradients
Weak long-term memory
Hidden state can be overwritten

Use cases

Small sequence classification
Simple autocomplete demos
Educational baselines

Long Short-Term Memory

LSTM

An LSTM adds a cell state and gates so it can choose what to write, forget, and reveal.

Input gate

Forget gate

Cell state

Output gate

Separate gates protect the cell state so important email clues survive longer.

Input gate

Decides how much new information should enter memory.

Forget gate

Decides what old information should be weakened or removed.

Cell state

The long-term memory highway that carries important clues forward.

Output gate

Decides which part of memory should influence the prediction.

How to read this math: The gates are small decision makers. They choose what enters memory, what gets forgotten, and what is exposed for prediction.

f_t = \sigma(W_f[x_t, h_{t-1}] + b_f)

i_t = \sigma(W_i[x_t, h_{t-1}] + b_i)

\tilde{C}_t = \tanh(W_C[x_t, h_{t-1}] + b_C)

C_t = f_t \odot C_{t-1} + i_t \odot \tilde{C}_t

o_t = \sigma(W_o[x_t, h_{t-1}] + b_o)

h_t = o_t \odot \tanh(C_t)

Strengths

Excellent long-context memory
Controls forgetting
Handles delayed clues well

Weaknesses

More parameters
Higher latency
More complex to explain and tune

Use cases

Language modeling
Speech recognition
Long-range time-series forecasting

Gated Recurrent Unit

GRU

A GRU keeps the gating idea but merges memory and hidden state into a simpler structure.

Update gate

Reset gate

Candidate state

Hidden h_t

Compact gates decide how much old context to keep and how much new evidence to write.

Update gate

Controls how much previous memory should be kept.

Reset gate

Controls how much past context should be ignored for the current update.

Candidate state

A proposed new memory based on the current word.

Hidden h_t

The final compact memory used for prediction.

How to read this math: The update gate decides what to keep, the reset gate decides what to ignore, and the hidden state becomes the prediction memory.

z_t = \sigma(W_z[x_t, h_{t-1}])

r_t = \sigma(W_r[x_t, h_{t-1}])

\tilde{h}_t = \tanh(W[x_t, r_t \odot h_{t-1}])

h_t = (1 - z_t) \odot h_{t-1} + z_t \odot \tilde{h}_t

Strengths

Fewer parameters than LSTM
Strong practical performance
Good speed-memory tradeoff

Weaknesses

Less explicit memory control than LSTM
Can underperform on very long dependencies

Use cases

Chat features
Mobile NLP
Real-time sequence prediction

Watch RNN, LSTM, and GRU compete to remember context.

Choose an email prediction challenge

Long Clue: Meeting Follow-up

Sentiment Flip: Customer Reply

Hard Negation: Archive Trap

Easy Baseline: Short Confirmation

Delayed Dependency: Extension

Business Context: Invoice

Policy Rule: Approval Hold

Schedule Memory: Interview

Recovery Signal: Support Update

Legal Context: Redline Memory

Clinical Referral Email

Security Warning: Do Not Share

Run the three-model arena

RNN

LSTM

GRU

Compare the outcome

Theory, math, architecture, strengths, and weaknesses

RNN

Strengths

Weaknesses

Use cases

LSTM

Strengths

Weaknesses

Use cases

GRU

Strengths

Weaknesses

Use cases

Compare the outcome