Gemma Token Analysis

Gemma Token Analysis

NotebookLink
Logprobs GenerationOpen In Colab
Token Tree AnalysisOpen In Colab

This repository explores the internal stochastic nature of Gemma models. By extracting transition scores and logits from the Hugging Face transformers generation loop, we can analyze the model’s confidence levels and visualize “competing” tokens at each step of the sequence.

This repository contains no confidential data/IP and is intended for demonstration and research use.

Features

Dynamic Thresholding Logic

The dynamic thresholding logic in Token_Tree_Analysis.ipynb adapts how “picky” the model is about branching based on how busy the search queue currently is.

T_{current} = T_{min} + \min\left(1.0, \frac{|Q|}{Q_{limit}}\right) \times (T_{max} - T_{min})

Where: * T_{current} is the calculated probability threshold for the current step. * T_{min} is the min_branch_threshold (e.g., 0.1). * T_{max} is the max_branch_threshold (e.g., 0.5). * |Q| is the current length of the queue (number of active paths). * Q_{limit} is the soft_queue_limit (target number of active paths).

Note: The saturation ratio is capped at 1.0.

How it works behaviorally:

  1. Empty Queue: When the queue is small, the threshold is close to T_{min}. This encourages the model to branch out and explore even low-probability alternatives.
  2. Full Queue: As the queue fills up (approaching soft_queue_limit), the threshold rises toward T_{max}. This forces the model to be very selective, only branching on highly probable tokens to prevent the search from exploding exponentially.

Repository Structure

Installation

  1. Clone the repository.
  2. Install the required dependencies:
pip install -U torch transformers pandas accelerate numpy huggingface-hub

Usage

  1. Open Logprobs_in_Gemma.ipynb in VS Code or Jupyter Lab.
  2. Ensure you have a Hugging Face account and an access token.
  3. Run the notebook cells to:
  4. Open Token_Tree_Analysis.ipynb to generate and analyze token decision trees.

Visualization

Requirements