The Three States: Working Directory, Staging Area, and Repository

Explain Git's three-state architecture with diagrams and practical examples — understand how files flow between working, staging, and committed states.

published: reading time: 24 min read author: Geek Workbench updated: March 31, 2026

Introduction

Git’s three-state architecture is the conceptual foundation that makes Git both powerful and, for beginners, confusing. Every file in a Git repository exists in one of three states at any given time: the working directory (your editable files), the staging area (files prepared for the next commit), and the repository (permanently recorded history). Understanding how files move between these states is the key to using Git effectively.

Most version control systems use a simpler two-state model — files are either modified or committed. Git’s staging area adds a deliberate intermediate step that gives you fine-grained control over what becomes part of each commit. This design choice is what enables atomic commits, partial file staging, and the ability to craft clean, reviewable commit histories.

This guide explains the three-state model in depth, with diagrams, practical examples, and real-world scenarios. Once you internalize this model, Git’s commands stop feeling arbitrary and start making logical sense. For a broader introduction to version control, see What Is Version Control?.

When to Use / When Not to Use

Understand the three states when:

  • Learning Git for the first time — this model explains why Git works the way it does
  • Debugging unexpected git status output
  • Crafting clean commits from messy working changes
  • Using interactive staging (git add -p) to split changes into logical commits
  • Understanding why git reset and git checkout behave differently
  • Teaching Git to others — the three-state model is the most important concept to convey

The staging area is less critical when:

  • You commit all changes at once every time — git commit -a bypasses explicit staging
  • Working on solo projects with simple, linear workflows
  • Using GUI Git clients that abstract the staging area away

Core Concepts

The three states represent three snapshots of your project:

Working Directory: The files you see and edit on your filesystem. This is your active workspace where you write code, fix bugs, and make changes. Files here may be untracked (new files Git does not know about), modified (changed since the last commit), or clean (identical to the last commit).

Staging Area (Index): A hidden file (.git/index) that records which changes will be included in the next commit. Think of it as a draft or preparation area — you selectively place changes here using git add, review them with git diff --staged, and only then make them permanent with git commit.

Repository (HEAD): The permanent, immutable history of your project. Each commit captures a snapshot of all staged files and links to its parent commit, forming a chain of history. Once committed, changes cannot be altered (only new commits can be added).


graph LR
    A[Working Directory<br/>Your editable files] -->|git add| B[Staging Area<br/>Prepared for commit]
    B -->|git commit| C[Repository<br/>Permanent history]
    C -->|git checkout| A
    C -->|git reset| B
    B -->|git reset| A
    A -->|git restore| A

Architecture or Flow Diagram

File State Transitions


stateDiagram-v2
    [*] --> Untracked: New file created
    Untracked --> Staged: git add
    Staged --> Committed: git commit
    Committed --> Modified: Edit file
    Modified --> Staged: git add
    Modified --> Unmodified: git restore
    Staged --> Modified: git restore --staged
    Committed --> Modified: git reset HEAD~1
    Committed --> Staged: git reset --soft
    Modified --> Untracked: git clean

    note right of Untracked
        Git does not track this file
        It will not be committed
    end note

    note right of Staged
        Changes are queued
        for the next commit
    end note

    note right of Committed
        Permanently recorded
        in repository history
    end note

The Complete File Lifecycle


graph TD
    A[Create new file] --> B{Tracked?}
    B -->|No| C[Untracked]
    B -->|Yes| D{Changed?}
    D -->|No| E[Unmodified<br/>Matches HEAD]
    D -->|Yes| F[Modified<br/>Working directory changed]

    C -->|git add| G[Staged<br/>New file]
    F -->|git add| H[Staged<br/>Modified file]

    G -->|git commit| I[Committed<br/>In repository]
    H -->|git commit| I

    I -->|Edit file| F
    I -->|git checkout| E
    F -->|git restore| E
    G -->|git restore --staged| C
    H -->|git restore --staged| F

Step-by-Step Guide / Deep Dive

Understanding Each State Through Examples

State 1: Working Directory

Create a file and observe its state:


# Initialize a repository
git init three-states-demo
cd three-states-demo

# Create a file
echo "Hello, World!" > hello.txt

# Check the status
git status

Output:


On branch main
No commits yet
Untracked files:
  (use "git add <file>..." to include in what will be committed)
 hello.txt
nothing added to commit but untracked files present (use "git add" to track)

The file exists in your working directory but Git does not track it yet. It is untracked — Git knows the file exists but will not include it in any commit until you explicitly add it.

State 2: Staging Area


# Stage the file
git add hello.txt

# Check the status
git status

Output:


On branch main
No commits yet
Changes to be committed:
  (use "git rm --cached <file>..." to unstage)
 new file:   hello.txt

The file has moved to the staging area. It is now “to be committed” — it will be included in the next commit, but it is not yet part of the permanent history.

State 3: Repository


# Commit the staged file
git commit -m "Add hello.txt"

# Check the status
git status

Output:


On branch main
nothing to commit, working tree clean

The file is now in the repository. It is permanently recorded in commit history. The working directory matches the repository — there are no uncommitted changes.

Modifying a Committed File

Now let’s see what happens when you edit a committed file:


# Modify the file
echo "Hello, Git!" > hello.txt

# Check the status
git status

Output:


On branch main
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
 modified:   hello.txt
no changes added to commit (use "git add" and/or "git commit -a")

The file is now modified in the working directory but not yet staged. Git detects the difference between your working copy and the last committed version. This is the most common state during active development.

Partial Staging

One of Git’s most powerful features is the ability to stage only some changes in a file:


# Create a file with multiple changes
cat > app.py << 'EOF'
def greet():
    print("Hello")

def farewell():
    print("Goodbye")

def helper():
    print("Helper function")
EOF

git add app.py
git commit -m "Add app.py with three functions"

# Now modify all three functions
cat > app.py << 'EOF'
def greet():
    print("Hello, World!")

def farewell():
    print("See you later!")

def helper():
    print("Updated helper")
EOF

# Stage only the greet and farewell changes interactively
git add -p app.py

Git will present each change hunk and ask whether to stage it. You can stage greet and farewell while leaving helper unstaged, then commit them separately:


# Commit only staged changes
git commit -m "Update greet and farewell messages"

# Check what remains unstaged
git status
# modified: app.py (the helper change is still in working directory)

# Stage and commit the remaining change
git add app.py
git commit -m "Update helper function"

This produces two clean, atomic commits from a single file with mixed changes.

Moving Files Between States


# Working Directory → Staging Area
git add <file>           # Stage specific file
git add .                # Stage all changes
git add -p <file>        # Stage interactively (hunk by hunk)

# Staging Area → Working Directory (unstage)
git restore --staged <file>   # Unstage specific file
git reset HEAD <file>         # Older syntax, same effect

# Staging Area → Repository
git commit              # Commit all staged changes
git commit -m "message" # Commit with a message

# Working Directory → Last Committed State (discard changes)
git restore <file>      # Discard working changes
git checkout -- <file>  # Older syntax, same effect

# Repository → Working Directory (restore old version)
git checkout <commit> -- <file>  # Restore file from specific commit

Production Failure Scenarios

ScenarioImpactMitigation
Accidentally committing debug code left in working directoryBroken production, exposed debug outputAlways review git diff --staged before committing; use pre-commit hooks
Forgetting to stage a critical fileIncomplete commit, broken build on remoteReview git status before every commit; use git diff --staged to verify
Staging too many unrelated changesUnreviewable commits, hard to revert specific changesStage logically grouped changes; use git add -p for selective staging
Losing uncommitted work in working directoryLost hours of developmentCommit frequently (even WIP commits); use git stash for temporary saves
git reset --hard on wrong branchPermanent loss of uncommitted changesUse --soft or --mixed first; verify with git reflog after mistakes
Merge conflict leaves files in partially staged stateConfused state, incomplete resolutionUse git status to identify conflicted files; resolve all before committing

Trade-off Analysis

ApproachAdvantagesDisadvantagesWhen to Use
Explicit staging (git add + git commit)Full control, atomic commits, reviewable historyMore commands to typeProduction code, team projects, code review workflows
Skip staging (git commit -a)Faster, fewer stepsCommits all tracked changes together, no selectivitySolo projects, quick fixes, when all changes belong in one commit
Interactive staging (git add -p)Granular control, clean commits from messy workSlower, requires understanding of hunksRefactoring, multi-purpose changes, preparing PRs
GUI stagingVisual, intuitiveAbstracts away the model, harder to debugBeginners, visual thinkers, complex merges

Implementation Snippets

Visualizing the Three States


# See the complete picture at once
echo "=== Working Directory Changes ==="
git diff                    # Unstaged changes

echo "=== Staging Area ==="
git diff --staged           # Staged changes (what will be committed)

echo "=== Repository Status ==="
git log --oneline -3        # Recent commits

echo "=== Overall Status ==="
git status                  # Summary of all three states

Committing with Review


# The safe commit workflow
git status                  # 1. See what changed
git diff                    # 2. Review unstaged changes
git add <files>             # 3. Stage intended changes
git diff --staged           # 4. Review what will be committed
git commit -m "message"     # 5. Commit
git log -1                  # 6. Verify the commit

Recovering from Mistakes


# Committed but forgot to stage a file
git add forgotten-file.txt
git commit --amend --no-edit   # Adds to the last commit without changing message

# Committed with wrong message
git commit --amend -m "Correct message"

# Committed to wrong branch
git reset --soft HEAD~1        # Undo commit, keep changes staged
git checkout correct-branch    # Switch to correct branch
git commit -m "message"        # Re-commit on correct branch

# Staged something you should not have
git restore --staged <file>    # Unstage without losing changes

Stashing: A Fourth State

Git stash provides a temporary holding area for uncommitted changes:


# Save working directory and staging area changes
git stash push -m "WIP: feature in progress"

# Working directory is now clean
git status
# nothing to commit, working tree clean

# List all stashes
git stash list

# Restore the most recent stash
git stash pop

# Restore a specific stash without removing it
git stash apply stash@{2}

# Drop a stash you no longer need
git stash drop stash@{0}

Observability Checklist

  • Logs: Use git status as your primary observability tool — it shows all three states at once
  • Metrics: Track the ratio of staged to unstaged changes — large unstaged deltas indicate infrequent commits
  • Traces: Use git diff and git diff --staged to trace exactly what will be committed
  • Alerts: Set up pre-commit hooks that block commits exceeding size thresholds or containing patterns like TODO, FIXME, or console.log
  • Audit: Run git log --stat to audit what files each commit touched
  • Health: Periodically run git status to ensure no long-running uncommitted work accumulates
  • Validation: Before pushing, always run git diff --staged to verify commit contents

Security & Compliance Considerations

  • Staging area is not a security boundary: Files in the staging area are stored in .git/index as plaintext references. They are not encrypted or protected beyond filesystem permissions
  • Committed secrets are permanent: Once a file with secrets is committed, it exists in the repository history forever — even if you delete it in a later commit. Use pre-commit hooks to scan for secrets before they reach the staging area
  • Stash is not encrypted: git stash stores changes in the repository’s object database. Anyone with repository access can view stash contents with git stash show -p
  • Audit trails: The staging area enables clean, atomic commits that serve as better audit trails than monolithic commits. Each commit should represent a single logical change for compliance traceability
  • Signed commits: Use git commit -S to cryptographically sign commits, proving that the staged changes were intentionally committed by the claimed author

Common Pitfalls / Anti-Patterns

  • Treating git add . as harmless: It stages everything including accidental debug files, temporary edits, and generated artifacts. Always review with git status before bulk staging
  • Confusing git reset modes: --soft keeps changes staged, --mixed (default) keeps changes in working directory, --hard discards everything. Using the wrong mode causes data loss or confusion
  • Not understanding that git commit -a skips staging: It automatically stages all tracked modified files and commits them. Untracked files are still ignored. This bypasses the review step
  • Leaving the staging area in an inconsistent state: Staging some changes, getting distracted, and coming back days later leads to accidental commits of unrelated changes. Commit or unstage promptly
  • Using git checkout to unstage: git checkout -- <file> restores the file from the repository, discarding both staged and unstaged changes. Use git restore --staged to unstage while keeping working changes
  • Ignoring the diff before commit: Skipping git diff --staged is the #1 cause of accidental commits with debug code, wrong files, or incomplete changes

Quick Recap Checklist

  • Working Directory: your editable files on the filesystem
  • Staging Area: the preparation zone for the next commit (.git/index)
  • Repository: the permanent, immutable history of commits
  • git add moves changes from working directory to staging area
  • git commit moves staged changes from staging area to repository
  • git status shows the state of all files across all three states
  • git diff shows unstaged changes; git diff --staged shows staged changes
  • git add -p enables interactive, hunk-by-hunk staging
  • git stash provides a temporary fourth state for uncommitted work
  • Always review staged changes with git diff --staged before committing
  • git restore --staged unstages without discarding working changes
  • Committed changes are permanent — the staging area is your last chance to review

Interview Questions

1. What is the staging area and why does Git have it?

The staging area (also called the index) is an intermediate state between your working directory and the repository. It acts as a preparation zone where you selectively choose which changes will be included in the next commit. Git has it because it enables atomic commits — you can modify ten files but only commit the three that form a complete logical change. Without the staging area, every commit would include all modified files, making it impossible to craft clean, reviewable history from messy working sessions.

2. What is the difference between `git reset --soft`, `--mixed`, and `--hard`?

These three modes control what happens to your changes when you undo a commit. --soft moves HEAD back but keeps all changes staged — perfect for amending a commit. --mixed (the default) moves HEAD back and keeps changes in the working directory but unstaged — useful for restaging selectively. --hard moves HEAD back and discards all changes entirely — dangerous, as working directory modifications are permanently lost. The mnemonic: soft keeps everything, mixed keeps working files, hard keeps nothing.

3. How does `git add -p` work and when should you use it?

git add -p (patch mode) presents each changed hunk (contiguous block of changes) in a file and asks whether to stage it. You respond with y (yes), n (no), s (split the hunk smaller), or e (edit manually). Use it when a single file contains multiple unrelated changes that should be separate commits — for example, fixing a bug and adding a feature in the same file. It produces cleaner, more reviewable commit history.

4. Can a file be in multiple states simultaneously?

A file can have different parts in different states. For example, lines 1-10 of a file might be staged while lines 11-20 remain modified but unstaged. This is the power of hunk-based staging with git add -p. However, at the file level, git status reports the most changed state — if any part is staged, the file shows as staged. The staging area tracks changes at the hunk level, not the file level, which is why partial staging is possible.

5. What is `.git/index` and what format does it use?

The staging area is stored in .git/index, a binary file that uses a custom B+tree format to store pathnames mapped to blob object references. When you run git add, Git computes the SHA-1 hash of your file content and stores it as a blob object, then records the mapping in the index. The index does not store file content directly — it holds metadata pointing to objects in the object database. You can inspect it with git ls-files --stage.

6. Why is `git commit -a` dangerous in team environments?

git commit -a automatically stages all tracked modified files and commits them in one step, bypassing explicit review of the staging area. In team environments this is dangerous because it can accidentally commit unrelated changes from other people's work, debug code, or files that should not be part of a logical change. It removes the deliberate review step that git add + git diff --staged provides. For code review workflows, always prefer explicit staging.

7. What is the difference between `git checkout` and `git restore`?

git restore is the newer, safer syntax introduced in Git 2.23. git restore --staged <file> unstages changes without discarding working changes. git restore <file> discards working changes by restoring from HEAD. In contrast, git checkout is older and overloaded — git checkout <branch> switches branches while git checkout -- <file> restores from HEAD. Using git checkout -- to unstage is particularly risky because the same syntax can accidentally switch branches. Always prefer git restore.

8. How does `git stash` interact with the three states?

Git stash creates a fourth virtual state by saving both your working directory changes and staged changes into a separate stash stack. When you git stash push, Git captures the dirty working directory and clean staging area, resets the working tree to match HEAD, then stores the dirty state as a commit object accessible via git stash list. git stash pop restores the most recent stash and drops it. git stash apply restores without dropping. Note that stash contents are stored as readable objects in .git/objects — they are not encrypted.

9. What happens to the staging area when you amend a commit?

When you run git commit --amend, Git creates a new commit with the same parent as the original but replaces the tree with whatever is currently staged. If your staging area is clean, amending just changes the commit message. If you have new staged changes, they get folded into the previous commit. This rewrites history — the original commit is orphaned and will eventually be garbage collected. Never amend commits that have been pushed to a shared branch.

10. Can you uncommit after running `git commit`? How?

Yes, using git reset. git reset --soft HEAD~1 moves HEAD back one commit, keeps all changes staged, ready to recommit with a new message. git reset --mixed HEAD~1 (default) moves HEAD back and unstages all changes but keeps them in the working directory. git reset --hard HEAD~1 permanently discards the last commit and its changes. After any destructive reset, git reflog can recover the orphaned commit within 30 days before garbage collection.

11. What is the order of operations when running `git add`, `git commit`, and `git checkout`?

git add computes the file content SHA-1, stores it as a blob in .git/objects, then updates .git/index with the blob reference and staged flag. git commit reads the index, creates a tree object from it, creates a commit object linking to that tree and the parent commit, updates HEAD to point to the new commit, and clears the staging area. git checkout reads the tree object for a given commit, extracts files into the working directory, and updates the index to match that tree.

12. Why should you review `git diff --staged` before every commit?

git diff --staged shows exactly what will be committed — the delta between HEAD and the staging area. This is the last line of defense against accidental commits of debug code, wrong files, or incomplete changes. git diff alone shows unstaged changes (working directory vs staging area), so even git diff passing does not mean the staged changes are correct. This single habit prevents the majority of commit-related incidents in production repositories.

13. How does `git restore` differ from `git reset` when unmodifying a file?

git restore <file> is the modern command to discard working directory changes for a specific file. git reset HEAD <file> was the traditional way to unstage a file. In Git 2.23+, git restore is the recommended approach because it separates the discarding operation from the branch-manipulation semantics of git reset. Use git restore --staged <file> to unstage and git restore <file> to discard working directory changes.

14. How does `git clean` interact with the three states?

git clean removes untracked files from the working directory only — it has no effect on tracked files in any state. It does not touch the staging area or repository. git clean -n shows what would be deleted (dry run). git clean -f removes untracked files. Combined with git checkout, you can reset a working directory to a clean state, but git clean specifically targets never-before-staged files. Note: git clean is irreversible — there is no built-in recovery mechanism for untracked files.

15. What is a hunk in the context of Git staging?

A hunk is a contiguous block of changes (additions or deletions) within a file that Git identifies as logically related. When you run git add -p, Git presents each hunk and asks whether to stage it. Hunks are computed using a longest common subsequence algorithm to minimize the number of split points. Using s (split) tries to break a hunk into smaller pieces at edit boundaries. Using e (edit) lets you manually edit the patch to stage specific lines. Understanding hunks is essential for clean, atomic commits.

16. What happens to file modes (executable) in the staging area?

Git stores file mode (executable, symlink, etc.) in the index along with the blob reference. When you git add a file, Git records its mode. git diff --staged will show a mode change as a binary difference unless core.filemode is disabled. This matters for deployment scripts that rely on executable bits. You can verify the stored mode with git ls-files --stage — the object mode field (100644 for regular files, 100755 for executables) is part of the index entry.

17. Why is the staging area not a security boundary?

The staging area lives in .git/index as plaintext references readable by anyone with repository access. Files placed there are not encrypted, access-controlled, or isolated — they exist in the same object database as all other Git data. Anyone with read access to the repository can inspect staged changes with git ls-files --stage or git show. Sensitive data should never reach the staging area. Use pre-commit hooks with tools like git-secrets or trufflehog to prevent secrets from being staged in the first place.

18. What is the relationship between the staging area and a merge conflict?

During a merge, Git populates the staging area with three entries per conflicted file: the common ancestor version (stage 1), the "ours" version (stage 2), and the "theirs" version (stage 3). Resolving a conflict means editing the file, staging it with git add, which marks it resolved. git status shows conflicted files distinctly. If you stage a conflicted file without fully resolving it, the commit will be incomplete. git merge --abort can undo an in-progress merge and restore the staging area to its pre-merge state.

19. How does `git reset` differ from `git restore` for unstaging?

git reset HEAD <file> and git restore --staged <file> both unstage a file, but with subtle differences. git reset updates the index to match HEAD by moving the current branch pointer, which can affect multiple files if you reset a directory. git restore --staged is scoped to the specific file and is the recommended modern syntax. Both keep working directory changes intact. git reset is older and more powerful (it also moves HEAD), while git restore was designed specifically for the restore use case and is safer for unstaging.

20. What is the performance implication of the staging area in large repositories?

The staging area (index) is stored in a single binary file that Git must parse on every git status, git add, and git commit. In repositories with hundreds of thousands of files, this can become a bottleneck. Git uses a B+tree format for the index, but updating it requires reading and rewriting the entire structure. Features like git add -p that diff hunks against the index add extra computation. Large repositories benefit from features like git sparse-checkout (avoiding unneeded directories), .gitignore discipline, and shallow clones to minimize staging area overhead.

Further Reading

Conclusion

Internalize this mental model and Git transforms from a confusing set of incantations into a transparent, predictable system. You will stop guessing what commands do and start reasoning about them in terms of how they move data between these three states.

Category

Related Posts

Centralized vs Distributed VCS: Architecture, Trade-offs, and When to Use Each

Compare centralized (SVN, CVS) vs distributed (Git, Mercurial) version control systems — their architectures, trade-offs, and when to use each approach.

#git #version-control #svn

Master git add: Selective Staging, Patch Mode, and Staging Strategies

Master git add including selective staging, interactive mode, patch mode, and staging strategies for clean atomic commits in version control.

#git #staging #git-add

What Is Version Control? The Developer's Safety Net

Learn what version control systems are, why they exist, what problems they solve, and why every developer needs one for modern software development.

#git #version-control #fundamentals