Most developers use Git like a magic box: you type something, something gets saved, history somehow works. Until something goes wrong. To stop Git being a black box – start by understanding what actually lives inside the .git/ directory. Four object types: blob, tree, commit, tag. That is it. Everything else follows from these.
Four object types
Git is a content-addressed object database. Every object is identified by the SHA-1 of its content. Change one byte – the SHA changes.
Blob
A blob stores raw file content. It has no name, no path – that is the tree’s job. Two identical files in different directories are one blob.
echo "Hello, Git" | git hash-object --stdin # 8ab686eafeb1f44702738c8b0f24f2567c36da6d git cat-file -p 8ab686eafeb1f44702738c8b0f24f2567c36da6d # Hello, Git
Tree
A tree is a list of entries: file mode, type (blob or tree), SHA and name. The equivalent of a directory in the file system.
git cat-file -p HEAD^{tree}
# 100644 blob a8a940627d13... README.md
# 100644 blob 1f7391f9274f... composer.json
# 040000 tree 2a1bcad13f8e... src
Commit
A commit points to one tree (the repository root), to zero or more parent commits, and contains author, committer, timestamp and message.
git cat-file -p HEAD # tree 2a1bcad13f8e8c8d9d2b7d6c4f2a1e9b8c7d6e5f # parent a3d5e2f1b4c6d7e8f9a0b1c2d3e4f5a6b7c8d9e0 # author Henryk Tews <henryk@tews.pl> 1746531600 +0200 # committer Henryk Tews <henryk@tews.pl> 1746531600 +0200 # # Add product export service
Tag
An annotated tag is a separate object pointing to another object (usually a commit). A lightweight tag is just a ref – a file in .git/refs/tags/ containing a SHA.
References – human names for SHAs
A SHA like a3d5e2f1b4c6 is precise but impractical. References are text files containing a SHA.
cat .git/refs/heads/main # a3d5e2f1b4c6d7e8f9a0b1c2d3e4f5a6b7c8d9e0 cat .git/HEAD # ref: refs/heads/main <- attached HEAD (points to branch) # or: # a3d5e2f1b4c6... <- detached HEAD (points to commit)
Practical implications
A branch is one file containing a SHA. Creating a branch is as cheap as writing one file. A merge creates a new commit with two parents.
Git stores snapshots, not diffs. Every commit is a complete picture - a tree pointing to all blobs. Diffs are computed on the fly. Blobs shared between commits are not duplicated - that is the effect of content addressing.
SHA never lies. If two SHAs are identical, content is identical. This is the foundation of Git integrity.
The .git/ structure
.git/
├── HEAD # where you are right now
├── config # repository configuration
├── index # staging area (binary)
├── objects/ # object database
│ ├── aa/ # first 2 chars of SHA = subdirectory
│ │ └── 3f...
│ └── pack/ # packed objects
└── refs/
├── heads/ # local branches
├── remotes/ # remote references
└── tags/
Exploration commands
# Object type git cat-file -t SHA # Object content git cat-file -p SHA # Graphical commit tree git log --oneline --graph --all # All files in HEAD git ls-tree -r HEAD
Summary
Git is four object types plus references. Understanding this structure explains behaviours that look like magic: why branches are cheap, why rebase rewrites SHAs, why two identical files take no more space than one. Next post: commits and history - good commit messages, rebase -i, cherry-pick and bisect.
