'\" t
.\" Title: gitdatamodel
.\" Author: [FIXME: author] [see http://www.docbook.org/tdg5/en/html/author]
.\" Generator: DocBook XSL Stylesheets vsnapshot
.\" Date: 2026-02-01
.\" Manual: Git Manual
.\" Source: Git 2.53.0
.\" Language: English
.\"
.TH "GITDATAMODEL" "7" "2026\-02\-01" "Git 2\&.53\&.0" "Git Manual"
.\" -----------------------------------------------------------------
.\" * Define some portability stuff
.\" -----------------------------------------------------------------
.\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.\" http://bugs.debian.org/507673
.\" http://lists.gnu.org/archive/html/groff/2009-02/msg00013.html
.\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.ie \n(.g .ds Aq \(aq
.el .ds Aq '
.\" -----------------------------------------------------------------
.\" * set default formatting
.\" -----------------------------------------------------------------
.\" disable hyphenation
.nh
.\" disable justification (adjust text to left margin only)
.ad l
.\" -----------------------------------------------------------------
.\" * MAIN CONTENT STARTS HERE *
.\" -----------------------------------------------------------------
.SH "NAME"
gitdatamodel \- Git\*(Aqs core data model
.SH "SYNOPSIS"
.sp
gitdatamodel
.SH "DESCRIPTION"
.sp
It\(cqs not necessary to understand Git\(cqs data model to use Git, but it\(cqs very helpful when reading Git\(cqs documentation so that you know what it means when the documentation says "object", "reference" or "index"\&.
.sp
Git\(cqs core operations use 4 kinds of data:
.sp
.RS 4
.ie n \{\
\h'-04' 1.\h'+01'\c
.\}
.el \{\
.sp -1
.IP " 1." 4.2
.\}
Objects: commits, trees, blobs, and tag objects
.RE
.sp
.RS 4
.ie n \{\
\h'-04' 2.\h'+01'\c
.\}
.el \{\
.sp -1
.IP " 2." 4.2
.\}
References: branches, tags, remote\-tracking branches, etc
.RE
.sp
.RS 4
.ie n \{\
\h'-04' 3.\h'+01'\c
.\}
.el \{\
.sp -1
.IP " 3." 4.2
.\}
The index, also known as the staging area
.RE
.sp
.RS 4
.ie n \{\
\h'-04' 4.\h'+01'\c
.\}
.el \{\
.sp -1
.IP " 4." 4.2
.\}
Reflogs: logs of changes to references ("ref log")
.RE
.SH "OBJECTS"
.sp
All of the commits and files in a Git repository are stored as "Git objects"\&. Git objects never change after they\(cqre created, and every object has an ID, like \fB1b61de420a21a2f1aaef93e38ecd0e45e8bc9f0a\fR\&.
.sp
This means that if you have an object\(cqs ID, you can always recover its exact contents as long as the object hasn\(cqt been deleted\&.
.sp
Every object has:
.sp
.RS 4
.ie n \{\
\h'-04' 1.\h'+01'\c
.\}
.el \{\
.sp -1
.IP " 1." 4.2
.\}
an
\fBID\fR
(aka "object name"), which is a cryptographic hash of its type and contents\&. It\(cqs fast to look up a Git object using its ID\&. This is usually represented in hexadecimal, like
\fB1b61de420a21a2f1aaef93e38ecd0e45e8bc9f0a\fR\&.
.RE
.sp
.RS 4
.ie n \{\
\h'-04' 2.\h'+01'\c
.\}
.el \{\
.sp -1
.IP " 2." 4.2
.\}
a
\fBtype\fR\&. There are 4 types of objects:
commits,
trees,
blobs, and
tag objects\&.
.RE
.sp
.RS 4
.ie n \{\
\h'-04' 3.\h'+01'\c
.\}
.el \{\
.sp -1
.IP " 3." 4.2
.\}
\fBcontents\fR\&. The structure of the contents depends on the type\&.
.RE
.sp
Here\(cqs how each type of object is structured:
.PP
commit
.RS 4
A commit contains these required fields (though there are other optional fields):
.sp
.RS 4
.ie n \{\
\h'-04' 1.\h'+01'\c
.\}
.el \{\
.sp -1
.IP " 1." 4.2
.\}
The full directory structure of all the files in that version of the repository and each file\(cqs contents, stored as the
\fBtree\fR
ID of the commit\(cqs top\-level directory
.RE
.sp
.RS 4
.ie n \{\
\h'-04' 2.\h'+01'\c
.\}
.el \{\
.sp -1
.IP " 2." 4.2
.\}
Its
\fBparent commit ID(s)\fR\&. The first commit in a repository has 0 parents, regular commits have 1 parent, merge commits have 2 or more parents
.RE
.sp
.RS 4
.ie n \{\
\h'-04' 3.\h'+01'\c
.\}
.el \{\
.sp -1
.IP " 3." 4.2
.\}
An
\fBauthor\fR
and the time the commit was authored
.RE
.sp
.RS 4
.ie n \{\
\h'-04' 4.\h'+01'\c
.\}
.el \{\
.sp -1
.IP " 4." 4.2
.\}
A
\fBcommitter\fR
and the time the commit was committed
.RE
.sp
.RS 4
.ie n \{\
\h'-04' 5.\h'+01'\c
.\}
.el \{\
.sp -1
.IP " 5." 4.2
.\}
A
\fBcommit message\fR
.sp
Here\(cqs how an example commit is stored:
.sp
.if n \{\
.RS 4
.\}
.nf
tree 1b61de420a21a2f1aaef93e38ecd0e45e8bc9f0a
parent 4ccb6d7b8869a86aae2e84c56523f8705b50c647
author Maya 1759173425 \-0400
committer Maya 1759173425 \-0400
Add README
.fi
.if n \{\
.RE
.\}
.sp
Like all other objects, commits can never be changed after they\(cqre created\&. For example, "amending" a commit with
\fBgit\fR
\fBcommit\fR
\fB\-\-amend\fR
creates a new commit with the same parent\&.
.sp
Git does not store the diff for a commit: when you ask Git to show the commit with
\fBgit-show\fR(1), it calculates the diff from its parent on the fly\&.
.RE
.RE
.PP
tree
.RS 4
A tree is how Git represents a directory\&. It can contain files or other trees (which are subdirectories)\&. It lists, for each item in the tree:
.sp
.RS 4
.ie n \{\
\h'-04' 1.\h'+01'\c
.\}
.el \{\
.sp -1
.IP " 1." 4.2
.\}
The
\fBfilename\fR, for example
\fBhello\&.py\fR
.RE
.sp
.RS 4
.ie n \{\
\h'-04' 2.\h'+01'\c
.\}
.el \{\
.sp -1
.IP " 2." 4.2
.\}
The
\fBfile type\fR, which must be one of these five types:
.sp
.RS 4
.ie n \{\
\h'-04'\(bu\h'+03'\c
.\}
.el \{\
.sp -1
.IP \(bu 2.3
.\}
\fBregular file\fR
.RE
.sp
.RS 4
.ie n \{\
\h'-04'\(bu\h'+03'\c
.\}
.el \{\
.sp -1
.IP \(bu 2.3
.\}
\fBexecutable file\fR
.RE
.sp
.RS 4
.ie n \{\
\h'-04'\(bu\h'+03'\c
.\}
.el \{\
.sp -1
.IP \(bu 2.3
.\}
\fBsymbolic link\fR
.RE
.sp
.RS 4
.ie n \{\
\h'-04'\(bu\h'+03'\c
.\}
.el \{\
.sp -1
.IP \(bu 2.3
.\}
\fBdirectory\fR
.RE
.sp
.RS 4
.ie n \{\
\h'-04'\(bu\h'+03'\c
.\}
.el \{\
.sp -1
.IP \(bu 2.3
.\}
\fBgitlink\fR
(for use with submodules)
.RE
.RE
.sp
.RS 4
.ie n \{\
\h'-04' 3.\h'+01'\c
.\}
.el \{\
.sp -1
.IP " 3." 4.2
.\}
The
\fBobject ID\fR
with the contents of the file, directory, or gitlink\&.
.sp
For example, this is how a tree containing one directory (\fBsrc\fR) and one file (\fBREADME\&.md\fR) is stored:
.sp
.if n \{\
.RS 4
.\}
.nf
100644 blob 8728a858d9d21a8c78488c8b4e70e531b659141f README\&.md
040000 tree 89b1d2e0495f66d6929f4ff76ff1bb07fc41947d src
.fi
.if n \{\
.RE
.\}
.sp
.RE
.RE
.if n \{\
.sp
.\}
.RS 4
.it 1 an-trap
.nr an-no-space-flag 1
.nr an-break-flag 1
.br
.ps +1
\fBNote\fR
.ps -1
.br
.sp
In the output above, Git displays the file type of each tree entry using a format that\(cqs loosely modelled on Unix file modes (\fB100644\fR is "regular file", \fB100755\fR is "executable file", \fB120000\fR is "symbolic link", \fB040000\fR is "directory", and \fB160000\fR is "gitlink")\&. It also displays the object\(cqs type: \fBblob\fR for files and symlinks, \fBtree\fR for directories, and \fBcommit\fR for gitlinks\&.
.sp .5v
.RE
.PP
blob
.RS 4
A blob object contains a file\(cqs contents\&.
.sp
When you make a commit, Git stores the full contents of each file that you changed as a blob\&. For example, if you have a commit that changes 2 files in a repository with 1000 files, that commit will create 2 new blobs, and use the previous blob ID for the other 998 files\&. This means that commits can use relatively little disk space even in a very large repository\&.
.RE
.PP
tag object
.RS 4
Tag objects contain these required fields (though there are other optional fields):
.sp
.RS 4
.ie n \{\
\h'-04' 1.\h'+01'\c
.\}
.el \{\
.sp -1
.IP " 1." 4.2
.\}
The
\fBID\fR
of the object it references
.RE
.sp
.RS 4
.ie n \{\
\h'-04' 2.\h'+01'\c
.\}
.el \{\
.sp -1
.IP " 2." 4.2
.\}
The
\fBtype\fR
of the object it references
.RE
.sp
.RS 4
.ie n \{\
\h'-04' 3.\h'+01'\c
.\}
.el \{\
.sp -1
.IP " 3." 4.2
.\}
The
\fBtagger\fR
and tag date
.RE
.sp
.RS 4
.ie n \{\
\h'-04' 4.\h'+01'\c
.\}
.el \{\
.sp -1
.IP " 4." 4.2
.\}
A
\fBtag message\fR, similar to a commit message
.RE
.RE
.sp
Here\(cqs how an example tag object is stored:
.sp
.if n \{\
.RS 4
.\}
.nf
object 750b4ead9c87ceb3ddb7a390e6c7074521797fb3
type commit
tag v1\&.0\&.0
tagger Maya 1759927359 \-0400
Release version 1\&.0\&.0
.fi
.if n \{\
.RE
.\}
.sp
.if n \{\
.sp
.\}
.RS 4
.it 1 an-trap
.nr an-no-space-flag 1
.nr an-break-flag 1
.br
.ps +1
\fBNote\fR
.ps -1
.br
.sp
All of the examples in this section were generated with \fBgit\fR \fBcat\-file\fR \fB\-p\fR \fI