This module provides interface to the objects:

  • Project - represents a repository
  • Commit - represents a commit in a repository
  • Tree - represents a directory and its content. Each Commit has a root tree.
  • File - represents a file path, including all parent directories/trees
  • Blob - Binary Large OBject, represents a file content.
  • Author - represents a combination of author name and email.

Commit, Tree and Blob are a straightforward representation of objects used by Git internally. It will be helpful to read Chapter 2 of Pro Git book (free and Open Source) for better understanding of these objects.

Common methods

All objects have a unique key. For git objects (Commit, Tree, Blob) it is the object SHA hash; for Project it is the project URI; for File it is the filename; for Author it is the author name and email. Objects of the same type and having the same key will be considered equivalent:

>>> sha = 'f2a7fcdc51450ab03cb364415f14e634fa69b62c'
>>> Commit(sha) == Commit(sha)

It is possible to iterate all objects of a given type using .all()

classmethod _Base.all()[source]

Iterate all objects of the given type

This might be useful to get a list of all projects, or a list of all file names.

Returns:a generator of Project objects

E.g. to iterate all repositories of user2589 on github:

>>> for project in Project.all('user2589_'):
...     print project.uri

GitObject methods

These methods are shared by Commit, Tree, Blob.

All git objects are instantiated by a 40-byte hex string SHA or a 20-byte binary SHA. In most cases you will use hex form, the latter way is needed only fore relatively rare cases you need to interface with binary data.

>>> Commit('f2a7fcdc51450ab03cb364415f14e634fa69b62c')
>>> Commit('\xf2\xa7\xfc\xdcQE\n\xb0<\xb3dA_\x14\xe64\xfai\xb6,')

Whatever form of SHA was used to instantiate the object, it will have properties:

  • sha - 40-byte hex string
  • bin_sha - 20 bytes binary string

All git objects, when coerced to str, will return their internal representation. It is mostly important for Blob to access the file content.

Class reference

class oscar.Project(uri)[source]
Projects are initialized with a URI:
  • Github: {user}_{repo}, e.g. user2589_minicms
  • Gitlab: gl_{user}_{repo}
  • Bitbucket: bb_{user}_{repo}
  • Bioconductor: bioconductor.org_{user}_{repo}
  • kde: kde.org_{user}_{repo}
  • drupal: drupal.org_{user}_{repo}
  • Googlesouce: android.googlesource.com_{repo}_{user}
  • Linux kernel: git.kernel.org_{user}_{repo}
  • PostgreSQL: git.postgresql.org_{user}_{repo}
  • GNU Savannah: git.savannah.gnu.org_{user}_{repo}
  • ZX2C4: git.zx2c4.com_{user}_{repo}
  • GNOME: gitlab.gnome.org_{user}_{repo}
  • repo.or.cz_{user}_{repo}
  • Salsa: salsa.debian.org_{user}_{repo}
  • SourceForge: sourceforge.net_{user}_{repo}

Projects are iterable:

>>> for commit in Project('user2589_minicms'):  # doctest: +SKIP
...     print(commit.sha)

Commits can be checked for membership in a project, either by their SHA hash or by a Commit object itself:

Commit: >>> sha = ‘e38126dbca6572912013621d2aa9e6f7c50f36bc’ >>> sha in Project(‘user2589_minicms’) True >>> Commit(sha) in Project(‘user2589_minicms’) True

SHA1 of all commits in the project

>>> Project('user2589_django-currencies').commit_shas
...         # doctest: +NORMALIZE_WHITESPACE

A generator of all Commit objects in the project. It has the same effect as iterating a Project instance itself, with some additional validation of commit dates.

>>> tuple(Project('user2589_django-currencies').commits)
...       # doctest: +NORMALIZE_WHITESPACE
(<Commit: 2dbcd43f077f2b5511cc107d63a0b9539a6aa2a7>,
 <Commit: 7572fc070c44f85e2a540f9a5a05a95d1dd2662d>)

Get a commit chain by following only the first parent, to mimic—first-parent . Thus, you only get a small subset of the full commit tree:

>>> p = Project('user2589_minicms')
>>> set(c.sha for c in p.commits_fp).issubset(p.commit_shas)

In scenarios where branches are not important, it can save a lot of computing.

Note: commits will come in order from the latest to the earliest.


Get the HEAD commit of the repository

>>> Project('user2589_minicms').head
<Commit: f2a7fcdc51450ab03cb364415f14e634fa69b62c>
>>> Project('RoseTHERESA_SimpleCMS').head
<Commit: a47afa002ccfd3e23920f323b172f78c5c970250>

Get the first commit SHA by following first parents

>>> Project('user2589_minicms').tail
class oscar.Commit(sha)[source]

A git commit object.

Commits have some special properties. Most of object properties provided by this project are lazy, i.e. they are computed when you access them for the first time. The following Commit properties will be instantiated all at once on the first access to any of them.

  • tree: root Tree of the commit
  • parent_shas: tuple of parent commit sha hashes
  • message: str, first line of the commit message
  • full_message: str, full commit message
  • author: str, Name <email>
  • authored_at: str, unix_epoch+timezone
  • committer: str, Name <email>
  • committed_at: str, unix_epoch+timezone

SHA hashes of all blobs in the commit

>>> Commit('af0048f4aac8f4760bf9b816e01524d7fb20a3fc').blob_shas
...        # doctest: +NORMALIZE_WHITESPACE

A generator of Blob objects included in this commit

>>> tuple(Commit('af0048f4aac8f4760bf9b816e01524d7fb20a3fc').blobs)
...              # doctest: +NORMALIZE_WHITESPACE
(<Blob: b2f49ffef1c8d7ce83a004b34035f917713e2766>,
 <Blob: c92011c5ccc32a9248bd929a6e56f846ac5b8072>,
 <Blob: bf3c2d2df2ef710f995b590ac3e2c851b592c871>)

Children commit binary sha hashes. Basically, this is a reverse parent_shas

Commit: >>> Commit(‘1e971a073f40d74a1e72e07c682e1cba0bae159b’).child_shas (‘9bd02434b834979bb69d0b752a403228f2e385e8’,)


A generator of children Commit objects

Commit: >>> tuple(Commit(‘1e971a073f40d74a1e72e07c682e1cba0bae159b’).children) (<Commit: 9bd02434b834979bb69d0b752a403228f2e385e8>,)


A generator of parent commits. If you only need hashes (and not Commit objects), use .parent_sha instead

Commit: >>> c = Commit(‘e38126dbca6572912013621d2aa9e6f7c50f36bc’) >>> tuple(c.parents) (<Commit: ab124ab4baa42cd9f554b7bb038e19d4e3647957>,)


URIs of projects including this commit. This property can be used to find all forks of a project by its first commit.

Commit: >>> c = Commit(‘f2a7fcdc51450ab03cb364415f14e634fa69b62c’) >>> isinstance(c.project_names, tuple) True >>> len(c.project_names) > 0 True >>> ‘user2589_minicms’ in c.project_names True


A generator of Project s, in which this commit is included.

class oscar.Tree(sha)[source]

A representation of git tree object, basically - a directory.

Trees are iterable. Each element of the iteration is a 3-tuple: (mode, filename, sha)

  • mode is an ASCII decimal string similar to file mode
    in Unix systems. Subtrees always have mode “40000”
  • filename is a string filename, not including directories
  • sha is a 40 bytes hex string representing file content Blob SHA


iteration is not recursive. For a recursive walk, use Tree.traverse() or Tree.files

Both files and blobs can be checked for membership, either by their id (filename or SHA) or a corresponding object:

>>> tree = Tree("d4ddbae978c9ec2dc3b7b3497c2086ecf7be7d9d")
>>> '.gitignore' in tree
>>> File('.keep') in tree
>>> '83d22195edc1473673f1bf35307aea6edf3c37e3' in tree
>>> Blob('83d22195edc1473673f1bf35307aea6edf3c37e3') in tree

len(tree) returns the number of files under the tree, including files in subtrees but not the subtrees themselves:

>>> len(Tree("d4ddbae978c9ec2dc3b7b3497c2086ecf7be7d9d"))

A tuple of all file content shas, including files in subdirectories


A generator of Blob objects with file content. It does include files in subdirectories.

>>> tuple(Tree('d20520ef8c1537a42628b72d481b8174c0a1de84').blobs
...       )  # doctest: +ELLIPSIS, +NORMALIZE_WHITESPACE
(<Blob: 2bdf5d686c6cd488b706be5c99c3bb1e166cf2f6>, ...,
 <Blob: c006bef767d08b41633b380058a171b7786b71ab>)

A dict of all files and their content/blob sha under this tree. It includes recursive files (i.e. files in subdirectories). It does NOT include subdirectories themselves.


Recursively traverse the tree This will generate 3-tuples of the same format as direct tree iteration, but will recursively include subtrees content.

Returns:generator of (mode, filename, blob/tree sha)
>>> c = Commit("1e971a073f40d74a1e72e07c682e1cba0bae159b")
>>> len(list(c.tree.traverse()))
>>> c = Commit('e38126dbca6572912013621d2aa9e6f7c50f36bc')
>>> len(list(c.tree.traverse()))
class oscar.File(path)[source]

Files are initialized with a path, starting from a commit root tree:

>>> File('.gitignore')  # doctest: +SKIP
>>> File('docs/Index.rst')  # doctest: +SKIP

SHA1 of all commits changing this file

NOTE: this relation considers only diff with the first parent, which substantially limits its application

>>> commits = File('minicms/templatetags/').commit_shas
>>> len(commits) > 0
>>> isinstance(commits, tuple)
>>> isinstance(commits[0], str)
>>> len(commits[0]) == 40

All commits changing the file

>>> cs = tuple(File('minicms/templatetags/').commits)
>>> len(cs) > 0
>>> isinstance(cs[0], Commit)
class oscar.Blob(sha)[source]

SHAs of Commits in which this blob have been introduced or modified.

NOTE: commits removing this blob are not included


Commits where this blob has been added or changed

NOTE: commits removing this blob are not included


Content of the blob

class oscar.Author(full_email)[source]

Authors are initialized with a combination of name and email, as they appear in git configuration.

>>> Author('John Doe <>')  # doctest: +SKIP

At this point we don’t have a relation to map all aliases of the same author, so keep in mind this object represents an alias, not a person.


SHA1 of all commits authored by the Author

>>> commits = Author('user2589 <>').commit_shas
>>> len(commits) > 50
>>> isinstance(commits, tuple)
>>> isinstance(commits[0], str)
>>> len(commits[0]) == 40

A generator of all Commit objects authored by the Author

>>> commits = tuple(Author('user2589 <>').commits)
>>> len(commits) > 50
>>> isinstance(commits[0], Commit)