Explain Codes LogoExplain Codes Logo

Get the current git hash in a Python script

python
prompt-engineering
best-practices
tools
Anton ShumikhinbyAnton Shumikhin·Feb 8, 2025
TLDR

To get the current Git commit hash with Python, use the following code snippet:

import subprocess print(subprocess.check_output(['git', 'rev-parse', 'HEAD']).decode('ascii').strip())

This code uses the subprocess module to execute git rev-parse HEAD and fetches the hash of the newest commit on the current branch, removing any extra whitespace.

Options and drawbacks

GitPython for streamlined repository access

GitPython provides a handy method for direct operations with Git repositories. Install it using pip, then use the following commands:

from git import Repo repo = Repo(search_parent_directories=True) commit_hash = repo.head.object.hexsha # It won't hurt if you tell me, repo! print(commit_hash)

Bear in mind, GitPython can leak resources in extended-scope scripts, so refer to the GitPython documentation for usage guidance.

Reading the .git directory directly

Avoiding external libraries? Use pathlib to directly read from the .git directory:

from pathlib import Path # Reading .git directory, no magic involved git_dir = Path('.git/HEAD') if git_dir.is_file(): reference = git_dir.read_text().strip().split(': ')[1] # If the painting could talk... print((git_dir.parent / reference).read_text().strip())

This approach assumes a standard setup; it might falter when dealing with detached HEAD states.

Package it up with "git describe"

Want more than just the commit hash? Try the fancy git describe for a string infused with the nearest tag and extra commit count:

import subprocess version = subprocess.check_output(['git', 'describe', '--tags', '--always']).decode().strip() # No, it's not a secret spell, just the version string! print(version)

The --always flag ensures we always get a hash, even without a tag or in a detached HEAD state.

Create a short hash function

For those feeling ‛hash-phobic’, creating a function to yield the short hash simplifies things:

def get_git_short_hash(): # Short and sweet, just like my grandma's apple pie recipe. return subprocess.check_output(['git', 'rev-parse', '--short', 'HEAD']).decode('ascii').strip() print(get_git_short_hash())

This shorter hash is more manageable when running a version of your code or logging outputs.

Adapting to dynamic environments

When executing scripts across varying repository states or locations, os.path or pathlib can be a lifesaver:

import os import subprocess repo_path = '/path/to/your/repo' os.chdir(repo_path) # No tricks, just plain magic! current_hash = subprocess.check_output(['git', 'rev-parse', 'HEAD']).decode('ascii').strip() print(current_hash)

Syncing your script's path and the repository is crucial to avoid miscommunication.

Deeper dive and best practices

Consistency in hash retrieval is key

A consistent hash retrieval is pivotal, especially when using it to mirror the state of your code in versioning, databases, or logging.

Stay on track with dynamic code versions

Adding the git hash to output files or object links your specific codebase or version - an efficient way to keep track of what code produced which output:

output_version = get_git_short_hash() # Use this in file names or metadata, like a secret identity!

Handling execution and errors

Proper error handling is essential when running git commands, preventing script crash-landings or mysterious error messages:

try: commit_hash = subprocess.check_output(['git', 'rev-parse', 'HEAD'], stderr=subprocess.STDOUT).decode().strip() except subprocess.CalledProcessError as e: # Oops! Something went wrong and it wasn't my cooking this time! print(f"Failed to retrieve commit hash: {e.output.decode().strip()}") commit_hash = None

These checks capture and relay any errors, facilitating a smooth debugging experience.