Hello, all. After a few days of thinking, discovering and working, here it is. The first working draft of new git eclass codenamed 'git-r3'.
First of all, the name is not final. I'm open to ideas. I'm open to naming it 'git-r1' to put it in line with my other -r1 eclasses :). I'd definitely like to avoid 'git-3' though, since that version-like naming was a mistake as almost-'python-2' eclass shown. Secondly, it's not even final that there will be a new eclass. Most likely I will commit it as a new eclass since that way is easier for us but if you prefer I may try to get it and git-2 more API-friendly and work on making it a almost-drop-in replacement. Since, after all, internals have actually changed much more than the API. And now for the major changes: 1. The code has been split into clean 'fetch' and 'checkout' pieces. That is, it is suited for distinct src_fetch() and src_unpack() phases that we'll hopefully have in EAPI 6. What's important, the checkout code does not rely on passing *any* environment variables from fetching code. It is also made with concurrency in mind, so multiple ebuilds using the same repository at the same time shouldn't be a problem. 2. Public fetch/checkout API. git-2 has a lot of private functions and just src_unpack(). git-r3 has git-r3_fetch() and git-r3_checkout() which are public API intended to used in ebuilds that need more than plain fetch+unpack. While this isn't exactly what multi-repo support pursuers wanted, it should make supporting multiple repos in one ebuild much cleaner. 3. Clean submodule support with bare clones. Since the submodules are very straightforward in design, I have decided to move their support into the eclass directly. As a result, the new eclass cleanly supports submodules, treating them as additional repositories and doing submodule fetch/checkout recursively. There is no need for non-bare clones anymore (and therefore their support has been removed to make code simpler), and submodules work fine with EVCS_OFFLINE=1. 4. 'Best-effort' shallow clones support. I did my best to support shallow clones in the eclass. The code is specifically designed to handle them whenever possible. However, since shallow clones have a few limitations: a) only branch/tag-based fetches support shallow clones. Fetching by commit id forces complete clone (this is what submodules do BTW). b) there's EGIT_NONSHALLOW option for users who prefer to have full clones, and possibly for ebuilds that fail with shallow clones. c) if shallow clones cause even more trouble than that, I will simply remove their support from the eclass :). [see notes about testing at the end] 5. Safer default EGIT_DIR choice. EGIT_PROJECT removed. Since submodules are cloned as separate repositories as well, we can't afford having EGIT_PROJECT to change the clone dir. Instead, the eclass uses full path from first repo URI (with some preprocessing) to determine the clone location. This should ensure non-colliding clones with most likeliness that two ebuilds using the same repo will use the same clone without any special effort from the maintainer. 6. Safer default checkout dir. EGIT_SOURCEDIR removed. git-2 used to default EGIT_SOURCEDIR=${S}. This kinda sucked since if one wanted to use subdirectory of the git repo, he needed to both set EGIT_SOURCEDIR and S. Now, the checkout is done to ${WORKDIR}/${P} by default and ebuilds can safely do S=${WORKDIR}/${P}/foo. I may provide EGIT_SOURCEDIR if someone still finds it useful. API/variables removed: 1. EGIT_SOURCEDIR: a) if you need it for multiple repos, use the fetch/checkout functions instead, b) otherwise, play with S instead, c) if you really need it, lemme know and I'll put it back. 2. EGIT_HAS_SUBMODULES -> no longer necessary, we autodetect them (and we don't need that much special magic like we used to). 3. EGIT_OPTIONS -> interfered too much with eclass internals. 4. EGIT_MASTER -> people misused it all the time, and it caused issues for projects that used different default branch. Now we just respect upstream's default branch. 5. EGIT_PROJECT -> should be no longer necessary. 6. EGIT_DIR -> still exported, but no longer respects user setting it. 7. EGIT_REPACK, EGIT_PRUNE -> I will probably reintroduce it, or just provide the ability to set git auto-cleanup options. 8. EGIT_NONBARE -> only bare clones are supported now. 9. EGIT_NOUNPACK -> git-2 is only eclass calling the default. Does anyone actually need this? Is adding custom src_unpack() that hard? 10. EGIT_BOOTSTRAP -> this really belongs in *your* src_prepare(). I've tested the eclass on 113 live packages I'm using. Most of them work just fine (I've replaced git-2 with the new eclass). Some of them only require removing the old variables, some need having S changed. However, I noticed the following issues as well: 1. code.google fails with 500 when trying to do a shallow clone (probably they implemented their own dumb git server), 2. sys-apps/portage wants to play with 'git log' for ChangeLogs. That's something that definitely is not going to work with shallow clones ;). Not that I understand why someone would like a few megs of detailed git backlog. 3. sys-fs/bedup's btrfs-progs submodule says the given commit id is 'not a valid branch point'. Need to investigate what this means. 4. 'git fetch --depth 1' seems to be refetching stuff even when nothing changed. Need to investigate it. It may be enough to do an additional 'did anything change?' check. I will try to look into those issues tomorrow. In the meantime, please review this eclass and give me your thoughts. Especially if someone has some more insight on shallow clones. Thanks. And a fun fact: LLVM subversion checkout (in svn-src) has around ~2.4k files which consume around 220M on btrfs. LLVM git shallow clone takes 17M. -- Best regards, Michał Górny
# Copyright 1999-2013 Gentoo Foundation # Distributed under the terms of the GNU General Public License v2 # $Header: $ # @ECLASS: git-r3.eclass # @MAINTAINER: # MichaŠGórny <mgo...@gentoo.org> # @BLURB: Eclass for fetching and unpacking git repositories. # @DESCRIPTION: # Third generation eclass for easing maitenance of live ebuilds using # git as remote repository. Eclass supports lightweight (shallow) # clones, local object deduplication and submodules. case "${EAPI:-0}" in 0|1|2|3|4|5) ;; *) die "Unsupported EAPI=${EAPI} (unknown) for ${ECLASS}" ;; esac if [[ ! ${_GIT_R3} ]]; then inherit eutils fi EXPORT_FUNCTIONS src_unpack if [[ ! ${_GIT_R3} ]]; then # @ECLASS-VARIABLE: EGIT_STORE_DIR # @DESCRIPTION: # Storage directory for git sources. # # EGIT_STORE_DIR=${DISTDIR}/git3-src # @ECLASS-VARIABLE: EGIT_REPO_URI # @REQUIRED # @DESCRIPTION: # URIs to the repository, e.g. git://foo, https://foo. If multiple URIs # are provided, the eclass will consider them as fallback URIs to try # if the first URI does not work. # # It can be overriden via env using ${PN}_LIVE_REPO variable. # # Example: # @CODE # EGIT_REPO_URI="git://a/b.git https://c/d.git"; # @CODE # @ECLASS-VARIABLE: EVCS_OFFLINE # @DEFAULT_UNSET # @DESCRIPTION: # If non-empty, this variable prevents any online operations. # @ECLASS-VARIABLE: EGIT_BRANCH # @DEFAULT_UNSET # @DESCRIPTION: # The branch name to check out. If unset, the upstream default (HEAD) # will be used. # # It can be overriden via env using ${PN}_LIVE_BRANCH variable. # @ECLASS-VARIABLE: EGIT_COMMIT # @DEFAULT_UNSET # @DESCRIPTION: # The tag name or commit identifier to check out. If unset, newest # commit from the branch will be used. If set, EGIT_BRANCH will # be ignored. # # It can be overriden via env using ${PN}_LIVE_COMMIT variable. # @ECLASS-VARIABLE: EGIT_NONSHALLOW # @DEFAULT_UNSET # @DESCRIPTION: # Disable performing shallow fetches/clones. Shallow clones have # a fair number of limitations. Therefore, if you'd like the eclass to # perform complete clones instead, set this to a non-null value. # # This variable is to be set in make.conf. Ebuilds are not allowed # to set it. # @FUNCTION: _git-r3_env_setup # @INTERNAL # @DESCRIPTION: # Set the eclass variables as necessary for operation. This can involve # setting EGIT_* to defaults or ${PN}_LIVE_* variables. _git-r3_env_setup() { debug-print-function ${FUNCNAME} "$@" local esc_pn livevar esc_pn=${PN//[-+]/_} livevar=${esc_pn}_LIVE_REPO EGIT_REPO_URI=${!livevar:-${EGIT_REPO_URI}} [[ ${EGIT_REPO_URI} ]] \ || die "EGIT_REPO_URI must be set to a non-empty value" [[ ${!livevar} ]] \ && ewarn "Using ${livevar}, no support will be provided" livevar=${esc_pn}_LIVE_BRANCH EGIT_BRANCH=${!livevar:-${EGIT_BRANCH}} [[ ${!livevar} ]] \ && ewarn "Using ${livevar}, no support will be provided" livevar=${esc_pn}_LIVE_COMMIT EGIT_COMMIT=${!livevar:-${EGIT_COMMIT}} [[ ${!livevar} ]] \ && ewarn "Using ${livevar}, no support will be provided" # git-2 unsupported cruft local v for v in EGIT_{SOURCEDIR,MASTER,HAS_SUBMODULES,PROJECT} \ EGIT_{NOUNPACK,BOOTSTRAP} do [[ ${!v} ]] && die "${v} is not supported." done } # @FUNCTION: _git-r3_set_gitdir # @USAGE: <repo-uri> # @INTERNAL # @DESCRIPTION: # Obtain the local repository path and set it as GIT_DIR. Creates # a new repository if necessary. # # <repo-uri> may be used to compose the path. It should therefore be # a canonical URI to the repository. _git-r3_set_gitdir() { debug-print-function ${FUNCNAME} "$@" local repo_name=${1#*://*/} # strip common prefixes to make paths more likely to match # e.g. git://X/Y.git vs https://X/git/Y.git # (but just one of the prefixes) case "${repo_name}" in # cgit can proxy requests to git cgit/*) repo_name=${repo_name#cgit/};; # pretty common git/*) repo_name=${repo_name#git/};; # gentoo.org gitroot/*) repo_name=${repo_name#gitroot/};; # google code, sourceforge p/*) repo_name=${repo_name#p/};; # kernel.org pub/scm/*) repo_name=${repo_name#pub/scm/};; esac # ensure a .git suffix, same reason repo_name=${repo_name%.git}.git # now replace all the slashes repo_name=${repo_name//\//_} local distdir=${PORTAGE_ACTUAL_DISTDIR:-${DISTDIR}} : ${EGIT_STORE_DIR:=${distdir}/git3-src} GIT_DIR=${EGIT_STORE_DIR}/${repo_name} if [[ ! -d ${EGIT_STORE_DIR} ]]; then ( addwrite / mkdir -m0755 -p "${EGIT_STORE_DIR}" ) || die "Unable to create ${EGIT_STORE_DIR}" fi addwrite "${EGIT_STORE_DIR}" if [[ ! -d ${GIT_DIR} ]]; then mkdir "${GIT_DIR}" || die git init --bare || die fi } # @FUNCTION: _git-r3_set_submodules # @USAGE: <file-contents> # @INTERNAL # @DESCRIPTION: # Parse .gitmodules contents passed as <file-contents> # as in "$(cat .gitmodules)"). Composes a 'submodules' array that # contains in order (name, URL, path) for each submodule. _git-r3_set_submodules() { debug-print-function ${FUNCNAME} "$@" local data=${1} # ( name url path ... ) submodules=() local l while read l; do # submodule.<path>.path=<path> # submodule.<path>.url=<url> [[ ${l} == submodule.*.url=* ]] || continue l=${l#submodule.} local subname=${l%%.url=*} submodules+=( "${subname}" "$(echo "${data}" | git config -f /dev/fd/0 \ submodule."${subname}".url)" "$(echo "${data}" | git config -f /dev/fd/0 \ submodule."${subname}".path)" ) done < <(echo "${data}" | git config -f /dev/fd/0 -l) } # @FUNCTION: git-r3_fetch # @USAGE: <repo-uri> <remote-ref> <local-id> # @DESCRIPTION: # Fetch new commits to the local clone of repository. <repo-uri> follows # the syntax of EGIT_REPO_URI and may list multiple (fallback) URIs. # <remote-ref> specifies the remote ref to fetch (branch, tag # or commit). <local-id> specifies an identifier that needs to uniquely # identify the fetch operation in case multiple parallel merges used # the git repo. <local-id> usually involves using CATEGORY, PN and SLOT. # # The fetch operation will only affect the local storage. It will not # touch the working copy. If the repository contains submodules, they # will be fetched recursively as well. git-r3_fetch() { debug-print-function ${FUNCNAME} "$@" local repos=( ${1} ) local remote_ref=${2} local local_id=${3} local local_ref=refs/heads/${local_id}/__main__ local -x GIT_DIR _git-r3_set_gitdir ${repos[0]} # try to fetch from the remote local r success for r in ${repos[@]}; do einfo "Fetching ${remote_ref} from ${r} ..." # first, try ls-remote to see if ${remote_ref} is a real ref # and not a commit id. if it succeeds, we can pass ${remote_ref} # to 'fetch'. otherwise, we will just fetch everything # split on whitespace local ref=( $(git ls-remote "${r}" "${remote_ref}") ) local ref_param=() if [[ ${ref[0]} ]]; then [[ ${EGIT_NONSHALLOW} ]] || ref_param+=( --depth 1 ) ref_param+=( "${remote_ref}" ) fi # if ${remote_ref} is branch or tag, ${ref[@]} will contain # the respective commit id. otherwise, it will be an empty # array, so the following won't evaluate to a parameter. if git fetch --no-tags "${r}" "${ref_param[@]}"; then if ! git branch -f "${local_id}/__main__" "${ref[0]:-${remote_ref}}" then die "Creating tag failed (${remote_ref} invalid?)" fi success=1 break fi done [[ ${success} ]] || die "Unable to fetch from any of EGIT_REPO_URI" # recursively fetch submodules if git cat-file -e "${local_ref}":.gitmodules &>/dev/null; then local submodules _git-r3_set_submodules \ "$(git cat-file -p "${local_ref}":.gitmodules || die)" while [[ ${submodules[@]} ]]; do local subname=${submodules[0]} local url=${submodules[1]} local path=${submodules[2]} local commit=$(git rev-parse "${local_ref}:${path}") if [[ ! ${commit} ]]; then die "Unable to get commit id for submodule ${subname}" fi git-r3_fetch "${url}" "${commit}" "${local_id}/${subname}" submodules=( "${submodules[@]:3}" ) # shift done fi } # @FUNCTION: git-r3_checkout # @USAGE: <repo-uri> <local-id> <path> # @DESCRIPTION: # Check the previously fetched commit out to <path> (usually # ${WORKDIR}/${P}). <repo-uri> follows the syntax of EGIT_REPO_URI # and will be used to re-construct the local storage path. <local-id> # is the unique identifier used for the fetch operation and will # be used to obtain the proper commit. # # If the repository contains submodules, they will be checked out # recursively as well. git-r3_checkout() { debug-print-function ${FUNCNAME} "$@" local repos=( ${1} ) local local_id=${2} local out_dir=${3} local -x GIT_DIR GIT_WORK_TREE _git-r3_set_gitdir ${repos[0]} GIT_WORK_TREE=${out_dir} einfo "Checking out ${repos[0]} to ${out_dir} ..." mkdir -p "${GIT_WORK_TREE}" git checkout -f "${local_id}"/__main__ || die # diff against previous revision (if any) local new_commit_id=$(git rev-parse --verify "${local_id}"/__main__) local old_commit_id=$( git rev-parse --verify "${local_id}"/__old__ 2>/dev/null ) if [[ ! ${old_commit_id} ]]; then echo "GIT NEW branch -->" echo " repository: ${repos[0]}" echo " at the commit: ${new_commit_id}" else echo "GIT update -->" echo " repository: ${repos[0]}" # write out message based on the revisions if [[ "${old_commit_id}" != "${new_commit_id}" ]]; then echo " updating from commit: ${old_commit_id}" echo " to commit: ${new_commit_id}" else echo " at the commit: ${new_commit_id}" fi fi git branch -f "${local_id}"/{__old__,__main__} || die # recursively checkout submodules if [[ -f ${GIT_WORK_TREE}/.gitmodules ]]; then local submodules _git-r3_set_submodules \ "$(cat "${GIT_WORK_TREE}"/.gitmodules)" while [[ ${submodules[@]} ]]; do local subname=${submodules[0]} local url=${submodules[1]} local path=${submodules[2]} git-r3_checkout "${url}" "${local_id}/${subname}" \ "${GIT_WORK_TREE}/${path}" submodules=( "${submodules[@]:3}" ) # shift done fi # keep this *after* submodules export EGIT_DIR=${GIT_DIR} export EGIT_VERSION=${new_commit_id} } git-r3_src_fetch() { debug-print-function ${FUNCNAME} "$@" [[ ${EVCS_OFFLINE} ]] && return _git-r3_env_setup local branch=${EGIT_BRANCH:+refs/heads/${EGIT_BRANCH}} git-r3_fetch "${EGIT_REPO_URI}" \ "${EGIT_COMMIT:-${branch:-HEAD}}" \ ${CATEGORY}/${PN}/${SLOT} } git-r3_src_unpack() { debug-print-function ${FUNCNAME} "$@" _git-r3_env_setup git-r3_src_fetch git-r3_checkout "${EGIT_REPO_URI}" \ ${CATEGORY}/${PN}/${SLOT} \ "${WORKDIR}/${P}" } _GIT_R3=1 fi
signature.asc
Description: PGP signature