Development/One Git Conversion

Introduction
Currently we have 20 git repository: bootstrap + artwork base calc components extensions extras filters help impress libs-core libs-extern libs-extern-sys libs-gui postprocess sdk testing translations ure writer

For reference, the size of the .git of each of these repositories is:

I concentrate on the size of the .git because that is the size that really matter for the performance of most git operations, and especially for git clone out of a remote location.

The general idea is to consolidate some of these repositories together using a git fetch. This technique has the merit of relative simplicity, but the drawback is that the resulting history, although theoretically complete, is fairly unusable. That is it would be very cumbersome. to be able to check-out at a point in time prior to when the actual fusion is done and we have a complete buildable tree. The core issue is that the 'true' history is in fact represented by 20 parallels history. In order to mitigate that, the tag for each imported repository will be renamed, prefixing the name of the repository in front of the tag name. That way we will avoid name conflict, and the different tags for the same 'history' level in each of the repository are still accessible. Nevertheless, you can only check out one of these tag, i.e essentially only one old-repository at the time.

Per repository analysis
We will review each repository and discuss what the migration will entail for it. The main guiding principles are: which group is the primary maintainer of the repository, how independent is that repository, what is the rate of change on that repository, and how big is that repository.

bootstrap
bootstrap will be used as the anchor for the whole process. it is left as is and become, after all processing our new 'core' repository

base calc components extensions extras impress libs-core libs-gui postprocess sdk testing ure writer
These are bread-and-butter code repository. The will be merged into the new 'core' repository. The only processing step is to rename the tag to avoid name-conflict.

The current version of the onegit.sh script combine all these repositories in the core repository.

artwork
Artwork contains mostly binary objects. Its content is mostly the domain of the graphic design team, and from the dev side change consist mostly into naming thing and moving things around.

The rate of change is fairly low (46 commits since the beginning of the year ) The size of the repository is moderate (with respect to the other repositories)

For simplicity sake, it would be not to costly to consolidate artwork into core.

artwork will be merged.

help
This repository contains scripts, metadata and data needed to generate the help files. Technically this is an optional repository, and mostly managed by the documentation and translation team. Today it does depend on libs-gui, for instance, to be build. But there are effort under way to make that repository even more independent.

help will not be merged

filters
filters particularity is being the host of the binfilter module. Since binfilter is deprecated and will eventually be dropped, we will take advantage of this intrusive reorganization of the git repositories to extract the binfilter module out of the filters repository This save about 60% of the size of filters.

The resulting 'lean' filters will have it's tag renamed and combined with core. The new binfilter git repository will be placed in clone/binfilter and managed the same way auxiliary git repositories are managed today with the added benefit of being completely optional.

libs-extern
libs-extern is essentially a wrapper to patch and build external libraries. It is a fairly big repository, but in fact a substantial part of the size is due to the fact that once upon a time the tar.gz of these external libraries were dumped into the libs-extern repository. We do not do that anymore, but the history still reflect that. Once again, since we are going to an intrusive reorganization we may as well take advantage of this to do some clean-up and put that repository on a diet.

libs-extern will undergo a git-filter to remove historical */download/* instances and to rename the tags. It will then be combined with core.

libs-extern-sys
the case of libs-extern-sys is very similar to that of libs-extern with the added twist that a big part of this repository is used by the 'dictionaries' module. That module is fairly independent of the mainline development.

libs-extern-sys as it stand is a pretty big repository, in fact our biggest but without dictionaries and with a clean-up of the old historical tar.gz it size dwindle to 2MB

The dictionaries module will be extracted from libs-extern-sys and spined-off as a separate repository.

The remaining libs-extern-sys will undergo a git-filter to remove historical .tar.gz files, the tags will be renamed and the result combined with core.

translations
The translations repository is a big repository, managed quasi-exclusively by the translation/localization team. It is an optional repository in the sens that it is not required to build the product.

Translations is left untouched and will still be optionally present as clone/translations

other clean-up
Since the migration will result in a discontinuity of ours repositories, it might be a good time to sneak in intrusive clean-up that require a complete rewrite of the history. The main such proposed clean-up is the tab/space clean-up (eliminate tabs in sources) and the trailing spaces clean-up.

A custom C program, sandwiched between git fast-export and git fast-import, allow to do that clean-up very efficiently. The overall run-time of the conversion has been measure to about 30 minutes.

Documentation
The development Wiki need to be updated to reflect the changes, notably

Development

Git For LibreOffice Developers

onegit.sh
The onegit.sh, found in contrib/dev-tools, implement the steps described above. The script will contain every step to convert from a standard set of git repo, including 'translations' to a 'merge' repo that will comprise : a 'core' git repo, a 'binfilter' repo (the extraction of binfilter out of filters), a 'dictionaries' git repo (the extraction of the 'dictionaries' module of libs-extern-sys) a 'help' repos (that will be identical to the current help repo) and 'translations' git repo (that will be identical to the current 'translations' git repo.

The script will also aplly to the final result a set of patches to make the result buildable (mostly fixes in the ex-bootstrap, in configure.in and bin/* scripts to take into account the layout changes, and some hard-coded path in unit-test that make reference to clone/xxx)

preliminary result
After conversion out git repositories look like this:

Note that binfilter, dictionaries and translations are 'optional' repositories. they are only needed to build if the underlying options is activated in autogen.

preamble
The migration is intrusive. Technically cherry picking from the existing layout should still be possible, but should be done in a dedicated set-up because fetching the existing git repositories into the migrated one to cherry pick will make them grow a lot... it is probably easier to git-format-patch the commits to cherry pick and then apply them as patches...

Still in order to limit the amount of such patches, it is preferable to do the migration when 3.4 has stabilized... so after 3.4.2... or based on current schedule sometime in August.

Furthermore, during the actual migration master on the original repos need to be somehow shutdown. We need to plan for this, make sure that everybody having commit access is aware of the planning and take appropriate step to flush what need to be flushed out of their local tree.

The migration itself, with the step described above, takes a couple of hours of processing. So all in all we should be able to do the whole thing, end-to-end in less than a day.

The result of the conversion should be a libreoffice/core repo and an additional libreoffice/binfiter and libreoffice/dictionaries libreoffice/help and libreoffice/translations remain untouched.

The build instructions will remain mostly unchanged (except to use 'core' instead of 'bootstrap')

A full dress rehearsal will be conducted to iron-out the kinks and more importantly to give a chance to the community to check to proposed result to detect possible issues. This will also give us a realistic time-line to plan the actual migration and the downtime associated with it

planning
In case of abort, the freeze on the old-repo master branches will be lifted and the migration rescheduled at a later date.

locking of master
Locking master will be achieved by installing the following change in the update hook on freedesktop.org:

diff --git a/update.hook.old b/update.hook.new index 92a9939..40ffc4a 100644 --- a/update.hook.old +++ b/update.hook.new @@ -14,6 +14,15 @@ recipients="libreoffice-commits@lists.freedesktop.org" ref_type=$(git cat-file -t "$3") +# One Git Convertsion: make the master branch read-only +if [ "$1" = "refs/heads/master" ] ; then +   echo "*** This repository has been migrated to core.git," >&2 +   echo "*** the 'master' branch is now read-only." >&2 +   echo "*** see https://wiki.documentfoundation.org/Development/One_Git_Conversion" >&2 +   echo "*** for more details" >&2 +   exit 1; +fi + # Only allow annotated tags in a shared repo # Remove this code to treat dumb tags the same as everything else case "$1","$ref_type" in

Managing a feature branch across the conversion
If a feature branch cannot be merged into master before the conversion, the following procedure can be use to transplant it after the conversion.


 * git pull your old set of repo to get the latest level of the old git repo. At that point you should have a FINAL_MASTER tag in all git repos.
 * rebase your feature branch on the FINAL_MASTER tag (that is rebase it on master HEAD, that should be the same commit)
 * generate patches for everything from FINAL_MASTER to the head of your feature branch (./g format-patch --suffix=-@REPO@.patch --output-dir=/home/ /patches/ FINAL_MASTER..HEAD )
 * set-up a new migrated repo
 * checkout/create your feature branch and apply all the patches you generated. There should not be any conflict.