Launchpad Buildout ****************** Launchpad is moving to using the buildout_ (or "zc.buildout") build system. It will gradually replace the sourcecode directory, and hopefully also the Launchpad Makefile. Buildout's biggest strength is managing Python packages. That will also be our primary focus for it. Meanwhile, apt will continue to manage our Python language installation, as well some or all of our non-Python dependencies, such as PostgreSQL. While Bazaar will continue to be an essential part of our toolchain, we will move away from using it to incorporate source code trees of our dependencies. If you are not interested in our `Motivations`_ or in an `Introduction to zc.buildout`_, all developers will at least want to read the very brief sections about the `Set Up`_ and the `Everyday Usage`_. Developers who manage source dependencies probably should read the general information about `Managing Dependencies and Scripts`_, but will also find detailed instructions to `Add a Package`_, to `Upgrade a Package`_, to `Add a Script`_, to `Add a File Modified By Buildout`_, and to `Work with Unreleased or Forked Packages`_. .. _buildout: http://www.buildout.org/ =========== Motivations =========== These motivations are labeled as "[INTERNAL]" or "[EXTERNAL]" to indicate whether it is pertinent for internal dependencies, which we on the Launchpad team create and release ourselves; or external dependencies, which other parties, in and out of Canonical, create and release. * We want more careful specification of our dependencies across branches. [INTERNAL] [EXTERNAL] This is a real concern pertinent both for our "trunks" (devel, stable, db-devel, db-stable) and for our development boxes. For instance, now, in our trunks, when we want to update a dependency, we need to make sure that *all* the current Launchpad trunks work with the dependency initially; then submit a new Launchpad branch that uses the change dependency. A mistake can even potentially break one or both of the db-* trunks, since PQM only tests against one branch (usually devel), and sourcecode changes affect all branches at once. For simplicity, speed, and safety, we want to be able to submit a single branch that incorporates the source dependencies and the associated changes at once. This is also true, if less critical and easier to work around, on developer boxes. Without care, changes to sourcecode when working on dependencies will affect all a developer's branches, polluting test results with false negatives or false positives. * We want to default to using released versions of our software dependencies. [EXTERNAL] A significant number of projects do not always have a pristine trunk, and many also spend extra effort on polish, bug fixes, and compatibility before a release. If we do not desperately need a new feature on trunk, using a release is generally regarded as a safer, better practice. Our current usage of bzr branches of the development trunks does not encourage this practice. * We want to be encouraged to make the effort to interact with upstream projects to have our patches integrated. [EXTERNAL] Interacting and negotiating with upstream is undeniably more time-consuming than our current practice of maintaining local bzr branches with our patches, especially short-term. But our current practice is not good open-source community behavior--an ironic characteristic for a project like Launchpad. It also can cause problems down the road, for instance, if the patch becomes stale and we want to migrate to new releases. * We want to be protected from changes and differences in our operating system. [INTERNAL] [EXTERNAL] This is a concern both over time and across different Launchpad environments. First, our operating system, Ubuntu, is driven by many needs and goals. Launchpad is among them, but generally Launchpad serves Ubuntu, not the reverse. For instance, recently Jaunty dropped Launchpad's Python version. The Ubuntu developers had good reason--Python 2.4 has not been supported by the Python developers for some time--but it caused a significant inconvenience to the Launchpad team. Managing our dependencies, particularly the Python library dependencies, can help alleviate these problems. Second, Launchpad developers run a significantly different version of the operating system than that run in production. Maintaining our Python library dependencies ourselves can also help alleviate these concerns. * We want to be able to easily use standard packages of our primary programming language, Python. [EXTERNAL] Our Python library dependencies are distributed for many operating systems-- Windows, Mac, and other flavors of Linux--in a unified location and format: PyPI, using distutils. Using Python library dependencies in their standard distributions makes it easier for us to reuse code. * We want to be encouraged to release Python packages of our open-source code. [INTERNAL] We are beginning to realize our aspirations of abstracting and releasing some of our code. Much of that code is in Python. Consuming standard Python packages encourages us to follow good practice in releasing our own Python packages. Dogfooding can help keep us honest, and good official releases can help us support and encourage a community of users. Meanwhile, we want to be able to keep certain aspects of the legacy story. * We need deployment to not need network access. * We need to be able to use custom releases of our Python dependencies, if absolutely necessary. * We would like to be able to follow security releases in our operating system. We will be able to support the first two of these aspects entirely. As to the third, we will continue to follow operating system security releases for most or all of the dependencies that are not Python packages--a very similar situation to the one before Buildout integration. =========================== Introduction to zc.buildout =========================== Buildout is a relatively simple system that increases in complexity as it is extended via "recipes". It works on top of distutils_ and setuptools_. It uses declarative ini-style files to direct its work. The `Buildout site`_ points to a variety of documents describing and documenting its use. For Launchpad, Buildout is hidden behind a Makefile as of this writing. If the Makefile is removed, developers will typically run ``bootstrap.py`` in the branch, and then run ``bin/buildout``. After that, the ``bin`` directory will have the commands to start, stop, test, and debug the software. If you are interested in example direct usage of Buildout, you may want to read `the "Hacking" document in the Launchpad wiki`_ that describes the usage patterns in ``lazr.*`` packages. .. _`the "Hacking" document in the Launchpad wiki`: https://dev.launchpad.net/Hacking Launchpad's Buildout usage is roughly of medium complexity. It is more complex than that needed by a package with few dependencies and simple usage (see lazr.uri, for instance), but less complex than that of other large applications that use the buildout system. More complexity can come by building more non-Python tools and by having multiple configuration variations, for instance. The documentation below will focus on using Launchpad's buildout. See the links given above for a more thorough general review. .. _distutils: http://docs.python.org/distutils/index.html .. _setuptools: http://peak.telecommunity.com/DevCenter/setuptools .. _`Buildout site`: http://www.buildout.org/ ====== Set Up ====== If you use the ``rocketfuel-get`` script, run that, and you will be done. If you don't, you have just a bit more initial set up. I'll assume you maintain a pattern similar to what the ``rocketfuel-*`` scripts use: you have a local, pristine branch of trunk from which you make your other branches.  You manually update the trunk and rsync sourcecode when necessary.  When you make a branch, you use ``utilities/link-external-sourcecode``. Developers that take this approach should do the following, where ``trunk`` is the trunk branch from which you make local branches. :: bzr co lp:lp-source-dependencies download-cache Then run ``make`` in the trunk. See the `Everyday Usage: Manual`_ section for further instructions on how to work without the ``rocketfuel-*`` scripts. .. _`Everyday Usage: Manual`: Manual_ ============== Everyday Usage ============== ``rocketfuel`` Scripts ====================== If you typically use ``rocketfuel-get``, and you don't change source dependencies, you should not have any further changes, except that ``bin/test`` has replaced ``test.py``. ``rocketfuel-branch`` and ``link-external-dependencies`` will Do the Right Thing. Manual ====== If you don't use the rocketfuel scripts, you will still use ``link-external-dependencies`` as before. When a buildout complains that it cannot find a version of a dependency, do the following, from within the branch:: bzr up download-cache After this, retry your make (or run ``bin/buildout`` from the branch). That's it for everyday usage. ================================= Managing Dependencies and Scripts ================================= What if you need to change or add dependencies or scripts? As you might expect, you need to know a bit more about what's going on, although we can still keep this at a fairly high level. First let's talk a little about the anatomy of what we have set up. To be clear, much of this is based on our own decisions of what to do. If you see something problematic, bring it up with the Foundations team. Maybe together we can come up with another approach that meets our needs better. If you saw the top-level Launchpad directory before we started using Buildout, you might notice seven new items in the checkout. ``bootstrap.py`` This is the standard bootstrapping file provided by the Buildout distribution. As of this writing, the Makefile uses this file, and you do not have to modify it or call it directly. Without a Makefile, the first step of developing a source tree that uses Buildout is to run this file in the top level directory with the Python executable that you wish to use for the source tree. It creates a ``bin`` directory, if it does not already exist, and puts a ``buildout`` executable there. The next step is to run ``bin/buildout``, which, unless you supply a ``-c`` option to specify a configuration file, will look for a buildout.cfg file by default to discover what to do. Again, as of this writing, the Makefile uses this file, and it does not need to be modified, so you need not concern yourself with it further at this time. ``ez_setup.py`` This is another standard file from another project. In this case, it is the file provided by the setuptools project to install setuptools. It is used by the ``setup.py`` file, described below. It does not need to be modified or called directly. ``setup.py`` This is the file that uses ``distutils``, extended by ``setuptools``, to specify direct dependencies, scripts, and other elements of the local source tree. Buildout uses it, but the reverse is not true: ``setup.py`` does not know about Buildout. This means that packages that use Buildout for development do not have to require it when they are being installed in other software as a dependency. Describing this file in full is well beyond the scope of this document. We will give recipes for modifying it for certain tasks below. For more information beyond these recipes, see the setuptools and distutils documentation. ``buildout.cfg`` This is the default configuration file that ``bin/buildout`` will look to for instructions. Describing it in full is well beyond the scope of this document. However, we will give an overview here. Configuration files for Buildout are comprised of sections with key-value pairs. The key-value pairs are separated with new lines, when the subsequent line is not indented. The key and value are each separated with an equals sign. :: foo = bar baz = bing bah boo sha = zam That example shows three keys, 'foo', 'baz', and 'sha'. The 'baz' key has a string with two new lines (which might be interpreted one several ways, as defined for that key). The ``[buildout]`` section is the starting point for Buildout to determine what to do. It looks for an ``extends`` key to find any additional files to merge in; we use this for ``versions.cfg``, discussed below. In addition to general configuration and initialization such as this, it looks in the ``develop`` key to find source trees to develop as part of the buildout. In the standard Launchpad configuration, we develop only Launchpad itself (the current directory, or '.'). This means that the local ``setup.py`` will be run. If you want to develop Launchpad while you develop another dependency, you can link another source tree in, and specify an additional ``develop`` directory in another line:: [buildout] develop = . lazr_uri_branch See `Developing a Dependent Library In Parallel`_ for more on this. The other basic key in the ``[buildout]`` section that we'll highlight here is ``parts``. This key identifies the other sections in buildout.cfg that will be processed. A section that is not identified in the ``[buildout]`` sections ``parts`` key will usually be ignored (unless chosen for another role by another key elsewhere). Sections other than ``[buildout]`` that are specified as parts always must specify a ``recipe``: an identifier that determines what code should process that section. You'll see a variety of recipes in Launchpad's buildout.cfg, including ``z3c.recipe.filetemplate``, ``zc.recipe.egg``, and others. ``versions.cfg`` As mentioned above, ``buildout.cfg`` extends ``versions.cfg`` by specifying it in the ``extends`` key of the ``[buildout]`` section. Versions.cfg specifies the precise versions of the dependencies we use. This means that we can have several versions of a dependency available locally, but we only build the precise one we specify. We give an example of its use below. To read about the mechanism used, see the zc.buildout documentation of the ``versions`` option in the ``[buildout]`` section. ``eggs`` The ``eggs`` directory holds the eggs built from the downloaded distributions. Unless you set it up differently yourself, this directory is shared by all your branches. This directory is local to your system--we do not manage it in a branch. One reason for this is that eggs are often platform-specific. ``download-cache`` The ``download-cache`` directory is a set of downloaded distributions--that is, exact copies of the items that would typically be obtained from the Python Package Index ("PyPI"), or another download source. We manage the download cache as a shared resource across all of our developers with a bzr branch in a Launchpad project called ``lp-source-dependencies``. When we run buildout, Buildout reads a special key and value in the ``[buildout]`` section: ``install-from-cache = true``. This means that, by default, Buildout will *not* use network access to find packages, but *only* look in the download cache. This has many advantages. - First, it helps us keep our deployment boxes from needing network access out to PyPI and other download sites. - Second, it makes the buildout much faster, because it does not have to look out on the net for every dependency. - Third, it makes the buildout more repeatable, because we are more insulated from outages at download sites such as PyPI, and poor release management. - Fourth, it makes our deployments more auditable, because we can tell exactly what we are deploying. - Fifth, it gives us a single obvious place to put custom package distributions, as we'll discuss below. The downside is that adding and upgrading packages takes a small additional step, as we'll see below. ``buildout-templates`` The last additional item in the checkout is the ``buildout-templates`` directory. This is used to hold templates that are used by the section in buildout.cfg that uses the ``z3c.recipe.filetemplate`` recipe. This can be used for many things, but we are using it as an alternate way for producing scripts when the zc.recipe.egg approach is insufficient. In addition to these seven listings, after you have run the Makefile (or ``bin/buildout``), you will see an additional listing: ``bin`` The ``bin`` directory has already been discussed many times. After running the bootstrap.py, it holds the ``buildout`` script which can be used to process Buildout configuration files. In Launchpad's case, after running the buildout, it also holds many executables, including scripts to test Launchpad; to run it; to run Python or IPython with Launchpad's sourcetree and dependencies available; to run harness or iharness (with IPython) with the sourcetree, dependencies, and database connections; or to perform several other tasks. For now, the Makefile provides aliases for many of these. Now that you have an introduction to the pertinent files and directories, we'll move on to trying to perform tasks in the buildout. We'll discuss adding a dependency, upgrading a dependency, adding a script, adding an arbitrary file, and working with unreleased packages. Add a Package ============= Let's suppose that we want to add the "lazr.foo" package as a dependency. 1. Add the new package to the ``setup.py`` file in the ``install_requires`` list. Generally, our policy is to only set minimum version numbers in this file, or none at all. It doesn't really matter for an application like Launchpad, but it a good rule for library packages, so we follow it for consistency. Therefore, we might simply add ``'lazr.foo'`` to install_requires, or ``'lazr.foo >= 1.1'`` if we know that we are depending on features introduced in version 1.1 of lazr.foo. 2. [OPTIONAL] If you know it, add the desired version number to versions.cfg now. For instance, if you know you want lazr.foo 1.1.2, add this line to the ``[versions]`` section of ``versions.cfg``:: lazr.foo = 1.1.2 3. [OPTIONAL] Add the desired distribution of lazr.foo 1.1.2 to the ``download-cache/dist`` directory. 4. Run the following command (or your variation):: ./bin/buildout -v buildout:install-from-cache=false | tee out.txt | grep 'Picked' The first part (``./bin/buildout -v buildout:install-from-cache=false``) will run buildout, allowing it to download source packages from the Internet to ``download-cache/dist``. The second part (``tee out.txt``) will dump the full output to the ``out.txt`` file in case you need to debug a problem. The last part (``grep 'Picked'``) will filter the output so that only additional packages--dependencies of your dependency--will be listed. Look at the output. You need to see if it means that you have dependencies, some of which may be indirect dependencies. Here's an imaginary example output:: Picked: ipython = 0.9.1 Picked: lazr.foom = 1.4 Picked: zope.bar = 3.6.1 Picked: z3c.shazam = 2.0.1 At this time, note that the output will include at least one, and possibly more, spurious "Picked:" listings. ipython, in particular, shows up repeatedly. In our example, other than the spurious ``ipython`` listing, this means that these packages have also been added to your ``download-cache/dist`` directory. You also need to add those versions to the ``versions.cfg`` file:: lazr.foom = 1.4 zope.bar = 3.6.1 z3c.shazam = 2.0.1 5. Test. [TODO] Note that you can tell ec2test to include all uncommitted distributions from the local download-cache in its tests. If you do this, you cannot use the ec2test feature to submit on test success. Also, if you have uncommitted distributions and you do *not* explicitly tell ec2test to include or ignore the uncommitted distributions, it will refuse to start an instance. 6. Check old versions in the download-cache. If you are sure that they are not in use any more, *anywhere*, then remove them to save checkout space. More explicitly, check with the LOSAs to see if they are in use in production and send an email to launchpad-dev@lists.launchpad.net before deleting anything if you are unsure. 7. Commit the changes (``cd download-cache``, ``bzr up``, ``bzr commit -m 'Add lazr.foom 1.1.2 and depdendencies to the download cache'``) to the shared download cache when you are sure it is what you want. *Never* modify a package in the download-cache. Upgrade a Package ================= Sometimes you need to upgrade a dependency. This may require additional dependency additions or upgrades. If you already know what version you want, the simplest thing to try is to modify versions.cfg to specify the new version and run steps 4, 5, and 6 of the `Add a Package`_ instructions. If you don't know what version you want, but just want to see what happens when you upgrade to the most recent revision, you can clear out the versions of the packages for upgrade and give it a try (that is, run steps 4, 5, and 6 of the `Add a Package`_ instructions). Note that, when not given an explicit version number, our buildout is set to prefer final releases over alpha and beta releases. If you want to temporarily override this behavior, include ``buildout:prefer-final=false`` as another argument to ``bin/buildout``. Add a Script ============ We often need scripts that are run in a certain environment defined by Python dependencies, and sometimes even different Python executables. Several of the scripts we have are specified using the setuptools-based spelling that the ``zc.recipe.egg`` recipe supports. For the common case, in ``setup.py``, add a string in the ``console_scripts`` list of the ``entry_points`` argument. Here's an example string:: 'run = canonical.launchpad.scripts.runlaunchpad:start_launchpad' This will create a script named ``run`` in the ``bin`` directory that calls the ``start_launchpad`` function in the ``canonical.launchpad.scripts.runlaunchpad`` module. See the `zc.recipe.egg documentation`_ for more information on how to add scripts using this method. .. _`zc.recipe.egg documentation`: http://pypi.python.org/pypi/zc.recipe.egg Add a File Modified By Buildout =============================== Sometimes we need more control for the way our scripts are generated, or we need other files processed during a buildout. Writing a custom zc.buildout recipe is one way, but well out of the scope of this document. Read the zc.buildout documentation for direction. A much easier, and more limited approach is to use `z3c.recipe.filetemplate`_ to build the file. The recipe uses the ``buildout-templates`` directory, which is a mirror of the Launchpad source tree. The recipe searches the tree for files ending in '.in', performs variable substitution on them, and then be copies them into the Launchpad source tree. To add a file using the recipe, simply create mirrors of the source tree directories that you need under ``buildout-templates/``, and create a .in file template at the desired location. Take a look at ``buildout-templates/bin/`` for examples of what is possible. .. _`z3c.recipe.filetemplate`: http://pypi.python.org/pypi/z3c.recipe.filetemplate Work with Unreleased or Forked Packages ======================================= Sometimes you need to work with unreleased or forked packages. FeedValidator_, for instance, makes nightly zip releases but other than that only offers svn access. Similarly, we may require a patched or unreleased version of a package for some purpose. Hopefully, these situations will be rare, but they do occur. While `other answers`_ are available for Buildout, our solution is to use the download-cache. Basically, make a custom source distribution with a unique suffix in the name, and use it (and its version name) for the normal process of adding or updating a package, as described above. Because the custom package is in the download-cache, it will be found and used. Here's an example of making a custom distribution of FeedValidator. FeedValidator is a Subversion project. We check it out:: svn co http://feedvalidator.googlecode.com/svn/trunk/feedvalidator/src feedvalidator Next, we ``cd feedvalidator``, and, using a Python that has setuptools installed, we run the following command:: python setup.py egg_info -r -bDEV sdist For this example, imagine that the current revision of the repository is 1049. Because setuptools has built-in Subversion support, the command above will create a tar.gz in the ``dist`` directory named ``feedvalidator-0.0.0DEV-r1049.tar.gz``. The -r option specifies that the subversion revision should be part of the package name. The -bDEV option specifies that the 'DEV' suffix should be added to the version number. We could then put the tar.gz file in Launchpad's ``download-cache/dist`` directory, specify ``feedvalidator = 0.0.0DEV-r1049`` in the ``versions.cfg`` file, and proceed with the usual steps to update or add a new package. If you use a bzr branch, you might use the ``-d`` option instead of the ``-r`` option when you create the distribution. This will add the date instead of the revision:: python setup.py egg_info -d -bDEV sdist For instance, this might produce a distribution for the ``lazr.restful`` project with a name like this: ``lazr.restful-0.9.1DEV-20090512.tar.gz``. See the setuptools documentation for more information about `the egg_info command`_. It is also possible that setup.py does not support the egg_info command and it is sufficient to just run the sdist command. It adds the current revision of the bzr branch to the version number:: python setup.py sdist For instance, this might produce a distribution for the ``testtools`` project with a name like this: ``testtools-0.9.12-r228.tar.gz``. .. _FeedValidator: http://feedvalidator.org/ .. _`other answers`: http://pypi.python.org/pypi/zc.buildoutsftp .. _`the egg_info command`: http://peak.telecommunity.com/DevCenter/setuptools#tagging-and-daily-build-or-snapshot-releases Developing a Dependent Library In Parallel ========================================== Sometimes you need to iterate on change to a library used by Launchpad that is managed by buildout as an egg. You could just edit what it is in the ``eggs`` directory, but it is harder to produce a patch while doing this. You could instead grab a branch of the libarary and produce an egg everytime you make a change and make buildout use the new egg, but this is slow. buildout defaults to only using the current directory as code that will be used without creating a distribution. We can arrange for it to use other paths as well, so we can use a checkout of any code we like, with changes being picked up instantly without us having to create a distribution. To do this add the extra paths to the ``develop`` key in the ``[buildout]`` section of ``buildout.cfg``:: [buildout] develop = . path/to/branch and re-run ``make``. Now any changes you make in that path will be picked up, and you are free to make the changes you need and test them in the Launchpad environment. Once you are finished you can produce a distribution as above for inclusion in to Launchpad, as well as sending your patch upstream. At that point you are free to revert the configuration to only develop Launchpad. You should not submit your branch with this change in the configuration as it will not work for others. Be aware that you may have to change the version for the package in ``versions.cfg`` if there is a difference between the version in the ``eggs`` directory and the one in the ``setup.py`` that you pointed to in the ``develop`` key. One thing to be wary of is that setuptools treats "develop eggs" created by this process with the same precedence as system packages. That means that if the version in the ``setup.py`` at the path that you put in the ``develop`` key is the same as the version installed system wide, setuptools may pick the wrong one. If that happens then increase the version in setup.py and setuptools will take it. ===================== Possible Future Goals ===================== - No longer use system site-packages. - No longer use make. - Get rid of the sourcecode directory.