Background

The SDK is distributed as a VirtualBox image, and developers make changes to adjust the SDK to their needs. These changes include installing tools, libraries, changing system configuration files, and adding content to their workspace. There is one VirtualBox image for each version of the SDK, and currently a version upgrade requires each developer to manually migrate their SDK customization to the new version. This migration is time consuming, and involves frustrating and repetitive work.

One additional problem is the need some product teams have to support different versions of the SDK at the same time. The main challenge in this scenario is the synchronization of the developer’s customizations between multiple VirtualBox images.

The goal of this document is to define a model to decouple developer customization from SDK images and thus allowing the developer to have persistence for workspace, configuration, and binary packages (libraries and tools) over different SDK images.

Use cases

  • SDK developer wants to share the workspace among different SDK images with minimal effort. In particular, the user doesn't want to have to rely on manually copying the workspace across SDK images in order to keep them in sync.

  • SDK developer wants a simple way to share custom system configuration (i.e. changes to /etc) across SDK images.

  • SDK developer wants to keep tools and libraries selection in sync over different SDK images.

Solution

For addressing workspace persistence, and partially addressing tools and libraries synchronization across different SDK images the following options were considered:

  • Use VirtualBox shared folders as mount points for /home and /opt directories
  • Ship a preconfigured second disk as part of the SDK images using the OVF format (.ova files)
  • Use a second (optional) disk with partitions for /home and /opt directories and leave it to the developer to setup the disk. Helper scripts would then be provided to help the developer setting up the disk (e.g. setup partitions, mountpoints, copy existing content of /home and /opt directories, etc)

The use of shared folders would be ideal here given that the setup would be simpler while also allowing the developer to easily share data between the host and guest (SDK). The problem with shared folders is that they don't support the creation of symlinks, which is essential for development given that they are frequently used when configuring a source tree to build.

However, the issue with symlinks is nonexistent when using a secondary disk, as the disk can be partitioned and formatted using a filesystem that supports them, making it a viable option here.

While the option to ship a preconfigured second disk as part of the SDK images (using the OVF format) seems like a better approach at first, it brings some limitations:

  • The disk/partitions size would be limited to what is preconfigured during image build
  • Although some workarounds exist for VirtualBox to use .vdi images (native VirtualBox image format) on .ova files, this is not officially supported and VirtualBox will even convert any .vdi file to .vmdk format when exporting an appliance using the OVF format
  • In order to allow the same disk to be used by multiple virtual machines at the same time (concurrently), VirtualBox requires the disk to be made shareable, which in turn requires fixed size disks (not dynamically allocated). While this may not be a common usecase, some developers may still want it to be supported, in which case the SDK images would have a huge increase in size, thus impacting download/bandwidth/etc.

That said, we recommend the usage of a second disk configured by the developer itself. This should add more flexibility to the developer, while avoiding the limitations of using the OVF format. Helper scripts could also be provided to ease the work of setting up the second disk. Another advantage of this solution is that current SDK users can also rely on it the same way as new users would.

However it is important to note that using this option would also impact QA, as it would need to support the two different setups (with and without a second disk) for proper testing.

Also important to note that while this solution partially address tools and libraries synchronization among different SDK images, it won't cover synchronization of tools/libraries installed outside the developer workspace or /opt directories. Supporting it for any tools/libraries, despite of where they are installed, would be quite complex and not pratically viable for several reasons such as the fact that dpkg uses a single database for installed packages.

For that reason we recommend developers that want to keep their package installation in full sync among different images to do it manually.

To address synchronization of system configuration changes (i.e. /etc) the following options were considered:

  • Use OverlayFS on top of /etc
  • Use symlinks in the second disk (e.g. on /opt/etc) for each configuration file changed

Although the use of an OverlayFS seems simpler at first, it has some drawbacks such as the fact that after an update, changes stored at the developer customization layer are likely to cause dependency issues and hide changes to the content and structure of the base system. For example if a developer upgrades an existing SDK image (or downloads a new one) and sets up the second disk/partition as overlay for this image's /etc, it may happen that if the image had changes to the same configuration files present in the overlay, these changes would simply get ignored and it would be hard for the user to notice it.

The other option would be to use symlinks in the second disk for each configuration file changed. While this should require a bit more effort to setup, it should at the same time give the user more control/flexibility over which configuration files get used, and also should make it easier to notice changes in the default image configuration, given that it is likely that the user would check the original/system configuration files before replacing them with a symlink.

For this option, the user would still have to manually create the symlinks in all SDK images it wants to share the configuration, but that process could be eased with the use of helper scripts to create and setup the symlinks.

Note that this approach may also cause some issues, such as the fact that some specific software may not work with symlinked configuration files or that early boot could potentially break if there are symlinks to e.g. /opt.

Given that the most common use cases for customizing system configuration would be to setup things like a system proxy (e.g. cntlm) and that not many customizations are expected, the recommended approach would be to use symlinks, as it would allow the user to have more control over the changes.

As mentioned above, no single solution would work for all use cases and the developers/users should evaluate the best approach based on their requirements.

Implementation notes

To setup a new second disk, the following would be required:

  • Create a new empty disk image
  • Add the disk to the SDK image in question using the VirtualBox UI
  • Partition and format the disk accordingly
  • Setup mountpoints (i.e. /etc/fstab) such that the disk is mounted during boot
  • Copy existing content of /home and /opt to the respective new disk partitions - such that things like the default user files/folders are properly populated on the new disk

Optionally, if the developer plans to use the same disk across multiple SDK instances at the same time, it must create a fixed size disk above and mark it as shareable using the VirtualBox UI.

To setup an existing disk on a new SDK image, the following would be required:

  • Add the existing disk to the SDK image in question using the VirtualBox UI
  • Setup mountpoints (i.e. /etc/fstab) such that the disk is mounted during boot

As mentioned above, helper scripts could be provided to ease this work. A script could for example do all the work of partitioning/formatting the disk, setting up the mountpoints and copying existing content over the new partitions when on setup mode or only setup the mountpoints otherwise. It could also allow the user to optionally specify the partitions size and other configuration options.

For system configuration changes, considering the recommended approach, the same or another script could also be used to setup the symlinks based on the content of /opt/etc when setting up the disk. It is recommended that the content of /opt/etc mimics the dir structure and filenames of the original files in /etc, such that a script could walk through all dirs/files in /opt/etc to create the symlinks on /etc.

The user would still have to manually install the packages living outside /opt or the user workspace, but that can be easily done by retrieving the list of installed packages in one image (e.g. using dpkg --get-selections) and using that to install the packages in other images.

The results of the search are