Discussion:
guixsd in lxd container
Eddy Pronk
2017-06-07 13:08:29 UTC
Permalink
Hello guix!

I'm trying to run guixsd in an lxd container.
My lxd containers run on an Ubuntu server 16.04.

I took the usb-installer image and imported in as an lxd image.

When a container start it runs /sbin/init.
In guixsd /proc/1 is shepherd, but a lot of stuff happens before shepherd
is started.

I've set a few things in the environment matching values in /proc/1/environ.
The argument of --load in grub.cfg is a guile program.

I found some details about the kernel loading guile here:
https://lists.gnu.org/archive/html/guix-devel/2016-12/msg00704.html

To be able to get some logging during startup of guixsd I'm trying to run
it from a shell script as root.

====
cat /sbin/start
export HOME=/
export TERM=linux
export
BOOT_IMAGE="/gnu/store/fqc2kg4lq1lz1ymk41080jzb5q90icg0-linux-libre-4.11/bzImage
--root=gnu-disk-image
--system=/gnu/store/kq71yhydfgc0nksvmmn66cbvbj5a3mvf-system
--load=/gnu/store/kq71yhydfgc0nksvmmn66cbvbj5a3mvf-system/boot"
export PATH=/gnu/store/crvb68g89b479n4h44r8l42hy39axhg2-shadow-4.4/sbin/
cd $HOME
/gnu/store/sa7zrdfqglnb5rvvr11qdj0rspbs292v-profile/bin/ln -s
/gnu/store/kq71yhydfgc0nksvmmn66cbvbj5a3mvf-system /run/current-system
/gnu/store/zk41gmzbibvpx9dpsm5gs8p0liz8shy0-guile-2.0.14/bin/guile
--no-auto-compile /gnu/store/kq71yhydfgc0nksvmmn66cbvbj5a3mvf-system/boot
===

When I run the start script I get the following output.

$ lxc exec guixsd --
/gnu/store/sa7zrdfqglnb5rvvr11qdj0rspbs292v-profile/bin/bash -c
"/sbin/start 2>&1"
/gnu/store/sa7zrdfqglnb5rvvr11qdj0rspbs292v-profile/bin/ln: failed to
create symbolic link
'/run/current-system/kq71yhydfgc0nksvmmn66cbvbj5a3mvf-system': File exists
making '#f' the current system...
Backtrace:
In ice-9/boot-9.scm:
160: 13 [catch #t #<catch-closure 938020> ...]
In unknown file:
?: 12 [apply-smob/1 #<catch-closure 938020>]
In ice-9/boot-9.scm:
66: 11 [call-with-prompt prompt0 ...]
In ice-9/eval.scm:
432: 10 [eval # #]
In ice-9/boot-9.scm:
2412: 9 [save-module-excursion #<procedure 95c900 at
ice-9/boot-9.scm:4084:3 ()>]
4089: 8 [#<procedure 95c900 at ice-9/boot-9.scm:4084:3 ()>]
1734: 7 [%start-stack load-stack #<procedure 9663a0 at
ice-9/boot-9.scm:4080:10 ()>]
1739: 6 [#<procedure 96ebd0 ()>]
In unknown file:
?: 5 [primitive-load
"/gnu/store/kq71yhydfgc0nksvmmn66cbvbj5a3mvf-system/boot"]
In ice-9/eval.scm:
432: 4 [eval # ()]
In unknown file:
?: 3 [primitive-load
"/gnu/store/9j944zjslsihhsgipa7gz7x046fkcjm7-activate"]
In ice-9/eval.scm:
432: 2 [eval # ()]
In ./gnu/build/activation.scm:
456: 1 [activate-current-system #f]
In unknown file:
?: 0 [symlink #f "/run/current-system.new"]

ERROR: In procedure symlink:
ERROR: Wrong type (expecting string): #f
===

Ignoring the errors above I'll now try to start shepherd, to see how far I
get.

***@ubuntu16041:~/guixsd$ lxc exec guixsd --
/gnu/store/sa7zrdfqglnb5rvvr11qdj0rspbs292v-profile/bin/bash -c
"/gnu/store/q49si29djfcrpzibqg6cg8k6xixxvd2f-shepherd-0.3.2/bin/shepherd
--config /gnu/store/df56ad2rw1ayjyhs1kqadskf5zsmsc5l-shepherd.conf 2>&1"
Service root has been started.
starting services...
Service root-file-system has been started.
Service user-file-systems has been started.
Service file-system-/tmp has been started.
failed to start service 'file-systems' <<== first problem.
failed to start service 'file-system-/dev/pts'
Service file-system-/dev/shm has been started.
failed to start service 'file-system-/gnu/store'
failed to start service 'user-processes'
Service host-name has been started.
failed to start service 'user-homes'
failed to start service 'nscd'
failed to start service 'ssh-daemon'
waiting for udevd...
waiting for udevd...
waiting for udevd...
waiting for udevd...
Service udev has been started.
Service gpm could not be started.
failed to start service 'console-font-tty1'
failed to start service 'console-font-tty2'
failed to start service 'console-font-tty3'
failed to start service 'console-font-tty4'
failed to start service 'console-font-tty5'
failed to start service 'console-font-tty6'
failed to start service 'guix-daemon'
failed to start service 'syslogd'
failed to start service 'term-tty6'
failed to start service 'term-tty5'
failed to start service 'term-tty4'
failed to start service 'term-tty3'
failed to start service 'term-tty2'
failed to start service 'term-tty1'


C-c C-c^CExiting shepherd...
unmounting '/dev'...
failed to unmount '/dev': Device or resource busy
unmounting '/dev/null'...
failed to unmount '/dev/null': Device or resource busy
Service user-file-systems has been stopped.
Service host-name has been stopped.
Service file-system-/dev/shm has been stopped.
Service file-system-/tmp has been stopped.
Service udev has been stopped.
closing log
===

See also:
https://lists.gnu.org/archive/html/guix-devel/2016-12/msg00733.html

I would like to get some help to solve this puzzle.



Cheers,
Eddy
Ludovic Courtès
2017-06-09 21:54:23 UTC
Permalink
Hi Eddy,
Post by Eddy Pronk
I'm trying to run guixsd in an lxd container.
My lxd containers run on an Ubuntu server 16.04.
I took the usb-installer image and imported in as an lxd image.
When a container start it runs /sbin/init.
You mean LXD expects to run /sbin/init, right?
Post by Eddy Pronk
In guixsd /proc/1 is shepherd, but a lot of stuff happens before shepherd
is started.
I've set a few things in the environment matching values in /proc/1/environ.
The argument of --load in grub.cfg is a guile program.
https://lists.gnu.org/archive/html/guix-devel/2016-12/msg00704.html
To be able to get some logging during startup of guixsd I'm trying to run
it from a shell script as root.
====
cat /sbin/start
export HOME=/
export TERM=linux
export
BOOT_IMAGE="/gnu/store/fqc2kg4lq1lz1ymk41080jzb5q90icg0-linux-libre-4.11/bzImage
--root=gnu-disk-image
--system=/gnu/store/kq71yhydfgc0nksvmmn66cbvbj5a3mvf-system
--load=/gnu/store/kq71yhydfgc0nksvmmn66cbvbj5a3mvf-system/boot"
export PATH=/gnu/store/crvb68g89b479n4h44r8l42hy39axhg2-shadow-4.4/sbin/
cd $HOME
/gnu/store/sa7zrdfqglnb5rvvr11qdj0rspbs292v-profile/bin/ln -s
/gnu/store/kq71yhydfgc0nksvmmn66cbvbj5a3mvf-system /run/current-system
/gnu/store/zk41gmzbibvpx9dpsm5gs8p0liz8shy0-guile-2.0.14/bin/guile
--no-auto-compile /gnu/store/kq71yhydfgc0nksvmmn66cbvbj5a3mvf-system/boot
===
When I run the start script I get the following output.
$ lxc exec guixsd --
/gnu/store/sa7zrdfqglnb5rvvr11qdj0rspbs292v-profile/bin/bash -c
"/sbin/start 2>&1"
/gnu/store/sa7zrdfqglnb5rvvr11qdj0rspbs292v-profile/bin/ln: failed to
create symbolic link
'/run/current-system/kq71yhydfgc0nksvmmn66cbvbj5a3mvf-system': File exists
making '#f' the current system...
[...]
Post by Eddy Pronk
456: 1 [activate-current-system #f]
?: 0 [symlink #f "/run/current-system.new"]
ERROR: Wrong type (expecting string): #f
The line that’s printed here comes from (gnu build activation):

https://git.savannah.gnu.org/cgit/guix.git/tree/gnu/build/activation.scm#n442

As you can see, it takes the value of ‘system’ either from the kernel’s
‘--system’ command-line argument (/proc/cmdline), or from the
‘GUIX_NEW_SYSTEM’ environment variable.

So you’d have to set ‘GUIX_NEW_SYSTEM’ in your case to fix this.
Post by Eddy Pronk
Service file-system-/tmp has been started.
failed to start service 'file-systems' <<== first problem.
failed to start service 'file-system-/dev/pts'
What ‘guix system container’ does to work around this is to try to mount
only file systems that can really be mounted inside a container, with
the right options:

https://git.savannah.gnu.org/cgit/guix.git/tree/gnu/system/linux-container.scm#n37
https://git.savannah.gnu.org/cgit/guix.git/tree/gnu/system/file-systems.scm#n325

HTH!

BTW, did you consider using ‘guix system container’ directly instead of
LXC? It’s not perfect but probably worth a try:

https://www.gnu.org/software/guix/manual/html_node/Invoking-guix-system.html

Ludo’.
Eddy Pronk
2017-06-10 04:53:24 UTC
Permalink
Post by Ludovic Courtès
Post by Eddy Pronk
When a container start it runs /sbin/init.
You mean LXD expects to run /sbin/init, right?
Yes, and the LXD container can be configured to run any other program instead.
Post by Ludovic Courtès
https://git.savannah.gnu.org/cgit/guix.git/tree/gnu/build/activation.scm#n442
As you can see, it takes the value of ‘system’ either from the kernel’s
‘--system’ command-line argument (/proc/cmdline), or from the
‘GUIX_NEW_SYSTEM’ environment variable.
So you’d have to set ‘GUIX_NEW_SYSTEM’ in your case to fix this.
Added this to my start script:
export GUIX_NEW_SYSTEM=/gnu/store/kq71yhydfgc0nksvmmn66cbvbj5a3mvf-system

Great. That works. This got me 2 steps further.

I had to comment out some snippets from activation.scm:

;; activate-modprobe
;; activate-firmware
;; activate-ptrace-attach

(I actually edited the 1 line snippets that get generated somehow.)

Now the boot script starts shepherd.
Post by Ludovic Courtès
Post by Eddy Pronk
Service file-system-/tmp has been started.
failed to start service 'file-systems' <<== first problem.
failed to start service 'file-system-/dev/pts'
What ‘guix system container’ does to work around this is to try to mount
only file systems that can really be mounted inside a container, with
https://git.savannah.gnu.org/cgit/guix.git/tree/gnu/system/linux-container.scm#n37
https://git.savannah.gnu.org/cgit/guix.git/tree/gnu/system/file-systems.scm#n325
I'd like to get some logging out of the "failed to start service
'file-systems'".
When /sbin/init starts the usb-install image is already mounted on '/'.

What I see as pstree is:

---shepherd-+-udevd
`-{shepherd}

failed to start service 'term-tty1' is the last thing I see.
Can someone post a full log of the shepherd startup?
Post by Ludovic Courtès
BTW, did you consider using ‘guix system container’ directly instead of
I'll give that a try. Maybe just to learn how it does it.

For Ubuntu users (or others distros with LXD) it would be a nice
managed way of trying out GuixSD if I get this to work.


Cheers,
Eddy
Jan Nieuwenhuizen
2017-06-10 05:30:11 UTC
Permalink
Post by Eddy Pronk
Post by Eddy Pronk
When a container start it runs /sbin/init.
As you can see, it takes the value of ‘system’ either from the kernel’s
‘--system’ command-line argument (/proc/cmdline), or from the
‘GUIX_NEW_SYSTEM’ environment variable.
So you’d have to set ‘GUIX_NEW_SYSTEM’ in your case to fix this.
Nice to know, kind of obvious when you do... I saw
Post by Eddy Pronk
Post by Eddy Pronk
making '#f' the current system...
which even gave me a smile...that won't work ;-)
Post by Eddy Pronk
export GUIX_NEW_SYSTEM=/gnu/store/kq71yhydfgc0nksvmmn66cbvbj5a3mvf-system
Great. That works. This got me 2 steps further.
Yay!
Post by Eddy Pronk
failed to start service 'term-tty1' is the last thing I see.
Can someone post a full log of the shepherd startup?
PFA
Post by Eddy Pronk
For Ubuntu users (or others distros with LXD) it would be a nice
managed way of trying out GuixSD if I get this to work.
Great work, thanks!

Greetings, janneke
Ludovic Courtès
2017-06-11 20:26:58 UTC
Permalink
Hi,
[...]
Post by Eddy Pronk
Post by Ludovic Courtès
So you’d have to set ‘GUIX_NEW_SYSTEM’ in your case to fix this.
export GUIX_NEW_SYSTEM=/gnu/store/kq71yhydfgc0nksvmmn66cbvbj5a3mvf-system
Great. That works. This got me 2 steps further.
;; activate-modprobe
;; activate-firmware
;; activate-ptrace-attach
Yeah, ‘guix system container’ does that too:

http://git.savannah.gnu.org/cgit/guix.git/tree/gnu/system.scm#n418
http://git.savannah.gnu.org/cgit/guix.git/tree/gnu/services.scm#n466

I think we should look for ways that would allow you to reuse what ‘guix
system container’ does.

HTH,
Ludo’.
Eddy Pronk
2017-06-16 12:21:59 UTC
Permalink
Post by Ludovic Courtès
Post by Eddy Pronk
;; activate-modprobe
;; activate-firmware
;; activate-ptrace-attach
http://git.savannah.gnu.org/cgit/guix.git/tree/gnu/system.scm#n418
http://git.savannah.gnu.org/cgit/guix.git/tree/gnu/services.scm#n466
I think we should look for ways that would allow you to reuse what ‘guix
system container’ does.
Shall I open a bug for this so it can be tracked?

Eddy
Ludovic Courtès
2017-06-19 11:41:06 UTC
Permalink
Hi Eddy,

Sorry for the delay.
Post by Eddy Pronk
Post by Ludovic Courtès
Post by Eddy Pronk
;; activate-modprobe
;; activate-firmware
;; activate-ptrace-attach
http://git.savannah.gnu.org/cgit/guix.git/tree/gnu/system.scm#n418
http://git.savannah.gnu.org/cgit/guix.git/tree/gnu/services.scm#n466
I think we should look for ways that would allow you to reuse what ‘guix
system container’ does.
Shall I open a bug for this so it can be tracked?
Before, I’d like to make sure we have a good understanding of what we
need.

My first question would be, do we really need to have a mechanism other
than ‘guix system container’? I guess that if the host system runs LXD,
it’s better to use it than to spawn the script that ‘guix system
container’ generates.

In that case, what about adding an LXD backend to ‘guix system
container’? AIUI LXD has a REST API¹; by doing a POST on /1.0/images,
we should be able to register our container image, though it’s not clear
to me what image format is expected. If we can figure out these
details, it might not be that hard to implement.

Dave, there’s also some overlap with your work on ‘guix deloy’ I think.
Thoughts?

Ludo’.

¹ https://github.com/lxc/lxd/blob/master/doc/rest-api.md
Eddy Pronk
2017-06-20 12:35:19 UTC
Permalink
Post by Ludovic Courtès
Post by Eddy Pronk
Shall I open a bug for this so it can be tracked?
Before, I’d like to make sure we have a good understanding of what we
need.
My first question would be, do we really need to have a mechanism other
than ‘guix system container’? I guess that if the host system runs LXD,
it’s better to use it than to spawn the script that ‘guix system
container’ generates.
In that case, what about adding an LXD backend to ‘guix system
container’? AIUI LXD has a REST API¹; by doing a POST on /1.0/images,
we should be able to register our container image, though it’s not clear
to me what image format is expected. If we can figure out these
details, it might not be that hard to implement.
Lxd expects a root file system and a traditional 'init' process.
For this experiment I'm using a bash script to play the role of /sbin/init.

I'll need to spend some time to see what 'guix system container' generates.
I'm very new to guix, so that will be my homework for my spare time
this week. :-)

This weekend I set up a VM on Google cloud with lxd.

Below the log of all the steps I did for this experiment.

(I can give anyone who wants to experiment in this environment access.
Just send me you ssh public key.)


My recipe so far:

$ wget https://alpha.gnu.org/gnu/guix/guixsd-vm-image-0.13.0.x86_64-linux.xz

$ xz -d guixsd-vm-image-0.13.0.x86_64-linux.xz

$ qemu-img convert guixsd-vm-image-0.13.0.x86_64-linux image.raw

We need the sector size and start sector for the right offset:
$ fdisk image.raw

Sector size (logical/physical): 512 bytes / 512 bytes

Device Boot Start End Sectors Size Id Type
image.raw1 * 2048 4093952 4091905 2G 83 Linux
image.raw2 4093953 4175873 81921 40M ef EFI (FAT-12/16/32)

Create a loopback device with an offset poiting to the Linux partition:

$ sudo losetup /dev/loop0 image.raw -o $((2048 * 512))

Mount it. Now we have to content of the vm image on /mnt.
$ sudo mount /dev/loop0 /mnt

I hope this preserves links, timestamps in the right way.
$ sudo tar cpf ./rootfs.tar -C /mnt/ .
tar: ./dev/log: socket ignored

=== metadata.yaml ===
architecture: "x86_64"
creation_date: 1424284563
properties:
description: "GuixSD Intel 64bit"
os: "guixsd"
release: "0.0"
===

lxc imports an image from 2 tarballs:
$ tar cf metadata.tar metadata.yaml
$ lxc image import metadata.tar rootfs.tar --alias guixsd-vm

***@instance-1:~$ lxc image list
+-----------+--------------+--------+--------------------+--------+----------+------------------------------+
| ALIAS | FINGERPRINT | PUBLIC | DESCRIPTION | ARCH |
SIZE | UPLOAD DATE |
+-----------+--------------+--------+--------------------+--------+----------+------------------------------+
| guixsd-vm | c9eeb3dfcee7 | no | GuixSD Intel 64bit | x86_64 |
883.92MB | Jun 17, 2017 at 5:43am (UTC) |
+-----------+--------------+--------+--------------------+--------+----------+------------------------------+

Create container called guixsd from guixsd-vm image:

$ lxc launch guixsd-vm guixsd

barf... no /sbin/init yet.

***@instance-1:~$ lxc list
+--------+---------+------+------+------------+-----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+--------+---------+------+------+------------+-----------+
| guixsd | STOPPED | | | PERSISTENT | 0 |
+--------+---------+------+------+------------+-----------+

***@instance-1:~$ sudo ls /var/lib/lxd/containers/guixsd/rootfs
bin boot dev etc gnu home lost+found mnt root run tmp var

***@instance-1:~$ sudo find /var/lib/lxd/containers/guixsd/rootfs -name sleep
/var/lib/lxd/containers/guixsd/rootfs/gnu/store/xniak294s1x03zssfj2xzvfkcny1gn0x-profile/bin/sleep
(other entries omitted)

I don't know how to see the output of /sbin/init. For now all
/sbin/init does is sleep.
We start shepherd manually in later steps.

=== /sbin/init ===
#!/gnu/store/xniak294s1x03zssfj2xzvfkcny1gn0x-profile/bin/bash
/gnu/store/xniak294s1x03zssfj2xzvfkcny1gn0x-profile/bin/sleep 99999
===

$ sudo mkdir /var/lib/lxd/containers/guixsd/rootfs/sbin
$ sudo cp init /var/lib/lxd/containers/guixsd/rootfs/sbin/init

$ lxc start guixsd
$ lxc list
+--------+---------+------+------+------------+-----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+--------+---------+------+------+------------+-----------+
| guixsd | RUNNING | | | PERSISTENT | 0 |
+--------+---------+------+------+------------+-----------+

Now that the container is in a running state I can attach bash as a
process to it:
***@instance-1:~$ lxc exec guixsd
/gnu/store/xniak294s1x03zssfj2xzvfkcny1gn0x-profile/bin/bash
bash-4.4#

'start' is blueprint for /sbin/init.

=== /sbin/start/ ===
export HOME=/
export TERM=linux

export PATH=/gnu/store/crvb68g89b479n4h44r8l42hy39axhg2-shadow-4.4/sbin/
export GUIX_NEW_SYSTEM=/gnu/store/4pr317614r1ff1bi6vd1q0jjdca5h78s-system
cd $HOME
/gnu/store/zk41gmzbibvpx9dpsm5gs8p0liz8shy0-guile-2.0.14/bin/guile
--no-auto-compile $GUIX_NEW_SYSTEM/boot
===

Run start script via bash, so we can see stderr and stdout from host OS.

$ lxc exec guixsd --
/gnu/store/xniak294s1x03zssfj2xzvfkcny1gn0x-profile/bin/bash -c
"/sbin/start 2>&1"

Error #1:

?: 2 [primitive-load
"/gnu/store/ysvjgjb9ph1vg0m4y67lfrj06wc5gdx4-activate-service"]
In ice-9/boot-9.scm:
893: 1 [call-with-output-file
"/sys/module/firmware_class/parameters/path" ...]
In unknown file:
?: 0 [open-file "/sys/module/firmware_class/parameters/path"
"w" #:encoding #f]

$ sudo chmod +w
/var/lib/lxd/containers/guixsd/rootfs/gnu/store/ysvjgjb9ph1vg0m4y67lfrj06wc5gdx4-activate-service

Comment out with ;;
$ sudo emacs /var/lib/lxd/containers/guixsd/rootfs/gnu/store/ysvjgjb9ph1vg0m4y67lfrj06wc5gdx4-activate-service

Error #2:

?: 3 [primitive-load
"/gnu/store/nz2wixyg218l9j56vb21w0whnvdrnmh5-activate-service"]
In ice-9/eval.scm:
432: 2 [eval # ()]
In ice-9/boot-9.scm:
893: 1 [call-with-output-file "/proc/sys/kernel/modprobe" ...]
In unknown file:
?: 0 [open-file "/proc/sys/kernel/modprobe" "w" #:encoding #f]


Commented out expression in
/gnu/store/nz2wixyg218l9j56vb21w0whnvdrnmh5-activate-service

After fixing these #1 #2, shepherd starts, but reports services that
didn't start.

The first service that reports an issue is file-systems.

Next we tried to start some services manually:

***@instance-1:~$ lxc exec guixsd
/gnu/store/xniak294s1x03zssfj2xzvfkcny1gn0x-profile/bin/bash
bash-4.4#

***@gnu ~# herd start guix-daemon
herd start guix-daemon herd: exception caught while executing 'start'
on service 'file-system-/gnu/store': ERROR:
In procedure mount: mount "/gnu/store" on "///gnu/store": Permission denied

This is how far I got.

I hope this give some idea of what the image looks like and what I
tried to start it.


Cheers,
Eddy

Loading...