Skip to content
This repository has been archived by the owner on Feb 26, 2020. It is now read-only.

Add CONFIG_KALLSYMS_ALL check #6

Closed
jberzins opened this issue May 31, 2010 · 20 comments
Closed

Add CONFIG_KALLSYMS_ALL check #6

jberzins opened this issue May 31, 2010 · 20 comments
Milestone

Comments

@jberzins
Copy link

xbmc@xbmc:~/spl-0.4.9$ uname -a
Linux xbmc 2.6.32-22-generic #33-Ubuntu SMP Wed Apr 28 13:28:05 UTC 2010 x86_64 GNU/Linux

Steps:

  1. ./configure --with-linux=/usr/src/linux-headers-2.6.32-22-generic/
  2. make
  3. sudo make install
  4. sudo modprobe splat
    FATAL: Could not load /lib/modules/2.6.32-22-generic/modules.dep: No such file or directory
  5. sudo depmod -a
  6. sudo modprobe splat
    FATAL: Error inserting splat (/lib/modules/2.6.32-22-generic/addon/spl/splat/splat.ko): Cannot assign requested address
  7. tail /var/log/messages
    Jun 1 00:09:51 xbmc kernel: [ 5482.698506] SPL: Failed user helper '/bin/sh -c awk '{ if ( $3 == "kallsyms_lookup_name") { print $1 } }' /proc/kallsyms >/proc/sys/kernel/spl/kallsyms_lookup_name', rc = 512
    Jun 1 00:09:51 xbmc kernel: [ 5482.698537] SPL: Failed to Load Solaris Porting Layer v0.4.9, rc = -99

The test ubuntu is 64 bit 10.04 minimal cd install + updates + build-essential + psmisc + linux-headers-uname -r

Regards,
jb

@behlendorf
Copy link
Contributor

Thanks for the bug report jberzins. I've got a hunch this was caused by not having 'awk' installed on your system due to the minimal install. Can you double check that /usr/bin/awk exists. If it doesn't please pull in the gawk package and try again.

@jberzins
Copy link
Author

jberzins commented Jun 2, 2010

Hi, thanks for reply.

awk does exist, here results of the script:

  1. xbmc@xbmc:~$ sh
    $ awk '{ if ( $3 == "kallsyms_lookup_name") { print $1 } }' /proc/kallsyms

    ffffffff810a2270

  2. $ awk '{ if ( $3 == "kallsyms_lookup_name") { print $1 } }' /proc/kallsyms >/proc/sys/kernel/spl/kallsyms_lookup_name
    sh: cannot create /proc/sys/kernel/spl/kallsyms_lookup_name: Directory nonexistent

  3. $ sudo awk '{ if ( $3 == "kallsyms_lookup_name") { print $1 } }' /proc/kallsyms >/proc/sys/kernel/spl/kallsyms_lookup_name
    sh: cannot create /proc/sys/kernel/spl/kallsyms_lookup_name: Directory nonexistent

Not sure about internals, but it seems spl dir does not exist when the helper script is executed.

If this does not make sense - I can install full ubuntu and try again. I do not have much experience with modules, but I can follow instructions.

@behlendorf
Copy link
Contributor

Interesting. So the way this is designed to work is when the spl module loads that missing directory is created. After it knows the directory exists it then makes the call to user space to extract that address from kallsyms and pass it back to the kernel through the proc file. In your case this fails for some reason and the module stops loading and removes the directory which is why you don't see it.

What would be good is if you could verify that the file does exist in the right place during the module load. You might be able to catch it if your watching /proc/ carefully when the module loads. Or better yet you can modify the following lines in the spl source to call a helper script which calls awk and then you can add some debugging to the helper script to see what the issue is. Simply change the following macro like this, recompile the spl, and add the helper script let's call it /tmp/spl-awk.sh. Then you can try and load the module and look at the contents of /tmp/spl-awk.log to see what went wrong.

spl-generic.c:336
#define GET_KALLSYMS_ADDR_CMD                                           \
        "awk '{ if ( $3 == \"kallsyms_lookup_name\") { print $1 } }' "  \
        "/proc/kallsyms >/proc/sys/kernel/spl/kallsyms_lookup_name"
#define GET_KALLSYMS_ADDR_CMD "/tmp/spl-awk.sh"
#! /bin/sh
ls -l /proc/sys/kernel/spl/ >/tmp/spl-awk.log
awk '{ if ( $3 == \"kallsyms_lookup_name\") { print $1 } }' /proc/kallsyms >/proc/sys/kernel/spl/kallsyms_lookup_name

@jberzins
Copy link
Author

jberzins commented Jun 4, 2010

Thanks for advice, unfortunately the results are irrational.

Does NOT work:
#! /bin/sh
awk '{ if ( $3 == "kallsyms_lookup_name") { print $1 } }' /proc/kallsyms >/proc/sys/kernel/spl/kallsyms_lookup_name

Works:
#! /bin/sh
awk '{ if ( $3 == "kallsyms_lookup_name") { print $1 } }' /proc/kallsyms >/tmp/spl-awk.tmp
cat /tmp/spl-awk.tmp >/proc/sys/kernel/spl/kallsyms_lookup_name`

All tests passed:
splat -a

Cheers,
jb

@behlendorf
Copy link
Contributor

I hate bugs like this. OK, well thanks for all your debugging lets leave the bug open for now until I can setup a similar system. Perhaps there's a bug in the proc handler or some such.

@chexum
Copy link

chexum commented Jun 6, 2010

It sounds like a stdio buffering difference.
@jberzins - can you please show what "strace -e write ..." thinks of your awk command line (when writing to proc?)

@jberzins
Copy link
Author

jberzins commented Jun 7, 2010

I'm using VMWare Server, maybe it contributes to the issue.
Here you go.

strace -e write -o /tmp/spl-strace.log
spl-strace.log:
write(1, "ffffffff810a2270\n", 17) = 17

I'm not familiar with the tool so a desperate strace.
strace -e write=1,2,3,4,5,6,7,8,9 -o /tmp/spl-strace.log
spl-strace.log:
execve("/usr/bin/awk", ["awk", "{ if ( $3 == "kallsyms_lookup_na"..., "/proc/kallsyms"], [/* 4 vars */]) = 0
brk(0) = 0x24d1000
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f915ff28000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY) = 0
fstat(0, {st_mode=S_IFREG|0644, st_size=14452, ...}) = 0
mmap(NULL, 14452, PROT_READ, MAP_PRIVATE, 0, 0) = 0x7f915ff24000
close(0) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
open("/lib/libm.so.6", O_RDONLY) = 0
read(0, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\360>\0\0\0\0\0\0"..., 832) = 832
fstat(0, {st_mode=S_IFREG|0644, st_size=534832, ...}) = 0
mmap(NULL, 2629864, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 0, 0) = 0x7f915fa87000
mprotect(0x7f915fb09000, 2093056, PROT_NONE) = 0
mmap(0x7f915fd08000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 0, 0x81000) = 0x7f915fd08000
close(0) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
open("/lib/libc.so.6", O_RDONLY) = 0
read(0, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0`\355\1\0\0\0\0\0"..., 832) = 832
fstat(0, {st_mode=S_IFREG|0755, st_size=1568136, ...}) = 0
mmap(NULL, 3676200, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 0, 0) = 0x7f915f705000
mprotect(0x7f915f87d000, 2097152, PROT_NONE) = 0
mmap(0x7f915fa7d000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 0, 0x178000) = 0x7f915fa7d000
mmap(0x7f915fa82000, 18472, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f915fa82000
close(0) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f915ff23000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f915ff22000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f915ff21000
arch_prctl(ARCH_SET_FS, 0x7f915ff22700) = 0
mprotect(0x7f915fa7d000, 16384, PROT_READ) = 0
mprotect(0x7f915fd08000, 4096, PROT_READ) = 0
mprotect(0x61a000, 4096, PROT_READ) = 0
mprotect(0x7f915ff2a000, 4096, PROT_READ) = 0
munmap(0x7f915ff24000, 14452) = 0
brk(0) = 0x24d1000
brk(0x24f2000) = 0x24f2000
open("/proc/kallsyms", O_RDONLY) = 0
ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fffe19cb600) = -1 ENOTTY (Inappropriate ioctl for device)
read(0, "0000000000000000 D per_cpu__irq_"..., 4096) = 4080

/*content listing skipped*/

read(0, "rt_spi]\nffffffffa0004820 d dev_a"..., 4048) = 4048
read(0, "9e0 r .LC5\t[scsi_transport_spi]\n"..., 4083) = 688
read(0, "", 3395)                       = 0
write(1, "ffffffff810a2270\n", 17)      = 17
 | 00000  66 66 66 66 66 66 66 66  38 31 30 61 32 32 37 30  ffffffff 810a2270 |
 | 00010  0a                                                .                 |
close(1)                                = 0
munmap(0x7f915ff27000, 4096)            = 0
close(2)                                = -1 EBADF (Bad file descriptor)
exit_group(2)    

jb

@behlendorf
Copy link
Contributor

Well that's interesting. Everything worked fine writing to the spl proc file, the error EBADF occurred while closing file descriptor 2. Since I never see this file descriptor being opened in the log the error makes sense, but it seems like this is an issue with awk and not the spl. If the spl we're to wrongly ignore the error everything should be fine.

@nergdron
Copy link

nergdron commented Jun 7, 2010

Since it seems to be something with awk, I thought I'd see how I could rewrite it. Here's my diff, which seems to compile and work fine on my system that was experiencing this problem. splat -a passes all tests, as well:

-       "awk '{ if ( $3 == \"kallsyms_lookup_name\") { print $1 } }' "  \
-       "/proc/kallsyms >/proc/sys/kernel/spl/kallsyms_lookup_name"
+       "grep ' kallsyms_lookup_name' /proc/kallsyms | cut -d ' ' -f 1 " \
+       "> /proc/sys/kernel/spl/kallsyms_lookup_name"

@behlendorf
Copy link
Contributor

This turns out to be an issue with awk. It does not behave properly when called from the kernel usermode helper, this may have something to do with have awk handles its argument parsing. The reason I never saw this is I was using gawk on all the platforms I tested which does not suffer from this same issue. I've updated the ./configure script to check explicitly for gawk and make it a mandatory requirement (1814251). This seems like a pretty reasonable requirement so I'm closing the bug as fixed.

@ghost
Copy link

ghost commented Feb 9, 2011

Hi,
i am using 2.6.37-gentoo
gawk 3.1.6
and am seeing the same error
in dmesg:

SPL: Failed user helper '/bin/sh -c gawk '{ if ( $3 == "kallsyms_lookup_name") { print $1 } }' /proc/kallsyms >/proc/sys/kernel/spl/kallsyms_lookup_name', rc = 512
SPL: Failed to Load Solaris Porting Layer v0.6.0, rc = -99

i tried gawk 3.1.8 too, same results

should i use another version of gawk, is gentoo using some bizarre gawk version?

thanks

@behlendorf
Copy link
Contributor

You can try an earlier gawk version, 3.1.7 works fine for me. But before trying that please try this simple experiment. Run the following simple gawk command and see if returns something reasonable. You should see something like ffffffff81157680 which is the memory address of kallsyms_lookup_name function for your kernel. If you don't we'll have to investigate if that symbol is missing on your system or if it's just being misreported somehow.

gawk '{ if ( $3 == "kallsyms_lookup_name") { print $1 } }' /proc/kallsyms
ffffffff81157680

@ghost
Copy link

ghost commented Feb 9, 2011

hi, thanks for the reply

apparently,.. i don't have /proc/kallsyms

gawk: cmd. line:1: fatal: cannot open file `/proc/kallsyms' for reading (No such file or directory)

so maybe the problem is the kernel version??

many thanks!

@behlendorf
Copy link
Contributor

That is absolutely the problem. Make sure your kernel is built with CONFIG_KALLSYMS=y and CONFIG_KALLSYMS_ALL=y.

@ghost
Copy link

ghost commented Feb 10, 2011

it works with
CONFIG_KALLSYMS
but i didnt have an entry for CONFIG_KALLSYMS_ALL in my .config
should i still add it?
now zfs has problem compiling, but that is another issue (it did compile with a previous version)

many thanks

@behlendorf
Copy link
Contributor

CONFIG_KALLSYMS_ALL isn't strictly required, most distro's just happen to define both in their default kernels. I wouldn't worry about it, thanks for letting me know it worked!

@user318
Copy link

user318 commented Dec 7, 2011

That is absolutely the problem. Make sure your kernel is built with CONFIG_KALLSYMS=y and CONFIG_KALLSYMS_ALL=y.

If there is a check for other kernel config flags during configure than may be add check for this option too?

@behlendorf
Copy link
Contributor

In fact there are checks for other CONFIG_* options and checking for this at build time is a good idea. I'll reopen the issue until this check gets added.

@behlendorf behlendorf reopened this Dec 7, 2011
@egon010
Copy link

egon010 commented Aug 17, 2012

I'm having the same problem, and as far as I can tell, I have gawk, kallsyms (the gawk command works), the helper command line work, and the modules load after the boot process switches to an ext4 rootFS

@behlendorf
Copy link
Contributor

If your using zfs as your root filesystem are you sure you have gawk in your initramfs?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants