-
Notifications
You must be signed in to change notification settings - Fork 13
ixpdimm_cli: encounter "segmentation fault" while creating namespace in app direct mode of Apache_Pass(intel) #11
Comments
Might be the same issue as #10 To confirm, could you run the command with the '-o verbose' option ixpdimm-cli create -o verbose -namespace -pool 51428255-2a16-6258-b876-f9a2d33cb55b persistentmemorytype=appdirect And also check for that sysfs file in that issue |
@juston-li It reports “label-less”,what shoud i do then ? For sysfs file, i am not clear what to check,can you be more specific? |
What kernel are you running? linux automatically initialize namespaces without labels so thats why you already have namespaces show up after creating a pool. Will need an updated/patched kernel to support creating labeled namespaces with our tool |
I have confirmed that the kernel is kinda older,thanks a lot. Next i will try to patch the kernel |
my kernel is where can I get the patch rpm packet so that i can install it by myself. My friend who is familiar with the kernel is quite busy right now,and i should patch it myself! Thanks |
@juston-li could u plesase help me with the above questions? |
I'm afraid I can't help much with a patch rpm. I just compile and install kernel from source. Maybe check with your distro if there's updated kernels. I think the patches got backported to centos 7.4. Will need to double check |
I have installed kernel 4.15.1 which is a stable version on CentOS7.4. However,the result of creating namespace is still "segmentation fault(core dumped)".Additionally,is it true that kernel newer than 4.10 will not have /dev/pmem upon the creation of pool? this time /dev/pmem exists upon the creation of pool ..i dont know if its correct. Here is the log: [root@localhost ~]# ixpdimm-cli create -o verbose -namespace -pool 51428255-2a16-6258-b876-f9a2d33cb55b persistentmemorytype=appdirect [root@localhost ~]# uname -a @juston-li any idea? |
For newer kernels, the label-less namespaces will still appear if labels have not been initialized. ixpdimm should be able to handle that case and disable label-less namespaces when creating namespaces. One other thing to try is creating a goal again. From 2399+, create goal will initialize labels and only use labeled namespaces so there won't be label-less namespaces. |
I tried to recreate a goal, but the problem still appears. when i execute coredumpctl info, it shows no coredumps found. below is the demsg log: it seems that sth wrong with libndctl.so.6.7.0. Do you have any idea? Then I installed kernel of version 4.14.17. Surprisely,namespace will not appear after creating goal which implies that it is labeled. However,creating namespace takes a lot of time and it sometimes causes soft lockup when deleting the namespace.By the way,the rpm I used is complied under EulerOS,Do i have to rebuild it on Centos? |
You should be able to get a backtrace of that fault if you install the debug symbols and backtrace the core dump. However, since you report it fixed with the latest kernel I suspect it is ixpdimm-cli not checking for missing functionality on an older kernel that causes it to trigger a fault in libndctl. The soft lockup is benign, but it needs to fixed up similar to how the init softlockups were addressed. Can you post the softlockup report from the namespace tear down? |
@djbw Thanks for your reply,how do i get the softlockup report? Additionally,after the soft lockup,i found that /dev/pmem dosen't exist anymore however ixpdimm-cli show -namespace CMD can still show the namespace. Then, i type the "ixpdimm-cli delete -namespace" again,and the namespace can be deleted successfully....So, it is functional,but there might be some problems with it.. |
@insanecoderr just paste the output of dmesg after the softlockup triggers. |
Message from syslogd@localhost at Feb 6 11:39:13 ... |
env info: The detailed dmesg log after the softlockup: |
Moreover,when creating namespace, it will stuck here for about 20minutes.although namespace will be created eventually,but it takes too much time: |
I'm still looking for the softlockup backtrace. Those messages you posted are not from the kernel. Is the behavior any different with: |
@djbw the above demsg is not from kernel?When softlockup happens,i paste the dmesg here..there are two comments i posted,maybe you skipped the first one? |
@insanecoderr oops, sorry I missed it. Thanks, that's what I need. |
@djbw Do you have any idea of the soft lockup? |
Yes, it is benign you can ignore it for now. I have posted a fix for it here: |
excellent |
The cond_resched() currently in the setup path needs to be duplicated in the teardown path. Rather than require each instance of for_each_device_pfn() to open code the same sequence, embed it in the helper. Link: intel/ixpdimm_sw#11 Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Christoph Hellwig <hch@lst.de> Cc: <stable@vger.kernel.org> Fixes: 7138970 ("mm, zone_device: Replace {get, put}_zone_device_page()...") Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Fix released in the v4.16-rc4 kernel: |
The cond_resched() currently in the setup path needs to be duplicated in the teardown path. Rather than require each instance of for_each_device_pfn() to open code the same sequence, embed it in the helper. Orabug: 27663570 Link: intel/ixpdimm_sw#11 Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Christoph Hellwig <hch@lst.de> Cc: <stable@vger.kernel.org> Fixes: 7138970 ("mm, zone_device: Replace {get, put}_zone_device_page()...") Signed-off-by: Dan Williams <dan.j.williams@intel.com> (cherry picked from commit 949b932) Signed-off-by: Jane Chu <jane.chu@oracle.com> Reviewed-by: Larry Bassel <larry.bassel@oracle.com> Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
The cond_resched() currently in the setup path needs to be duplicated in the teardown path. Rather than require each instance of for_each_device_pfn() to open code the same sequence, embed it in the helper. Link: intel/ixpdimm_sw#11 Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Christoph Hellwig <hch@lst.de> Cc: <stable@vger.kernel.org> Fixes: 7138970 ("mm, zone_device: Replace {get, put}_zone_device_page()...") Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: atndko <z1281552865@gmail.com>
After compiling the master branch of ixpdimm-sw successfully,I started to install the rpms on the EulerOS(basically same as Centos7). Then, I resolved the dependencies and installed it.
The creation of app direct mode is successful(2-1-1 polulation on single CPU),however when I start to create an appdirecct namespace,"segmentation fault" occurs. Curiously, I found that the /dev/pmem2 exists upon the app direct pool is created
[root@localhost ~]# ixpdimm-cli show -pool
PoolID PersistentMemoryType Capacity FreeCapacity
51428255-2a16-6258-b876-f9a2d33cb55b AppDirect 252.0 GiB 252.0 GiB
[root@localhost ~]# ixpdimm-cli create -namespace -pool 51428255-2a16-6258-b876-f9a2d33cb55b persistentmemorytype=appdirect
Segmentation fault
[root@localhost ~]# ixpdimm-cli show -namespace
No results
[root@localhost ~]# ls /dev/pmem*
/dev/pmem2
[root@localhost ~]#
@juston-li
The text was updated successfully, but these errors were encountered: