Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faulted after upgrade on linux - files intact on BSD ZFS #984

Closed
MiceMiceRabies opened this issue Sep 23, 2012 · 12 comments
Closed

Faulted after upgrade on linux - files intact on BSD ZFS #984

MiceMiceRabies opened this issue Sep 23, 2012 · 12 comments
Labels
Type: Documentation Indicates a requested change to the documentation
Milestone

Comments

@MiceMiceRabies
Copy link

Just upgraded my ZFS on linux, after a reboot my pools are "faulted too many errors" on all disks.
I am running Debian squeeze with kernel 3.2.0. Im not too sure where to go from here... I can supply any info needed to help track this bug.

@dechamps
Copy link
Contributor

Start with zpool status?

@MiceMiceRabies
Copy link
Author

I tried an export at first, thinking that when I go to import it may correct some error. I was wrong. Also zpool import -fD tank does not import, heres the output, this would be the same for status.

bill@workbox:~$ sudo zpool import
pool: tank
id: 13503037218073075533
state: UNAVAIL
status: One or more devices are faulted.
action: The pool cannot be imported due to damaged devices or data.
config:

tank                                               UNAVAIL  insufficient replicas
  raidz1-0                                         UNAVAIL  insufficient replicas
    scsi-SATA_SAMSUNG_HD103SJS246J90Z150704-part1  FAULTED  too many errors
    scsi-SATA_SAMSUNG_HD103SJS2AEJ1BZ406106-part1  FAULTED  too many errors
    scsi-SATA_SAMSUNG_HD103SJS2AEJ1BZ406107-part1  FAULTED  too many errors
    scsi-SATA_SAMSUNG_HD204UIS2H7J1BZC11220-part1  FAULTED  too many errors
    scsi-SATA_SAMSUNG_HD204UIS2H7J1BZC11225-part1  FAULTED  too many errors

@cwedgwood
Copy link
Contributor

@MiceMiceRabies are all the devices available at the time /etc/init.d/zfs runs?

i have a system here where the sas controller takes maybe 20s to find all the disks and so it typically will fault on import

@MiceMiceRabies
Copy link
Author

@cwedgwood - I just tried restarting via init.d with no luck.

I did boot to the latest NAS4FREE and was successful by doing a
#zpool import -f tank
All files are still intact so I know my disks are not faulted.

Another interesting thing to note, during my last upgrade I did notice that the modules did not get built into the other kernels on the system.

@ryao
Copy link
Contributor

ryao commented Sep 24, 2012

Have you tried importing the pool after exporting it from FreeBSD?

@MiceMiceRabies
Copy link
Author

Ryao, Yes that was the next step I went for, I still get the same output on the linux side. STATE: UNAVAIL, and all drives are faulted

@Ukko-Ylijumala
Copy link

ZoL seems to be trying to mount the ZFS pool from partitions. Are the disks in GPT format? Did you initially make the pool on whole disks or partitions?

CONFIG_EFI_PARTITION=y enabled in kernel config?

Do issues #94, #489 or #955 have relevance for this? The most likely candidate for your problem might be the last one (AF disk detection). Which ZoL version you're running? Is your /etc/zfs/zpool.cache stale?

@MiceMiceRabies
Copy link
Author

The disks are GPT. The lay out is (3) 1TB drives, (2) 2TB drives. The 2TB drives are partitioned in half so I have a total of 5 disks in my pool the leftover partitions are used for LVG.

CONFIG_EFI_PARTITION=y is enabled in the kernel config.

As for ZoL - Version: 0.6.0.80-0ubuntu2~lucid1

/etc/zfs/zpool.cache doesnt exist the only file in /etc/zfs/ is zdev.conf

I think this looks more like Issue #94 then the others because of the 1049kB offset. As for the others I am not too sure as I am very new to zfs.

Heres my drive layout:

For the 2TB drives (quantity 2)

Model: ATA SAMSUNG HD204UI (scsi)
Disk /dev/sd[b-c]: 2000GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Number Start End Size File system Name Flags
1 1049kB 1000GB 1000GB zfs zfs
9 1000GB 1000GB 8389kB
2 1000GB 2000GB 1000GB zfs Linux filesystem

And the 1TB drives (quantity 3)

Model: ATA SAMSUNG HD103SJ (scsi)
Disk /dev/sd[d-f]: 1000GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Number Start End Size File system Name Flags
1 1049kB 1000GB 1000GB zfs zfs
9 1000GB 1000GB 8389kB

@dechamps
Copy link
Contributor

I think this looks more like Issue #94 then the others because of the 1049kB offset.

No. I'll just repeat what I said in #94:

Just FYI, parted doesn't have any unit calculation issues, it just uses IEC units:

(parted) print                                                            
Model: ATA Hitachi HDP72502 (scsi)
 Disk /dev/sdj: 250GB
 Sector size (logical/physical): 512B/512B
 Partition Table: gpt

Number  Start   End    Size    File system  Name  Flags
 1      1049kB  250GB  250GB   zfs          zfs
 9      250GB   250GB  8389kB

(parted) unit kib                                                         
(parted) print                                                            
Model: ATA Hitachi HDP72502 (scsi)
 Disk /dev/sdj: 244198584kiB
 Sector size (logical/physical): 512B/512B
 Partition Table: gpt

Number  Start         End           Size          File system  Name  Flags
 1      1024kiB       244190208kiB  244189184kiB  zfs          zfs
 9      244190208kiB  244198400kiB  8192kiB

Sure, choosing kB instead of kiB as the default unit is a surprising decision from the parted folks, but that's not the issue here.

There is no "1049kB offset" issue. 1049kB is 1024kiB. Everything is fine.

@MiceMiceRabies
Copy link
Author

so then its something else im missing. good to know its not #94

@MiceMiceRabies
Copy link
Author

Fixed!
I had done my upgrade while running a Xen kernel which caused the issue.

To fix I purged the ubunut-zfs package, then rebooted to a non-xen kernel (3.2bpo) and reinstalled the ubuntu-zfs package.

Now all my disks appear as online again with no issues.

pool: tank
state: ONLINE
scan: none requested
config:

NAME                                               STATE     READ WRITE CKSUM
tank                                               ONLINE       0     0     0
  raidz1-0                                         ONLINE       0     0     0
    scsi-SATA_SAMSUNG_HD103SJS246J90Z150704-part1  ONLINE       0     0     0
    scsi-SATA_SAMSUNG_HD103SJS2AEJ1BZ406106-part1  ONLINE       0     0     0
    scsi-SATA_SAMSUNG_HD103SJS2AEJ1BZ406107-part1  ONLINE       0     0     0
    scsi-SATA_SAMSUNG_HD204UIS2H7J1BZC11220-part1  ONLINE       0     0     0
    scsi-SATA_SAMSUNG_HD204UIS2H7J1BZC11225-part1  ONLINE       0     0     0

errors: No known data errors

@behlendorf
Copy link
Contributor

Great, I'm glad you got it sorted out.

pcd1193182 pushed a commit to pcd1193182/zfs that referenced this issue Sep 26, 2023
Bumps [lru](https://github.com/jeromefroe/lru-rs) from 0.10.0 to 0.10.1.
- [Changelog](https://github.com/jeromefroe/lru-rs/blob/master/CHANGELOG.md)
- [Commits](jeromefroe/lru-rs@0.10.0...0.10.1)

---
updated-dependencies:
- dependency-name: lru
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Documentation Indicates a requested change to the documentation
Projects
None yet
Development

No branches or pull requests

6 participants