Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creation of ceph_test_radios_io_sequence application #1

Conversation

JonBailey1993
Copy link
Owner

Application creates IO sequences using seeded random numbers. IO sequences are varied in IO offset, length, IO order and data patterns to try stress erasure coding specifically and detect defects earlier in the development cycle.

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows
  • jenkins test rook e2e

mchangir and others added 30 commits March 30, 2023 21:17
…m selection operator

Document about the '$' operator which is the random selection operator.

Fixes: https://tracker.ceph.com/issues/55198
Signed-off-by: Milind Changire <mchangir@redhat.com>
if relevant config settings result in no output.

Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
LogSegment::try_to_expire() batches backtrace updations for inodes in
dirty_parent_inodes list. If a backtrace update operations fails for
one inode due to missing (removed) data pool, which is specially
handled by treating the operation as a success, however, the errno
(-ENOENT) is stored by the gather context and passed on as the return
value to subsequent operations (even for successful backtrace update
operations in the same gather context).

Fixes: http://tracker.ceph.com/issues/63259
Signed-off-by: Venky Shankar <vshankar@redhat.com>
…a pool

Signed-off-by: Venky Shankar <vshankar@redhat.com>
Implementing necessary changes for the NFS module to align with the new export block format introduced in nfs-ganesha-V5.6.
The purpose of these changes is to enhance memory efficiency for exports. To achieve this goal, we have introduced a new field
called cmount_path under the FSAL block of export. Initially, this is applicable only to CephFS-based exports.

Furthermore, the newly created CephFS exports will now share the same user_id and secret_access_key, which are determined based
on the NFS cluster name and filesystem name. This results in each export on the same filesystem using a shared connection,
thereby optimizing resource usage.

Signed-off-by: avanthakkar <avanjohn@gmail.com>

mgr/nfs: fix a unit test failure

Signed-off-by: John Mulligan <jmulligan@redhat.com>

mgr/nfs: fix a unit test failure

Signed-off-by: John Mulligan <jmulligan@redhat.com>

mgr/nfs: fix a unit test failure

Signed-off-by: John Mulligan <jmulligan@redhat.com>

mgr/nfs: enable user management on a per-fs basis

Add back the ability to create a user for a cephfs export but do
it only for a cluster+filesystem combination. According to the
ganesha devs this ought to continue sharing a cephfs client connection
across multiple exports.

Signed-off-by: John Mulligan <jmulligan@redhat.com>

mgr/nfs: add more unit tests with cmount_path

Add more unit tests for CEPHFS based NFS exports with
newly added cmount_path field under FSAL.
Signed-off-by: avanthakkar <avanjohn@gmail.com>

mgr/nfs: fix rgw nfs export when no existing exports

Signed-off-by: avanthakkar <avanjohn@gmail.com>

mgr/nfs: generate user_id & access_key for apply_export(CEPHFS)

Generate user_id & secret_access_key for CephFS based exports
for apply export. Also ensure the export FSAL block has
`cmount_path`.

Fixes: https://tracker.ceph.com/issues/63377
Signed-off-by: avanthakkar <avanjohn@gmail.com>

mgr/nfs: simplify validation, update f-strings, add generate_user_id function

- Improved validation to check cmount_path as a superset of path
- Replaced format method with f-strings
- Added generate_user_id function using SHA-1 hash
- Enhanced error handling and integrated new user_id generation

Signed-off-by: Avan Thakkar <athakkar@redhat.com>
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
Add unit tests for unique user ID generation, deletion and `cmount_path` handling in FSAL exports

- Ensure unique user ID generation for different FSAL blocks when creating exports.
- Test deletion behavior when multiple exports share the same user ID and one has a unique ID.
- Test default behavior when no `cmount_path` is provided (defaults to `/`).
- Add tests to validate error handling for invalid `cmount_path` values.

Signed-off-by: Avan Thakkar <athakkar@redhat.com>
the v1 endpoint is deprecated and v2 is suggested to use.

Fixes: https://tracker.ceph.com/issues/67183
Signed-off-by: Nizamudeen A <nia@redhat.com>
Fixes: https://tracker.ceph.com/issues/65829
Signed-off-by: Milind Changire <mchangir@redhat.com>
Signed-off-by: Xuehan Xu <xuxuehan@qianxin.com>
Fixes https://tracker.ceph.com/issues/67853

When EC pools are created with device class specified, the pools are created with just 1 PG and autoscaler does not work.
PG autoscaler not working on a cluster where pools have multiple overlapping roots is a known issue, and bug is raised for same :>

Issue documented already : https://docs.ceph.com/en/reef/rados/operations/placement-groups/#viewing-pg-scaling-recommendations

Also renames "let ceph decide" option to "All devices" in crush rule and ec profile component.
Updates unit tests for ec profile modal
Signed-off-by: Afreen Misbah <afreen23.git@gmail.com>
Fixes: https://tracker.ceph.com/issues/68062

Signed-off-by: Ivo Almeida <ialmeida@redhat.com>
Fixes: https://tracker.ceph.com/issues/68024

Signed-off-by: Ivo Almeida <ialmeida@redhat.com>
The intent of 42721c0 was to address an issue where boolean
parameters weren't handled correctly.

I noticed that a parameter (`tpm2`) was missed, which made me realize
that maintaining a list of these boolean parameters is necessary.

To simplify things, we should only accept `"true"` or `"false"` (in any case),
allowing us to avoid the need to maintain a list of boolean parameters.

This change introduces a `list_drive_group_spec_bool_arg` to store boolean
arguments related to drive group specifications, simplifying the validation
process for boolean values by directly checking if the values are 'true' or 'false'.

Fixes: https://tracker.ceph.com/issues/68045

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Fixes: https://tracker.ceph.com/issues/68063

Signed-off-by: Ivo Almeida <ialmeida@redhat.com>
Signed-off-by: Xuehan Xu <xuxuehan@qianxin.com>
Signed-off-by: Xuehan Xu <xuxuehan@qianxin.com>
"PG::complete_error_log" interruptible

Fixes: https://tracker.ceph.com/issues/67800
Signed-off-by: Xuehan Xu <xuxuehan@qianxin.com>
An indentation of five spaces relative to the previous line creates a
command that is copyable with a single mouse click. This commit adds
those copyabale commands to the procedure in the section "Building
Ceph".

Signed-off-by: Zac Dover <zac.dover@proton.me>
Fixes: https://tracker.ceph.com/issues/67651
Signed-off-by: Pedro Gonzalez Gomez <pegonzal@redhat.com>
mgr/dashboard: add service management for oauth2-proxy

Reviewed-by: afreen23 <NOT@FOUND>
Reviewed-by: Ankush Behl <cloudbehl@gmail.com>
Reviewed-by: Redouane Kachach <rkachach@redhat.com>
…e-test-and-overrides-to-correct-location

qa: relocate subvol creation overrides and test

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Rishabh Dave <ridave@redhat.com>
nbalacha and others added 14 commits September 17, 2024 14:40
Allows a namespace in a pool to be mirrored to a differently named
namespace in the secondary cluster.

Signed-off-by: N Balachandran <nibalach@redhat.com>
Includes the mirror_uuid in the mirror pool info command
output.

Signed-off-by: N Balachandran <nibalach@redhat.com>
…form-fix

mgr/dashboard: fix table column pipe transform

Reviewed-by: Afreen Misbah <afreen23.git@gmail.com>
…ap-id

qa: do the set/get attribute on the remote filesystem

Reviewed-by: Venky Shankar <vshankar@redhat.com>
* refs/pull/55421/head:
	qa/cephfs: add test to verify backtrace update failure on deleted data pool
	mds: batch backtrace updates by pool-id when expiring a log segment
	mds: dump log segment in segment expiry callback
	mds: dump log segment end along with offset

Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
- fixes table in table structure's unusual padding
- fixes the ceph fs details page where table was getting hidden
- improving the subvolume list page by changing the spacings there
- hide the refresh button where it shouldn't be shown.

Fixes: https://tracker.ceph.com/issues/68050
Signed-off-by: Nizamudeen A <nia@redhat.com>
kv/rocksdb: return error for dump_objectstore_kv_stats asok command

Reviewed-by: Adam Kupczyk <akupczyk@ibm.com>
mgr/nfs: generate user_id & access_key for apply_export(CephFS)

Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Dhairya Parmar <dparmar@redhat.com>
Reviewed-by: John Mulligan <jmulligan@redhat.com>
…-md-command-formatting

doc/README.md: create selectable commands

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
…r-v2

qa/task: update alertmanager endpoints version

Reviewed-by: Adam King <adking@redhat.com>
mgr/dashboard: fix minor issues in carbon tables

Reviewed-by: Ankush Behl <cloudbehl@gmail.com>
Reviewed-by: ivoalmeida <NOT@FOUND>
* refs/pull/59111/head:
	doc: document earmark option for subvolume and new commands
	qa/cephfs: update tests for test_volumes & unit-test for earmarking
	mgr/volumes: add earmarking for subvol

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Kotresh Hiremath Ravishankar <khiremat@redhat.com>
Reviewed-by: John Mulligan <jmulligan@redhat.com>
…d-decorator

mgr/dashboard: remove orch required decorator from host UI router (list) 

Reviewed-by: afreen23 <NOT@FOUND>
Reviewed-by: dnyanee1997 <NOT@FOUND>
@JonBailey1993 JonBailey1993 force-pushed the JonBailey1993/ceph_test_rados_io_sequence branch from 36a5542 to 95273ea Compare September 18, 2024 11:17
crimson/osd/pg: do_osd_ops_execute fixes

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
@JonBailey1993 JonBailey1993 force-pushed the JonBailey1993/ceph_test_rados_io_sequence branch from 95273ea to 609b4a5 Compare September 18, 2024 11:23
idryomov and others added 2 commits September 18, 2024 15:13
rbd-mirror: allow mirroring to a different namespace

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
@JonBailey1993 JonBailey1993 force-pushed the JonBailey1993/ceph_test_rados_io_sequence branch 2 times, most recently from 5ce5fed to 41fd338 Compare September 18, 2024 15:52
Creation of ceph_test_radios_io_sequence tool.
Application creates IO sequences using seeded random numbers.
IO sequences are varied in IO offset, length, IO order and data patterns to try stress erasure coding specifically and detect defects earlier in the development cycle.

We feel the creation of this as a new tool is benificial for several reasons:
* Existing test tools either generate completely random I/O (with different offsets and lengths) or test very uniform I/O (either read/write whole object or read/writes with fixed length). This is very inefficient (relies on brute force) for testing boundary conditions. The I/O sequence test tool generates sequences of I/Os to specifically test boundary conditions
* Existing test tools only generate RADOS requests that contain a single read or write request. Recent regressions found in EC show that clients (e.g. RBD with LUKs) can generate RADOS requests that contain multiple reads or writes. This tool will sometimes generate requests with 2 or 3 I/O operations
* Quality of data patterns when writing to objects and validation of data for reading is not as great as desired in existing test tools. This means that code bugs where the wrong part of an object is read/written may not be detected. This tool creates more sophisticated data patterns and keeps a model of the object to track the expected contents of the object.
* For erasure coding there are many choices of erasure code profile, this tool can create pools and exercise multiple different profiles in parallel to speed up testing.

The test tool is located in src/test/osd, as the new tool is used explicitly for testing the erasure coding implementation of OSDs.
A lot of the logic for generating IO Sequences has been stored in a new location, io_exerciser, located in src/common. The thinking behind storing the code here is we feel it is useful to share between other applications.
It is abstracted to a level it shouldn't be necissairy to have to be only used with only the new ceph_test_rados_io_sequence application and we would like to be able to integrate with other tools such as ceph_test_rados or possibly rados bench as well in future PRs to enhance testing done with those tools.

Signed-off-by: Jon Bailey <jonathan.bailey1@ibm.com>
@JonBailey1993 JonBailey1993 force-pushed the JonBailey1993/ceph_test_rados_io_sequence branch from 41fd338 to afb6399 Compare September 18, 2024 17:00
Removes use of const members to allow possibility of moving.
Changes strings that were passed by value to now be const ref.
Adds readability improvements to shorten repeated use of long namespace names in cpp files.
Use std::generator for initialisation of seeds instead of previous for loop

Signed-off-by: Jon Bailey <jonathan.bailey1@ibm.com>
Removed dataGenerationSingleton.
Changed stringstream to use of fmt::format
Made enums into enum classes

Signed-off-by: Jon Bailey <jonathan.bailey1@ibm.com>
JonBailey1993 pushed a commit that referenced this pull request Sep 30, 2024
…n async task

test_messenger_thrash UT shows,

```
==461141==ERROR: AddressSanitizer: stack-use-after-return on address 0xffffb0b37c20 at pc 0xaaaad7239508 bp 0xffffeb113c50 sp 0xffffeb113c48
READ of size 4 at 0xffffb0b37c20 thread T0
    #0 0xaaaad7239504 in (anonymous namespace)::SyntheticWorkload::wait_for_done()::'lambda0'()::operator()() const /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/src/test/crimson/test_messenger_thrash.cc:455:13
    #1 0xaaaad723a1c0 in seastar::internal::do_until_state<(anonymous namespace)::SyntheticWorkload::wait_for_done()::'lambda'(), (anonymous namespace)::SyntheticWorkload::wait_for_done()::'lambda0'()>::run_and_dispose() /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/src/seastar/include/seastar/core/loop.hh:303:26
    ceph#2 0xaaaadacfb790 in seastar::reactor::run_tasks(seastar::reactor::task_queue&) /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/src/seastar/src/core/reactor.cc:2653:14
    ceph#3 0xaaaadad04288 in seastar::reactor::run_some_tasks() /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/src/seastar/src/core/reactor.cc:3123:9
    ceph#4 0xaaaadad07cd0 in seastar::reactor::do_run() /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/src/seastar/src/core/reactor.cc:3291:9
    ceph#5 0xaaaadad05d60 in seastar::reactor::run() /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/src/seastar/src/core/reactor.cc:3181:16
    ceph#6 0xaaaadaa860d8 in seastar::app_template::run_deprecated(int, char**, std::function<void ()>&&) /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/src/seastar/src/core/app-template.cc:276:31
    ceph#7 0xaaaadaa83fb0 in seastar::app_template::run(int, char**, std::function<seastar::future<int> ()>&&) /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/src/seastar/src/core/app-template.cc:167:12
    ceph#8 0xaaaad7203d88 in main /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/src/test/crimson/test_messenger_thrash.cc:669:14
    ceph#9 0xffffb32773f8 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    ceph#10 0xffffb32774c8 in __libc_start_main csu/../csu/libc-start.c:392:3
    ceph#11 0xaaaad712546c in _start (/home/jenkins-build/build/workspace/ceph-pull-requests-arm64/build/bin/unittest-seastar-messenger-thrash+0x3b5546c) (BuildId: b0048d750e057d178816f94b3ce0459971785191)
```

Address 0xffffb0b37c20 is located in stack of thread T0 at offset 32 in frame
    #0 0xaaaad7493ed8 in ceph::buffer::v15_2_0::list::buffers_t::clear_and_dispose() /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/src/include/buffer.h:596

Signed-off-by: arm7star <arm7star@qq.com>
JonBailey1993 pushed a commit that referenced this pull request Dec 2, 2024
https://vote.heliosvoting.org/helios/elections/e03494ce-e04c-41d0-bb05-ec5ccc632ce4/view

Question #1
    Update election requirements for Ceph Executive Council Elections?
        Remove "ranked-choice" requirement	13
        Keep "ranked-choice" requirement (no change)	16

Question ceph#2
    Require periodic elections in governance charter?
        No (no change)	8
        Annual	15
        Semi-annual	3
        Quarterly	2

Question ceph#3
    Update the Ceph Executive Council term length?
        Change to 3 years	14
        Keep 2 years (no change)	14

Question ceph#4
    Amend governance document to require a supermajority of votes for amendments to the governance model? The current requirement is a simple majority.
        Require a supermajority	20
        Require a simple majority (no change)	9

Question ceph#5
    Clarify "supermajority" and "majority" election requirements?
        Of members voting on a given question (abstaining does not bias the vote)	18
        Of members voting on the election (abstaining is an implicit "no")	6
        Of members in the CSC	3

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
JonBailey1993 pushed a commit that referenced this pull request Jan 7, 2025
https://vote.heliosvoting.org/helios/elections/e03494ce-e04c-41d0-bb05-ec5ccc632ce4/view

Question #1
    Update election requirements for Ceph Executive Council Elections?
        Remove "ranked-choice" requirement	13
        Keep "ranked-choice" requirement (no change)	16

Question ceph#2
    Require periodic elections in governance charter?
        No (no change)	8
        Annual	15
        Semi-annual	3
        Quarterly	2

Question ceph#3
    Update the Ceph Executive Council term length?
        Change to 3 years	14
        Keep 2 years (no change)	14

Question ceph#4
    Amend governance document to require a supermajority of votes for amendments to the governance model? The current requirement is a simple majority.
        Require a supermajority	20
        Require a simple majority (no change)	9

Question ceph#5
    Clarify "supermajority" and "majority" election requirements?
        Of members voting on a given question (abstaining does not bias the vote)	18
        Of members voting on the election (abstaining is an implicit "no")	6
        Of members in the CSC	3

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
JonBailey1993 pushed a commit that referenced this pull request Feb 7, 2025
…lable

When we have a socket failure or connection issue, we may send a mon command
and never check if it completed. If we resend the command to another monitor,
the resent command may complete before the first sent command. This can cause
users to send the command twice, which can lead to issues in automated
environments. For example:

We have 2 monitors: mon.a and mon.b

1. Send command to delete pool - monclient targets mon.a
2. A socket failure occurs, and mon.a has a delay in response
3. Monclient hunts for another monitor to resend the delete pool command
   and finds mon.b
4. Mon.b removes the pool and sends an acknowledgment
5. The user script now sends a create pool command, but mon.a now sends the
   acknowledgment for the pool delete from step #1

We end up without a pool, as mon.a deleted it.

The mon_client_hunt_on_resent configuration was added to control the behavior of
retrying commands on monitor connection failures.
By default, this option is enabled to prevent situations where a command is retried
on the same monitor, potentially missing better monitor candidates.
Clients experiencing specific conditions that require retrying on the same monitor
can disable this feature by setting the configuration to false.

Fixes: https://tracker.ceph.com/issues/63789
Signed-off-by: Nitzan Mordechai <nmordec@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.