Skip to content

Commit

Permalink
parent 9fedc1e
Browse files Browse the repository at this point in the history
author sparvekar <[email protected]> 1695407418 -0400
committer sparvekar <[email protected]> 1696791578 -0400

docs/cephadm: fix broken links in cephadm docs

In the documentation https://docs.ceph.com/en/quincy/cephadm/services/osd/, there is a broken link in "List Devices" chapter. This change fixes the documentation to point to the correct link https://docs.ceph.com/en/quincy/rados/operations/devices

Fixes: https://tracker.ceph.com/issues/55763

Signed-off-by: sparvekar <[email protected]>

rgw: remove Bucket::update_container_stats()

callers use Bucket::read_stats() to load bucket stats

Signed-off-by: Casey Bodley <[email protected]>

rgw/admin: 'buckets list' takes --marker

Signed-off-by: Casey Bodley <[email protected]>

rgw/sal: list_buckets() returns RGWBucketEnts

`sal::User::list_buckets()` no longer returns a map of `sal::Bucket`
handles. it now uses `std::span<RGWBucketEnt>` for input and output.
`RGWBucketEnt` contains all of the information we need to satisfy
ListBuckets requests, and also stores the `rgw_bucket` key for use with
`Driver::get_bucket()` where a `sal::Bucket` handle is necessary

`sal::BucketList` contains the span of results and the `next_marker`.
the `is_truncated` flag was removed in favor of `!next_marker.empty()`

the checks for `user->get_max_buckets()` on bucket creation now use a
paginated `check_user_max_buckets()` helper function that limits the
number of allocated entries to `rgw_list_buckets_max_chunk`

Signed-off-by: Casey Bodley <[email protected]>

rgw/sal: StoreBucket no longer wraps RGWBucketEnt

`sal::Bucket` no longer needs to wrap `RGWBucketEnt` to support user
bucket listings, so can be represented by `RGWBucketInfo` alone. the
bucket stats interfaces that relied on RGWBucketEnt internally now
return their result as either `RGWBucketEnt` or `RGWStorageStats`

Signed-off-by: Casey Bodley <[email protected]>

debian: Build-Depend on g++ 11 or greater

Rely on the packaging system to provide a suitable g++ of version 11
or greater, and removing the corresponding hard-coding from
debian/rules, since cmake will then find a suitable version. This
seems better than trying to hard-code a particular version in
debian/rules, and Debian package building tools like e.g. sbuild will
then do the right thing.

This enables Reef (v18.2.0) to build on Debian bookworm in a clean
chroot.

Fixes: https://tracker.ceph.com/issues/61845

Signed-off-by: Matthew Vernon <[email protected]>

debian: specify interpreters for ceph-mon and ceph-osd postinsts

These were previously missing. The requirement for interpreters is in
Debian policy section 10.4:
https://www.debian.org/doc/debian-policy/ch-files.html#s-scripts

Debian's packaging already adds the #! to these two postinsts. In
practice, a text executible without a #! line will likely be executed
by the calling shell, so a lot of the time we'd get away with it
unless the administrator is using an incompatible shell like tcsh.

This behaviour of shells is documented in POSIX section 1(e)(i)(b)
here:
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_09_01_01

Signed-off-by: Matthew Vernon <[email protected]>

debian: remove {Build-,}Depends on essential packages

Unless there's a version requirement (which there isn't here),
packages should not declare a Build-Depends: or Depends: relationship
on essential packages. Policy link:

https://www.debian.org/doc/debian-policy/ch-binary.html#dependencies

Signed-off-by: Matthew Vernon <[email protected]>

debian: add missing item separators in debian/control

Signed-off-by: Matthew Vernon <[email protected]>

debian/copyright: update syntax, maintainer, add license stanzas

Update the header paragraph to link to the canonical URL for the
format, and point to [email protected] as the Contact.

Also add License: stanzas to reflect the licences in use (and refer to
fuller versions in /usr/share/common-licenses/ as appropriate).

This means that packages containing this copyright file are better in
compliance with the licences concerned.

Signed-off-by: Matthew Vernon <[email protected]>

debian: dh compat to 12, necessary init/systemd adjustments

Bring the dh compat level to 12, the most recent supported by the
oldest supported Ubuntu LTS release, 20.04. This necessitates changes
to how initscripts & systemd packaging are done.

Signed-off-by: Matthew Vernon <[email protected]>

debian: correct maintainer address

This means that debian/control matches changelog entries, and that the
Maintainer address is up to date.

Signed-off-by: Matthew Vernon <[email protected]>

debian: remove obsolete ceph-base.docs, restore dh_installdocs

debian/ceph-base.docs only referred to a README that doesn't exist, so
remove it. Because dpkg-source doesn't reflect deletions from debian/
cf the orig.tar.gz, also remove the file in dh_auto_clean.

Then do away with the removal of the empty override of dh_installdocs;
the main benefit of which here is that debian/copyright gets installed
in all of the built packages, which otherwise lack a copyright
file.

Signed-off-by: Matthew Vernon <[email protected]>

debian: specify a dependency on python3 for cephadm

cephadm is a compressed zipapp, and dh3_python3 doesn't understand
this sort of binary file, so fails to produce the required python3
dependency. So specify this explicitly in debian/control

Signed-off-by: Matthew Vernon <[email protected]>

debian: radosgw.init to installinit, remove auto_build override

Installation of init scripts properly belongs with dh_installinit, so
move the installation there.

That means we no longer need the override of dh_auto_build, which
simplifies the rules file.

Signed-off-by: Matthew Vernon <[email protected]>

debian: call dh_python3 for ceph-{base,common,fuse,volume}

In the cases of ceph-base, ceph-common, and ceph-fuse, this picks up
that these packages contain python scripts and adds a necessary
python3 dependency. In the case of ceph-volume it additionally parses
the requirements.txt file.

Signed-off-by: Matthew Vernon <[email protected]>

test: unmount the mountpoint just before exiting

Without this the qa test may fail by evicting the unresponsive
client after 300 seconds.

Fixes: https://tracker.ceph.com/issues/61394
Signed-off-by: Xiubo Li <[email protected]>

run-make-check.sh: use clang-17 if available

now that clang-17 has been released, let's use it if available.

Signed-off-by: Kefu Chai <[email protected]>

valgrind: UninitCondition under __run_exit_handlers suppression

reqiered in CentOS / RHEL 9 & Ubuntu 22.04.1 LTS

Fixes: https://tracker.ceph.com/issues/62141

Signed-off-by: Mark Kogan <[email protected]>

doc/architecture: "Edit HA Auth"

Rewrite the explanation of how a client authenticates against a monitor.
This is a rewrite of a single paragraph, and has been set apart in its
own PR so that it can receive the maximum amount of scrutiny that the
upstream Ceph community can muster.

Co-authored-by: Anthony D'Atri <[email protected]>
Signed-off-by: Zac Dover <[email protected]>

cephadm: fix unit tests executing FileLock type

The FileLock type doesn't play much of a role when running tests so
to prevent issues, always mock it out when using with_cephadm_ctx.

In particular, a future patch revealed a problem with the FileLock code
that I can not understand how it was not hit before, or why this simple
refactoring - not directly related to file locking - triggered it. But
in short, the FakeFilesystem mocking utility only covers some syscalls.
In fact, the fake filesystem was returning an fd that was then passed to
real calls (fcntl and os.close).  The latter then triggered issues when
pytest was trying to clean up after it applied it's magic to stdio
objects in sys. The fix is easy - understanding why it happens and how
was hard.  I still don't understand why it popped up when it did only
that this is necessary to implement the following patches.

Signed-off-by: John Mulligan <[email protected]>

cephadm: move a pair of systemd unit status funcs to systemd.py

Signed-off-by: John Mulligan <[email protected]>

cephadm: move CephContainer/similar to new container_types.py

Part of general cephadm split-up refactoring. I am not happy with the
name 'container_types' but none of the alternatives I could think
of were much better.

Signed-off-by: John Mulligan <[email protected]>

cephadm: black format systemd.py

Signed-off-by: John Mulligan <[email protected]>

cephadm: black format container_types.py

Signed-off-by: John Mulligan <[email protected]>

mgr/cephadm: removing double quotes from the generated nvmeof config
Fixes: https://tracker.ceph.com/issues/62838

Signed-off-by: Redouane Kachach <[email protected]>

doc/architecture: edit "HA Authentication"

Edit "High Availability Authentication" in doc/architecture.rst.

Co-authored-by: Anthony D'Atri <[email protected]>
Signed-off-by: Zac Dover <[email protected]>

mgr/dashboard: fix prometheus queries subscriptions

Fixes: https://tracker.ceph.com/issues/62868
Signed-off-by: Pedro Gonzalez Gomez <[email protected]>

mgr/dashboard: remove empty popover when there are no health warns

Fixes: https://tracker.ceph.com/issues/62846
Signed-off-by: Nizamudeen A <[email protected]>

cephadm: start decorators.py in cephadmlib

Originally, wanted to move all the decorators into
their own files. Unfortunately, that isn't possible
at this time as most of them depend on things that
are still within cephadm.py This includes

list_daemons
_rm_cluster
is_fsid
termcolor
ContainerInfo
Ceph

and I'm sure I'm missing some others. We'll have to
revisit this again later when more of these things
have moved, or they can be slowly moved as their
dependencies are.

Signed-off-by: Adam King <[email protected]>

cephadm: black format initial decorators.py

Signed-off-by: Adam King <[email protected]>

cephadm: create host_facts.py in cephadmlib

For storing classes/functions related to gathering
information about the hosts such as disk enclosures
and networks

Signed-off-by: Adam King <[email protected]>

cephadm: format black host_facts.py

Signed-off-by: Adam King <[email protected]>

mgr/cephadm: add ability to zap OSDs' devices while draining host

Currently, when cephadm drains a host, it will remove all OSDs on
the host, but provides no option to zap the OSD's devices afterwards.
Given users are draining the host likely to remove it from the cluster,
it makes sense some users would want to clean up the devices on the
host that were being used for OSDs. Cephadm already supports zapping
devices outside of host draining, so it makes shouldn't take much to
add that functionality to the host drain as well.

Fixes: https://tracker.ceph.com/issues/61593

Signed-off-by: Adam King <[email protected]>

pybind/mgr/pg_autoscaler: noautoscale flag retains individual pool configs

Problem:

The pg_autoscaler `noautoscale flag` doesn't retain individual pool states of
`autoscale mode`. For example turn the flag `ON` and then `OFF` again all
the pools will have `autoscale mode on` which is inconvenience for the user
because sometimes the user just want to temporary disable the autoscaler on
all pools and will enable it back after a period of time while retaining
individual pool states of `autoscale mode`

Solution:

We store noautoscale flag in the OSDMAP such that it is
persistent. We then get rid of noautoscale MODULE OPTION
in the pg_autoscaler module since we do not need it anymore.
Everytime we set, unset or get the flag we rely on looking up
the OSDMAP, we did this because we want to avoid inconsistancy
between the `noautoscale flag`. This is because `noautoscale flag`
can easily be set by doing `ceph osd set noautoscale`.

Fixes: https://tracker.ceph.com/issues/61922

Signed-off-by: Kamoltat <[email protected]>

qa/workunits: modified tests for noautoscale flag change

modified:

`qa/workunits/mon/test_noautoscale_flag.sh`
`qa/workunits/cephtool/test.sh`

adding test coverage to files mentioned above

Fixes: https://tracker.ceph.com/issues/61922

Signed-off-by: Kamoltat <[email protected]>

doc/architecture: edit "SDEH"

Edit the front matter of the "Smart Daemons Enable Hyperscale" section
of doc/architecture.rst.

Co-authored-by: Anthony D'Atri <[email protected]>
Signed-off-by: Zac Dover <[email protected]>

crimson/os/seastore/cache: don't add EXIST_CLEAN extents to lru

Signed-off-by: Xuehan Xu <[email protected]>

crimson/os/seastore/cache: replace is_clean by is_stable_clean wherever
possible

Signed-off-by: Xuehan Xu <[email protected]>

crimson/os/seastore/transaction_manager: move intermediate_key by
"remap_offset" when remapping the "back" half of the original pin

Signed-off-by: Xuehan Xu <[email protected]>

os/bluestore: Only capture time of oldest operation

Used to capture entire oldest operation.
Now only captures oldest operation time.
Changes parsing to quickly exit when op is not "osd_op".

Signed-off-by: Adam Kupczyk <[email protected]>

os/bluestore: Fix setting osd_op_history_size

To make OSD immediately react to new config one must set value directly
instead of changing configuration.

Signed-off-by: Adam Kupczyk <[email protected]>

os/bluestore: scraper, fix sleep time

Typo with +/- caused problem with calculation of osd.ready_time
causing target OSD to be excluded from processing cycle.

Signed-off-by: Adam Kupczyk <[email protected]>

os/bluestore: scraper: make fixed history duration

Set fixes 2s of OSD history ops duration.
OSD behaves wierd if there is full ops size.
Make duration small to lets ops leave capture window quickly.

Signed-off-by: Adam Kupczyk <[email protected]>

os/bluestore: scraper: Make window size calculation dumb

OSD has a wierd handling of ops when window is full.
Make window size adaptation really-really simple.
Now window is set to 1.5 * captured ops.
Window is never shortened.
Deleted unused code that related to periodic window size update.

Signed-off-by: Adam Kupczyk <[email protected]>

fixup

Signed-off-by: Adam Kupczyk <[email protected]>

pybind/mgr/pg_autoscaler: fix warn when not too few pgs

Problem:

when `pg_num_final` is equal to `pg_num_target`
we get too many PGs warnings in ceph health while
`warn` mode in the autoscaler.

Solution:

Get rid of `else` condition and add an
`elif p['pg_num_final'] < p['pg_num_target']`
instead

Fixes: https://tracker.ceph.com/issues/61570

Signed-off-by: Kamoltat <[email protected]>

osd: fix manifest object not to be promoted when references_chunk called

When a cls_cas_references_chunk() is called on a chunked metadata object,
it makes the object's chunks be promoted in maybe_handle_manifest_detail().
It happens, for instance, while doing a chunk scrub job.

However, this operation doesn't need to get evicted data.
It only needs metadata information that already exists in metadata object.
To prevent this object promotion, this commit adds an exception handling for this.

Signed-off-by: Sungmin Lee <[email protected]>

client: Add multi_target_id to handle libcephfs mds_spce=* command

To fix multi target MDSs command result overwritten issue, add multi_target_id feature in CommandTable.
Also, add multi_target_id in CommandOp to track all the multi target commands end to make formatted result

Signed-off-by: Jimyeong Lee <[email protected]>

client: Apply multi target MDS result handling

To fix the issue that multi target MDSs result is overwritten

Signed-off-by: Jimyeong Lee <[email protected]>

client: Fix 1 active, multi standy mds condition timing issue

When there are 1 active MDS and several standby MDSs and the result of standby-MDS comes after the active-MDS's,
the closing square bracket cannot be added.
To Fix this issue, add one more condition

Signed-off-by: Jimyeong Lee <[email protected]>

client: Adjust multi_targets map assert condition

When access to map, even do not add element to set of map, the set is initiated,
so make the assert condition reasonable.

Signed-off-by: Jimyeong Lee <[email protected]>

client: Add tailing closing square bracket once in handle_command_reply

Delete unnecessary method, Add multi_id to logs

Signed-off-by: Jimyeong Lee <[email protected]>

test: client: Add multi target MDSs command tests

Signed-off-by: Jinmyeong Lee <[email protected]>

mon/ConfigMonitor: Show localized name in "config dump --format json" output

The "ceph config dump" command without the json formatted output shows
the localized option names and their values. An example of a normalized
vs localized option is shown below:

Normalized: mgr/dashboard/ssl_server_port (maintaned within Option struct)
Localized: mgr/dashboard/x/ssl_server_port (maintained in mon store)

But the "ceph config dump --format json*" output showed the normalized
option names which was not consistent with the "config dump" output.
The output of the command along with variations for pretty printing must
show the same content.

This commit introduces a new member within the ConfigMap's MaskedOption
struct called "localized_name". This is initialized to the localized name
as part of ConfigMonitor::load_config() method.

The MaskedOption::dump() used for the json formatting is modified to
display the localized_name instead of the normalized name.

Fixes: https://tracker.ceph.com/issues/62379
Signed-off-by: Sridhar Seshasayee <[email protected]>

PendingReleaseNotes: Note change to 'ceph config dump' pretty-print output.

Signed-off-by: Sridhar Seshasayee <[email protected]>

msg/AsyncMessenger: re-evaluate the stop condition when woken up in 'wait()'

Signed-off-by: Leonid Usov <[email protected]>
Fixes: https://tracker.ceph.com/issues/62395

os/bluestore: add some slow count for bluestore

Add slow count as below:
- l_bluestore_slow_aio_wait_count
- l_bluestore_slow_committed_kv_count
- l_bluestore_slow_read_onode_meta_count
- l_bluestore_slow_read_wait_aio_count

We can get a count while bluestore happens slowly,
in some cases, this is more useful than average latency.
Add it to prometheus, we can get it from the dashboard.

Signed-off-by: Yite Gu <[email protected]>

tools: add std:: qualifiers to 'move'

to silence compiler warnings.
e.g. (ceph_dedup_tool.cc:1104:32: warning: unqualified call to
'std::move' [-Wunqualified-std-cast-call]
     estimate_threads.push_back(move(ptr));
                               ^
                               std::

Signed-off-by: Ronen Friedman <[email protected]>

rgw/test: add std:: qualifiers to 'move'

to silence compiler warnings.

Signed-off-by: Ronen Friedman <[email protected]>

doc/architecture: edit "OSDs service clients directly"

Edit "OSDs service clients directly" in the list in
"Smart Daemons Enable Hyperscale" in doc/architecure.rst.

Signed-off-by: Zac Dover <[email protected]>

cephfs-shell: drop LooseVersion for version.parse

Fixes: https://tracker.ceph.com/issues/62739
Signed-off-by: Jos Collin <[email protected]>

doc: update colorama, packaging

Fixes: https://tracker.ceph.com/issues/62739
Signed-off-by: Jos Collin <[email protected]>

mgr/dashboard: fix cephfs forms validations

1. CephFS Edit Form didnt had any validation for name eventhough the
   create had. So reused the Create form to display the Edit as well

2. Add Name Validations to Subvoume and Subvolume group forms

3. Removed the datePipe from the cephfs list template since we are using
   the relativeDate.

Fixes: https://tracker.ceph.com/issues/62939
Signed-off-by: Nizamudeen A <[email protected]>

mgr/dashboard: update to angular v15

- The scss import was broken because of the ~ symbol. Looks like its not
needed.

- Login username/password label was somehow broken because of the
placeholder class and color. instead of applying the color through a
class I applied the color directly to the attribute and it worked

- Typescript 4.9 uses ES2022 and it complaints about using some items
  before its initialization. There were other typescript fixes need to
be delivered because of this change.

- Reverting back the badge to rectangular shape (because I feel like the
  round leaves out some empty spaces)

Fixes: https://tracker.ceph.com/issues/62844
Signed-off-by: Nizamudeen A <[email protected]>

mgr/dashboard: update nodejs to 18.17.0

the latest npm doesn't support setting python as a config like `npm
config set python3` instead it needs to be either set in the node-gyp
explicitly using the node-gyp command or through an environment
variable.
Since we are calling the node-gyp through npm, we need to set the
environment variable which is documented here: https://github.com/nodejs/node-gyp?tab=readme-ov-file#configuring-python-dependency

Accordingly the CMakeLists.txt for dashboard is adapted

Fixes: https://tracker.ceph.com/issues/62844
Signed-off-by: Nizamudeen A <[email protected]>

mgr/dashboard: adapt and refactor jest test files

Use the `configureTestBed` as the placeholder for adding the
declarations, imports... that is required for the unit tests to run

Fixes: https://tracker.ceph.com/issues/62844
Signed-off-by: Nizamudeen A <[email protected]>

mgr/dashboard: upgrade to cypress 12

Looks like chrome 117 will need cypress >=12.15.0
https://github.com/cypress-io/cypress-documentation/issues/5479

Signed-off-by: Nizamudeen A <[email protected]>

exporter: add ceph_daemon labels to labeled counters as well

Exporter missed adding the `ceph_daemon` or `instance_id`
labels(in case if rgw metrics) to the new labeled performance counters.

Fixes: https://tracker.ceph.com/issues/62874
Signed-off-by: avanthakkar <[email protected]>

common/tracer: remove is_enabled check in add_span methods

when tracing is disabled globally, new spans won't be added
to existing traces, because of that if condition.
this can happen also if a specific trace was enabled by lua script

so in case tracing is disabled, the tracer will create new spans
if it's parent span is not a noop span, regardless of tracer state

Signed-off-by: Omri Zeneva <[email protected]>

rgw: add test case to reproduce bucket check stats bug for versioned bucket

Reproduces a regression where radosgw-admin bucket check incorrectly counts
objects that started as unversioned and later transitioned to versioned.

Signed-off-by: Cory Snyder <[email protected]>

rgw: fix radosgw-admin bucket check stat calculation bug

Fixes a regression with radosgw-admin bucket check stat
calculation and bucket reshard stat calculation when
there are objects that have transitioned from unversioned
to versioned. The bug was introduced in
152aadb71b61c53a4832a1c8cf82fce3d64b68d1.

Signed-off-by: Cory Snyder <[email protected]>

rgw: fix output formatting of bucket index check admin api

The bucket index check admin API was previously returning invalid
JSON.

Signed-off-by: Cory Snyder <[email protected]>

crimson/vstart: add --seastore-device-size option in vstart.sh command line

default seastore_device_size will be out of space for smp >28

Signed-off-by: chunmei <[email protected]>

qa/rgw/sts: keycloak task installs java manually

java had already been installed automatically before centos 9. add an
override to install the jdk-17 packages manually

Fixes: https://tracker.ceph.com/issues/62536

Signed-off-by: Casey Bodley <[email protected]>

doc/architecture: edit "OSD Membership and Status"

Edit "OSD Membership and Status" in the "Smart Daemons Enable
Hyperscale" section of doc/architecture.rst.

Co-authored-by: Anthony D'Atri <[email protected]>
Signed-off-by: Zac Dover <[email protected]>

qa: fix "no orch backend set" in nfs suite

Fixes: https://tracker.ceph.com/issues/62870
Signed-off-by: Dhairya Parmar <[email protected]>

doc/architecture: edit "Data Scrubbing"

Edit the "Data Scrubbing" listitem in the list of benefits conferred by
the use by OSDs of the aggregate power of the cluster, in the section
"Smart Daemons Enable Hyperscale" in doc/architecture.rst.

Co-authored-by: Anthony D'Atri <[email protected]>
Signed-off-by: Zac Dover <[email protected]>

script/backport-resolve-issue: Update script with latest versions
Signed-off-by: Sayantani Saha <[email protected]>

doc/architecture: edit "Replication"

Edit "Replication" in the "Smart Daemons Enable Hyperscale" section of
doc/architecture.rst.

Signed-off-by: Zac Dover <[email protected]>

cephadm: move a logging line closer to where the data is used

Move a logging line closer to where the data being logging is
used. This avoids having a dependency on logging in a fairly
simple function and should make moving the function in a future
commit easier.

Signed-off-by: John Mulligan <[email protected]>

cephadm: move context based getters to context_getters.py

Move functions that exist mainly to pull information out of the
CephadmContext in various ways to a new context_getters.py module.

Signed-off-by: John Mulligan <[email protected]>

cephadm: rename fetch_tcp_ports to fetch_endpoints

Rename fetch_tcp_ports to fetch_endpoints to more closely match what
the function is doing.

Signed-off-by: John Mulligan <[email protected]>

cephadm: black format context_getters.py

Signed-off-by: John Mulligan <[email protected]>

cephadm: remove (doc)string

Remove a, now irrelevant (IMO), docstring that might have been
associated with the recently moved `cached_stdin` global. It's not
really clear how helpful it is in light of the new "compiled"
cephadm, so I am opting to remove it rather than move it.

Signed-off-by: John Mulligan <[email protected]>

cephadm: move pathify & get_file_timestamp to file_utils

Signed-off-by: John Mulligan <[email protected]>

mgr/cephadm: fix REFRESHED column of orch ps being unpopulated

The way the daemon ls data was processed was changed in
https://github.com/ceph/ceph/commit/1fd4132c7c03602719f29230732b12c8afa04779
and it seems that commit removed a line that set the
last_refresh field. This commit just adds it back
in the new location after the change.

Without this in "ceph orch ps" the REFRESHED column
for every daemon just reports "-"

Fixes: https://tracker.ceph.com/issues/62954

Signed-off-by: Adam King <[email protected]>

mgr/cephadm: add unit test for _process_ls_output

This is a weird function to make a unit test for
since it's essentially just moving data from a
list of dicts into a list of DaemonDescriptions,
but wanted to have some coverage to lower the
chance of breaking something again.

Signed-off-by: Adam King <[email protected]>

cephadm: move more funcs into data_utils.py

Signed-off-by: Adam King <[email protected]>

cephadm: re-format black data_utils.py

Signed-off-by: Adam King <[email protected]>

cephadm: move logging from registry_login to command_registry_login

So that registry_login can be moved to container_engines.py
without creating a dependency on logging there

Signed-off-by: Adam King <[email protected]>

cephadm: move registry_login to container_engines.py

Signed-off-by: Adam King <[email protected]>

cephadm: re-format black container_engines.py

Signed-off-by: Adam King <[email protected]>

cephadm: move more funcs into net_utils.py

Signed-off-by: Adam King <[email protected]>

cephadm: add unit test for get_ipv6_address

I wanted to modify this function slightly
to try to make both black and flake8 happy
with it, so adding a unit test to make sure
I don't break it.

Signed-off-by: Adam King <[email protected]>

cephadm: re-format black net_utils.py

There was a conflict here between what black
and flake8 were okay with. After running
format-black flake8 would report

cephadmlib/net_utils.py:211:29: E203 whitespace before ':'
cephadmlib/net_utils.py:259:25: E203 whitespace before ':'
cephadmlib/net_utils.py:272:27: E203 whitespace before ':'

but removing the whitespace before the ":" would
cause black to complain. For parse_mon_ip and
parse_mon_addrv, it was doing array slicing with
a start of "0" so I believe we can just remove the
start point without affecting anything (since "0" is
just the beginning of the string anyway). For
get_ipv6_address it had to actually be altered in
a way that had the potential to be done incorrectly,
so I added a unit test for it in a previous commit
in order to make sure we maintain the behavior.

Signed-off-by: Adam King <[email protected]>

doc/architecture: edit several sections

Edit the following sections in doc/architecture.rst:

 1. Dynamic Cluster Management
 2. About Pools
 3. Mapping PGs to OSDs

The tone of "Dynamic Cluster Management" remains a bit too close to the
tone of marketing material, in my opinion, but I will return to firm it
up when I have finished a once-over of architecture.rst.

Co-authored-by: Anthony D'Atri <[email protected]>
Signed-off-by: Zac Dover <[email protected]>

mgr: add throttle policy for DaemonServer
This commit fix the throttle parameter of osd does not take effect for mgr
Fixes: https://tracker.ceph.com/issues/61942

Signed-off-by: ericqzhao <[email protected]>

osd: Report health error if OSD public address is not within subnet

In a containerized environment after a OSD node reboot, due to
some race condition in systemd some OSDs registered their
v1/v2 public addresses on cluster network instead on
defined public_network. Report this inconsistency as a health
error as RADOS clients fail to connect to the cluster.

Fixes: https://tracker.ceph.com/issues/56057

Signed-off-by: Prashant D <[email protected]>

RGW | Bucket Notification: migrating old entries to support persistency control

Signed-off-by: Ali Masarwa <[email protected]>

test: corrected control reaches end by adding a return

Signed-off-by: Patty8122 <[email protected]>

doc/architecture: edit "Calculating PG IDs"

Edit the section "Calcluating PG IDs" in doc/architecture.rst.

Signed-off-by: Zac Dover <[email protected]>

cephadm: fix haproxy version with certain containers

Some builds of haproxy containers' output
from "haproxy -v" start with

HAProxy version

rather than

HA-Proxy version

no reason on our end not to accept both

Signed-off-by: Adam King <[email protected]>

cephadm: remove get_unit_name_by_instance func

As it is one line, quite simple, and only
had a single caller, it was decided we'd remove
this function as part of the cephadm refactor.

Signed-off-by: Adam King <[email protected]>

rgw: Fix bucket validation against POST policies

It's possible that user could provide a form part as a part of a POST
object upload that uses 'bucket' as a key; in this case, it was
overriding what was being set in the validation env (which is the real
bucket being modified). The result of this is that a user could actually
upload to any bucket accessible by the specified access key by matching
the bucket in the POST policy in said POST form part.

Fix this simply by setting the bucket to the correct value after the
POST form parts are processed, ignoring the form part above if
specified.

Fixes: https://tracker.ceph.com/issues/63004

Signed-off-by: Joshua Baergen <[email protected]>

rgw: fix unwatch crash at radosgw startup

During radosgw initialization, if there is an exception in init_watch that causes the watcher registration to fail,
When finalize_watch is executed, a crash occurs due to unregister an unregistered watch.

Fixes: https://tracker.ceph.com/issues/60094

Signed-off-by: lichaochao <[email protected]>

rgw/async: use optional_yield for keystone and kms requests

Signed-off-by: Casey Bodley <[email protected]>

rgw/keystone: EC2Engine uses reject() for ERR_SIGNATURE_NO_MATCH

ERR_SIGNATURE_NO_MATCH means that we found the given access key in
keystone, so we should use reject() instead of deny() to prevent
other engines like LocalEngine from looking up the access key again

this change causes us to return the SignatureDoesNotMatch error expected
by s3test case test_list_buckets_bad_auth()

Fixes: https://tracker.ceph.com/issues/62989

Signed-off-by: Casey Bodley <[email protected]>

rgw/multisite: call drain before flushing markers in incremental sync

Signed-off-by: Shilpa Jagannath <[email protected]>

rgw: fix rgw rate limiting RGWRateLimitInfo class decode_json max_read_bytes and max_write_bytes field mismatch

Fixes: https://tracker.ceph.com/issues/62955
Signed-off-by: xiangrui meng <[email protected]>

rgw: s3website doesn't prefetch for web_dir() check

this function only needs to check for existence of the given path.
the sal::Object is destroyed before the function returns, so it's
wasteful to prefetch its data

Fixes: https://tracker.ceph.com/issues/62938

Signed-off-by: Casey Bodley <[email protected]>

rgw: fix SignatureDoesNotMatch when extra headers

Headers start with 'x-amz' but not 'x-amz-', should not be in the list of CanonicalHeaders.

Signed-off-by: rui ma <[email protected]>

rgw: improve the efficiency of buffer list utilization of chunk upload

Reduced waste of buffer::ptr by receiving multiple chunks and filling them into the buffer

AWSv4ComplMulti::recv_body() just receive one chunk and fill it into buffer.
Each 4MB buffer is actually only utilizing 64KB, leading to frequent buffer allocations.
~800GB virtual memory consumption has been observed.

Signed-off-by: liubingrun <[email protected]>

rgw/lc: remove_bucket_config() doesn't update xattrs on bucket delete

we're deleting the bucket instance metadata anyway, so there's no reason
to send an additional write to remove the RGW_ATTR_LC xattr first. this
write bumps the cls_version and can cause the actual delete op to fail
with ECANCELED

Fixes: https://tracker.ceph.com/issues/62411

Signed-off-by: Casey Bodley <[email protected]>

rgw/lc: bucket delete only calls remove_bucket_config() if RGW_ATTR_LC

if there's no RGW_ATTR_LC, don't try to do any lifecycle-related cleanup

Signed-off-by: Casey Bodley <[email protected]>

rgw/file: make setattr(...) a no-op on buckets

Shallow fix for apparent unstable behavior after nfs "chown" on
an RGW bucket via RGW NFS.  While we allow buckets to be created
(and subject to ordinary rules, deleted), chown against a bucket
hasn't been tested and potentially is not valid.  Prevent it
altogether for now--if permissions would allow it, chown will
succeed but won't have any effect.

Fixes: https://tracker.ceph.com/issues/61689

Signed-off-by: Matt Benjamin <[email protected]>

rgw/ops-log: explicitly specify object name in the log entry

Pseudo-directories can be used when naming an object (key): e.g.,
"my/weird/object.txt". This would lead to cases where the name of the object
cannot be derived deterministically. For example, if an object name uses
pseudo-directories and starts with "<bucket_name>/" and virtual-host style is
used when accessing object, we cannot figure out the name of the object. Note
that in DNS (virtual host) style bucket naming, since bucket name is specified as
a part of the host name, URI doesn't contain any reference to the bucket.

boto_client = boto3.client(..., config=Config(s3={'addressing_style': 'virtual'}))

boto_client.put_object(
        Body=b"this is the data",
        Key="my-bucket/my-object",
        Bucket="my-bucket"
)

The corresponding log entry is

{   ...,
    "bucket":"my-bucket", ...,
    "uri":"PUT /my-bucket/my-object HTTP/1.1",
    ...
}

We can falsely conclude that the name of the object is "my-object".

By having the name of the object listed in the log-entry, we address
this ambiguity.

Signed-off-by: Oguzhan Ozmen <[email protected]>

qa/suites/krbd: rename singleton to singleton-msgr-failures

A "singleton without msgr-failures" is wanted in the next commit.

Signed-off-by: Ilya Dryomov <[email protected]>

qa/suites/krbd: stress test for recovering from watch errors

Fixes: https://tracker.ceph.com/issues/63010
Signed-off-by: Ilya Dryomov <[email protected]>

rgw/test: fix compiler warning

Fixing a compiler warning regarding ambiguity of
the overloaded operator '==' (as it allows a one-sided
const operand)

Signed-off-by: Ronen Friedman <[email protected]>

cls_lock: expired lock before unlock and start check

If the lock expired, the stat check shouldn't return -ENOENT,
We will change the lock duration to prevent lock expired before the
stat check.

Fixes: https://tracker.ceph.com/issues/56575
Signed-off-by: Nitzan Mordechai <[email protected]>

osd: correct unsigned/signed compiler wrn

    /home/pdonnell/ceph/src/osd/OSD.cc: In member function ‘void OSD::ShardedOpWQ::stop_for_fast_shutdown()’:
    /home/pdonnell/ceph/src/osd/OSD.cc:11143:41: warning: comparison of integer expressions of different signedness: ‘int’ and ‘uint32_t’ {aka ‘unsigned int’} [-Wsign-compare]
    11143 |   for (int shard_index = 0; shard_index < osd->num_shards; shard_index++) {

Fixes: https://tracker.ceph.com/issues/62851
Fixes: 210dbd4ff19ea66fd2f0109cc15aad53349be52f
Signed-off-by: Patrick Donnelly <[email protected]>

mgr/dashboard: fix cephfs form validator

Number is not allowed as the starting character of the mds service

Fixes: https://tracker.ceph.com/issues/63005
Signed-off-by: Nizamudeen A <[email protected]>

mgr/dashboard: allow tls 1.2 with a config option

Provide the option to allow tls1.2

`ceph dashboard set-enable-unsafe-tls-v1-2 True` followed with a mgr
restart will enable tls 1.2.

With tls1.2 enabled
```
╰─$ nmap -sV --script ssl-enum-ciphers -p 11000 127.0.0.1
Starting Nmap 7.93 ( https://nmap.org ) at 2023-09-27 16:56 IST
Nmap scan report for localhost (127.0.0.1)
Host is up (0.00018s latency).

PORT      STATE SERVICE  VERSION
11000/tcp open  ssl/http CherryPy wsgiserver
|_http-server-header: Ceph-Dashboard
| ssl-enum-ciphers:
|   TLSv1.2:
|     ciphers:
|       TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (ecdh_x25519) - A
|       TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256 (ecdh_x25519) - A
|       TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (ecdh_x25519) - A
|       TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256 (ecdh_x25519) - A
|       TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA (ecdh_x25519) - A
|       TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA (ecdh_x25519) - A
|       TLS_RSA_WITH_AES_256_GCM_SHA384 (rsa 2048) - A
|       TLS_RSA_WITH_AES_256_CCM (rsa 2048) - A
|       TLS_RSA_WITH_AES_128_GCM_SHA256 (rsa 2048) - A
|       TLS_RSA_WITH_AES_128_CCM (rsa 2048) - A
|       TLS_RSA_WITH_AES_256_CBC_SHA256 (rsa 2048) - A
|       TLS_RSA_WITH_AES_128_CBC_SHA256 (rsa 2048) - A
|       TLS_RSA_WITH_AES_256_CBC_SHA (rsa 2048) - A
|       TLS_RSA_WITH_AES_128_CBC_SHA (rsa 2048) - A
|     compressors:
|       NULL
|     cipher preference: server
|   TLSv1.3:
|     ciphers:
|       TLS_AKE_WITH_AES_256_GCM_SHA384 (ecdh_x25519) - A
|       TLS_AKE_WITH_CHACHA20_POLY1305_SHA256 (ecdh_x25519) - A
|       TLS_AKE_WITH_AES_128_GCM_SHA256 (ecdh_x25519) - A
|       TLS_AKE_WITH_AES_128_CCM_SHA256 (ecdh_x25519) - A
|     cipher preference: server
|_  least strength: A

Service detection performed. Please report any incorrect results at https://nmap.org/submit/ .
Nmap done: 1 IP address (1 host up) scanned in 16.55 seconds
```

Without tls1.2 enabled (which defaults to tls 1.3)
```
╰─$ nmap -sV --script ssl-enum-ciphers -p 11000 127.0.0.1
Starting Nmap 7.93 ( https://nmap.org ) at 2023-09-27 16:54 IST
Nmap scan report for localhost (127.0.0.1)
Host is up (0.000075s latency).

PORT      STATE SERVICE  VERSION
11000/tcp open  ssl/http CherryPy wsgiserver
| ssl-enum-ciphers:
|   TLSv1.3:
|     ciphers:
|       TLS_AKE_WITH_AES_256_GCM_SHA384 (ecdh_x25519) - A
|       TLS_AKE_WITH_CHACHA20_POLY1305_SHA256 (ecdh_x25519) - A
|       TLS_AKE_WITH_AES_128_GCM_SHA256 (ecdh_x25519) - A
|       TLS_AKE_WITH_AES_128_CCM_SHA256 (ecdh_x25519) - A
|     cipher preference: server
|_  least strength: A
|_http-server-header: Ceph-Dashboard
```

Fixes: https://tracker.ceph.com/issues/62940
Signed-off-by: Nizamudeen A <[email protected]>

mgr/dashboard: fix the landing page layout issues

We were following a row-col grid layout for the landing page.
First row includes Details, Status and Capacity
Second row for Inventory and Cluster Utilization

So if one of the item in the first row increases, it pushes the entire
second row downwards.

To fix this, I made a col-row grid.

First col has Details and Inventory in two rows.
Second col has Status and Capacity as a col and Cluster Utilization as a
single row

Fixes: https://tracker.ceph.com/issues/62961

Signed-off-by: Nizamudeen A <[email protected]>
Co-authored-by: cloudbehl <[email protected]>

mds/FSMap: allow upgrades if no up mds

This is to support the fail_fs scenario for cephadm where max_mds >= 1
and all MDS are down.

Fixes: https://tracker.ceph.com/issues/62682
Signed-off-by: Patrick Donnelly <[email protected]>

cephadm: start ssh.py in cephadmlib

As part of the cephadm refactoring process
to split cephadm into multiple python files,
start "ssh.py" that includes some functions used
for setting up and testing ssh connections,
primarily as part of bootstrap.

Signed-off-by: Adam King <[email protected]>

cephadm: format black cephadmlib/ssh.py

Signed-off-by: Adam King <[email protected]>

mgr/dashboard: show a message to restart the rgw daemons after moving from single-site to multi-site

Fixes: https://tracker.ceph.com/issues/62984

Signed-off-by: Aashish Sharma <[email protected]>

mgr/dashboard: enable protect option if layering enabled

Fixes: https://tracker.ceph.com/issues/63076
Signed-off-by: avanthakkar <[email protected]>

osd: fix read balancer logic to avoid redundant primary assignment

Fixes: https://tracker.ceph.com/issues/62833
Signed-off-by: Laura Flores <[email protected]>

osd/OSDMonitor: check svc is writeable before changing pending

Fixes: https://tracker.ceph.com/issues/59813
Signed-off-by: Patrick Donnelly <[email protected]>

mon: refactor loop variable names

To make it easier to read.

Signed-off-by: Patrick Donnelly <[email protected]>

mgr/dashboard: Rgw Multi-site naming improvements

Fixes: https://tracker.ceph.com/issues/62721

Signed-off-by: Aashish Sharma <[email protected]>

mgr/dashboard: rbd image hide usage bar when disk usage is not provided

Fixes: https://tracker.ceph.com/issues/63037
Signed-off-by: Pedro Gonzalez Gomez <[email protected]>

mds: add option mds_bal_overload_epochs

Add an option to configure the number of epochs the overload lasts before migrating,
setting it to a higher value can avoid frequent migrations caused by load fluctuations.

Signed-off-by: Zhansong Gao <[email protected]>

mds: fix stray CInodes' use-after-free bug when submit ELid entry

When submitting a journal log entry it could start a new segment
and it could advance the stray CInodes, which has been released
just before it. Just skip advancing the stray dentries when MDS is
shutting down.

Reported-by: Patrick Donnelly <[email protected]>
Fixes: commit 5a537476544("mds: introduce ELid event to create/close log")
Fixes: https://tracker.ceph.com/issues/62861
Signed-off-by: Xiubo Li <[email protected]>

doc/rados: edit ops/control.rst (2 of x)

Edit doc/rados/operations/control.rst (2 of x).

Co-authored-by: Cole Mitchell <[email protected]>
Co-authored-by: Anthony D'Atri <[email protected]>
Signed-off-by: Zac Dover <[email protected]>

mgr/dashboard: Fix user/bucket count in rgw overview dashboard

Donot consider buckets/users count from daemons that have similar realm
name

Fixes: https://tracker.ceph.com/issues/62964

Signed-off-by: Aashish Sharma <[email protected]>

test/allocator_replay_test: add assess_free command.

This permits to estimate amount of free space for the given
allocation unit.

Signed-off-by: Igor Fedotov <[email protected]>

test/hybrid_allocator_test: a couple broken cases

Signed-off-by: Igor Fedotov <[email protected]>

os/bluestore: fix edge case for bitmap alloc's claim_free_to_left(0)
call.

This imporperly marked the first 64 chunks as allocated.
Apaprently not critical for production since offset(0) is never
released.

Signed-off-by: Igor Fedotov <[email protected]>

os/bluestore: Hybrid Allocator might unexpectedly returned ENOSPC

This happened when secondary allocator returned no additional
extents while primary one still provided a few but less than enough.
This is rather a non-critical issue but it violated our informal
agreement for allocators which can return less space than requested.

Signed-off-by: Igor Fedotov <[email protected]>

test/allocator_replay_test: proper command line options setup

Signed-off-by: Igor Fedotov <[email protected]>

mgr/dashboard: fix bootstrap script for cephadm installation

Fixes: https://tracker.ceph.com/issues/62827
Signed-off-by: avanthakkar <[email protected]>

dashboard: regression, make install fails w/dashboard disabled

https://tracker.ceph.com/issues/63100

Signed-off-by: Matt Benjamin <[email protected]>

mgr/dashboard: fix rgw inventory card and broken shadows

Mess up of the dashboard landing page layout fixes PR

Fixes: http://tracker.ceph.com/issues/62961
Signed-off-by: Nizamudeen A <[email protected]>

RGW: add the missing help print for command 'topic stats'

Signed-off-by: Ali Masarwa <[email protected]>

rgw: Add coverity annotation for warning about tautological comparison

Signed-off-by: Vedansh Bhartia <[email protected]>

doc/rados: edit troubleshooting.rst

Edit doc/rads/troubleshooting.rst to remove some language that sounds
quite close to marketing language.

Signed-off-by: Zac Dover <[email protected]>

vstart: exclude default route during cluster setup

"ip route list" may list default route, and that needs to be excluded
while doing cluster setup
Typical o/p of ip route list:
$ ip route list
default via 10.8.159.254 dev eno1 proto dhcp src 10.8.152.13 metric 100
10.8.152.0/21 dev eno1 proto kernel scope link src 10.8.152.13 metric 100

Signed-off-by: Sachin Punadikar <[email protected]>

mgr/dashboard: fixed cephfs snapshot & Quota list

fixes: https://tracker.ceph.com/issues/63007

Signed-off-by: cloudbehl <[email protected]>

mds: disable delegating inode ranges to clients

Fixes: http://tracker.ceph.com/issues/63103
Signed-off-by: Venky Shankar <[email protected]>

qa: start testing mds_client_delegate_inos_pct config

Signed-off-by: Venky Shankar <[email protected]>

PendingReleaseNotes: add a note about disallowing delegating inodes

Signed-off-by: Venky Shankar <[email protected]>

cephadm: add some unit test coverage for deploying nfs, snmp

Signed-off-by: John Mulligan <[email protected]>

cephadm: add daemon_form.py: bases and funcs for daemon forms

Create daemon_form.py containing the DaemonForm class and a few
subclasses and utility functions for working with DaemonForms.
In a future commit, DaemonForm will become the base class for
the current assortment of classes named after the daemon or
family of daemon they help manage.

A daemon form, think "form" as in "template" or "mold", assists
in setting up, creating, and managing daemons controlled with
cephadm. Because cephadm supports a variety of services the
DaemonForm is an abstract base class and the module also supports
additional ABCs that may be used by DaemonForms to implement
optional features.

The daemon forms that are expected to be used directly must be
registered using the provided decorator. This is an explicit extra
step so that common bases that inherit from DaemonForm can be
implemented. Plus explicit is better than implicit. :-)
All DeamonForm subclasses are expected to provide a small set
of standard methods so that the types can be chosen, instantiated,
and used a common manner.

Signed-off-by: John Mulligan <[email protected]>

cephadm: introduce daemon forms to cephadm.py

Introduce the DeamonForm base class to cephadm.py and make various
daemon-type classes into fully fleged deamon form classes.

Some classes already had a semi-standard `init` classmethod for
instantiation. In these cases the new `create` classmethod is a thin
wrapper over the existing method. In cases where the class was not
already being instantiated a minimal set of methods are added.

Signed-off-by: John Mulligan <[email protected]>

cephadm: add test_daemon_form.py

Signed-off-by: John Mulligan <[email protected]>

cephadm: remove direct daemon-type deps from sysctl

Using the appropriate daemon form we can break the direct dependency
that the sysctl setup function has on particular classes and use
a generic interface.

Signed-off-by: John Mulligan <[email protected]>

cephadm: move sysctl specific functions to sysctl.py

Signed-off-by: John Mulligan <[email protected]>

cephadm: remove direct daemon-class deps from firewall

Signed-off-by: John Mulligan <[email protected]>

cephadm: move firewalld related items to firewalld.py

Signed-off-by: John Mulligan <[email protected]>

cephadm: move DeploymentType to deploy.py

The DeploymentType is used by a number of other classes and functions
and has no dependencies beyond enum and is safe to move.

Signed-off-by: John Mulligan <[email protected]>

cephadm: add ContainerDaemonForm

Add a supplemental DaemonForm subclass that helps deploy container
based daemons in a standard fashion. Most of these methods are
optional and should have sensible defaults.

Signed-off-by: John Mulligan <[email protected]>

cephadm: add func to deploy any generic ContainerDaemonForm

While there are no ContainerDaemonForms implemented yet, add a function
that uses the ContainerDaemonForm methods to construct a deployment
for the container based daemons.

Signed-off-by: John Mulligan <[email protected]>

cephadm: convert NFSGanesha to a ContainerDaemonForm

Signed-off-by: John Mulligan <[email protected]>

cephadm: convert CustomContainer to a ContainerDaemonForm

Signed-off-by: John Mulligan <[email protected]>

cephadm: convert SNMPGateway to a ContainerDaemonForm

Signed-off-by: John Mulligan <[email protected]>

cephadm: convert cephadm agent to a daemon form

The cephadm agent is a bit special in that it will not be converted
to a ContainerDaemonForm (it is not containerized) but we still want
to have it registered as a DeamonForm so that the deamon_type can be
passed to create and have it resolve correctly.

Signed-off-by: John Mulligan <[email protected]>

ceph orch add fails when ipv6 address is surrounded by square brackets.

fixes: https://tracker.ceph.com/issues/61885
fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2153448

Signed-off-by: Teoman ONAY <[email protected]>

doc: remove egg fragment from dev/developer_guide/running-tests-locally

DEPRECATION: git+https://github.com/ceph/teuthology#egg=teuthology
[test] contains an egg fragment with a non-PEP 508 name pip 25.0 will enforce
this behaviour change. A possible replacement is to use the req @ url syntax,
and remove the egg fragment. Discussion can be found at
https://github.com/pypa/pip/issues/11617

Signed-off-by: Dhairya Parmar <[email protected]>

rgw: Add coverity annotations for missing mutex locks

Signed-off-by: Vedansh Bhartia <[email protected]>

osd: fix: slow scheduling when item_cost is large

We use the iops and bandwidth tested by
`ceph tell osd.0 bench 10737418240 204800 204800 100`
to verify the QoS function. iops was 400 and bandwidth was 80MiB/s.
When osd_mclock_scheduler_client_lim is set to 1,
the sequential write bandwidth is only half of the capacity.
Therefore, we believe that it should not unconditionally increase
osd_bandwidth_cost_per_io for each IO, but take the maximum of the two.

Fixes: https://tracker.ceph.com/issues/62812
co-author: yanghonggang <[email protected]>
co-author: zhangjianwei <[email protected]>
Signed-off-by: Jrchyang Yu <[email protected]>

script: add option for debug build

See: https://github.com/ceph/ceph-build/pull/2167

Signed-off-by: Patrick Donnelly <[email protected]>

doc/architecture: edit "Peering and Sets"

Edit the English in the section "Peering and Sets" in the file
doc/architecture.rst.

Co-authored-by: Anthony D'Atri <[email protected]>
Signed-off-by: Zac Dover <[email protected]>

doc/architecture: repair RBD sentence

Improve an ambiguous sentence in doc/architecture.rst.

The problem presented by the original sentence is that the phrasal verb
"to provide with" is implicated in one of its possible readings.
Interpreted in that way, the sentence seems to express the incorrect
idea that RBD furnishes block devices with snapshotting and cloning, as
though snapshotting and cloning are being delivered to the block
devices. In fact, snapshotting and cloning are just features of RBD, and
are features that are described on this page:
https://docs.ceph.com/en/quincy/rbd/rbd-snapshot/.

Signed-off-by: Zac Dover <[email protected]>

doc/rados: edit troubleshooting-mon.rst (3 of x)

Edit doc/rados/troubleshooting/troubleshooting-mon.rst.

Follows https://github.com/ceph/ceph/pull/52827

Co-authored-by: Anthony D'Atri <[email protected]>
Signed-off-by: Zac Dover <[email protected]>

doc/rados: edit troubleshooting/community.rst

Edit doc/rados/troubleshooting/community.rst.

Co-authored-by: Anthony D'Atri <[email protected]>
Signed-off-by: Zac Dover <[email protected]>

docs/cephadm: fix broken links in cephadm docs
In the documentation https://docs.ceph.com/en/quincy/cephadm/services/osd/, there is a broken link in "List Devices" chapter. This change fixes the documentation to point to the correct internal link under ceph/doc/rados/operations/devices

Fixes: https://tracker.ceph.com/issues/55763

Signed-off-by: sparvekar <[email protected]>
  • Loading branch information
shrutiparvekar committed Oct 8, 2023
1 parent 9fedc1e commit 02f66ac
Show file tree
Hide file tree
Showing 307 changed files with 12,322 additions and 34,331 deletions.
8 changes: 8 additions & 0 deletions PendingReleaseNotes
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,9 @@
recommend that users with versioned buckets, especially those that existed
on prior releases, use these new tools to check whether their buckets are
affected and to clean them up accordingly.
CephFS: Disallow delegating preallocated inode ranges to clients. Config
`mds_client_delegate_inos_pct` defaults to 0 which disables async dirops
in the kclient.

>=18.0.0

Expand Down Expand Up @@ -231,6 +234,11 @@
also change read/write permissions in a capability that the entity already
holds. If the capability passed by user is same as one of the capabilities
that the entity already holds, idempotency is maintained.
* `ceph config dump --format <json|xml>` output will display the localized
option names instead of its normalized version. For e.g.,
"mgr/prometheus/x/server_port" will be displayed instead of
"mgr/prometheus/server_port". This matches the output of the non pretty-print
formatted version of the command.

>=17.2.1

Expand Down
1 change: 0 additions & 1 deletion debian/ceph-base.docs

This file was deleted.

1 change: 1 addition & 0 deletions debian/ceph-mon.postinst
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
#!/bin/sh
# vim: set noet ts=8:
# postinst script for ceph-mon
#
Expand Down
1 change: 1 addition & 0 deletions debian/ceph-osd.postinst
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
#!/bin/sh
# vim: set noet ts=8:
# postinst script for ceph-osd
#
Expand Down
2 changes: 1 addition & 1 deletion debian/compat
Original file line number Diff line number Diff line change
@@ -1 +1 @@
9
12
18 changes: 7 additions & 11 deletions debian/control
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Priority: optional
Homepage: http://ceph.com/
Vcs-Git: git://github.com/ceph/ceph.git
Vcs-Browser: https://github.com/ceph/ceph
Maintainer: Ceph Maintainers <ceph-maintainers@lists.ceph.com>
Maintainer: Ceph Maintainers <ceph-maintainers@ceph.io>
Uploaders: Ken Dreyer <[email protected]>,
Alfredo Deza <[email protected]>,
Build-Depends: automake,
Expand All @@ -20,8 +20,7 @@ Build-Depends: automake,
git,
golang,
gperf,
g++ (>= 7),
hostname <pkg.ceph.check>,
g++ (>= 11),
javahelper,
jq <pkg.ceph.check>,
jsonnet <pkg.ceph.check>,
Expand Down Expand Up @@ -136,9 +135,6 @@ Package: ceph-base
Architecture: linux-any
Depends: binutils,
ceph-common (= ${binary:Version}),
debianutils,
findutils,
grep,
logrotate,
parted,
psmisc,
Expand Down Expand Up @@ -190,6 +186,7 @@ Package: cephadm
Architecture: linux-any
Recommends: podman (>= 2.0.2) | docker.io | docker-ce
Depends: lvm2,
python3,
${python3:Depends},
Description: cephadm utility to bootstrap ceph daemons with systemd and containers
Ceph is a massively scalable, open-source, distributed
Expand Down Expand Up @@ -432,7 +429,6 @@ Depends: ceph-osd (= ${binary:Version}),
e2fsprogs,
lvm2,
parted,
util-linux,
xfsprogs,
${misc:Depends},
${python3:Depends}
Expand Down Expand Up @@ -760,7 +756,7 @@ Architecture: any
Section: debug
Priority: extra
Depends: libsqlite3-mod-ceph (= ${binary:Version}),
libsqlite3-0-dbgsym
libsqlite3-0-dbgsym,
${misc:Depends},
Description: debugging symbols for libsqlite3-mod-ceph
A SQLite3 VFS for storing and manipulating databases stored on Ceph's RADOS
Expand Down Expand Up @@ -1208,14 +1204,14 @@ Description: Java Native Interface library for CephFS Java bindings
Package: rados-objclass-dev
Architecture: linux-any
Section: libdevel
Depends: librados-dev (= ${binary:Version}) ${misc:Depends},
Depends: librados-dev (= ${binary:Version}), ${misc:Depends},
Description: RADOS object class development kit.
.
This package contains development files needed for building RADOS object class plugins.

Package: cephfs-shell
Architecture: all
Depends: ${misc:Depends}
Depends: ${misc:Depends},
${python3:Depends}
Description: interactive shell for the Ceph distributed file system
Ceph is a massively scalable, open-source, distributed
Expand All @@ -1228,7 +1224,7 @@ Description: interactive shell for the Ceph distributed file system

Package: cephfs-top
Architecture: all
Depends: ${misc:Depends}
Depends: ${misc:Depends},
${python3:Depends}
Description: This package provides a top(1) like utility to display various
filesystem metrics in realtime.
Expand Down
Loading

0 comments on commit 02f66ac

Please sign in to comment.