Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
…l/git/axboe/linux-block into rust-dev

Conflicts:
	rust/bindings/bindings_helper.h
  • Loading branch information
fbq committed Jul 9, 2024
2 parents 9df2572 + 83215e8 commit db01a9d
Show file tree
Hide file tree
Showing 203 changed files with 6,573 additions and 4,208 deletions.
1 change: 1 addition & 0 deletions .mailmap
Original file line number Diff line number Diff line change
Expand Up @@ -689,6 +689,7 @@ Vivien Didelot <[email protected]> <[email protected]>
Vlad Dogaru <[email protected]> <[email protected]>
Vladimir Davydov <[email protected]> <[email protected]>
Vladimir Davydov <[email protected]> <[email protected]>
Weiwen Hu <[email protected]> <[email protected]>
WeiXiong Liao <[email protected]> <[email protected]>
Wen Gong <[email protected]> <[email protected]>
Wesley Cheng <[email protected]> <[email protected]>
Expand Down
53 changes: 53 additions & 0 deletions Documentation/ABI/stable/sysfs-block
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,59 @@ Description:
device is offset from the internal allocation unit's
natural alignment.

What: /sys/block/<disk>/atomic_write_max_bytes
Date: February 2024
Contact: Himanshu Madhani <[email protected]>
Description:
[RO] This parameter specifies the maximum atomic write
size reported by the device. This parameter is relevant
for merging of writes, where a merged atomic write
operation must not exceed this number of bytes.
This parameter may be greater than the value in
atomic_write_unit_max_bytes as
atomic_write_unit_max_bytes will be rounded down to a
power-of-two and atomic_write_unit_max_bytes may also be
limited by some other queue limits, such as max_segments.
This parameter - along with atomic_write_unit_min_bytes
and atomic_write_unit_max_bytes - will not be larger than
max_hw_sectors_kb, but may be larger than max_sectors_kb.


What: /sys/block/<disk>/atomic_write_unit_min_bytes
Date: February 2024
Contact: Himanshu Madhani <[email protected]>
Description:
[RO] This parameter specifies the smallest block which can
be written atomically with an atomic write operation. All
atomic write operations must begin at a
atomic_write_unit_min boundary and must be multiples of
atomic_write_unit_min. This value must be a power-of-two.


What: /sys/block/<disk>/atomic_write_unit_max_bytes
Date: February 2024
Contact: Himanshu Madhani <[email protected]>
Description:
[RO] This parameter defines the largest block which can be
written atomically with an atomic write operation. This
value must be a multiple of atomic_write_unit_min and must
be a power-of-two. This value will not be larger than
atomic_write_max_bytes.


What: /sys/block/<disk>/atomic_write_boundary_bytes
Date: February 2024
Contact: Himanshu Madhani <[email protected]>
Description:
[RO] A device may need to internally split an atomic write I/O
which straddles a given logical block address boundary. This
parameter specifies the size in bytes of the atomic boundary if
one is reported by the device. This value must be a
power-of-two and at least the size as in
atomic_write_unit_max_bytes.
Any attempt to merge atomic write I/Os must not result in a
merged I/O which crosses this boundary (if any).


What: /sys/block/<disk>/diskseq
Date: February 2021
Expand Down
49 changes: 3 additions & 46 deletions Documentation/block/data-integrity.rst
Original file line number Diff line number Diff line change
Expand Up @@ -153,18 +153,11 @@ bio_free() will automatically free the bip.
4.2 Block Device
----------------

Because the format of the protection data is tied to the physical
disk, each block device has been extended with a block integrity
profile (struct blk_integrity). This optional profile is registered
with the block layer using blk_integrity_register().

The profile contains callback functions for generating and verifying
the protection data, as well as getting and setting application tags.
The profile also contains a few constants to aid in completing,
merging and splitting the integrity metadata.
Block devices can set up the integrity information in the integrity
sub-struture of the queue_limits structure.

Layered block devices will need to pick a profile that's appropriate
for all subdevices. blk_integrity_compare() can help with that. DM
for all subdevices. queue_limits_stack_integrity() can help with that. DM
and MD linear, RAID0 and RAID1 are currently supported. RAID4/5/6
will require extra work due to the application tag.

Expand Down Expand Up @@ -250,42 +243,6 @@ will require extra work due to the application tag.
integrity upon completion.


5.4 Registering A Block Device As Capable Of Exchanging Integrity Metadata
--------------------------------------------------------------------------

To enable integrity exchange on a block device the gendisk must be
registered as capable:

`int blk_integrity_register(gendisk, blk_integrity);`

The blk_integrity struct is a template and should contain the
following::

static struct blk_integrity my_profile = {
.name = "STANDARDSBODY-TYPE-VARIANT-CSUM",
.generate_fn = my_generate_fn,
.verify_fn = my_verify_fn,
.tuple_size = sizeof(struct my_tuple_size),
.tag_size = <tag bytes per hw sector>,
};

'name' is a text string which will be visible in sysfs. This is
part of the userland API so chose it carefully and never change
it. The format is standards body-type-variant.
E.g. T10-DIF-TYPE1-IP or T13-EPP-0-CRC.

'generate_fn' generates appropriate integrity metadata (for WRITE).

'verify_fn' verifies that the data buffer matches the integrity
metadata.

'tuple_size' must be set to match the size of the integrity
metadata per sector. I.e. 8 for DIF and EPP.

'tag_size' must be set to identify how many bytes of tag space
are available per hardware sector. For DIF this is either 2 or
0 depending on the value of the Control Mode Page ATO bit.

----------------------------------------------------------------------

2007-12-24 Martin K. Petersen <[email protected]>
67 changes: 38 additions & 29 deletions Documentation/block/writeback_cache_control.rst
Original file line number Diff line number Diff line change
Expand Up @@ -46,41 +46,50 @@ worry if the underlying devices need any explicit cache flushing and how
the Forced Unit Access is implemented. The REQ_PREFLUSH and REQ_FUA flags
may both be set on a single bio.

Feature settings for block drivers
----------------------------------

Implementation details for bio based block drivers
--------------------------------------------------------------
For devices that do not support volatile write caches there is no driver
support required, the block layer completes empty REQ_PREFLUSH requests before
entering the driver and strips off the REQ_PREFLUSH and REQ_FUA bits from
requests that have a payload.

These drivers will always see the REQ_PREFLUSH and REQ_FUA bits as they sit
directly below the submit_bio interface. For remapping drivers the REQ_FUA
bits need to be propagated to underlying devices, and a global flush needs
to be implemented for bios with the REQ_PREFLUSH bit set. For real device
drivers that do not have a volatile cache the REQ_PREFLUSH and REQ_FUA bits
on non-empty bios can simply be ignored, and REQ_PREFLUSH requests without
data can be completed successfully without doing any work. Drivers for
devices with volatile caches need to implement the support for these
flags themselves without any help from the block layer.
For devices with volatile write caches the driver needs to tell the block layer
that it supports flushing caches by setting the

BLK_FEAT_WRITE_CACHE

Implementation details for request_fn based block drivers
---------------------------------------------------------
flag in the queue_limits feature field. For devices that also support the FUA
bit the block layer needs to be told to pass on the REQ_FUA bit by also setting
the

For devices that do not support volatile write caches there is no driver
support required, the block layer completes empty REQ_PREFLUSH requests before
entering the driver and strips off the REQ_PREFLUSH and REQ_FUA bits from
requests that have a payload. For devices with volatile write caches the
driver needs to tell the block layer that it supports flushing caches by
doing::
BLK_FEAT_FUA

flag in the features field of the queue_limits structure.

Implementation details for bio based block drivers
--------------------------------------------------

For bio based drivers the REQ_PREFLUSH and REQ_FUA bit are simply passed on to
the driver if the driver sets the BLK_FEAT_WRITE_CACHE flag and the driver
needs to handle them.

*NOTE*: The REQ_FUA bit also gets passed on when the BLK_FEAT_FUA flags is
_not_ set. Any bio based driver that sets BLK_FEAT_WRITE_CACHE also needs to
handle REQ_FUA.

blk_queue_write_cache(sdkp->disk->queue, true, false);
For remapping drivers the REQ_FUA bits need to be propagated to underlying
devices, and a global flush needs to be implemented for bios with the
REQ_PREFLUSH bit set.

and handle empty REQ_OP_FLUSH requests in its prep_fn/request_fn. Note that
REQ_PREFLUSH requests with a payload are automatically turned into a sequence
of an empty REQ_OP_FLUSH request followed by the actual write by the block
layer. For devices that also support the FUA bit the block layer needs
to be told to pass through the REQ_FUA bit using::
Implementation details for blk-mq drivers
-----------------------------------------

blk_queue_write_cache(sdkp->disk->queue, true, true);
When the BLK_FEAT_WRITE_CACHE flag is set, REQ_OP_WRITE | REQ_PREFLUSH requests
with a payload are automatically turned into a sequence of a REQ_OP_FLUSH
request followed by the actual write by the block layer.

and the driver must handle write requests that have the REQ_FUA bit set
in prep_fn/request_fn. If the FUA bit is not natively supported the block
layer turns it into an empty REQ_OP_FLUSH request after the actual write.
When the BLK_FEAT_FUA flags is set, the REQ_FUA bit is simply passed on for the
REQ_OP_WRITE request, else a REQ_OP_FLUSH request is sent by the block layer
after the completion of the write request for bio submissions with the REQ_FUA
bit set.
16 changes: 15 additions & 1 deletion MAINTAINERS
Original file line number Diff line number Diff line change
Expand Up @@ -3780,6 +3780,20 @@ F: include/linux/blk*
F: kernel/trace/blktrace.c
F: lib/sbitmap.c

BLOCK LAYER DEVICE DRIVER API [RUST]
M: Andreas Hindborg <[email protected]>
R: Boqun Feng <[email protected]>
L: [email protected]
L: [email protected]
S: Supported
W: https://rust-for-linux.com
B: https://github.com/Rust-for-Linux/linux/issues
C: https://rust-for-linux.zulipchat.com/#narrow/stream/Block
T: git https://github.com/Rust-for-Linux/linux.git rust-block-next
F: drivers/block/rnull.rs
F: rust/kernel/block.rs
F: rust/kernel/block/

BLOCK2MTD DRIVER
M: Joern Engel <[email protected]>
L: [email protected]
Expand Down Expand Up @@ -11569,7 +11583,7 @@ F: include/linux/iosys-map.h

IO_URING
M: Jens Axboe <[email protected]>
R: Pavel Begunkov <[email protected]>
M: Pavel Begunkov <[email protected]>
L: [email protected]
S: Maintained
T: git git://git.kernel.dk/linux-block
Expand Down
3 changes: 2 additions & 1 deletion arch/m68k/emu/nfblock.c
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ static void nfhd_submit_bio(struct bio *bio)
len = bvec.bv_len;
len >>= 9;
nfhd_read_write(dev->id, 0, dir, sec >> shift, len >> shift,
page_to_phys(bvec.bv_page) + bvec.bv_offset);
bvec_phys(&bvec));
sec += len;
}
bio_endio(bio);
Expand All @@ -98,6 +98,7 @@ static int __init nfhd_init_one(int id, u32 blocks, u32 bsize)
{
struct queue_limits lim = {
.logical_block_size = bsize,
.features = BLK_FEAT_ROTATIONAL,
};
struct nfhd_device *dev;
int dev_id = id - NFHD_DEV_OFFSET;
Expand Down
53 changes: 20 additions & 33 deletions arch/um/drivers/ubd_kern.c
Original file line number Diff line number Diff line change
Expand Up @@ -447,43 +447,31 @@ static int bulk_req_safe_read(
return n;
}

/* Called without dev->lock held, and only in interrupt context. */
static void ubd_handler(void)
static void ubd_end_request(struct io_thread_req *io_req)
{
int n;
int count;

while(1){
n = bulk_req_safe_read(
thread_fd,
irq_req_buffer,
&irq_remainder,
&irq_remainder_size,
UBD_REQ_BUFFER_SIZE
);
if (n < 0) {
if(n == -EAGAIN)
break;
printk(KERN_ERR "spurious interrupt in ubd_handler, "
"err = %d\n", -n);
return;
}
for (count = 0; count < n/sizeof(struct io_thread_req *); count++) {
struct io_thread_req *io_req = (*irq_req_buffer)[count];

if ((io_req->error == BLK_STS_NOTSUPP) && (req_op(io_req->req) == REQ_OP_DISCARD)) {
blk_queue_max_discard_sectors(io_req->req->q, 0);
blk_queue_max_write_zeroes_sectors(io_req->req->q, 0);
}
blk_mq_end_request(io_req->req, io_req->error);
kfree(io_req);
}
if (io_req->error == BLK_STS_NOTSUPP) {
if (req_op(io_req->req) == REQ_OP_DISCARD)
blk_queue_disable_discard(io_req->req->q);
else if (req_op(io_req->req) == REQ_OP_WRITE_ZEROES)
blk_queue_disable_write_zeroes(io_req->req->q);
}
blk_mq_end_request(io_req->req, io_req->error);
kfree(io_req);
}

static irqreturn_t ubd_intr(int irq, void *dev)
{
ubd_handler();
int len, i;

while ((len = bulk_req_safe_read(thread_fd, irq_req_buffer,
&irq_remainder, &irq_remainder_size,
UBD_REQ_BUFFER_SIZE)) >= 0) {
for (i = 0; i < len / sizeof(struct io_thread_req *); i++)
ubd_end_request((*irq_req_buffer)[i]);
}

if (len < 0 && len != -EAGAIN)
pr_err("spurious interrupt in %s, err = %d\n", __func__, len);
return IRQ_HANDLED;
}

Expand Down Expand Up @@ -847,6 +835,7 @@ static int ubd_add(int n, char **error_out)
struct queue_limits lim = {
.max_segments = MAX_SG,
.seg_boundary_mask = PAGE_SIZE - 1,
.features = BLK_FEAT_WRITE_CACHE,
};
struct gendisk *disk;
int err = 0;
Expand Down Expand Up @@ -893,8 +882,6 @@ static int ubd_add(int n, char **error_out)
goto out_cleanup_tags;
}

blk_queue_flag_set(QUEUE_FLAG_NONROT, disk->queue);
blk_queue_write_cache(disk->queue, true, false);
disk->major = UBD_MAJOR;
disk->first_minor = n << UBD_SHIFT;
disk->minors = 1 << UBD_SHIFT;
Expand Down
5 changes: 4 additions & 1 deletion arch/xtensa/platforms/iss/simdisk.c
Original file line number Diff line number Diff line change
Expand Up @@ -263,6 +263,9 @@ static const struct proc_ops simdisk_proc_ops = {
static int __init simdisk_setup(struct simdisk *dev, int which,
struct proc_dir_entry *procdir)
{
struct queue_limits lim = {
.features = BLK_FEAT_ROTATIONAL,
};
char tmp[2] = { '0' + which, 0 };
int err;

Expand All @@ -271,7 +274,7 @@ static int __init simdisk_setup(struct simdisk *dev, int which,
spin_lock_init(&dev->lock);
dev->users = 0;

dev->gd = blk_alloc_disk(NULL, NUMA_NO_NODE);
dev->gd = blk_alloc_disk(&lim, NUMA_NO_NODE);
if (IS_ERR(dev->gd)) {
err = PTR_ERR(dev->gd);
goto out;
Expand Down
8 changes: 2 additions & 6 deletions block/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,8 @@ config BLK_DEV_BSGLIB

config BLK_DEV_INTEGRITY
bool "Block layer data integrity support"
select CRC_T10DIF
select CRC64_ROCKSOFT
help
Some storage devices allow extra information to be
stored/retrieved to help protect the data. The block layer
Expand All @@ -72,12 +74,6 @@ config BLK_DEV_INTEGRITY
T10/SCSI Data Integrity Field or the T13/ATA External Path
Protection. If in doubt, say N.

config BLK_DEV_INTEGRITY_T10
tristate
depends on BLK_DEV_INTEGRITY
select CRC_T10DIF
select CRC64_ROCKSOFT

config BLK_DEV_WRITE_MOUNTED
bool "Allow writing to mounted block devices"
default y
Expand Down
3 changes: 1 addition & 2 deletions block/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,7 @@ obj-$(CONFIG_MQ_IOSCHED_KYBER) += kyber-iosched.o
bfq-y := bfq-iosched.o bfq-wf2q.o bfq-cgroup.o
obj-$(CONFIG_IOSCHED_BFQ) += bfq.o

obj-$(CONFIG_BLK_DEV_INTEGRITY) += bio-integrity.o blk-integrity.o
obj-$(CONFIG_BLK_DEV_INTEGRITY_T10) += t10-pi.o
obj-$(CONFIG_BLK_DEV_INTEGRITY) += bio-integrity.o blk-integrity.o t10-pi.o
obj-$(CONFIG_BLK_MQ_PCI) += blk-mq-pci.o
obj-$(CONFIG_BLK_MQ_VIRTIO) += blk-mq-virtio.o
obj-$(CONFIG_BLK_DEV_ZONED) += blk-zoned.o
Expand Down
Loading

0 comments on commit db01a9d

Please sign in to comment.