Skip to content

Conversation

@DemiMarie
Copy link
Contributor

This adds support for a generic vhost-user-generic device. All information about this device must be provided to Cloud Hypervisor via the command-line or API.

Fixes #7189

@DemiMarie DemiMarie requested a review from a team as a code owner July 25, 2025 02:22
@DemiMarie DemiMarie force-pushed the generic-vhost-user branch 6 times, most recently from aed497d to cd6b328 Compare July 26, 2025 02:33
@DemiMarie DemiMarie marked this pull request as draft July 28, 2025 19:35
@DemiMarie DemiMarie force-pushed the generic-vhost-user branch 4 times, most recently from b101b21 to 64b241c Compare August 8, 2025 01:13
@DemiMarie DemiMarie force-pushed the generic-vhost-user branch 2 times, most recently from 2b46ed3 to 5c8e3f7 Compare August 9, 2025 17:54
@rbradford
Copy link
Member

Can you give an example of what a vhost-user-generic device is?

@DemiMarie DemiMarie marked this pull request as ready for review August 11, 2025 04:26
@DemiMarie
Copy link
Contributor Author

A vhost-user-generic device is a vhost-user device that is implemented entirely by the backend. All parameters, such as the maximum number of queues and the features, must be passed to Cloud Hypervisor via API or the command line. This allows using vhost-user devices that Cloud Hypervisor does not know about, such as sound or (in the future) GPU. It is also possible for other devices to wrap the vhost-user-generic implementation if they do not need any specific handling.

@liuw
Copy link
Member

liuw commented Aug 14, 2025

The QEMU patch referenced in #7189 -- https://lore.kernel.org/qemu-devel/20230901110018.3704459-1-alex.bennee@linaro.org/ -- shows that it is still in RFC status. What's the latest status in QEMU?

There is a flag VHOST_USER_PROTOCOL_F_PROBE defined in the QEMU patch, but that's not accepted. Your patches don't use that flag.

How will this work in practice? Do you have a working backend to support this already?

@DemiMarie
Copy link
Contributor Author

DemiMarie commented Aug 15, 2025

The QEMU patch referenced in #7189 -- https://lore.kernel.org/qemu-devel/20230901110018.3704459-1-alex.bennee@linaro.org/ -- shows that it is still in RFC status. What's the latest status in QEMU?

I don’t think the patch was accepted. QEMU does have a generic vhost-user device, but (like this patch) it requires parameters to be passed by the user rather than fetching them from the backend. QEMU also requires a trivial source-code patch to enable the feature, which this patch doesn’t need.

There is a flag VHOST_USER_PROTOCOL_F_PROBE defined in the QEMU patch, but that's not accepted. Your patches don't use that flag.

You are correct. VHOST_USER_PROTOCOL_F_PROBE is needed to automatically obtain various information from the backend. This patch requires the information to be provided at device creation time instead.

How will this work in practice? Do you have a working backend to support this already?

This works with an unmodified virtiofsd, but the caller must explicitly provide all of the parameters needed. If I did not make any copy-and-paste mistakes, I used the following option to make a virtio-fs device:

--vhost-user-generic \
"socket=build/virtiofsd.sock,virtio_id=26,avail_features=$((0x5970000000)),avail_protocol_features=$((0x920b)),queue_size=1024,num_queues=1,min_queues=1,id=virtiofs0"

This interface is deliberately low-level to allow maximum flexibility. Callers are expected to know the parameters for their particular device. Higher-level tooling is expected to provide a user-friendly interface.

@alyssais
Copy link
Member

Needs some clippy fixes.

@DemiMarie DemiMarie force-pushed the generic-vhost-user branch 2 times, most recently from 595b465 to 986efa7 Compare November 22, 2025 02:58
DemiMarie added a commit to DemiMarie/cloud-hypervisor that referenced this pull request Dec 31, 2025
This was breaking CI for PR cloud-hypervisor#7221.

Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
DemiMarie added a commit to DemiMarie/cloud-hypervisor that referenced this pull request Dec 31, 2025
This breaks CI for cloud-hypervisor#7221.

Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
DemiMarie added a commit to DemiMarie/cloud-hypervisor that referenced this pull request Dec 31, 2025
This breaks CI for cloud-hypervisor#7221.

Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
DemiMarie added a commit to DemiMarie/cloud-hypervisor that referenced this pull request Dec 31, 2025
This breaks CI for cloud-hypervisor#7221.

Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
DemiMarie added a commit to DemiMarie/cloud-hypervisor that referenced this pull request Dec 31, 2025
This breaks CI for cloud-hypervisor#7221.

Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
DemiMarie added a commit to DemiMarie/cloud-hypervisor that referenced this pull request Dec 31, 2025
This breaks CI for cloud-hypervisor#7221.

Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
DemiMarie added a commit to DemiMarie/cloud-hypervisor that referenced this pull request Jan 1, 2026
This breaks CI for cloud-hypervisor#7221.

Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
DemiMarie added a commit to DemiMarie/cloud-hypervisor that referenced this pull request Jan 1, 2026
This breaks CI for cloud-hypervisor#7221.

Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
DemiMarie added a commit to DemiMarie/cloud-hypervisor that referenced this pull request Jan 1, 2026
This breaks CI for cloud-hypervisor#7221.

Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
Copy link
Member

@alyssais alyssais left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some proofreading issues in the commit message:

  • "vhost-user-vhost"
  • "generic vhost-user-generic device"

pub const SYNTAX: &'static str = "virtio-vhost-user parameters \
\"virtio_id=<id>,socket=<socket_path>,num_queues=<number_of_queues>,\
queue_size=<size_of_each_queue>,id=<device_id>,pci_segment=<segment_id>,\
min_queues=<minimum_number_of_queues>,avail_features=<available features>\"";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't! That's maintenance burden. If we find flexibility is needed later, it's a lot easier to add it then than it is to remove unnecessary flexibility down the line.

Ok(())
}

fn validate_always_identifier(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's rename it to something we understand, then?

@DemiMarie DemiMarie force-pushed the generic-vhost-user branch 2 times, most recently from 7e1c247 to a00688b Compare January 9, 2026 09:36
Copy link
Contributor Author

@DemiMarie DemiMarie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this with virtiofsd and it does not work. The culprit apepars to be configuration space access: the read_config and write_config functions are never called, as shown by a panic!() in them not triggering. Also, I have no idea how much config space is needed, so I don't know how much to save and restore. Therefore, I added a todo!() in the restore code.

Update: Cause of the problem was not adding the MetaVirtioDevice to the list of devices.

@DemiMarie DemiMarie marked this pull request as draft January 9, 2026 09:39
@DemiMarie DemiMarie requested a review from alyssais January 9, 2026 09:39
@DemiMarie DemiMarie force-pushed the generic-vhost-user branch 2 times, most recently from a2cb9f0 to 17930d1 Compare January 9, 2026 22:44
@DemiMarie
Copy link
Contributor Author

Culprit is commit a8d1411, which had an incompatible change to internal APIs.

@DemiMarie DemiMarie marked this pull request as ready for review January 9, 2026 22:44
@DemiMarie
Copy link
Contributor Author

This is ready for review, but I do not expect migration to work. The reason is that the configuration space is not stored anywhere. I don’t know how large configuration space should be so I don’t know how large to make it.

It should always succeed and is apparently implicitly called by libc or
some dependency somewhere.

Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
This adds support for a generic vhost-user-generic device.  All
information about this device must be provided to Cloud Hypervisor via
the command-line or API.

Allowing gettid() is needed to prevent seccomp violations.  It has
negligible attack surface.

Fixes cloud-hypervisor#7189

Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
@rbradford
Copy link
Member

This is ready for review, but I do not expect migration to work. The reason is that the configuration space is not stored anywhere. I don’t know how large configuration space should be so I don’t know how large to make it.

Great! I think we can still move forward with integrating but we will need to generate an error on an attempt to migrate.

@alyssais
Copy link
Member

The reason is that the configuration space is not stored anywhere.

Given we just pass through configuration space reads and writes to the backend, wouldn't it be the responsibility of the backend to restore its configuration space when restarted?

.map_err(Error::VhostUserGetQueueMaxNum)?
as usize
} else {
default_queues.into()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasn't this supposed to be inferred from the queue sizes? Why do we need another parameter? It's starting to feel like every time I look at this PR there's a new redundant way to specify the number of queues! 😅

@rbradford
Copy link
Member

The reason is that the configuration space is not stored anywhere.

Given we just pass through configuration space reads and writes to the backend, wouldn't it be the responsibility of the backend to restore its configuration space when restarted?

Makes sense - let's just make sure that we don't get a silent failure when trying to migrate (I don't mind that migration is unsupported - I just don't want it to blow up on the destination, I think that's a bit user hostile!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support for generic vhost-user devices

4 participants