On Wed, Mar 30, 2022 at 01:15:58PM +0100, Jonathan Cameron wrote:
> On Tue, 29 Mar 2022 12:53:51 -0700
> Davidlohr Bueso <[email protected]> wrote:
>
> > On Tue, 29 Mar 2022, Adam Manzanares wrote:
> > >> +typedef struct cxl_device_state {
> > >> + MemoryRegion device_registers;
> > >> +
> > >> + /* mmio for device capabilities array - 8.2.8.2 */
> > >> + MemoryRegion device;
> > >> + MemoryRegion caps;
> > >> +
> > >> + /* mmio for the mailbox registers 8.2.8.4 */
> > >> + MemoryRegion mailbox;
> > >> +
> > >> + /* memory region for persistent memory, HDM */
> > >> + uint64_t pmem_size;
> > >
> > >Can we switch this to mem_size and drop the persistent comment? It is my
> > >understanding that HDM is independent of persistence.
> >
> > Agreed, but ideally both volatile and persistent capacities would have been
> > supported in this patchset. I'm also probably missing specific reasons as to
> > why this isn't the case.
>
> Whilst it doesn't add a huge amount of complexity it does add some
> and the software paths in Linux we were developing this for are pmem focused.
> Hence volatile is on the todo list rather than in this first patch set.
> Not sensible to aim for feature complete in one go.
Makes complete sense. We can help with the Linux development for the volatile
side. I will add a couple of folks on cc. In addition, we would like to help
the CXL ecosystem in general so I anticipate we will have more reviews and
patches for CXL in general.
>
> >
> > Looking at it briefly could it be just a matter of adding to cxl_type3_dev
> > a new hostmem along with it's AddressSpace for the volatile? If so, I'm
> > thinking something along these lines:
> >
> > @@ -123,8 +123,8 @@ typedef struct cxl_device_state {
> > uint64_t host_set;
> > } timestamp;
> >
> > - /* memory region for persistent memory, HDM */
> > - uint64_t pmem_size;
> > + /* memory region for persistent and volatile memory, HDM */
> > + uint64_t pmem_size, mem_size;
> > } CXLDeviceState;
> >
> > /* Initialize the register block for a device */
> > @@ -235,9 +235,9 @@ typedef struct cxl_type3_dev {
> > PCIDevice parent_obj;
> >
> > /* Properties */
> > - AddressSpace hostmem_as;
> > + AddressSpace hostmem_as, hostmemv_as;
> > uint64_t size;
> > - HostMemoryBackend *hostmem;
> > + HostMemoryBackend *hostmem, *hostmemv;
> > HostMemoryBackend *lsa;
> > uint64_t sn;
> >
> > Then for cxl_setup_memory(), with ct3d->hostmem and/or ct3d->hostmemv
> > non-nil, set the respective MemoryRegions:
> >
> > + if (ct3d->hostmem) {
> > + memory_region_set_nonvolatile(mr, true);
> > + memory_region_set_enabled(mr, true);
> > + host_memory_backend_set_mapped(ct3d->hostmem, true);
> > + address_space_init(&ct3d->hostmem_as, mr, name);
> > + ct3d->cxl_dstate.pmem_size = ct3d->hostmem->size;
> > + }
> > + if (ct3d->hostmemv) {
> > + memory_region_set_nonvolatile(mrv, false);
> > + memory_region_set_enabled(mrv, true);
> > + host_memory_backend_set_mapped(ct3d->hostmemv, true);
> > + address_space_init(&ct3d->hostmem_as, mrv, name);
> > + ct3d->cxl_dstate.pmem_size = ct3d->hostmem->size;
> > + }
> >
> > For corresponding MB commands, it's mostly IDENTIFY_MEMORY_DEVICE that needs
> > updating:
> >
> > @@ -281,7 +281,7 @@ static ret_code cmd_identify_memory_device(struct
> > cxl_cmd *cmd,
> >
> > CXLType3Dev *ct3d = container_of(cxl_dstate, CXLType3Dev, cxl_dstate);
> > CXLType3Class *cvc = CXL_TYPE3_DEV_GET_CLASS(ct3d);
> > - uint64_t size = cxl_dstate->pmem_size;
> > + uint64_t size = cxl_dstate->pmem_size + cxl_dstate->mem_size;
> >
> > if (!QEMU_IS_ALIGNED(size, 256 << 20)) {
> > return CXL_MBOX_INTERNAL_ERROR;
> > @@ -290,11 +290,11 @@ static ret_code cmd_identify_memory_device(struct
> > cxl_cmd *cmd,
> > id = (void *)cmd->payload;
> > memset(id, 0, sizeof(*id));
> >
> > - /* PMEM only */
> > snprintf(id->fw_revision, 0x10, "BWFW VERSION %02d", 0);
> >
> > id->total_capacity = size / (256 << 20);
> > - id->persistent_capacity = size / (256 << 20);
> > + id->persistent_capacity = cxl_dstate->pmem_size / (256 << 20);
> > + id->volatile_capacity = cxl_dstate->mem_size / (256 << 20);
> > id->lsa_size = cvc->get_lsa_size(ct3d);
> >
> > *len = sizeof(*id);
> > @@ -312,16 +312,16 @@ static ret_code cmd_ccls_get_partition_info(struct
> > cxl_cmd *cmd,
> > uint64_t next_pmem;
> > } QEMU_PACKED *part_info = (void *)cmd->payload;
> > QEMU_BUILD_BUG_ON(sizeof(*part_info) != 0x20);
> > - uint64_t size = cxl_dstate->pmem_size;
> > + uint64_t psize = cxl_dstate->pmem_size;
> > + uint64_t vsize = cxl_dstate->mem_size;
> >
> > - if (!QEMU_IS_ALIGNED(size, 256 << 20)) {
> > + if (!QEMU_IS_ALIGNED(psize + vsize, 256 << 20)) {
> > return CXL_MBOX_INTERNAL_ERROR;
> > }
> >
> > - /* PMEM only */
> > - part_info->active_vmem = 0;
> > - part_info->next_vmem = 0;
> > - part_info->active_pmem = size / (256 << 20);
> > + part_info->active_vmem = vsize / (256 << 20);
> > + part_info->next_vmem = part_info->active_vmem;
> > + part_info->active_pmem = psize / (256 << 20);
> > part_info->next_pmem = part_info->active_pmem;
> >
> > Then for reads/writes, both cxl_type3_write() and _read() would, after
> > computing the dpa_offset,
> > first try the volatile region then upon error attempt the same in the
> > persistent memory - this
> > assuming the DPA space is consistent among both types of memory. Ie:
> >
> > address_space_read(&ct3d->hostmemv_as, dpa_offset, attrs, data, size)
> > or
> > address_space_read(&ct3d->hostmem_as, dpa_offset, attrs, data, size)
> >
> > ... but then again all this is probably just wishful thinking.
>
> Without looking in detail, will indeed be something along those lines.
> Gets more fiddly if you include partitioning control that Alison was
> interested
> in adding.
>
> Also, we probably need to support multiple HDM decoders. Also not a huge
> complexity to add, but left for now as main focus is to get the base
> patch set merged.
>
> So I'm happy to queue stuff up on top of this series and carry it forward
> but I don't want to add features to what we try to merge initially.
> This set is already huge and hard to review even with what think is a
> minimum set of features to be useful.
>
> Note I'm already carrying a few features on top if this on the gitlab
> branch gitlab.com/jic23/qemu (DOE + CDAT and serial numbers) and
> have a few other things out of tree for now (SPDM, emulating most
> of the PCI Config space controls).
>
Thanks for the updates. Do you have any suggestions on how to coordinate
efforts? Ideally we can have a list of features that need to be developed and
some names of people that will lead the work.
> Jonathan
>
> >
> > Thanks,
> > Davidlohr
>