Re: [Qemu-devel] [PATCH v6 00/16] KVM platform device passthrough

2014-09-16 Thread Eric Auger
On 09/16/2014 10:51 PM, Alex Williamson wrote:
> On Tue, 2014-09-16 at 00:01 +0200, Eric Auger wrote:
>> On 09/12/2014 01:05 AM, Christoffer Dall wrote:
>>> On Thu, Sep 11, 2014 at 04:51:14PM -0600, Alex Williamson wrote:
>>>> On Thu, 2014-09-11 at 15:23 -0700, Christoffer Dall wrote:
>>>>> On Thu, Sep 11, 2014 at 04:14:09PM -0600, Alex Williamson wrote:
>>>>>> On Tue, 2014-09-09 at 08:31 +0100, Eric Auger wrote:
>>>>>>> This RFC series aims at enabling KVM platform device passthrough.
>>>>>>> It implements a VFIO platform device, derived from VFIO PCI device.
>>>>>>>
>>>>>>> The VFIO platform device uses the host VFIO platform driver which must
>>>>>>> be bound to the assigned device prior to the QEMU system start.
>>>>>>>
>>>>>>> - the guest can directly access the device register space
>>>>>>> - assigned device IRQs are transparently routed to the guest by
>>>>>>>   QEMU/KVM (3 methods currently are supported: user-level eventfd
>>>>>>>   handling, irqfd, forwarded IRQs)
>>>>>>> - iommu is transparently programmed to prevent the device from
>>>>>>>   accessing physical pages outside of the guest address space
>>>>>>>
>>>>>>> This patch series is made of the following patch files:
>>>>>>>
>>>>>>> 1-7) Modifications to PCI code to prepare for VFIO platform device
>>>>>>> 8) split of PCI specific code and generic code (move)
>>>>>>> 9-11) creation of the VFIO calxeda xgmac platform device, without irqfd
>>>>>>>   support (MMIO direct access and IRQ assignment).
>>>>>>> 12) fake injection test modality (to test multiple IRQ)
>>>>>>> 13) addition of irqfd/virqfd support
>>>>>>> 14-16) forwarded IRQ
>>>>>>>
>>>>>>> Dependency List:
>>>>>>>
>>>>>>> QEMU dependencies:
>>>>>>> [1] [PATCH v2 0/9] Dynamic sysbus device allocation support, Alex Graf
>>>>>>> http://lists.gnu.org/archive/html/qemu-ppc/2014-07/msg00047.html
>>>>>>> [2] [RFC v3] machvirt dynamic sysbus device instantiation, Eric Auger
>>>>>>> [3] [PATCH v2 0/2] actual checks of KVM_CAP_IRQFD and 
>>>>>>> KVM_CAP_IRQFD_RESAMPLE,
>>>>>>> Eric Auger
>>>>>>> 
>>>>>>> http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg00589.html
>>>>>>> [4] [RFC] vfio: migration to trace points, Eric Auger
>>>>>>> 
>>>>>>> http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg00569.html
>>>>>>>
>>>>>>> Kernel Dependencies:
>>>>>>> [5] [RFC Patch v6 0/20] VFIO support for platform devices, Antonios 
>>>>>>> Motakis
>>>>>>> https://www.mail-archive.com/kvm@vger.kernel.org/msg103247.html
>>>>>>> [6] [PATCH v3] ARM: KVM: add irqfd support, Eric Auger
>>>>>>> https://lkml.org/lkml/2014/9/1/141
>>>>>>> [7] arm/arm64: KVM: Various VGIC cleanups and improvements, Christoffer 
>>>>>>> Dall
>>>>>>> http://comments.gmane.org/gmane.linux.ports.arm.kernel/340430
>>>>>>> [8] [RFC v2 0/9] KVM-VFIO IRQ forward control, Eric Auger
>>>>>>> https://lkml.org/lkml/2014/9/1/344
>>>>>>> [9] [RFC PATCH 0/9] ARM: Forwarding physical interrupts to a guest VM,
>>>>>>> Marc Zyngier
>>>>>>> http://lwn.net/Articles/603514/
>>>>>>>
>>>>>>> kernel pieces can be found at:
>>>>>>> http://git.linaro.org/people/eric.auger/linux.git
>>>>>>> (branch 3.17rc3_irqfd_forward_integ_v2)
>>>>>>> QEMU pieces can be found at:
>>>>>>> http://git.linaro.org/people/eric.auger/qemu.git (branch vfio_integ_v6)
>>>>>>>
>>>>>>> The patch series was tested on Calxeda Midway (ARMv7) where one xgmac
>>>>>>> is assigned to KVM host while the second one is assigned to the guest.
>>>>>>> Reworked PCI device is not tested.
>>>>>>>
>>>>>>> Wiki for Calxeda Midway setup:
>>>>>>>

Re: [Qemu-devel] [RFC] vfio: migration to trace points

2014-09-18 Thread Eric Auger
Hi,

For those who would like to try that patch and are not familiar with
trace points, here are some very basic instructions to start with.
- in your qemu configure command line, add
  --enable-trace-backends=stderr
  This enables the stderr trace backend.

- create a events.txt file where you launch qemu
  add the trace points you want to observe.
  list of trace points can be found in trace-events (qemu root dir).
  In the VFIO device, they are trace_

  for instance add:
  vfio_intx_interrupt
  vfio_eoi

  wildcard seems to work as well. lines can be commented with #

- when launching qemu, add
  -trace events=events.txt

Complete details can be found in docs/tracing.txt

Best Regards

Eric



On 09/03/2014 10:45 AM, Eric Auger wrote:
> This patch removes all DPRINTF and replace them by trace points.
> A few DPRINTF used in error cases were transformed into error_report.
> 
> Signed-off-by: Eric Auger 
> 
> ---
> 
> - __func__ is removed since trace point name does the same job
> - HWADDR_PRIx were replaced by PRIx64
> 
> Besides those changes format strings were kept the same. in few
> cases however I was forced to change them due to parsing errors
> (always related to parenthesis handling). This is indicated in
> trace-events. Cases than are not correctly handled are given below:
> - "(%04x:%02x:%02x.%x)" need to be replaced by " (%04x:%02x:%02x.%x)"
> - "%s read(%04x:%02x:%02x.%x:BAR%d+0x%"PRIx64", %d) = 0x%"PRIx64 ->
>   "%s read(%04x:%02x:%02x.%x:BAR%d+0x%"PRIx64", %d = 0x%"PRIx64 ->
> - "%s write(%04x:%02x:%02x.%x:BAR%d+0x%"PRIx64", 0x%"PRIx64", %d)"
>   "%s write(%04x:%02x:%02x.%x:BAR%d+0x%"PRIx64", 0x%"PRIx64", %d"
> This is a temporary fix.
> 
> - This leads to a too large amount of trace points which may not be
> eligible as trace points - I don't know?-
> - this transformation just is tested compiled on PCI. Tested on platform
>   qemu configured with --enable-trace-backends=stderr
> - in future, format strings and calls may be simplified by using a single
>   name argument instead of domain, bus, slot, function.
> ---
>  hw/misc/vfio.c | 403 
> +
>  trace-events   |  79 +++
>  2 files changed, 285 insertions(+), 197 deletions(-)
> 
> diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
> index 40dcaa6..6b6dee9 100644
> --- a/hw/misc/vfio.c
> +++ b/hw/misc/vfio.c
> @@ -40,15 +40,7 @@
>  #include "sysemu/kvm.h"
>  #include "sysemu/sysemu.h"
>  #include "hw/misc/vfio.h"
> -
> -/* #define DEBUG_VFIO */
> -#ifdef DEBUG_VFIO
> -#define DPRINTF(fmt, ...) \
> -do { fprintf(stderr, "vfio: " fmt, ## __VA_ARGS__); } while (0)
> -#else
> -#define DPRINTF(fmt, ...) \
> -do { } while (0)
> -#endif
> +#include "trace.h"
>  
>  /* Extra debugging, trap acceleration paths for more logging */
>  #define VFIO_ALLOW_MMAP 1
> @@ -365,9 +357,9 @@ static void vfio_intx_interrupt(void *opaque)
>  return;
>  }
>  
> -DPRINTF("%s(%04x:%02x:%02x.%x) Pin %c\n", __func__, vdev->host.domain,
> -vdev->host.bus, vdev->host.slot, vdev->host.function,
> -'A' + vdev->intx.pin);
> +trace_vfio_intx_interrupt(vdev->host.domain, vdev->host.bus,
> +  vdev->host.slot, vdev->host.function,
> +  'A' + vdev->intx.pin);
>  
>  vdev->intx.pending = true;
>  pci_irq_assert(&vdev->pdev);
> @@ -384,8 +376,8 @@ static void vfio_eoi(VFIODevice *vdev)
>  return;
>  }
>  
> -DPRINTF("%s(%04x:%02x:%02x.%x) EOI\n", __func__, vdev->host.domain,
> -vdev->host.bus, vdev->host.slot, vdev->host.function);
> +trace_vfio_eoi(vdev->host.domain, vdev->host.bus,
> +   vdev->host.slot, vdev->host.function);
>  
>  vdev->intx.pending = false;
>  pci_irq_deassert(&vdev->pdev);
> @@ -454,9 +446,8 @@ static void vfio_enable_intx_kvm(VFIODevice *vdev)
>  
>  vdev->intx.kvm_accel = true;
>  
> -DPRINTF("%s(%04x:%02x:%02x.%x) KVM INTx accel enabled\n",
> -__func__, vdev->host.domain, vdev->host.bus,
> -vdev->host.slot, vdev->host.function);
> +trace_vfio_enable_intx_kvm(vdev->host.domain, vdev->host.bus,
> +   vdev->host.slot, vdev->host.function);
>  
>  return;
>  
> @@ -508,9 +499,8 @@ static void vfio_disable_intx_kvm(VFIODevice *vdev)
>  /* If we've mis

[Qemu-devel] [question] virtio-net-device and multi-queue option

2014-09-18 Thread Eric Auger
Hi,


I am currently doing some benchmarks using virtio-net-device using
virtio-mmio (non PCI) with qemu_system_arm with KVM. Purpose is to
compare with device passthrough performance.

I heard about the availability of a multi-queue option that greatly
improves the performance but I currently fail in enabling it.

Please could someone explain me how to turn that feature on? My current
virtio-net options simply are:

-netdev tap,id=tap0,ifname="tap0" \
-device virtio-net-device,netdev=tap0

Also are there any "easy" tunings I can play with to try to reach the
best performance.

Thank you in advance

Best Regards

Eric



Re: [Qemu-devel] [PATCH v6 00/16] KVM platform device passthrough

2014-09-18 Thread Eric Auger
On 09/16/2014 11:23 PM, Alex Williamson wrote:
> On Tue, 2014-09-16 at 14:51 -0600, Alex Williamson wrote:
>> On Tue, 2014-09-16 at 00:01 +0200, Eric Auger wrote:
>>> On 09/12/2014 01:05 AM, Christoffer Dall wrote:
>>>> On Thu, Sep 11, 2014 at 04:51:14PM -0600, Alex Williamson wrote:
>>>>> On Thu, 2014-09-11 at 15:23 -0700, Christoffer Dall wrote:
>>>>>> On Thu, Sep 11, 2014 at 04:14:09PM -0600, Alex Williamson wrote:
>>>>>>> On Tue, 2014-09-09 at 08:31 +0100, Eric Auger wrote:
>>>>>>>> This RFC series aims at enabling KVM platform device passthrough.
>>>>>>>> It implements a VFIO platform device, derived from VFIO PCI device.
>>>>>>>>
>>>>>>>> The VFIO platform device uses the host VFIO platform driver which must
>>>>>>>> be bound to the assigned device prior to the QEMU system start.
>>>>>>>>
>>>>>>>> - the guest can directly access the device register space
>>>>>>>> - assigned device IRQs are transparently routed to the guest by
>>>>>>>>   QEMU/KVM (3 methods currently are supported: user-level eventfd
>>>>>>>>   handling, irqfd, forwarded IRQs)
>>>>>>>> - iommu is transparently programmed to prevent the device from
>>>>>>>>   accessing physical pages outside of the guest address space
>>>>>>>>
>>>>>>>> This patch series is made of the following patch files:
>>>>>>>>
>>>>>>>> 1-7) Modifications to PCI code to prepare for VFIO platform device
>>>>>>>> 8) split of PCI specific code and generic code (move)
>>>>>>>> 9-11) creation of the VFIO calxeda xgmac platform device, without irqfd
>>>>>>>>   support (MMIO direct access and IRQ assignment).
>>>>>>>> 12) fake injection test modality (to test multiple IRQ)
>>>>>>>> 13) addition of irqfd/virqfd support
>>>>>>>> 14-16) forwarded IRQ
>>>>>>>>
>>>>>>>> Dependency List:
>>>>>>>>
>>>>>>>> QEMU dependencies:
>>>>>>>> [1] [PATCH v2 0/9] Dynamic sysbus device allocation support, Alex Graf
>>>>>>>> http://lists.gnu.org/archive/html/qemu-ppc/2014-07/msg00047.html
>>>>>>>> [2] [RFC v3] machvirt dynamic sysbus device instantiation, Eric Auger
>>>>>>>> [3] [PATCH v2 0/2] actual checks of KVM_CAP_IRQFD and 
>>>>>>>> KVM_CAP_IRQFD_RESAMPLE,
>>>>>>>> Eric Auger
>>>>>>>> 
>>>>>>>> http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg00589.html
>>>>>>>> [4] [RFC] vfio: migration to trace points, Eric Auger
>>>>>>>> 
>>>>>>>> http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg00569.html
>>>>>>>>
>>>>>>>> Kernel Dependencies:
>>>>>>>> [5] [RFC Patch v6 0/20] VFIO support for platform devices, Antonios 
>>>>>>>> Motakis
>>>>>>>> https://www.mail-archive.com/kvm@vger.kernel.org/msg103247.html
>>>>>>>> [6] [PATCH v3] ARM: KVM: add irqfd support, Eric Auger
>>>>>>>> https://lkml.org/lkml/2014/9/1/141
>>>>>>>> [7] arm/arm64: KVM: Various VGIC cleanups and improvements, 
>>>>>>>> Christoffer Dall
>>>>>>>> http://comments.gmane.org/gmane.linux.ports.arm.kernel/340430
>>>>>>>> [8] [RFC v2 0/9] KVM-VFIO IRQ forward control, Eric Auger
>>>>>>>> https://lkml.org/lkml/2014/9/1/344
>>>>>>>> [9] [RFC PATCH 0/9] ARM: Forwarding physical interrupts to a guest VM,
>>>>>>>> Marc Zyngier
>>>>>>>> http://lwn.net/Articles/603514/
>>>>>>>>
>>>>>>>> kernel pieces can be found at:
>>>>>>>> http://git.linaro.org/people/eric.auger/linux.git
>>>>>>>> (branch 3.17rc3_irqfd_forward_integ_v2)
>>>>>>>> QEMU pieces can be found at:
>>>>>>>> http://git.linaro.org/people/eric.auger/qemu.git (branch vfio_integ_v6)
>>>>>>>>
>>>>>>>> The patch series was tes

Re: [Qemu-devel] [question] virtio-net-device and multi-queue option

2014-09-18 Thread Eric Auger
On 09/19/2014 03:42 AM, Gonglei (Arei) wrote:
>> Subject: [Qemu-devel] [question] virtio-net-device and multi-queue option
>>
>> Hi,
>>
>>
>> I am currently doing some benchmarks using virtio-net-device using
>> virtio-mmio (non PCI) with qemu_system_arm with KVM. Purpose is to
>> compare with device passthrough performance.
>>
>> I heard about the availability of a multi-queue option that greatly
>> improves the performance but I currently fail in enabling it.
>>
>> Please could someone explain me how to turn that feature on? My current
>> virtio-net options simply are:
>>
>> -netdev tap,id=tap0,ifname="tap0" \
>> -device virtio-net-device,netdev=tap0
>>
> 
> There are some steps:
> 
>  1. create multi-queue netdev device, such as tap device (you can
>use "ip tuntap" command, or libvirt), something like as below:
> ip tuntap add tap_1 mode tap multi_queue
>  2. pass corresponding parameters in QEMU command line:
>-netdev 
> type=tap,ifname=tap_q,id=net1,vhost=on,vhostforce=on,queues=4,script= \
>-device virtio-net-device,netdev=net1,mq=on,vectors=9

Hi Gonglei,

Thanks for your quick reply.

Definitively I have not gone through step 1! Nethertheless I am a but
dubious about the fact the mq property does not seem to exist for my
virtio-net-device. I get

qemu-system-arm: -device
virtio-net-device,netdev=tap0,mq=on,mac=52:54:00:12:34:56: Property
'.mq' not found

Best Regards

Eric
>  
>> Also are there any "easy" tunings I can play with to try to reach the
>> best performance.
>>
> 
> You can consider that using irq binding, core binding, vhost-net? etc..
> 
> Best regards,
> -Gonglei
> 
>> Thank you in advance
>>
>> Best Regards
>>
>> Eric
> 




Re: [Qemu-devel] [RFC] vfio: migration to trace points

2014-09-19 Thread Eric Auger
Hi Stefan,

Thanks for asking. Actually I think this is a bit early. I would like
some VFIO PCI users experiencing it a little bit (typically Alex) and
confirm they are happy with it.

Also as I mentionned in the commit message, I identified some parsing
issues that forced me to change few format strings. I don't know if you
have time or are willing to fix those - you may be more efficient doing
those fixes than I would;-) - Nethertheless if you can't afford, I will
have a look at the Python code.

For convenience I put the issues again, all related to parenthesis:

Cases than are not correctly handled are given below:
- "(%04x:%02x:%02x.%x)" need to be replaced by " (%04x:%02x:%02x.%x)"
- "%s read(%04x:%02x:%02x.%x:BAR%d+0x%"PRIx64", %d) = 0x%"PRIx64 replaced by
  "%s read(%04x:%02x:%02x.%x:BAR%d+0x%"PRIx64", %d = 0x%"PRIx64
- "%s write(%04x:%02x:%02x.%x:BAR%d+0x%"PRIx64", 0x%"PRIx64", %d)"
replaced by
  "%s write(%04x:%02x:%02x.%x:BAR%d+0x%"PRIx64", 0x%"PRIx64", %d"

Best Regards

Eric




On 09/19/2014 11:03 AM, Stefan Hajnoczi wrote:
> On Wed, Sep 03, 2014 at 09:45:14AM +0100, Eric Auger wrote:
>> This patch removes all DPRINTF and replace them by trace points.
>> A few DPRINTF used in error cases were transformed into error_report.
>>
>> Signed-off-by: Eric Auger 
> 
> The subject line says "RFC".  Are you proposing this patch for merge?
> 
> Did you want me to take it into the tracing tree?
> 
> Stefan
> 




Re: [Qemu-devel] [question] virtio-net-device and multi-queue option

2014-09-19 Thread Eric Auger
On 09/19/2014 04:30 AM, Gonglei (Arei) wrote:
>> From: Eric Auger [mailto:eric.au...@linaro.org]
>> Sent: Friday, September 19, 2014 10:23 AM
>> To: Gonglei (Arei); qemu list; Michael S. Tsirkin
>> Subject: Re: [Qemu-devel] [question] virtio-net-device and multi-queue option
>>
>> On 09/19/2014 03:42 AM, Gonglei (Arei) wrote:
>>>> Subject: [Qemu-devel] [question] virtio-net-device and multi-queue option
>>>>
>>>> Hi,
>>>>
>>>>
>>>> I am currently doing some benchmarks using virtio-net-device using
>>>> virtio-mmio (non PCI) with qemu_system_arm with KVM. Purpose is to
>>>> compare with device passthrough performance.
>>>>
>>>> I heard about the availability of a multi-queue option that greatly
>>>> improves the performance but I currently fail in enabling it.
>>>>
>>>> Please could someone explain me how to turn that feature on? My current
>>>> virtio-net options simply are:
>>>>
>>>> -netdev tap,id=tap0,ifname="tap0" \
>>>> -device virtio-net-device,netdev=tap0
>>>>
>>>
>>> There are some steps:
>>>
>>>  1. create multi-queue netdev device, such as tap device (you can
>>>use "ip tuntap" command, or libvirt), something like as below:
>>> ip tuntap add tap_1 mode tap multi_queue
>>>  2. pass corresponding parameters in QEMU command line:
>>>-netdev
>> type=tap,ifname=tap_q,id=net1,vhost=on,vhostforce=on,queues=4,script= \
>>>-device virtio-net-device,netdev=net1,mq=on,vectors=9
>>
>> Hi Gonglei,
>>
>> Thanks for your quick reply.
>>
>> Definitively I have not gone through step 1! Nethertheless I am a but
>> dubious about the fact the mq property does not seem to exist for my
>> virtio-net-device. I get
>>
>> qemu-system-arm: -device
>> virtio-net-device,netdev=tap0,mq=on,mac=52:54:00:12:34:56: Property
>> '.mq' not found
>>
> Sorry, my typo. :(
> 
> It should be "virtio-net-pci", not "virtio-net-devcie" 
> 
> BTW, You can use help command to get a devices properties:
> 
> # ./qemu-system-x86_64 -device virtio-net-pci,? 
> [...] 
> virtio-net-pci.mq=on/off
> [...]
Hi Gonglei,

I fear I can only use virtio-net-device since I only have VIRTIO-MMIO in
my machine file and no PCI bus.

The mq property does not seem to be supported for virtio-net-device as
reported by

sudo arm-softmmu/qemu-system-arm -M virt -device virtio-net-device,?
virtio-net-device.tx=str
virtio-net-device.x-txburst=int32
virtio-net-device.x-txtimer=uint32
virtio-net-device.bootindex=int32
virtio-net-device.netdev=netdev
virtio-net-device.vlan=vlan
virtio-net-device.mac=macaddr

Thanks

Best Regards

Eric

> 
> Best regards,
> -Gonglei
> 
>> Best Regards
>>
>> Eric
>>>
>>>> Also are there any "easy" tunings I can play with to try to reach the
>>>> best performance.
>>>>
>>>
>>> You can consider that using irq binding, core binding, vhost-net? etc..
>>>
>>> Best regards,
>>> -Gonglei
>>>
>>>> Thank you in advance
>>>>
>>>> Best Regards
>>>>
>>>> Eric
>>>
> 




Re: [Qemu-devel] [PATCH] trace: tighten up trace-events regex to fix bad parse

2014-09-22 Thread Eric Auger
Dear all,

Many thanks for the fix. I am currently travelling but I will test it
early next week with vfio PCI & platform case. Also following Alex
advises, I will move [RFC] vfio: migration to trace points into a PATCH.

Best Regards

Eric


On 09/22/2014 07:35 PM, Lluís Vilanova wrote:
> Stefan Hajnoczi writes:
> 
>> Use \w for properties and trace event names since they are both drawn
>> from [a-zA-Z0-9_] character sets.
> 
>> The .* for matching properties was too aggressive and caused the
>> following failure with foo(int rc) "(this is a test)":
> 
>>   Traceback (most recent call last):
>> File "scripts/tracetool.py", line 139, in 
>>   main(sys.argv)
>> File "scripts/tracetool.py", line 134, in main
>>   binary=binary, probe_prefix=probe_prefix)
>> File "scripts/tracetool/__init__.py", line 334, in generate
>>   events = _read_events(fevents)
>> File "scripts/tracetool/__init__.py", line 262, in _read_events
>>   res.append(Event.build(line))
>> File "scripts/tracetool/__init__.py", line 225, in build
>>   return Event(name, props, fmt, args, arg_fmts)
>> File "scripts/tracetool/__init__.py", line 185, in __init__
>>   % ", ".join(unknown_props))
>>   ValueError: Unknown properties: foo(int, rc)
> 
>> Cc: Lluís Vilanova 
>> Reported-by: Eric Auger 
>> Signed-off-by: Stefan Hajnoczi 
>> ---
>>  scripts/tracetool/__init__.py | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
>> diff --git a/scripts/tracetool/__init__.py b/scripts/tracetool/__init__.py
>> index 36c789d..474f11b 100644
>> --- a/scripts/tracetool/__init__.py
>> +++ b/scripts/tracetool/__init__.py
>> @@ -140,8 +140,8 @@ class Event(object):
>>  The format strings for each argument.
>>  """
>  
>> -_CRE = re.compile("((?P.*)\s+)?"
>> -  "(?P[^(\s]+)"
>> +_CRE = re.compile("((?P\w*)\s+)?"
>> +  "(?P\w+)"
>>"\((?P[^)]*)\)"
>>"\s*"
>>"(?:(?:(?P\".+),)?\s*(?P\".+))?"
> 
> The previous implementation allowed multiple properties. Maybe this should be
> instead (which still allows multiple properties):
> 
> "((?P[\w\s]+)\s+)?"
> "(?P\w+)\s*"
> ...
> 
> 
> Thanks,
>   Lluis
> 




Re: [Qemu-devel] [PATCH v2] trace: tighten up trace-events regex to fix bad parse

2014-09-29 Thread Eric Auger
Dear all,

this patch fixes the issues I reported (related to VFIO trace points).

Many thanks for that.

Best Regards

Eric

On 09/23/2014 12:37 PM, Stefan Hajnoczi wrote:
> Use \w for properties and trace event names since they are both drawn
> from [a-zA-Z0-9_] character sets.
> 
> The .* for matching properties was too aggressive and caused the
> following failure with foo(int rc) "(this is a test)":
> 
>   Traceback (most recent call last):
> File "scripts/tracetool.py", line 139, in 
>   main(sys.argv)
> File "scripts/tracetool.py", line 134, in main
>   binary=binary, probe_prefix=probe_prefix)
> File "scripts/tracetool/__init__.py", line 334, in generate
>   events = _read_events(fevents)
> File "scripts/tracetool/__init__.py", line 262, in _read_events
>   res.append(Event.build(line))
> File "scripts/tracetool/__init__.py", line 225, in build
>   return Event(name, props, fmt, args, arg_fmts)
> File "scripts/tracetool/__init__.py", line 185, in __init__
>   % ", ".join(unknown_props))
>   ValueError: Unknown properties: foo(int, rc)
> 
> Cc: Lluís Vilanova 
> Reported-by: Eric Auger 
> Signed-off-by: Stefan Hajnoczi 
> ---
> v2:
>  * Fix regex to allow multiple properties [Lluís]
> 
>  scripts/tracetool/__init__.py | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/scripts/tracetool/__init__.py b/scripts/tracetool/__init__.py
> index 36c789d..6ee6af7 100644
> --- a/scripts/tracetool/__init__.py
> +++ b/scripts/tracetool/__init__.py
> @@ -140,8 +140,8 @@ class Event(object):
>  The format strings for each argument.
>  """
>  
> -_CRE = re.compile("((?P.*)\s+)?"
> -  "(?P[^(\s]+)"
> +_CRE = re.compile("((?P[\w\s]+)\s+)?"
> +  "(?P\w+)"
>"\((?P[^)]*)\)"
>"\s*"
>"(?:(?:(?P\".+),)?\s*(?P\".+))?"
> 




[Qemu-devel] [PATCH v2] vfio: migration to trace points

2014-09-29 Thread Eric Auger
This patch removes all DPRINTF and replace them by trace points.
A few DPRINTF used in error cases were transformed into error_report.

Signed-off-by: Eric Auger 

---

- __func__ is removed since trace point name does the same job
- HWADDR_PRIx were replaced by PRIx64
- this transformation just is tested compiled on PCI.
  qemu configured with --enable-trace-backends=stderr
- in future, format strings and calls may be simplified by using a single
  name argument instead of domain, bus, slot, function.

v1 (RFC) -> v2 (PATCH):
- restore original format strings since parsing now is OK after
  commit f9bbba9,
  [PATCH v2] trace: tighten up trace-events regex to fix bad parse
---
 hw/misc/vfio.c | 403 +
 trace-events   |  75 ++-
 2 files changed, 280 insertions(+), 198 deletions(-)

diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
index d66f3d2..4ed5cf3 100644
--- a/hw/misc/vfio.c
+++ b/hw/misc/vfio.c
@@ -40,15 +40,7 @@
 #include "sysemu/kvm.h"
 #include "sysemu/sysemu.h"
 #include "hw/misc/vfio.h"
-
-/* #define DEBUG_VFIO */
-#ifdef DEBUG_VFIO
-#define DPRINTF(fmt, ...) \
-do { fprintf(stderr, "vfio: " fmt, ## __VA_ARGS__); } while (0)
-#else
-#define DPRINTF(fmt, ...) \
-do { } while (0)
-#endif
+#include "trace.h"
 
 /* Extra debugging, trap acceleration paths for more logging */
 #define VFIO_ALLOW_MMAP 1
@@ -365,9 +357,9 @@ static void vfio_intx_interrupt(void *opaque)
 return;
 }
 
-DPRINTF("%s(%04x:%02x:%02x.%x) Pin %c\n", __func__, vdev->host.domain,
-vdev->host.bus, vdev->host.slot, vdev->host.function,
-'A' + vdev->intx.pin);
+trace_vfio_intx_interrupt(vdev->host.domain, vdev->host.bus,
+  vdev->host.slot, vdev->host.function,
+  'A' + vdev->intx.pin);
 
 vdev->intx.pending = true;
 pci_irq_assert(&vdev->pdev);
@@ -384,8 +376,8 @@ static void vfio_eoi(VFIODevice *vdev)
 return;
 }
 
-DPRINTF("%s(%04x:%02x:%02x.%x) EOI\n", __func__, vdev->host.domain,
-vdev->host.bus, vdev->host.slot, vdev->host.function);
+trace_vfio_eoi(vdev->host.domain, vdev->host.bus,
+   vdev->host.slot, vdev->host.function);
 
 vdev->intx.pending = false;
 pci_irq_deassert(&vdev->pdev);
@@ -454,9 +446,8 @@ static void vfio_enable_intx_kvm(VFIODevice *vdev)
 
 vdev->intx.kvm_accel = true;
 
-DPRINTF("%s(%04x:%02x:%02x.%x) KVM INTx accel enabled\n",
-__func__, vdev->host.domain, vdev->host.bus,
-vdev->host.slot, vdev->host.function);
+trace_vfio_enable_intx_kvm(vdev->host.domain, vdev->host.bus,
+   vdev->host.slot, vdev->host.function);
 
 return;
 
@@ -508,9 +499,8 @@ static void vfio_disable_intx_kvm(VFIODevice *vdev)
 /* If we've missed an event, let it re-fire through QEMU */
 vfio_unmask_intx(vdev);
 
-DPRINTF("%s(%04x:%02x:%02x.%x) KVM INTx accel disabled\n",
-__func__, vdev->host.domain, vdev->host.bus,
-vdev->host.slot, vdev->host.function);
+trace_vfio_disable_intx_kvm(vdev->host.domain, vdev->host.bus,
+vdev->host.slot, vdev->host.function);
 #endif
 }
 
@@ -529,9 +519,9 @@ static void vfio_update_irq(PCIDevice *pdev)
 return; /* Nothing changed */
 }
 
-DPRINTF("%s(%04x:%02x:%02x.%x) IRQ moved %d -> %d\n", __func__,
-vdev->host.domain, vdev->host.bus, vdev->host.slot,
-vdev->host.function, vdev->intx.route.irq, route.irq);
+trace_vfio_update_irq(vdev->host.domain, vdev->host.bus,
+  vdev->host.slot, vdev->host.function,
+  vdev->intx.route.irq, route.irq);
 
 vfio_disable_intx_kvm(vdev);
 
@@ -607,8 +597,8 @@ static int vfio_enable_intx(VFIODevice *vdev)
 
 vdev->interrupt = VFIO_INT_INTx;
 
-DPRINTF("%s(%04x:%02x:%02x.%x)\n", __func__, vdev->host.domain,
-vdev->host.bus, vdev->host.slot, vdev->host.function);
+trace_vfio_enable_intx(vdev->host.domain, vdev->host.bus,
+   vdev->host.slot, vdev->host.function);
 
 return 0;
 }
@@ -630,8 +620,8 @@ static void vfio_disable_intx(VFIODevice *vdev)
 
 vdev->interrupt = VFIO_INT_NONE;
 
-DPRINTF("%s(%04x:%02x:%02x.%x)\n", __func__, vdev->host.domain,
-vdev->host.bus, vdev->host.slot, vdev->host.function);
+trace_vfio_disable_intx(vdev->host.domain, vdev->host.bus,
+vdev->host.slot, vdev->host.function);
 }
 
 /*
@@ -658,9 +648,9 @@ static void vfio_msi_inter

Re: [Qemu-devel] [PATCH 1/5] Platform: Add platform device class

2014-06-19 Thread Eric Auger
On 06/04/2014 02:28 PM, Alexander Graf wrote:
> This patch adds a new device class called "platform device". This is an
> abstract class for consumption of actual classes that implement devices.
> 
> The new thing about platform devices is that they have awareness of all
> memory regions and IRQs that the device exposes. That gives us the ability
> to manually specify thing using properties from the command line.

Hi Alex,

thanks for this serie. I am currently reworking the vfio-platform device
to inherit from this device instead of SysBusDevice.

Best Regards

Eric
> 
> Signed-off-by: Alexander Graf 
> ---
>  hw/Makefile.objs |   1 +
>  hw/platform/Makefile.objs|   1 +
>  hw/platform/device.c | 108 
> +++
>  include/hw/platform/device.h |  45 ++
>  4 files changed, 155 insertions(+)
>  create mode 100644 hw/platform/Makefile.objs
>  create mode 100644 hw/platform/device.c
>  create mode 100644 include/hw/platform/device.h
> 
> diff --git a/hw/Makefile.objs b/hw/Makefile.objs
> index d178b65..f300f68 100644
> --- a/hw/Makefile.objs
> +++ b/hw/Makefile.objs
> @@ -20,6 +20,7 @@ devices-dirs-$(CONFIG_SOFTMMU) += nvram/
>  devices-dirs-$(CONFIG_SOFTMMU) += pci/
>  devices-dirs-$(CONFIG_PCI) += pci-bridge/ pci-host/
>  devices-dirs-$(CONFIG_SOFTMMU) += pcmcia/
> +devices-dirs-$(CONFIG_PLATFORM) += platform/
>  devices-dirs-$(CONFIG_SOFTMMU) += scsi/
>  devices-dirs-$(CONFIG_SOFTMMU) += sd/
>  devices-dirs-$(CONFIG_SOFTMMU) += ssi/
> diff --git a/hw/platform/Makefile.objs b/hw/platform/Makefile.objs
> new file mode 100644
> index 000..824356b
> --- /dev/null
> +++ b/hw/platform/Makefile.objs
> @@ -0,0 +1 @@
> +common-obj-$(CONFIG_PLATFORM) += device.o
> diff --git a/hw/platform/device.c b/hw/platform/device.c
> new file mode 100644
> index 000..9e23370
> --- /dev/null
> +++ b/hw/platform/device.c
> @@ -0,0 +1,108 @@
> +/*
> + * Platform Device that can expose its memory regions and IRQ lines
> + *
> + * Copyright 2014 Freescale Semiconductor, Inc.
> + *
> + * Authors: Alexander Graf,   
> + *
> + * This is free software; you can redistribute it and/or modify
> + * it under the terms of  the GNU General  Public License as published by
> + * the Free Software Foundation;  either version 2 of the  License, or
> + * (at your option) any later version.
> + *
> + *
> + * This is an abstract platform device, so you really only want to use it
> + * as parent class for platform devices.
> + *
> + * It ensures that all boilerplate is properly abstracted away from children
> + * and consistent across devices.
> + *
> + * When instantiating a platform device you can optionally always specify 2
> + * properties which otherwise get populated automatically:
> + *
> + *  regions: Offsets in the platform hole the device's memory regions get 
> mapped
> + *   to.
> + *  irqs: IRQ pins in the linear platform IRQ range the device's IRQs get 
> mapped
> + *   to.
> + */
> +
> +#include "qemu-common.h"
> +#include "hw/platform/device.h"
> +
> +static void fixup_regions(PlatformDeviceState *s)
> +{
> +uint64_t *addrs = g_new(uint64_t, s->num_regions);
> +int i;
> +
> +/* Treat memory offsets that the user did not specify as dynamic */
> +for (i = 0; i < s->num_regions; i++) {
> +if (s->num_plat_region_addrs > i) {
> +addrs[i] = s->plat_region_addrs[i];
> +} else {
> +addrs[i] = PLATFORM_DYNAMIC;
> +}
> +}
> +
> +s->plat_region_addrs = addrs;
> +s->num_plat_region_addrs = s->num_regions;
> +}
> +
> +static void fixup_irqs(PlatformDeviceState *s)
> +{
> +uint32_t *irqs = g_new(uint32_t, s->num_irqs);
> +int i;
> +
> +/* Treat IRQs that the user did not specify as dynamic */
> +for (i = 0; i < s->num_irqs; i++) {
> +if (s->num_plat_irqs > i) {
> +irqs[i] = s->plat_irqs[i];
> +} else {
> +irqs[i] = PLATFORM_DYNAMIC;
> +}
> +}
> +
> +s->plat_irqs = irqs;
> +s->num_plat_irqs = s->num_irqs;
> +}
> +
> +static void platform_device_realize(DeviceState *dev, Error **errp)
> +{
> +PlatformDeviceState *s = PLATFORM_DEVICE(dev);
> +
> +fixup_regions(s);
> +fixup_irqs(s);
> +}
> +
> +static Property platform_device_properties[] = {
> +/* memory regions for a device */
> +DEFINE_PROP_ARRAY("regions", PlatformDeviceState, num_plat_region_addrs,
> +  plat_region_addrs, qdev_prop_uint64, uint64_t),
> +/* interrupts for a device */
> +DEFINE_PROP_ARRAY("irqs", PlatformDeviceState, num_plat_irqs,
> +  plat_irqs, qdev_prop_uint32, uint32_t),
> +DEFINE_PROP_END_OF_LIST(),
> +};
> +
> +static void platform_device_class_init(ObjectClass *oc, void *data)
> +{
> +DeviceClass *dc = DEVICE_CLASS(oc);
> +
> +dc->realize = platform_device_realize;
> +dc->props = platform_device_properties;
> +}
> +
> +static const TypeInfo platfo

Re: [Qemu-devel] [PATCH 4/5] PPC: e500: Support platform devices

2014-06-19 Thread Eric Auger
On 06/04/2014 02:28 PM, Alexander Graf wrote:
> For e500 our approach to supporting platform devices is to create a simple
> bus from the guest's point of view within which we map platform devices
> dynamically.
> 
> We allocate memory regions always within the "platform" hole in address
> space and map IRQs to predetermined IRQ lines that are reserved for platform
> device usage.
> 
> This maps really nicely into device tree logic, so we can just tell the
> guest about our virtual simple bus in device tree as well.
Hi Alex,

this "qemu_add_machine_init_done_notifier" was the qemu mechanism I
missed in my patch. You light my way ;-)

One first comment is it would make much sense to reuse your code in arm
virt.c too. I am currently doing the exercise. Do you think it would be
possible to share a common helper code, outside of e500 machine code?

Thank you in advance

Best Regards

Eric
> 
> Signed-off-by: Alexander Graf 
> ---
>  default-configs/ppc-softmmu.mak   |   1 +
>  default-configs/ppc64-softmmu.mak |   1 +
>  hw/ppc/e500.c | 221 
> ++
>  hw/ppc/e500.h |   1 +
>  hw/ppc/e500plat.c |   1 +
>  5 files changed, 225 insertions(+)
> 
> diff --git a/default-configs/ppc-softmmu.mak b/default-configs/ppc-softmmu.mak
> index 33f8d84..d6ec8b9 100644
> --- a/default-configs/ppc-softmmu.mak
> +++ b/default-configs/ppc-softmmu.mak
> @@ -45,6 +45,7 @@ CONFIG_PREP=y
>  CONFIG_MAC=y
>  CONFIG_E500=y
>  CONFIG_OPENPIC_KVM=$(and $(CONFIG_E500),$(CONFIG_KVM))
> +CONFIG_PLATFORM=y
>  # For PReP
>  CONFIG_MC146818RTC=y
>  CONFIG_ETSEC=y
> diff --git a/default-configs/ppc64-softmmu.mak 
> b/default-configs/ppc64-softmmu.mak
> index 37a15b7..06677bf 100644
> --- a/default-configs/ppc64-softmmu.mak
> +++ b/default-configs/ppc64-softmmu.mak
> @@ -45,6 +45,7 @@ CONFIG_PSERIES=y
>  CONFIG_PREP=y
>  CONFIG_MAC=y
>  CONFIG_E500=y
> +CONFIG_PLATFORM=y
>  CONFIG_OPENPIC_KVM=$(and $(CONFIG_E500),$(CONFIG_KVM))
>  # For pSeries
>  CONFIG_XICS=$(CONFIG_PSERIES)
> diff --git a/hw/ppc/e500.c b/hw/ppc/e500.c
> index 33d54b3..bc26215 100644
> --- a/hw/ppc/e500.c
> +++ b/hw/ppc/e500.c
> @@ -36,6 +36,7 @@
>  #include "exec/address-spaces.h"
>  #include "qemu/host-utils.h"
>  #include "hw/pci-host/ppce500.h"
> +#include "hw/platform/device.h"
>  
>  #define EPAPR_MAGIC(0x45504150)
>  #define BINARY_DEVICE_TREE_FILE"mpc8544ds.dtb"
> @@ -47,6 +48,14 @@
>  
>  #define RAM_SIZES_ALIGN(64UL << 20)
>  
> +#define E500_PLATFORM_BASE 0xF000ULL
> +#define E500_PLATFORM_HOLE (128ULL * 1024 * 1024) /* 128 MB */
> +#define E500_PLATFORM_PAGE_SHIFT   12
> +#define E500_PLATFORM_HOLE_PAGES   (E500_PLATFORM_HOLE >> \
> +E500_PLATFORM_PAGE_SHIFT)
> +#define E500_PLATFORM_FIRST_IRQ5
> +#define E500_PLATFORM_NUM_IRQS 10
> +
>  /* TODO: parameterize */
>  #define MPC8544_CCSRBAR_BASE   0xE000ULL
>  #define MPC8544_CCSRBAR_SIZE   0x0010ULL
> @@ -122,6 +131,62 @@ static void dt_serial_create(void *fdt, unsigned long 
> long offset,
>  }
>  }
>  
> +typedef struct PlatformDevtreeData {
> +void *fdt;
> +const char *mpic;
> +int irq_start;
> +const char *node;
> +} PlatformDevtreeData;
> +
> +static int platform_device_create_devtree(Object *obj, void *opaque)
> +{
> +PlatformDevtreeData *data = opaque;
> +Object *dev;
> +PlatformDeviceState *pdev;
> +
> +dev = object_dynamic_cast(obj, TYPE_PLATFORM_DEVICE);
> +pdev = (PlatformDeviceState *)dev;
> +
> +if (!pdev) {
> +/* Container, traverse it for children */
> +return object_child_foreach(obj, platform_device_create_devtree, 
> data);
> +}
> +
> +return 0;
> +}
> +
> +static void platform_create_devtree(void *fdt, const char *node, uint64_t 
> addr,
> +const char *mpic, int irq_start,
> +int nr_irqs)
> +{
> +const char platcomp[] = "qemu,platform\0simple-bus";
> +PlatformDevtreeData data;
> +
> +/* Create a /platform node that we can put all devices into */
> +
> +qemu_fdt_add_subnode(fdt, node);
> +qemu_fdt_setprop(fdt, node, "compatible", platcomp, sizeof(platcomp));
> +qemu_fdt_setprop_string(fdt, node, "device_type", "platform");
> +
> +/* Our platform hole is less than 32bit big, so 1 cell is enough for 
> address
> +   and size */
> +qemu_fdt_setprop_cells(fdt, node, "#size-cells", 1);
> +qemu_fdt_setprop_cells(fdt, node, "#address-cells", 1);
> +qemu_fdt_setprop_cells(fdt, node, "ranges", 0, addr >> 32, addr,
> +   E500_PLATFORM_HOLE);
> +
> +qemu_fdt_setprop_phandle(fdt, node, "interrupt-parent", mpic);
> +
> +/* Loop through all devices and create nodes for known ones */
> +
> +data.fdt = fdt;
> +data.mpic = mpic;
> +data.irq_start = irq_start;
> +data.node = node;
> +
> +   

[Qemu-devel] [PATCH v5 04/10] hw/vfio/pci: Introduce VFIORegion

2014-08-09 Thread Eric Auger
This structure is going to be shared by VFIOPCIDevice and
VFIOPlatformDevice. VFIOBAR includes it.

vfio_eoi becomes an ops of VFIODevice specialized by parent device.
This makes possible to transform vfio_bar_write/read into generic
vfio_region_write/read that will be used by VFIOPlatformDevice too.

vfio_mmap_bar becomes vfio_map_region

Signed-off-by: Eric Auger 

---

v4->v5:
- remove fd field from VFIORegion
- change error_report format string in vfio_region_write/read
- remove #ifdef DEBUG_VFIO in the same function
- correct missing initialization of bar region's vbasedev field
- change Object * parameter name of vfio_mmap_region and remove
  useless OBJECT()
---
 hw/vfio/pci.c | 194 +++---
 1 file changed, 103 insertions(+), 91 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index ae827c5..1a24398 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -86,15 +86,19 @@ typedef struct VFIOQuirk {
 } data;
 } VFIOQuirk;
 
-typedef struct VFIOBAR {
-off_t fd_offset; /* offset of BAR within device fd */
-int fd; /* device fd, allows us to pass VFIOBAR as opaque data */
+typedef struct VFIORegion {
+struct VFIODevice *vbasedev;
+off_t fd_offset; /* offset of region within device fd */
 MemoryRegion mem; /* slow, read/write access */
 MemoryRegion mmap_mem; /* direct mapped access */
 void *mmap;
 size_t size;
 uint32_t flags; /* VFIO region flags (rd/wr/mmap) */
-uint8_t nr; /* cache the BAR number for debug */
+uint8_t nr; /* cache the region number for debug */
+} VFIORegion;
+
+typedef struct VFIOBAR {
+VFIORegion region;
 bool ioport;
 bool mem64;
 QLIST_HEAD(, VFIOQuirk) quirks;
@@ -214,6 +218,7 @@ typedef struct VFIODevice {
 struct VFIODeviceOps {
 bool (*vfio_compute_needs_reset)(VFIODevice *vdev);
 int (*vfio_hot_reset_multi)(VFIODevice *vdev);
+void (*vfio_eoi)(VFIODevice *vdev);
 };
 
 typedef struct VFIOPCIDevice {
@@ -397,8 +402,10 @@ static void vfio_intx_interrupt(void *opaque)
 }
 }
 
-static void vfio_eoi(VFIOPCIDevice *vdev)
+static void vfio_eoi(VFIODevice *vbasedev)
 {
+VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev);
+
 if (!vdev->intx.pending) {
 return;
 }
@@ -408,7 +415,7 @@ static void vfio_eoi(VFIOPCIDevice *vdev)
 
 vdev->intx.pending = false;
 pci_irq_deassert(&vdev->pdev);
-vfio_unmask_irqindex(&vdev->vbasedev, VFIO_PCI_INTX_IRQ_INDEX);
+vfio_unmask_irqindex(vbasedev, VFIO_PCI_INTX_IRQ_INDEX);
 }
 
 static void vfio_enable_intx_kvm(VFIOPCIDevice *vdev)
@@ -563,7 +570,7 @@ static void vfio_update_irq(PCIDevice *pdev)
 vfio_enable_intx_kvm(vdev);
 
 /* Re-enable the interrupt in cased we missed an EOI */
-vfio_eoi(vdev);
+vfio_eoi(&vdev->vbasedev);
 }
 
 static int vfio_enable_intx(VFIOPCIDevice *vdev)
@@ -1101,10 +1108,11 @@ static void vfio_update_msi(VFIOPCIDevice *vdev)
 /*
  * IO Port/MMIO - Beware of the endians, VFIO is always little endian
  */
-static void vfio_bar_write(void *opaque, hwaddr addr,
+static void vfio_region_write(void *opaque, hwaddr addr,
uint64_t data, unsigned size)
 {
-VFIOBAR *bar = opaque;
+VFIORegion *region = opaque;
+VFIODevice *vbasedev = region->vbasedev;
 union {
 uint8_t byte;
 uint16_t word;
@@ -1127,21 +1135,16 @@ static void vfio_bar_write(void *opaque, hwaddr addr,
 break;
 }
 
-if (pwrite(bar->fd, &buf, size, bar->fd_offset + addr) != size) {
-error_report("%s(,0x%"HWADDR_PRIx", 0x%"PRIx64", %d) failed: %m",
- __func__, addr, data, size);
+if (pwrite(vbasedev->fd, &buf, size, region->fd_offset + addr) != size) {
+error_report("%s(%s:region%d+0x%"HWADDR_PRIx", 0x%"PRIx64
+ ",%d) failed: %m",
+ __func__, vbasedev->name, region->nr,
+ addr, data, size);
 }
 
-#ifdef DEBUG_VFIO
-{
-VFIOPCIDevice *vdev = container_of(bar, VFIOPCIDevice, bars[bar->nr]);
-
-DPRINTF("%s(%04x:%02x:%02x.%x:BAR%d+0x%"HWADDR_PRIx", 0x%"PRIx64
-", %d)\n", __func__, vdev->host.domain, vdev->host.bus,
-vdev->host.slot, vdev->host.function, bar->nr, addr,
-data, size);
-}
-#endif
+DPRINTF("%s(%s:region%d+0x%"HWADDR_PRIx", 0x%"PRIx64
+", %d)\n", __func__, vbasedev->name,
+region->nr, addr, data, size);
 
 /*
  * A read or write to a BAR always signals an INTx EOI.  This will
@@ -1151,13 +1154,15 @@ static void vfio_bar_write(void *opaque, hwaddr addr,
  * which access will service the interrupt, so we're potentially
  * getting quite a few host interrupts per gu

[Qemu-devel] [PATCH v5 00/10] KVM platform device passthrough

2014-08-09 Thread Eric Auger
This RFC series aims at enabling KVM platform device passthrough.
It implements a VFIO platform device, derived from VFIO PCI device.

The VFIO platform device uses the host VFIO platform driver which must
be bound to the assigned device prior to the QEMU system start.

- the guest can directly access the device register space
- assigned device IRQs are transparently routed to the guest by
  QEMU/KVM (2 methods currently are supported)
- iommu is transparently programmed to prevent the device from
  accessing physical pages outside of the guest address space

the patch relies on the following QEMU patch series:

- Alex Graf's "Dynamic sysbus device allocation support"
  http://lists.gnu.org/archive/html/qemu-ppc/2014-07/msg00047.html
  (up to "sysbus: Make devices spawnable via -device")
- [RFC v2] machvirt dynamic sysbus device instantiation

This patch series is made of the following patch files:

1-5) Modifications to PCI code to prepare for VFIO platform device
6) split of PCI specific code and generic code (move)
7) creation of the VFIO platform device, without irqfd support
   (MMIO direct access and IRQ assignment).
8-9) addition of irqfd/virqfd support
10) capability to dynamically instantiate the device

v4->v5:
- rebase on v2.1.0 PCI code
- take into account Alex Williamson comments on PCI code rework
  - trace updates in vfio_region_write/read
  - remove fd from VFIORegion
  - get/put ckeanup
- bug fix: bar region's vbasedev field duly initialization
- misc cleanups in platform device
- device tree node generation removed from device and handled in
  hw/arm/dyn_sysbus_devtree.c
- remove "hw/vfio: add an example calxeda_xgmac": with removal of
  device tree node generation we do not have so many things to
  implement in that derived device yet. May be re-introduced later
  on if needed typically for reset/migration.
- no GSI routing table anymore

v3->v4 changes (Eric Auger, Alvise Rigo)
- rebase on last VFIO PCI code (v2.1.0-rc0)
- full git history rework to ease PCI code change review
- mv include files in hw/vfio
- DPRINTF reformatting temporarily moved out
- support of VFIO virq (removal of resamplefd handler on user-side)
- integration with sysbus dynamic instantiation framwork
- removal of unrealize and cleanup routines until it is better
  understood what is really needed
- Support of VFIO for Amba devices should be handled in an inherited
  device to specialize the device tree generation (clock handle currently
  missing in framework however)
- "Always use eventfd as notifying mechanism" temporarily moved out
- static instantiation is not mainstream (although it remains possible)
  note if static instantiation is used, irqfd must be setup in machine file
  when virtual IRQ is known
- create the GSI routing table on qemu side

v2->v3 changes (Alvise Rigo, Eric Auger):
- Following Alex W recommandations, further efforts to factorize the
  code between PCI:introduction of VFIODevice and VFIORegion
  as base classes
- unique reset handler for platform and PCI
- cleanup following Kim's comments
- multiple IRQ support mechanics should be in place although not
  tested
- Better handling of MMIO multiple regions
- New features and fixes by Alvise (multiple compat string, exec
  flag, force eventfd usage, amba device tree support)
- irqfd support

v1->v2 changes (Kim Phillips, Eric Auger):
- IRQ initial support (legacy mode where eventfds are handled on
  user side)
- hacked dynamic instantiation

v1 (Kim Phillips):
- initial split between PCI and platform
- MMIO support only
- static instantiation

This patch has the following kernel side dependencies:

- [RFC Patch v6 0/20] VFIO support for platform devices
https://www.mail-archive.com/kvm@vger.kernel.org/msg103247.html
- [Patch] ARM: KVM: Handle IPA unmapping on memory region deletion
https://patches.linaro.org/27691/
- [PATCH RFC] ARM: KVM: add irqfd support
http://www.gossamer-threads.com/lists/linux/kernel/1981144
- arm/arm64: KVM: Various VGIC cleanups and improvements
http://comments.gmane.org/gmane.linux.ports.arm.kernel/340430
- [PATCH] ARM: KVM: Enable the KVM-VFIO device
https://lists.cs.columbia.edu/pipermail/kvmarm/2014-March/008629.html

those kernel pieces can be found at:
git://git.linaro.org/people/eric.auger/linux.git (branch irqfd_integ_v4)

QEMU patch files and dependencies can be found at:
git://git.linaro.org/people/eric.auger/qemu.git (branch vfio_integ_v5)

The patch series was tested on Calxeda Midway (ARMv7) where one xgmac
is assigned to KVM host while the second one is assigned to the guest.
Unfortunately a single IRQ is exercised. Reworked PCI device is not tested.

https://wiki.linaro.org/LEG/Engineering/Virtualization/Platform_Device_Passthrough_on_Midway

Best Regards

Eric




Eric Auger (9):
  hw/vfio/pci: Rename VFIODevice into VFIOPCIDevice
  hw/vfio/pci: introduce VFIODevice
  hw/vfio/pci: Introduce VFIORegion
  hw/vfio/pci: split vfio_get_

[Qemu-devel] [PATCH v5 02/10] hw/vfio/pci: Rename VFIODevice into VFIOPCIDevice

2014-08-09 Thread Eric Auger
This prepares for the introduction of VFIOPlatformDevice

Signed-off-by: Eric Auger 
---
 hw/vfio/pci.c | 209 +-
 1 file changed, 105 insertions(+), 104 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 188fdd2..c2cdd73 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -56,11 +56,11 @@
 #define VFIO_ALLOW_KVM_MSI 1
 #define VFIO_ALLOW_KVM_MSIX 1
 
-struct VFIODevice;
+struct VFIOPCIDevice;
 
 typedef struct VFIOQuirk {
 MemoryRegion mem;
-struct VFIODevice *vdev;
+struct VFIOPCIDevice *vdev;
 QLIST_ENTRY(VFIOQuirk) next;
 struct {
 uint32_t base_offset:TARGET_PAGE_BITS;
@@ -131,7 +131,7 @@ typedef struct VFIOMSIVector {
  */
 EventNotifier interrupt;
 EventNotifier kvm_interrupt;
-struct VFIODevice *vdev; /* back pointer to device */
+struct VFIOPCIDevice *vdev; /* back pointer to device */
 int virq;
 bool use;
 } VFIOMSIVector;
@@ -193,7 +193,7 @@ typedef struct VFIOMSIXInfo {
 void *mmap;
 } VFIOMSIXInfo;
 
-typedef struct VFIODevice {
+typedef struct VFIOPCIDevice {
 PCIDevice pdev;
 int fd;
 VFIOINTx intx;
@@ -211,7 +211,7 @@ typedef struct VFIODevice {
 VFIOBAR bars[PCI_NUM_REGIONS - 1]; /* No ROM */
 VFIOVGA vga; /* 0xa, 0x3b0, 0x3c0 */
 PCIHostDeviceAddress host;
-QLIST_ENTRY(VFIODevice) next;
+QLIST_ENTRY(VFIOPCIDevice) next;
 struct VFIOGroup *group;
 EventNotifier err_notifier;
 uint32_t features;
@@ -226,13 +226,13 @@ typedef struct VFIODevice {
 bool has_pm_reset;
 bool needs_reset;
 bool rom_read_failed;
-} VFIODevice;
+} VFIOPCIDevice;
 
 typedef struct VFIOGroup {
 int fd;
 int groupid;
 VFIOContainer *container;
-QLIST_HEAD(, VFIODevice) device_list;
+QLIST_HEAD(, VFIOPCIDevice) device_list;
 QLIST_ENTRY(VFIOGroup) next;
 QLIST_ENTRY(VFIOGroup) container_next;
 } VFIOGroup;
@@ -276,16 +276,16 @@ static QLIST_HEAD(, VFIOGroup)
 static int vfio_kvm_device_fd = -1;
 #endif
 
-static void vfio_disable_interrupts(VFIODevice *vdev);
+static void vfio_disable_interrupts(VFIOPCIDevice *vdev);
 static uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len);
 static void vfio_pci_write_config(PCIDevice *pdev, uint32_t addr,
   uint32_t val, int len);
-static void vfio_mmap_set_enabled(VFIODevice *vdev, bool enabled);
+static void vfio_mmap_set_enabled(VFIOPCIDevice *vdev, bool enabled);
 
 /*
  * Common VFIO interrupt disable
  */
-static void vfio_disable_irqindex(VFIODevice *vdev, int index)
+static void vfio_disable_irqindex(VFIOPCIDevice *vdev, int index)
 {
 struct vfio_irq_set irq_set = {
 .argsz = sizeof(irq_set),
@@ -301,7 +301,7 @@ static void vfio_disable_irqindex(VFIODevice *vdev, int 
index)
 /*
  * INTx
  */
-static void vfio_unmask_intx(VFIODevice *vdev)
+static void vfio_unmask_intx(VFIOPCIDevice *vdev)
 {
 struct vfio_irq_set irq_set = {
 .argsz = sizeof(irq_set),
@@ -315,7 +315,7 @@ static void vfio_unmask_intx(VFIODevice *vdev)
 }
 
 #ifdef CONFIG_KVM /* Unused outside of CONFIG_KVM code */
-static void vfio_mask_intx(VFIODevice *vdev)
+static void vfio_mask_intx(VFIOPCIDevice *vdev)
 {
 struct vfio_irq_set irq_set = {
 .argsz = sizeof(irq_set),
@@ -346,7 +346,7 @@ static void vfio_mask_intx(VFIODevice *vdev)
  */
 static void vfio_intx_mmap_enable(void *opaque)
 {
-VFIODevice *vdev = opaque;
+VFIOPCIDevice *vdev = opaque;
 
 if (vdev->intx.pending) {
 timer_mod(vdev->intx.mmap_timer,
@@ -359,7 +359,7 @@ static void vfio_intx_mmap_enable(void *opaque)
 
 static void vfio_intx_interrupt(void *opaque)
 {
-VFIODevice *vdev = opaque;
+VFIOPCIDevice *vdev = opaque;
 
 if (!event_notifier_test_and_clear(&vdev->intx.interrupt)) {
 return;
@@ -378,7 +378,7 @@ static void vfio_intx_interrupt(void *opaque)
 }
 }
 
-static void vfio_eoi(VFIODevice *vdev)
+static void vfio_eoi(VFIOPCIDevice *vdev)
 {
 if (!vdev->intx.pending) {
 return;
@@ -392,7 +392,7 @@ static void vfio_eoi(VFIODevice *vdev)
 vfio_unmask_intx(vdev);
 }
 
-static void vfio_enable_intx_kvm(VFIODevice *vdev)
+static void vfio_enable_intx_kvm(VFIOPCIDevice *vdev)
 {
 #ifdef CONFIG_KVM
 struct kvm_irqfd irqfd = {
@@ -471,7 +471,7 @@ fail:
 #endif
 }
 
-static void vfio_disable_intx_kvm(VFIODevice *vdev)
+static void vfio_disable_intx_kvm(VFIOPCIDevice *vdev)
 {
 #ifdef CONFIG_KVM
 struct kvm_irqfd irqfd = {
@@ -516,7 +516,7 @@ static void vfio_disable_intx_kvm(VFIODevice *vdev)
 
 static void vfio_update_irq(PCIDevice *pdev)
 {
-VFIODevice *vdev = DO_UPCAST(VFIODevice, pdev, pdev);
+VFIOPCIDevice *vdev = DO_UPCAST(VFIOPCIDevice, pdev, pdev);
 PCIINTxRoute route;
 
 if (vdev->interrupt != VFIO_INT_INTx) {
@@ -547,7 +547,7 @@ static void vfio_update_irq(PCIDevice *pdev)
 vfio_eoi(vdev);
 }
 
-static int vfio_enable_intx(VFIODevice

[Qemu-devel] [PATCH v5 01/10] vfio: move hw/misc/vfio.c to hw/vfio/pci.c Move vfio.h into include/hw/vfio

2014-08-09 Thread Eric Auger
From: Kim Phillips 

This is done in preparation for the addition of VFIO platform
device support.

Signed-off-by: Kim Phillips 
---
 LICENSE  | 2 +-
 MAINTAINERS  | 2 +-
 hw/Makefile.objs | 1 +
 hw/misc/Makefile.objs| 1 -
 hw/ppc/spapr_pci_vfio.c  | 2 +-
 hw/vfio/Makefile.objs| 3 +++
 hw/{misc/vfio.c => vfio/pci.c}   | 2 +-
 include/hw/{misc => vfio}/vfio.h | 0
 8 files changed, 8 insertions(+), 5 deletions(-)
 create mode 100644 hw/vfio/Makefile.objs
 rename hw/{misc/vfio.c => vfio/pci.c} (99%)
 rename include/hw/{misc => vfio}/vfio.h (100%)

diff --git a/LICENSE b/LICENSE
index da70e94..0e0b4b9 100644
--- a/LICENSE
+++ b/LICENSE
@@ -11,7 +11,7 @@ option) any later version.
 
 As of July 2013, contributions under version 2 of the GNU General Public
 License (and no later version) are only accepted for the following files
-or directories: bsd-user/, linux-user/, hw/misc/vfio.c, hw/xen/xen_pt*.
+or directories: bsd-user/, linux-user/, hw/vfio/, hw/xen/xen_pt*.
 
 3) The Tiny Code Generator (TCG) is released under the BSD license
(see license headers in files).
diff --git a/MAINTAINERS b/MAINTAINERS
index 906f252..866e3c6 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -619,7 +619,7 @@ F: tests/usb-hcd-ehci-test.c
 VFIO
 M: Alex Williamson 
 S: Supported
-F: hw/misc/vfio.c
+F: hw/vfio/*
 
 vhost
 M: Michael S. Tsirkin 
diff --git a/hw/Makefile.objs b/hw/Makefile.objs
index 52a1464..73afa41 100644
--- a/hw/Makefile.objs
+++ b/hw/Makefile.objs
@@ -26,6 +26,7 @@ devices-dirs-$(CONFIG_SOFTMMU) += ssi/
 devices-dirs-$(CONFIG_SOFTMMU) += timer/
 devices-dirs-$(CONFIG_TPM) += tpm/
 devices-dirs-$(CONFIG_SOFTMMU) += usb/
+devices-dirs-$(CONFIG_SOFTMMU) += vfio/
 devices-dirs-$(CONFIG_VIRTIO) += virtio/
 devices-dirs-$(CONFIG_SOFTMMU) += watchdog/
 devices-dirs-$(CONFIG_SOFTMMU) += xen/
diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
index 86f6243..9b77554 100644
--- a/hw/misc/Makefile.objs
+++ b/hw/misc/Makefile.objs
@@ -21,7 +21,6 @@ common-obj-$(CONFIG_MACIO) += macio/
 
 ifeq ($(CONFIG_PCI), y)
 obj-$(CONFIG_KVM) += ivshmem.o
-obj-$(CONFIG_LINUX) += vfio.o
 endif
 
 obj-$(CONFIG_REALVIEW) += arm_sysctl.o
diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c
index d3bddf2..144912b 100644
--- a/hw/ppc/spapr_pci_vfio.c
+++ b/hw/ppc/spapr_pci_vfio.c
@@ -20,7 +20,7 @@
 #include "hw/ppc/spapr.h"
 #include "hw/pci-host/spapr.h"
 #include "linux/vfio.h"
-#include "hw/misc/vfio.h"
+#include "hw/vfio/vfio.h"
 
 static Property spapr_phb_vfio_properties[] = {
 DEFINE_PROP_INT32("iommu", sPAPRPHBVFIOState, iommugroupid, -1),
diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
new file mode 100644
index 000..31c7dab
--- /dev/null
+++ b/hw/vfio/Makefile.objs
@@ -0,0 +1,3 @@
+ifeq ($(CONFIG_LINUX), y)
+obj-$(CONFIG_PCI) += pci.o
+endif
diff --git a/hw/misc/vfio.c b/hw/vfio/pci.c
similarity index 99%
rename from hw/misc/vfio.c
rename to hw/vfio/pci.c
index ba08adb..188fdd2 100644
--- a/hw/misc/vfio.c
+++ b/hw/vfio/pci.c
@@ -39,7 +39,7 @@
 #include "qemu/range.h"
 #include "sysemu/kvm.h"
 #include "sysemu/sysemu.h"
-#include "hw/misc/vfio.h"
+#include "hw/vfio/vfio.h"
 
 /* #define DEBUG_VFIO */
 #ifdef DEBUG_VFIO
diff --git a/include/hw/misc/vfio.h b/include/hw/vfio/vfio.h
similarity index 100%
rename from include/hw/misc/vfio.h
rename to include/hw/vfio/vfio.h
-- 
1.8.3.2




[Qemu-devel] [PATCH v5 03/10] hw/vfio/pci: introduce VFIODevice

2014-08-09 Thread Eric Auger
Introduce the VFIODevice struct that is going to be shared by
VFIOPCIDevice and VFIOPlatformDevice.

Additional fields will be added there later on for review
convenience.

the group's device_list becomes a list of VFIODevice

This obliges to rework the reset_handler which becomes generic and
calls VFIODevice ops that are specialized in each parent object.
Also functions that iterate on this list must take care that the
devices can be something else than VFIOPCIDevice. The type is used
to discriminate them.

we profit from this step to change the prototype of
vfio_unmask_intx, vfio_mask_intx, vfio_disable_irqindex which now
apply to VFIODevice. They are renamed as *_irqindex.
The index is passed as parameter to anticipate their usage for
platform IRQs

Signed-off-by: Eric Auger 

---

v4->v5:
- fix style issues
- in vfio_initfn, rework allocation of vdev->vbasedev.name and
  replace snprintf by g_strdup_printf
---
 hw/vfio/pci.c | 239 +++---
 1 file changed, 146 insertions(+), 93 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index c2cdd73..ae827c5 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -56,6 +56,11 @@
 #define VFIO_ALLOW_KVM_MSI 1
 #define VFIO_ALLOW_KVM_MSIX 1
 
+enum {
+VFIO_DEVICE_TYPE_PCI = 0,
+VFIO_DEVICE_TYPE_PLATFORM = 1,
+};
+
 struct VFIOPCIDevice;
 
 typedef struct VFIOQuirk {
@@ -193,9 +198,27 @@ typedef struct VFIOMSIXInfo {
 void *mmap;
 } VFIOMSIXInfo;
 
+typedef struct VFIODeviceOps VFIODeviceOps;
+
+typedef struct VFIODevice {
+QLIST_ENTRY(VFIODevice) next;
+struct VFIOGroup *group;
+char *name;
+int fd;
+int type;
+bool reset_works;
+bool needs_reset;
+VFIODeviceOps *ops;
+} VFIODevice;
+
+struct VFIODeviceOps {
+bool (*vfio_compute_needs_reset)(VFIODevice *vdev);
+int (*vfio_hot_reset_multi)(VFIODevice *vdev);
+};
+
 typedef struct VFIOPCIDevice {
 PCIDevice pdev;
-int fd;
+VFIODevice vbasedev;
 VFIOINTx intx;
 unsigned int config_size;
 uint8_t *emulated_config_bits; /* QEMU emulated bits, little-endian */
@@ -211,20 +234,16 @@ typedef struct VFIOPCIDevice {
 VFIOBAR bars[PCI_NUM_REGIONS - 1]; /* No ROM */
 VFIOVGA vga; /* 0xa, 0x3b0, 0x3c0 */
 PCIHostDeviceAddress host;
-QLIST_ENTRY(VFIOPCIDevice) next;
-struct VFIOGroup *group;
 EventNotifier err_notifier;
 uint32_t features;
 #define VFIO_FEATURE_ENABLE_VGA_BIT 0
 #define VFIO_FEATURE_ENABLE_VGA (1 << VFIO_FEATURE_ENABLE_VGA_BIT)
 int32_t bootindex;
 uint8_t pm_cap;
-bool reset_works;
 bool has_vga;
 bool pci_aer;
 bool has_flr;
 bool has_pm_reset;
-bool needs_reset;
 bool rom_read_failed;
 } VFIOPCIDevice;
 
@@ -232,7 +251,7 @@ typedef struct VFIOGroup {
 int fd;
 int groupid;
 VFIOContainer *container;
-QLIST_HEAD(, VFIOPCIDevice) device_list;
+QLIST_HEAD(, VFIODevice) device_list;
 QLIST_ENTRY(VFIOGroup) next;
 QLIST_ENTRY(VFIOGroup) container_next;
 } VFIOGroup;
@@ -285,7 +304,7 @@ static void vfio_mmap_set_enabled(VFIOPCIDevice *vdev, bool 
enabled);
 /*
  * Common VFIO interrupt disable
  */
-static void vfio_disable_irqindex(VFIOPCIDevice *vdev, int index)
+static void vfio_disable_irqindex(VFIODevice *vbasedev, int index)
 {
 struct vfio_irq_set irq_set = {
 .argsz = sizeof(irq_set),
@@ -295,37 +314,37 @@ static void vfio_disable_irqindex(VFIOPCIDevice *vdev, 
int index)
 .count = 0,
 };
 
-ioctl(vdev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
+ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
 }
 
 /*
  * INTx
  */
-static void vfio_unmask_intx(VFIOPCIDevice *vdev)
+static void vfio_unmask_irqindex(VFIODevice *vbasedev, int index)
 {
 struct vfio_irq_set irq_set = {
 .argsz = sizeof(irq_set),
 .flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_UNMASK,
-.index = VFIO_PCI_INTX_IRQ_INDEX,
+.index = index,
 .start = 0,
 .count = 1,
 };
 
-ioctl(vdev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
+ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
 }
 
 #ifdef CONFIG_KVM /* Unused outside of CONFIG_KVM code */
-static void vfio_mask_intx(VFIOPCIDevice *vdev)
+static void vfio_mask_irqindex(VFIODevice *vbasedev, int index)
 {
 struct vfio_irq_set irq_set = {
 .argsz = sizeof(irq_set),
 .flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_MASK,
-.index = VFIO_PCI_INTX_IRQ_INDEX,
+.index = index,
 .start = 0,
 .count = 1,
 };
 
-ioctl(vdev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
+ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
 }
 #endif
 
@@ -389,7 +408,7 @@ static void vfio_eoi(VFIOPCIDevice *vdev)
 
 vdev->intx.pending = false;
 pci_irq_deassert(&vdev->pdev);
-vfio_unmask_intx(vdev);
+vfio_unmask_irqindex(&vdev->vbasedev, VFIO_PCI_INTX_IRQ_I

[Qemu-devel] [PATCH v5 05/10] hw/vfio/pci: split vfio_get_device

2014-08-09 Thread Eric Auger
vfio_get_device now takes a VFIODevice as argument. The function is split
into 4 functional parts: dev_info query, device check, region populate
and interrupt populate. the last 3 are specialized by parent device and
are added into DeviceOps.

3 new fields are introduced in VFIODevice to store dev_info.

vfio_put_base_device is created.

---

v4->v5:
- cleanup up of error handling and get/put operations in
  vfio_check_device, vfio_populate_regions, vfio_populate_interrupts and
  vfio_get_device.
  - correct misuse of errno
  - vfio_populate_regions always returns 0
  - VFIODevice .name deallocation done in vfio_put_device instead of
vfio_put_base_device
  - vfio_put_base_device done at vfio_get_device level.

Signed-off-by: Eric Auger 
---
 hw/vfio/pci.c | 181 ++
 1 file changed, 120 insertions(+), 61 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 1a24398..5f218b7 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -213,12 +213,18 @@ typedef struct VFIODevice {
 bool reset_works;
 bool needs_reset;
 VFIODeviceOps *ops;
+unsigned int num_irqs;
+unsigned int num_regions;
+unsigned int flags;
 } VFIODevice;
 
 struct VFIODeviceOps {
 bool (*vfio_compute_needs_reset)(VFIODevice *vdev);
 int (*vfio_hot_reset_multi)(VFIODevice *vdev);
 void (*vfio_eoi)(VFIODevice *vdev);
+int (*vfio_check_device)(VFIODevice *vdev);
+int (*vfio_populate_regions)(VFIODevice *vdev);
+int (*vfio_populate_interrupts)(VFIODevice *vdev);
 };
 
 typedef struct VFIOPCIDevice {
@@ -305,6 +311,10 @@ static uint32_t vfio_pci_read_config(PCIDevice *pdev, 
uint32_t addr, int len);
 static void vfio_pci_write_config(PCIDevice *pdev, uint32_t addr,
   uint32_t val, int len);
 static void vfio_mmap_set_enabled(VFIOPCIDevice *vdev, bool enabled);
+static void vfio_put_base_device(VFIODevice *vbasedev);
+static int vfio_check_device(VFIODevice *vbasedev);
+static int vfio_populate_regions(VFIODevice *vbasedev);
+static int vfio_populate_interrupts(VFIODevice *vbasedev);
 
 /*
  * Common VFIO interrupt disable
@@ -3608,6 +3618,9 @@ static VFIODeviceOps vfio_pci_ops = {
 .vfio_compute_needs_reset = vfio_pci_compute_needs_reset,
 .vfio_hot_reset_multi = vfio_pci_hot_reset_multi,
 .vfio_eoi = vfio_eoi,
+.vfio_check_device = vfio_check_device,
+.vfio_populate_regions = vfio_populate_regions,
+.vfio_populate_interrupts = vfio_populate_interrupts,
 };
 
 static void vfio_reset_handler(void *opaque)
@@ -3949,54 +3962,52 @@ static void vfio_put_group(VFIOGroup *group)
 }
 }
 
-static int vfio_get_device(VFIOGroup *group, const char *name,
-   VFIOPCIDevice *vdev)
+static int vfio_check_device(VFIODevice *vbasedev)
 {
-struct vfio_device_info dev_info = { .argsz = sizeof(dev_info) };
-struct vfio_region_info reg_info = { .argsz = sizeof(reg_info) };
-struct vfio_irq_info irq_info = { .argsz = sizeof(irq_info) };
-int ret, i;
-
-ret = ioctl(group->fd, VFIO_GROUP_GET_DEVICE_FD, name);
-if (ret < 0) {
-error_report("vfio: error getting device %s from group %d: %m",
- name, group->groupid);
-error_printf("Verify all devices in group %d are bound to vfio-pci "
- "or pci-stub and not already in use\n", group->groupid);
-return ret;
+if (!(vbasedev->flags & VFIO_DEVICE_FLAGS_PCI)) {
+error_report("vfio: Um, this isn't a PCI device");
+goto error;
 }
-
-vdev->vbasedev.fd = ret;
-vdev->vbasedev.group = group;
-QLIST_INSERT_HEAD(&group->device_list, &vdev->vbasedev, next);
-
-/* Sanity check device */
-ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_INFO, &dev_info);
-if (ret) {
-error_report("vfio: error getting device info: %m");
+if (vbasedev->num_regions < VFIO_PCI_CONFIG_REGION_INDEX + 1) {
+error_report("vfio: unexpected number of io regions %u",
+ vbasedev->num_regions);
 goto error;
 }
-
-DPRINTF("Device %s flags: %u, regions: %u, irgs: %u\n", name,
-dev_info.flags, dev_info.num_regions, dev_info.num_irqs);
-
-if (!(dev_info.flags & VFIO_DEVICE_FLAGS_PCI)) {
-error_report("vfio: Um, this isn't a PCI device");
+if (vbasedev->num_irqs < VFIO_PCI_MSIX_IRQ_INDEX + 1) {
+error_report("vfio: unexpected number of irqs %u",
+ vbasedev->num_irqs);
 goto error;
 }
+return 0;
+error:
+return -1;
+}
 
-vdev->vbasedev.reset_works = !!(dev_info.flags & VFIO_DEVICE_FLAGS_RESET);
+static int vfio_populate_interrupts(VFIODevice *vbasedev)
+{
+VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev);
+int ret

[Qemu-devel] [PATCH v5 06/10] hw/vfio: create common module

2014-08-09 Thread Eric Auger
A new common module is created. It implements all functions
that have no device specificity (PCI, Platform).

This patch only consists in move (no functional changes)

Signed-off-by: Kim Phillips 
Signed-off-by: Eric Auger 

---

v4 -> v5:
- integrate "sPAPR/IOMMU: Fix TCE entry permission"
- VFIOdevice .name dealloc removed from vfio_put_base_device
- add some includes according to vfio inclusion policy

v3 -> v4:
[Eric Auger]
move done after all PCI modifications to anticipate for
VFIO Platform needs. Purpose is to alleviate the whole
review process.

<= v3
First split done by Kim Phillips
---
 hw/vfio/Makefile.objs |1 +
 hw/vfio/common.c  |  990 ++
 hw/vfio/pci.c | 1070 +
 include/hw/vfio/vfio-common.h |  151 ++
 4 files changed, 1147 insertions(+), 1065 deletions(-)
 create mode 100644 hw/vfio/common.c
 create mode 100644 include/hw/vfio/vfio-common.h

diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
index 31c7dab..e31f30e 100644
--- a/hw/vfio/Makefile.objs
+++ b/hw/vfio/Makefile.objs
@@ -1,3 +1,4 @@
 ifeq ($(CONFIG_LINUX), y)
+obj-$(CONFIG_SOFTMMU) += common.o
 obj-$(CONFIG_PCI) += pci.o
 endif
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
new file mode 100644
index 000..297c508
--- /dev/null
+++ b/hw/vfio/common.c
@@ -0,0 +1,990 @@
+/*
+ * generic functions used by VFIO devices
+ *
+ * Copyright Red Hat, Inc. 2012
+ *
+ * Authors:
+ *  Alex Williamson 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ * Based on qemu-kvm device-assignment:
+ *  Adapted for KVM by Qumranet.
+ *  Copyright (c) 2007, Neocleus, Alex Novik (a...@neocleus.com)
+ *  Copyright (c) 2007, Neocleus, Guy Zana (g...@neocleus.com)
+ *  Copyright (C) 2008, Qumranet, Amit Shah (amit.s...@qumranet.com)
+ *  Copyright (C) 2008, Red Hat, Amit Shah (amit.s...@redhat.com)
+ *  Copyright (C) 2008, IBM, Muli Ben-Yehuda (m...@il.ibm.com)
+ */
+
+#include 
+#include 
+#include 
+
+#include "hw/vfio/vfio-common.h"
+#include "hw/vfio/vfio.h"
+#include "exec/address-spaces.h"
+#include "exec/memory.h"
+#include "hw/hw.h"
+#include "qemu/error-report.h"
+#include "sysemu/kvm.h"
+
+QLIST_HEAD(, VFIOGroup)
+group_list = QLIST_HEAD_INITIALIZER(group_list);
+
+QLIST_HEAD(, VFIOAddressSpace) vfio_address_spaces =
+QLIST_HEAD_INITIALIZER(vfio_address_spaces);
+
+#ifdef CONFIG_KVM
+/*
+ * We have a single VFIO pseudo device per KVM VM.  Once created it lives
+ * for the life of the VM.  Closing the file descriptor only drops our
+ * reference to it and the device's reference to kvm.  Therefore once
+ * initialized, this file descriptor is only released on QEMU exit and
+ * we'll re-use it should another vfio device be attached before then.
+ */
+static int vfio_kvm_device_fd = -1;
+#endif
+
+/*
+ * Common VFIO interrupt disable
+ */
+void vfio_disable_irqindex(VFIODevice *vbasedev, int index)
+{
+struct vfio_irq_set irq_set = {
+.argsz = sizeof(irq_set),
+.flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_TRIGGER,
+.index = index,
+.start = 0,
+.count = 0,
+};
+
+ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
+}
+
+void vfio_unmask_irqindex(VFIODevice *vbasedev, int index)
+{
+struct vfio_irq_set irq_set = {
+.argsz = sizeof(irq_set),
+.flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_UNMASK,
+.index = index,
+.start = 0,
+.count = 1,
+};
+
+ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
+}
+
+#ifdef CONFIG_KVM /* Unused outside of CONFIG_KVM code */
+void vfio_mask_irqindex(VFIODevice *vbasedev, int index)
+{
+struct vfio_irq_set irq_set = {
+.argsz = sizeof(irq_set),
+.flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_MASK,
+.index = index,
+.start = 0,
+.count = 1,
+};
+
+ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
+}
+#endif
+
+/*
+ * IO Port/MMIO - Beware of the endians, VFIO is always little endian
+ */
+void vfio_region_write(void *opaque, hwaddr addr,
+   uint64_t data, unsigned size)
+{
+VFIORegion *region = opaque;
+VFIODevice *vbasedev = region->vbasedev;
+union {
+uint8_t byte;
+uint16_t word;
+uint32_t dword;
+uint64_t qword;
+} buf;
+
+switch (size) {
+case 1:
+buf.byte = data;
+break;
+case 2:
+buf.word = data;
+break;
+case 4:
+buf.dword = data;
+break;
+default:
+hw_error("vfio: unsupported write size, %d bytes", size);
+break;
+}
+
+if (pwrite(vbasedev->fd, &buf, size, region->fd_offset + addr) != size) {
+error_report(&

[Qemu-devel] [PATCH v5 10/10] hw/arm/dyn_sysbus_devtree: enable simple VFIO dynamic instantiation

2014-08-09 Thread Eric Auger
Generates the device node of VFIO devices, if any are invoked in
-device option. In case VFIO devices require more complex node
generation, they can be handled before.

Signed-off-by: Eric Auger 
---
 hw/arm/dyn_sysbus_devtree.c | 138 
 1 file changed, 138 insertions(+)

diff --git a/hw/arm/dyn_sysbus_devtree.c b/hw/arm/dyn_sysbus_devtree.c
index 56af62f..ac34f07 100644
--- a/hw/arm/dyn_sysbus_devtree.c
+++ b/hw/arm/dyn_sysbus_devtree.c
@@ -1,6 +1,139 @@
 #include "hw/arm/dyn_sysbus_devtree.h"
 #include "qemu/error-report.h"
 #include "sysemu/device_tree.h"
+#include "hw/vfio/vfio-platform.h"
+
+static void vfio_fdt_add_device_node(SysBusDevice *sbdev, void *opaque);
+
+static char *format_compat(char * compat)
+{
+char *str_ptr, *corrected_compat;
+/*
+ * process compatibility property string passed by end-user
+ * replaces / by , and ; by NUL character
+ */
+corrected_compat = g_strdup(compat);
+/*
+ * the total length of the string has to include also the last
+ * NUL char.
+ */
+
+str_ptr = corrected_compat;
+while ((str_ptr = strchr(str_ptr, '/')) != NULL) {
+*str_ptr = ',';
+}
+
+/* substitute ";" with the NUL char */
+str_ptr = corrected_compat;
+while ((str_ptr = strchr(str_ptr, ';')) != NULL) {
+*str_ptr = '\0';
+}
+
+return corrected_compat;
+}
+
+static void wrap_vfio_fdt_add_node(SysBusDevice *sbdev, void *opaque)
+{
+PlatformDevtreeData *data = opaque;
+VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(sbdev);
+VFIODevice *vbasedev = &vdev->vbasedev;
+gchar irq_number_prop[8];
+Object *obj = OBJECT(sbdev);
+char *corrected_compat;
+uint64_t irq_number;
+int compat_str_len = strlen(vdev->compat)+1;
+int i;
+
+corrected_compat = format_compat(vdev->compat);
+snprintf(vdev->compat, compat_str_len, "%s", corrected_compat);
+g_free(corrected_compat);
+
+vfio_fdt_add_device_node(sbdev, opaque);
+
+for (i = 0; i < vbasedev->num_irqs; i++) {
+snprintf(irq_number_prop, sizeof(irq_number_prop), "irq[%d]", i);
+irq_number = object_property_get_int(obj, irq_number_prop, NULL)
+ + data->irq_start;
+/*
+ * for setting irqfd up we must provide the virtual IRQ number
+ * which is the sum of irq_start and actual platform bus irq
+ * index. At realize point we do not have this info.
+ */
+if (vdev->irqfd_allowed) {
+vfio_setup_irqfd(sbdev, i, irq_number);
+}
+}
+}
+
+static void vfio_fdt_add_device_node(SysBusDevice *sbdev, void *opaque)
+{
+PlatformDevtreeData *data = opaque;
+void *fdt = data->fdt;
+const char *parent_node = data->node;
+int compat_str_len;
+char *nodename;
+int i, ret;
+uint32_t *irq_attr;
+uint64_t *reg_attr;
+uint64_t mmio_base;
+uint64_t irq_number;
+gchar mmio_base_prop[8];
+gchar irq_number_prop[8];
+VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(sbdev);
+VFIODevice *vbasedev = &vdev->vbasedev;
+Object *obj = OBJECT(sbdev);
+
+mmio_base = object_property_get_int(obj, "mmio[0]", NULL);
+
+nodename = g_strdup_printf("%s/%s@%" PRIx64, parent_node,
+   vbasedev->name,
+   mmio_base);
+
+qemu_fdt_add_subnode(fdt, nodename);
+
+compat_str_len = strlen(vdev->compat) + 1;
+qemu_fdt_setprop(fdt, nodename, "compatible",
+vdev->compat, compat_str_len);
+
+reg_attr = g_new(uint64_t, vbasedev->num_regions*4);
+
+for (i = 0; i < vbasedev->num_regions; i++) {
+snprintf(mmio_base_prop, sizeof(mmio_base_prop), "mmio[%d]", i);
+mmio_base = object_property_get_int(obj, mmio_base_prop, NULL);
+reg_attr[2*i] = 1;
+reg_attr[2*i+1] = mmio_base;
+reg_attr[2*i+2] = 1;
+reg_attr[2*i+3] = memory_region_size(&vdev->regions[i]->mem);
+}
+
+ret = qemu_fdt_setprop_sized_cells_from_array(fdt, nodename, "reg",
+ vbasedev->num_regions*2, reg_attr);
+if (ret < 0) {
+error_report("could not set reg property of node %s", nodename);
+}
+
+irq_attr = g_new(uint32_t, vbasedev->num_irqs*3);
+
+for (i = 0; i < vbasedev->num_irqs; i++) {
+snprintf(irq_number_prop, sizeof(irq_number_prop), "irq[%d]", i);
+irq_number = object_property_get_int(obj, irq_number_prop, NULL)
+ + data->irq_start;
+irq_attr[3*i] = cpu_to_be32(0);
+irq_attr[3*i+1] = cpu_to_be32(irq_number);
+irq_attr[3*i+2] = cpu_to_b

[Qemu-devel] [PATCH v5 07/10] hw/vfio/platform: add vfio-platform support

2014-08-09 Thread Eric Auger
Minimal VFIO platform implementation supporting
- register space user mapping,
- IRQ assignment based on eventfds handled on qemu side.

irqfd kernel acceleration comes in a subsequent patch.

Signed-off-by: Kim Phillips 
Signed-off-by: Eric Auger 

---

v4 -> v5:
- vfio-plaform.h included first
- cleanup error handling in *populate*, vfio_get_device,
  vfio_enable_intp
- vfio_put_device not called anymore
- add some includes to follow vfio policy

v3 -> v4:
[Eric Auger]
- merge of "vfio: Add initial IRQ support in platform device"
  to get a full functional patch although perfs are limited.
- removal of unrealize function since I currently understand
  it is only used with device hot-plug feature.

v2 -> v3:
[Eric Auger]
- further factorization between PCI and platform (VFIORegion,
  VFIODevice). same level of functionality.

<= v2:
[Kim Philipps]
- Initial Creation of the device supporting register space mapping
---
 hw/vfio/Makefile.objs   |   1 +
 hw/vfio/platform.c  | 517 
 include/hw/vfio/vfio-platform.h |  77 ++
 3 files changed, 595 insertions(+)
 create mode 100644 hw/vfio/platform.c
 create mode 100644 include/hw/vfio/vfio-platform.h

diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
index e31f30e..c5c76fe 100644
--- a/hw/vfio/Makefile.objs
+++ b/hw/vfio/Makefile.objs
@@ -1,4 +1,5 @@
 ifeq ($(CONFIG_LINUX), y)
 obj-$(CONFIG_SOFTMMU) += common.o
 obj-$(CONFIG_PCI) += pci.o
+obj-$(CONFIG_SOFTMMU) += platform.o
 endif
diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
new file mode 100644
index 000..f1a1b55
--- /dev/null
+++ b/hw/vfio/platform.c
@@ -0,0 +1,517 @@
+/*
+ * vfio based device assignment support - platform devices
+ *
+ * Copyright Linaro Limited, 2014
+ *
+ * Authors:
+ *  Kim Phillips 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ * Based on vfio based PCI device assignment support:
+ *  Copyright Red Hat, Inc. 2012
+ */
+
+#include 
+#include 
+
+#include "hw/vfio/vfio-platform.h"
+#include "qemu/error-report.h"
+#include "qemu/range.h"
+#include "sysemu/sysemu.h"
+#include "exec/memory.h"
+#include "qemu/queue.h"
+#include "hw/sysbus.h"
+
+extern const MemoryRegionOps vfio_region_ops;
+extern const MemoryListener vfio_memory_listener;
+extern QLIST_HEAD(, VFIOGroup) group_list;
+extern QLIST_HEAD(, VFIOAddressSpace) vfio_address_spaces;
+void vfio_put_device(VFIOPlatformDevice *vdev);
+
+/*
+ * It is mandatory to pass a VFIOPlatformDevice since VFIODevice
+ * is not a QOM Object and cannot be passed to memory region functions
+*/
+static void vfio_map_region(VFIOPlatformDevice *vdev, int nr)
+{
+VFIORegion *region = vdev->regions[nr];
+unsigned size = region->size;
+char name[64];
+
+if (!size) {
+return;
+}
+
+snprintf(name, sizeof(name), "VFIO %s region %d",
+ vdev->vbasedev.name, nr);
+
+/* A "slow" read/write mapping underlies all regions */
+memory_region_init_io(®ion->mem, OBJECT(vdev), &vfio_region_ops,
+  region, name, size);
+
+strncat(name, " mmap", sizeof(name) - strlen(name) - 1);
+
+if (vfio_mmap_region(OBJECT(vdev), region, ®ion->mem,
+ ®ion->mmap_mem, ®ion->mmap, size, 0, name)) {
+error_report("%s unsupported. Performance may be slow", name);
+}
+}
+
+static void print_regions(VFIOPlatformDevice *vdev)
+{
+int i;
+
+DPRINTF("Device \"%s\" counts %d region(s):\n",
+ vdev->vbasedev.name, vdev->vbasedev.num_regions);
+
+for (i = 0; i < vdev->vbasedev.num_regions; i++) {
+DPRINTF("- region %d flags = 0x%lx, size = 0x%lx, "
+"fd= %d, offset = 0x%lx\n",
+vdev->regions[i]->nr,
+(unsigned long)vdev->regions[i]->flags,
+(unsigned long)vdev->regions[i]->size,
+vdev->regions[i]->vbasedev->fd,
+(unsigned long)vdev->regions[i]->fd_offset);
+}
+}
+
+static int vfio_populate_regions(VFIODevice *vbasedev)
+{
+struct vfio_region_info reg_info = { .argsz = sizeof(reg_info) };
+int i, ret = 0;
+VFIOPlatformDevice *vdev =
+container_of(vbasedev, VFIOPlatformDevice, vbasedev);
+
+vdev->regions = g_malloc0(sizeof(VFIORegion *) * vbasedev->num_regions);
+
+for (i = 0; i < vbasedev->num_regions; i++) {
+vdev->regions[i] = g_malloc0(sizeof(VFIORegion));
+reg_info.index = i;
+ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_REGION_INFO, ®_info);
+if (ret) {
+error_report("vfio: Error getting region %d info: %m", i);
+goto error;
+}
+

[Qemu-devel] [PATCH v5 08/10] hw/intc/arm_gic_kvm: advertise irqfd

2014-08-09 Thread Eric Auger
set kvm_irqfds_allowed

Signed-off-by: Eric Auger 
---
 hw/intc/arm_gic_kvm.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/intc/arm_gic_kvm.c b/hw/intc/arm_gic_kvm.c
index 5038885..08b7bf9 100644
--- a/hw/intc/arm_gic_kvm.c
+++ b/hw/intc/arm_gic_kvm.c
@@ -576,6 +576,8 @@ static void kvm_arm_gic_realize(DeviceState *dev, Error 
**errp)
 KVM_DEV_ARM_VGIC_GRP_ADDR,
 KVM_VGIC_V2_ADDR_TYPE_CPU,
 s->dev_fd);
+
+kvm_irqfds_allowed = true;
 }
 
 static void kvm_arm_gic_class_init(ObjectClass *klass, void *data)
-- 
1.8.3.2




[Qemu-devel] [PATCH v5 09/10] hw/vfio/platform: Add irqfd support

2014-08-09 Thread Eric Auger
This patch aims at optimizing IRQ handling using irqfd framework.

Instead of handling the eventfds on user-side they are handled on
kernel side using
- the KVM irqfd framework,
- the VFIO driver virqfd framework.

the virtual IRQ completion is trapped at interrupt controller
instead of on guest 1st access to any region after IRQ hit.
This removes the need for fast/slow path swap.

Overall this brings significant performance improvements.

It depends on host kernel KVM irqfd/GSI routing capability.

Signed-off-by: Alvise Rigo 
Signed-off-by: Eric Auger 

---
v4 -> v5:
- addition of sysemu/kvm.h header

v3 -> v4:
[Alvise Rigo]
Use of VFIO Platform driver v6 unmask/virqfd feature and removal
of resamplefd handler. Physical IRQ unmasking is now done in
VFIO driver.

v3:
[Eric Auger]
initial support with resamplefd handled on QEMU side since the
unmask was not supported on VFIO platform driver v5.
---
 hw/vfio/platform.c | 94 ++
 1 file changed, 94 insertions(+)

diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
index f1a1b55..e5c652c 100644
--- a/hw/vfio/platform.c
+++ b/hw/vfio/platform.c
@@ -23,6 +23,7 @@
 #include "exec/memory.h"
 #include "qemu/queue.h"
 #include "hw/sysbus.h"
+#include "sysemu/kvm.h"
 
 extern const MemoryRegionOps vfio_region_ops;
 extern const MemoryListener vfio_memory_listener;
@@ -367,6 +368,99 @@ static int vfio_populate_interrupts(VFIODevice *vbasedev)
 return 0;
 }
 
+static void vfio_enable_intp_kvm(VFIOINTp *intp)
+{
+#ifdef CONFIG_KVM
+struct kvm_irqfd irqfd = {
+.fd = event_notifier_get_fd(&intp->interrupt),
+.gsi = intp->virtualID,
+.flags = KVM_IRQFD_FLAG_RESAMPLE,
+};
+
+struct vfio_irq_set *irq_set;
+int ret, argsz;
+int32_t *pfd;
+VFIODevice *vbasedev = &intp->vdev->vbasedev;
+
+if (!kvm_irqfds_enabled() ||
+!kvm_check_extension(kvm_state, KVM_CAP_IRQFD_RESAMPLE)) {
+return;
+}
+
+/* Get to a known interrupt state */
+qemu_set_fd_handler(irqfd.fd, NULL, NULL, NULL);
+vfio_mask_irqindex(vbasedev, intp->pin);
+intp->state = VFIO_IRQ_INACTIVE;
+qemu_set_irq(intp->qemuirq, 0);
+
+/* Get an eventfd for resample/unmask */
+if (event_notifier_init(&intp->unmask, 0)) {
+error_report("vfio: Error: event_notifier_init failed eoi");
+goto fail;
+}
+
+/* KVM triggers it, VFIO listens for it */
+irqfd.resamplefd = event_notifier_get_fd(&intp->unmask);
+
+if (kvm_vm_ioctl(kvm_state, KVM_IRQFD, &irqfd)) {
+error_report("vfio: Error: Failed to setup resample irqfd: %m");
+goto fail_irqfd;
+}
+
+argsz = sizeof(*irq_set) + sizeof(*pfd);
+
+irq_set = g_malloc0(argsz);
+irq_set->argsz = argsz;
+irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_UNMASK;
+irq_set->index = intp->pin;
+irq_set->start = 0;
+irq_set->count = 1;
+pfd = (int32_t *)&irq_set->data;
+
+*pfd = irqfd.resamplefd;
+
+ret = ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irq_set);
+g_free(irq_set);
+if (ret) {
+error_report("vfio: Error: Failed to setup IRQ unmask fd: %m");
+goto fail_vfio;
+}
+
+vfio_unmask_irqindex(vbasedev, intp->pin);
+
+intp->kvm_accel = true;
+
+DPRINTF("%s irqfd pin=%d to virtID = %d fd=%d, resamplefd=%d)\n",
+__func__, intp->pin, intp->virtualID,
+irqfd.fd, irqfd.resamplefd);
+return;
+
+fail_vfio:
+irqfd.flags = KVM_IRQFD_FLAG_DEASSIGN;
+kvm_vm_ioctl(kvm_state, KVM_IRQFD, &irqfd);
+fail_irqfd:
+event_notifier_cleanup(&intp->unmask);
+fail:
+qemu_set_fd_handler(irqfd.fd, vfio_intp_interrupt, NULL, intp);
+vfio_unmask_irqindex(vbasedev, intp->pin);
+#endif
+}
+
+void vfio_setup_irqfd(SysBusDevice *s, int index, int virq)
+{
+VFIOPlatformDevice *vdev = container_of(s, VFIOPlatformDevice, sbdev);
+VFIOINTp *intp;
+
+QLIST_FOREACH(intp, &vdev->intp_list, next) {
+if (intp->pin == index) {
+intp->virtualID = virq;
+DPRINTF("enable irqfd for irq index %d (virtual IRQ %d)\n",
+index, virq);
+vfio_enable_intp_kvm(intp);
+}
+}
+}
+
 static VFIODeviceOps vfio_platform_ops = {
 .vfio_compute_needs_reset = vfio_platform_compute_needs_reset,
 .vfio_hot_reset_multi = vfio_platform_hot_reset_multi,
-- 
1.8.3.2




[Qemu-devel] VFIO-PCI testing after VFIO-platform rework

2014-08-11 Thread Eric Auger
Dear all,

I would like to test VFIO-PCI after rework done for [PATCH v5 00/10] KVM
platform device passthrough.

Do you have advises about best test environment (was thinking about TCG
system platform suitable for VFIO PCI assignment?).

All inputs welcome

Thank you in advance

Best Regards

Eric



Re: [Qemu-devel] [PATCH v5 10/10] hw/arm/dyn_sysbus_devtree: enable simple VFIO dynamic instantiation

2014-08-11 Thread Eric Auger
On 08/11/2014 11:40 AM, Alexander Graf wrote:
> 
> On 09.08.14 16:25, Eric Auger wrote:
>> Generates the device node of VFIO devices, if any are invoked in
>> -device option. In case VFIO devices require more complex node
>> generation, they can be handled before.
>>
>> Signed-off-by: Eric Auger 
>> ---
>>   hw/arm/dyn_sysbus_devtree.c | 138
>> 
>>   1 file changed, 138 insertions(+)
>>
>> diff --git a/hw/arm/dyn_sysbus_devtree.c b/hw/arm/dyn_sysbus_devtree.c
>> index 56af62f..ac34f07 100644
>> --- a/hw/arm/dyn_sysbus_devtree.c
>> +++ b/hw/arm/dyn_sysbus_devtree.c
>> @@ -1,6 +1,139 @@
>>   #include "hw/arm/dyn_sysbus_devtree.h"
>>   #include "qemu/error-report.h"
>>   #include "sysemu/device_tree.h"
>> +#include "hw/vfio/vfio-platform.h"
>> +
>> +static void vfio_fdt_add_device_node(SysBusDevice *sbdev, void *opaque);
>> +
>> +static char *format_compat(char * compat)
>> +{
>> +char *str_ptr, *corrected_compat;
>> +/*
>> + * process compatibility property string passed by end-user
>> + * replaces / by , and ; by NUL character
>> + */
>> +corrected_compat = g_strdup(compat);
>> +/*
>> + * the total length of the string has to include also the last
>> + * NUL char.
>> + */
>> +
>> +str_ptr = corrected_compat;
>> +while ((str_ptr = strchr(str_ptr, '/')) != NULL) {
>> +*str_ptr = ',';
>> +}
>> +
>> +/* substitute ";" with the NUL char */
>> +str_ptr = corrected_compat;
>> +while ((str_ptr = strchr(str_ptr, ';')) != NULL) {
>> +*str_ptr = '\0';
>> +}
>> +
>> +return corrected_compat;
>> +}
>> +
>> +static void wrap_vfio_fdt_add_node(SysBusDevice *sbdev, void *opaque)
>> +{
>> +PlatformDevtreeData *data = opaque;
>> +VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(sbdev);
>> +VFIODevice *vbasedev = &vdev->vbasedev;
>> +gchar irq_number_prop[8];
>> +Object *obj = OBJECT(sbdev);
>> +char *corrected_compat;
>> +uint64_t irq_number;
>> +int compat_str_len = strlen(vdev->compat)+1;
>> +int i;
>> +
>> +corrected_compat = format_compat(vdev->compat);
>> +snprintf(vdev->compat, compat_str_len, "%s", corrected_compat);
>> +g_free(corrected_compat);
>> +
>> +vfio_fdt_add_device_node(sbdev, opaque);
>> +
>> +for (i = 0; i < vbasedev->num_irqs; i++) {
>> +snprintf(irq_number_prop, sizeof(irq_number_prop), "irq[%d]",
>> i);
>> +irq_number = object_property_get_int(obj, irq_number_prop, NULL)
>> + + data->irq_start;
>> +/*
>> + * for setting irqfd up we must provide the virtual IRQ number
>> + * which is the sum of irq_start and actual platform bus irq
>> + * index. At realize point we do not have this info.
>> + */
>> +if (vdev->irqfd_allowed) {
>> +vfio_setup_irqfd(sbdev, i, irq_number);
>> +}
>> +}
>> +}
>> +
>> +static void vfio_fdt_add_device_node(SysBusDevice *sbdev, void *opaque)
>> +{
>> +PlatformDevtreeData *data = opaque;
>> +void *fdt = data->fdt;
>> +const char *parent_node = data->node;
>> +int compat_str_len;
>> +char *nodename;
>> +int i, ret;
>> +uint32_t *irq_attr;
>> +uint64_t *reg_attr;
>> +uint64_t mmio_base;
>> +uint64_t irq_number;
>> +gchar mmio_base_prop[8];
>> +gchar irq_number_prop[8];
>> +VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(sbdev);
>> +VFIODevice *vbasedev = &vdev->vbasedev;
>> +Object *obj = OBJECT(sbdev);
>> +
>> +mmio_base = object_property_get_int(obj, "mmio[0]", NULL);
>> +
>> +nodename = g_strdup_printf("%s/%s@%" PRIx64, parent_node,
>> +   vbasedev->name,
>> +   mmio_base);
>> +
>> +qemu_fdt_add_subnode(fdt, nodename);
>> +
>> +compat_str_len = strlen(vdev->compat) + 1;
>> +qemu_fdt_setprop(fdt, nodename, "compatible",
>> +vdev->compat, compat_str_len);
>> +
>> +reg_attr = g_new(uint64_t, vbasedev->num_regi

Re: [Qemu-devel] [PATCH v5 08/10] hw/intc/arm_gic_kvm: advertise irqfd

2014-08-11 Thread Eric Auger
On 08/11/2014 11:37 AM, Alexander Graf wrote:
> 
> On 09.08.14 16:25, Eric Auger wrote:
>> set kvm_irqfds_allowed
>>
>> Signed-off-by: Eric Auger 
>> ---
>>   hw/intc/arm_gic_kvm.c | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/hw/intc/arm_gic_kvm.c b/hw/intc/arm_gic_kvm.c
>> index 5038885..08b7bf9 100644
>> --- a/hw/intc/arm_gic_kvm.c
>> +++ b/hw/intc/arm_gic_kvm.c
>> @@ -576,6 +576,8 @@ static void kvm_arm_gic_realize(DeviceState *dev,
>> Error **errp)
>>   KVM_DEV_ARM_VGIC_GRP_ADDR,
>>   KVM_VGIC_V2_ADDR_TYPE_CPU,
>>   s->dev_fd);
>> +
>> +kvm_irqfds_allowed = true;
> 
> Is this always true? If it is, why not enable it separately while making
> vhost-net work for example?

Hi Alex,

yes I think so. As soon as KVM is enabled, KVM/arm would enable
injection though irqfd. Defintively makes sense to test it with
vhost-net too. Well a matter of priority ;-)

Best Regards

Eric

> 
> Alex
> 




Re: [Qemu-devel] [PATCH v5 08/10] hw/intc/arm_gic_kvm: advertise irqfd

2014-08-11 Thread Eric Auger
On 08/11/2014 02:05 PM, Alexander Graf wrote:
> 
> On 11.08.14 14:04, Eric Auger wrote:
>> On 08/11/2014 11:37 AM, Alexander Graf wrote:
>>> On 09.08.14 16:25, Eric Auger wrote:
>>>> set kvm_irqfds_allowed
>>>>
>>>> Signed-off-by: Eric Auger 
>>>> ---
>>>>hw/intc/arm_gic_kvm.c | 2 ++
>>>>1 file changed, 2 insertions(+)
>>>>
>>>> diff --git a/hw/intc/arm_gic_kvm.c b/hw/intc/arm_gic_kvm.c
>>>> index 5038885..08b7bf9 100644
>>>> --- a/hw/intc/arm_gic_kvm.c
>>>> +++ b/hw/intc/arm_gic_kvm.c
>>>> @@ -576,6 +576,8 @@ static void kvm_arm_gic_realize(DeviceState *dev,
>>>> Error **errp)
>>>>KVM_DEV_ARM_VGIC_GRP_ADDR,
>>>>KVM_VGIC_V2_ADDR_TYPE_CPU,
>>>>s->dev_fd);
>>>> +
>>>> +kvm_irqfds_allowed = true;
>>> Is this always true? If it is, why not enable it separately while making
>>> vhost-net work for example?
>> Hi Alex,
>>
>> yes I think so. As soon as KVM is enabled, KVM/arm would enable
>> injection though irqfd. Defintively makes sense to test it with
>> vhost-net too. Well a matter of priority ;-)
> 
> More a matter of accuracy. What if you use new QEMU on old KVM which
> does have in-kernel GIC support, but no irqfd support?

Hi Alex,

VFIO device code also calls kvm_check_extension(kvm_state,
KVM_CAP_IRQFD_RESAMPLE) which would return false if IRQFD is not enabled
in old kernels. But with respect to vhost-net irqfd usage I cannot
comment yet and you may be right ;-)

Best Regards

Eric
> 
> 
> Alex
> 




Re: [Qemu-devel] [RFC v2 1/7] hw/misc/dyn_sysbus_binding: helpers for sysbus device dynamic binding

2014-08-11 Thread Eric Auger
On 08/11/2014 03:08 PM, Alexander Graf wrote:
> 
> On 08.08.14 17:03, Eric Auger wrote:
>> This new module implements routines which help in dynamic device
>> binding (mmio regions, irq). They are supposed to be used by machine
>> files that support dynamic sysbus instantiation.
>>
>> ---
>>
>> v1 -> v2:
>> - platform_devices renamed into dyn_sysbus_binding
>> - PlatformParams renamed into DynSysbusParams
>> - PlatformBusNotifier renamed into DynSysbusNotifier
>> - platform_bus_map_irq, platform_bus_map_mmio, sysbus_device_check,
>>platform_bus_init become static
>> - PlatformBusInitData becomes private to the module
>> - page_shift becomes a member of DynSysbusParams
>>
>> v1: Dynamic sysbus device allocation fully written by Alex Graf.
>> Those functions were initially in ppc e500 machine file. Now moved to a
>> separate module.
>> PPCE500Params is replaced by a generic struct named PlatformParams
>>
>> Signed-off-by: Alexander Graf 
>> Signed-off-by: Eric Auger 
>> ---
>>   hw/misc/Makefile.objs|   1 +
>>   hw/misc/dyn_sysbus_binding.c | 163
>> +++
>>   include/hw/misc/dyn_sysbus_binding.h |  24 ++
>>   3 files changed, 188 insertions(+)
>>   create mode 100644 hw/misc/dyn_sysbus_binding.c
>>   create mode 100644 include/hw/misc/dyn_sysbus_binding.h
>>
>> diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
>> index 979e532..86f6243 100644
>> --- a/hw/misc/Makefile.objs
>> +++ b/hw/misc/Makefile.objs
>> @@ -41,3 +41,4 @@ obj-$(CONFIG_SLAVIO) += slavio_misc.o
>>   obj-$(CONFIG_ZYNQ) += zynq_slcr.o
>> obj-$(CONFIG_PVPANIC) += pvpanic.o
>> +obj-y += dyn_sysbus_binding.o
>> diff --git a/hw/misc/dyn_sysbus_binding.c b/hw/misc/dyn_sysbus_binding.c
>> new file mode 100644
>> index 000..0f34f0b
>> --- /dev/null
>> +++ b/hw/misc/dyn_sysbus_binding.c
>> @@ -0,0 +1,163 @@
> 
> This file is missing a license header.
OK
> 
>> +#include "hw/misc/dyn_sysbus_binding.h"
>> +#include "qemu/error-report.h"
>> +
>> +typedef struct PlatformBusInitData {
>> +unsigned long *used_irqs;
>> +unsigned long *used_mem;
>> +MemoryRegion *mem;
>> +qemu_irq *irqs;
>> +int device_count;
>> +DynSysbusParams *params;
>> +} PlatformBusInitData;
>> +
>> +
>> +static int platform_bus_map_irq(DynSysbusParams *params,
>> +SysBusDevice *sbdev,
>> +int n, unsigned long *used_irqs,
>> +qemu_irq *platform_irqs)
>> +{
>> +int max_irqs = params->platform_bus_num_irqs;
>> +char *prop = g_strdup_printf("irq[%d]", n);
>> +int irqn = object_property_get_int(OBJECT(sbdev), prop, NULL);
>> +
>> +if (irqn == SYSBUS_DYNAMIC) {
>> +/* Find the first available IRQ */
>> +irqn = find_first_zero_bit(used_irqs, max_irqs);
>> +}
>> +
>> +if ((irqn >= max_irqs) || test_and_set_bit(irqn, used_irqs)) {
>> +hw_error("IRQ %d is already allocated or no free IRQ left",
>> irqn);
>> +}
>> +
>> +sysbus_connect_irq(sbdev, n, platform_irqs[irqn]);
>> +object_property_set_int(OBJECT(sbdev), irqn, prop, NULL);
>> +
>> +g_free(prop);
>> +return 0;
>> +}
>> +
>> +static int platform_bus_map_mmio(DynSysbusParams *params,
>> + SysBusDevice *sbdev,
>> + int n, unsigned long *used_mem,
>> + MemoryRegion *pmem)
>> +{
>> +MemoryRegion *device_mem = sbdev->mmio[n].memory;
>> +uint64_t size = memory_region_size(device_mem);
>> +uint64_t page_size = (1 << params->page_shift);
>> +uint64_t page_mask = page_size - 1;
>> +uint64_t size_pages = (size + page_mask) >> params->page_shift;
>> +uint64_t max_size = params->platform_bus_size;
>> +uint64_t max_pages = max_size >> params->page_shift;
>> +char *prop = g_strdup_printf("mmio[%d]", n);
>> +hwaddr addr = object_property_get_int(OBJECT(sbdev), prop, NULL);
>> +int page;
>> +int i;
>> +
>> +page = addr >> params->page_shift;
>> +if (addr == SYSBUS_DYNAMIC) {
>> +uint64_t size_pages_align;
>> +
>> +/* Align the region to at least its own size granularity */
>> +   

Re: [Qemu-devel] [RFC v2 2/7] hw/arm/dyn_sysbus_devtree: helpers for sysbus device dynamic dt node generation

2014-08-11 Thread Eric Auger
On 08/11/2014 03:16 PM, Alexander Graf wrote:
> 
> On 08.08.14 17:03, Eric Auger wrote:
>> This module will be used by ARM machine files to generate
>> device tree nodes of dynamically instantiated sysbus devices (ie.
>> those instantiated with -device option).
>>
>> Signed-off-by: Alexander Graf 
>> Signed-off-by: Eric Auger 
>>
>> ---
>>
>> v2:
>> - Code moved in an arch specific file to accomodate architecture
>>dependent specificities.
>> - remove platform_bus_base from PlatformDevtreeData
>>
>> v1: code originally written by Alex Graf in e500.c and reused for ARM
>>  [Eric Auger]
>>  code originally moved in hw/misc/platform_devices and device itself
>> ---
>>   hw/arm/Makefile.objs|  1 +
>>   hw/arm/dyn_sysbus_devtree.c | 66
>> +
>>   include/hw/arm/dyn_sysbus_devtree.h | 18 ++
>>   3 files changed, 85 insertions(+)
>>   create mode 100644 hw/arm/dyn_sysbus_devtree.c
>>   create mode 100644 include/hw/arm/dyn_sysbus_devtree.h
>>
>> diff --git a/hw/arm/Makefile.objs b/hw/arm/Makefile.objs
>> index 6088e53..bc5e014 100644
>> --- a/hw/arm/Makefile.objs
>> +++ b/hw/arm/Makefile.objs
>> @@ -3,6 +3,7 @@ obj-$(CONFIG_DIGIC) += digic_boards.o
>>   obj-y += integratorcp.o kzm.o mainstone.o musicpal.o nseries.o
>>   obj-y += omap_sx1.o palm.o realview.o spitz.o stellaris.o
>>   obj-y += tosa.o versatilepb.o vexpress.o virt.o xilinx_zynq.o z2.o
>> +obj-y += dyn_sysbus_devtree.o
>> obj-y += armv7m.o exynos4210.o pxa2xx.o pxa2xx_gpio.o pxa2xx_pic.o
>>   obj-$(CONFIG_DIGIC) += digic.o
>> diff --git a/hw/arm/dyn_sysbus_devtree.c b/hw/arm/dyn_sysbus_devtree.c
>> new file mode 100644
>> index 000..56af62f
>> --- /dev/null
>> +++ b/hw/arm/dyn_sysbus_devtree.c
>> @@ -0,0 +1,66 @@
>> +#include "hw/arm/dyn_sysbus_devtree.h"
>> +#include "qemu/error-report.h"
>> +#include "sysemu/device_tree.h"
>> +
>> +int sysbus_device_create_devtree(Object *obj, void *opaque)
>> +{
>> +PlatformDevtreeData *data = opaque;
>> +Object *dev;
>> +SysBusDevice *sbdev;
>> +bool matched = false;
>> +
>> +dev = object_dynamic_cast(obj, TYPE_SYS_BUS_DEVICE);
>> +sbdev = (SysBusDevice *)dev;
>> +
>> +if (!sbdev) {
>> +/* Container, traverse it for children */
>> +return object_child_foreach(obj,
>> sysbus_device_create_devtree, data);
>> +}
>> +
>> +if (!matched) {
>> +error_report("Device %s is not supported by this machine yet.",
>> + qdev_fw_name(DEVICE(dev)));
>> +exit(1);
>> +}
>> +
>> +return 0;
>> +}
>> +
>> +void platform_bus_create_devtree(DynSysbusParams *params,
>> + void *fdt, const char *mpic)
>> +{
>> +gchar *node = g_strdup_printf("/platform@%"PRIx64,
>> +  params->platform_bus_base);
>> +const char platcomp[] = "qemu,platform\0simple-bus";
>> +PlatformDevtreeData data;
>> +Object *container;
>> +uint64_t addr = params->platform_bus_base;
>> +uint64_t size = params->platform_bus_size;
>> +int irq_start = params->platform_bus_first_irq;
>> +
>> +/* Create a /platform node that we can put all devices into */
>> +
>> +qemu_fdt_add_subnode(fdt, node);
>> +qemu_fdt_setprop(fdt, node, "compatible", platcomp,
>> sizeof(platcomp));
>> +
>> +/* Our platform bus region is less than 32bit big, so 1 cell is
>> enough for
>> +   address and size */
>> +qemu_fdt_setprop_cells(fdt, node, "#size-cells", 1);
>> +qemu_fdt_setprop_cells(fdt, node, "#address-cells", 1);
>> +qemu_fdt_setprop_cells(fdt, node, "ranges", 0, addr >> 32, addr,
>> size);
>> +
>> +qemu_fdt_setprop_phandle(fdt, node, "interrupt-parent", mpic);
>> +
>> +/* Loop through all devices and create nodes for known ones */
>> +data.fdt = fdt;
>> +data.mpic = mpic;
>> +data.irq_start = irq_start;
>> +data.node = node;
>> +
>> +container = container_get(qdev_get_machine(), "/peripheral");
>> +sysbus_device_create_devtree(container, &data);
>> +container = container_get(qdev_get_machine(), "/peripheral-anon");
>> +sysbus_device_create_devtre

Re: [Qemu-devel] [RFC v2 5/7] hw/arm/boot: load_dtb becomes non static

2014-08-11 Thread Eric Auger
On 08/11/2014 03:10 PM, Alexander Graf wrote:
> 
> On 08.08.14 17:03, Eric Auger wrote:
>> load_dtb will be used by machvirt for dynamic instantiation of
>> platform devices
>>
>> Signed-off-by: Eric Auger 
>> ---
>>   hw/arm/boot.c| 2 +-
>>   include/hw/arm/arm.h | 1 +
>>   2 files changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/arm/boot.c b/hw/arm/boot.c
>> index 1241761..53b43e8 100644
>> --- a/hw/arm/boot.c
>> +++ b/hw/arm/boot.c
>> @@ -312,7 +312,7 @@ static void set_kernel_args_old(const struct
>> arm_boot_info *info)
>>   }
>>   }
>>   -static int load_dtb(hwaddr addr, const struct arm_boot_info *binfo)
>> +int load_dtb(hwaddr addr, const struct arm_boot_info *binfo)
> 
> Please rename it arm_load_dtb then.
OK
thanks

Eric
> 
> 
> Alex
> 
>>   {
>>   void *fdt = NULL;
>>   int size, rc;
>> diff --git a/include/hw/arm/arm.h b/include/hw/arm/arm.h
>> index cbbf4ca..fe58dc0 100644
>> --- a/include/hw/arm/arm.h
>> +++ b/include/hw/arm/arm.h
>> @@ -68,6 +68,7 @@ struct arm_boot_info {
>>   hwaddr entry;
>>   };
>>   void arm_load_kernel(ARMCPU *cpu, struct arm_boot_info *info);
>> +int load_dtb(hwaddr addr, const struct arm_boot_info *binfo);
>> /* Multiplication factor to convert from system clock ticks to
>> qemu timer
>>  ticks.  */
> 




Re: [Qemu-devel] VFIO-PCI testing after VFIO-platform rework

2014-08-11 Thread Eric Auger
On 08/11/2014 03:50 PM, Will Deacon wrote:
> I'm playing with PCI device assignment with kvmtool, so I could do some
> basic testing if you like. Can you put the patches on a git tree somewhere
> please?

Hi Will,

Thanks for your proposal,

the kernel part is at:
git://git.linaro.org/people/eric.auger/linux.git (branch irqfd_integ_v4)

the qemu part is at:
git://git.linaro.org/people/eric.auger/qemu.git (branch vfio_integ_v5)

Please do not hesitate to contact me for any support.

Best Regards

Eric

> 
> Will




Re: [Qemu-devel] [PATCH v5 07/10] hw/vfio/platform: add vfio-platform support

2014-08-11 Thread Eric Auger
On 08/11/2014 10:13 PM, Alex Williamson wrote:
> On Sat, 2014-08-09 at 15:25 +0100, Eric Auger wrote:
>> Minimal VFIO platform implementation supporting
>> - register space user mapping,
>> - IRQ assignment based on eventfds handled on qemu side.
>>
>> irqfd kernel acceleration comes in a subsequent patch.
>>
>> Signed-off-by: Kim Phillips 
>> Signed-off-by: Eric Auger 
>>
>> ---
>>
>> v4 -> v5:
>> - vfio-plaform.h included first
>> - cleanup error handling in *populate*, vfio_get_device,
>>   vfio_enable_intp
>> - vfio_put_device not called anymore
>> - add some includes to follow vfio policy
>>
>> v3 -> v4:
>> [Eric Auger]
>> - merge of "vfio: Add initial IRQ support in platform device"
>>   to get a full functional patch although perfs are limited.
>> - removal of unrealize function since I currently understand
>>   it is only used with device hot-plug feature.
>>
>> v2 -> v3:
>> [Eric Auger]
>> - further factorization between PCI and platform (VFIORegion,
>>   VFIODevice). same level of functionality.
>>
>> <= v2:
>> [Kim Philipps]
>> - Initial Creation of the device supporting register space mapping
>> ---
>>  hw/vfio/Makefile.objs   |   1 +
>>  hw/vfio/platform.c  | 517 
>> 
>>  include/hw/vfio/vfio-platform.h |  77 ++
>>  3 files changed, 595 insertions(+)
>>  create mode 100644 hw/vfio/platform.c
>>  create mode 100644 include/hw/vfio/vfio-platform.h
>>
>> diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
>> index e31f30e..c5c76fe 100644
>> --- a/hw/vfio/Makefile.objs
>> +++ b/hw/vfio/Makefile.objs
>> @@ -1,4 +1,5 @@
>>  ifeq ($(CONFIG_LINUX), y)
>>  obj-$(CONFIG_SOFTMMU) += common.o
>>  obj-$(CONFIG_PCI) += pci.o
>> +obj-$(CONFIG_SOFTMMU) += platform.o
>>  endif
>> diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
>> new file mode 100644
>> index 000..f1a1b55
>> --- /dev/null
>> +++ b/hw/vfio/platform.c
>> @@ -0,0 +1,517 @@
>> +/*
>> + * vfio based device assignment support - platform devices
>> + *
>> + * Copyright Linaro Limited, 2014
>> + *
>> + * Authors:
>> + *  Kim Phillips 
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2.  See
>> + * the COPYING file in the top-level directory.
>> + *
>> + * Based on vfio based PCI device assignment support:
>> + *  Copyright Red Hat, Inc. 2012
>> + */
>> +
>> +#include 
>> +#include 
>> +
>> +#include "hw/vfio/vfio-platform.h"
>> +#include "qemu/error-report.h"
>> +#include "qemu/range.h"
>> +#include "sysemu/sysemu.h"
>> +#include "exec/memory.h"
>> +#include "qemu/queue.h"
>> +#include "hw/sysbus.h"
>> +
>> +extern const MemoryRegionOps vfio_region_ops;
>> +extern const MemoryListener vfio_memory_listener;
>> +extern QLIST_HEAD(, VFIOGroup) group_list;
>> +extern QLIST_HEAD(, VFIOAddressSpace) vfio_address_spaces;
>> +void vfio_put_device(VFIOPlatformDevice *vdev);
>> +
>> +/*
>> + * It is mandatory to pass a VFIOPlatformDevice since VFIODevice
>> + * is not a QOM Object and cannot be passed to memory region functions
>> +*/
>> +static void vfio_map_region(VFIOPlatformDevice *vdev, int nr)
>> +{
>> +VFIORegion *region = vdev->regions[nr];
>> +unsigned size = region->size;
>> +char name[64];
>> +
>> +if (!size) {
>> +return;
>> +}
>> +
>> +snprintf(name, sizeof(name), "VFIO %s region %d",
>> + vdev->vbasedev.name, nr);
>> +
>> +/* A "slow" read/write mapping underlies all regions */
>> +memory_region_init_io(®ion->mem, OBJECT(vdev), &vfio_region_ops,
>> +  region, name, size);
>> +
>> +strncat(name, " mmap", sizeof(name) - strlen(name) - 1);
>> +
>> +if (vfio_mmap_region(OBJECT(vdev), region, ®ion->mem,
>> + ®ion->mmap_mem, ®ion->mmap, size, 0, name)) {
>> +error_report("%s unsupported. Performance may be slow", name);
>> +}
>> +}
>> +
>> +static void print_regions(VFIOPlatformDevice *vdev)
>> +{
>> +int i;
>> +
>> +DPRINTF("Device \"%s\" counts %d region(s):\n",
>> + vdev->vbasedev.name, vdev-

Re: [Qemu-devel] [PATCH v5 06/10] hw/vfio: create common module

2014-08-11 Thread Eric Auger
On 08/11/2014 09:20 PM, Alex Williamson wrote:
> On Sat, 2014-08-09 at 15:25 +0100, Eric Auger wrote:
>> A new common module is created. It implements all functions
>> that have no device specificity (PCI, Platform).
>>
>> This patch only consists in move (no functional changes)
>>
>> Signed-off-by: Kim Phillips 
>> Signed-off-by: Eric Auger 
>>
>> ---
>>
>> v4 -> v5:
>> - integrate "sPAPR/IOMMU: Fix TCE entry permission"
>> - VFIOdevice .name dealloc removed from vfio_put_base_device
>> - add some includes according to vfio inclusion policy
>>
>> v3 -> v4:
>> [Eric Auger]
>> move done after all PCI modifications to anticipate for
>> VFIO Platform needs. Purpose is to alleviate the whole
>> review process.
>>
>> <= v3
>> First split done by Kim Phillips
>> ---
>>  hw/vfio/Makefile.objs |1 +
>>  hw/vfio/common.c  |  990 ++
>>  hw/vfio/pci.c | 1070 
>> +
>>  include/hw/vfio/vfio-common.h |  151 ++
>>  4 files changed, 1147 insertions(+), 1065 deletions(-)
>>  create mode 100644 hw/vfio/common.c
>>  create mode 100644 include/hw/vfio/vfio-common.h
>>
>> diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
>> index 31c7dab..e31f30e 100644
>> --- a/hw/vfio/Makefile.objs
>> +++ b/hw/vfio/Makefile.objs
>> @@ -1,3 +1,4 @@
>>  ifeq ($(CONFIG_LINUX), y)
>> +obj-$(CONFIG_SOFTMMU) += common.o
>>  obj-$(CONFIG_PCI) += pci.o
>>  endif
>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>> new file mode 100644
>> index 000..297c508
>> --- /dev/null
>> +++ b/hw/vfio/common.c
>> @@ -0,0 +1,990 @@
>> +/*
>> + * generic functions used by VFIO devices
>> + *
>> + * Copyright Red Hat, Inc. 2012
>> + *
>> + * Authors:
>> + *  Alex Williamson 
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2.  See
>> + * the COPYING file in the top-level directory.
>> + *
>> + * Based on qemu-kvm device-assignment:
>> + *  Adapted for KVM by Qumranet.
>> + *  Copyright (c) 2007, Neocleus, Alex Novik (a...@neocleus.com)
>> + *  Copyright (c) 2007, Neocleus, Guy Zana (g...@neocleus.com)
>> + *  Copyright (C) 2008, Qumranet, Amit Shah (amit.s...@qumranet.com)
>> + *  Copyright (C) 2008, Red Hat, Amit Shah (amit.s...@redhat.com)
>> + *  Copyright (C) 2008, IBM, Muli Ben-Yehuda (m...@il.ibm.com)
>> + */
>> +
>> +#include 
>> +#include 
>> +#include 
>> +
>> +#include "hw/vfio/vfio-common.h"
>> +#include "hw/vfio/vfio.h"
>> +#include "exec/address-spaces.h"
>> +#include "exec/memory.h"
>> +#include "hw/hw.h"
>> +#include "qemu/error-report.h"
>> +#include "sysemu/kvm.h"
>> +
>> +QLIST_HEAD(, VFIOGroup)
>> +group_list = QLIST_HEAD_INITIALIZER(group_list);
>> +
>> +QLIST_HEAD(, VFIOAddressSpace) vfio_address_spaces =
>> +QLIST_HEAD_INITIALIZER(vfio_address_spaces);
>> +
>> +#ifdef CONFIG_KVM
>> +/*
>> + * We have a single VFIO pseudo device per KVM VM.  Once created it lives
>> + * for the life of the VM.  Closing the file descriptor only drops our
>> + * reference to it and the device's reference to kvm.  Therefore once
>> + * initialized, this file descriptor is only released on QEMU exit and
>> + * we'll re-use it should another vfio device be attached before then.
>> + */
>> +static int vfio_kvm_device_fd = -1;
>> +#endif
>> +
>> +/*
>> + * Common VFIO interrupt disable
>> + */
>> +void vfio_disable_irqindex(VFIODevice *vbasedev, int index)
>> +{
>> +struct vfio_irq_set irq_set = {
>> +.argsz = sizeof(irq_set),
>> +.flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_TRIGGER,
>> +.index = index,
>> +.start = 0,
>> +.count = 0,
>> +};
>> +
>> +ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
>> +}
>> +
>> +void vfio_unmask_irqindex(VFIODevice *vbasedev, int index)
>> +{
>> +struct vfio_irq_set irq_set = {
>> +.argsz = sizeof(irq_set),
>> +.flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_UNMASK,
>> +.index = index,
>> +.start = 0,
>> +.count = 1,
>> +};
>> +
>> +ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
>> +}
&

Re: [Qemu-devel] [PATCH v5 06/10] hw/vfio: create common module

2014-08-11 Thread Eric Auger
On 08/11/2014 09:25 PM, Alex Williamson wrote:
> On Sat, 2014-08-09 at 15:25 +0100, Eric Auger wrote:
>> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
>> new file mode 100644
>> index 000..4684ee5
>> --- /dev/null
>> +++ b/include/hw/vfio/vfio-common.h
>> @@ -0,0 +1,151 @@
>> +/*
>> + * common header for vfio based device assignment support
>> + *
>> + * Copyright Red Hat, Inc. 2012
>> + *
>> + * Authors:
>> + *  Alex Williamson 
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2.  See
>> + * the COPYING file in the top-level directory.
>> + *
>> + * Based on qemu-kvm device-assignment:
>> + *  Adapted for KVM by Qumranet.
>> + *  Copyright (c) 2007, Neocleus, Alex Novik (a...@neocleus.com)
>> + *  Copyright (c) 2007, Neocleus, Guy Zana (g...@neocleus.com)
>> + *  Copyright (C) 2008, Qumranet, Amit Shah (amit.s...@qumranet.com)
>> + *  Copyright (C) 2008, Red Hat, Amit Shah (amit.s...@redhat.com)
>> + *  Copyright (C) 2008, IBM, Muli Ben-Yehuda (m...@il.ibm.com)
>> + */
>> +#ifndef HW_VFIO_VFIO_COMMON_H
>> +#define HW_VFIO_VFIO_COMMON_H
>> +
>> +#include "qemu-common.h"
>> +#include "exec/address-spaces.h"
>> +#include "exec/memory.h"
>> +#include "qemu/queue.h"
>> +#include "qemu/notify.h"
>> +
>> +/*#define DEBUG_VFIO*/
>> +#ifdef DEBUG_VFIO
>> +#define DPRINTF(fmt, ...) \
>> +do { fprintf(stderr, "vfio: " fmt, ## __VA_ARGS__); } while (0)
>> +#else
>> +#define DPRINTF(fmt, ...) \
>> +do { } while (0)
>> +#endif
> 
> 
> DPRINTF also need to be renamed to avoid conflicting namespace issues.
Ji Alex,

OK.

As I am going to touch at traces,
- are you OK if I use the new .name field to simply format strings?

DPRINTF("%s(%04x:%02x:%02x.%x) Pin %c\n", __func__, vdev->host.domain,
vdev->host.bus, vdev->host.slot, vdev->host.function,
'A' + vdev->intx.pin);
- Also Alex was suggesting to use trace points. What is your position
about that? Also I am not 100% sure of what it consists in? is it trace
events as documented in docs/tracing.txt

Thanks

Eric



> Thanks,
> 
> Alex
> 




Re: [Qemu-devel] VFIO-PCI testing after VFIO-platform rework

2014-08-11 Thread Eric Auger
On 08/11/2014 07:57 PM, Will Deacon wrote:
> On Mon, Aug 11, 2014 at 03:28:40PM +0100, Eric Auger wrote:
>> On 08/11/2014 03:50 PM, Will Deacon wrote:
>>> I'm playing with PCI device assignment with kvmtool, so I could do some
>>> basic testing if you like. Can you put the patches on a git tree somewhere
>>> please?
>>
>> Hi Will,
>>
>> Thanks for your proposal,
>>
>> the kernel part is at:
>> git://git.linaro.org/people/eric.auger/linux.git (branch irqfd_integ_v4)
> 
> I run into a bunch of horrible vgic conflicts if I try to merge this with my
> testing branch -- can you take a look at rebasing on top of Marc's GICv3
> code, please? (that should all be in mainline now).

Hi Will,

Sure. This merge will be done soon along with the integration of the
vgic cleanup serie (http://www.spinics.net/lists/kvm-arm/msg10325.html).

Best Regards

Eric

> 
> Will
> 




Re: [Qemu-devel] [PATCH v5 05/10] hw/vfio/pci: split vfio_get_device

2014-08-11 Thread Eric Auger
On 08/12/2014 04:41 AM, David Gibson wrote:
> On Sat, Aug 09, 2014 at 03:25:44PM +0100, Eric Auger wrote:
>> vfio_get_device now takes a VFIODevice as argument. The function is split
>> into 4 functional parts: dev_info query, device check, region populate
>> and interrupt populate. the last 3 are specialized by parent device and
>> are added into DeviceOps.
> 
> Why is splitting these up into 4 stages useful, rather than having a
> single sub-class specific callback?

Hi David,

VFIOPlatformDevice already inherits from SysBusDevice and hence cannot
inherit from another VFIODevice. Same for VFIOPCIDevice that inherits
from PCIDevice. This is why I created this non QOM struct. But did you
mean something else?

Then splitting into 4: This was to share some code between platform and
PCI (dev_info query) and vfio_get_device was quite big already. I
thought it makes sense to split it into functional parts.

Best Regards

Eric
> 




Re: [Qemu-devel] The status about vhost-net on kvm-arm?

2014-08-12 Thread Eric Auger
On 08/12/2014 04:41 AM, Li Liu wrote:
> Hi all,
> 
> Is anyone there can tell the current status of vhost-net on kvm-arm?
> 
> Half a year has passed from Isa Ansharullah asked this question:
> http://www.spinics.net/lists/kvm-arm/msg08152.html
> 
> I have found two patches which have provided the kvm-arm support of
> eventfd and irqfd:
> 
> 1) [RFC PATCH 0/4] ARM: KVM: Enable the ioeventfd capability of KVM on ARM
> http://lists.gnu.org/archive/html/qemu-devel/2014-01/msg01770.html
> 
> 2) [RFC,v3] ARM: KVM: add irqfd and irq routing support
> https://patches.linaro.org/32261/

Hi Li,

The patch below uses Paul Mackerras' work and removed usage of GSI
routing table. It is a simpler alternative to 2)
http://www.spinics.net/lists/kvm/msg106535.html

> 
> And there's a rough patch for qemu to support eventfd from Ying-Shiuan Pan:
> 
> [Qemu-devel] [PATCH 0/4] ioeventfd support for virtio-mmio
> https://lists.gnu.org/archive/html/qemu-devel/2014-02/msg00715.html
> 
> But there no any comments of this patch. And I can found nothing about qemu
> to support irqfd. Do I lost the track?

Actually I am using irqfd in QEMU VFIO Platform device
https://lists.nongnu.org/archive/html/qemu-devel/2014-08/msg01455.html

Best Regards

Eric

> 
> If nobody try to fix it. We have a plan to complete it about virtio-mmio
> supporing irqfd and multiqueue.
> 
> 
> 
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 




Re: [Qemu-devel] [PATCH v5 07/10] hw/vfio/platform: add vfio-platform support

2014-08-12 Thread Eric Auger
On 08/12/2014 09:59 AM, bharat.bhus...@freescale.com wrote:
> 
> 
>> -Original Message-
>> From: Alexander Graf [mailto:ag...@suse.de]
>> Sent: Monday, August 11, 2014 3:06 PM
>> To: Eric Auger; eric.au...@st.com; christoffer.d...@linaro.org; qemu-
>> de...@nongnu.org; Phillips Kim-R1AAHA; a.r...@virtualopensystems.com
>> Cc: will.dea...@arm.com; kvm...@lists.cs.columbia.edu;
>> alex.william...@redhat.com; Bhushan Bharat-R65777; peter.mayd...@linaro.org;
>> Yoder Stuart-B08248; a.mota...@virtualopensystems.com; patc...@linaro.org;
>> joel.sch...@amd.com; Kim Phillips
>> Subject: Re: [PATCH v5 07/10] hw/vfio/platform: add vfio-platform support
>>
>>
>> On 09.08.14 16:25, Eric Auger wrote:
>>> Minimal VFIO platform implementation supporting
>>> - register space user mapping,
>>> - IRQ assignment based on eventfds handled on qemu side.
>>>
>>> irqfd kernel acceleration comes in a subsequent patch.
>>>
>>> Signed-off-by: Kim Phillips 
>>> Signed-off-by: Eric Auger 
>>>
>>> ---
>>>
>>> v4 -> v5:
>>> - vfio-plaform.h included first
>>> - cleanup error handling in *populate*, vfio_get_device,
>>>vfio_enable_intp
>>> - vfio_put_device not called anymore
>>> - add some includes to follow vfio policy
>>>
>>> v3 -> v4:
>>> [Eric Auger]
>>> - merge of "vfio: Add initial IRQ support in platform device"
>>>to get a full functional patch although perfs are limited.
>>> - removal of unrealize function since I currently understand
>>>it is only used with device hot-plug feature.
>>>
>>> v2 -> v3:
>>> [Eric Auger]
>>> - further factorization between PCI and platform (VFIORegion,
>>>VFIODevice). same level of functionality.
>>>
>>> <= v2:
>>> [Kim Philipps]
>>> - Initial Creation of the device supporting register space mapping
>>> ---
>>>   hw/vfio/Makefile.objs   |   1 +
>>>   hw/vfio/platform.c  | 517
>> 
>>>   include/hw/vfio/vfio-platform.h |  77 ++
>>>   3 files changed, 595 insertions(+)
>>>   create mode 100644 hw/vfio/platform.c
>>>   create mode 100644 include/hw/vfio/vfio-platform.h
>>>
>>> diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
>>> index e31f30e..c5c76fe 100644
>>> --- a/hw/vfio/Makefile.objs
>>> +++ b/hw/vfio/Makefile.objs
>>> @@ -1,4 +1,5 @@
>>>   ifeq ($(CONFIG_LINUX), y)
>>>   obj-$(CONFIG_SOFTMMU) += common.o
>>>   obj-$(CONFIG_PCI) += pci.o
>>> +obj-$(CONFIG_SOFTMMU) += platform.o
>>>   endif
>>> diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
>>> new file mode 100644
>>> index 000..f1a1b55
>>> --- /dev/null
>>> +++ b/hw/vfio/platform.c
>>> @@ -0,0 +1,517 @@
>>> +/*
>>> + * vfio based device assignment support - platform devices
>>> + *
>>> + * Copyright Linaro Limited, 2014
>>> + *
>>> + * Authors:
>>> + *  Kim Phillips 
>>> + *
>>> + * This work is licensed under the terms of the GNU GPL, version 2.  See
>>> + * the COPYING file in the top-level directory.
>>> + *
>>> + * Based on vfio based PCI device assignment support:
>>> + *  Copyright Red Hat, Inc. 2012
>>> + */
>>> +
>>> +#include 
>>> +#include 
>>> +
>>> +#include "hw/vfio/vfio-platform.h"
>>> +#include "qemu/error-report.h"
>>> +#include "qemu/range.h"
>>> +#include "sysemu/sysemu.h"
>>> +#include "exec/memory.h"
>>> +#include "qemu/queue.h"
>>> +#include "hw/sysbus.h"
>>> +
>>> +extern const MemoryRegionOps vfio_region_ops;
>>> +extern const MemoryListener vfio_memory_listener;
>>> +extern QLIST_HEAD(, VFIOGroup) group_list;
>>> +extern QLIST_HEAD(, VFIOAddressSpace) vfio_address_spaces;
>>> +void vfio_put_device(VFIOPlatformDevice *vdev);
>>> +
>>> +/*
>>> + * It is mandatory to pass a VFIOPlatformDevice since VFIODevice
>>> + * is not a QOM Object and cannot be passed to memory region functions
>>> +*/
>>> +static void vfio_map_region(VFIOPlatformDevice *vdev, int nr)
>>> +{
>>> +VFIORegion *region = vdev->regions[nr];
>>> +unsigned size = region->size;
>>> +cha

Re: [Qemu-devel] [PATCH v5 10/10] hw/arm/dyn_sysbus_devtree: enable simple VFIO dynamic instantiation

2014-08-19 Thread Eric Auger
On 08/18/2014 11:54 PM, Joel Schopp wrote:
> 
> +static void vfio_fdt_add_device_node(SysBusDevice *sbdev, void *opaque)
> +{
> +PlatformDevtreeData *data = opaque;
> +void *fdt = data->fdt;
> +const char *parent_node = data->node;
> +int compat_str_len;
> +char *nodename;
> +int i, ret;
> +uint32_t *irq_attr;
> +uint64_t *reg_attr;
> +uint64_t mmio_base;
> +uint64_t irq_number;
> +gchar mmio_base_prop[8];
> +gchar irq_number_prop[8];
> +VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(sbdev);
> +VFIODevice *vbasedev = &vdev->vbasedev;
> +Object *obj = OBJECT(sbdev);
> +
> +mmio_base = object_property_get_int(obj, "mmio[0]", NULL);
> +
> +nodename = g_strdup_printf("%s/%s@%" PRIx64, parent_node,
> +   vbasedev->name,
> +   mmio_base);
> +
> +qemu_fdt_add_subnode(fdt, nodename);
> +
> +compat_str_len = strlen(vdev->compat) + 1;
> +qemu_fdt_setprop(fdt, nodename, "compatible",
> +vdev->compat, compat_str_len);
> +
> +reg_attr = g_new(uint64_t, vbasedev->num_regions*4);
> +
> +for (i = 0; i < vbasedev->num_regions; i++) {
> +snprintf(mmio_base_prop, sizeof(mmio_base_prop), "mmio[%d]", i);
> +mmio_base = object_property_get_int(obj, mmio_base_prop, NULL);
> +reg_attr[2*i] = 1;
> +reg_attr[2*i+1] = mmio_base;
> +reg_attr[2*i+2] = 1;
> +reg_attr[2*i+3] = memory_region_size(&vdev->regions[i]->mem);
> +}
> 
> This should be 4 instead of 2. 
Hi Joel,

Yes definitively! Forgot to restore the original value after trying
different qemu_fdt_setprop_* functions. sorry for that.

Best Regards

Eric
> Also, to support 64 bit systems I think this should be 2 instead of 1.
> 




Re: [Qemu-devel] [PATCH v5 10/10] hw/arm/dyn_sysbus_devtree: enable simple VFIO dynamic instantiation

2014-08-19 Thread Eric Auger
On 08/19/2014 12:11 AM, Peter Maydell wrote:
> On 18 August 2014 22:54, Joel Schopp  wrote:
>>
>> +static void vfio_fdt_add_device_node(SysBusDevice *sbdev, void *opaque)
>> +{
>> +PlatformDevtreeData *data = opaque;
>> +void *fdt = data->fdt;
>> +const char *parent_node = data->node;
>> +int compat_str_len;
>> +char *nodename;
>> +int i, ret;
>> +uint32_t *irq_attr;
>> +uint64_t *reg_attr;
>> +uint64_t mmio_base;
>> +uint64_t irq_number;
>> +gchar mmio_base_prop[8];
>> +gchar irq_number_prop[8];
>> +VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(sbdev);
>> +VFIODevice *vbasedev = &vdev->vbasedev;
>> +Object *obj = OBJECT(sbdev);
>> +
>> +mmio_base = object_property_get_int(obj, "mmio[0]", NULL);
>> +
>> +nodename = g_strdup_printf("%s/%s@%" PRIx64, parent_node,
>> +   vbasedev->name,
>> +   mmio_base);
>> +
>> +qemu_fdt_add_subnode(fdt, nodename);
>> +
>> +compat_str_len = strlen(vdev->compat) + 1;
> 
> At this point you've already substituted the NULs in,
> so you can't call strlen(), I think.
Hi Peter,

yes you're right. Thanks
> 
>> +qemu_fdt_setprop(fdt, nodename, "compatible",
>> +vdev->compat, compat_str_len);
>> +
>> +reg_attr = g_new(uint64_t, vbasedev->num_regions*4);
>> +
>> +for (i = 0; i < vbasedev->num_regions; i++) {
>> +snprintf(mmio_base_prop, sizeof(mmio_base_prop), "mmio[%d]", i);
>> +mmio_base = object_property_get_int(obj, mmio_base_prop, NULL);
>> +reg_attr[2*i] = 1;
>> +reg_attr[2*i+1] = mmio_base;
>> +reg_attr[2*i+2] = 1;
>> +reg_attr[2*i+3] = memory_region_size(&vdev->regions[i]->mem);
>> +}
>>
>> This should be 4 instead of 2.
>> Also, to support 64 bit systems I think this should be 2 instead of 1.
> 
> Actually it depends entirely on what the board has done to
> create the device tree node that we're inserting this child
> node into. For ARM boot.c sets both #address-cells and
> #size-cells to 2 regardless of whether the system is 32
> or 64 bits, for simplicity. I imagine PPC does something
> different. If we're editing a dtb that the user passed in (which
> I think would be pretty lunatic so we shouldn't do this)
> we'd have to actually walk the dtb to try to figure out what
> the semantics of the reg property should be.

Putting size=1 was the only solution I found to use an offset relative
to the parent bus instead of an absolute base address. I would explain
this because, in platform_bus_create_devtree, the function that creates
the "platform bus" node, #address-cells and #size-cells currently are
set to 1. I assume the motivation was that bus size was supposed to be
smaller than 4GB. Then I guess the problem is shifted to the inclusion
of the platform bus in any ARM platform.

Thanks

Eric
> 
> thanks
> -- PMM
> 




Re: [Qemu-devel] [PATCH v5 10/10] hw/arm/dyn_sysbus_devtree: enable simple VFIO dynamic instantiation

2014-08-19 Thread Eric Auger
On 08/19/2014 12:26 AM, Joel Schopp wrote:
> 
> On 08/18/2014 05:11 PM, Peter Maydell wrote:
>> On 18 August 2014 22:54, Joel Schopp  wrote:
>>> +static void vfio_fdt_add_device_node(SysBusDevice *sbdev, void *opaque)
>>> +{
>>> +PlatformDevtreeData *data = opaque;
>>> +void *fdt = data->fdt;
>>> +const char *parent_node = data->node;
>>> +int compat_str_len;
>>> +char *nodename;
>>> +int i, ret;
>>> +uint32_t *irq_attr;
>>> +uint64_t *reg_attr;
>>> +uint64_t mmio_base;
>>> +uint64_t irq_number;
>>> +gchar mmio_base_prop[8];
>>> +gchar irq_number_prop[8];
>>> +VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(sbdev);
>>> +VFIODevice *vbasedev = &vdev->vbasedev;
>>> +Object *obj = OBJECT(sbdev);
>>> +
>>> +mmio_base = object_property_get_int(obj, "mmio[0]", NULL);
>>> +
>>> +nodename = g_strdup_printf("%s/%s@%" PRIx64, parent_node,
>>> +   vbasedev->name,
>>> +   mmio_base);
>>> +
>>> +qemu_fdt_add_subnode(fdt, nodename);
>>> +
>>> +compat_str_len = strlen(vdev->compat) + 1;
>> At this point you've already substituted the NULs in,
>> so you can't call strlen(), I think.
>>
>>> +qemu_fdt_setprop(fdt, nodename, "compatible",
>>> +vdev->compat, compat_str_len);
>>> +
>>> +reg_attr = g_new(uint64_t, vbasedev->num_regions*4);
>>> +
>>> +for (i = 0; i < vbasedev->num_regions; i++) {
>>> +snprintf(mmio_base_prop, sizeof(mmio_base_prop), "mmio[%d]", i);
>>> +mmio_base = object_property_get_int(obj, mmio_base_prop, NULL);
>>> +reg_attr[2*i] = 1;
>>> +reg_attr[2*i+1] = mmio_base;
>>> +reg_attr[2*i+2] = 1;
>>> +reg_attr[2*i+3] = memory_region_size(&vdev->regions[i]->mem);
>>> +}
>>>
>>> This should be 4 instead of 2.
>>> Also, to support 64 bit systems I think this should be 2 instead of 1.
>> Actually it depends entirely on what the board has done to
>> create the device tree node that we're inserting this child
>> node into. For ARM boot.c sets both #address-cells and
>> #size-cells to 2 regardless of whether the system is 32
>> or 64 bits, for simplicity. I imagine PPC does something
>> different. If we're editing a dtb that the user passed in (which
>> I think would be pretty lunatic so we shouldn't do this)
>> we'd have to actually walk the dtb to try to figure out what
>> the semantics of the reg property should be.
> For the index [2*i],[2*i+1], etc is clearly a bug as when i = 1 it will
> overwrite two of the values.  Changing that to [4*i],[4*i+1],etc fixes it.
> 
> I think you are right on the size.  I also wonder if the user doesn't
> pass in a dtb if qemu should try to recreate the device-tree entry from
> the platform device entry in the host kernel?  If so would that best be
> done by recreating the values from /proc/device-tree ?
Antonios recently submitted a patch to retrieve dt info from the vfio
platform device.
[RFC 0/4] VFIO: PLATFORM: Return device tree info for a platform device node
https://www.mail-archive.com/kvm@vger.kernel.org/msg106282.html

Best Regards

Eric
> 
> I also wish that qemu had a flag to output the generated dtb to a file
> much like lkvm (kvmtool) has.
> 




Re: [Qemu-devel] [PATCH v5 05/10] hw/vfio/pci: split vfio_get_device

2014-08-29 Thread Eric Auger
On 08/13/2014 05:32 AM, David Gibson wrote:
> On Tue, Aug 12, 2014 at 08:54:34AM +0200, Eric Auger wrote:
>> On 08/12/2014 04:41 AM, David Gibson wrote:
>>> On Sat, Aug 09, 2014 at 03:25:44PM +0100, Eric Auger wrote:
>>>> vfio_get_device now takes a VFIODevice as argument. The function is split
>>>> into 4 functional parts: dev_info query, device check, region populate
>>>> and interrupt populate. the last 3 are specialized by parent device and
>>>> are added into DeviceOps.
>>>
>>> Why is splitting these up into 4 stages useful, rather than having a
>>> single sub-class specific callback?
>>
>> Hi David,
>>
>> VFIOPlatformDevice already inherits from SysBusDevice and hence cannot
>> inherit from another VFIODevice. Same for VFIOPCIDevice that inherits
>> from PCIDevice. This is why I created this non QOM struct. But did you
>> mean something else?
> 
> Ah, yes, sorry, I missed that, though it's obvious now I think about
> it.
> 
>> Then splitting into 4: This was to share some code between platform and
>> PCI (dev_info query) and vfio_get_device was quite big already. I
>> thought it makes sense to split it into functional parts.
> 
> Hm, ok.  So splitting out dev_info_query certainly makes sense then.
> But does splitting the two populate sections make sense?  Is it
> plausible that two different VFIO capable busses would share one of
> these functions but not the other?

Hi David,

Coming back to you on that topic. There is no other justification for
splitting the code into 3 functions except than having shorter functions
with reduced functionality. But I acknowledge it would simplify the diff
between original code and new one so I intend to keep a single
specialized functions instead of 3.

Best Regards

Eric

> 




Re: [Qemu-devel] [RFC v2 1/4] Add EXEC_FLAG to VFIO DMA mappings

2014-05-23 Thread Eric Auger
On 05/11/2014 07:13 PM, Alvise Rigo wrote:
> The flag is mandatory for the ARM SMMU so we always add it if the MMIO
> handles it.

Hi Alvise,

Refering to the root problem explanation found in
https://lkml.org/lkml/2014/2/8/176, I understand the problem is specific
to devices that fetch instructions from executable memory region
sections (XN =0).

Typically this is not the case of Midway xgmac which does not need
executable regions and hence does not need that change.

in
http://lists.linuxfoundation.org/pipermail/iommu/2013-November/007095.html,
Will says most IOMMU mappings should be XN.

I am not knowledged enough on mem mapping settings to understand the
consequences of always setting XN=0, even for devices that do not need
request it.

Does anyone have an opinion on this?

Best Regards

Eric

> 
> Signed-off-by: Alvise Rigo 
> ---
>  hw/vfio/common.c   | 9 +
>  hw/vfio/vfio-common.h  | 1 +
>  linux-headers/linux/vfio.h | 2 ++
>  3 files changed, 12 insertions(+)
> 
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index 9d1f723..a805c5d 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -107,6 +107,11 @@ static int vfio_dma_map(VFIOContainer *container, hwaddr 
> iova,
>  map.flags |= VFIO_DMA_MAP_FLAG_WRITE;
>  }
>  
> +/* add exec flag */
> +if (container->iommu_data.has_exec_cap) {
> +map.flags |= VFIO_DMA_MAP_FLAG_EXEC;
> +}
> +
>  /*
>   * Try the mapping, if it fails with EBUSY, unmap the region and try
>   * again.  This shouldn't be necessary, but we sometimes see it in
> @@ -352,6 +357,10 @@ static int vfio_connect_container(VFIOGroup *group)
>  return -errno;
>  }
>  
> +if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_IOMMU_PROT_EXEC)) {
> +container->iommu_data.has_exec_cap = true;
> +}
> +
>  container->iommu_data.type1.listener = vfio_memory_listener;
>  container->iommu_data.release = vfio_listener_release;
>  
> diff --git a/hw/vfio/vfio-common.h b/hw/vfio/vfio-common.h
> index 21148ef..1abbd1a 100644
> --- a/hw/vfio/vfio-common.h
> +++ b/hw/vfio/vfio-common.h
> @@ -35,6 +35,7 @@ typedef struct VFIOContainer {
>  union {
>  VFIOType1 type1;
>  };
> +bool has_exec_cap; /* support of exec capability by the IOMMU */
>  void (*release)(struct VFIOContainer *);
>  } iommu_data;
>  QLIST_HEAD(, VFIOGroup) group_list;
> diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
> index 17c58e0..95a02c5 100644
> --- a/linux-headers/linux/vfio.h
> +++ b/linux-headers/linux/vfio.h
> @@ -24,6 +24,7 @@
>  #define VFIO_TYPE1_IOMMU 1
>  #define VFIO_SPAPR_TCE_IOMMU 2
>  
> +#define VFIO_IOMMU_PROT_EXEC 5
>  /*
>   * The IOCTL interface is designed for extensibility by embedding the
>   * structure length (argsz) and flags into structures passed between
> @@ -392,6 +393,7 @@ struct vfio_iommu_type1_dma_map {
>   __u32   flags;
>  #define VFIO_DMA_MAP_FLAG_READ (1 << 0)  /* readable from device 
> */
>  #define VFIO_DMA_MAP_FLAG_WRITE (1 << 1) /* writable from device */
> +#define VFIO_DMA_MAP_FLAG_EXEC (1 << 2)  /* executable from device */
>   __u64   vaddr;  /* Process virtual address */
>   __u64   iova;   /* IO virtual address */
>   __u64   size;   /* Size of mapping (bytes) */
> 




Re: [Qemu-devel] [PATCH v3 6/6] hw/arm/virt: Support dynamically spawned sysbus devices

2014-10-20 Thread Eric Auger
On 09/09/2014 01:11 PM, Paolo Bonzini wrote:
> Il 09/09/2014 09:54, Eric Auger ha scritto:
>> Allows sysbus devices to be instantiated from command line by
>> using -device option
>>
>> Signed-off-by: Alexander Graf 
>> Signed-off-by: Eric Auger 
>>
>> ---
>>
>> v2 -> v3
>> - renaming of arm_platform_bus_create_devtree and arm_load_dtb
>> - add copyright in hw/arm/dyn_sysbus_devtree.c
>>
>> v1 -> v2:
>> - remove useless vfio-platform.h include file
>> - s/MACHVIRT_PLATFORM_HOLE/MACHVIRT_PLATFORM_SIZE
>> - use dyn_sysbus_binding and dyn_sysbus_devtree
>> - dynamic sysbus platform buse size shrinked to 4MB and
>>   moved between RTC and MMIO
>>
>> v1:
>>
>> Inspired from what Alex Graf did in ppc e500
>> https://lists.gnu.org/archive/html/qemu-ppc/2014-07/msg00012.html
>> ---
>>  hw/arm/dyn_sysbus_devtree.c | 26 +++
>>  hw/arm/virt.c   | 62 
>> +++--
>>  2 files changed, 86 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/arm/dyn_sysbus_devtree.c b/hw/arm/dyn_sysbus_devtree.c
>> index 6375024..61e5b5f 100644
>> --- a/hw/arm/dyn_sysbus_devtree.c
>> +++ b/hw/arm/dyn_sysbus_devtree.c
>> @@ -1,7 +1,30 @@
>> +/*
>> + * ARM Platform Bus device tree generation helpers 
>> + *
>> + * Copyright (c) 2014 Linaro Limited
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms and conditions of the GNU General Public License,
>> + * version 2 or later, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope it will be useful, but WITHOUT
>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>> + * more details.
>> + *
>> + * You should have received a copy of the GNU General Public License along 
>> with
>> + * this program.  If not, see <http://www.gnu.org/licenses/>.
>> + *
>> + */
>> +
>>  #include "hw/arm/dyn_sysbus_devtree.h"
>>  #include "qemu/error-report.h"
>>  #include "sysemu/device_tree.h"
>>  
>> +/**
>> + * arm_sysbus_device_create_devtree - create the node of devices
>> + * attached to the platform bus
>> + */
>>  static int arm_sysbus_device_create_devtree(Object *obj, void *opaque)
>>  {
>>  PlatformDevtreeData *data = opaque;
>> @@ -27,6 +50,9 @@ static int arm_sysbus_device_create_devtree(Object *obj, 
>> void *opaque)
>>  return 0;
>>  }
>>  
>> +/**
>> + * arm_platform_bus_create_devtree - create the platform bus node
>> + */
>>  void arm_platform_bus_create_devtree(DynSysbusParams *params,
>>   void *fdt, const char *intc)
>>  {
> 
> All this should go in patch 2.  For the documentation comments, please
> follow the model of include/hw/memory.h (including putting the
> documentation in the header for public functions).
> 
>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>> index 9085b88..16acf44 100644
>> --- a/hw/arm/virt.c
>> +++ b/hw/arm/virt.c
>> @@ -40,6 +40,8 @@
>>  #include "exec/address-spaces.h"
>>  #include "qemu/bitops.h"
>>  #include "qemu/error-report.h"
>> +#include "hw/misc/dyn_sysbus_binding.h"
>> +#include "hw/arm/dyn_sysbus_devtree.h"
>>  
>>  #define NUM_VIRTIO_TRANSPORTS 32
>>  
>> @@ -57,6 +59,14 @@
>>  #define GIC_FDT_IRQ_PPI_CPU_START 8
>>  #define GIC_FDT_IRQ_PPI_CPU_WIDTH 8
>>  
>> +#define MACHVIRT_PLATFORM_BASE 0x940
>> +#define MACHVIRT_PLATFORM_SIZE (4ULL * 1024 * 1024) /* 4 MB */
>> +#define MACHVIRT_PLATFORM_PAGE_SHIFT   12
>> +#define MACHVIRT_PLATFORM_SIZE_PAGES   (MACHVIRT_PLATFORM_SIZE >> \
>> +MACHVIRT_PLATFORM_PAGE_SHIFT)
>> +#define MACHVIRT_PLATFORM_FIRST_IRQ48
>> +#define MACHVIRT_PLATFORM_NUM_IRQS 20
>> +
>>  enum {
>>  VIRT_FLASH,
>>  VIRT_MEM,
>> @@ -66,6 +76,7 @@ enum {
>>  VIRT_UART,
>>  VIRT_MMIO,
>>  VIRT_RTC,
>> +VIRT_PLATFORM,
>>  };
>>  
>>  typedef struct MemMapEntry {
>> @@ -105,16 +116,27 @@ static const MemMapEntry a15memmap[] = {
>>  [VIRT_GIC_CPU] ={ 0x0801, 0x0001 },
>>  [VIRT_UART] =   { 0x0900, 0x1000 },
>>  [

[Qemu-devel] dynamic sysbus instantiation and load_dtb implementation

2014-10-23 Thread Eric Auger
Dear all,

The goal of this mail is to summarize how dynamic sysbus device tree
nodes were created on ARM with "machvirt dynamic sysbus device
instantiation",
https://lists.gnu.org/archive/html/qemu-devel/2014-09/msg01626.html
and request some advises after commit "hw/arm/boot: load DTB as a ROM
image", which puts into question the current implementation.

When dynamically instantiating sysbus devices from qemu command line,
the complete device tree cannot be built at machine init. At time we
miss key information about those devices (base address, IRQ binding, ...)

dynamic sysbus devices are "realized" after the machine init when
parsing "-device" option line. This is at that time the information
about the device are collected.

The QEMU binding of the devices is performed in the platform_bus
machine_init_done_notifier. Only at that time the base address of the
device and IRQ number are chosen.

The original idea was to create the dynamic sysbus device tree nodes in
a reset callback (registered through qemu_register_reset). device tree
was fully re-created at that time and new sysbus device nodes were added
too. Finally archi specific load_dtb was called.

On ppc/e500 this works since load_dtb uses cpu_physical_memory_write.
it was the case on ARM too until recently but commit "hw/arm/boot: load
DTB as a ROM image" changed the arm load_dtb implementation. It now uses
rom_add_blob_fixed. when the reset callback is called rom_load_done()
was called by vl.c and prevents from changing the rom content. Hence
current callback mechanism does not work anymore.

A solution I foresee to fix the issue:
construct the device tree nodes in one machine_init_done_notifier,
before the rom_load_done is called. I would propose the platform bus
device (hw/core/platform-bus.c in [PATCH v3 0/7] Dynamic sysbus device
allocation support,
http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg04833.html)
to register another machine_init_done_notifier whose role would be to
initiate the dt upgrade. I would add a function to the platform bus to
pass an opaque data that allows calling architecture specific dt
implementation in the notifier, if needed (on ARM only).

I understand reverting to previous cpu_physical_memory_write
implementation on ARM is not the good direction.

Do you have any comments about the proposed solution, any other suggestion?

Thanks in advance

Best Regards

Eric








Re: [Qemu-devel] dynamic sysbus instantiation and load_dtb implementation

2014-10-23 Thread Eric Auger
Hi,

Thanks everyone for entering the thread & reading my long email.

Alex, I indeed can register the notifier in the machine file after the
platform bus instantiation. This indeed guarantees the notifiers are
called in the right order ...

Thanks

Best Regards

Eric

On 10/23/2014 01:26 PM, Alexander Graf wrote:
> 
> 
> On 23.10.14 13:24, Peter Maydell wrote:
>> On 23 October 2014 12:23, Alexander Graf  wrote:
>>> On 23.10.14 12:19, Ard Biesheuvel wrote:
 The reason for this change was that, before, the DTB would only be
 generated once, and after a reset, the machine would go through the
 kernel boot protocol as before but the DTB pointer would point to
 garbage. Any idea how ppc deals with this? Do they recreate the device
 tree after each reset?
>>>
>>> Yes, we regenerate the device tree on each reset.
>>
>> Any particular reason? Surely it's always the same...
> 
> We have the code in place anyway, it's not a performance critical code
> path and putting it into a rom would be a waste of RAM, as it'd keep yet
> another copy of something we can easily regenerate.
> 
> It's a matter of personal preference I guess.
> 
> 
> Alex
> 




Re: [Qemu-devel] dynamic sysbus instantiation and load_dtb implementation

2014-10-23 Thread Eric Auger
On 10/23/2014 01:41 PM, Eric Auger wrote:
> Hi,
> 
> Thanks everyone for entering the thread & reading my long email.
> 
> Alex, I indeed can register the notifier in the machine file after 
s/after/before
Eric
the
> platform bus instantiation. This indeed guarantees the notifiers are
> called in the right order ...

> 
> Thanks
> 
> Best Regards
> 
> Eric
> 
> On 10/23/2014 01:26 PM, Alexander Graf wrote:
>>
>>
>> On 23.10.14 13:24, Peter Maydell wrote:
>>> On 23 October 2014 12:23, Alexander Graf  wrote:
>>>> On 23.10.14 12:19, Ard Biesheuvel wrote:
>>>>> The reason for this change was that, before, the DTB would only be
>>>>> generated once, and after a reset, the machine would go through the
>>>>> kernel boot protocol as before but the DTB pointer would point to
>>>>> garbage. Any idea how ppc deals with this? Do they recreate the device
>>>>> tree after each reset?
>>>>
>>>> Yes, we regenerate the device tree on each reset.
>>>
>>> Any particular reason? Surely it's always the same...
>>
>> We have the code in place anyway, it's not a performance critical code
>> path and putting it into a rom would be a waste of RAM, as it'd keep yet
>> another copy of something we can easily regenerate.
>>
>> It's a matter of personal preference I guess.
>>
>>
>> Alex
>>
> 




[Qemu-devel] [PATCH v7 12/16] hw/arm/sysbus-fdt: enable vfio-calxeda-xgmac dynamic instantiation

2014-10-31 Thread Eric Auger
vfio-calxeda-xgmac now can be instantiated using the -device option.
The node creation function generates a very basic dt node composed
of the compat, reg and interrupts properties

Signed-off-by: Eric Auger 

---

v6 -> v7:
- compat string re-formatting removed since compat string is not exposed
  anymore as a user option
- VFIO IRQ kick-off removed from sysbus-fdt and moved to VFIO platform
  device
---
 hw/arm/sysbus-fdt.c | 88 +
 1 file changed, 88 insertions(+)

diff --git a/hw/arm/sysbus-fdt.c b/hw/arm/sysbus-fdt.c
index d5476f1..f8b310b 100644
--- a/hw/arm/sysbus-fdt.c
+++ b/hw/arm/sysbus-fdt.c
@@ -27,6 +27,8 @@
 #include "hw/platform-bus.h"
 #include "sysemu/sysemu.h"
 #include "hw/platform-bus.h"
+#include "hw/vfio/vfio-platform.h"
+#include "hw/vfio/vfio-calxeda-xgmac.h"
 
 /*
  * internal struct that contains the information to create dynamic
@@ -54,8 +56,11 @@ typedef struct NodeCreationPair {
 int (*add_fdt_node_fn)(SysBusDevice *sbdev, void *opaque);
 } NodeCreationPair;
 
+static int add_basic_vfio_fdt_node(SysBusDevice *sbdev, void *opaque);
+
 /* list of supported dynamic sysbus devices */
 NodeCreationPair add_fdt_node_functions[] = {
+{TYPE_VFIO_CALXEDA_XGMAC, add_basic_vfio_fdt_node},
 {"", NULL}, /*last element*/
 };
 
@@ -86,6 +91,89 @@ static int add_fdt_node(SysBusDevice *sbdev, void *opaque)
 }
 
 /**
+ * add_basic_vfio_fdt_node - generates the most basic node for a VFIO node
+ *
+ * set properties are:
+ * - compatible string
+ * - regs
+ * - interrupts
+ */
+static int add_basic_vfio_fdt_node(SysBusDevice *sbdev, void *opaque)
+{
+PlatformBusFdtData *data = opaque;
+PlatformBusDevice *pbus = data->pbus;
+void *fdt = data->fdt;
+const char *parent_node = data->pbus_node_name;
+int compat_str_len;
+char *nodename;
+int i, ret;
+uint32_t *irq_attr;
+uint64_t *reg_attr;
+uint64_t mmio_base;
+uint64_t irq_number;
+VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(sbdev);
+VFIODevice *vbasedev = &vdev->vbasedev;
+Object *obj = OBJECT(sbdev);
+
+mmio_base = object_property_get_int(obj, "mmio[0]", NULL);
+
+nodename = g_strdup_printf("%s/%s@%" PRIx64, parent_node,
+   vbasedev->name,
+   mmio_base);
+
+qemu_fdt_add_subnode(fdt, nodename);
+
+compat_str_len = strlen(vdev->compat) + 1;
+qemu_fdt_setprop(fdt, nodename, "compatible",
+  vdev->compat, compat_str_len);
+
+reg_attr = g_new(uint64_t, vbasedev->num_regions*4);
+
+for (i = 0; i < vbasedev->num_regions; i++) {
+mmio_base = platform_bus_get_mmio_addr(pbus, sbdev, i);
+reg_attr[4*i] = 1;
+reg_attr[4*i+1] = mmio_base;
+reg_attr[4*i+2] = 1;
+reg_attr[4*i+3] = memory_region_size(&vdev->regions[i]->mem);
+}
+
+ret = qemu_fdt_setprop_sized_cells_from_array(fdt, nodename, "reg",
+ vbasedev->num_regions*2, reg_attr);
+if (ret < 0) {
+error_report("could not set reg property of node %s", nodename);
+goto fail;
+}
+
+irq_attr = g_new(uint32_t, vbasedev->num_irqs*3);
+
+for (i = 0; i < vbasedev->num_irqs; i++) {
+irq_number = platform_bus_get_irqn(pbus, sbdev , i)
+ + data->irq_start;
+irq_attr[3*i] = cpu_to_be32(0);
+irq_attr[3*i+1] = cpu_to_be32(irq_number);
+irq_attr[3*i+2] = cpu_to_be32(0x4);
+}
+
+   ret = qemu_fdt_setprop(fdt, nodename, "interrupts",
+ irq_attr, vbasedev->num_irqs*3*sizeof(uint32_t));
+if (ret < 0) {
+error_report("could not set interrupts property of node %s",
+ nodename);
+goto fail;
+}
+
+g_free(nodename);
+g_free(irq_attr);
+g_free(reg_attr);
+
+return 0;
+
+fail:
+
+   return -1;
+}
+
+/**
  * add_all_platform_bus_fdt_nodes - create all the platform bus nodes
  *
  * builds the parent platform bus node and all the nodes of dynamic
-- 
1.8.3.2




[Qemu-devel] [PATCH v2 RESEND] vfio: migration to trace points

2014-10-31 Thread Eric Auger
This patch removes all DPRINTF and replace them by trace points.
A few DPRINTF used in error cases were transformed into error_report.

Signed-off-by: Eric Auger 

---

- __func__ is removed since trace point name does the same job
- HWADDR_PRIx were replaced by PRIx64
- this transformation just is tested compiled on PCI.
  qemu configured with --enable-trace-backends=stderr
- in future, format strings and calls may be simplified by using a single
  name argument instead of domain, bus, slot, function.

v1 (RFC) -> v2 (PATCH):
- restore original format strings since parsing now is OK after
  commit f9bbba9,
  [PATCH v2] trace: tighten up trace-events regex to fix bad parse
---
 hw/misc/vfio.c | 403 +
 trace-events   |  75 ++-
 2 files changed, 280 insertions(+), 198 deletions(-)

diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
index 75bfa1c..cdf4922 100644
--- a/hw/misc/vfio.c
+++ b/hw/misc/vfio.c
@@ -40,15 +40,7 @@
 #include "sysemu/kvm.h"
 #include "sysemu/sysemu.h"
 #include "hw/misc/vfio.h"
-
-/* #define DEBUG_VFIO */
-#ifdef DEBUG_VFIO
-#define DPRINTF(fmt, ...) \
-do { fprintf(stderr, "vfio: " fmt, ## __VA_ARGS__); } while (0)
-#else
-#define DPRINTF(fmt, ...) \
-do { } while (0)
-#endif
+#include "trace.h"
 
 /* Extra debugging, trap acceleration paths for more logging */
 #define VFIO_ALLOW_MMAP 1
@@ -365,9 +357,9 @@ static void vfio_intx_interrupt(void *opaque)
 return;
 }
 
-DPRINTF("%s(%04x:%02x:%02x.%x) Pin %c\n", __func__, vdev->host.domain,
-vdev->host.bus, vdev->host.slot, vdev->host.function,
-'A' + vdev->intx.pin);
+trace_vfio_intx_interrupt(vdev->host.domain, vdev->host.bus,
+  vdev->host.slot, vdev->host.function,
+  'A' + vdev->intx.pin);
 
 vdev->intx.pending = true;
 pci_irq_assert(&vdev->pdev);
@@ -384,8 +376,8 @@ static void vfio_eoi(VFIODevice *vdev)
 return;
 }
 
-DPRINTF("%s(%04x:%02x:%02x.%x) EOI\n", __func__, vdev->host.domain,
-vdev->host.bus, vdev->host.slot, vdev->host.function);
+trace_vfio_eoi(vdev->host.domain, vdev->host.bus,
+   vdev->host.slot, vdev->host.function);
 
 vdev->intx.pending = false;
 pci_irq_deassert(&vdev->pdev);
@@ -454,9 +446,8 @@ static void vfio_enable_intx_kvm(VFIODevice *vdev)
 
 vdev->intx.kvm_accel = true;
 
-DPRINTF("%s(%04x:%02x:%02x.%x) KVM INTx accel enabled\n",
-__func__, vdev->host.domain, vdev->host.bus,
-vdev->host.slot, vdev->host.function);
+trace_vfio_enable_intx_kvm(vdev->host.domain, vdev->host.bus,
+   vdev->host.slot, vdev->host.function);
 
 return;
 
@@ -508,9 +499,8 @@ static void vfio_disable_intx_kvm(VFIODevice *vdev)
 /* If we've missed an event, let it re-fire through QEMU */
 vfio_unmask_intx(vdev);
 
-DPRINTF("%s(%04x:%02x:%02x.%x) KVM INTx accel disabled\n",
-__func__, vdev->host.domain, vdev->host.bus,
-vdev->host.slot, vdev->host.function);
+trace_vfio_disable_intx_kvm(vdev->host.domain, vdev->host.bus,
+vdev->host.slot, vdev->host.function);
 #endif
 }
 
@@ -529,9 +519,9 @@ static void vfio_update_irq(PCIDevice *pdev)
 return; /* Nothing changed */
 }
 
-DPRINTF("%s(%04x:%02x:%02x.%x) IRQ moved %d -> %d\n", __func__,
-vdev->host.domain, vdev->host.bus, vdev->host.slot,
-vdev->host.function, vdev->intx.route.irq, route.irq);
+trace_vfio_update_irq(vdev->host.domain, vdev->host.bus,
+  vdev->host.slot, vdev->host.function,
+  vdev->intx.route.irq, route.irq);
 
 vfio_disable_intx_kvm(vdev);
 
@@ -606,8 +596,8 @@ static int vfio_enable_intx(VFIODevice *vdev)
 
 vdev->interrupt = VFIO_INT_INTx;
 
-DPRINTF("%s(%04x:%02x:%02x.%x)\n", __func__, vdev->host.domain,
-vdev->host.bus, vdev->host.slot, vdev->host.function);
+trace_vfio_enable_intx(vdev->host.domain, vdev->host.bus,
+   vdev->host.slot, vdev->host.function);
 
 return 0;
 }
@@ -629,8 +619,8 @@ static void vfio_disable_intx(VFIODevice *vdev)
 
 vdev->interrupt = VFIO_INT_NONE;
 
-DPRINTF("%s(%04x:%02x:%02x.%x)\n", __func__, vdev->host.domain,
-vdev->host.bus, vdev->host.slot, vdev->host.function);
+trace_vfio_disable_intx(vdev->host.domain, vdev->host.bus,
+vdev->host.slot, vdev->host.function);
 }
 
 /*
@@ -657,9 +647,9 @@ static void vfio_msi_inter

[Qemu-devel] [PATCH v7 05/16] hw/vfio/pci: split vfio_get_device

2014-10-31 Thread Eric Auger
vfio_get_device now takes a VFIODevice as argument. The function is split
into 2 parts: vfio_get_device which is generic and vfio_populate_device
which is bus specific.

3 new fields are introduced in VFIODevice to store dev_info.

vfio_put_base_device is created.

---

v5->v6:
- simplifies the split for vfio_get_device:
  vfio_check_device, vfio_populate_regions, vfio_populate_interrupts
  are now gathered into a unique specialization function dubbed
  vfio_populate_device

v4->v5:
- cleanup up of error handling and get/put operations in
  vfio_check_device, vfio_populate_regions, vfio_populate_interrupts and
  vfio_get_device.
  - correct misuse of errno
  - vfio_populate_regions always returns 0
  - VFIODevice .name deallocation done in vfio_put_device instead of
vfio_put_base_device
  - vfio_put_base_device done at vfio_get_device level.

Signed-off-by: Eric Auger 
---
 hw/vfio/pci.c | 130 +++---
 trace-events  |  10 ++---
 2 files changed, 83 insertions(+), 57 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 186dfd0..0ee6f7f 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -205,12 +205,16 @@ typedef struct VFIODevice {
 bool reset_works;
 bool needs_reset;
 VFIODeviceOps *ops;
+unsigned int num_irqs;
+unsigned int num_regions;
+unsigned int flags;
 } VFIODevice;
 
 struct VFIODeviceOps {
 bool (*vfio_compute_needs_reset)(VFIODevice *vdev);
 int (*vfio_hot_reset_multi)(VFIODevice *vdev);
 void (*vfio_eoi)(VFIODevice *vdev);
+int (*vfio_populate_device)(VFIODevice *vdev);
 };
 
 typedef struct VFIOPCIDevice {
@@ -297,6 +301,8 @@ static uint32_t vfio_pci_read_config(PCIDevice *pdev, 
uint32_t addr, int len);
 static void vfio_pci_write_config(PCIDevice *pdev, uint32_t addr,
   uint32_t val, int len);
 static void vfio_mmap_set_enabled(VFIOPCIDevice *vdev, bool enabled);
+static void vfio_put_base_device(VFIODevice *vbasedev);
+static int vfio_populate_device(VFIODevice *vbasedev);
 
 /*
  * Common VFIO interrupt disable
@@ -3611,6 +3617,7 @@ static VFIODeviceOps vfio_pci_ops = {
 .vfio_compute_needs_reset = vfio_pci_compute_needs_reset,
 .vfio_hot_reset_multi = vfio_pci_hot_reset_multi,
 .vfio_eoi = vfio_eoi,
+.vfio_populate_device = vfio_populate_device,
 };
 
 static void vfio_reset_handler(void *opaque)
@@ -3952,70 +3959,45 @@ static void vfio_put_group(VFIOGroup *group)
 }
 }
 
-static int vfio_get_device(VFIOGroup *group, const char *name,
-   VFIOPCIDevice *vdev)
+static int vfio_populate_device(VFIODevice *vbasedev)
 {
-struct vfio_device_info dev_info = { .argsz = sizeof(dev_info) };
+VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev);
 struct vfio_region_info reg_info = { .argsz = sizeof(reg_info) };
 struct vfio_irq_info irq_info = { .argsz = sizeof(irq_info) };
-int ret, i;
-
-ret = ioctl(group->fd, VFIO_GROUP_GET_DEVICE_FD, name);
-if (ret < 0) {
-error_report("vfio: error getting device %s from group %d: %m",
- name, group->groupid);
-error_printf("Verify all devices in group %d are bound to vfio-pci "
- "or pci-stub and not already in use\n", group->groupid);
-return ret;
-}
-
-vdev->vbasedev.fd = ret;
-vdev->vbasedev.group = group;
-QLIST_INSERT_HEAD(&group->device_list, &vdev->vbasedev, next);
+int i, ret = -1;
 
 /* Sanity check device */
-ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_INFO, &dev_info);
-if (ret) {
-error_report("vfio: error getting device info: %m");
-goto error;
-}
-
-trace_vfio_get_device_irq(name, dev_info.flags,
-  dev_info.num_regions, dev_info.num_irqs);
-
-if (!(dev_info.flags & VFIO_DEVICE_FLAGS_PCI)) {
+if (!(vbasedev->flags & VFIO_DEVICE_FLAGS_PCI)) {
 error_report("vfio: Um, this isn't a PCI device");
 goto error;
 }
 
-vdev->vbasedev.reset_works = !!(dev_info.flags & VFIO_DEVICE_FLAGS_RESET);
-
-if (dev_info.num_regions < VFIO_PCI_CONFIG_REGION_INDEX + 1) {
+if (vbasedev->num_regions < VFIO_PCI_CONFIG_REGION_INDEX + 1) {
 error_report("vfio: unexpected number of io regions %u",
- dev_info.num_regions);
+ vbasedev->num_regions);
 goto error;
 }
 
-if (dev_info.num_irqs < VFIO_PCI_MSIX_IRQ_INDEX + 1) {
-error_report("vfio: unexpected number of irqs %u", dev_info.num_irqs);
+if (vbasedev->num_irqs < VFIO_PCI_MSIX_IRQ_INDEX + 1) {
+error_report("vfio: unexpected number of irqs %u", vbasedev->num_irqs);
 goto error;
 }
 
 for (i = VFIO_PCI_BAR0_REGION_INDEX; i < V

[Qemu-devel] [PATCH v7 09/16] hw/vfio/platform: add vfio-platform support

2014-10-31 Thread Eric Auger
Minimal VFIO platform implementation supporting
- register space user mapping,
- IRQ assignment based on eventfds handled on qemu side.

irqfd kernel acceleration comes in a subsequent patch.

Signed-off-by: Kim Phillips 
Signed-off-by: Eric Auger 

---
v6 -> v7:
- compat is not exposed anymore as a user option. Rationale is
  the vfio device became abstract and a specialization is needed
  anyway. The derived device must set the compat string.
- in v6 vfio_start_irq_injection was exposed in vfio-platform.h.
  A new function dubbed vfio_register_irq_starter replaces it. It
  registers a machine init done notifier that programs & starts
  all dynamic VFIO device IRQs. This function is supposed to be
  called by the machine file. A set of static helper routines are
  added too. It must be called before the creation of the platform
  bus device.

v5 -> v6:
- vfio_device property renamed into host property
- correct error handling of VFIO_DEVICE_GET_IRQ_INFO ioctl
  and remove PCI related comment
- remove declaration of vfio_setup_irqfd and irqfd_allowed
  property.Both belong to next patch (irqfd)
- remove declaration of vfio_intp_interrupt in vfio-platform.h
- functions that can be static get this characteristic
- remove declarations of vfio_region_ops, vfio_memory_listener,
  group_list, vfio_address_spaces. All are moved to vfio-common.h
- remove vfio_put_device declaration and definition
- print_regions removed. code moved into vfio_populate_regions
- replace DPRINTF by trace events
- new helper routine to set the trigger eventfd
- dissociate intp init from the injection enablement:
  vfio_enable_intp renamed into vfio_init_intp and new function
  named vfio_start_eventfd_injection
- injection start moved to vfio_start_irq_injection (not anymore
  in vfio_populate_interrupt)
- new start_irq_fn field in VFIOPlatformDevice corresponding to
  the function that will be used for starting injection
- user handled eventfd:
  x add mutex to protect IRQ state & list manipulation,
  x correct misleading comment in vfio_intp_interrupt.
  x Fix bugs thanks to fake interrupt modality
- VFIOPlatformDeviceClass becomes abstract
- add error_setg in vfio_platform_realize

v4 -> v5:
- vfio-plaform.h included first
- cleanup error handling in *populate*, vfio_get_device,
  vfio_enable_intp
- vfio_put_device not called anymore
- add some includes to follow vfio policy

v3 -> v4:
[Eric Auger]
- merge of "vfio: Add initial IRQ support in platform device"
  to get a full functional patch although perfs are limited.
- removal of unrealize function since I currently understand
  it is only used with device hot-plug feature.

v2 -> v3:
[Eric Auger]
- further factorization between PCI and platform (VFIORegion,
  VFIODevice). same level of functionality.

<= v2:
[Kim Philipps]
- Initial Creation of the device supporting register space mapping
---
 hw/vfio/Makefile.objs   |   1 +
 hw/vfio/platform.c  | 672 
 include/hw/vfio/vfio-common.h   |   1 +
 include/hw/vfio/vfio-platform.h |  87 ++
 trace-events|  12 +
 5 files changed, 773 insertions(+)
 create mode 100644 hw/vfio/platform.c
 create mode 100644 include/hw/vfio/vfio-platform.h

diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
index e31f30e..c5c76fe 100644
--- a/hw/vfio/Makefile.objs
+++ b/hw/vfio/Makefile.objs
@@ -1,4 +1,5 @@
 ifeq ($(CONFIG_LINUX), y)
 obj-$(CONFIG_SOFTMMU) += common.o
 obj-$(CONFIG_PCI) += pci.o
+obj-$(CONFIG_SOFTMMU) += platform.o
 endif
diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
new file mode 100644
index 000..9f66610
--- /dev/null
+++ b/hw/vfio/platform.c
@@ -0,0 +1,672 @@
+/*
+ * vfio based device assignment support - platform devices
+ *
+ * Copyright Linaro Limited, 2014
+ *
+ * Authors:
+ *  Kim Phillips 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ * Based on vfio based PCI device assignment support:
+ *  Copyright Red Hat, Inc. 2012
+ */
+
+#include 
+#include 
+
+#include "hw/vfio/vfio-platform.h"
+#include "qemu/error-report.h"
+#include "qemu/range.h"
+#include "sysemu/sysemu.h"
+#include "exec/memory.h"
+#include "qemu/queue.h"
+#include "hw/sysbus.h"
+#include "trace.h"
+#include "hw/platform-bus.h"
+
+static void vfio_intp_interrupt(VFIOINTp *intp);
+typedef void (*eventfd_user_side_handler_t)(VFIOINTp *intp);
+static int vfio_set_trigger_eventfd(VFIOINTp *intp,
+eventfd_user_side_handler_t handler);
+
+/*
+ * Functions only used when eventfd are handled on user-side
+ * ie. without irqfd
+ */
+
+/**
+ * vfio_platform_eoi - IRQ completion routine
+ * @vbasedev: the VFIO device
+ *
+ * de-asserts the active virtual IRQ and unmask the physical IRQ
+ * (masked by the  VFIO driver). Handl

[Qemu-devel] [PATCH v7 04/16] hw/vfio/pci: Introduce VFIORegion

2014-10-31 Thread Eric Auger
This structure is going to be shared by VFIOPCIDevice and
VFIOPlatformDevice. VFIOBAR includes it.

vfio_eoi becomes an ops of VFIODevice specialized by parent device.
This makes possible to transform vfio_bar_write/read into generic
vfio_region_write/read that will be used by VFIOPlatformDevice too.

vfio_mmap_bar becomes vfio_map_region

Signed-off-by: Eric Auger 

---

v4->v5:
- remove fd field from VFIORegion
- change error_report format string in vfio_region_write/read
- remove #ifdef DEBUG_VFIO in the same function
- correct missing initialization of bar region's vbasedev field
- change Object * parameter name of vfio_mmap_region and remove
  useless OBJECT()

Conflicts:
hw/vfio/pci.c
---
 hw/vfio/pci.c | 193 ++
 trace-events  |   4 +-
 2 files changed, 103 insertions(+), 94 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 0531744..186dfd0 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -78,15 +78,19 @@ typedef struct VFIOQuirk {
 } data;
 } VFIOQuirk;
 
-typedef struct VFIOBAR {
-off_t fd_offset; /* offset of BAR within device fd */
-int fd; /* device fd, allows us to pass VFIOBAR as opaque data */
+typedef struct VFIORegion {
+struct VFIODevice *vbasedev;
+off_t fd_offset; /* offset of region within device fd */
 MemoryRegion mem; /* slow, read/write access */
 MemoryRegion mmap_mem; /* direct mapped access */
 void *mmap;
 size_t size;
 uint32_t flags; /* VFIO region flags (rd/wr/mmap) */
-uint8_t nr; /* cache the BAR number for debug */
+uint8_t nr; /* cache the region number for debug */
+} VFIORegion;
+
+typedef struct VFIOBAR {
+VFIORegion region;
 bool ioport;
 bool mem64;
 QLIST_HEAD(, VFIOQuirk) quirks;
@@ -206,6 +210,7 @@ typedef struct VFIODevice {
 struct VFIODeviceOps {
 bool (*vfio_compute_needs_reset)(VFIODevice *vdev);
 int (*vfio_hot_reset_multi)(VFIODevice *vdev);
+void (*vfio_eoi)(VFIODevice *vdev);
 };
 
 typedef struct VFIOPCIDevice {
@@ -389,8 +394,10 @@ static void vfio_intx_interrupt(void *opaque)
 }
 }
 
-static void vfio_eoi(VFIOPCIDevice *vdev)
+static void vfio_eoi(VFIODevice *vbasedev)
 {
+VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev);
+
 if (!vdev->intx.pending) {
 return;
 }
@@ -400,7 +407,7 @@ static void vfio_eoi(VFIOPCIDevice *vdev)
 
 vdev->intx.pending = false;
 pci_irq_deassert(&vdev->pdev);
-vfio_unmask_irqindex(&vdev->vbasedev, VFIO_PCI_INTX_IRQ_INDEX);
+vfio_unmask_irqindex(vbasedev, VFIO_PCI_INTX_IRQ_INDEX);
 }
 
 static void vfio_enable_intx_kvm(VFIOPCIDevice *vdev)
@@ -553,7 +560,7 @@ static void vfio_update_irq(PCIDevice *pdev)
 vfio_enable_intx_kvm(vdev);
 
 /* Re-enable the interrupt in cased we missed an EOI */
-vfio_eoi(vdev);
+vfio_eoi(&vdev->vbasedev);
 }
 
 static int vfio_enable_intx(VFIOPCIDevice *vdev)
@@ -1090,10 +1097,11 @@ static void vfio_update_msi(VFIOPCIDevice *vdev)
 /*
  * IO Port/MMIO - Beware of the endians, VFIO is always little endian
  */
-static void vfio_bar_write(void *opaque, hwaddr addr,
-   uint64_t data, unsigned size)
+static void vfio_region_write(void *opaque, hwaddr addr,
+  uint64_t data, unsigned size)
 {
-VFIOBAR *bar = opaque;
+VFIORegion *region = opaque;
+VFIODevice *vbasedev = region->vbasedev;
 union {
 uint8_t byte;
 uint16_t word;
@@ -1116,20 +1124,14 @@ static void vfio_bar_write(void *opaque, hwaddr addr,
 break;
 }
 
-if (pwrite(bar->fd, &buf, size, bar->fd_offset + addr) != size) {
-error_report("%s(,0x%"HWADDR_PRIx", 0x%"PRIx64", %d) failed: %m",
- __func__, addr, data, size);
+if (pwrite(vbasedev->fd, &buf, size, region->fd_offset + addr) != size) {
+error_report("%s(%s:region%d+0x%"HWADDR_PRIx", 0x%"PRIx64
+ ",%d) failed: %m",
+ __func__, vbasedev->name, region->nr,
+ addr, data, size);
 }
 
-#ifdef DEBUG_VFIO
-{
-VFIOPCIDevice *vdev = container_of(bar, VFIOPCIDevice, bars[bar->nr]);
-
-trace_vfio_bar_write(vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function,
- region->nr, addr, data, size);
-}
-#endif
+trace_vfio_region_write(vbasedev->name, region->nr, addr, data, size);
 
 /*
  * A read or write to a BAR always signals an INTx EOI.  This will
@@ -1139,13 +1141,14 @@ static void vfio_bar_write(void *opaque, hwaddr addr,
  * which access will service the interrupt, so we're potentially
  * getting quite a few host interrupts per guest interrupt.
  */
-vfio_eoi(container_of(bar, VFIOPCIDevi

[Qemu-devel] [PATCH v7 11/16] hw/arm/virt: add support for VFIO devices

2014-10-31 Thread Eric Auger
VFIO devices are dynamic sysbus devices. They could already be
instantiated. However for them to be functional, IRQ injection must
be programmed and started. This programming must happen after the
sysbus devices are attached to the platform bus and IRQ are bound.
Only at that time the GSI they are connected to are identified and
irqfd can be programmed.

Binding happens in a machine init done notifier registered by the
platform bus init. The IRQ start is done in another notifier that
must be registered before the platform bus creation.

This patchs adds the registration of the IRQ start notifier in machvirt.

Signed-off-by: Eric Auger 

---

The registration of the IRQ start notifier could also happen in
the platform bus.
---
 hw/arm/virt.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 3a09d58..911dbfc 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -44,6 +44,7 @@
 #include "qemu/error-report.h"
 #include "hw/arm/sysbus-fdt.h"
 #include "hw/platform-bus.h"
+#include "hw/vfio/vfio-platform.h"
 
 #define NUM_VIRTIO_TRANSPORTS 32
 
@@ -546,6 +547,14 @@ static void create_platform_bus(VirtBoardInfo *vbi, 
qemu_irq *pic,
 MemoryRegion *sysmem = get_system_memory();
 
 /*
+ * Registers a notifier that starts VFIO IRQ injection. The notifier
+ * must be registered before the platform bus device creation. This
+ * latter registers another notifier that binds the dynamic sysbus
+ * devices to the platform bus.
+ */
+vfio_register_irq_starter(system_params->platform_bus_first_irq);
+
+/*
  * register the notifier that will update the device tree with
  * the platform bus and device tree nodes. Must be done before
  * the instantiation of the platform bus device that registers
-- 
1.8.3.2




[Qemu-devel] [PATCH v7 13/16] hw/vfio/platform: Add irqfd support

2014-10-31 Thread Eric Auger
This patch aims at optimizing IRQ handling using irqfd framework.

Instead of handling the eventfds on user-side they are handled on
kernel side using
- the KVM irqfd framework,
- the VFIO driver virqfd framework.

the virtual IRQ completion is trapped at interrupt controller
This removes the need for fast/slow path swap.

Overall this brings significant performance improvements.

it depends on host kernel KVM irqfd.

Signed-off-by: Alvise Rigo 
Signed-off-by: Eric Auger 

---
v5 -> v6
- rely on kvm_irqfds_enabled() and kvm_resamplefds_enabled()
- guard KVM code with #ifdef CONFIG_KVM

v3 -> v4:
[Alvise Rigo]
Use of VFIO Platform driver v6 unmask/virqfd feature and removal
of resamplefd handler. Physical IRQ unmasking is now done in
VFIO driver.

v3:
[Eric Auger]
initial support with resamplefd handled on QEMU side since the
unmask was not supported on VFIO platform driver v5.

Conflicts:
hw/vfio/platform.c
---
 hw/vfio/platform.c  | 96 +
 include/hw/vfio/vfio-platform.h |  1 +
 trace-events|  2 +
 3 files changed, 99 insertions(+)

diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
index 9f66610..bdd5c93 100644
--- a/hw/vfio/platform.c
+++ b/hw/vfio/platform.c
@@ -25,6 +25,7 @@
 #include "hw/sysbus.h"
 #include "trace.h"
 #include "hw/platform-bus.h"
+#include "sysemu/kvm.h"
 
 static void vfio_intp_interrupt(VFIOINTp *intp);
 typedef void (*eventfd_user_side_handler_t)(VFIOINTp *intp);
@@ -236,6 +237,83 @@ static int vfio_start_eventfd_injection(VFIOINTp *intp)
 }
 
 /*
+ * Functions used for irqfd
+ */
+
+#ifdef CONFIG_KVM
+
+/**
+ * vfio_set_resample_eventfd - sets the resamplefd for an IRQ
+ * @intp: the IRQ struct pointer
+ * programs the VFIO driver to unmask this IRQ when the
+ * intp->unmask eventfd is triggered
+ */
+static int vfio_set_resample_eventfd(VFIOINTp *intp)
+{
+VFIODevice *vbasedev = &intp->vdev->vbasedev;
+struct vfio_irq_set *irq_set;
+int argsz, ret;
+int32_t *pfd;
+
+argsz = sizeof(*irq_set) + sizeof(*pfd);
+irq_set = g_malloc0(argsz);
+irq_set->argsz = argsz;
+irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_UNMASK;
+irq_set->index = intp->pin;
+irq_set->start = 0;
+irq_set->count = 1;
+pfd = (int32_t *)&irq_set->data;
+*pfd = event_notifier_get_fd(&intp->unmask);
+qemu_set_fd_handler(*pfd, NULL, NULL, intp);
+ret = ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irq_set);
+g_free(irq_set);
+if (ret < 0) {
+error_report("vfio: Failed to set resample eventfd: %m");
+qemu_set_fd_handler(*pfd, NULL, NULL, NULL);
+}
+return ret;
+}
+
+/**
+ * vfio_start_irqfd_injection - starts irqfd injection for an IRQ
+ * programs VFIO driver with both the trigger and resamplefd
+ * programs KVM with the gsi, trigger & resample eventfds
+ */
+static int vfio_start_irqfd_injection(VFIOINTp *intp)
+{
+struct kvm_irqfd irqfd = {
+.fd = event_notifier_get_fd(&intp->interrupt),
+.resamplefd = event_notifier_get_fd(&intp->unmask),
+.gsi = intp->virtualID,
+.flags = KVM_IRQFD_FLAG_RESAMPLE,
+};
+
+if (kvm_vm_ioctl(kvm_state, KVM_IRQFD, &irqfd)) {
+error_report("vfio: Error: Failed to assign the irqfd: %m");
+goto fail_irqfd;
+}
+if (vfio_set_trigger_eventfd(intp, NULL) < 0) {
+goto fail_vfio;
+}
+if (vfio_set_resample_eventfd(intp) < 0) {
+goto fail_vfio;
+}
+
+intp->kvm_accel = true;
+trace_vfio_platform_start_irqfd_injection(intp->pin, intp->virtualID,
+ irqfd.fd, irqfd.resamplefd);
+return 0;
+
+fail_vfio:
+irqfd.flags = KVM_IRQFD_FLAG_DEASSIGN;
+kvm_vm_ioctl(kvm_state, KVM_IRQFD, &irqfd);
+fail_irqfd:
+return -1;
+}
+
+#endif
+
+/*
  * Functions used whatever the injection method
  */
 
@@ -314,6 +392,13 @@ static VFIOINTp *vfio_init_intp(VFIODevice *vbasedev, 
unsigned int index)
 error_report("vfio: Error: trigger event_notifier_init failed ");
 return NULL;
 }
+/* Get an eventfd for resample/unmask */
+ret = event_notifier_init(&intp->unmask, 0);
+if (ret) {
+g_free(intp);
+error_report("vfio: Error: resample event_notifier_init failed eoi");
+return NULL;
+}
 
 /* store the new intp in qlist */
 QLIST_INSERT_HEAD(&vdev->intp_list, intp, next);
@@ -542,7 +627,17 @@ static void vfio_platform_realize(DeviceState *dev, Error 
**errp)
 
 vbasedev->type = VFIO_DEVICE_TYPE_PLATFORM;
 vbasedev->ops = &vfio_platform_ops;
+
+#ifdef CONFIG_KVM
+if (kvm_irqfds_enabled() && kvm_resamplefds_enabled() &&
+vdev->irqfd_allowed) {
+vdev->start_irq_fn = vfio_sta

[Qemu-devel] [PATCH v4 0/6] machvirt dynamic sysbus device instantiation

2014-10-31 Thread Eric Auger
This patch series enables machvirt to dynamically instantiate sysbus
devices from command line (using -device option).

All those sysbus devices are plugged onto a platform bus. This latter
device is instantiated in machvirt and takes care of the binding of
children sysbus devices on a machine init done notifier. The device
tree node generation for children dynamic sysbus device also happens
on a subsequent notifier that must be executed after the above one.
machvirt registers that notifier before the platform bus creation to
make sure notifiers are executed in the right order: dt generation
after actual QOM binding.

Very few sysbus devices are supposed to be instantiated that way.
VFIO devices belong to them.

Node creation really is architecture specific. On ARM the dynamic
sysbus device node creation is implemented in a new C module,
hw/arm/sysbus-fdt.c and not in the machine file.

This series applies on top of Alex Graf's series
[PATCH v3 0/7] Dynamic sysbus device allocation support
http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg04860.html

Machvirt transformations and sysbus-fdt are largely inspired from Alex work.

The patch series can be found at:
http://git.linaro.org/people/eric.auger/qemu.git (branch vfio_integ_v7)

Best Regards

Eric

v3 -> v4:
- dyn_sysbus_binding removed since binding stuff now are implemented by
  the platform bus device
- due to a change in ARM load_dtb implementation using rom_add_blob_fixed,
  the dt no more is generated in a reset notifier but is generated on a
  machine init done notifier
- the augmented device tree is not generated from scratch anymore but is
  added using a modify_dtb function. This required some small change in
  boot.c
- the case where the user provides a dtb file now is handled
- some cleanup in virt additions
- implement a list of dyanmically instantiable devices in sysbus-fdt

v2 -> v3:
- patch now applies on top of Alex full patchset
- dyn_sysbus_devtree: add arm_prefix to emphasize the fact those
  functions are arm specific; arm_sysbus_device_create_devtree
  becomes static
- load_dtb renamed into arm_load_dtb
- add copyright in hw/arm/dyn_sysbus_devtree.c

v1 -> v2:
- device node generation no more in sysbus device but in
  dyn_sysbus_devtree
- VFIO region shrinked to 4MB and relocated in machvirt to avoid PCI
  shrink (dynamic vfio-mmio support might come latter)
- platform_bus_base removed from PlatformDevtreeData

Eric Auger (6):
  hw/arm/boot: load_dtb becomes non static arm_load_dtb
  hw/arm/boot: dtb start and limit moved in arm_boot_info
  hw/arm/boot: do not free VirtBoardInfo fdt in arm_load_dtb
  hw/arm: add a new modify_dtb_opaque field in arm_boot_info
  hw/arm/sysbus-fdt: helpers for platform bus nodes addition
  hw/arm/virt: add dynamic sysbus device support

 hw/arm/Makefile.objs|   1 +
 hw/arm/boot.c   |  48 +++-
 hw/arm/sysbus-fdt.c | 181 
 hw/arm/virt.c   |  59 +++
 include/hw/arm/arm.h|   7 ++
 include/hw/arm/sysbus-fdt.h |  50 
 6 files changed, 326 insertions(+), 20 deletions(-)
 create mode 100644 hw/arm/sysbus-fdt.c
 create mode 100644 include/hw/arm/sysbus-fdt.h

-- 
1.8.3.2




[Qemu-devel] [PATCH v7 16/16] hw/vfio/platform: add forwarded irq support

2014-10-31 Thread Eric Auger
Tests whether the forwarded IRQ modality is available.
In the positive device IRQs are forwarded. This control is
achieved with KVM-VFIO device. with such a modality injection
still is handled through irqfds. However end of interrupt is
not trapped anymore. As soon as the guest completes its virtual
IRQ, the corresponding physical IRQ is completed and the same
physical IRQ can hit again.

A new x-forward property enables to force forwarding off although
enabled by the kernel.

Signed-off-by: Eric Auger 
---
 hw/vfio/platform.c  | 52 +
 include/hw/vfio/vfio-platform.h |  2 ++
 trace-events|  1 +
 3 files changed, 55 insertions(+)

diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
index bdd5c93..f7ed209 100644
--- a/hw/vfio/platform.c
+++ b/hw/vfio/platform.c
@@ -237,6 +237,52 @@ static int vfio_start_eventfd_injection(VFIOINTp *intp)
 }
 
 /*
+ * Functions used with forwarding capability
+ */
+
+#ifdef CONFIG_KVM
+
+static bool has_kvm_vfio_forward_capability(void)
+{
+struct kvm_device_attr attr = {
+ .group = KVM_DEV_VFIO_DEVICE,
+ .attr = KVM_DEV_VFIO_DEVICE_FORWARD_IRQ};
+
+if (ioctl(vfio_kvm_device_fd, KVM_HAS_DEVICE_ATTR, &attr) == 0) {
+return true;
+} else {
+return false;
+}
+}
+
+static int vfio_set_forwarding(VFIOINTp *intp)
+{
+int ret;
+struct kvm_device_attr attr = {
+ .group = KVM_DEV_VFIO_DEVICE,
+ .attr = KVM_DEV_VFIO_DEVICE_FORWARD_IRQ};
+
+intp->fwd_irq = g_malloc0(sizeof(*intp->fwd_irq));
+intp->fwd_irq->fd = intp->vdev->vbasedev.fd;
+intp->fwd_irq->index = intp->pin;
+intp->fwd_irq->gsi = intp->virtualID;
+
+attr.addr = (uint64_t)(unsigned long)intp->fwd_irq;
+
+if (ioctl(vfio_kvm_device_fd, KVM_SET_DEVICE_ATTR, &attr)) {
+error_report("Failed to forward IRQ %d through KVM VFIO device",
+ intp->pin);
+g_free(intp->fwd_irq);
+return -errno;
+}
+trace_vfio_start_fwd_injection(intp->pin);
+
+return ret;
+}
+
+#endif
+
+/*
  * Functions used for irqfd
  */
 
@@ -288,6 +334,11 @@ static int vfio_start_irqfd_injection(VFIOINTp *intp)
 .flags = KVM_IRQFD_FLAG_RESAMPLE,
 };
 
+if (has_kvm_vfio_forward_capability() &&
+ intp->vdev->forward_allowed) {
+vfio_set_forwarding(intp);
+}
+
 if (kvm_vm_ioctl(kvm_state, KVM_IRQFD, &irqfd)) {
 error_report("vfio: Error: Failed to assign the irqfd: %m");
 goto fail_irqfd;
@@ -737,6 +788,7 @@ static Property vfio_platform_dev_properties[] = {
 DEFINE_PROP_UINT32("mmap-timeout-ms", VFIOPlatformDevice,
mmap_timeout, 1100),
 DEFINE_PROP_BOOL("x-irqfd", VFIOPlatformDevice, irqfd_allowed, true),
+DEFINE_PROP_BOOL("x-forward", VFIOPlatformDevice, forward_allowed, true),
 DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/include/hw/vfio/vfio-platform.h b/include/hw/vfio/vfio-platform.h
index 26ddba7..d22eb0e 100644
--- a/include/hw/vfio/vfio-platform.h
+++ b/include/hw/vfio/vfio-platform.h
@@ -42,6 +42,7 @@ typedef struct VFIOINTp {
 bool kvm_accel; /* set when QEMU bypass through KVM enabled */
 uint8_t pin; /* index */
 uint8_t virtualID; /* virtual IRQ */
+struct kvm_arch_forwarded_irq *fwd_irq;
 } VFIOINTp;
 
 typedef int (*start_irq_fn_t)(VFIOINTp *intp);
@@ -59,6 +60,7 @@ typedef struct VFIOPlatformDevice {
 start_irq_fn_t start_irq_fn;
 QemuMutex  intp_mutex;
 bool irqfd_allowed; /* debug option to force irqfd on/off */
+bool forward_allowed; /* debug option to force forwarding on/off */
 } VFIOPlatformDevice;
 
 
diff --git a/trace-events b/trace-events
index a05ed80..df3b71b 100644
--- a/trace-events
+++ b/trace-events
@@ -1429,6 +1429,7 @@ vfio_get_device(const char * name, unsigned int flags, 
unsigned int num_regions,
 vfio_put_base_device(int fd) "close vdev->fd=%d"
 
 # hw/vfio/platform.c
+vfio_start_fwd_injection(int pin) "forwarding set for IRQ pin %d"
 vfio_platform_eoi(int pin, int fd) "EOI IRQ pin %d (fd=%d)"
 vfio_platform_mmap_set_enabled(bool enabled) "fast path = %d"
 vfio_platform_intp_mmap_enable(int pin) "IRQ #%d still active, stay in slow 
path"
-- 
1.8.3.2




[Qemu-devel] [PATCH v7 14/16] linux-headers: Update KVM headers from linux-next tag ToBeFilled

2014-10-31 Thread Eric Auger
Syncup KVM related linux headers from linux-next tree using
scripts/update-linux-headers.sh.

Integrate updated KVM-VFIO API related to forwarded IRQ

Signed-off-by: Eric Auger 
---
 linux-headers/linux/kvm.h | 9 +
 1 file changed, 9 insertions(+)

diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index 2669938..239b380 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -947,6 +947,12 @@ struct kvm_device_attr {
__u64   addr;   /* userspace address of attr data */
 };
 
+struct kvm_arch_forwarded_irq {
+__u32 fd; /* file desciptor of the VFIO device */
+__u32 index; /* VFIO device IRQ index */
+__u32 gsi; /* gsi, ie. virtual IRQ number */
+};
+
 #define KVM_DEV_TYPE_FSL_MPIC_20   1
 #define KVM_DEV_TYPE_FSL_MPIC_42   2
 #define KVM_DEV_TYPE_XICS  3
@@ -954,6 +960,9 @@ struct kvm_device_attr {
 #define  KVM_DEV_VFIO_GROUP1
 #define   KVM_DEV_VFIO_GROUP_ADD   1
 #define   KVM_DEV_VFIO_GROUP_DEL   2
+#define  KVM_DEV_VFIO_DEVICE   2
+#define   KVM_DEV_VFIO_DEVICE_FORWARD_IRQ  1
+#define   KVM_DEV_VFIO_DEVICE_UNFORWARD_IRQ2
 #define KVM_DEV_TYPE_ARM_VGIC_V2   5
 #define KVM_DEV_TYPE_FLIC  6
 
-- 
1.8.3.2




[Qemu-devel] [PATCH v7 08/16] hw/vfio: create common module

2014-10-31 Thread Eric Auger
A new common module is created. It implements all functions
that have no device specificity (PCI, Platform).

This patch only consists in move (no functional changes)

Signed-off-by: Kim Phillips 
Signed-off-by: Eric Auger 

---
v6 -> v7:
- integrate Revert "vfio: Make BARs native endian"
- remove VFIO_DEVICE_TYPE_PLATFORM in vfio-common.h,
  will come in next patch

v5 -> v6:
- follow all evolutions of original PCI code from v5 to V6
- move declaration of vfio_region_ops, vfio_memory_listener,
  vfio_group_list, vfio_address_spaces into vfio-common.h

v4 -> v5:
- integrate "sPAPR/IOMMU: Fix TCE entry permission"
- VFIOdevice .name dealloc removed from vfio_put_base_device
- add some includes according to vfio inclusion policy

v3 -> v4:
[Eric Auger]
move done after all PCI modifications to anticipate for
VFIO Platform needs. Purpose is to alleviate the whole
review process.

<= v3
First split done by Kim Phillips

Conflicts:
hw/vfio/pci.c
---
 hw/vfio/Makefile.objs |1 +
 hw/vfio/common.c  |  958 ++
 hw/vfio/pci.c | 1028 +
 include/hw/vfio/vfio-common.h |  151 ++
 trace-events  |1 +
 5 files changed, 1112 insertions(+), 1027 deletions(-)
 create mode 100644 hw/vfio/common.c
 create mode 100644 include/hw/vfio/vfio-common.h

diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
index 31c7dab..e31f30e 100644
--- a/hw/vfio/Makefile.objs
+++ b/hw/vfio/Makefile.objs
@@ -1,3 +1,4 @@
 ifeq ($(CONFIG_LINUX), y)
+obj-$(CONFIG_SOFTMMU) += common.o
 obj-$(CONFIG_PCI) += pci.o
 endif
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
new file mode 100644
index 000..fbd9e7f
--- /dev/null
+++ b/hw/vfio/common.c
@@ -0,0 +1,958 @@
+/*
+ * generic functions used by VFIO devices
+ *
+ * Copyright Red Hat, Inc. 2012
+ *
+ * Authors:
+ *  Alex Williamson 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ * Based on qemu-kvm device-assignment:
+ *  Adapted for KVM by Qumranet.
+ *  Copyright (c) 2007, Neocleus, Alex Novik (a...@neocleus.com)
+ *  Copyright (c) 2007, Neocleus, Guy Zana (g...@neocleus.com)
+ *  Copyright (C) 2008, Qumranet, Amit Shah (amit.s...@qumranet.com)
+ *  Copyright (C) 2008, Red Hat, Amit Shah (amit.s...@redhat.com)
+ *  Copyright (C) 2008, IBM, Muli Ben-Yehuda (m...@il.ibm.com)
+ */
+
+#include 
+#include 
+#include 
+
+#include "hw/vfio/vfio-common.h"
+#include "hw/vfio/vfio.h"
+#include "exec/address-spaces.h"
+#include "exec/memory.h"
+#include "hw/hw.h"
+#include "qemu/error-report.h"
+#include "sysemu/kvm.h"
+#include "trace.h"
+
+struct vfio_group_head vfio_group_list =
+QLIST_HEAD_INITIALIZER(vfio_address_spaces);
+struct vfio_as_head vfio_address_spaces =
+QLIST_HEAD_INITIALIZER(vfio_address_spaces);
+
+#ifdef CONFIG_KVM
+/*
+ * We have a single VFIO pseudo device per KVM VM.  Once created it lives
+ * for the life of the VM.  Closing the file descriptor only drops our
+ * reference to it and the device's reference to kvm.  Therefore once
+ * initialized, this file descriptor is only released on QEMU exit and
+ * we'll re-use it should another vfio device be attached before then.
+ */
+static int vfio_kvm_device_fd = -1;
+#endif
+
+/*
+ * Common VFIO interrupt disable
+ */
+void vfio_disable_irqindex(VFIODevice *vbasedev, int index)
+{
+struct vfio_irq_set irq_set = {
+.argsz = sizeof(irq_set),
+.flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_TRIGGER,
+.index = index,
+.start = 0,
+.count = 0,
+};
+
+ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
+}
+
+void vfio_unmask_irqindex(VFIODevice *vbasedev, int index)
+{
+struct vfio_irq_set irq_set = {
+.argsz = sizeof(irq_set),
+.flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_UNMASK,
+.index = index,
+.start = 0,
+.count = 1,
+};
+
+ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
+}
+
+void vfio_mask_irqindex(VFIODevice *vbasedev, int index)
+{
+struct vfio_irq_set irq_set = {
+.argsz = sizeof(irq_set),
+.flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_MASK,
+.index = index,
+.start = 0,
+.count = 1,
+};
+
+ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
+}
+
+/*
+ * IO Port/MMIO - Beware of the endians, VFIO is always little endian
+ */
+void vfio_region_write(void *opaque, hwaddr addr,
+   uint64_t data, unsigned size)
+{
+VFIORegion *region = opaque;
+VFIODevice *vbasedev = region->vbasedev;
+union {
+uint8_t byte;
+uint16_t word;
+uint32_t dword;
+uint64_t qword;
+} buf;
+
+switch (size) {
+

[Qemu-devel] [PATCH v4 5/6] hw/arm/sysbus-fdt: helpers for platform bus nodes addition

2014-10-31 Thread Eric Auger
This new C module will be used by ARM machine files to generate
platform bus node and their dynamic sysbus device tree nodes.

Dynamic sysbus device node addition is done in a machine init
done notifier. arm_register_platform_bus_fdt_creator does the
registration of this latter and is supposed to be called by
ARM machine files that support platform bus and their dynamic
sysbus. Addition of dynamic sysbus nodes is done only if the
user did not provide any dtb.

Signed-off-by: Alexander Graf 
Signed-off-by: Eric Auger 

---

v3 -> v4:
- dyn_sysbus_devtree.c renamed into sysbus-fdt.c
- use new PlatformBusDevice object
- the dtb upgrade is done through modify_dtb. Before the fdt
  was recreated from scratch. When the user provided a dtb this
  latter was overwritten which was not correct.
- an array contains the association between device type names
  and their node creation function
- I must aknowledge I did not find any cleaner way to implement
  a FDT_BUILDER interface, as suggested by Paolo. The class method
  would need to be initialized somewhere and since it cannot
  happen in the device itself - according to Alex & Peter comments -,
  I don't see when I shall associate the device type and its
  interface implementation.

v2 -> v3:
- add arm_ prefix
- arm_sysbus_device_create_devtree becomes static

v1 -> v2:
- Code moved in an arch specific file to accomodate architecture
  dependent specificities.
- remove platform_bus_base from PlatformDevtreeData

v1: code originally written by Alex Graf in e500.c and reused for
ARM [Eric Auger]
---
 hw/arm/Makefile.objs|   1 +
 hw/arm/sysbus-fdt.c | 181 
 include/hw/arm/sysbus-fdt.h |  50 
 3 files changed, 232 insertions(+)
 create mode 100644 hw/arm/sysbus-fdt.c
 create mode 100644 include/hw/arm/sysbus-fdt.h

diff --git a/hw/arm/Makefile.objs b/hw/arm/Makefile.objs
index 6088e53..0cc63e1 100644
--- a/hw/arm/Makefile.objs
+++ b/hw/arm/Makefile.objs
@@ -3,6 +3,7 @@ obj-$(CONFIG_DIGIC) += digic_boards.o
 obj-y += integratorcp.o kzm.o mainstone.o musicpal.o nseries.o
 obj-y += omap_sx1.o palm.o realview.o spitz.o stellaris.o
 obj-y += tosa.o versatilepb.o vexpress.o virt.o xilinx_zynq.o z2.o
+obj-y += sysbus-fdt.o
 
 obj-y += armv7m.o exynos4210.o pxa2xx.o pxa2xx_gpio.o pxa2xx_pic.o
 obj-$(CONFIG_DIGIC) += digic.o
diff --git a/hw/arm/sysbus-fdt.c b/hw/arm/sysbus-fdt.c
new file mode 100644
index 000..d5476f1
--- /dev/null
+++ b/hw/arm/sysbus-fdt.c
@@ -0,0 +1,181 @@
+/*
+ * ARM Platform Bus device tree generation helpers
+ *
+ * Copyright (c) 2014 Linaro Limited
+ *
+ * Authors:
+ *  Alex Graf 
+ *  Eric Auger 
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ */
+
+#include "hw/arm/sysbus-fdt.h"
+#include "qemu/error-report.h"
+#include "sysemu/device_tree.h"
+#include "hw/platform-bus.h"
+#include "sysemu/sysemu.h"
+#include "hw/platform-bus.h"
+
+/*
+ * internal struct that contains the information to create dynamic
+ * sysbus device node
+ */
+typedef struct PlatformBusFdtData {
+void *fdt; /* device tree handle */
+int irq_start; /* index of the first IRQ usable by platform bus devices */
+const char *pbus_node_name; /* name of the platform bus node */
+PlatformBusDevice *pbus;
+} PlatformBusFdtData;
+
+/*
+ * struct used when calling the machine init done notifier
+ * that constructs the fdt nodes of platform bus devices
+ */
+typedef struct PlatformBusFdtNotifierParams {
+ARMPlatformBusFdtParams *fdt_params;
+Notifier notifier;
+} PlatformBusFdtNotifierParams;
+
+/* struct that associates a device type name and a node creation function */
+typedef struct NodeCreationPair {
+const char *typename;
+int (*add_fdt_node_fn)(SysBusDevice *sbdev, void *opaque);
+} NodeCreationPair;
+
+/* list of supported dynamic sysbus devices */
+NodeCreationPair add_fdt_node_functions[] = {
+{"", NULL}, /*last element*/
+};
+
+/**
+ * add_fdt_node - add the device tree node of a dynamic sysbus device
+ *
+ * @sbdev: handle to the sysbus device
+ * @opaque: handle to the PlatformBusFdtData
+ *
+ * Checks the sysbus type belongs to the list of device types that
+ * are dynamically instantiable and in the positive call the node
+ * creation function.
+ */
+static int add_fdt_node(SysBusDevice *sbdev, void *opaque

[Qemu-devel] [PATCH v4 1/6] hw/arm/boot: load_dtb becomes non static arm_load_dtb

2014-10-31 Thread Eric Auger
load_dtb is renamed into arm_load_dtb and becomes non static.
it will be used by machvirt for dynamic instantiation of
platform devices

Signed-off-by: Eric Auger 

---

v2 -> v3:
load_dtb renamed into arm_load_dtb

Conflicts:
hw/arm/boot.c
---
 hw/arm/boot.c| 12 ++--
 include/hw/arm/arm.h |  2 ++
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index bffbea5..f5714ea 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -313,7 +313,7 @@ static void set_kernel_args_old(const struct arm_boot_info 
*info)
 }
 
 /**
- * load_dtb() - load a device tree binary image into memory
+ * arm_load_dtb() - load a device tree binary image into memory
  * @addr:   the address to load the image at
  * @binfo:  struct describing the boot environment
  * @addr_limit: upper limit of the available memory area at @addr
@@ -330,8 +330,8 @@ static void set_kernel_args_old(const struct arm_boot_info 
*info)
  *  0 if the image size exceeds the limit,
  *  -1 on errors.
  */
-static int load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
-hwaddr addr_limit)
+int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
+ hwaddr addr_limit)
 {
 void *fdt = NULL;
 int size, rc;
@@ -504,7 +504,7 @@ void arm_load_kernel(ARMCPU *cpu, struct arm_boot_info 
*info)
 /* If we have a device tree blob, but no kernel to supply it to,
  * copy it to the base of RAM for a bootloader to pick up.
  */
-if (load_dtb(info->loader_start, info, 0) < 0) {
+if (arm_load_dtb(info->loader_start, info, 0) < 0) {
 exit(1);
 }
 }
@@ -572,7 +572,7 @@ void arm_load_kernel(ARMCPU *cpu, struct arm_boot_info 
*info)
 if (elf_low_addr < info->loader_start) {
 elf_low_addr = 0;
 }
-if (load_dtb(info->loader_start, info, elf_low_addr) < 0) {
+if (arm_load_dtb(info->loader_start, info, elf_low_addr) < 0) {
 exit(1);
 }
 }
@@ -637,7 +637,7 @@ void arm_load_kernel(ARMCPU *cpu, struct arm_boot_info 
*info)
  */
 hwaddr dtb_start = QEMU_ALIGN_UP(info->initrd_start + initrd_size,
  4096);
-if (load_dtb(dtb_start, info, 0) < 0) {
+if (arm_load_dtb(dtb_start, info, 0) < 0) {
 exit(1);
 }
 fixupcontext[FIXUP_ARGPTR] = dtb_start;
diff --git a/include/hw/arm/arm.h b/include/hw/arm/arm.h
index cefc9e6..5fdae7b 100644
--- a/include/hw/arm/arm.h
+++ b/include/hw/arm/arm.h
@@ -68,6 +68,8 @@ struct arm_boot_info {
 hwaddr entry;
 };
 void arm_load_kernel(ARMCPU *cpu, struct arm_boot_info *info);
+int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
+ hwaddr addr_limit);
 
 /* Multiplication factor to convert from system clock ticks to qemu timer
ticks.  */
-- 
1.8.3.2




[Qemu-devel] [PATCH v7 15/16] hw/vfio/common: vfio_kvm_device_fd moved in the common header

2014-10-31 Thread Eric Auger
the device is now used in platform for forwarded IRQ setup

Signed-off-by: Eric Auger 
---
 hw/vfio/common.c  | 3 ++-
 include/hw/vfio/vfio-common.h | 5 +
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index fbd9e7f..99ff89a 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -44,9 +44,10 @@ struct vfio_as_head vfio_address_spaces =
  * initialized, this file descriptor is only released on QEMU exit and
  * we'll re-use it should another vfio device be attached before then.
  */
-static int vfio_kvm_device_fd = -1;
+int vfio_kvm_device_fd = -1;
 #endif
 
+
 /*
  * Common VFIO interrupt disable
  */
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 83c7876..0ae0153 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -41,6 +41,11 @@
 #define VFIO_ALLOW_KVM_MSI 1
 #define VFIO_ALLOW_KVM_MSIX 1
 
+#ifdef CONFIG_KVM
+extern int vfio_kvm_device_fd;
+#endif
+
+
 enum {
 VFIO_DEVICE_TYPE_PCI = 0,
 VFIO_DEVICE_TYPE_PLATFORM = 1,
-- 
1.8.3.2




[Qemu-devel] [PATCH v7 10/16] hw/vfio: calxeda xgmac device

2014-10-31 Thread Eric Auger
The platform device class has become abstract. The device can be be
instantiated on command line using such option.

-device vfio-calxeda-xgmac,host="fff51000.ethernet"

Signed-off-by: Eric Auger 

---

v5 -> v6
- back again following Alex Graf advises
- fix a bug related to compat override

v4 -> v5:
removed since device tree was moved to hw/arm/dyn_sysbus_devtree.c

v4: creation for device tree specialization
---
 hw/vfio/Makefile.objs|  1 +
 hw/vfio/calxeda_xgmac.c  | 54 
 include/hw/vfio/vfio-calxeda-xgmac.h | 41 +++
 3 files changed, 96 insertions(+)
 create mode 100644 hw/vfio/calxeda_xgmac.c
 create mode 100644 include/hw/vfio/vfio-calxeda-xgmac.h

diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
index c5c76fe..913ab14 100644
--- a/hw/vfio/Makefile.objs
+++ b/hw/vfio/Makefile.objs
@@ -2,4 +2,5 @@ ifeq ($(CONFIG_LINUX), y)
 obj-$(CONFIG_SOFTMMU) += common.o
 obj-$(CONFIG_PCI) += pci.o
 obj-$(CONFIG_SOFTMMU) += platform.o
+obj-$(CONFIG_SOFTMMU) += calxeda_xgmac.o
 endif
diff --git a/hw/vfio/calxeda_xgmac.c b/hw/vfio/calxeda_xgmac.c
new file mode 100644
index 000..199e076
--- /dev/null
+++ b/hw/vfio/calxeda_xgmac.c
@@ -0,0 +1,54 @@
+/*
+ * calxeda xgmac example VFIO device
+ *
+ * Copyright Linaro Limited, 2014
+ *
+ * Authors:
+ *  Eric Auger 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include "hw/vfio/vfio-calxeda-xgmac.h"
+
+static void calxeda_xgmac_realize(DeviceState *dev, Error **errp)
+{
+VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(dev);
+VFIOCalxedaXgmacDeviceClass *k = VFIO_CALXEDA_XGMAC_DEVICE_GET_CLASS(dev);
+
+vdev->compat = g_strdup("calxeda,hb-xgmac");
+
+k->parent_realize(dev, errp);
+}
+
+static const VMStateDescription vfio_platform_vmstate = {
+.name = TYPE_VFIO_CALXEDA_XGMAC,
+.unmigratable = 1,
+};
+
+static void vfio_calxeda_xgmac_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+VFIOCalxedaXgmacDeviceClass *vcxc =
+VFIO_CALXEDA_XGMAC_DEVICE_CLASS(klass);
+vcxc->parent_realize = dc->realize;
+dc->realize = calxeda_xgmac_realize;
+dc->desc = "VFIO Calxeda XGMAC";
+}
+
+static const TypeInfo vfio_calxeda_xgmac_dev_info = {
+.name = TYPE_VFIO_CALXEDA_XGMAC,
+.parent = TYPE_VFIO_PLATFORM,
+.instance_size = sizeof(VFIOCalxedaXgmacDevice),
+.class_init = vfio_calxeda_xgmac_class_init,
+.class_size = sizeof(VFIOCalxedaXgmacDeviceClass),
+};
+
+static void register_calxeda_xgmac_dev_type(void)
+{
+type_register_static(&vfio_calxeda_xgmac_dev_info);
+}
+
+type_init(register_calxeda_xgmac_dev_type)
diff --git a/include/hw/vfio/vfio-calxeda-xgmac.h 
b/include/hw/vfio/vfio-calxeda-xgmac.h
new file mode 100644
index 000..1529cf5
--- /dev/null
+++ b/include/hw/vfio/vfio-calxeda-xgmac.h
@@ -0,0 +1,41 @@
+/*
+ * VFIO calxeda xgmac device
+ *
+ * Copyright Linaro Limited, 2014
+ *
+ * Authors:
+ *  Eric Auger 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef HW_VFIO_VFIO_CALXEDA_XGMAC_H
+#define HW_VFIO_VFIO_CALXEDA_XGMAC_H
+
+#include "hw/vfio/vfio-platform.h"
+
+#define TYPE_VFIO_CALXEDA_XGMAC "vfio-calxeda-xgmac"
+
+typedef struct VFIOCalxedaXgmacDevice {
+VFIOPlatformDevice vdev;
+} VFIOCalxedaXgmacDevice;
+
+typedef struct VFIOCalxedaXgmacDeviceClass {
+/*< private >*/
+VFIOPlatformDeviceClass parent_class;
+/*< public >*/
+DeviceRealize parent_realize;
+} VFIOCalxedaXgmacDeviceClass;
+
+#define VFIO_CALXEDA_XGMAC_DEVICE(obj) \
+ OBJECT_CHECK(VFIOCalxedaXgmacDevice, (obj), TYPE_VFIO_CALXEDA_XGMAC)
+#define VFIO_CALXEDA_XGMAC_DEVICE_CLASS(klass) \
+ OBJECT_CLASS_CHECK(VFIOCalxedaXgmacDeviceClass, (klass), \
+TYPE_VFIO_CALXEDA_XGMAC)
+#define VFIO_CALXEDA_XGMAC_DEVICE_GET_CLASS(obj) \
+ OBJECT_GET_CLASS(VFIOCalxedaXgmacDeviceClass, (obj), \
+  TYPE_VFIO_CALXEDA_XGMAC)
+
+#endif
-- 
1.8.3.2




[Qemu-devel] [PATCH v7 06/16] hw/vfio/pci: rename group_list into vfio_group_list

2014-10-31 Thread Eric Auger
better fit in the rest of the namespace

Signed-off-by: Eric Auger 
---
 hw/vfio/pci.c | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 0ee6f7f..2216bd4 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -283,7 +283,7 @@ static const VFIORomBlacklistEntry romblacklist[] = {
 #define MSIX_CAP_LENGTH 12
 
 static QLIST_HEAD(, VFIOGroup)
-group_list = QLIST_HEAD_INITIALIZER(group_list);
+vfio_group_list = QLIST_HEAD_INITIALIZER(vfio_group_list);
 
 #ifdef CONFIG_KVM
 /*
@@ -3454,7 +3454,7 @@ static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool 
single)
 continue;
 }
 
-QLIST_FOREACH(group, &group_list, next) {
+QLIST_FOREACH(group, &vfio_group_list, next) {
 if (group->groupid == devices[i].group_id) {
 break;
 }
@@ -3501,7 +3501,7 @@ static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool 
single)
 
 /* Determine how many group fds need to be passed */
 count = 0;
-QLIST_FOREACH(group, &group_list, next) {
+QLIST_FOREACH(group, &vfio_group_list, next) {
 for (i = 0; i < info->count; i++) {
 if (group->groupid == devices[i].group_id) {
 count++;
@@ -3515,7 +3515,7 @@ static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool 
single)
 fds = &reset->group_fds[0];
 
 /* Fill in group fds */
-QLIST_FOREACH(group, &group_list, next) {
+QLIST_FOREACH(group, &vfio_group_list, next) {
 for (i = 0; i < info->count; i++) {
 if (group->groupid == devices[i].group_id) {
 fds[reset->count++] = group->fd;
@@ -3550,7 +3550,7 @@ out:
 continue;
 }
 
-QLIST_FOREACH(group, &group_list, next) {
+QLIST_FOREACH(group, &vfio_group_list, next) {
 if (group->groupid == devices[i].group_id) {
 break;
 }
@@ -3625,13 +3625,13 @@ static void vfio_reset_handler(void *opaque)
 VFIOGroup *group;
 VFIODevice *vbasedev;
 
-QLIST_FOREACH(group, &group_list, next) {
+QLIST_FOREACH(group, &vfio_group_list, next) {
 QLIST_FOREACH(vbasedev, &group->device_list, next) {
 vbasedev->ops->vfio_compute_needs_reset(vbasedev);
 }
 }
 
-QLIST_FOREACH(group, &group_list, next) {
+QLIST_FOREACH(group, &vfio_group_list, next) {
 QLIST_FOREACH(vbasedev, &group->device_list, next) {
 if (vbasedev->needs_reset) {
 vbasedev->ops->vfio_hot_reset_multi(vbasedev);
@@ -3880,7 +3880,7 @@ static VFIOGroup *vfio_get_group(int groupid, 
AddressSpace *as)
 char path[32];
 struct vfio_group_status status = { .argsz = sizeof(status) };
 
-QLIST_FOREACH(group, &group_list, next) {
+QLIST_FOREACH(group, &vfio_group_list, next) {
 if (group->groupid == groupid) {
 /* Found it.  Now is it already in the right context? */
 if (group->container->space->as == as) {
@@ -3922,11 +3922,11 @@ static VFIOGroup *vfio_get_group(int groupid, 
AddressSpace *as)
 goto close_fd_exit;
 }
 
-if (QLIST_EMPTY(&group_list)) {
+if (QLIST_EMPTY(&vfio_group_list)) {
 qemu_register_reset(vfio_reset_handler, NULL);
 }
 
-QLIST_INSERT_HEAD(&group_list, group, next);
+QLIST_INSERT_HEAD(&vfio_group_list, group, next);
 
 vfio_kvm_device_add_group(group);
 
@@ -3954,7 +3954,7 @@ static void vfio_put_group(VFIOGroup *group)
 close(group->fd);
 g_free(group);
 
-if (QLIST_EMPTY(&group_list)) {
+if (QLIST_EMPTY(&vfio_group_list)) {
 qemu_unregister_reset(vfio_reset_handler, NULL);
 }
 }
-- 
1.8.3.2




[Qemu-devel] [PATCH v7 00/16] KVM platform device passthrough

2014-10-31 Thread Eric Auger
This RFC series aims at enabling KVM platform device passthrough.
It implements a VFIO platform device, derived from VFIO PCI device.

The VFIO platform device uses the host VFIO platform driver which must
be bound to the assigned device prior to the QEMU system start.

- the guest can directly access the device register space
- assigned device IRQs are transparently routed to the guest by
  QEMU/KVM (3 methods currently are supported: user-level eventfd
  handling, irqfd, forwarded IRQs)
- iommu is transparently programmed to prevent the device from
  accessing physical pages outside of the guest address space

This patch series is made of the following patch file groups:

1-8) PCI modifications to prepare for platform device introduction
9-12) VFIO platform device without irqfd support
13) VFIO platform device with irqfd support
14-16) VFIO platform device with IRQ forwarding support

Each group is independent and should be separately upstreamable.

Dependency List:

QEMU dependencies:
[1] [PATCH v3 0/7] Dynamic sysbus device allocation support, Alex Graf
http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg04860.html
[2] [PATCH v4] machvirt dynamic sysbus device instantiation, Eric Auger
[3] [PATCH v3 0/2] actual checks of KVM_CAP_IRQFD and KVM_CAP_IRQFD_RESAMPLE,
Eric Auger
http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg00589.html
[4] [PATCH v2] vfio: migration to trace points, Eric Auger
https://patchwork.ozlabs.org/patch/394785/

Kernel Dependencies:
[5] [PATCH v9 00/19] VFIO support for platform and AMBA devices on ARM
http://comments.gmane.org/gmane.linux.kernel.iommu/7096
[6] [PATCH v3] ARM: KVM: add irqfd support, Eric Auger
https://lkml.org/lkml/2014/9/1/141
[8] [RFC v2 0/9] KVM-VFIO IRQ forward control, Eric Auger
https://lkml.org/lkml/2014/9/1/344
[9] [RFC PATCH 0/9] ARM: Forwarding physical interrupts to a guest VM,
Marc Zyngier
http://lwn.net/Articles/603514/

- kernel pieces can be found at:
  http://git.linaro.org/people/eric.auger/linux.git (branch 3.17rc7-v8)
- QEMU pieces can be found at:
  http://git.linaro.org/people/eric.auger/qemu.git (branch vfio_integ_v7)

The patch series was tested on Calxeda Midway (ARMv7) where one xgmac
is assigned to KVM host while the second one is assigned to the guest.
Reworked PCI device is not tested.

Wiki for Calxeda Midway setup:
https://wiki.linaro.org/LEG/Engineering/Virtualization/Platform_Device_Passthrough_on_Midway

History:
v6->v7:
- fake injection test modality removed
- VFIO_DEVICE_TYPE_PLATFORM only introduced with VFIO platform
- new helper functions to start VFIO IRQ on machine init done notifier
  (introduced in hw/vfio/platform: add vfio-platform support and notifier
  registration invoked in hw/arm/virt: add support for VFIO devices).
  vfio_start_irq_injection is replaced by vfio_register_irq_starter.

v5->v6:
- rebase on 2.1rc5 PCI code
- forwarded IRQ first integraton
- vfio_device property renamed into host property
- split IRQ setup in different functions that match the 3 supported
  injection techniques (user handled eventfd, irqfd, forwarded IRQ):
  removes dynamic switch between injection methods
- introduce fake interrupts as a test modality:
  x makes possible to test multiple IRQ user-side handling.
  x this is a test feature only: enable to trigger a fd as if the
real physical IRQ hit. No virtual IRQ is injected into the guest
but handling is simulated so that the state machine can be tested
- user handled eventfd:
  x add mutex to protect IRQ state & list manipulation,
  x correct misleading comment in vfio_intp_interrupt.
  x Fix bugs using fake interrupt modality
- irqfd no more advertised in this patchset (handled in [3])
- VFIOPlatformDeviceClass becomes abstract and Calxeda xgmac device
  and class is re-introduced (as per v4)
- all DPRINTF removed in platform and replaced by trace-points
- corrects compilation with configure --disable-kvm
- simplifies the split for vfio_get_device and introduce a unique
  specialized function named vfio_populate_device
- group_list renamed into vfio_group_list
- hw/arm/dyn_sysbus_devtree.c currently only support vfio-calxeda-xgmac
  instantiation. Needs to be specialized for other VFIO devices
- fix 2 bugs in dyn_sysbus_devtree(reg_attr index and compat)

v4->v5:
- rebase on v2.1.0 PCI code
- take into account Alex Williamson comments on PCI code rework
  - trace updates in vfio_region_write/read
  - remove fd from VFIORegion
  - get/put ckeanup
- bug fix: bar region's vbasedev field duly initialization
- misc cleanups in platform device
- device tree node generation removed from device and handled in
  hw/arm/dyn_sysbus_devtree.c
- remove "hw/vfio: add an example calxeda_xgmac": with removal of
  device tree node generation we do not have so many things to
  implement in that derived device yet. May be re-introduced later
  on if needed typically for reset/migration.
- no GSI routing table a

[Qemu-devel] [PATCH RESEND] vfio: migration to trace points

2014-10-31 Thread Eric Auger
This patch removes all DPRINTF and replace them by trace points.
A few DPRINTF used in error cases were transformed into error_report.

Signed-off-by: Eric Auger 

---

- __func__ is removed since trace point name does the same job
- HWADDR_PRIx were replaced by PRIx64
- this transformation just is tested compiled on PCI.
  qemu configured with --enable-trace-backends=stderr
- in future, format strings and calls may be simplified by using a single
  name argument instead of domain, bus, slot, function.

v1 (RFC) -> v2 (PATCH):
- restore original format strings since parsing now is OK after
  commit f9bbba9,
  [PATCH v2] trace: tighten up trace-events regex to fix bad parse
---
 hw/misc/vfio.c | 403 +
 trace-events   |  75 ++-
 2 files changed, 280 insertions(+), 198 deletions(-)

diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
index 75bfa1c..cdf4922 100644
--- a/hw/misc/vfio.c
+++ b/hw/misc/vfio.c
@@ -40,15 +40,7 @@
 #include "sysemu/kvm.h"
 #include "sysemu/sysemu.h"
 #include "hw/misc/vfio.h"
-
-/* #define DEBUG_VFIO */
-#ifdef DEBUG_VFIO
-#define DPRINTF(fmt, ...) \
-do { fprintf(stderr, "vfio: " fmt, ## __VA_ARGS__); } while (0)
-#else
-#define DPRINTF(fmt, ...) \
-do { } while (0)
-#endif
+#include "trace.h"
 
 /* Extra debugging, trap acceleration paths for more logging */
 #define VFIO_ALLOW_MMAP 1
@@ -365,9 +357,9 @@ static void vfio_intx_interrupt(void *opaque)
 return;
 }
 
-DPRINTF("%s(%04x:%02x:%02x.%x) Pin %c\n", __func__, vdev->host.domain,
-vdev->host.bus, vdev->host.slot, vdev->host.function,
-'A' + vdev->intx.pin);
+trace_vfio_intx_interrupt(vdev->host.domain, vdev->host.bus,
+  vdev->host.slot, vdev->host.function,
+  'A' + vdev->intx.pin);
 
 vdev->intx.pending = true;
 pci_irq_assert(&vdev->pdev);
@@ -384,8 +376,8 @@ static void vfio_eoi(VFIODevice *vdev)
 return;
 }
 
-DPRINTF("%s(%04x:%02x:%02x.%x) EOI\n", __func__, vdev->host.domain,
-vdev->host.bus, vdev->host.slot, vdev->host.function);
+trace_vfio_eoi(vdev->host.domain, vdev->host.bus,
+   vdev->host.slot, vdev->host.function);
 
 vdev->intx.pending = false;
 pci_irq_deassert(&vdev->pdev);
@@ -454,9 +446,8 @@ static void vfio_enable_intx_kvm(VFIODevice *vdev)
 
 vdev->intx.kvm_accel = true;
 
-DPRINTF("%s(%04x:%02x:%02x.%x) KVM INTx accel enabled\n",
-__func__, vdev->host.domain, vdev->host.bus,
-vdev->host.slot, vdev->host.function);
+trace_vfio_enable_intx_kvm(vdev->host.domain, vdev->host.bus,
+   vdev->host.slot, vdev->host.function);
 
 return;
 
@@ -508,9 +499,8 @@ static void vfio_disable_intx_kvm(VFIODevice *vdev)
 /* If we've missed an event, let it re-fire through QEMU */
 vfio_unmask_intx(vdev);
 
-DPRINTF("%s(%04x:%02x:%02x.%x) KVM INTx accel disabled\n",
-__func__, vdev->host.domain, vdev->host.bus,
-vdev->host.slot, vdev->host.function);
+trace_vfio_disable_intx_kvm(vdev->host.domain, vdev->host.bus,
+vdev->host.slot, vdev->host.function);
 #endif
 }
 
@@ -529,9 +519,9 @@ static void vfio_update_irq(PCIDevice *pdev)
 return; /* Nothing changed */
 }
 
-DPRINTF("%s(%04x:%02x:%02x.%x) IRQ moved %d -> %d\n", __func__,
-vdev->host.domain, vdev->host.bus, vdev->host.slot,
-vdev->host.function, vdev->intx.route.irq, route.irq);
+trace_vfio_update_irq(vdev->host.domain, vdev->host.bus,
+  vdev->host.slot, vdev->host.function,
+  vdev->intx.route.irq, route.irq);
 
 vfio_disable_intx_kvm(vdev);
 
@@ -606,8 +596,8 @@ static int vfio_enable_intx(VFIODevice *vdev)
 
 vdev->interrupt = VFIO_INT_INTx;
 
-DPRINTF("%s(%04x:%02x:%02x.%x)\n", __func__, vdev->host.domain,
-vdev->host.bus, vdev->host.slot, vdev->host.function);
+trace_vfio_enable_intx(vdev->host.domain, vdev->host.bus,
+   vdev->host.slot, vdev->host.function);
 
 return 0;
 }
@@ -629,8 +619,8 @@ static void vfio_disable_intx(VFIODevice *vdev)
 
 vdev->interrupt = VFIO_INT_NONE;
 
-DPRINTF("%s(%04x:%02x:%02x.%x)\n", __func__, vdev->host.domain,
-vdev->host.bus, vdev->host.slot, vdev->host.function);
+trace_vfio_disable_intx(vdev->host.domain, vdev->host.bus,
+vdev->host.slot, vdev->host.function);
 }
 
 /*
@@ -657,9 +647,9 @@ static void vfio_msi_inter

[Qemu-devel] [PATCH v3 1/2] KVM_CAP_IRQFD and KVM_CAP_IRQFD_RESAMPLE checks

2014-10-31 Thread Eric Auger
Compute kvm_irqfds_allowed by checking the KVM_CAP_IRQFD extension.
Remove direct settings in architecture specific files.

Add a new kvm_resamplefds_allowed variable, initialized by
checking the KVM_CAP_IRQFD_RESAMPLE extension. Add a corresponding
kvm_resamplefds_enabled() function.

A special notice for s390 where KVM_CAP_IRQFD was not immediatly
advirtised when irqfd capability was introduced in the kernel.
KVM_CAP_IRQ_ROUTING was advertised instead.

This was fixed in "KVM: s390: announce irqfd capability",
ebc3226202d5956a5963185222982d435378b899 whereas irqfd support
was brought in 84223598778ba08041f4297fda485df83414d57e,
"KVM: s390: irq routing for adapter interrupts".  Both commits
first appear in 3.15 so there should not be any kernel
version impacted by this QEMU modification.

Signed-off-by: Eric Auger 

---

v2->v3:
- changed the commit message only
---
 hw/intc/openpic_kvm.c |  1 -
 hw/intc/xics_kvm.c|  1 -
 include/sysemu/kvm.h  | 10 ++
 kvm-all.c |  7 +++
 target-i386/kvm.c |  1 -
 target-s390x/kvm.c|  1 -
 6 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/hw/intc/openpic_kvm.c b/hw/intc/openpic_kvm.c
index e3bce04..6cef3b1 100644
--- a/hw/intc/openpic_kvm.c
+++ b/hw/intc/openpic_kvm.c
@@ -229,7 +229,6 @@ static void kvm_openpic_realize(DeviceState *dev, Error 
**errp)
 kvm_irqchip_add_irq_route(kvm_state, i, 0, i);
 }
 
-kvm_irqfds_allowed = true;
 kvm_msi_via_irqfd_allowed = true;
 kvm_gsi_routing_allowed = true;
 
diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
index 20b19e9..c15453f 100644
--- a/hw/intc/xics_kvm.c
+++ b/hw/intc/xics_kvm.c
@@ -448,7 +448,6 @@ static void xics_kvm_realize(DeviceState *dev, Error **errp)
 }
 
 kvm_kernel_irqchip = true;
-kvm_irqfds_allowed = true;
 kvm_msi_via_irqfd_allowed = true;
 kvm_gsi_direct_mapping = true;
 
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index b0cd657..a23ddab 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -45,6 +45,7 @@ extern bool kvm_async_interrupts_allowed;
 extern bool kvm_halt_in_kernel_allowed;
 extern bool kvm_eventfds_allowed;
 extern bool kvm_irqfds_allowed;
+extern bool kvm_resamplefds_allowed;
 extern bool kvm_msi_via_irqfd_allowed;
 extern bool kvm_gsi_routing_allowed;
 extern bool kvm_gsi_direct_mapping;
@@ -102,6 +103,15 @@ extern bool kvm_readonly_mem_allowed;
 #define kvm_irqfds_enabled() (kvm_irqfds_allowed)
 
 /**
+ * kvm_resamplefds_enabled:
+ *
+ * Returns: true if we can use resamplefds to inject interrupts into
+ * a KVM CPU (ie the kernel supports resamplefds and we are running
+ * with a configuration where it is meaningful to use them).
+ */
+#define kvm_resamplefds_enabled() (kvm_resamplefds_allowed)
+
+/**
  * kvm_msi_via_irqfd_enabled:
  *
  * Returns: true if we can route a PCI MSI (Message Signaled Interrupt)
diff --git a/kvm-all.c b/kvm-all.c
index 44a5e72..0a3139f 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -120,6 +120,7 @@ bool kvm_async_interrupts_allowed;
 bool kvm_halt_in_kernel_allowed;
 bool kvm_eventfds_allowed;
 bool kvm_irqfds_allowed;
+bool kvm_resamplefds_allowed;
 bool kvm_msi_via_irqfd_allowed;
 bool kvm_gsi_routing_allowed;
 bool kvm_gsi_direct_mapping;
@@ -1566,6 +1567,12 @@ static int kvm_init(MachineState *ms)
 kvm_eventfds_allowed =
 (kvm_check_extension(s, KVM_CAP_IOEVENTFD) > 0);
 
+kvm_irqfds_allowed =
+(kvm_check_extension(s, KVM_CAP_IRQFD) > 0);
+
+kvm_resamplefds_allowed =
+(kvm_check_extension(s, KVM_CAP_IRQFD_RESAMPLE) > 0);
+
 ret = kvm_arch_init(s);
 if (ret < 0) {
 goto err;
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index ccf36e8..3a3dfc4 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -2563,7 +2563,6 @@ void kvm_arch_init_irq_routing(KVMState *s)
  * irqchip, so we can use irqfds, and on x86 we know
  * we can use msi via irqfd and GSI routing.
  */
-kvm_irqfds_allowed = true;
 kvm_msi_via_irqfd_allowed = true;
 kvm_gsi_routing_allowed = true;
 }
diff --git a/target-s390x/kvm.c b/target-s390x/kvm.c
index 5b10a25..9ae1958 100644
--- a/target-s390x/kvm.c
+++ b/target-s390x/kvm.c
@@ -1294,7 +1294,6 @@ void kvm_arch_init_irq_routing(KVMState *s)
  * have to override the common code kvm_halt_in_kernel_allowed setting.
  */
 if (kvm_check_extension(s, KVM_CAP_IRQ_ROUTING)) {
-kvm_irqfds_allowed = true;
 kvm_gsi_routing_allowed = true;
 kvm_halt_in_kernel_allowed = false;
 }
-- 
1.8.3.2




[Qemu-devel] [PATCH v7 03/16] hw/vfio/pci: introduce VFIODevice

2014-10-31 Thread Eric Auger
Introduce the VFIODevice struct that is going to be shared by
VFIOPCIDevice and VFIOPlatformDevice.

Additional fields will be added there later on for review
convenience.

the group's device_list becomes a list of VFIODevice

This obliges to rework the reset_handler which becomes generic and
calls VFIODevice ops that are specialized in each parent object.
Also functions that iterate on this list must take care that the
devices can be something else than VFIOPCIDevice. The type is used
to discriminate them.

we profit from this step to change the prototype of
vfio_unmask_intx, vfio_mask_intx, vfio_disable_irqindex which now
apply to VFIODevice. They are renamed as *_irqindex.
The index is passed as parameter to anticipate their usage for
platform IRQs

Signed-off-by: Eric Auger 

---

v4->v5:
- fix style issues
- in vfio_initfn, rework allocation of vdev->vbasedev.name and
  replace snprintf by g_strdup_printf
---
 hw/vfio/pci.c | 241 +++---
 1 file changed, 147 insertions(+), 94 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 93181bf..0531744 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -48,6 +48,11 @@
 #define VFIO_ALLOW_KVM_MSI 1
 #define VFIO_ALLOW_KVM_MSIX 1
 
+enum {
+VFIO_DEVICE_TYPE_PCI = 0,
+VFIO_DEVICE_TYPE_PLATFORM = 1,
+};
+
 struct VFIOPCIDevice;
 
 typedef struct VFIOQuirk {
@@ -185,9 +190,27 @@ typedef struct VFIOMSIXInfo {
 void *mmap;
 } VFIOMSIXInfo;
 
+typedef struct VFIODeviceOps VFIODeviceOps;
+
+typedef struct VFIODevice {
+QLIST_ENTRY(VFIODevice) next;
+struct VFIOGroup *group;
+char *name;
+int fd;
+int type;
+bool reset_works;
+bool needs_reset;
+VFIODeviceOps *ops;
+} VFIODevice;
+
+struct VFIODeviceOps {
+bool (*vfio_compute_needs_reset)(VFIODevice *vdev);
+int (*vfio_hot_reset_multi)(VFIODevice *vdev);
+};
+
 typedef struct VFIOPCIDevice {
 PCIDevice pdev;
-int fd;
+VFIODevice vbasedev;
 VFIOINTx intx;
 unsigned int config_size;
 uint8_t *emulated_config_bits; /* QEMU emulated bits, little-endian */
@@ -203,20 +226,16 @@ typedef struct VFIOPCIDevice {
 VFIOBAR bars[PCI_NUM_REGIONS - 1]; /* No ROM */
 VFIOVGA vga; /* 0xa, 0x3b0, 0x3c0 */
 PCIHostDeviceAddress host;
-QLIST_ENTRY(VFIOPCIDevice) next;
-struct VFIOGroup *group;
 EventNotifier err_notifier;
 uint32_t features;
 #define VFIO_FEATURE_ENABLE_VGA_BIT 0
 #define VFIO_FEATURE_ENABLE_VGA (1 << VFIO_FEATURE_ENABLE_VGA_BIT)
 int32_t bootindex;
 uint8_t pm_cap;
-bool reset_works;
 bool has_vga;
 bool pci_aer;
 bool has_flr;
 bool has_pm_reset;
-bool needs_reset;
 bool rom_read_failed;
 } VFIOPCIDevice;
 
@@ -224,7 +243,7 @@ typedef struct VFIOGroup {
 int fd;
 int groupid;
 VFIOContainer *container;
-QLIST_HEAD(, VFIOPCIDevice) device_list;
+QLIST_HEAD(, VFIODevice) device_list;
 QLIST_ENTRY(VFIOGroup) next;
 QLIST_ENTRY(VFIOGroup) container_next;
 } VFIOGroup;
@@ -277,7 +296,7 @@ static void vfio_mmap_set_enabled(VFIOPCIDevice *vdev, bool 
enabled);
 /*
  * Common VFIO interrupt disable
  */
-static void vfio_disable_irqindex(VFIOPCIDevice *vdev, int index)
+static void vfio_disable_irqindex(VFIODevice *vbasedev, int index)
 {
 struct vfio_irq_set irq_set = {
 .argsz = sizeof(irq_set),
@@ -287,37 +306,37 @@ static void vfio_disable_irqindex(VFIOPCIDevice *vdev, 
int index)
 .count = 0,
 };
 
-ioctl(vdev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
+ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
 }
 
 /*
  * INTx
  */
-static void vfio_unmask_intx(VFIOPCIDevice *vdev)
+static void vfio_unmask_irqindex(VFIODevice *vbasedev, int index)
 {
 struct vfio_irq_set irq_set = {
 .argsz = sizeof(irq_set),
 .flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_UNMASK,
-.index = VFIO_PCI_INTX_IRQ_INDEX,
+.index = index,
 .start = 0,
 .count = 1,
 };
 
-ioctl(vdev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
+ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
 }
 
 #ifdef CONFIG_KVM /* Unused outside of CONFIG_KVM code */
-static void vfio_mask_intx(VFIOPCIDevice *vdev)
+static void vfio_mask_irqindex(VFIODevice *vbasedev, int index)
 {
 struct vfio_irq_set irq_set = {
 .argsz = sizeof(irq_set),
 .flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_MASK,
-.index = VFIO_PCI_INTX_IRQ_INDEX,
+.index = index,
 .start = 0,
 .count = 1,
 };
 
-ioctl(vdev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
+ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
 }
 #endif
 
@@ -381,7 +400,7 @@ static void vfio_eoi(VFIOPCIDevice *vdev)
 
 vdev->intx.pending = false;
 pci_irq_deassert(&vdev->pdev);
-vfio_unmask_intx(vdev);
+vfio_unmask_irqindex(&vdev->vbasedev, VFIO_PCI_INTX_IRQ_I

[Qemu-devel] [PATCH v7 02/16] hw/vfio/pci: Rename VFIODevice into VFIOPCIDevice

2014-10-31 Thread Eric Auger
This prepares for the introduction of VFIOPlatformDevice

Signed-off-by: Eric Auger 
---
 hw/vfio/pci.c | 210 +-
 1 file changed, 106 insertions(+), 104 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 8514b9e..93181bf 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -48,11 +48,11 @@
 #define VFIO_ALLOW_KVM_MSI 1
 #define VFIO_ALLOW_KVM_MSIX 1
 
-struct VFIODevice;
+struct VFIOPCIDevice;
 
 typedef struct VFIOQuirk {
 MemoryRegion mem;
-struct VFIODevice *vdev;
+struct VFIOPCIDevice *vdev;
 QLIST_ENTRY(VFIOQuirk) next;
 struct {
 uint32_t base_offset:TARGET_PAGE_BITS;
@@ -123,7 +123,7 @@ typedef struct VFIOMSIVector {
  */
 EventNotifier interrupt;
 EventNotifier kvm_interrupt;
-struct VFIODevice *vdev; /* back pointer to device */
+struct VFIOPCIDevice *vdev; /* back pointer to device */
 int virq;
 bool use;
 } VFIOMSIVector;
@@ -185,7 +185,7 @@ typedef struct VFIOMSIXInfo {
 void *mmap;
 } VFIOMSIXInfo;
 
-typedef struct VFIODevice {
+typedef struct VFIOPCIDevice {
 PCIDevice pdev;
 int fd;
 VFIOINTx intx;
@@ -203,7 +203,7 @@ typedef struct VFIODevice {
 VFIOBAR bars[PCI_NUM_REGIONS - 1]; /* No ROM */
 VFIOVGA vga; /* 0xa, 0x3b0, 0x3c0 */
 PCIHostDeviceAddress host;
-QLIST_ENTRY(VFIODevice) next;
+QLIST_ENTRY(VFIOPCIDevice) next;
 struct VFIOGroup *group;
 EventNotifier err_notifier;
 uint32_t features;
@@ -218,13 +218,13 @@ typedef struct VFIODevice {
 bool has_pm_reset;
 bool needs_reset;
 bool rom_read_failed;
-} VFIODevice;
+} VFIOPCIDevice;
 
 typedef struct VFIOGroup {
 int fd;
 int groupid;
 VFIOContainer *container;
-QLIST_HEAD(, VFIODevice) device_list;
+QLIST_HEAD(, VFIOPCIDevice) device_list;
 QLIST_ENTRY(VFIOGroup) next;
 QLIST_ENTRY(VFIOGroup) container_next;
 } VFIOGroup;
@@ -268,16 +268,16 @@ static QLIST_HEAD(, VFIOGroup)
 static int vfio_kvm_device_fd = -1;
 #endif
 
-static void vfio_disable_interrupts(VFIODevice *vdev);
+static void vfio_disable_interrupts(VFIOPCIDevice *vdev);
 static uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len);
 static void vfio_pci_write_config(PCIDevice *pdev, uint32_t addr,
   uint32_t val, int len);
-static void vfio_mmap_set_enabled(VFIODevice *vdev, bool enabled);
+static void vfio_mmap_set_enabled(VFIOPCIDevice *vdev, bool enabled);
 
 /*
  * Common VFIO interrupt disable
  */
-static void vfio_disable_irqindex(VFIODevice *vdev, int index)
+static void vfio_disable_irqindex(VFIOPCIDevice *vdev, int index)
 {
 struct vfio_irq_set irq_set = {
 .argsz = sizeof(irq_set),
@@ -293,7 +293,7 @@ static void vfio_disable_irqindex(VFIODevice *vdev, int 
index)
 /*
  * INTx
  */
-static void vfio_unmask_intx(VFIODevice *vdev)
+static void vfio_unmask_intx(VFIOPCIDevice *vdev)
 {
 struct vfio_irq_set irq_set = {
 .argsz = sizeof(irq_set),
@@ -307,7 +307,7 @@ static void vfio_unmask_intx(VFIODevice *vdev)
 }
 
 #ifdef CONFIG_KVM /* Unused outside of CONFIG_KVM code */
-static void vfio_mask_intx(VFIODevice *vdev)
+static void vfio_mask_intx(VFIOPCIDevice *vdev)
 {
 struct vfio_irq_set irq_set = {
 .argsz = sizeof(irq_set),
@@ -338,7 +338,7 @@ static void vfio_mask_intx(VFIODevice *vdev)
  */
 static void vfio_intx_mmap_enable(void *opaque)
 {
-VFIODevice *vdev = opaque;
+VFIOPCIDevice *vdev = opaque;
 
 if (vdev->intx.pending) {
 timer_mod(vdev->intx.mmap_timer,
@@ -351,7 +351,7 @@ static void vfio_intx_mmap_enable(void *opaque)
 
 static void vfio_intx_interrupt(void *opaque)
 {
-VFIODevice *vdev = opaque;
+VFIOPCIDevice *vdev = opaque;
 
 if (!event_notifier_test_and_clear(&vdev->intx.interrupt)) {
 return;
@@ -370,7 +370,7 @@ static void vfio_intx_interrupt(void *opaque)
 }
 }
 
-static void vfio_eoi(VFIODevice *vdev)
+static void vfio_eoi(VFIOPCIDevice *vdev)
 {
 if (!vdev->intx.pending) {
 return;
@@ -384,7 +384,7 @@ static void vfio_eoi(VFIODevice *vdev)
 vfio_unmask_intx(vdev);
 }
 
-static void vfio_enable_intx_kvm(VFIODevice *vdev)
+static void vfio_enable_intx_kvm(VFIOPCIDevice *vdev)
 {
 #ifdef CONFIG_KVM
 struct kvm_irqfd irqfd = {
@@ -462,7 +462,7 @@ fail:
 #endif
 }
 
-static void vfio_disable_intx_kvm(VFIODevice *vdev)
+static void vfio_disable_intx_kvm(VFIOPCIDevice *vdev)
 {
 #ifdef CONFIG_KVM
 struct kvm_irqfd irqfd = {
@@ -506,7 +506,7 @@ static void vfio_disable_intx_kvm(VFIODevice *vdev)
 
 static void vfio_update_irq(PCIDevice *pdev)
 {
-VFIODevice *vdev = DO_UPCAST(VFIODevice, pdev, pdev);
+VFIOPCIDevice *vdev = DO_UPCAST(VFIOPCIDevice, pdev, pdev);
 PCIINTxRoute route;
 
 if (vdev->interrupt != VFIO_INT_INTx) {
@@ -537,7 +537,7 @@ static void vfio_update_irq(PCIDevice *pdev)
 vfio_eoi(vdev);
 }
 
-static int vfio_enable_intx(VFIODevice

[Qemu-devel] [PATCH v7 01/16] vfio: move hw/misc/vfio.c to hw/vfio/pci.c Move vfio.h into include/hw/vfio

2014-10-31 Thread Eric Auger
From: Kim Phillips 

This is done in preparation for the addition of VFIO platform
device support.

Signed-off-by: Kim Phillips 
---
 LICENSE  | 2 +-
 MAINTAINERS  | 2 +-
 hw/Makefile.objs | 1 +
 hw/misc/Makefile.objs| 1 -
 hw/ppc/spapr_pci_vfio.c  | 2 +-
 hw/vfio/Makefile.objs| 3 +++
 hw/{misc/vfio.c => vfio/pci.c}   | 2 +-
 include/hw/{misc => vfio}/vfio.h | 0
 8 files changed, 8 insertions(+), 5 deletions(-)
 create mode 100644 hw/vfio/Makefile.objs
 rename hw/{misc/vfio.c => vfio/pci.c} (99%)
 rename include/hw/{misc => vfio}/vfio.h (100%)

diff --git a/LICENSE b/LICENSE
index da70e94..0e0b4b9 100644
--- a/LICENSE
+++ b/LICENSE
@@ -11,7 +11,7 @@ option) any later version.
 
 As of July 2013, contributions under version 2 of the GNU General Public
 License (and no later version) are only accepted for the following files
-or directories: bsd-user/, linux-user/, hw/misc/vfio.c, hw/xen/xen_pt*.
+or directories: bsd-user/, linux-user/, hw/vfio/, hw/xen/xen_pt*.
 
 3) The Tiny Code Generator (TCG) is released under the BSD license
(see license headers in files).
diff --git a/MAINTAINERS b/MAINTAINERS
index 94366ef..3f2db91 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -656,7 +656,7 @@ F: hw/usb/dev-serial.c
 VFIO
 M: Alex Williamson 
 S: Supported
-F: hw/misc/vfio.c
+F: hw/vfio/*
 
 vhost
 M: Michael S. Tsirkin 
diff --git a/hw/Makefile.objs b/hw/Makefile.objs
index 52a1464..73afa41 100644
--- a/hw/Makefile.objs
+++ b/hw/Makefile.objs
@@ -26,6 +26,7 @@ devices-dirs-$(CONFIG_SOFTMMU) += ssi/
 devices-dirs-$(CONFIG_SOFTMMU) += timer/
 devices-dirs-$(CONFIG_TPM) += tpm/
 devices-dirs-$(CONFIG_SOFTMMU) += usb/
+devices-dirs-$(CONFIG_SOFTMMU) += vfio/
 devices-dirs-$(CONFIG_VIRTIO) += virtio/
 devices-dirs-$(CONFIG_SOFTMMU) += watchdog/
 devices-dirs-$(CONFIG_SOFTMMU) += xen/
diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
index 979e532..e47fea8 100644
--- a/hw/misc/Makefile.objs
+++ b/hw/misc/Makefile.objs
@@ -21,7 +21,6 @@ common-obj-$(CONFIG_MACIO) += macio/
 
 ifeq ($(CONFIG_PCI), y)
 obj-$(CONFIG_KVM) += ivshmem.o
-obj-$(CONFIG_LINUX) += vfio.o
 endif
 
 obj-$(CONFIG_REALVIEW) += arm_sysctl.o
diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c
index d3bddf2..144912b 100644
--- a/hw/ppc/spapr_pci_vfio.c
+++ b/hw/ppc/spapr_pci_vfio.c
@@ -20,7 +20,7 @@
 #include "hw/ppc/spapr.h"
 #include "hw/pci-host/spapr.h"
 #include "linux/vfio.h"
-#include "hw/misc/vfio.h"
+#include "hw/vfio/vfio.h"
 
 static Property spapr_phb_vfio_properties[] = {
 DEFINE_PROP_INT32("iommu", sPAPRPHBVFIOState, iommugroupid, -1),
diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
new file mode 100644
index 000..31c7dab
--- /dev/null
+++ b/hw/vfio/Makefile.objs
@@ -0,0 +1,3 @@
+ifeq ($(CONFIG_LINUX), y)
+obj-$(CONFIG_PCI) += pci.o
+endif
diff --git a/hw/misc/vfio.c b/hw/vfio/pci.c
similarity index 99%
rename from hw/misc/vfio.c
rename to hw/vfio/pci.c
index cdf4922..8514b9e 100644
--- a/hw/misc/vfio.c
+++ b/hw/vfio/pci.c
@@ -39,8 +39,8 @@
 #include "qemu/range.h"
 #include "sysemu/kvm.h"
 #include "sysemu/sysemu.h"
-#include "hw/misc/vfio.h"
 #include "trace.h"
+#include "hw/vfio/vfio.h"
 
 /* Extra debugging, trap acceleration paths for more logging */
 #define VFIO_ALLOW_MMAP 1
diff --git a/include/hw/misc/vfio.h b/include/hw/vfio/vfio.h
similarity index 100%
rename from include/hw/misc/vfio.h
rename to include/hw/vfio/vfio.h
-- 
1.8.3.2




[Qemu-devel] [PATCH v7 07/16] hw/vfio/pci: use name field in format strings

2014-10-31 Thread Eric Auger
Signed-off-by: Eric Auger 

Conflicts:
trace-events
---
 hw/vfio/pci.c | 213 --
 trace-events  | 109 --
 2 files changed, 116 insertions(+), 206 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 2216bd4..6584425 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -387,9 +387,7 @@ static void vfio_intx_interrupt(void *opaque)
 return;
 }
 
-trace_vfio_intx_interrupt(vdev->host.domain, vdev->host.bus,
-  vdev->host.slot, vdev->host.function,
-  'A' + vdev->intx.pin);
+trace_vfio_intx_interrupt(vdev->vbasedev.name, 'A' + vdev->intx.pin);
 
 vdev->intx.pending = true;
 pci_irq_assert(&vdev->pdev);
@@ -408,8 +406,7 @@ static void vfio_eoi(VFIODevice *vbasedev)
 return;
 }
 
-trace_vfio_eoi(vdev->host.domain, vdev->host.bus,
-   vdev->host.slot, vdev->host.function);
+trace_vfio_eoi(vbasedev->name);
 
 vdev->intx.pending = false;
 pci_irq_deassert(&vdev->pdev);
@@ -478,8 +475,7 @@ static void vfio_enable_intx_kvm(VFIOPCIDevice *vdev)
 
 vdev->intx.kvm_accel = true;
 
-trace_vfio_enable_intx_kvm(vdev->host.domain, vdev->host.bus,
-   vdev->host.slot, vdev->host.function);
+trace_vfio_enable_intx_kvm(vdev->vbasedev.name);
 
 return;
 
@@ -531,8 +527,7 @@ static void vfio_disable_intx_kvm(VFIOPCIDevice *vdev)
 /* If we've missed an event, let it re-fire through QEMU */
 vfio_unmask_irqindex(&vdev->vbasedev, VFIO_PCI_INTX_IRQ_INDEX);
 
-trace_vfio_disable_intx_kvm(vdev->host.domain, vdev->host.bus,
-vdev->host.slot, vdev->host.function);
+trace_vfio_disable_intx_kvm(vdev->vbasedev.name);
 #endif
 }
 
@@ -551,8 +546,7 @@ static void vfio_update_irq(PCIDevice *pdev)
 return; /* Nothing changed */
 }
 
-trace_vfio_update_irq(vdev->host.domain, vdev->host.bus,
-  vdev->host.slot, vdev->host.function,
+trace_vfio_update_irq(vdev->vbasedev.name,
   vdev->intx.route.irq, route.irq);
 
 vfio_disable_intx_kvm(vdev);
@@ -628,8 +622,7 @@ static int vfio_enable_intx(VFIOPCIDevice *vdev)
 
 vdev->interrupt = VFIO_INT_INTx;
 
-trace_vfio_enable_intx(vdev->host.domain, vdev->host.bus,
-   vdev->host.slot, vdev->host.function);
+trace_vfio_enable_intx(vdev->vbasedev.name);
 
 return 0;
 }
@@ -651,8 +644,7 @@ static void vfio_disable_intx(VFIOPCIDevice *vdev)
 
 vdev->interrupt = VFIO_INT_NONE;
 
-trace_vfio_disable_intx(vdev->host.domain, vdev->host.bus,
-vdev->host.slot, vdev->host.function);
+trace_vfio_disable_intx(vdev->vbasedev.name);
 }
 
 /*
@@ -679,9 +671,7 @@ static void vfio_msi_interrupt(void *opaque)
 abort();
 }
 
-trace_vfio_msi_interrupt(vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function,
- nr, msg.address, msg.data);
+trace_vfio_msi_interrupt(vbasedev->name, nr, msg.address, msg.data);
 #endif
 
 if (vdev->interrupt == VFIO_INT_MSIX) {
@@ -788,9 +778,7 @@ static int vfio_msix_vector_do_use(PCIDevice *pdev, 
unsigned int nr,
 VFIOMSIVector *vector;
 int ret;
 
-trace_vfio_msix_vector_do_use(vdev->host.domain, vdev->host.bus,
-  vdev->host.slot, vdev->host.function,
-  nr);
+trace_vfio_msix_vector_do_use(vdev->vbasedev.name, nr);
 
 vector = &vdev->msi_vectors[nr];
 
@@ -876,9 +864,7 @@ static void vfio_msix_vector_release(PCIDevice *pdev, 
unsigned int nr)
 VFIOPCIDevice *vdev = DO_UPCAST(VFIOPCIDevice, pdev, pdev);
 VFIOMSIVector *vector = &vdev->msi_vectors[nr];
 
-trace_vfio_msix_vector_release(vdev->host.domain, vdev->host.bus,
-   vdev->host.slot, vdev->host.function,
-   nr);
+trace_vfio_msix_vector_release(vdev->vbasedev.name, nr);
 
 /*
  * There are still old guests that mask and unmask vectors on every
@@ -941,8 +927,7 @@ static void vfio_enable_msix(VFIOPCIDevice *vdev)
 error_report("vfio: msix_set_vector_notifiers failed");
 }
 
-trace_vfio_enable_msix(vdev->host.domain, vdev->host.bus,
-   vdev->host.slot, vdev->host.function);
+trace_vfio_enable_msix(vdev->vbasedev.name);
 }
 
 static void vfio_enable_msi(VFIOPCIDevice *vdev)
@@ -1018,9 +1003,7 @@ retry:
 return;
 }
 
-trace_vfio_enable_msi(vdev->host.domain, vdev->host.bus,
- 

[Qemu-devel] [PATCH v4 6/6] hw/arm/virt: add dynamic sysbus device support

2014-10-31 Thread Eric Auger
Allows sysbus devices to be instantiated from command line by
using -device option. Machvirt creates a platform bus at init.
The dynamic sysbus devices are attached to a platform bus device.

The platform bus device registers a machine init done notifier
whose role will be to bind the dynamic sysbus devices. Indeed
dynamic sysbus devices are created after machine init.

machvirt also registers a notifier that will start the VFIO
dynamic device IRQ handling.

Signed-off-by: Alexander Graf 
Signed-off-by: Eric Auger 

---
v3 -> v4:
- use platform bus object, instantiated in create_platform_bus
- device tree generation for platform bus and children dynamic
  sysbus devices is no more handled at reset but in a
  machine_init_done_notifier (due to the change in implementaion
  of ARM load dtb using rom_add_blob_fixed).
- device tree enhancement now takes into account the case of
  user provided dtb. Before the user dtb was overwritten which
  was wrong. However in case the dtb is provided by the user,
  dynamic sysbus nodes are not added there.
- renaming of MACHVIRT_PLATFORM defines
- MACHVIRT_PLATFORM_PAGE_SHIFT and SIZE_PAGES not needed anymore,
  hence removed.
- DynSysbusParams struct renamed into ARMPlatformBusSystemParams
  and above params removed.
- separation of dt creation and QEMU binding is not mandated anymore
  since the device tree is not created from scratch anymore. Instead
  the modify_dtb function is used.
- create_platform_bus registers another machine init done notifier
  to start VFIO IRQ handling. This latter executes after the
  dynamic sysbus device binding.

v2 -> v3:
- renaming of arm_platform_bus_create_devtree and arm_load_dtb
- add copyright in hw/arm/dyn_sysbus_devtree.c

v1 -> v2:
- remove useless vfio-platform.h include file
- s/MACHVIRT_PLATFORM_HOLE/MACHVIRT_PLATFORM_SIZE
- use dyn_sysbus_binding and dyn_sysbus_devtree
- dynamic sysbus platform buse size shrinked to 4MB and
  moved between RTC and MMIO

v1:

Inspired from what Alex Graf did in ppc e500
https://lists.gnu.org/archive/html/qemu-ppc/2014-07/msg00012.html

Conflicts:
hw/arm/sysbus-fdt.c
---
 hw/arm/virt.c | 59 +++
 1 file changed, 59 insertions(+)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 78f618d..3a09d58 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -42,6 +42,8 @@
 #include "exec/address-spaces.h"
 #include "qemu/bitops.h"
 #include "qemu/error-report.h"
+#include "hw/arm/sysbus-fdt.h"
+#include "hw/platform-bus.h"
 
 #define NUM_VIRTIO_TRANSPORTS 32
 
@@ -59,6 +61,11 @@
 #define GIC_FDT_IRQ_PPI_CPU_START 8
 #define GIC_FDT_IRQ_PPI_CPU_WIDTH 8
 
+#define PLATFORM_BUS_BASE 0x940
+#define PLATFORM_BUS_SIZE (4ULL * 1024 * 1024)
+#define PLATFORM_BUS_FIRST_IRQ48
+#define PLATFORM_BUS_NUM_IRQS 20
+
 enum {
 VIRT_FLASH,
 VIRT_MEM,
@@ -68,6 +75,7 @@ enum {
 VIRT_UART,
 VIRT_MMIO,
 VIRT_RTC,
+VIRT_PLATFORM_BUS,
 };
 
 typedef struct MemMapEntry {
@@ -107,6 +115,7 @@ static const MemMapEntry a15memmap[] = {
 [VIRT_GIC_CPU] ={ 0x0801, 0x0001 },
 [VIRT_UART] =   { 0x0900, 0x1000 },
 [VIRT_RTC] ={ 0x0901, 0x1000 },
+[VIRT_PLATFORM_BUS] = {PLATFORM_BUS_BASE , PLATFORM_BUS_SIZE},
 [VIRT_MMIO] =   { 0x0a00, 0x0200 },
 /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
 /* 0x1000 .. 0x4000 reserved for PCI */
@@ -117,6 +126,14 @@ static const int a15irqmap[] = {
 [VIRT_UART] = 1,
 [VIRT_RTC] = 2,
 [VIRT_MMIO] = 16, /* ...to 16 + NUM_VIRTIO_TRANSPORTS - 1 */
+[VIRT_PLATFORM_BUS] = PLATFORM_BUS_FIRST_IRQ,
+};
+
+ARMPlatformBusSystemParams platform_bus_params = {
+.platform_bus_base = PLATFORM_BUS_BASE,
+.platform_bus_size = PLATFORM_BUS_SIZE,
+.platform_bus_first_irq = PLATFORM_BUS_FIRST_IRQ,
+.platform_bus_num_irqs = PLATFORM_BUS_NUM_IRQS,
 };
 
 static VirtBoardInfo machines[] = {
@@ -519,6 +536,45 @@ static void create_flash(const VirtBoardInfo *vbi)
 g_free(nodename);
 }
 
+static void create_platform_bus(VirtBoardInfo *vbi, qemu_irq *pic,
+ARMPlatformBusSystemParams *system_params)
+{
+DeviceState *dev;
+SysBusDevice *s;
+int i;
+ARMPlatformBusFdtParams *fdt_params = g_new(ARMPlatformBusFdtParams, 1);
+MemoryRegion *sysmem = get_system_memory();
+
+/*
+ * register the notifier that will update the device tree with
+ * the platform bus and device tree nodes. Must be done before
+ * the instantiation of the platform bus device that registers
+ * the notifier that instantiates the dynamic sysbus devices
+ */
+fdt_params->system_params = system_params;
+fdt_params->binfo = &vbi->bootinfo;
+fdt_params->intc = "/intc";
+arm_register_platform_bus_fdt_creator(fdt_params);
+
+dev =

[Qemu-devel] [PATCH v4 4/6] hw/arm: add a new modify_dtb_opaque field in arm_boot_info

2014-10-31 Thread Eric Auger
This field can be used by any modify_dtb() function to pass
additional arguments requested to build the modified dtb. This
is needed for creating the platform bus dynamic sysbus nodes.

Signed-off-by: Eric Auger 
---
 include/hw/arm/arm.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/include/hw/arm/arm.h b/include/hw/arm/arm.h
index 5f1ecb7..ff776fa 100644
--- a/include/hw/arm/arm.h
+++ b/include/hw/arm/arm.h
@@ -68,6 +68,10 @@ struct arm_boot_info {
 hwaddr dtb_start; /* start address of the dtb */
 hwaddr dtb_limit; /* upper RAM limit the dtb cannot overshoot */
 hwaddr entry;
+/* in case modify_dtb requires additional parameters to create the
+ * the new nodes, use following opaque
+ */
+void *modify_dtb_opaque;
 };
 void arm_load_kernel(ARMCPU *cpu, struct arm_boot_info *info);
 int arm_load_dtb(const struct arm_boot_info *binfo);
-- 
1.8.3.2




[Qemu-devel] [PATCH v3 2/2] vfio: use kvm_resamplefds_enabled()

2014-10-31 Thread Eric Auger
Use the kvm_resamplefds_enabled function

Signed-off-by: Eric Auger 
---
 hw/misc/vfio.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
index b5e7981..75bfa1c 100644
--- a/hw/misc/vfio.c
+++ b/hw/misc/vfio.c
@@ -406,7 +406,7 @@ static void vfio_enable_intx_kvm(VFIODevice *vdev)
 
 if (!VFIO_ALLOW_KVM_INTX || !kvm_irqfds_enabled() ||
 vdev->intx.route.mode != PCI_INTX_ENABLED ||
-!kvm_check_extension(kvm_state, KVM_CAP_IRQFD_RESAMPLE)) {
+!kvm_resamplefds_enabled()) {
 return;
 }
 
@@ -568,8 +568,7 @@ static int vfio_enable_intx(VFIODevice *vdev)
  * Only conditional to avoid generating error messages on platforms
  * where we won't actually use the result anyway.
  */
-if (kvm_irqfds_enabled() &&
-kvm_check_extension(kvm_state, KVM_CAP_IRQFD_RESAMPLE)) {
+if (kvm_irqfds_enabled() && kvm_resamplefds_enabled()) {
 vdev->intx.route = pci_device_route_intx_to_irq(&vdev->pdev,
 vdev->intx.pin);
 }
-- 
1.8.3.2




[Qemu-devel] [PATCH v4 3/6] hw/arm/boot: do not free VirtBoardInfo fdt in arm_load_dtb

2014-10-31 Thread Eric Auger
Currently arm_load_dtb frees the fdt handle whatever it is allocated
from load_device_tree or allocated externally.

When adding dynamic sysbus nodes after the first dtb load, we would like
to reuse the fdt used during the first load instead of re-creating the
whole device tree. If the fdt is destroyed, this is not possible.

Signed-off-by: Eric Auger 
---
 hw/arm/boot.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index 9f0662e..0e4b078 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -427,12 +427,16 @@ int arm_load_dtb(const struct arm_boot_info *binfo)
  */
 rom_add_blob_fixed("dtb", fdt, size, binfo->dtb_start);
 
-g_free(fdt);
+if (binfo->dtb_filename) {
+g_free(fdt);
+}
 
 return size;
 
 fail:
-g_free(fdt);
+if (binfo->dtb_filename) {
+g_free(fdt);
+}
 return -1;
 }
 
-- 
1.8.3.2




[Qemu-devel] [PATCH v4 2/6] hw/arm/boot: dtb start and limit moved in arm_boot_info

2014-10-31 Thread Eric Auger
Two fields are added in arm_boot_info (dtb_start and dtb_limit). The
prototype of arm_load_kernel is changed to only use arm_boot_info.

The rationale behind introducing that change is when dealing with
dynamic sysbus devices, we need to upgrade the device tree with dynamic
device nodes after the dtb is already loaded. Storing those parameters
in arm_boot_info allows to avoid computing again dtb_start and
dtb_load, as done in arm_load_kernel.

Signed-off-by: Eric Auger 
---
 hw/arm/boot.c| 38 +-
 include/hw/arm/arm.h |  5 +++--
 2 files changed, 24 insertions(+), 19 deletions(-)

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index f5714ea..9f0662e 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -314,24 +314,21 @@ static void set_kernel_args_old(const struct 
arm_boot_info *info)
 
 /**
  * arm_load_dtb() - load a device tree binary image into memory
- * @addr:   the address to load the image at
  * @binfo:  struct describing the boot environment
- * @addr_limit: upper limit of the available memory area at @addr
  *
  * Load a device tree supplied by the machine or by the user  with the
- * '-dtb' command line option, and put it at offset @addr in target
- * memory.
+ * '-dtb' command line option, and put it at offset binfo->dtb_start in
+ * target memory.
  *
- * If @addr_limit contains a meaningful value (i.e., it is strictly greater
- * than @addr), the device tree is only loaded if its size does not exceed
- * the limit.
+ * If binfo->dtb_limit contains a meaningful value (i.e., it is strictly
+ * greater binfo->dtb_start, the device tree is only loaded if its size does
+ * not exceed this upper limit.
  *
  * Returns: the size of the device tree image on success,
  *  0 if the image size exceeds the limit,
  *  -1 on errors.
  */
-int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
- hwaddr addr_limit)
+int arm_load_dtb(const struct arm_boot_info *binfo)
 {
 void *fdt = NULL;
 int size, rc;
@@ -360,7 +357,8 @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info 
*binfo,
 }
 }
 
-if (addr_limit > addr && size > (addr_limit - addr)) {
+if (binfo->dtb_limit > binfo->dtb_start &&
+size > (binfo->dtb_limit - binfo->dtb_start)) {
 /* Installing the device tree blob at addr would exceed addr_limit.
  * Whether this constitutes failure is up to the caller to decide,
  * so just return 0 as size, i.e., no error.
@@ -427,7 +425,7 @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info 
*binfo,
 /* Put the DTB into the memory map as a ROM image: this will ensure
  * the DTB is copied again upon reset, even if addr points into RAM.
  */
-rom_add_blob_fixed("dtb", fdt, size, addr);
+rom_add_blob_fixed("dtb", fdt, size, binfo->dtb_start);
 
 g_free(fdt);
 
@@ -504,7 +502,10 @@ void arm_load_kernel(ARMCPU *cpu, struct arm_boot_info 
*info)
 /* If we have a device tree blob, but no kernel to supply it to,
  * copy it to the base of RAM for a bootloader to pick up.
  */
-if (arm_load_dtb(info->loader_start, info, 0) < 0) {
+info->dtb_start = info->loader_start;
+info->dtb_limit = 0;
+
+if (arm_load_dtb(info) < 0) {
 exit(1);
 }
 }
@@ -572,7 +573,9 @@ void arm_load_kernel(ARMCPU *cpu, struct arm_boot_info 
*info)
 if (elf_low_addr < info->loader_start) {
 elf_low_addr = 0;
 }
-if (arm_load_dtb(info->loader_start, info, elf_low_addr) < 0) {
+info->dtb_start = info->loader_start;
+info->dtb_limit = elf_low_addr;
+if (arm_load_dtb(info) < 0) {
 exit(1);
 }
 }
@@ -635,12 +638,13 @@ void arm_load_kernel(ARMCPU *cpu, struct arm_boot_info 
*info)
  * kernels will trash anything in the 4K page the initrd
  * ends in, so make sure the DTB isn't caught up in that.
  */
-hwaddr dtb_start = QEMU_ALIGN_UP(info->initrd_start + initrd_size,
- 4096);
-if (arm_load_dtb(dtb_start, info, 0) < 0) {
+info->dtb_start = QEMU_ALIGN_UP(info->initrd_start + initrd_size,
+4096);
+info->dtb_limit = 0;
+if (arm_load_dtb(info) < 0) {
 exit(1);
 }
-fixupcontext[FIXUP_ARGPTR] = dtb_start;
+fixupcontext[FIXUP_ARGPTR] = info->dtb_start;
 } else {
 fixupcontext[FIXUP_ARGPTR] = info->loader_start + KERNEL_ARGS_ADDR;
 if (info->ram_size >= (1ULL << 32)) {
diff --git a/include/hw/ar

[Qemu-devel] [PATCH v3 0/2] actual checks of KVM_CAP_IRQFD and KVM_CAP_IRQFD_RESAMPLE

2014-10-31 Thread Eric Auger
This patch series replaces direct settings of kvm_irqfds_allowed
in architecture specific files by actual check of the KVM_CAP_IRQFD
extension in kvm-all.c.

Also A new kvm_resamplefds_enabled() enables to check
KVM_CAP_IRQFD_RESAMPLE. In the second patch file the vfio device
becomes the first user of kvm_resamplefds_enabled().

Eric Auger (2):
  KVM_CAP_IRQFD and KVM_CAP_IRQFD_RESAMPLE checks
  vfio: use kvm_resamplefds_enabled()

 hw/intc/openpic_kvm.c |  1 -
 hw/intc/xics_kvm.c|  1 -
 hw/misc/vfio.c|  5 ++---
 include/sysemu/kvm.h  | 10 ++
 kvm-all.c |  7 +++
 target-i386/kvm.c |  1 -
 target-s390x/kvm.c|  1 -
 7 files changed, 19 insertions(+), 7 deletions(-)

-- 
1.8.3.2




Re: [Qemu-devel] [PATCH RESEND] vfio: migration to trace points

2014-10-31 Thread Eric Auger
On 10/31/2014 03:35 PM, Alex Williamson wrote:
> On Fri, 2014-10-31 at 13:44 +0000, Eric Auger wrote:
>> This patch removes all DPRINTF and replace them by trace points.
>> A few DPRINTF used in error cases were transformed into error_report.
>>
>> Signed-off-by: Eric Auger 
>>
>> ---
> 
> I've already got this one:
> 
> http://lists.gnu.org/archive/html/qemu-devel/2014-10/msg02323.html

Hi Alex

yes you're right. my sendmail --subject-prefix was wrong. It is the
same. I just resent it with correct prefix.

Best Regards

Eric
> 
>>
>> - __func__ is removed since trace point name does the same job
>> - HWADDR_PRIx were replaced by PRIx64
>> - this transformation just is tested compiled on PCI.
>>   qemu configured with --enable-trace-backends=stderr
>> - in future, format strings and calls may be simplified by using a single
>>   name argument instead of domain, bus, slot, function.
>>
>> v1 (RFC) -> v2 (PATCH):
>> - restore original format strings since parsing now is OK after
>>   commit f9bbba9,
>>   [PATCH v2] trace: tighten up trace-events regex to fix bad parse
>> ---
>>  hw/misc/vfio.c | 403 
>> +
>>  trace-events   |  75 ++-
>>  2 files changed, 280 insertions(+), 198 deletions(-)
>>
>> diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
>> index 75bfa1c..cdf4922 100644
>> --- a/hw/misc/vfio.c
>> +++ b/hw/misc/vfio.c
>> @@ -40,15 +40,7 @@
>>  #include "sysemu/kvm.h"
>>  #include "sysemu/sysemu.h"
>>  #include "hw/misc/vfio.h"
>> -
>> -/* #define DEBUG_VFIO */
>> -#ifdef DEBUG_VFIO
>> -#define DPRINTF(fmt, ...) \
>> -do { fprintf(stderr, "vfio: " fmt, ## __VA_ARGS__); } while (0)
>> -#else
>> -#define DPRINTF(fmt, ...) \
>> -do { } while (0)
>> -#endif
>> +#include "trace.h"
>>  
>>  /* Extra debugging, trap acceleration paths for more logging */
>>  #define VFIO_ALLOW_MMAP 1
>> @@ -365,9 +357,9 @@ static void vfio_intx_interrupt(void *opaque)
>>  return;
>>  }
>>  
>> -DPRINTF("%s(%04x:%02x:%02x.%x) Pin %c\n", __func__, vdev->host.domain,
>> -vdev->host.bus, vdev->host.slot, vdev->host.function,
>> -'A' + vdev->intx.pin);
>> +trace_vfio_intx_interrupt(vdev->host.domain, vdev->host.bus,
>> +  vdev->host.slot, vdev->host.function,
>> +  'A' + vdev->intx.pin);
>>  
>>  vdev->intx.pending = true;
>>  pci_irq_assert(&vdev->pdev);
>> @@ -384,8 +376,8 @@ static void vfio_eoi(VFIODevice *vdev)
>>  return;
>>  }
>>  
>> -DPRINTF("%s(%04x:%02x:%02x.%x) EOI\n", __func__, vdev->host.domain,
>> -vdev->host.bus, vdev->host.slot, vdev->host.function);
>> +trace_vfio_eoi(vdev->host.domain, vdev->host.bus,
>> +   vdev->host.slot, vdev->host.function);
>>  
>>  vdev->intx.pending = false;
>>  pci_irq_deassert(&vdev->pdev);
>> @@ -454,9 +446,8 @@ static void vfio_enable_intx_kvm(VFIODevice *vdev)
>>  
>>  vdev->intx.kvm_accel = true;
>>  
>> -DPRINTF("%s(%04x:%02x:%02x.%x) KVM INTx accel enabled\n",
>> -__func__, vdev->host.domain, vdev->host.bus,
>> -vdev->host.slot, vdev->host.function);
>> +trace_vfio_enable_intx_kvm(vdev->host.domain, vdev->host.bus,
>> +   vdev->host.slot, vdev->host.function);
>>  
>>  return;
>>  
>> @@ -508,9 +499,8 @@ static void vfio_disable_intx_kvm(VFIODevice *vdev)
>>  /* If we've missed an event, let it re-fire through QEMU */
>>  vfio_unmask_intx(vdev);
>>  
>> -DPRINTF("%s(%04x:%02x:%02x.%x) KVM INTx accel disabled\n",
>> -__func__, vdev->host.domain, vdev->host.bus,
>> -vdev->host.slot, vdev->host.function);
>> +trace_vfio_disable_intx_kvm(vdev->host.domain, vdev->host.bus,
>> +vdev->host.slot, vdev->host.function);
>>  #endif
>>  }
>>  
>> @@ -529,9 +519,9 @@ static void vfio_update_irq(PCIDevice *pdev)
>>  return; /* Nothing changed */
>>  }
>>  
>> -DPRINTF("%s(%04x:%02x:%02x.%x) IRQ moved %d -> %d\n", __func__,
>> -vdev-

[Qemu-devel] [PATCH] KVM_CAP_IRQFD and KVM_CAP_IRQFD_RESAMPLE checks

2014-08-29 Thread Eric Auger
Compute kvm_irqfds_allowed by checking the KVM_CAP_IRQFD extension.
Remove direct settings in architecture specific files.

Add a new kvm_resamplefds_allowed variable, initialized by
checking the KVM_CAP_IRQFD_RESAMPLE extension. Add a corresponding
kvm_resamplefds_enabled() function.

Signed-off-by: Eric Auger 

---

in practice KVM_CAP_IRQFD_RESAMPLE seems to be always enabled
as soon as kernel has HAVE_KVM_IRQFD so the resamplefd check
may be unnecessary.
---
 hw/intc/openpic_kvm.c |  1 -
 hw/intc/xics_kvm.c|  1 -
 include/sysemu/kvm.h  | 10 ++
 kvm-all.c |  7 +++
 target-i386/kvm.c |  1 -
 target-s390x/kvm.c|  1 -
 6 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/hw/intc/openpic_kvm.c b/hw/intc/openpic_kvm.c
index e3bce04..6cef3b1 100644
--- a/hw/intc/openpic_kvm.c
+++ b/hw/intc/openpic_kvm.c
@@ -229,7 +229,6 @@ static void kvm_openpic_realize(DeviceState *dev, Error 
**errp)
 kvm_irqchip_add_irq_route(kvm_state, i, 0, i);
 }
 
-kvm_irqfds_allowed = true;
 kvm_msi_via_irqfd_allowed = true;
 kvm_gsi_routing_allowed = true;
 
diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
index 20b19e9..c15453f 100644
--- a/hw/intc/xics_kvm.c
+++ b/hw/intc/xics_kvm.c
@@ -448,7 +448,6 @@ static void xics_kvm_realize(DeviceState *dev, Error **errp)
 }
 
 kvm_kernel_irqchip = true;
-kvm_irqfds_allowed = true;
 kvm_msi_via_irqfd_allowed = true;
 kvm_gsi_direct_mapping = true;
 
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 174ea36..69c4d0f 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -45,6 +45,7 @@ extern bool kvm_async_interrupts_allowed;
 extern bool kvm_halt_in_kernel_allowed;
 extern bool kvm_eventfds_allowed;
 extern bool kvm_irqfds_allowed;
+extern bool kvm_resamplefds_allowed;
 extern bool kvm_msi_via_irqfd_allowed;
 extern bool kvm_gsi_routing_allowed;
 extern bool kvm_gsi_direct_mapping;
@@ -102,6 +103,15 @@ extern bool kvm_readonly_mem_allowed;
 #define kvm_irqfds_enabled() (kvm_irqfds_allowed)
 
 /**
+ * kvm_resamplefds_enabled:
+ *
+ * Returns: true if we can use resamplefds to inject interrupts into
+ * a KVM CPU (ie the kernel supports resamplefds and we are running
+ * with a configuration where it is meaningful to use them).
+ */
+#define kvm_resamplefds_enabled() (kvm_resamplefds_allowed)
+
+/**
  * kvm_msi_via_irqfd_enabled:
  *
  * Returns: true if we can route a PCI MSI (Message Signaled Interrupt)
diff --git a/kvm-all.c b/kvm-all.c
index 1402f4f..fdc97d6 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -116,6 +116,7 @@ bool kvm_async_interrupts_allowed;
 bool kvm_halt_in_kernel_allowed;
 bool kvm_eventfds_allowed;
 bool kvm_irqfds_allowed;
+bool kvm_resamplefds_allowed;
 bool kvm_msi_via_irqfd_allowed;
 bool kvm_gsi_routing_allowed;
 bool kvm_gsi_direct_mapping;
@@ -1548,6 +1549,12 @@ int kvm_init(MachineClass *mc)
 kvm_eventfds_allowed =
 (kvm_check_extension(s, KVM_CAP_IOEVENTFD) > 0);
 
+kvm_irqfds_allowed =
+(kvm_check_extension(s, KVM_CAP_IRQFD) > 0);
+
+kvm_resamplefds_allowed =
+(kvm_check_extension(s, KVM_CAP_IRQFD_RESAMPLE) > 0);
+
 ret = kvm_arch_init(s);
 if (ret < 0) {
 goto err;
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 097fe11..4bc2d80 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -2447,7 +2447,6 @@ void kvm_arch_init_irq_routing(KVMState *s)
  * irqchip, so we can use irqfds, and on x86 we know
  * we can use msi via irqfd and GSI routing.
  */
-kvm_irqfds_allowed = true;
 kvm_msi_via_irqfd_allowed = true;
 kvm_gsi_routing_allowed = true;
 }
diff --git a/target-s390x/kvm.c b/target-s390x/kvm.c
index a32d91a..4d2bca6 100644
--- a/target-s390x/kvm.c
+++ b/target-s390x/kvm.c
@@ -1281,7 +1281,6 @@ void kvm_arch_init_irq_routing(KVMState *s)
  * have to override the common code kvm_halt_in_kernel_allowed setting.
  */
 if (kvm_check_extension(s, KVM_CAP_IRQ_ROUTING)) {
-kvm_irqfds_allowed = true;
 kvm_gsi_routing_allowed = true;
 kvm_halt_in_kernel_allowed = false;
 }
-- 
1.8.3.2




Re: [Qemu-devel] [PATCH] KVM_CAP_IRQFD and KVM_CAP_IRQFD_RESAMPLE checks

2014-09-01 Thread Eric Auger
On 09/01/2014 12:49 PM, Paolo Bonzini wrote:
> Il 29/08/2014 19:38, Eric Auger ha scritto:
>> Compute kvm_irqfds_allowed by checking the KVM_CAP_IRQFD extension.
>> Remove direct settings in architecture specific files.
>>
>> Add a new kvm_resamplefds_allowed variable, initialized by
>> checking the KVM_CAP_IRQFD_RESAMPLE extension. Add a corresponding
>> kvm_resamplefds_enabled() function.
> 
> Please add a user too (in hw/misc/vfio.c).

Hi Paolo,

OK sure. Thanks

Best Regards

Eric
> 
> Otherwise looks good, thanks!
> 
> Paolo
> 
>> Signed-off-by: Eric Auger 
>>
>> ---
>>
>> in practice KVM_CAP_IRQFD_RESAMPLE seems to be always enabled
>> as soon as kernel has HAVE_KVM_IRQFD so the resamplefd check
>> may be unnecessary.
>> ---
>>  hw/intc/openpic_kvm.c |  1 -
>>  hw/intc/xics_kvm.c|  1 -
>>  include/sysemu/kvm.h  | 10 ++
>>  kvm-all.c |  7 +++
>>  target-i386/kvm.c |  1 -
>>  target-s390x/kvm.c|  1 -
>>  6 files changed, 17 insertions(+), 4 deletions(-)
>>
>> diff --git a/hw/intc/openpic_kvm.c b/hw/intc/openpic_kvm.c
>> index e3bce04..6cef3b1 100644
>> --- a/hw/intc/openpic_kvm.c
>> +++ b/hw/intc/openpic_kvm.c
>> @@ -229,7 +229,6 @@ static void kvm_openpic_realize(DeviceState *dev, Error 
>> **errp)
>>  kvm_irqchip_add_irq_route(kvm_state, i, 0, i);
>>  }
>>  
>> -kvm_irqfds_allowed = true;
>>  kvm_msi_via_irqfd_allowed = true;
>>  kvm_gsi_routing_allowed = true;
>>  
>> diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
>> index 20b19e9..c15453f 100644
>> --- a/hw/intc/xics_kvm.c
>> +++ b/hw/intc/xics_kvm.c
>> @@ -448,7 +448,6 @@ static void xics_kvm_realize(DeviceState *dev, Error 
>> **errp)
>>  }
>>  
>>  kvm_kernel_irqchip = true;
>> -kvm_irqfds_allowed = true;
>>  kvm_msi_via_irqfd_allowed = true;
>>  kvm_gsi_direct_mapping = true;
>>  
>> diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
>> index 174ea36..69c4d0f 100644
>> --- a/include/sysemu/kvm.h
>> +++ b/include/sysemu/kvm.h
>> @@ -45,6 +45,7 @@ extern bool kvm_async_interrupts_allowed;
>>  extern bool kvm_halt_in_kernel_allowed;
>>  extern bool kvm_eventfds_allowed;
>>  extern bool kvm_irqfds_allowed;
>> +extern bool kvm_resamplefds_allowed;
>>  extern bool kvm_msi_via_irqfd_allowed;
>>  extern bool kvm_gsi_routing_allowed;
>>  extern bool kvm_gsi_direct_mapping;
>> @@ -102,6 +103,15 @@ extern bool kvm_readonly_mem_allowed;
>>  #define kvm_irqfds_enabled() (kvm_irqfds_allowed)
>>  
>>  /**
>> + * kvm_resamplefds_enabled:
>> + *
>> + * Returns: true if we can use resamplefds to inject interrupts into
>> + * a KVM CPU (ie the kernel supports resamplefds and we are running
>> + * with a configuration where it is meaningful to use them).
>> + */
>> +#define kvm_resamplefds_enabled() (kvm_resamplefds_allowed)
>> +
>> +/**
>>   * kvm_msi_via_irqfd_enabled:
>>   *
>>   * Returns: true if we can route a PCI MSI (Message Signaled Interrupt)
>> diff --git a/kvm-all.c b/kvm-all.c
>> index 1402f4f..fdc97d6 100644
>> --- a/kvm-all.c
>> +++ b/kvm-all.c
>> @@ -116,6 +116,7 @@ bool kvm_async_interrupts_allowed;
>>  bool kvm_halt_in_kernel_allowed;
>>  bool kvm_eventfds_allowed;
>>  bool kvm_irqfds_allowed;
>> +bool kvm_resamplefds_allowed;
>>  bool kvm_msi_via_irqfd_allowed;
>>  bool kvm_gsi_routing_allowed;
>>  bool kvm_gsi_direct_mapping;
>> @@ -1548,6 +1549,12 @@ int kvm_init(MachineClass *mc)
>>  kvm_eventfds_allowed =
>>  (kvm_check_extension(s, KVM_CAP_IOEVENTFD) > 0);
>>  
>> +kvm_irqfds_allowed =
>> +(kvm_check_extension(s, KVM_CAP_IRQFD) > 0);
>> +
>> +kvm_resamplefds_allowed =
>> +(kvm_check_extension(s, KVM_CAP_IRQFD_RESAMPLE) > 0);
>> +
>>  ret = kvm_arch_init(s);
>>  if (ret < 0) {
>>  goto err;
>> diff --git a/target-i386/kvm.c b/target-i386/kvm.c
>> index 097fe11..4bc2d80 100644
>> --- a/target-i386/kvm.c
>> +++ b/target-i386/kvm.c
>> @@ -2447,7 +2447,6 @@ void kvm_arch_init_irq_routing(KVMState *s)
>>   * irqchip, so we can use irqfds, and on x86 we know
>>   * we can use msi via irqfd and GSI routing.
>>   */
>> -kvm_irqfds_allowed = true;
>>  kvm_msi_via_irqfd_allowed = true;
>>  kvm_gsi_routing_allowed = true;
>>  }
>> diff --git a/target-s390x/kvm.c b/target-s390x/kvm.c
>> index a32d91a..4d2bca6 100644
>> --- a/target-s390x/kvm.c
>> +++ b/target-s390x/kvm.c
>> @@ -1281,7 +1281,6 @@ void kvm_arch_init_irq_routing(KVMState *s)
>>   * have to override the common code kvm_halt_in_kernel_allowed setting.
>>   */
>>  if (kvm_check_extension(s, KVM_CAP_IRQ_ROUTING)) {
>> -kvm_irqfds_allowed = true;
>>  kvm_gsi_routing_allowed = true;
>>  kvm_halt_in_kernel_allowed = false;
>>  }
>>
> 




Re: [Qemu-devel] [PATCH v5 06/10] hw/vfio: create common module

2014-09-01 Thread Eric Auger
On 08/13/2014 09:59 PM, Alex Williamson wrote:
> On Tue, 2014-08-12 at 08:09 +0200, Eric Auger wrote:
>> On 08/11/2014 09:25 PM, Alex Williamson wrote:
>>> On Sat, 2014-08-09 at 15:25 +0100, Eric Auger wrote:
>>>> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
>>>> new file mode 100644
>>>> index 000..4684ee5
>>>> --- /dev/null
>>>> +++ b/include/hw/vfio/vfio-common.h
>>>> @@ -0,0 +1,151 @@
>>>> +/*
>>>> + * common header for vfio based device assignment support
>>>> + *
>>>> + * Copyright Red Hat, Inc. 2012
>>>> + *
>>>> + * Authors:
>>>> + *  Alex Williamson 
>>>> + *
>>>> + * This work is licensed under the terms of the GNU GPL, version 2.  See
>>>> + * the COPYING file in the top-level directory.
>>>> + *
>>>> + * Based on qemu-kvm device-assignment:
>>>> + *  Adapted for KVM by Qumranet.
>>>> + *  Copyright (c) 2007, Neocleus, Alex Novik (a...@neocleus.com)
>>>> + *  Copyright (c) 2007, Neocleus, Guy Zana (g...@neocleus.com)
>>>> + *  Copyright (C) 2008, Qumranet, Amit Shah (amit.s...@qumranet.com)
>>>> + *  Copyright (C) 2008, Red Hat, Amit Shah (amit.s...@redhat.com)
>>>> + *  Copyright (C) 2008, IBM, Muli Ben-Yehuda (m...@il.ibm.com)
>>>> + */
>>>> +#ifndef HW_VFIO_VFIO_COMMON_H
>>>> +#define HW_VFIO_VFIO_COMMON_H
>>>> +
>>>> +#include "qemu-common.h"
>>>> +#include "exec/address-spaces.h"
>>>> +#include "exec/memory.h"
>>>> +#include "qemu/queue.h"
>>>> +#include "qemu/notify.h"
>>>> +
>>>> +/*#define DEBUG_VFIO*/
>>>> +#ifdef DEBUG_VFIO
>>>> +#define DPRINTF(fmt, ...) \
>>>> +do { fprintf(stderr, "vfio: " fmt, ## __VA_ARGS__); } while (0)
>>>> +#else
>>>> +#define DPRINTF(fmt, ...) \
>>>> +do { } while (0)
>>>> +#endif
>>>
>>>
>>> DPRINTF also need to be renamed to avoid conflicting namespace issues.
>> Ji Alex,
>>
>> OK.
>>
>> As I am going to touch at traces,
>> - are you OK if I use the new .name field to simply format strings?
> 
> Sure, that's fine.
> 
>> DPRINTF("%s(%04x:%02x:%02x.%x) Pin %c\n", __func__, vdev->host.domain,
>> vdev->host.bus, vdev->host.slot, vdev->host.function,
>> 'A' + vdev->intx.pin);
>> - Also Alex was suggesting to use trace points. What is your position
>> about that? Also I am not 100% sure of what it consists in? is it trace
>> events as documented in docs/tracing.txt
> 
> I think it would be a great conversion, but it's not required.  Thanks,

Hi Alex,

I am currently progressing on the conversion to trace points (I did it
for platform and common and now do the job for PCI). I wonder whether it
makes sense I convert all DPRINTF into trace-points or only convert a
subset (state transitions, ...). Would you accept a mixture of DPRINTFs
and trace-points or do you advise to convert everything?

Also the tracing.txt doc says we should use the name of the function as
prefix. That being said it could be interesting to trace all pci* or all
platform* and wildcard seems to work fine to select the trace-events. So
my second question is would you accept using pci__* as a
generic pattern.

Thanks in advance

Best Regards

Eric
> 
> Alex
> 




Re: [Qemu-devel] [PATCH v5 06/10] hw/vfio: create common module

2014-09-02 Thread Eric Auger
On 09/01/2014 07:41 PM, Alexander Graf wrote:
> 
> 
>> Am 01.09.2014 um 18:31 schrieb Eric Auger :
>>
>>> On 08/13/2014 09:59 PM, Alex Williamson wrote:
>>>> On Tue, 2014-08-12 at 08:09 +0200, Eric Auger wrote:
>>>>> On 08/11/2014 09:25 PM, Alex Williamson wrote:
>>>>>> On Sat, 2014-08-09 at 15:25 +0100, Eric Auger wrote:
>>>>>> diff --git a/include/hw/vfio/vfio-common.h 
>>>>>> b/include/hw/vfio/vfio-common.h
>>>>>> new file mode 100644
>>>>>> index 000..4684ee5
>>>>>> --- /dev/null
>>>>>> +++ b/include/hw/vfio/vfio-common.h
>>>>>> @@ -0,0 +1,151 @@
>>>>>> +/*
>>>>>> + * common header for vfio based device assignment support
>>>>>> + *
>>>>>> + * Copyright Red Hat, Inc. 2012
>>>>>> + *
>>>>>> + * Authors:
>>>>>> + *  Alex Williamson 
>>>>>> + *
>>>>>> + * This work is licensed under the terms of the GNU GPL, version 2.  See
>>>>>> + * the COPYING file in the top-level directory.
>>>>>> + *
>>>>>> + * Based on qemu-kvm device-assignment:
>>>>>> + *  Adapted for KVM by Qumranet.
>>>>>> + *  Copyright (c) 2007, Neocleus, Alex Novik (a...@neocleus.com)
>>>>>> + *  Copyright (c) 2007, Neocleus, Guy Zana (g...@neocleus.com)
>>>>>> + *  Copyright (C) 2008, Qumranet, Amit Shah (amit.s...@qumranet.com)
>>>>>> + *  Copyright (C) 2008, Red Hat, Amit Shah (amit.s...@redhat.com)
>>>>>> + *  Copyright (C) 2008, IBM, Muli Ben-Yehuda (m...@il.ibm.com)
>>>>>> + */
>>>>>> +#ifndef HW_VFIO_VFIO_COMMON_H
>>>>>> +#define HW_VFIO_VFIO_COMMON_H
>>>>>> +
>>>>>> +#include "qemu-common.h"
>>>>>> +#include "exec/address-spaces.h"
>>>>>> +#include "exec/memory.h"
>>>>>> +#include "qemu/queue.h"
>>>>>> +#include "qemu/notify.h"
>>>>>> +
>>>>>> +/*#define DEBUG_VFIO*/
>>>>>> +#ifdef DEBUG_VFIO
>>>>>> +#define DPRINTF(fmt, ...) \
>>>>>> +do { fprintf(stderr, "vfio: " fmt, ## __VA_ARGS__); } while (0)
>>>>>> +#else
>>>>>> +#define DPRINTF(fmt, ...) \
>>>>>> +do { } while (0)
>>>>>> +#endif
>>>>>
>>>>>
>>>>> DPRINTF also need to be renamed to avoid conflicting namespace issues.
>>>> Ji Alex,
>>>>
>>>> OK.
>>>>
>>>> As I am going to touch at traces,
>>>> - are you OK if I use the new .name field to simply format strings?
>>>
>>> Sure, that's fine.
>>>
>>>>DPRINTF("%s(%04x:%02x:%02x.%x) Pin %c\n", __func__, vdev->host.domain,
>>>>vdev->host.bus, vdev->host.slot, vdev->host.function,
>>>>'A' + vdev->intx.pin);
>>>> - Also Alex was suggesting to use trace points. What is your position
>>>> about that? Also I am not 100% sure of what it consists in? is it trace
>>>> events as documented in docs/tracing.txt
>>>
>>> I think it would be a great conversion, but it's not required.  Thanks,
>>
>> Hi Alex,
>>
>> I am currently progressing on the conversion to trace points (I did it
>> for platform and common and now do the job for PCI). I wonder whether it
>> makes sense I convert all DPRINTF into trace-points or only convert a
>> subset (state transitions, ...). Would you accept a mixture of DPRINTFs
>> and trace-points or do you advise to convert everything?
> 
> Yeah, it's perfectly good to even just nit introduce new dprintfs.
ok thanks
> 
>>
>> Also the tracing.txt doc says we should use the name of the function as
>> prefix. That being said it could be interesting to trace all pci* or all
>> platform* and wildcard seems to work fine to select the trace-events. So
>> my second question is would you accept using pci__* as a
>> generic pattern.
> 
> Not sure - maybe be more explicit and call it vfio_pci_...?
well. maybe as a first draft I will follow the tracing.txt guideline and
you will tell me, both Alex's, what you think of the outcome. Anyway it
is not a big deal then to change ...

Thanks

Eric
> 
> 
> Alex
> 
>>
>> Thanks in advance
>>
>> Best Regards
>>
>> Eric
>>>
>>> Alex
>>




[Qemu-devel] [RFC] vfio: migration to trace points

2014-09-03 Thread Eric Auger
This patch removes all DPRINTF and replace them by trace points.
A few DPRINTF used in error cases were transformed into error_report.

Signed-off-by: Eric Auger 

---

- __func__ is removed since trace point name does the same job
- HWADDR_PRIx were replaced by PRIx64

Besides those changes format strings were kept the same. in few
cases however I was forced to change them due to parsing errors
(always related to parenthesis handling). This is indicated in
trace-events. Cases than are not correctly handled are given below:
- "(%04x:%02x:%02x.%x)" need to be replaced by " (%04x:%02x:%02x.%x)"
- "%s read(%04x:%02x:%02x.%x:BAR%d+0x%"PRIx64", %d) = 0x%"PRIx64 ->
  "%s read(%04x:%02x:%02x.%x:BAR%d+0x%"PRIx64", %d = 0x%"PRIx64 ->
- "%s write(%04x:%02x:%02x.%x:BAR%d+0x%"PRIx64", 0x%"PRIx64", %d)"
  "%s write(%04x:%02x:%02x.%x:BAR%d+0x%"PRIx64", 0x%"PRIx64", %d"
This is a temporary fix.

- This leads to a too large amount of trace points which may not be
eligible as trace points - I don't know?-
- this transformation just is tested compiled on PCI. Tested on platform
  qemu configured with --enable-trace-backends=stderr
- in future, format strings and calls may be simplified by using a single
  name argument instead of domain, bus, slot, function.
---
 hw/misc/vfio.c | 403 +
 trace-events   |  79 +++
 2 files changed, 285 insertions(+), 197 deletions(-)

diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
index 40dcaa6..6b6dee9 100644
--- a/hw/misc/vfio.c
+++ b/hw/misc/vfio.c
@@ -40,15 +40,7 @@
 #include "sysemu/kvm.h"
 #include "sysemu/sysemu.h"
 #include "hw/misc/vfio.h"
-
-/* #define DEBUG_VFIO */
-#ifdef DEBUG_VFIO
-#define DPRINTF(fmt, ...) \
-do { fprintf(stderr, "vfio: " fmt, ## __VA_ARGS__); } while (0)
-#else
-#define DPRINTF(fmt, ...) \
-do { } while (0)
-#endif
+#include "trace.h"
 
 /* Extra debugging, trap acceleration paths for more logging */
 #define VFIO_ALLOW_MMAP 1
@@ -365,9 +357,9 @@ static void vfio_intx_interrupt(void *opaque)
 return;
 }
 
-DPRINTF("%s(%04x:%02x:%02x.%x) Pin %c\n", __func__, vdev->host.domain,
-vdev->host.bus, vdev->host.slot, vdev->host.function,
-'A' + vdev->intx.pin);
+trace_vfio_intx_interrupt(vdev->host.domain, vdev->host.bus,
+  vdev->host.slot, vdev->host.function,
+  'A' + vdev->intx.pin);
 
 vdev->intx.pending = true;
 pci_irq_assert(&vdev->pdev);
@@ -384,8 +376,8 @@ static void vfio_eoi(VFIODevice *vdev)
 return;
 }
 
-DPRINTF("%s(%04x:%02x:%02x.%x) EOI\n", __func__, vdev->host.domain,
-vdev->host.bus, vdev->host.slot, vdev->host.function);
+trace_vfio_eoi(vdev->host.domain, vdev->host.bus,
+   vdev->host.slot, vdev->host.function);
 
 vdev->intx.pending = false;
 pci_irq_deassert(&vdev->pdev);
@@ -454,9 +446,8 @@ static void vfio_enable_intx_kvm(VFIODevice *vdev)
 
 vdev->intx.kvm_accel = true;
 
-DPRINTF("%s(%04x:%02x:%02x.%x) KVM INTx accel enabled\n",
-__func__, vdev->host.domain, vdev->host.bus,
-vdev->host.slot, vdev->host.function);
+trace_vfio_enable_intx_kvm(vdev->host.domain, vdev->host.bus,
+   vdev->host.slot, vdev->host.function);
 
 return;
 
@@ -508,9 +499,8 @@ static void vfio_disable_intx_kvm(VFIODevice *vdev)
 /* If we've missed an event, let it re-fire through QEMU */
 vfio_unmask_intx(vdev);
 
-DPRINTF("%s(%04x:%02x:%02x.%x) KVM INTx accel disabled\n",
-__func__, vdev->host.domain, vdev->host.bus,
-vdev->host.slot, vdev->host.function);
+trace_vfio_disable_intx_kvm(vdev->host.domain, vdev->host.bus,
+vdev->host.slot, vdev->host.function);
 #endif
 }
 
@@ -529,9 +519,9 @@ static void vfio_update_irq(PCIDevice *pdev)
 return; /* Nothing changed */
 }
 
-DPRINTF("%s(%04x:%02x:%02x.%x) IRQ moved %d -> %d\n", __func__,
-vdev->host.domain, vdev->host.bus, vdev->host.slot,
-vdev->host.function, vdev->intx.route.irq, route.irq);
+trace_vfio_update_irq(vdev->host.domain, vdev->host.bus,
+  vdev->host.slot, vdev->host.function,
+  vdev->intx.route.irq, route.irq);
 
 vfio_disable_intx_kvm(vdev);
 
@@ -607,8 +597,8 @@ static int vfio_enable_intx(VFIODevice *vdev)
 
 vdev->interrupt = VFIO_INT_INTx;
 
-DPRINTF("%s(%04x:%02x:%02x.%x)\n", __func__, vdev->host.domain,

[Qemu-devel] [PATCH v2 0/2] actual checks of KVM_CAP_IRQFD and KVM_CAP_IRQFD_RESAMPLE

2014-09-03 Thread Eric Auger
This patch serie replaces direct settings of kvm_irqfds_allowed
by actual checks of the KVM_CAP_IRQFD extension. Also A new
kvm_resamplefds_enabled() enables to check KVM_CAP_IRQFD_RESAMPLE.

in the second patch file the vfio device is the first user of
kvm_resamplefds_enabled().

Eric Auger (2):
  KVM_CAP_IRQFD and KVM_CAP_IRQFD_RESAMPLE checks
  vfio: use kvm_resamplefds_enabled()

 hw/intc/openpic_kvm.c |  1 -
 hw/intc/xics_kvm.c|  1 -
 hw/misc/vfio.c|  5 ++---
 include/sysemu/kvm.h  | 10 ++
 kvm-all.c |  7 +++
 target-i386/kvm.c |  1 -
 target-s390x/kvm.c|  1 -
 7 files changed, 19 insertions(+), 7 deletions(-)

-- 
1.8.3.2




[Qemu-devel] [PATCH v2 1/2] KVM_CAP_IRQFD and KVM_CAP_IRQFD_RESAMPLE checks

2014-09-03 Thread Eric Auger
Compute kvm_irqfds_allowed by checking the KVM_CAP_IRQFD extension.
Remove direct settings in architecture specific files.

Add a new kvm_resamplefds_allowed variable, initialized by
checking the KVM_CAP_IRQFD_RESAMPLE extension. Add a corresponding
kvm_resamplefds_enabled() function.

Signed-off-by: Eric Auger 

---

in practice KVM_CAP_IRQFD_RESAMPLE seems to be always enabled
as soon as kernel has HAVE_KVM_IRQFD so the resamplefd check
may be unnecessary.
---
 hw/intc/openpic_kvm.c |  1 -
 hw/intc/xics_kvm.c|  1 -
 include/sysemu/kvm.h  | 10 ++
 kvm-all.c |  7 +++
 target-i386/kvm.c |  1 -
 target-s390x/kvm.c|  1 -
 6 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/hw/intc/openpic_kvm.c b/hw/intc/openpic_kvm.c
index e3bce04..6cef3b1 100644
--- a/hw/intc/openpic_kvm.c
+++ b/hw/intc/openpic_kvm.c
@@ -229,7 +229,6 @@ static void kvm_openpic_realize(DeviceState *dev, Error 
**errp)
 kvm_irqchip_add_irq_route(kvm_state, i, 0, i);
 }
 
-kvm_irqfds_allowed = true;
 kvm_msi_via_irqfd_allowed = true;
 kvm_gsi_routing_allowed = true;
 
diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
index 20b19e9..c15453f 100644
--- a/hw/intc/xics_kvm.c
+++ b/hw/intc/xics_kvm.c
@@ -448,7 +448,6 @@ static void xics_kvm_realize(DeviceState *dev, Error **errp)
 }
 
 kvm_kernel_irqchip = true;
-kvm_irqfds_allowed = true;
 kvm_msi_via_irqfd_allowed = true;
 kvm_gsi_direct_mapping = true;
 
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 174ea36..69c4d0f 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -45,6 +45,7 @@ extern bool kvm_async_interrupts_allowed;
 extern bool kvm_halt_in_kernel_allowed;
 extern bool kvm_eventfds_allowed;
 extern bool kvm_irqfds_allowed;
+extern bool kvm_resamplefds_allowed;
 extern bool kvm_msi_via_irqfd_allowed;
 extern bool kvm_gsi_routing_allowed;
 extern bool kvm_gsi_direct_mapping;
@@ -102,6 +103,15 @@ extern bool kvm_readonly_mem_allowed;
 #define kvm_irqfds_enabled() (kvm_irqfds_allowed)
 
 /**
+ * kvm_resamplefds_enabled:
+ *
+ * Returns: true if we can use resamplefds to inject interrupts into
+ * a KVM CPU (ie the kernel supports resamplefds and we are running
+ * with a configuration where it is meaningful to use them).
+ */
+#define kvm_resamplefds_enabled() (kvm_resamplefds_allowed)
+
+/**
  * kvm_msi_via_irqfd_enabled:
  *
  * Returns: true if we can route a PCI MSI (Message Signaled Interrupt)
diff --git a/kvm-all.c b/kvm-all.c
index b240bf8..d635942 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -116,6 +116,7 @@ bool kvm_async_interrupts_allowed;
 bool kvm_halt_in_kernel_allowed;
 bool kvm_eventfds_allowed;
 bool kvm_irqfds_allowed;
+bool kvm_resamplefds_allowed;
 bool kvm_msi_via_irqfd_allowed;
 bool kvm_gsi_routing_allowed;
 bool kvm_gsi_direct_mapping;
@@ -1548,6 +1549,12 @@ int kvm_init(MachineClass *mc)
 kvm_eventfds_allowed =
 (kvm_check_extension(s, KVM_CAP_IOEVENTFD) > 0);
 
+kvm_irqfds_allowed =
+(kvm_check_extension(s, KVM_CAP_IRQFD) > 0);
+
+kvm_resamplefds_allowed =
+(kvm_check_extension(s, KVM_CAP_IRQFD_RESAMPLE) > 0);
+
 ret = kvm_arch_init(s);
 if (ret < 0) {
 goto err;
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index ddedc73..2320920 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -2544,7 +2544,6 @@ void kvm_arch_init_irq_routing(KVMState *s)
  * irqchip, so we can use irqfds, and on x86 we know
  * we can use msi via irqfd and GSI routing.
  */
-kvm_irqfds_allowed = true;
 kvm_msi_via_irqfd_allowed = true;
 kvm_gsi_routing_allowed = true;
 }
diff --git a/target-s390x/kvm.c b/target-s390x/kvm.c
index a85a480..f937568 100644
--- a/target-s390x/kvm.c
+++ b/target-s390x/kvm.c
@@ -1290,7 +1290,6 @@ void kvm_arch_init_irq_routing(KVMState *s)
  * have to override the common code kvm_halt_in_kernel_allowed setting.
  */
 if (kvm_check_extension(s, KVM_CAP_IRQ_ROUTING)) {
-kvm_irqfds_allowed = true;
 kvm_gsi_routing_allowed = true;
 kvm_halt_in_kernel_allowed = false;
 }
-- 
1.8.3.2




[Qemu-devel] [PATCH v2 2/2] vfio: use kvm_resamplefds_enabled()

2014-09-03 Thread Eric Auger
Use the kvm_resamplefds_enabled function

Signed-off-by: Eric Auger 
---
 hw/misc/vfio.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
index 40dcaa6..24f6a3a 100644
--- a/hw/misc/vfio.c
+++ b/hw/misc/vfio.c
@@ -406,7 +406,7 @@ static void vfio_enable_intx_kvm(VFIODevice *vdev)
 
 if (!VFIO_ALLOW_KVM_INTX || !kvm_irqfds_enabled() ||
 vdev->intx.route.mode != PCI_INTX_ENABLED ||
-!kvm_check_extension(kvm_state, KVM_CAP_IRQFD_RESAMPLE)) {
+!kvm_resamplefds_enabled()) {
 return;
 }
 
@@ -568,8 +568,7 @@ static int vfio_enable_intx(VFIODevice *vdev)
  * Only conditional to avoid generating error messages on platforms
  * where we won't actually use the result anyway.
  */
-if (kvm_irqfds_enabled() &&
-kvm_check_extension(kvm_state, KVM_CAP_IRQFD_RESAMPLE)) {
+if (kvm_irqfds_enabled() && kvm_resamplefds_enabled()) {
 vdev->intx.route = pci_device_route_intx_to_irq(&vdev->pdev,
 vdev->intx.pin);
 }
-- 
1.8.3.2




Re: [Qemu-devel] [PATCH v2 1/2] KVM_CAP_IRQFD and KVM_CAP_IRQFD_RESAMPLE checks

2014-09-03 Thread Eric Auger
On 09/03/2014 02:50 PM, Christian Borntraeger wrote:
> On 03/09/14 11:54, Eric Auger wrote:
> [...]
>> --- a/kvm-all.c
>> +++ b/kvm-all.c
> [...]
>> @@ -1548,6 +1549,12 @@ int kvm_init(MachineClass *mc)
>>  kvm_eventfds_allowed =
>>  (kvm_check_extension(s, KVM_CAP_IOEVENTFD) > 0);
>>
>> +kvm_irqfds_allowed =
>> +(kvm_check_extension(s, KVM_CAP_IRQFD) > 0);
>^
>> +
>> +kvm_resamplefds_allowed =
>> +(kvm_check_extension(s, KVM_CAP_IRQFD_RESAMPLE) > 0);
>> +
> [...]
> 
>> diff --git a/target-s390x/kvm.c b/target-s390x/kvm.c
>> index a85a480..f937568 100644
>> --- a/target-s390x/kvm.c
>> +++ b/target-s390x/kvm.c
>> @@ -1290,7 +1290,6 @@ void kvm_arch_init_irq_routing(KVMState *s)
>>   * have to override the common code kvm_halt_in_kernel_allowed setting.
>>   */
>>  if (kvm_check_extension(s, KVM_CAP_IRQ_ROUTING)) {
>   ^^^
>> -kvm_irqfds_allowed = true;
>>  kvm_gsi_routing_allowed = true;
>>  kvm_halt_in_kernel_allowed = false;
>>  }
> 
> 
> 
> I first thought that this is wrong, because on s390 we forgot to announce 
> IRQFD when we introduced
> it. As you can see in both hunks, we check for KVM_CAP_IRQ_ROUTING and not 
> for CAP_IRQFD
> 
> Luckily, the kernel got fixed with commit 
> ebc3226202d5956a5963185222982d435378b899
> ("KVM: s390: announce irqfd capability") and the capability was introduced 
> with 
> commit 84223598778ba08041f4297fda485df83414d57e ("KVM: s390: irq routing for 
> adapter interrupts").
> Looks that both patches first appear with 3.15, so there should be no kernel 
> version that
> is affected by this change. You might want to add that to your changelog to 
> help distros that
> 84223598778ba08041f4297fda485df83414d57e also needs 
> ebc3226202d5956a5963185222982d435378b899.
Hi Christian,

I will add this information to the changelog. I effectively checked the
cap was advertised in s390 arch but I was not aware of the history.

Thanks

Best Regards

Eric
> 
> 
> 
> Christian
> 




[Qemu-devel] [PATCH v6 00/16] KVM platform device passthrough

2014-09-09 Thread Eric Auger
This RFC series aims at enabling KVM platform device passthrough.
It implements a VFIO platform device, derived from VFIO PCI device.

The VFIO platform device uses the host VFIO platform driver which must
be bound to the assigned device prior to the QEMU system start.

- the guest can directly access the device register space
- assigned device IRQs are transparently routed to the guest by
  QEMU/KVM (3 methods currently are supported: user-level eventfd
  handling, irqfd, forwarded IRQs)
- iommu is transparently programmed to prevent the device from
  accessing physical pages outside of the guest address space

This patch series is made of the following patch files:

1-7) Modifications to PCI code to prepare for VFIO platform device
8) split of PCI specific code and generic code (move)
9-11) creation of the VFIO calxeda xgmac platform device, without irqfd
  support (MMIO direct access and IRQ assignment).
12) fake injection test modality (to test multiple IRQ)
13) addition of irqfd/virqfd support
14-16) forwarded IRQ

Dependency List:

QEMU dependencies:
[1] [PATCH v2 0/9] Dynamic sysbus device allocation support, Alex Graf
http://lists.gnu.org/archive/html/qemu-ppc/2014-07/msg00047.html
[2] [RFC v3] machvirt dynamic sysbus device instantiation, Eric Auger
[3] [PATCH v2 0/2] actual checks of KVM_CAP_IRQFD and KVM_CAP_IRQFD_RESAMPLE,
Eric Auger
http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg00589.html
[4] [RFC] vfio: migration to trace points, Eric Auger
http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg00569.html

Kernel Dependencies:
[5] [RFC Patch v6 0/20] VFIO support for platform devices, Antonios Motakis
https://www.mail-archive.com/kvm@vger.kernel.org/msg103247.html
[6] [PATCH v3] ARM: KVM: add irqfd support, Eric Auger
https://lkml.org/lkml/2014/9/1/141
[7] arm/arm64: KVM: Various VGIC cleanups and improvements, Christoffer Dall
http://comments.gmane.org/gmane.linux.ports.arm.kernel/340430
[8] [RFC v2 0/9] KVM-VFIO IRQ forward control, Eric Auger
https://lkml.org/lkml/2014/9/1/344
[9] [RFC PATCH 0/9] ARM: Forwarding physical interrupts to a guest VM,
Marc Zyngier
http://lwn.net/Articles/603514/

kernel pieces can be found at:
http://git.linaro.org/people/eric.auger/linux.git
(branch 3.17rc3_irqfd_forward_integ_v2)
QEMU pieces can be found at:
http://git.linaro.org/people/eric.auger/qemu.git (branch vfio_integ_v6)

The patch series was tested on Calxeda Midway (ARMv7) where one xgmac
is assigned to KVM host while the second one is assigned to the guest.
Reworked PCI device is not tested.

Wiki for Calxeda Midway setup:
https://wiki.linaro.org/LEG/Engineering/Virtualization/Platform_Device_Passthrough_on_Midway

History:

v5->v6:
- rebase on 2.1rc5 PCI code
- forwarded IRQ first integraton
- vfio_device property renamed into host property
- split IRQ setup in different functions that match the 3 supported
  injection techniques (user handled eventfd, irqfd, forwarded IRQ):
  removes dynamic switch between injection methods
- introduce fake interrupts as a test modality:
  x makes possible to test multiple IRQ user-side handling.
  x this is a test feature only: enable to trigger a fd as if the
real physical IRQ hit. No virtual IRQ is injected into the guest
but handling is simulated so that the state machine can be tested
- user handled eventfd:
  x add mutex to protect IRQ state & list manipulation,
  x correct misleading comment in vfio_intp_interrupt.
  x Fix bugs using fake interrupt modality
- irqfd no more advertised in this patchset (handled in [3])
- VFIOPlatformDeviceClass becomes abstract and Calxeda xgmac device
  and class is re-introduced (as per v4)
- all DPRINTF removed in platform and replaced by trace-points
- corrects compilation with configure --disable-kvm
- simplifies the split for vfio_get_device and introduce a unique
  specialized function named vfio_populate_device
- group_list renamed into vfio_group_list
- hw/arm/dyn_sysbus_devtree.c currently only support vfio-calxeda-xgmac
  instantiation. Needs to be specialized for other VFIO devices
- fix 2 bugs in dyn_sysbus_devtree(reg_attr index and compat)

v4->v5:
- rebase on v2.1.0 PCI code
- take into account Alex Williamson comments on PCI code rework
  - trace updates in vfio_region_write/read
  - remove fd from VFIORegion
  - get/put ckeanup
- bug fix: bar region's vbasedev field duly initialization
- misc cleanups in platform device
- device tree node generation removed from device and handled in
  hw/arm/dyn_sysbus_devtree.c
- remove "hw/vfio: add an example calxeda_xgmac": with removal of
  device tree node generation we do not have so many things to
  implement in that derived device yet. May be re-introduced later
  on if needed typically for reset/migration.
- no GSI routing table anymore

v3->v4 changes (Eric Auger, Alvise Rigo)
- rebase on last VFIO PCI code (v2.1.0-rc0)
- full git history rework to ease PCI cod

[Qemu-devel] [PATCH v6 02/16] hw/vfio/pci: Rename VFIODevice into VFIOPCIDevice

2014-09-09 Thread Eric Auger
This prepares for the introduction of VFIOPlatformDevice

Signed-off-by: Eric Auger 
---
 hw/vfio/pci.c | 209 +-
 1 file changed, 105 insertions(+), 104 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 7e6a1bc..ad5da4b 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -48,11 +48,11 @@
 #define VFIO_ALLOW_KVM_MSI 1
 #define VFIO_ALLOW_KVM_MSIX 1
 
-struct VFIODevice;
+struct VFIOPCIDevice;
 
 typedef struct VFIOQuirk {
 MemoryRegion mem;
-struct VFIODevice *vdev;
+struct VFIOPCIDevice *vdev;
 QLIST_ENTRY(VFIOQuirk) next;
 struct {
 uint32_t base_offset:TARGET_PAGE_BITS;
@@ -123,7 +123,7 @@ typedef struct VFIOMSIVector {
  */
 EventNotifier interrupt;
 EventNotifier kvm_interrupt;
-struct VFIODevice *vdev; /* back pointer to device */
+struct VFIOPCIDevice *vdev; /* back pointer to device */
 int virq;
 bool use;
 } VFIOMSIVector;
@@ -185,7 +185,7 @@ typedef struct VFIOMSIXInfo {
 void *mmap;
 } VFIOMSIXInfo;
 
-typedef struct VFIODevice {
+typedef struct VFIOPCIDevice {
 PCIDevice pdev;
 int fd;
 VFIOINTx intx;
@@ -203,7 +203,7 @@ typedef struct VFIODevice {
 VFIOBAR bars[PCI_NUM_REGIONS - 1]; /* No ROM */
 VFIOVGA vga; /* 0xa, 0x3b0, 0x3c0 */
 PCIHostDeviceAddress host;
-QLIST_ENTRY(VFIODevice) next;
+QLIST_ENTRY(VFIOPCIDevice) next;
 struct VFIOGroup *group;
 EventNotifier err_notifier;
 uint32_t features;
@@ -218,13 +218,13 @@ typedef struct VFIODevice {
 bool has_pm_reset;
 bool needs_reset;
 bool rom_read_failed;
-} VFIODevice;
+} VFIOPCIDevice;
 
 typedef struct VFIOGroup {
 int fd;
 int groupid;
 VFIOContainer *container;
-QLIST_HEAD(, VFIODevice) device_list;
+QLIST_HEAD(, VFIOPCIDevice) device_list;
 QLIST_ENTRY(VFIOGroup) next;
 QLIST_ENTRY(VFIOGroup) container_next;
 } VFIOGroup;
@@ -268,16 +268,16 @@ static QLIST_HEAD(, VFIOGroup)
 static int vfio_kvm_device_fd = -1;
 #endif
 
-static void vfio_disable_interrupts(VFIODevice *vdev);
+static void vfio_disable_interrupts(VFIOPCIDevice *vdev);
 static uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len);
 static void vfio_pci_write_config(PCIDevice *pdev, uint32_t addr,
   uint32_t val, int len);
-static void vfio_mmap_set_enabled(VFIODevice *vdev, bool enabled);
+static void vfio_mmap_set_enabled(VFIOPCIDevice *vdev, bool enabled);
 
 /*
  * Common VFIO interrupt disable
  */
-static void vfio_disable_irqindex(VFIODevice *vdev, int index)
+static void vfio_disable_irqindex(VFIOPCIDevice *vdev, int index)
 {
 struct vfio_irq_set irq_set = {
 .argsz = sizeof(irq_set),
@@ -293,7 +293,7 @@ static void vfio_disable_irqindex(VFIODevice *vdev, int 
index)
 /*
  * INTx
  */
-static void vfio_unmask_intx(VFIODevice *vdev)
+static void vfio_unmask_intx(VFIOPCIDevice *vdev)
 {
 struct vfio_irq_set irq_set = {
 .argsz = sizeof(irq_set),
@@ -307,7 +307,7 @@ static void vfio_unmask_intx(VFIODevice *vdev)
 }
 
 #ifdef CONFIG_KVM /* Unused outside of CONFIG_KVM code */
-static void vfio_mask_intx(VFIODevice *vdev)
+static void vfio_mask_intx(VFIOPCIDevice *vdev)
 {
 struct vfio_irq_set irq_set = {
 .argsz = sizeof(irq_set),
@@ -338,7 +338,7 @@ static void vfio_mask_intx(VFIODevice *vdev)
  */
 static void vfio_intx_mmap_enable(void *opaque)
 {
-VFIODevice *vdev = opaque;
+VFIOPCIDevice *vdev = opaque;
 
 if (vdev->intx.pending) {
 timer_mod(vdev->intx.mmap_timer,
@@ -351,7 +351,7 @@ static void vfio_intx_mmap_enable(void *opaque)
 
 static void vfio_intx_interrupt(void *opaque)
 {
-VFIODevice *vdev = opaque;
+VFIOPCIDevice *vdev = opaque;
 
 if (!event_notifier_test_and_clear(&vdev->intx.interrupt)) {
 return;
@@ -370,7 +370,7 @@ static void vfio_intx_interrupt(void *opaque)
 }
 }
 
-static void vfio_eoi(VFIODevice *vdev)
+static void vfio_eoi(VFIOPCIDevice *vdev)
 {
 if (!vdev->intx.pending) {
 return;
@@ -384,7 +384,7 @@ static void vfio_eoi(VFIODevice *vdev)
 vfio_unmask_intx(vdev);
 }
 
-static void vfio_enable_intx_kvm(VFIODevice *vdev)
+static void vfio_enable_intx_kvm(VFIOPCIDevice *vdev)
 {
 #ifdef CONFIG_KVM
 struct kvm_irqfd irqfd = {
@@ -462,7 +462,7 @@ fail:
 #endif
 }
 
-static void vfio_disable_intx_kvm(VFIODevice *vdev)
+static void vfio_disable_intx_kvm(VFIOPCIDevice *vdev)
 {
 #ifdef CONFIG_KVM
 struct kvm_irqfd irqfd = {
@@ -506,7 +506,7 @@ static void vfio_disable_intx_kvm(VFIODevice *vdev)
 
 static void vfio_update_irq(PCIDevice *pdev)
 {
-VFIODevice *vdev = DO_UPCAST(VFIODevice, pdev, pdev);
+VFIOPCIDevice *vdev = DO_UPCAST(VFIOPCIDevice, pdev, pdev);
 PCIINTxRoute route;
 
 if (vdev->interrupt != VFIO_INT_INTx) {
@@ -537,7 +537,7 @@ static void vfio_update_irq(PCIDevice *pdev)
 vfio_eoi(vdev);
 }
 
-static int vfio_enable_intx(VFIODevice

[Qemu-devel] [PATCH v6 03/16] hw/vfio/pci: introduce VFIODevice

2014-09-09 Thread Eric Auger
Introduce the VFIODevice struct that is going to be shared by
VFIOPCIDevice and VFIOPlatformDevice.

Additional fields will be added there later on for review
convenience.

the group's device_list becomes a list of VFIODevice

This obliges to rework the reset_handler which becomes generic and
calls VFIODevice ops that are specialized in each parent object.
Also functions that iterate on this list must take care that the
devices can be something else than VFIOPCIDevice. The type is used
to discriminate them.

we profit from this step to change the prototype of
vfio_unmask_intx, vfio_mask_intx, vfio_disable_irqindex which now
apply to VFIODevice. They are renamed as *_irqindex.
The index is passed as parameter to anticipate their usage for
platform IRQs

Signed-off-by: Eric Auger 

---

v4->v5:
- fix style issues
- in vfio_initfn, rework allocation of vdev->vbasedev.name and
  replace snprintf by g_strdup_printf
---
 hw/vfio/pci.c | 241 +++---
 1 file changed, 147 insertions(+), 94 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index ad5da4b..e2caa08 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -48,6 +48,11 @@
 #define VFIO_ALLOW_KVM_MSI 1
 #define VFIO_ALLOW_KVM_MSIX 1
 
+enum {
+VFIO_DEVICE_TYPE_PCI = 0,
+VFIO_DEVICE_TYPE_PLATFORM = 1,
+};
+
 struct VFIOPCIDevice;
 
 typedef struct VFIOQuirk {
@@ -185,9 +190,27 @@ typedef struct VFIOMSIXInfo {
 void *mmap;
 } VFIOMSIXInfo;
 
+typedef struct VFIODeviceOps VFIODeviceOps;
+
+typedef struct VFIODevice {
+QLIST_ENTRY(VFIODevice) next;
+struct VFIOGroup *group;
+char *name;
+int fd;
+int type;
+bool reset_works;
+bool needs_reset;
+VFIODeviceOps *ops;
+} VFIODevice;
+
+struct VFIODeviceOps {
+bool (*vfio_compute_needs_reset)(VFIODevice *vdev);
+int (*vfio_hot_reset_multi)(VFIODevice *vdev);
+};
+
 typedef struct VFIOPCIDevice {
 PCIDevice pdev;
-int fd;
+VFIODevice vbasedev;
 VFIOINTx intx;
 unsigned int config_size;
 uint8_t *emulated_config_bits; /* QEMU emulated bits, little-endian */
@@ -203,20 +226,16 @@ typedef struct VFIOPCIDevice {
 VFIOBAR bars[PCI_NUM_REGIONS - 1]; /* No ROM */
 VFIOVGA vga; /* 0xa, 0x3b0, 0x3c0 */
 PCIHostDeviceAddress host;
-QLIST_ENTRY(VFIOPCIDevice) next;
-struct VFIOGroup *group;
 EventNotifier err_notifier;
 uint32_t features;
 #define VFIO_FEATURE_ENABLE_VGA_BIT 0
 #define VFIO_FEATURE_ENABLE_VGA (1 << VFIO_FEATURE_ENABLE_VGA_BIT)
 int32_t bootindex;
 uint8_t pm_cap;
-bool reset_works;
 bool has_vga;
 bool pci_aer;
 bool has_flr;
 bool has_pm_reset;
-bool needs_reset;
 bool rom_read_failed;
 } VFIOPCIDevice;
 
@@ -224,7 +243,7 @@ typedef struct VFIOGroup {
 int fd;
 int groupid;
 VFIOContainer *container;
-QLIST_HEAD(, VFIOPCIDevice) device_list;
+QLIST_HEAD(, VFIODevice) device_list;
 QLIST_ENTRY(VFIOGroup) next;
 QLIST_ENTRY(VFIOGroup) container_next;
 } VFIOGroup;
@@ -277,7 +296,7 @@ static void vfio_mmap_set_enabled(VFIOPCIDevice *vdev, bool 
enabled);
 /*
  * Common VFIO interrupt disable
  */
-static void vfio_disable_irqindex(VFIOPCIDevice *vdev, int index)
+static void vfio_disable_irqindex(VFIODevice *vbasedev, int index)
 {
 struct vfio_irq_set irq_set = {
 .argsz = sizeof(irq_set),
@@ -287,37 +306,37 @@ static void vfio_disable_irqindex(VFIOPCIDevice *vdev, 
int index)
 .count = 0,
 };
 
-ioctl(vdev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
+ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
 }
 
 /*
  * INTx
  */
-static void vfio_unmask_intx(VFIOPCIDevice *vdev)
+static void vfio_unmask_irqindex(VFIODevice *vbasedev, int index)
 {
 struct vfio_irq_set irq_set = {
 .argsz = sizeof(irq_set),
 .flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_UNMASK,
-.index = VFIO_PCI_INTX_IRQ_INDEX,
+.index = index,
 .start = 0,
 .count = 1,
 };
 
-ioctl(vdev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
+ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
 }
 
 #ifdef CONFIG_KVM /* Unused outside of CONFIG_KVM code */
-static void vfio_mask_intx(VFIOPCIDevice *vdev)
+static void vfio_mask_irqindex(VFIODevice *vbasedev, int index)
 {
 struct vfio_irq_set irq_set = {
 .argsz = sizeof(irq_set),
 .flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_MASK,
-.index = VFIO_PCI_INTX_IRQ_INDEX,
+.index = index,
 .start = 0,
 .count = 1,
 };
 
-ioctl(vdev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
+ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
 }
 #endif
 
@@ -381,7 +400,7 @@ static void vfio_eoi(VFIOPCIDevice *vdev)
 
 vdev->intx.pending = false;
 pci_irq_deassert(&vdev->pdev);
-vfio_unmask_intx(vdev);
+vfio_unmask_irqindex(&vdev->vbasedev, VFIO_PCI_INTX_IRQ_I

[Qemu-devel] [PATCH v6 01/16] vfio: move hw/misc/vfio.c to hw/vfio/pci.c Move vfio.h into include/hw/vfio

2014-09-09 Thread Eric Auger
From: Kim Phillips 

This is done in preparation for the addition of VFIO platform
device support.

Signed-off-by: Kim Phillips 
---
 LICENSE  | 2 +-
 MAINTAINERS  | 2 +-
 hw/Makefile.objs | 1 +
 hw/misc/Makefile.objs| 1 -
 hw/ppc/spapr_pci_vfio.c  | 2 +-
 hw/vfio/Makefile.objs| 3 +++
 hw/{misc/vfio.c => vfio/pci.c}   | 2 +-
 include/hw/{misc => vfio}/vfio.h | 0
 8 files changed, 8 insertions(+), 5 deletions(-)
 create mode 100644 hw/vfio/Makefile.objs
 rename hw/{misc/vfio.c => vfio/pci.c} (99%)
 rename include/hw/{misc => vfio}/vfio.h (100%)

diff --git a/LICENSE b/LICENSE
index da70e94..0e0b4b9 100644
--- a/LICENSE
+++ b/LICENSE
@@ -11,7 +11,7 @@ option) any later version.
 
 As of July 2013, contributions under version 2 of the GNU General Public
 License (and no later version) are only accepted for the following files
-or directories: bsd-user/, linux-user/, hw/misc/vfio.c, hw/xen/xen_pt*.
+or directories: bsd-user/, linux-user/, hw/vfio/, hw/xen/xen_pt*.
 
 3) The Tiny Code Generator (TCG) is released under the BSD license
(see license headers in files).
diff --git a/MAINTAINERS b/MAINTAINERS
index 206bf7e..8683f62 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -625,7 +625,7 @@ F: tests/usb-*-test.c
 VFIO
 M: Alex Williamson 
 S: Supported
-F: hw/misc/vfio.c
+F: hw/vfio/*
 
 vhost
 M: Michael S. Tsirkin 
diff --git a/hw/Makefile.objs b/hw/Makefile.objs
index 52a1464..73afa41 100644
--- a/hw/Makefile.objs
+++ b/hw/Makefile.objs
@@ -26,6 +26,7 @@ devices-dirs-$(CONFIG_SOFTMMU) += ssi/
 devices-dirs-$(CONFIG_SOFTMMU) += timer/
 devices-dirs-$(CONFIG_TPM) += tpm/
 devices-dirs-$(CONFIG_SOFTMMU) += usb/
+devices-dirs-$(CONFIG_SOFTMMU) += vfio/
 devices-dirs-$(CONFIG_VIRTIO) += virtio/
 devices-dirs-$(CONFIG_SOFTMMU) += watchdog/
 devices-dirs-$(CONFIG_SOFTMMU) += xen/
diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
index 86f6243..9b77554 100644
--- a/hw/misc/Makefile.objs
+++ b/hw/misc/Makefile.objs
@@ -21,7 +21,6 @@ common-obj-$(CONFIG_MACIO) += macio/
 
 ifeq ($(CONFIG_PCI), y)
 obj-$(CONFIG_KVM) += ivshmem.o
-obj-$(CONFIG_LINUX) += vfio.o
 endif
 
 obj-$(CONFIG_REALVIEW) += arm_sysctl.o
diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c
index d3bddf2..144912b 100644
--- a/hw/ppc/spapr_pci_vfio.c
+++ b/hw/ppc/spapr_pci_vfio.c
@@ -20,7 +20,7 @@
 #include "hw/ppc/spapr.h"
 #include "hw/pci-host/spapr.h"
 #include "linux/vfio.h"
-#include "hw/misc/vfio.h"
+#include "hw/vfio/vfio.h"
 
 static Property spapr_phb_vfio_properties[] = {
 DEFINE_PROP_INT32("iommu", sPAPRPHBVFIOState, iommugroupid, -1),
diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
new file mode 100644
index 000..31c7dab
--- /dev/null
+++ b/hw/vfio/Makefile.objs
@@ -0,0 +1,3 @@
+ifeq ($(CONFIG_LINUX), y)
+obj-$(CONFIG_PCI) += pci.o
+endif
diff --git a/hw/misc/vfio.c b/hw/vfio/pci.c
similarity index 99%
rename from hw/misc/vfio.c
rename to hw/vfio/pci.c
index 3d32657..7e6a1bc 100644
--- a/hw/misc/vfio.c
+++ b/hw/vfio/pci.c
@@ -39,8 +39,8 @@
 #include "qemu/range.h"
 #include "sysemu/kvm.h"
 #include "sysemu/sysemu.h"
-#include "hw/misc/vfio.h"
 #include "trace.h"
+#include "hw/vfio/vfio.h"
 
 /* Extra debugging, trap acceleration paths for more logging */
 #define VFIO_ALLOW_MMAP 1
diff --git a/include/hw/misc/vfio.h b/include/hw/vfio/vfio.h
similarity index 100%
rename from include/hw/misc/vfio.h
rename to include/hw/vfio/vfio.h
-- 
1.8.3.2




[Qemu-devel] [PATCH v6 04/16] hw/vfio/pci: Introduce VFIORegion

2014-09-09 Thread Eric Auger
This structure is going to be shared by VFIOPCIDevice and
VFIOPlatformDevice. VFIOBAR includes it.

vfio_eoi becomes an ops of VFIODevice specialized by parent device.
This makes possible to transform vfio_bar_write/read into generic
vfio_region_write/read that will be used by VFIOPlatformDevice too.

vfio_mmap_bar becomes vfio_map_region

Signed-off-by: Eric Auger 

---

v4->v5:
- remove fd field from VFIORegion
- change error_report format string in vfio_region_write/read
- remove #ifdef DEBUG_VFIO in the same function
- correct missing initialization of bar region's vbasedev field
- change Object * parameter name of vfio_mmap_region and remove
  useless OBJECT()
---
 hw/vfio/pci.c | 193 ++
 trace-events  |   4 +-
 2 files changed, 103 insertions(+), 94 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index e2caa08..5e34504 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -78,15 +78,19 @@ typedef struct VFIOQuirk {
 } data;
 } VFIOQuirk;
 
-typedef struct VFIOBAR {
-off_t fd_offset; /* offset of BAR within device fd */
-int fd; /* device fd, allows us to pass VFIOBAR as opaque data */
+typedef struct VFIORegion {
+struct VFIODevice *vbasedev;
+off_t fd_offset; /* offset of region within device fd */
 MemoryRegion mem; /* slow, read/write access */
 MemoryRegion mmap_mem; /* direct mapped access */
 void *mmap;
 size_t size;
 uint32_t flags; /* VFIO region flags (rd/wr/mmap) */
-uint8_t nr; /* cache the BAR number for debug */
+uint8_t nr; /* cache the region number for debug */
+} VFIORegion;
+
+typedef struct VFIOBAR {
+VFIORegion region;
 bool ioport;
 bool mem64;
 QLIST_HEAD(, VFIOQuirk) quirks;
@@ -206,6 +210,7 @@ typedef struct VFIODevice {
 struct VFIODeviceOps {
 bool (*vfio_compute_needs_reset)(VFIODevice *vdev);
 int (*vfio_hot_reset_multi)(VFIODevice *vdev);
+void (*vfio_eoi)(VFIODevice *vdev);
 };
 
 typedef struct VFIOPCIDevice {
@@ -389,8 +394,10 @@ static void vfio_intx_interrupt(void *opaque)
 }
 }
 
-static void vfio_eoi(VFIOPCIDevice *vdev)
+static void vfio_eoi(VFIODevice *vbasedev)
 {
+VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev);
+
 if (!vdev->intx.pending) {
 return;
 }
@@ -400,7 +407,7 @@ static void vfio_eoi(VFIOPCIDevice *vdev)
 
 vdev->intx.pending = false;
 pci_irq_deassert(&vdev->pdev);
-vfio_unmask_irqindex(&vdev->vbasedev, VFIO_PCI_INTX_IRQ_INDEX);
+vfio_unmask_irqindex(vbasedev, VFIO_PCI_INTX_IRQ_INDEX);
 }
 
 static void vfio_enable_intx_kvm(VFIOPCIDevice *vdev)
@@ -553,7 +560,7 @@ static void vfio_update_irq(PCIDevice *pdev)
 vfio_enable_intx_kvm(vdev);
 
 /* Re-enable the interrupt in cased we missed an EOI */
-vfio_eoi(vdev);
+vfio_eoi(&vdev->vbasedev);
 }
 
 static int vfio_enable_intx(VFIOPCIDevice *vdev)
@@ -1090,10 +1097,11 @@ static void vfio_update_msi(VFIOPCIDevice *vdev)
 /*
  * IO Port/MMIO - Beware of the endians, VFIO is always little endian
  */
-static void vfio_bar_write(void *opaque, hwaddr addr,
-   uint64_t data, unsigned size)
+static void vfio_region_write(void *opaque, hwaddr addr,
+  uint64_t data, unsigned size)
 {
-VFIOBAR *bar = opaque;
+VFIORegion *region = opaque;
+VFIODevice *vbasedev = region->vbasedev;
 union {
 uint8_t byte;
 uint16_t word;
@@ -1116,20 +1124,14 @@ static void vfio_bar_write(void *opaque, hwaddr addr,
 break;
 }
 
-if (pwrite(bar->fd, &buf, size, bar->fd_offset + addr) != size) {
-error_report("%s(,0x%"HWADDR_PRIx", 0x%"PRIx64", %d) failed: %m",
- __func__, addr, data, size);
+if (pwrite(vbasedev->fd, &buf, size, region->fd_offset + addr) != size) {
+error_report("%s(%s:region%d+0x%"HWADDR_PRIx", 0x%"PRIx64
+ ",%d) failed: %m",
+ __func__, vbasedev->name, region->nr,
+ addr, data, size);
 }
 
-#ifdef DEBUG_VFIO
-{
-VFIOPCIDevice *vdev = container_of(bar, VFIOPCIDevice, bars[bar->nr]);
-
-trace_vfio_bar_write(vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function,
- region->nr, addr, data, size);
-}
-#endif
+trace_vfio_region_write(vbasedev->name, region->nr, addr, data, size);
 
 /*
  * A read or write to a BAR always signals an INTx EOI.  This will
@@ -1139,13 +1141,14 @@ static void vfio_bar_write(void *opaque, hwaddr addr,
  * which access will service the interrupt, so we're potentially
  * getting quite a few host interrupts per guest interrupt.
  */
-vfio_eoi(container_of(bar, VFIOPCIDevice, bars[bar->nr]));
+vbas

[Qemu-devel] [PATCH v6 10/16] hw/vfio: calxeda xgmac device

2014-09-09 Thread Eric Auger
The platform device class has become abstract. The device can be be
instantiated on command line using such option.

-device vfio-calxeda-xgmac,host="fff51000.ethernet"
compat string is hardcoded in the code except if user overrides it

Signed-off-by: Eric Auger 

---

v5 -> v6
- back again following Alex Graf advises
- fix a bug related to compat override

v4 -> v5:
removed since device tree was moved to hw/arm/dyn_sysbus_devtree.c

v4: creation for device tree specialization
---
 hw/vfio/Makefile.objs|  1 +
 hw/vfio/calxeda_xgmac.c  | 57 
 include/hw/vfio/vfio-calxeda-xgmac.h | 41 ++
 3 files changed, 99 insertions(+)
 create mode 100644 hw/vfio/calxeda_xgmac.c
 create mode 100644 include/hw/vfio/vfio-calxeda-xgmac.h

diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
index c5c76fe..913ab14 100644
--- a/hw/vfio/Makefile.objs
+++ b/hw/vfio/Makefile.objs
@@ -2,4 +2,5 @@ ifeq ($(CONFIG_LINUX), y)
 obj-$(CONFIG_SOFTMMU) += common.o
 obj-$(CONFIG_PCI) += pci.o
 obj-$(CONFIG_SOFTMMU) += platform.o
+obj-$(CONFIG_SOFTMMU) += calxeda_xgmac.o
 endif
diff --git a/hw/vfio/calxeda_xgmac.c b/hw/vfio/calxeda_xgmac.c
new file mode 100644
index 000..5e655ae
--- /dev/null
+++ b/hw/vfio/calxeda_xgmac.c
@@ -0,0 +1,57 @@
+/*
+ * calxeda xgmac example VFIO device
+ *
+ * Copyright Linaro Limited, 2014
+ *
+ * Authors:
+ *  Eric Auger 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include "hw/vfio/vfio-calxeda-xgmac.h"
+
+static void calxeda_xgmac_realize(DeviceState *dev, Error **errp)
+{
+VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(dev);
+VFIOCalxedaXgmacDeviceClass *k = VFIO_CALXEDA_XGMAC_DEVICE_GET_CLASS(dev);
+const char compat[] = "calxeda,hb-xgmac";
+
+if (vdev->compat == NULL) {
+vdev->compat = g_strdup(compat);
+} /* else use user-provided compat string */
+
+k->parent_realize(dev, errp);
+}
+
+static const VMStateDescription vfio_platform_vmstate = {
+.name = TYPE_VFIO_CALXEDA_XGMAC,
+.unmigratable = 1,
+};
+
+static void vfio_calxeda_xgmac_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+VFIOCalxedaXgmacDeviceClass *vcxc =
+VFIO_CALXEDA_XGMAC_DEVICE_CLASS(klass);
+vcxc->parent_realize = dc->realize;
+dc->realize = calxeda_xgmac_realize;
+dc->desc = "VFIO Calxeda XGMAC";
+}
+
+static const TypeInfo vfio_calxeda_xgmac_dev_info = {
+.name = TYPE_VFIO_CALXEDA_XGMAC,
+.parent = TYPE_VFIO_PLATFORM,
+.instance_size = sizeof(VFIOCalxedaXgmacDevice),
+.class_init = vfio_calxeda_xgmac_class_init,
+.class_size = sizeof(VFIOCalxedaXgmacDeviceClass),
+};
+
+static void register_calxeda_xgmac_dev_type(void)
+{
+type_register_static(&vfio_calxeda_xgmac_dev_info);
+}
+
+type_init(register_calxeda_xgmac_dev_type)
diff --git a/include/hw/vfio/vfio-calxeda-xgmac.h 
b/include/hw/vfio/vfio-calxeda-xgmac.h
new file mode 100644
index 000..1529cf5
--- /dev/null
+++ b/include/hw/vfio/vfio-calxeda-xgmac.h
@@ -0,0 +1,41 @@
+/*
+ * VFIO calxeda xgmac device
+ *
+ * Copyright Linaro Limited, 2014
+ *
+ * Authors:
+ *  Eric Auger 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef HW_VFIO_VFIO_CALXEDA_XGMAC_H
+#define HW_VFIO_VFIO_CALXEDA_XGMAC_H
+
+#include "hw/vfio/vfio-platform.h"
+
+#define TYPE_VFIO_CALXEDA_XGMAC "vfio-calxeda-xgmac"
+
+typedef struct VFIOCalxedaXgmacDevice {
+VFIOPlatformDevice vdev;
+} VFIOCalxedaXgmacDevice;
+
+typedef struct VFIOCalxedaXgmacDeviceClass {
+/*< private >*/
+VFIOPlatformDeviceClass parent_class;
+/*< public >*/
+DeviceRealize parent_realize;
+} VFIOCalxedaXgmacDeviceClass;
+
+#define VFIO_CALXEDA_XGMAC_DEVICE(obj) \
+ OBJECT_CHECK(VFIOCalxedaXgmacDevice, (obj), TYPE_VFIO_CALXEDA_XGMAC)
+#define VFIO_CALXEDA_XGMAC_DEVICE_CLASS(klass) \
+ OBJECT_CLASS_CHECK(VFIOCalxedaXgmacDeviceClass, (klass), \
+TYPE_VFIO_CALXEDA_XGMAC)
+#define VFIO_CALXEDA_XGMAC_DEVICE_GET_CLASS(obj) \
+ OBJECT_GET_CLASS(VFIOCalxedaXgmacDeviceClass, (obj), \
+  TYPE_VFIO_CALXEDA_XGMAC)
+
+#endif
-- 
1.8.3.2




[Qemu-devel] [PATCH v6 09/16] hw/vfio/platform: add vfio-platform support

2014-09-09 Thread Eric Auger
Minimal VFIO platform implementation supporting
- register space user mapping,
- IRQ assignment based on eventfds handled on qemu side.

irqfd kernel acceleration comes in a subsequent patch.

Signed-off-by: Kim Phillips 
Signed-off-by: Eric Auger 

---

v5 -> v6:
- vfio_device property renamed into host property
- correct error handling of VFIO_DEVICE_GET_IRQ_INFO ioctl
  and remove PCI related comment
- remove declaration of vfio_setup_irqfd and irqfd_allowed
  property.Both belong to next patch (irqfd)
- remove declaration of vfio_intp_interrupt in vfio-platform.h
- functions that can be static get this characteristic
- remove declarations of vfio_region_ops, vfio_memory_listener,
  group_list, vfio_address_spaces. All are moved to vfio-common.h
- remove vfio_put_device declaration and definition
- print_regions removed. code moved into vfio_populate_regions
- replace DPRINTF by trace events
- new helper routine to set the trigger eventfd
- dissociate intp init from the injection enablement:
  vfio_enable_intp renamed into vfio_init_intp and new function
  named vfio_start_eventfd_injection
- injection start moved to vfio_start_irq_injection (not anymore
  in vfio_populate_interrupt)
- new start_irq_fn field in VFIOPlatformDevice corresponding to
  the function that will be used for starting injection
- user handled eventfd:
  x add mutex to protect IRQ state & list manipulation,
  x correct misleading comment in vfio_intp_interrupt.
  x Fix bugs thanks to fake interrupt modality
- VFIOPlatformDeviceClass becomes abstract
- add error_setg in vfio_platform_realize

v4 -> v5:
- vfio-plaform.h included first
- cleanup error handling in *populate*, vfio_get_device,
  vfio_enable_intp
- vfio_put_device not called anymore
- add some includes to follow vfio policy

v3 -> v4:
[Eric Auger]
- merge of "vfio: Add initial IRQ support in platform device"
  to get a full functional patch although perfs are limited.
- removal of unrealize function since I currently understand
  it is only used with device hot-plug feature.

v2 -> v3:
[Eric Auger]
- further factorization between PCI and platform (VFIORegion,
  VFIODevice). same level of functionality.

<= v2:
[Kim Philipps]
- Initial Creation of the device supporting register space mapping
---
 hw/vfio/Makefile.objs   |   1 +
 hw/vfio/platform.c  | 599 
 include/hw/vfio/vfio-platform.h |  79 ++
 trace-events|  12 +
 4 files changed, 691 insertions(+)
 create mode 100644 hw/vfio/platform.c
 create mode 100644 include/hw/vfio/vfio-platform.h

diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
index e31f30e..c5c76fe 100644
--- a/hw/vfio/Makefile.objs
+++ b/hw/vfio/Makefile.objs
@@ -1,4 +1,5 @@
 ifeq ($(CONFIG_LINUX), y)
 obj-$(CONFIG_SOFTMMU) += common.o
 obj-$(CONFIG_PCI) += pci.o
+obj-$(CONFIG_SOFTMMU) += platform.o
 endif
diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
new file mode 100644
index 000..9987b25
--- /dev/null
+++ b/hw/vfio/platform.c
@@ -0,0 +1,599 @@
+/*
+ * vfio based device assignment support - platform devices
+ *
+ * Copyright Linaro Limited, 2014
+ *
+ * Authors:
+ *  Kim Phillips 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ * Based on vfio based PCI device assignment support:
+ *  Copyright Red Hat, Inc. 2012
+ */
+
+#include 
+#include 
+
+#include "hw/vfio/vfio-platform.h"
+#include "qemu/error-report.h"
+#include "qemu/range.h"
+#include "sysemu/sysemu.h"
+#include "exec/memory.h"
+#include "qemu/queue.h"
+#include "hw/sysbus.h"
+#include "trace.h"
+
+static void vfio_intp_interrupt(VFIOINTp *intp);
+typedef void (*eventfd_user_side_handler_t)(VFIOINTp *intp);
+static int vfio_set_trigger_eventfd(VFIOINTp *intp,
+eventfd_user_side_handler_t handler);
+
+/*
+ * Functions only used when eventfd are handled on user-side
+ * ie. without irqfd
+ */
+
+/**
+ * vfio_platform_eoi - IRQ completion routine
+ * @vbasedev: the VFIO device
+ *
+ * de-asserts the active virtual IRQ and unmask the physical IRQ
+ * (masked by the  VFIO driver). Handle pending IRQs if any.
+ * eoi function is called on the first access to any MMIO region
+ * after an IRQ was triggered. It is assumed this access corresponds
+ * to the IRQ status register reset. With such a mechanism, a single
+ * IRQ can be handled at a time since there is no way to know which
+ * IRQ was completed by the guest (we would need additional details
+ * about the IRQ status register mask)
+ */
+static void vfio_platform_eoi(VFIODevice *vbasedev)
+{
+VFIOINTp *intp;
+VFIOPlatformDevice *vdev =
+container_of(vbasedev, VFIOPlatformDevice, vbasedev);
+
+qemu_mutex_lock(&vdev->intp_mutex);
+QLIST_FOREACH(intp, &am

[Qemu-devel] [PATCH v6 05/16] hw/vfio/pci: split vfio_get_device

2014-09-09 Thread Eric Auger
vfio_get_device now takes a VFIODevice as argument. The function is split
into 2 parts: vfio_get_device which is generic and vfio_populate_device
which is bus specific.

3 new fields are introduced in VFIODevice to store dev_info.

vfio_put_base_device is created.

---

v5->v6:
- simplifies the split for vfio_get_device:
  vfio_check_device, vfio_populate_regions, vfio_populate_interrupts
  are now gathered into a unique specialization function dubbed
  vfio_populate_device

v4->v5:
- cleanup up of error handling and get/put operations in
  vfio_check_device, vfio_populate_regions, vfio_populate_interrupts and
  vfio_get_device.
  - correct misuse of errno
  - vfio_populate_regions always returns 0
  - VFIODevice .name deallocation done in vfio_put_device instead of
vfio_put_base_device
  - vfio_put_base_device done at vfio_get_device level.

Signed-off-by: Eric Auger 
---
 hw/vfio/pci.c | 130 +++---
 trace-events  |  10 ++---
 2 files changed, 83 insertions(+), 57 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 5e34504..d48ca04 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -205,12 +205,16 @@ typedef struct VFIODevice {
 bool reset_works;
 bool needs_reset;
 VFIODeviceOps *ops;
+unsigned int num_irqs;
+unsigned int num_regions;
+unsigned int flags;
 } VFIODevice;
 
 struct VFIODeviceOps {
 bool (*vfio_compute_needs_reset)(VFIODevice *vdev);
 int (*vfio_hot_reset_multi)(VFIODevice *vdev);
 void (*vfio_eoi)(VFIODevice *vdev);
+int (*vfio_populate_device)(VFIODevice *vdev);
 };
 
 typedef struct VFIOPCIDevice {
@@ -297,6 +301,8 @@ static uint32_t vfio_pci_read_config(PCIDevice *pdev, 
uint32_t addr, int len);
 static void vfio_pci_write_config(PCIDevice *pdev, uint32_t addr,
   uint32_t val, int len);
 static void vfio_mmap_set_enabled(VFIOPCIDevice *vdev, bool enabled);
+static void vfio_put_base_device(VFIODevice *vbasedev);
+static int vfio_populate_device(VFIODevice *vbasedev);
 
 /*
  * Common VFIO interrupt disable
@@ -3611,6 +3617,7 @@ static VFIODeviceOps vfio_pci_ops = {
 .vfio_compute_needs_reset = vfio_pci_compute_needs_reset,
 .vfio_hot_reset_multi = vfio_pci_hot_reset_multi,
 .vfio_eoi = vfio_eoi,
+.vfio_populate_device = vfio_populate_device,
 };
 
 static void vfio_reset_handler(void *opaque)
@@ -3952,70 +3959,45 @@ static void vfio_put_group(VFIOGroup *group)
 }
 }
 
-static int vfio_get_device(VFIOGroup *group, const char *name,
-   VFIOPCIDevice *vdev)
+static int vfio_populate_device(VFIODevice *vbasedev)
 {
-struct vfio_device_info dev_info = { .argsz = sizeof(dev_info) };
+VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev);
 struct vfio_region_info reg_info = { .argsz = sizeof(reg_info) };
 struct vfio_irq_info irq_info = { .argsz = sizeof(irq_info) };
-int ret, i;
-
-ret = ioctl(group->fd, VFIO_GROUP_GET_DEVICE_FD, name);
-if (ret < 0) {
-error_report("vfio: error getting device %s from group %d: %m",
- name, group->groupid);
-error_printf("Verify all devices in group %d are bound to vfio-pci "
- "or pci-stub and not already in use\n", group->groupid);
-return ret;
-}
-
-vdev->vbasedev.fd = ret;
-vdev->vbasedev.group = group;
-QLIST_INSERT_HEAD(&group->device_list, &vdev->vbasedev, next);
+int i, ret = -1;
 
 /* Sanity check device */
-ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_INFO, &dev_info);
-if (ret) {
-error_report("vfio: error getting device info: %m");
-goto error;
-}
-
-trace_vfio_get_device_irq(name, dev_info.flags,
-  dev_info.num_regions, dev_info.num_irqs);
-
-if (!(dev_info.flags & VFIO_DEVICE_FLAGS_PCI)) {
+if (!(vbasedev->flags & VFIO_DEVICE_FLAGS_PCI)) {
 error_report("vfio: Um, this isn't a PCI device");
 goto error;
 }
 
-vdev->vbasedev.reset_works = !!(dev_info.flags & VFIO_DEVICE_FLAGS_RESET);
-
-if (dev_info.num_regions < VFIO_PCI_CONFIG_REGION_INDEX + 1) {
+if (vbasedev->num_regions < VFIO_PCI_CONFIG_REGION_INDEX + 1) {
 error_report("vfio: unexpected number of io regions %u",
- dev_info.num_regions);
+ vbasedev->num_regions);
 goto error;
 }
 
-if (dev_info.num_irqs < VFIO_PCI_MSIX_IRQ_INDEX + 1) {
-error_report("vfio: unexpected number of irqs %u", dev_info.num_irqs);
+if (vbasedev->num_irqs < VFIO_PCI_MSIX_IRQ_INDEX + 1) {
+error_report("vfio: unexpected number of irqs %u", vbasedev->num_irqs);
 goto error;
 }
 
 for (i = VFIO_PCI_BAR0_REGION_INDEX; i < V

[Qemu-devel] [PATCH v6 11/16] hw/arm/dyn_sysbus_devtree: enable vfio-calxeda-xgmac dynamic instantiation

2014-09-09 Thread Eric Auger
vfio-calxeda-xgmac now can be instantiated using the -device option

Signed-off-by: Eric Auger 

---

v2 -> v3:
- correct bug of reg_attr[2*i] in vfio_fdt_add_device_node
- fix a bug related to compat_str_len computed on original compat
  instead of corrected compat
- wrap_vfio_fdt_add_node take a node creation function: this function
  needs to be specialized for each VFIO device. wrap function must be
  called in sysbus_device_create_devtree
---
 hw/arm/dyn_sysbus_devtree.c | 141 
 1 file changed, 141 insertions(+)

diff --git a/hw/arm/dyn_sysbus_devtree.c b/hw/arm/dyn_sysbus_devtree.c
index 61e5b5f..3ef9430 100644
--- a/hw/arm/dyn_sysbus_devtree.c
+++ b/hw/arm/dyn_sysbus_devtree.c
@@ -20,6 +20,141 @@
 #include "hw/arm/dyn_sysbus_devtree.h"
 #include "qemu/error-report.h"
 #include "sysemu/device_tree.h"
+#include "hw/vfio/vfio-platform.h"
+#include "hw/vfio/vfio-calxeda-xgmac.h"
+
+typedef void (*vfio_fdt_add_device_node_t)(SysBusDevice *sbdev, void *opaque);
+
+static char *format_compat(char * compat)
+{
+char *str_ptr, *corrected_compat;
+/*
+ * process compatibility property string passed by end-user
+ * replaces / by , and ; by NUL character
+ */
+corrected_compat = g_strdup(compat);
+
+str_ptr = corrected_compat;
+while ((str_ptr = strchr(str_ptr, '/')) != NULL) {
+*str_ptr = ',';
+}
+
+/* substitute ";" with the NUL char */
+str_ptr = corrected_compat;
+while ((str_ptr = strchr(str_ptr, ';')) != NULL) {
+*str_ptr = '\0';
+}
+
+/*
+ * corrected compat includes a "\0" before or at the same location
+ * as compat's one
+ */
+return corrected_compat;
+}
+
+static void wrap_vfio_fdt_add_node(SysBusDevice *sbdev, void *opaque,
+   vfio_fdt_add_device_node_t add_node_fn)
+{
+PlatformDevtreeData *data = opaque;
+VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(sbdev);
+VFIODevice *vbasedev = &vdev->vbasedev;
+gchar irq_number_prop[8];
+Object *obj = OBJECT(sbdev);
+char *corrected_compat;
+uint64_t irq_number;
+int corrected_compat_str_len, i;
+
+corrected_compat = format_compat(vdev->compat);
+corrected_compat_str_len = strlen(corrected_compat) + 1;
+/* we copy the corrected_compat string + its "\0" */
+snprintf(vdev->compat, corrected_compat_str_len, "%s", corrected_compat);
+g_free(corrected_compat);
+
+add_node_fn(sbdev, opaque);
+
+for (i = 0; i < vbasedev->num_irqs; i++) {
+snprintf(irq_number_prop, sizeof(irq_number_prop), "irq[%d]", i);
+irq_number = object_property_get_int(obj, irq_number_prop, NULL)
+ + data->irq_start;
+/*
+ * for setting irqfd up we must provide the virtual IRQ number
+ * which is the sum of irq_start and actual platform bus irq
+ * index. At realize point we do not have this info.
+ */
+vfio_start_irq_injection(sbdev, i, irq_number);
+}
+}
+
+static void vfio_basic_fdt_add_device_node(SysBusDevice *sbdev,
+void *opaque)
+{
+PlatformDevtreeData *data = opaque;
+void *fdt = data->fdt;
+const char *parent_node = data->node;
+int compat_str_len;
+char *nodename;
+int i, ret;
+uint32_t *irq_attr;
+uint64_t *reg_attr;
+uint64_t mmio_base;
+uint64_t irq_number;
+gchar mmio_base_prop[8];
+gchar irq_number_prop[8];
+VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(sbdev);
+VFIODevice *vbasedev = &vdev->vbasedev;
+Object *obj = OBJECT(sbdev);
+
+mmio_base = object_property_get_int(obj, "mmio[0]", NULL);
+
+nodename = g_strdup_printf("%s/%s@%" PRIx64, parent_node,
+   vbasedev->name,
+   mmio_base);
+
+qemu_fdt_add_subnode(fdt, nodename);
+
+compat_str_len = strlen(vdev->compat) + 1;
+qemu_fdt_setprop(fdt, nodename, "compatible",
+  vdev->compat, compat_str_len);
+
+reg_attr = g_new(uint64_t, vbasedev->num_regions*4);
+
+for (i = 0; i < vbasedev->num_regions; i++) {
+snprintf(mmio_base_prop, sizeof(mmio_base_prop), "mmio[%d]", i);
+mmio_base = object_property_get_int(obj, mmio_base_prop, NULL);
+reg_attr[4*i] = 1;
+reg_attr[4*i+1] = mmio_base;
+reg_attr[4*i+2] = 1;
+reg_attr[4*i+3] = memory_region_size(&vdev->regions[i]->mem);
+}
+
+ret = qemu_fdt_setprop_sized_cells_from_array(fdt, nodename, "reg",
+ vbasedev->num_regions*2, reg_attr);
+if (ret < 0) {
+error_report("could not set reg

[Qemu-devel] [PATCH v6 13/16] hw/vfio/platform: Add irqfd support

2014-09-09 Thread Eric Auger
This patch aims at optimizing IRQ handling using irqfd framework.

Instead of handling the eventfds on user-side they are handled on
kernel side using
- the KVM irqfd framework,
- the VFIO driver virqfd framework.

the virtual IRQ completion is trapped at interrupt controller
This removes the need for fast/slow path swap.

Overall this brings significant performance improvements.

it depends on host kernel KVM irqfd.

Signed-off-by: Alvise Rigo 
Signed-off-by: Eric Auger 

---

v5 -> v6
- rely on kvm_irqfds_enabled() and kvm_resamplefds_enabled()
- guard KVM code with #ifdef CONFIG_KVM

v3 -> v4:
[Alvise Rigo]
Use of VFIO Platform driver v6 unmask/virqfd feature and removal
of resamplefd handler. Physical IRQ unmasking is now done in
VFIO driver.

v3:
[Eric Auger]
initial support with resamplefd handled on QEMU side since the
unmask was not supported on VFIO platform driver v5.
---
 hw/vfio/platform.c  | 96 +
 include/hw/vfio/vfio-platform.h |  1 +
 trace-events|  2 +
 3 files changed, 99 insertions(+)

diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
index 93aa94a..a59a842 100644
--- a/hw/vfio/platform.c
+++ b/hw/vfio/platform.c
@@ -24,6 +24,7 @@
 #include "qemu/queue.h"
 #include "hw/sysbus.h"
 #include "trace.h"
+#include "sysemu/kvm.h"
 
 #define MAX_FAKE_INTP 5
 
@@ -323,6 +324,83 @@ static void vfio_fake_intp_injection(void *opaque)
 }
 
 /*
+ * Functions used for irqfd
+ */
+
+#ifdef CONFIG_KVM
+
+/**
+ * vfio_set_resample_eventfd - sets the resamplefd for an IRQ
+ * @intp: the IRQ struct pointer
+ * programs the VFIO driver to unmask this IRQ when the
+ * intp->unmask eventfd is triggered
+ */
+static int vfio_set_resample_eventfd(VFIOINTp *intp)
+{
+VFIODevice *vbasedev = &intp->vdev->vbasedev;
+struct vfio_irq_set *irq_set;
+int argsz, ret;
+int32_t *pfd;
+
+argsz = sizeof(*irq_set) + sizeof(*pfd);
+irq_set = g_malloc0(argsz);
+irq_set->argsz = argsz;
+irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_UNMASK;
+irq_set->index = intp->pin;
+irq_set->start = 0;
+irq_set->count = 1;
+pfd = (int32_t *)&irq_set->data;
+*pfd = event_notifier_get_fd(&intp->unmask);
+qemu_set_fd_handler(*pfd, NULL, NULL, intp);
+ret = ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irq_set);
+g_free(irq_set);
+if (ret < 0) {
+error_report("vfio: Failed to set resample eventfd: %m");
+qemu_set_fd_handler(*pfd, NULL, NULL, NULL);
+}
+return ret;
+}
+
+/**
+ * vfio_start_irqfd_injection - starts irqfd injection for an IRQ
+ * programs VFIO driver with both the trigger and resamplefd
+ * programs KVM with the gsi, trigger & resample eventfds
+ */
+static int vfio_start_irqfd_injection(VFIOINTp *intp)
+{
+struct kvm_irqfd irqfd = {
+.fd = event_notifier_get_fd(&intp->interrupt),
+.resamplefd = event_notifier_get_fd(&intp->unmask),
+.gsi = intp->virtualID,
+.flags = KVM_IRQFD_FLAG_RESAMPLE,
+};
+
+if (kvm_vm_ioctl(kvm_state, KVM_IRQFD, &irqfd)) {
+error_report("vfio: Error: Failed to assign the irqfd: %m");
+goto fail_irqfd;
+}
+if (vfio_set_trigger_eventfd(intp, NULL) < 0) {
+goto fail_vfio;
+}
+if (vfio_set_resample_eventfd(intp) < 0) {
+goto fail_vfio;
+}
+
+intp->kvm_accel = true;
+trace_vfio_platform_start_irqfd_injection(intp->pin, intp->virtualID,
+ irqfd.fd, irqfd.resamplefd);
+return 0;
+
+fail_vfio:
+irqfd.flags = KVM_IRQFD_FLAG_DEASSIGN;
+kvm_vm_ioctl(kvm_state, KVM_IRQFD, &irqfd);
+fail_irqfd:
+return -1;
+}
+
+#endif
+
+/*
  * Functions used whatever the injection method
  */
 
@@ -418,6 +496,13 @@ static VFIOINTp *vfio_init_intp(VFIODevice *vbasedev, 
unsigned int index)
 error_report("vfio: Error: trigger event_notifier_init failed ");
 return NULL;
 }
+/* Get an eventfd for resample/unmask */
+ret = event_notifier_init(&intp->unmask, 0);
+if (ret) {
+g_free(intp);
+error_report("vfio: Error: resample event_notifier_init failed eoi");
+return NULL;
+}
 
 /* store the new intp in qlist */
 QLIST_INSERT_HEAD(&vdev->intp_list, intp, next);
@@ -660,7 +745,17 @@ static void vfio_platform_realize(DeviceState *dev, Error 
**errp)
 
 vbasedev->type = VFIO_DEVICE_TYPE_PLATFORM;
 vbasedev->ops = &vfio_platform_ops;
+
+#ifdef CONFIG_KVM
+if (kvm_irqfds_enabled() && kvm_resamplefds_enabled() &&
+vdev->irqfd_allowed) {
+vdev->start_irq_fn = vfio_start_irqfd_injection;
+} else {
+vdev->start_irq_fn = vfio_start_eventfd_injection;
+}
+#else
 vdev->start_ir

[Qemu-devel] [PATCH v6 07/16] hw/vfio/pci: use name field in format strings

2014-09-09 Thread Eric Auger
Signed-off-by: Eric Auger 
---
 hw/vfio/pci.c | 213 --
 trace-events  | 105 ++---
 2 files changed, 111 insertions(+), 207 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 5623539..c617b79 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -387,9 +387,7 @@ static void vfio_intx_interrupt(void *opaque)
 return;
 }
 
-trace_vfio_intx_interrupt(vdev->host.domain, vdev->host.bus,
-  vdev->host.slot, vdev->host.function,
-  'A' + vdev->intx.pin);
+trace_vfio_intx_interrupt(vdev->vbasedev.name, 'A' + vdev->intx.pin);
 
 vdev->intx.pending = true;
 pci_irq_assert(&vdev->pdev);
@@ -408,8 +406,7 @@ static void vfio_eoi(VFIODevice *vbasedev)
 return;
 }
 
-trace_vfio_eoi(vdev->host.domain, vdev->host.bus,
-   vdev->host.slot, vdev->host.function);
+trace_vfio_eoi(vbasedev->name);
 
 vdev->intx.pending = false;
 pci_irq_deassert(&vdev->pdev);
@@ -478,8 +475,7 @@ static void vfio_enable_intx_kvm(VFIOPCIDevice *vdev)
 
 vdev->intx.kvm_accel = true;
 
-trace_vfio_enable_intx_kvm(vdev->host.domain, vdev->host.bus,
-   vdev->host.slot, vdev->host.function);
+trace_vfio_enable_intx_kvm(vdev->vbasedev.name);
 
 return;
 
@@ -531,8 +527,7 @@ static void vfio_disable_intx_kvm(VFIOPCIDevice *vdev)
 /* If we've missed an event, let it re-fire through QEMU */
 vfio_unmask_irqindex(&vdev->vbasedev, VFIO_PCI_INTX_IRQ_INDEX);
 
-trace_vfio_disable_intx_kvm(vdev->host.domain, vdev->host.bus,
-vdev->host.slot, vdev->host.function);
+trace_vfio_disable_intx_kvm(vdev->vbasedev.name);
 #endif
 }
 
@@ -551,8 +546,7 @@ static void vfio_update_irq(PCIDevice *pdev)
 return; /* Nothing changed */
 }
 
-trace_vfio_update_irq(vdev->host.domain, vdev->host.bus,
-  vdev->host.slot, vdev->host.function,
+trace_vfio_update_irq(vdev->vbasedev.name,
   vdev->intx.route.irq, route.irq);
 
 vfio_disable_intx_kvm(vdev);
@@ -628,8 +622,7 @@ static int vfio_enable_intx(VFIOPCIDevice *vdev)
 
 vdev->interrupt = VFIO_INT_INTx;
 
-trace_vfio_enable_intx(vdev->host.domain, vdev->host.bus,
-   vdev->host.slot, vdev->host.function);
+trace_vfio_enable_intx(vdev->vbasedev.name);
 
 return 0;
 }
@@ -651,8 +644,7 @@ static void vfio_disable_intx(VFIOPCIDevice *vdev)
 
 vdev->interrupt = VFIO_INT_NONE;
 
-trace_vfio_disable_intx(vdev->host.domain, vdev->host.bus,
-vdev->host.slot, vdev->host.function);
+trace_vfio_disable_intx(vdev->vbasedev.name);
 }
 
 /*
@@ -679,9 +671,7 @@ static void vfio_msi_interrupt(void *opaque)
 abort();
 }
 
-trace_vfio_msi_interrupt(vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function,
- nr, msg.address, msg.data);
+trace_vfio_msi_interrupt(vbasedev->name, nr, msg.address, msg.data);
 #endif
 
 if (vdev->interrupt == VFIO_INT_MSIX) {
@@ -788,9 +778,7 @@ static int vfio_msix_vector_do_use(PCIDevice *pdev, 
unsigned int nr,
 VFIOMSIVector *vector;
 int ret;
 
-trace_vfio_msix_vector_do_use(vdev->host.domain, vdev->host.bus,
-  vdev->host.slot, vdev->host.function,
-  nr);
+trace_vfio_msix_vector_do_use(vdev->vbasedev.name, nr);
 
 vector = &vdev->msi_vectors[nr];
 
@@ -876,9 +864,7 @@ static void vfio_msix_vector_release(PCIDevice *pdev, 
unsigned int nr)
 VFIOPCIDevice *vdev = DO_UPCAST(VFIOPCIDevice, pdev, pdev);
 VFIOMSIVector *vector = &vdev->msi_vectors[nr];
 
-trace_vfio_msix_vector_release(vdev->host.domain, vdev->host.bus,
-   vdev->host.slot, vdev->host.function,
-   nr);
+trace_vfio_msix_vector_release(vdev->vbasedev.name, nr);
 
 /*
  * There are still old guests that mask and unmask vectors on every
@@ -941,8 +927,7 @@ static void vfio_enable_msix(VFIOPCIDevice *vdev)
 error_report("vfio: msix_set_vector_notifiers failed");
 }
 
-trace_vfio_enable_msix(vdev->host.domain, vdev->host.bus,
-   vdev->host.slot, vdev->host.function);
+trace_vfio_enable_msix(vdev->vbasedev.name);
 }
 
 static void vfio_enable_msi(VFIOPCIDevice *vdev)
@@ -1018,9 +1003,7 @@ retry:
 return;
 }
 
-trace_vfio_enable_msi(vdev->host.domain, vdev->host.bus,
-  vdev->host.slo

[Qemu-devel] [PATCH v6 12/16] vfio/platform: add fake injection modality

2014-09-09 Thread Eric Auger
This code is aimed at testing multiple IRQ injection with
user-side handled eventfds. Principle is a timer periodically
triggers an IRQ at VFIO driver level. Then this IRQ follows
regular VFIO driver -> eventfd trigger -> user-side eventfd handler.
The IRQ is not injected into the guest. the IRQ is completed
on another timer timeout to emulate eoi on write/read access.

for instance, following options
 x-fake-irq[0]=1,x-fake-period[0]=10,x-fake-duration[0]=50,
x-fake-irq[1]=2,x-fake-period[i]=20,x-fake-duration[1]=100
set vfio platform IRQ indexed #1 and #2 as fake IRQ

Signed-off-by: Eric Auger 

---

this modality was used to test calxeda xgmac assignment with
main IRQ generated by the HW and IRQ #1 and #2 as fake IRQs
---
 hw/vfio/platform.c  | 131 +++-
 include/hw/vfio/vfio-platform.h |  13 
 trace-events|   3 +
 3 files changed, 145 insertions(+), 2 deletions(-)

diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
index 9987b25..93aa94a 100644
--- a/hw/vfio/platform.c
+++ b/hw/vfio/platform.c
@@ -25,6 +25,8 @@
 #include "hw/sysbus.h"
 #include "trace.h"
 
+#define MAX_FAKE_INTP 5
+
 static void vfio_intp_interrupt(VFIOINTp *intp);
 typedef void (*eventfd_user_side_handler_t)(VFIOINTp *intp);
 static int vfio_set_trigger_eventfd(VFIOINTp *intp,
@@ -141,6 +143,27 @@ static void vfio_intp_mmap_enable(void *opaque)
 }
 
 /**
+ * vfio_fake_intp_index - returns the fake IRQ index
+ *
+ * @intp the interrupt struct pointer
+ * if the IRQ is not fake, returns < 0
+ * if it is fake returns the index of the fake IRQ
+ * ie the index i for which x-fake-irq[i]=intp->pin
+ */
+static int vfio_fake_intp_index(VFIOINTp *intp)
+{
+VFIOPlatformDevice *vdev = intp->vdev;
+int i;
+
+for (i = 0; i < MAX_FAKE_INTP; i++) {
+if (intp->pin == vdev->fake_intp_index[i]) {
+return i;
+}
+}
+return -1;
+}
+
+/**
  * vfio_intp_interrupt - The user-side eventfd handler
  * @opaque: opaque pointer which in practice is the VFIOINTp*
  *
@@ -199,8 +222,18 @@ static void vfio_intp_interrupt(VFIOINTp *intp)
 /* sets slow path */
 vfio_mmap_set_enabled(vdev, false);
 
-/* trigger the virtual IRQ */
-qemu_set_irq(intp->qemuirq, 1);
+if (intp->fake_intp_index < 0) {
+/* trigger the virtual IRQ */
+qemu_set_irq(intp->qemuirq, 1);
+} else {
+/*
+ * the vIRQ is not triggered but we emulate a handling
+ * duration
+ */
+timer_mod(intp->fake_eoi_timer,
+  qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) +
+  intp->fake_intp_duration);
+}
 
 /* schedule the mmap timer which will restore mmap path after EOI*/
 if (vdev->mmap_timeout) {
@@ -231,9 +264,64 @@ static int vfio_start_eventfd_injection(VFIOINTp *intp)
 return ret;
 }
 vfio_unmask_irqindex(vbasedev, intp->pin);
+
+/* in case of fake irq, starts its injection */
+if (intp->fake_intp_index >= 0) {
+timer_mod(intp->fake_intp_timer,
+  qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) +
+  intp->fake_intp_period);
+}
 return 0;
 }
 
+/**
+ * vfio_fake_intp_eoi - fake interrupt completion routine
+ * @opaque: actually is an IRQ struct pointer
+ *
+ * called on timer handler context
+ */
+static void vfio_fake_intp_eoi(void *opaque)
+{
+VFIOINTp *intp = (VFIOINTp *)opaque;
+trace_vfio_fake_intp_eoi(intp->pin);
+vfio_platform_eoi(&intp->vdev->vbasedev);
+}
+
+/**
+ * vfio_fake_intp_eoi - fake interrupt injection routine
+ * @opaque: actually is an IRQ struct pointer
+ *
+ * called on timer context
+ * use the VFIO loopback mode, ie. triggers the eventfd
+ * associated to the intp->pin although no physical IRQ hit.
+ */
+static void vfio_fake_intp_injection(void *opaque)
+{
+VFIOINTp *intp = (VFIOINTp *)opaque;
+VFIODevice *vbasedev = &intp->vdev->vbasedev;
+struct vfio_irq_set *irq_set;
+int argsz, ret;
+int32_t *pfd;
+
+argsz = sizeof(*irq_set) + sizeof(*pfd);
+irq_set = g_malloc0(argsz);
+irq_set->argsz = argsz;
+irq_set->flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_TRIGGER;
+irq_set->index = intp->pin;
+irq_set->start = 0;
+irq_set->count = 1;
+ret = ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irq_set);
+g_free(irq_set);
+if (ret < 0) {
+error_report("vfio: Failed to trigger fake IRQ: %m");
+} else {
+trace_vfio_fake_intp_injection(intp->pin);
+timer_mod(intp->fake_intp_timer,
+  qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) +
+  intp->fake_intp_period);
+}
+}
+
 /*
  * Functions used whatever the injection method
  */
@@ -304,6 +392,23 @@ static VFIOINTp *vfio_init_intp(VFIODevice *vbasedev, 
unsigned in

[Qemu-devel] [PATCH v6 14/16] linux-headers: Update KVM headers from linux-next tag ToBeFilled

2014-09-09 Thread Eric Auger
Syncup KVM related linux headers from linux-next tree using
scripts/update-linux-headers.sh.

Integrate updated KVM-VFIO API related to forwarded IRQ

Signed-off-by: Eric Auger 
---
 linux-headers/linux/kvm.h | 9 +
 1 file changed, 9 insertions(+)

diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index f5d2c38..42128d5 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -940,6 +940,12 @@ struct kvm_device_attr {
__u64   addr;   /* userspace address of attr data */
 };
 
+struct kvm_arch_forwarded_irq {
+__u32 fd; /* file desciptor of the VFIO device */
+__u32 index; /* VFIO device IRQ index */
+__u32 gsi; /* gsi, ie. virtual IRQ number */
+};
+
 #define KVM_DEV_TYPE_FSL_MPIC_20   1
 #define KVM_DEV_TYPE_FSL_MPIC_42   2
 #define KVM_DEV_TYPE_XICS  3
@@ -947,6 +953,9 @@ struct kvm_device_attr {
 #define  KVM_DEV_VFIO_GROUP1
 #define   KVM_DEV_VFIO_GROUP_ADD   1
 #define   KVM_DEV_VFIO_GROUP_DEL   2
+#define  KVM_DEV_VFIO_DEVICE   2
+#define   KVM_DEV_VFIO_DEVICE_FORWARD_IRQ  1
+#define   KVM_DEV_VFIO_DEVICE_UNFORWARD_IRQ2
 #define KVM_DEV_TYPE_ARM_VGIC_V2   5
 #define KVM_DEV_TYPE_FLIC  6
 
-- 
1.8.3.2




[Qemu-devel] [PATCH v6 15/16] VFIO: COMMON: vfio_kvm_device_fd moved in the common header

2014-09-09 Thread Eric Auger
the device is now used in platform for forwarded IRQ setup

Signed-off-by: Eric Auger 
---
 hw/vfio/common.c  | 3 ++-
 include/hw/vfio/vfio-common.h | 5 +
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 252c0b8..466b0e8 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -44,9 +44,10 @@ struct vfio_as_head vfio_address_spaces =
  * initialized, this file descriptor is only released on QEMU exit and
  * we'll re-use it should another vfio device be attached before then.
  */
-static int vfio_kvm_device_fd = -1;
+int vfio_kvm_device_fd = -1;
 #endif
 
+
 /*
  * Common VFIO interrupt disable
  */
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 83c7876..0ae0153 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -41,6 +41,11 @@
 #define VFIO_ALLOW_KVM_MSI 1
 #define VFIO_ALLOW_KVM_MSIX 1
 
+#ifdef CONFIG_KVM
+extern int vfio_kvm_device_fd;
+#endif
+
+
 enum {
 VFIO_DEVICE_TYPE_PCI = 0,
 VFIO_DEVICE_TYPE_PLATFORM = 1,
-- 
1.8.3.2




  1   2   3   4   5   6   7   8   9   10   >