Herbert,
> I don't think this is safe unless you do some kind of locking
> which would slow down the data path. The easiest fix would be
> to keep the old tfm around forever, or use RCU if RCU read locking
> is acceptable to your use-case.
You're right. There's a small race there.
Patch serie
On Fri, Aug 24, 2018 at 05:46:15PM -0400, Martin K. Petersen wrote:
>
> +#ifdef CONFIG_MODULES
> + struct module *mod = data;
> +
> + if (val != MODULE_STATE_LIVE ||
> + strncmp(mod->name, "crct10dif", strlen("crct10dif")))
> + return 0;
> +
> + /* Fall back to libra
Ard,
> This looks like it should work, yes. It does rely on the module name
> to start with 'crct10dif' but I guess that is reasonable, and matches
> the current state on all architectures.
Yep, I verified the module names on ARM and Power. There really wasn't
much I could key off of other than
On 24 August 2018 at 22:46, Martin K. Petersen
wrote:
>
> Ard,
>
>> I'd prefer to handle this without help from userland.
>>
>> It shouldn't be too difficult to register a module notifier that only
>> sets a flag (or the static key, even?), and to free and re-allocate
>> the crc_t10dif transform i
Ard,
> I'd prefer to handle this without help from userland.
>
> It shouldn't be too difficult to register a module notifier that only
> sets a flag (or the static key, even?), and to free and re-allocate
> the crc_t10dif transform if the flag is set.
Something like this proof of concept?
diff
On 24 August 2018 at 17:29, Martin K. Petersen
wrote:
>
> Ard,
>
>> Would it be possible to allocate the crypto transform upon first use
>> instead of from an initcall? If crc_t10dif() is mostly called from
>> non-process context, that would not really work, but otherwise, we
>> could simply defer
Jeffrey,
> So it seems the CONFIG_CRYPTO_CRCT10DIF_PCLMUL=y provides the best
> performance.
Thanks for confirming!
> Are there any negative side effect to this config option?
Other than kernel image size, not really.
--
Martin K. Petersen Oracle Linux Engineering
Ard,
> Would it be possible to allocate the crypto transform upon first use
> instead of from an initcall? If crc_t10dif() is mostly called from
> non-process context, that would not really work, but otherwise, we
> could simply defer it (and occasional calls from non-process context
> that do o
er.kernel.org;
> linux-s...@vger.kernel.org; herb...@gondor.apana.org.au;
> tim.c.c...@linux.intel.com; David Darrington ; Jeff
> Furlong
> Subject: Re: [PATCH] Performance Improvement in CRC16 Calculations.
>
> On Tue, Aug 21, 2018 at 09:40:34PM -0400, Martin K. Petersen wrote:
>> W
ana.org.au;
tim.c.c...@linux.intel.com; David Darrington ; Jeff
Furlong
Subject: Re: [PATCH] Performance Improvement in CRC16 Calculations.
On Tue, Aug 21, 2018 at 09:40:34PM -0400, Martin K. Petersen wrote:
> When crc-t10dif is initialized, the crypto infrastructure will pick
> the al
On Tue, Aug 21, 2018 at 09:40:34PM -0400, Martin K. Petersen wrote:
> When crc-t10dif is initialized, the crypto infrastructure will pick the
> algorithm with the highest priority currently registered. Both block and
> SCSI will cause crc-t10dif to be compiled as a built-in so this
> selection happ
> These days we obviously use the hardware-accelerated CRC calculation
> so the software table approach mostly serves as a reference
> implementation.
I was puzzled as to why WDC's tests did not seem to use the hardware-
accelerated CRC calculation whereas tests on my end worked fine. Turns
out
> With regard to your comment about slice (table ?) size, that is
> partially addressed by a kernel build time option shown in the above
> patch. That could be taken a bit further with a sysfs knob (where ?)
> to reduce the effective table size from that which the kernel is built
> with. To incre
vger.kernel.org; linux-s...@vger.kernel.org;
herb...@gondor.apana.org.au; tim.c.c...@linux.intel.com;
martin.peter...@oracle.com; David Darrington ; Jeff
Furlong
Subject: Re: [PATCH] Performance Improvement in CRC16 Calculations.
On Fri, Aug 10, 2018 at 02:12:11PM -0500, Jeff Lien wrote:
This p
arrington
; Jeff Furlong
Subject: Re: [PATCH] Performance Improvement in CRC16 Calculations.
On Fri, Aug 10, 2018 at 02:12:11PM -0500, Jeff Lien wrote:
This patch provides a performance improvement for the CRC16
calculations done in read/write workloads using the T10 Type 1/2/3
guard field.
el.org; linux-s...@vger.kernel.org;
herb...@gondor.apana.org.au; tim.c.c...@linux.intel.com;
martin.peter...@oracle.com; David Darrington
; Jeff Furlong
Subject: Re: [PATCH] Performance Improvement in CRC16 Calculations.
On Fri, Aug 10, 2018 at 02:12:11PM -0500, Jeff Lien wrote:
This p
g; linux-crypto@vger.kernel.org;
linux-bl...@vger.kernel.org; linux-s...@vger.kernel.org; herb...@gondor.apana.org.au;
tim.c.c...@linux.intel.com; martin.peter...@oracle.com; David Darrington
; Jeff Furlong
Subject: Re: [PATCH] Performance Improvement in CRC16 Calculations.
On Fri, Aug 10, 201
Cc: linux-ker...@vger.kernel.org; linux-crypto@vger.kernel.org;
linux-bl...@vger.kernel.org; linux-s...@vger.kernel.org;
herb...@gondor.apana.org.au; tim.c.c...@linux.intel.com;
martin.peter...@oracle.com; David Darrington ; Jeff
Furlong
Subject: Re: [PATCH] Performance Improvement in CRC16 Calcul
Hi!
> This patch provides a performance improvement for the CRC16 calculations done
> in read/write
> workloads using the T10 Type 1/2/3 guard field. For example, today with
> sequential write
> workloads (one thread/CPU of IO) we consume 100% of the CPU because of the
> CRC16 computation
> bo
Jeffrey Lien ; linux-ker...@vger.kernel.org;
linux-crypto@vger.kernel.org; linux-bl...@vger.kernel.org;
linux-s...@vger.kernel.org
Cc: herb...@gondor.apana.org.au; martin.peter...@oracle.com; David Darrington
; Jeff Furlong
Subject: Re: [PATCH] Performance Improvement in CRC16 Calculations.
On 08/10/2018
On 08/10/2018 12:12 PM, Jeff Lien wrote:
> This patch provides a performance improvement for the CRC16 calculations done
> in read/write
> workloads using the T10 Type 1/2/3 guard field. For example, today with
> sequential write
> workloads (one thread/CPU of IO) we consume 100% of the CPU beca
rlong
Subject: Re: [PATCH] Performance Improvement in CRC16 Calculations.
On Sat, 2018-08-11 at 02:04 -0700, Joe Perches wrote:
> On Fri, 2018-08-10 at 22:39 -0400, Douglas Gilbert wrote:
> > but below is a copy and paste of a table 27 from draft SBC-4
> > revision 15 in chapter
> A more interesting version would be to generate the lookup table
> for a byte followed by 3 zero bytes.
> You could then run four separate register dependency chains using the
> same 256 entry lookup table.
Not sure that works with a table lookup :-(
David
-
Registered Address Lakeside
From: Jeff Lien
> Sent: 10 August 2018 20:12
>
> This patch provides a performance improvement for the CRC16 calculations done
> in read/write
> workloads using the T10 Type 1/2/3 guard field. For example, today with
> sequential write
> workloads (one thread/CPU of IO) we consume 100% of the C
On 8/10/18, 12:13 PM, "linux-block-ow...@vger.kernel.org on behalf of Jeff
Lien" wrote:
This patch provides a performance improvement for the CRC16 calculations
done in read/write
workloads using the T10 Type 1/2/3 guard field. For example, today with
sequential write
workloads
On Sun, 2018-08-12 at 23:36 -0400, Douglas Gilbert wrote:
> On 2018-08-10 08:11 PM, Joe Perches wrote:
> > On Fri, 2018-08-10 at 16:02 -0400, Nicolas Pitre wrote:
> > > On Fri, 10 Aug 2018, Joe Perches wrote:
> > >
> > > > On Fri, 2018-08-10 at 14:12 -0500, Jeff Lien wrote:
> > > > > This patch pr
On 2018-08-10 08:11 PM, Joe Perches wrote:
On Fri, 2018-08-10 at 16:02 -0400, Nicolas Pitre wrote:
On Fri, 10 Aug 2018, Joe Perches wrote:
On Fri, 2018-08-10 at 14:12 -0500, Jeff Lien wrote:
This patch provides a performance improvement for the CRC16 calculations done
in read/write
workloads
On Sat, 2018-08-11 at 11:36 -0400, Martin K. Petersen wrote:
> Jeff,
>
> > This patch provides a performance improvement for the CRC16
> > calculations done in read/write workloads using the T10 Type 1/2/3
> > guard field. For example, today with sequential write workloads (one
> > thread/CPU of
Jeff,
> This patch provides a performance improvement for the CRC16
> calculations done in read/write workloads using the T10 Type 1/2/3
> guard field. For example, today with sequential write workloads (one
> thread/CPU of IO) we consume 100% of the CPU because of the CRC16
> computation bottl
On Sat, 2018-08-11 at 02:04 -0700, Joe Perches wrote:
> On Fri, 2018-08-10 at 22:39 -0400, Douglas Gilbert wrote:
> > but below is a copy and paste of a table 27 from draft SBC-4
> > revision 15 in chapter 4.22.4.4 on page 87.
>
> The posted code returns the proper crc for each
> CONFIG_CRYPTO_CRC
On Fri, 2018-08-10 at 22:39 -0400, Douglas Gilbert wrote:
> but below is a copy and paste of a table 27 from draft SBC-4
> revision 15 in chapter 4.22.4.4 on page 87.
The posted code returns the proper crc for each
CONFIG_CRYPTO_CRCT10DIF_TABLE_SIZE value from
1 to 5 for these arrays.
On 2018-08-10 08:11 PM, Joe Perches wrote:
On Fri, 2018-08-10 at 16:02 -0400, Nicolas Pitre wrote:
On Fri, 10 Aug 2018, Joe Perches wrote:
On Fri, 2018-08-10 at 14:12 -0500, Jeff Lien wrote:
This patch provides a performance improvement for the CRC16 calculations done
in read/write
workloads
On Fri, 10 Aug 2018, Joe Perches wrote:
> On Fri, 2018-08-10 at 16:02 -0400, Nicolas Pitre wrote:
> > On Fri, 10 Aug 2018, Joe Perches wrote:
> >
> > > On Fri, 2018-08-10 at 14:12 -0500, Jeff Lien wrote:
> > > > This patch provides a performance improvement for the CRC16
> > > > calculations don
On Fri, 2018-08-10 at 16:02 -0400, Nicolas Pitre wrote:
> On Fri, 10 Aug 2018, Joe Perches wrote:
>
> > On Fri, 2018-08-10 at 14:12 -0500, Jeff Lien wrote:
> > > This patch provides a performance improvement for the CRC16 calculations
> > > done in read/write
> > > workloads using the T10 Type 1/
On 2018-08-10 03:12 PM, Jeff Lien wrote:
This patch provides a performance improvement for the CRC16 calculations done
in read/write
workloads using the T10 Type 1/2/3 guard field. For example, today with
sequential write
workloads (one thread/CPU of IO) we consume 100% of the CPU because of t
On Fri, Aug 10, 2018 at 02:12:11PM -0500, Jeff Lien wrote:
> This patch provides a performance improvement for the CRC16 calculations done
> in read/write
> workloads using the T10 Type 1/2/3 guard field. For example, today with
> sequential write
> workloads (one thread/CPU of IO) we consume 10
On Fri, 10 Aug 2018, Jeff Lien wrote:
> This patch provides a performance improvement for the CRC16 calculations done
> in read/write
> workloads using the T10 Type 1/2/3 guard field. For example, today with
> sequential write
> workloads (one thread/CPU of IO) we consume 100% of the CPU becaus
On Fri, 10 Aug 2018, Joe Perches wrote:
> On Fri, 2018-08-10 at 14:12 -0500, Jeff Lien wrote:
> > This patch provides a performance improvement for the CRC16 calculations
> > done in read/write
> > workloads using the T10 Type 1/2/3 guard field. For example, today with
> > sequential write
> >
On Fri, 2018-08-10 at 14:12 -0500, Jeff Lien wrote:
> This patch provides a performance improvement for the CRC16 calculations done
> in read/write
> workloads using the T10 Type 1/2/3 guard field. For example, today with
> sequential write
> workloads (one thread/CPU of IO) we consume 100% of t
39 matches
Mail list logo