Am Freitag, 18. November 2016, 12:31:10 CET schrieb Stephan Mueller:
Hi Herbert,
> Hi Herbert,
>
> Once in a while I seem to trigger a bug in the blkcipher_walk code which I
> cannot track down. This bug happens sporadically where I assume that it has
> something to do with the memory management in the slow path of
> blkcipher_walk.
>
> I am using the CTR DRBG code that in turn uses the ctr-aes-aesni
> implementation. The bug only appears when I want to obtain a random number
> that is less than the CTR AES block size. In my particular case, I want 4
> bytes from the DRBG.
>
> The bug happens in arch/x86/crypto/aesni-intel_glue.c:ctr_crypt_final() at
> the line:
>
> memcpy(dst, keystream, nbytes);
>
> The bug looks like the following:
>
> [ 12.328676] BUG: unable to handle kernel paging request at
> ffffa17ae418b988 [ 12.328680] IP: [<ffffffff82060eea>]
> ctr_crypt+0x19a/0x1c0
> [ 12.328681] PGD 66fed067
> [ 12.328681] PUD 0
> [ 12.328681]
> [ 12.328683] Oops: 0002 [#1] SMP
> [ 12.328692] Modules linked in: bridge(+) stp llc ebtable_nat ip6table_raw
> ip6table_security ip6table_mangle iptable_raw iptable_security
> iptable_mangle ebtable_filter ebtables ip6table_filter ip6_tables
> crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcspkr i2c_piix4
> virtio_net virtio_balloon acpi_cpufreq sch_fq_codel virtio_console
> virtio_blk virtio_pci virtio_ring serio_raw crc32c_intel virtio
> [ 12.328693] CPU: 0 PID: 521 Comm: modprobe Not tainted 4.9.0-rc1+ #253
> [ 12.328694] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> 1.9.1-1.fc24 04/01/2014
> [ 12.328694] task: ffffa17ab8453fc0 task.stack: ffffbdafc0744000
> [ 12.328696] RIP: 0010:[<ffffffff82060eea>] [<ffffffff82060eea>]
> ctr_crypt +0x19a/0x1c0
> [ 12.328696] RSP: 0018:ffffbdafc0747a60 EFLAGS: 00010002
> [ 12.328697] RAX: 0000000032e455a6 RBX: 0000000000000004 RCX:
> 0000000000000002
> [ 12.328697] RDX: 0000000000000001 RSI: 0000000000000086 RDI:
> 0000000000000086
> [ 12.328698] RBP: ffffbdafc0747b28 R08: ffffa17abc16e900 R09:
> 0000000000000019
> [ 12.328698] R10: ffffa17a764f68b0 R11: 000000000002e918 R12:
> ffffbdafc0747b38
> [ 12.328698] R13: ffffa17a764f6840 R14: ffffa17ae418b988 R15:
> ffffbdafc0747a70
> [ 12.328699] FS: 00007f55f57a6700(0000) GS:ffffa17abfc00000(0000) knlGS:
> 0000000000000000
> [ 12.328700] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 12.328700] CR2: ffffa17ae418b988 CR3: 0000000079b26000 CR4:
> 00000000003406f0
> [ 12.328703] Stack:
> [ 12.328705] ffffa17abc16e900 ffffa17ab845fd80 2ae7e40732e455a6
> 3a224612a8f9841d
> [ 12.328706] fffffb4e81e117c0 ffffa17ab845fd80 fffffb4e829062c0
> ffffa17ae418b988
> [ 12.328707] ffffbdafc0747ba8 ffffffff00000d80 ffffffff00000004
> ffffbdafc0747bc8
> [ 12.328708] Call Trace:
> [ 12.328712] [<ffffffff823e5fd3>] __ablk_encrypt+0x43/0x50
> [ 12.328714] [<ffffffff823e6012>] ablk_encrypt+0x32/0xc0
> [ 12.328716] [<ffffffff823c4f2e>] skcipher_encrypt_ablkcipher+0x5e/0x60
> [ 12.328717] [<ffffffff823dbb80>] drbg_kcapi_sym_ctr+0xb0/0x130
> [ 12.328719] [<ffffffff823de153>] drbg_ctr_generate+0x53/0x80
>
>
> Now, the interesting part is the following: the original memory pointer that
> shall be processed by the DRBG is in my example ffffffffc018b988 -- this
> pointer is used until the DRBG invokes crypto_skcipher_encrypt. However,
> when I print out the buffer pointer that is used as dst in the memcpy of
> ctr_crypt_final, I see ffffa17ae418b988 -- i.e. the buffer that causes
> paging failure.
>
> During tracing the blkcipher_walk code I see that the slow code path is used
> when the request size is smaller than the block size. That slow code path
> allocates new memory that will be used for the dst pointer in
> ctr_crypt_final.
>
> May I ask you for checking whether the allocation and the memory pointer
> logic has an issue that would cause a paging failure?
Following up this issue, I found the location where the wrong memory pointer
is produced -- the following call tree is used:
1. set up of SGL with proper pointer
2. skcipher_encrypt_ablkcipher with SGL
3. invocation of ctr_crypt from arch/x86/crypto/aesni-intel_glue.c
4. blkcipher_walk_virt_block
5. blkcipher_walk_first
6. blkcipher_walk_next (this code does not use the code path to allocate a
page)
7. blkcipher_next_fast
walk->dst.virt.addr = walk->src.virt.addr;
-> copy src virt address into dst address pointer
Now, the diff path is used:
if (diff) {
walk->flags |= BLKCIPHER_WALK_DIFF;
blkcipher_map_dst(walk);
}
8. blkcipher_map_dst
walk->dst.virt.addr = scatterwalk_map(&walk->out);
==> this pointer is wrong
The interesting point is that step 8 gets the low and high bits right, but not
the bits in the middle:
The real data pointer for the dst buffer is ffffffffc0332988. The data pointer
used by the crypto API is ffff96a995332988 -- as often as I see the issue,
this similarity in the pointer values is always there.
Please note that the caller uses a static variable that shall be used as dst
buffer.
Thanks
Stephan
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html