Hi Siarhei, Board on which I collected those results is old PC-style evaluation board with MIPS CPU chip running at 1GHz (74Kc). More detailed information on cache for this board: - Primary instruction cache 32kB, VIPT, 4-way, linesize 32 bytes. - Primary data cache 32kB, 4-way, PIPT, no aliases, linesize 32 bytes - MIPS secondary cache 512kB, 8-way, linesize 32 bytes.
Your concerns for memory bandwidth make sense, but I don’t think this is related to timings and memory clock frequency configuration. This evaluation board uses old SDRAM, which lacks a lot in performance to modern DDR2/DDR3 memory chips, and thus influence overall peak memory bandwidth. Thanks, Nemanja Lukic -----Original Message----- From: Siarhei Siamashka [mailto:[email protected]] Sent: Thursday, February 16, 2012 5:21 PM To: Lukic, Nemanja Cc: [email protected]; [email protected] Subject: Re: [Pixman] [PATCH 2/2] MIPS: DSPr2: Added fast-paths for SRC operation. On Fri, Feb 10, 2012 at 5:02 PM, Nemanja Lukic <[email protected]> wrote: > From: Nemanja Lukic <[email protected]> > > Following fast-path functions are implemented (routines 4, 5 and 6 utilize > same fast-memcpy routine): > 1. src_x888_8888 > 2. src_8888_0565 > 3. src_0565_8888 > 4. src_0565_0565 > 5. src_8888_8888 > 6. src_0888_0888 Nice. That's a good choice of useful functions to optimize. > Performance numbers before/after on MIPS-74kc @ 1GHz > > Optimized (with these optimizations): > > lowlevel-blt-bench: > src_x888_8888 = L1: 369.50 L2: 99.37 M: 27.19 (145.07%) HT: 20.24 > VT: 19.48 R: 19.00 RT: 10.22 ( 63Kops/s) > src_8888_0565 = L1: 105.65 L2: 67.87 M: 25.41 (101.00%) HT: 20.78 > VT: 19.84 R: 18.52 RT: 9.81 ( 63Kops/s) > src_0565_8888 = L1: 77.10 L2: 63.04 M: 23.37 ( 92.90%) HT: 20.29 > VT: 19.37 R: 18.14 RT: 10.02 ( 63Kops/s) > src_0565_0565 = L1: 519.02 L2: 241.32 M: 62.35 (166.34%) HT: 33.74 > VT: 27.63 R: 26.12 RT: 11.70 ( 67Kops/s) > src_8888_8888 = L1: 390.48 L2: 113.99 M: 30.32 (161.77%) HT: 19.55 > VT: 17.05 R: 17.13 RT: 10.19 ( 63Kops/s) > src_0888_0888 = L1: 349.74 L2: 156.68 M: 40.68 (162.78%) HT: 25.58 > VT: 20.57 R: 20.20 RT: 9.96 ( 63Kops/s) Maybe this would be interesting for you. I'm getting the following numbers on my Asus RT-N16 router (MIPS 74K @480 MHz) with your optimizations applied: src_x888_8888 = L1: 149.94 L2: 37.43 M: 39.00 (146.51%) src_8888_0565 = L1: 50.05 L2: 24.53 M: 23.77 ( 66.62%) src_8888_8888 = L1: 173.30 L2: 70.62 M: 79.89 (299.11%) Looks like your hardware has roughly twice faster CPU and some amount of L2 cache (?), but shows ~2.6x worse peak memory bandwidth. Could it have memory timings and/or memory clock frequency misconfigured? -- Best regards, Siarhei Siamashka _______________________________________________ Pixman mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/pixman
