Nicko van Someren writes: > I used the number 10^9 for the factor base size (compared to about > 6*10^6 for the break of the 512 bit challenge) and 10^11 for the > weight of the matrix (compared to about 4*10^8 for RSA512). Again > these were guesses and they certainly could be out by an order of > magnitude.
In his paper Bernstein uses a relatively larger factor base than in typical current choices of parameters. It's likely that the factor bases which have been used in the past are "too small" in the sense that the linear algebra step is being limited by machine size rather than runtime, because of the difficulty of parallelizing it. For example in http://www.loria.fr/~zimmerma/records/RSA155 we find that the sieving took 8000 mips years but the linear algebra took 224 CPU hours on a 2GB Cray. If there were a larger machine to do the matrix solution, the whole process could be accelerated, and that's what Bernstein's figures assume. Specifically he uses a factor base size of L^.7904, where L for 1024 bit keys is approximately 2^45. This is a matrix size of about 50 billion, 50 times larger than your estimate. So a closer order of magnitude esimate would be 10^11 for the factor base size and 10^13 for the weight of the matrix. > The matrix reduction cells are pretty simple and my guess was > that we could build the cells plus inter-cell communication > in about 1000 transistors. I felt that, for a first order guess, > we could ignore the transistors in the edge drivers since for a > chip with N cells there are only order N^(1/2) edge drivers. > Thus I guessed 10^14 transistors which might fit onto about 10^7 > chips which in volume (if you own the fabrication facility) cost > about $10 each, or about $10^8 for the chips. Based on past work > in estimating the cost of large systems I then multiplied this > by three or four to get a build cost. The assumption of a larger factor base necessary for the large asymptotic speedups would increase the cost estimate by a factor of about 50. Instead of several hundred million dollars, it would be perhaps 10-50 billion dollars. Of course at this level of discussion it's just as easy to assume that the adversary spends $50 billion as $500 million; it's all completely hypothetical. > As far at the speed goes, this machine can compute a dot product > in about 10^6 cycles. Actually the sort algorithm described takes 8*sqrt(10^11) or about 2.5 * 10^6 cycles, and there are three sorts per dot product, so 10^7 cycles would be a better estimate. Using the larger factor base with 10^13 entries would imply a sort time of 10^8 cycles, by this reasoning. > Initially I thought that the board to > board communication would be slow and we might only have a 1MHz > clock for the long haul communication, but I messed up the total > time and got that out as a 1 second matrix reduction. In fact to > compute a kernel takes about 10^11 times longer. Fortunately it > turns out that you can drive from board to board probably at a > few GHz or better (using GMII type interfaces from back planes > of network switches). If we can get this up to 10GHz (we do have > lots to spend on R&D here) we should be able to find a kernel in > somewhere around 10^7 seconds, which is 16 weeks or 4 months. Taking into consideration that the sort algorithm takes about 8 times longer than you assumed, and that "a few" minimal polynomials have to be calculated to get the actual one, this adds about a factor of 20 over your estimate. Instead of 4 months it would be more like 7 years. This is pretty clearly impractical. Apparently Ian Goldberg expressed concerns about the interconnections when the machine was going to run at 1 MHz. Now it is projected to run 10,000 times faster? That's an aggressive design. Obviously if this speed cannot be achieved the run time goes up still more. If only 1 GHz can be achieved rack to rack then the machine takes 70 years for one factorization. Needless to say, any bit errors anywhere will destroy the result which may have taken years to produce, requiring error correction to be used, adding cost and possibly slowing the effective clock rate. Using the larger factor base from the Bernstein paper would increase the time to something like 10^11 seconds, thousands of years, which is out of the question. > Lastly, I want to reiterate that these are just estimates. I > give them here because you ask. I don't expect them to be used > for the design of any real machines; much more research is > needed before that. I do however think that they are rather > more accurate than my first estimates. These estimates are very helpful. Thanks for providing them. It seems that, based on the factor base size derived from Bernstein's asymptotic estimates, the machine is not feasible and would take thousands of years to solve a matrix. If the 50 times smaller factor base can be used, the machine is on the edge of feasibility but it appears that it would still take years to factor a single value. --------------------------------------------------------------------- The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
