Hi Dan, > You could use the record's id, and then add a checksum digit using the > Luhn or Verhoeff algorithm, and then convert the resulting number to a > base 36 string.
> There are three advantages to this approach. > > 1) you don't have to worry about generating a random value and > dealing with collisions since the database handles it for you > 2) you can detect typos and mistakes without having to hit the > database > 3) people won't be able to guess the URLs unless they are familiar > with the exact algo you're using I've just been playing around with luhnacy and oklasoft-verhoeff gems. The main problem now seems to be with point 3 - because only the last digit is changing there is very little difference in the resulting strings for bigger integers eg: def short_url(id) Verhoeff.checksum_of(n).to_i.to_s(36) end short_url(12897) => "2rip" short_url(12897) => "2riv" To get round this I had a go at adding a number at the beginning then reversing the digits before adding the chechsum digit: def short_url(id) id = id + 13 id.to_s.reverse.to_i Verhoeff.checksum_of(n).to_i.to_s(36) end This seems to do the trick: short_url(12897) => "etp" short_url(12897) => "2jzf" My only worry now is have I compromised point 1 - are the values still unique? I think they are but will need to have a bit more of a think about the possibilities. Point 2 is a bonus - being able to check a URL for authenticity before hitting the database to search for it. So I think this might work ... thanks to everybody for their help and suggestions! DAZ ps - would I still store this as type UUID, or just a string? On Mar 13, 12:06 am, "Dan Kubb (dkubb)" <[email protected]> wrote: > DAZ, > > > I definitely need short strings - 6-8 characters for the url. It is > > for the url of e-cards that people send - they don't have to be secret > > urls, but it would be nice if people couldn't easily guess other urls > > and read other peoples cards, so just using the auto-incrementing id > > isn't really an option :( > > You could use the record's id, and then add a checksum digit using the > Luhn or Verhoeff algorithm, and then convert the resulting number to a > base 36 string. There are libraries to handle the checksum generation > and testing so it would only take a couple of lines of code for both > operations. > > > A determined hacker could just brute force things too, I don't see any > way for 100% protection in those cases. The best thing you can hope > for is to discourage casual exploration of the URL space. > > > How likely is rand(36**8).to_s(36) to have a collision compared to > > truncating UUIDTools::UUID.random_create? > > It's probably the same. > > > I realise that with smaller strings the chances of collision are > > larger. How do sites like disqus and bit.ly make their short urls? > > I don't know precisely. I'd guess they do something like above, I > don't see how they could do it any other way at the scales they are > working at. > > -- > > Dan > (dkubb) -- You received this message because you are subscribed to the Google Groups "DataMapper" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/datamapper?hl=en.
