Now that I am got a big hunk of documents indexed with Solr, I am looking to see whether I can try some machine learning tools to try and extract bibliographic references out of the documents. Anyone got some recommendations about which kits might be good to play with for something like this? Notice: This email and any attachments are confidential and may not be used, published or redistributed without the prior written consent of the Institute of Geological and Nuclear Sciences Limited (GNS Science). If received in error please destroy and immediately notify GNS Science. Do not copy or disclose the contents.