Hello,

I finally managed to isolate the difference in cdhit output which causes segfaults in provean. It seems that cdhit >= 4.8.1-4 replaced full FASTA headers in its output with partial IDs:

diff -r /home/andrius/provean/good/cdhit.cluster /home/andrius/provean/bad/cdhit.cluster
1c1
< >gi|119610548|gb|EAW90142.1| tumor protein p53 (Li-Fraumeni syndrome), isoform CRA_c
---
> >EAW90142.1 tumor protein p53 (Li-Fraumeni syndrome), isoform CRA_c [Homo sapiens]

I need to look deeper if cdhit could be persuaded to use the old output format. If not, provean will have to be adjusted to the change.

Andrius

Reply via email to