There are a couple heuristics that 'file' could use to detect Haskell files:

- Java files always have {}s in them*. Haskell files may have {}s, but
mostly don't.
- Java 'import' statements end in ';'. Haskell 'import' statements are
allowed to end in in ';' as part of the alternate syntax, but in
practice very rarely do. I have hundreds** of Haskell source
repositories, and grepping through them all, I found 12 repos*** using
semicolons with import statements.
- Haskell has some unusual syntax that Java doesn't; '::' in type
signatures, for example. Or operators like '>>='.
- Haskell modules usually start with either Haskell comments, -- or {-
-}, which differ significantly from Java comments, // or /* */
- or they start or with the 'module' keyword, where Java would start
with a visibility modifier, public/private/protected, module kinds,
abstract/interface, and the 'class' keyword.
- And of course, Haskell files are usually suffixed .hs or .lhs, while
Java files are more usually .java

Between these 6 differences, I think 'file' could quite reliably
distinguish Java from Haskell files.

* #java on Freenode tells me that a Java source file which consists
only of 'import' statements can get away with no {}s, but such a file
will do nothing useful. And 'package-info.java' files - some sort of
autogenerated file - can also apparently be {}-free. But these are
very much edge-cases and I think can be disregarded.
** roughly 970
*** hlint, ddc, jhc,  open-witnesses, nobench, buddha, cabal,
langage-c, bnfc, protocol-buffers, lhc, hera

-- 
gwern



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to