On Mon, Jan 18, 2021 at 10:53 PM Tim Meehan <[email protected]> wrote:
>
> Say that I have a strange character group that I want to find in a binary
> file.
> I wanted to use something like this:
>
> (define needle (list->string (map integer->char (list #xab #xcd #xef))))
> (define needle-offset
> (call-with-input-file "big_binary_blob.bin"
> #:mode 'binary
> (λ (p)
> (regexp-match-positions (regexp needle) p))))
>
> The "regexp-match-positions" returns #f (even though I know that needle is in
> there, I put it there). Is there a better way to go about this? The binary
> blob is about 100 MiB or so, if that helps.
You're searching for a certain unicode codepoint sequence (U+00AB,
U+00CD, U+00EF) in a string, but I think you're trying to search for a
byte sequence in a byte string. You can read in the file as bytes and
use a byte regexp. So:
(define needle (list->bytes (list #xab #xcd #xef)))
(define needle-offset
(call-with-input-file "big_binary_blob.bin"
#:mode 'binary
(λ (p)
(regexp-match-positions (byte-regexp needle) p))))
--
You received this message because you are subscribed to the Google Groups
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/racket-users/CAKfDxxwCcTvg8sNJphkWHe%3DUDpV_c3dLKLReTX6kL%3DD2ZVOOAA%40mail.gmail.com.