Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize cleanUpBSNulls function #25

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Paczesiowa
Copy link

this uses a lot less memory than the concatMap one. in our app there are queries with length of 10k chars (inserting files to db) and the concatMap one was creating huge amounts of small bytestrings, which would consume additional 1gb of memory and a few extra seconds for gc.

@dcoutts
Copy link
Contributor

dcoutts commented Apr 9, 2014

Here's a simpler version using the new bytestring builder:

import qualified Data.ByteString as B
import qualified Data.ByteString.Lazy as BL
import qualified Data.ByteString.Builder as B
import qualified Data.ByteString.Builder.Prim as BP
import Data.ByteString.Builder.Prim ((>$<), (>*<))
import Data.Word (Word8)

escapeBSNulls :: B.ByteString -> BL.ByteString
escapeBSNulls = B.toLazyByteString . BP.primMapByteStringBounded conv
  where
    conv :: BP.BoundedPrim Word8
    conv = BP.condB (==0) (BP.liftFixedToBounded replacement)
                          (BP.liftFixedToBounded BP.word8)
    replacement :: BP.FixedPrim a
    replacement = const ('\\', ('0', ('0', '0')))
              >$< BP.char8 >*< BP.char8 >*< BP.char8 >*< BP.char8

In my benchmark on a 13M data file criterion says it is about 20x faster than the low-level version from this pull req.

@dcoutts
Copy link
Contributor

dcoutts commented Apr 9, 2014

Note that the large factor is likely an effect of using test data with a lot of \0s (an executable file) not from using such a large input (it's actually an even bigger factor for a 10k prefix of the same binary test file).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants