Filtering a List based on State in Haskell

ghz 8months ago ⋅ 102 views

I've been looking at the implementations of ListT to find a good way to do something like this:

func list = do
  set <- get -- Data.Set
  el <- list
  guard $ Set.notMember el set
  return el

I know I could use ListT.fromFoldable, but I want to be able to use this stream as part of a larger processing pipeline without converting from list to stream and back at every step. Perhaps something like this:

func' list = (ListT.toReverseList . ListT.take 5 . func . ListT.fromFoldable) list

From what I understand a streaming approach should be used here. But I don't understand how to do this using e.g. the list-t package. Can a traverse somehow filter out results from the stream? I don't see people asking about this so maybe the approach itself is flawed?

Answers

You're correct that a streaming approach can be more efficient than converting between lists and streams repeatedly, especially if you're dealing with large datasets. Using the list-t package (or similar streaming libraries like pipes or conduit) can help you achieve this.

To implement a streaming version of your function, you can use the ListT monad transformer to represent the streaming computation. However, filtering out elements from the stream directly inside the computation might not be straightforward since monads like ListT typically represent nondeterministic computations, and filtering would usually happen after the computation has finished.

One approach to achieve what you want is to define a custom monadic operation that filters out elements from the stream. Here's a possible implementation using ListT:

import Control.Monad.Trans.List
import Control.Monad.Trans.Class (lift)
import qualified Data.Set as Set

customFilter :: (Monad m, Ord a) => Set.Set a -> ListT m a -> ListT m a
customFilter set m = do
  el <- m
  lift $ guard $ Set.notMember el set
  return el

func :: (Monad m, Ord a) => Set.Set a -> ListT m a -> ListT m a
func set list = customFilter set list

In this implementation, customFilter takes a Set and a ListT computation as input and filters out elements from the stream based on whether they are present in the set. You can then use this func function in your processing pipeline.

Here's how you can use it:

import Control.Monad.Trans.List
import Control.Monad.Trans.Class (lift)
import qualified Data.Set as Set

main :: IO ()
main = do
  let inputList = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
      set = Set.fromList [2, 4, 6, 8, 10]
      result = runListT $ func set $ ListT.fromFoldable inputList
  print result

This will print a stream of elements from inputList filtered to exclude elements present in the set.