distributed-fork alternatives and similar packages
Based on the "distributed" category
Simple zero-configuration backend for Cloud Haskell
The Cloud Haskell Application Platform
Supervisors for The Cloud Haskell Application Platform
Orphan instances for MonadBase and MonadBaseControl.
Do you think we are missing an alternative of distributed-fork or a related project?
A distributed data processing framework in pure Haskell. Inspired by Apache Spark.
- An example: /examples/gh/Main.hs
- API documentation: https://utdemir.github.io/distributed-dataset/
- Introduction blogpost: https://utdemir.com/posts/ann-distributed-dataset.html
This package provides a
Dataset type which lets you express and execute
transformations on a distributed multiset. Its API is highly inspired
by Apache Spark.
It uses pluggable
Backends for spawning executors and
for exchanging information. See 'distributed-dataset-aws' for an
implementation using AWS Lambda and S3.
It also exposes a more primitive
module which lets you run
IO actions remotely. It
is especially useful when your task is embarrassingly
This package provides a backend for 'distributed-dataset' using AWS services. Currently it supports running functions on AWS Lambda and using an S3 bucket as a shuffle store.
Dataset's reading from public open datasets. Currently it can fetch GitHub event data from GH Archive.
Running the example
- Clone the repository.
$ git clone https://github.com/utdemir/distributed-dataset $ cd distributed-dataset
- Make sure that you have AWS credentials set up. The easiest way is to install AWS command line interface and to run:
$ aws configure
- Create an S3 bucket to put the deployment artifact in. You can use the console or the CLI:
$ aws s3api create-bucket --bucket my-s3-bucket
Build an run the example:
- If you use Nix on Linux:
- (Recommended) Use my binary cache on Cachix to reduce compilation times:
$(nix-build -A cachix https://cachix.org/api/v1/install)/bin/cachix use utdemir
$ $(nix-build -A example-gh)/bin/example-gh my-s3-bucket
- If you use stack (requires Docker, works on Linux and MacOS):
$ stack run --docker-mount $HOME/.aws/ --docker-env HOME=$HOME example-gh my-s3-bucket
Experimental. Expect lots of missing features, bugs, instability and API changes. You will probably need to modify the source if you want to do anything serious. See issues.
I am open to contributions; any issue, PR or opinion is more than welcome.
- In order to develop
distributed-dataset, you can use;
- On Linux:
- On MacOS:
- On Linux:
- Use ormolu to format source code.
- You can use my binary cache on cachix so that you don't recompile half of the Hackage.
nix-shellwill drop you into a shell with
.ghcidalongside with all required haskell and system dependencies. You can use
cabal new-*commands there.
- There is a
./make.shat the root folder with some utilities like formatting the source code or running
./make.sh --helpto see the usage.
- Make sure that you have
stackas usual, it will automatically use a Docker image
./make.sh stack-buildbefore you send a PR to test different resolvers.
- Towards Haskell in Cloud by Jeff Epstein, Andrew P. Black, Simon L. Peyton Jones
- Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing by Matei Zaharia, et al.