Popularity

9.2

Stable

Activity

0.0

Stable

Stars 114

Watchers 13

Forks 5

Last Commit almost 4 years ago

Monthly Downloads: 17

Programming language: Haskell

License: BSD 3-clause "New" or "Revised" License

Tags: Control Distributed

distributed-fork alternatives and similar packages

Based on the "distributed" category.
Alternatively, view distributed-fork alternatives based on common mentions on social networks and blogs.

distributed-closure

9.5 3.9 distributed-fork VS distributed-closure

Serializable closures for distributed programming.
distributed-process-platform

9.1 0.0 distributed-fork VS distributed-process-platform

DEPRECATED (Cloud Haskell Platform) in favor of distributed-process-extras, distributed-process-async, distributed-process-client-server, distributed-process-registry, distributed-process-supervisor, distributed-process-task and distributed-process-execution

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

Promo www.influxdata.com

distributed-static

8.2 0.0 distributed-fork VS distributed-static

Support for static values
distributed-process-simplelocalnet

7.9 5.9 distributed-fork VS distributed-process-simplelocalnet

Simple cloud haskell backend for local networks
distributed-process-async

7.9 2.8 distributed-fork VS distributed-process-async

Cloud Haskell - Asynchronous Execution
distributed-process-client-server

7.8 0.0 distributed-fork VS distributed-process-client-server

Cloud Haskell - gen_server implementation
distributed-process-extras

7.6 0.0 distributed-fork VS distributed-process-extras

Core utilities for distributed-process-platform
distributed-process-supervisor

7.5 0.0 distributed-fork VS distributed-process-supervisor

Cloud Haskell Supervision Trees
distributed-process-registry

6.9 0.0 distributed-fork VS distributed-process-registry

Extended Process Registry
distributed-process-execution

6.9 0.0 distributed-fork VS distributed-process-execution

Cloud Haskell Process Execution Framework
distributed-process-monad-control

6.9 0.0 distributed-fork VS distributed-process-monad-control

distributed-process-monad-control
distributed-process-fsm

5.3 0.0 distributed-fork VS distributed-process-fsm

Cloud Haskell implementation of Erlang's gen_statem (ish)
distributed-process-azure

- - distributed-fork VS distributed-process-azure

DISCONTINUED. Microsoft Azure backend for Cloud Haskell
distributed-fork-aws-lambda

- distributed-fork VS distributed-fork-aws-lambda

AWS Lambda backend for distributed-fork.

Do you think we are missing an alternative of distributed-fork or a related project?

Add another 'distributed' Package

Popular Comparisons

README

distributed-dataset

A distributed data processing framework in pure Haskell. Inspired by Apache Spark.

An example: /examples/gh/Main.hs
API documentation: https://utdemir.github.io/distributed-dataset/
Introduction blogpost: https://utdemir.com/posts/ann-distributed-dataset.html

Packages

distributed-dataset

This package provides a Dataset type which lets you express and execute transformations on a distributed multiset. Its API is highly inspired by Apache Spark.

It uses pluggable Backends for spawning executors and ShuffleStores for exchanging information. See 'distributed-dataset-aws' for an implementation using AWS Lambda and S3.

It also exposes a more primitive Control.Distributed.Fork module which lets you run IO actions remotely. It is especially useful when your task is embarrassingly parallel.

distributed-dataset-aws

This package provides a backend for 'distributed-dataset' using AWS services. Currently it supports running functions on AWS Lambda and using an S3 bucket as a shuffle store.

distributed-dataset-opendatasets

Provides Dataset's reading from public open datasets. Currently it can fetch GitHub event data from GH Archive.

Running the example

Clone the repository.

  $ git clone https://github.com/utdemir/distributed-dataset
  $ cd distributed-dataset

Make sure that you have AWS credentials set up. The easiest way is to install AWS command line interface and to run:

  $ aws configure

Create an S3 bucket to put the deployment artifact in. You can use the console or the CLI:

  $ aws s3api create-bucket --bucket my-s3-bucket

Build an run the example:

If you use Nix on Linux:
(Recommended) Use my binary cache on Cachix to reduce compilation times:

nix-env -i cachix # or your preferred installation method
cachix use utdemir

Then:

  $ nix run -f ./default.nix example-gh -c example-gh my-s3-bucket

If you use stack (requires Docker, works on Linux and MacOS):

  $ stack run --docker-mount $HOME/.aws/ --docker-env HOME=$HOME example-gh my-s3-bucket

Stability

Experimental. Expect lots of missing features, bugs, instability and API changes. You will probably need to modify the source if you want to do anything serious. See issues.

Contributing

I am open to contributions; any issue, PR or opinion is more than welcome.

In order to develop distributed-dataset, you can use;
- On Linux: Nix, cabal-install or stack.
- On MacOS: stack with docker.
Use ormolu to format source code.

Nix

You can use my binary cache on cachix so that you don't recompile half of the Hackage.
nix-shell will drop you into a shell with ormolu, cabal-install and steeloverseer alongside with all required haskell and system dependencies. You can use cabal new-* commands there.
Easiest way to get a development environment would be to run sos at the top level directory inside of a nix-shell.

Stack

Make sure that you have Docker installed.
Use stack as usual, it will automatically use a Docker image
Run ./make.sh stack-build before you send a PR to test different resolvers.

Related Work

Papers

Towards Haskell in Cloud by Jeff Epstein, Andrew P. Black, Simon L. Peyton Jones
Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing by Matei Zaharia, et al.

Projects

Apache Spark.
Sparkle: Run Haskell on top of Apache Spark.
HSpark: Another attempt at porting Apache Spark to Haskell.