Apache Fluo™ is a distributed processing system that lets users make incremental updates to large data sets

Download GitHub Follow


With Apache Fluo, users can set up workflows that execute cross node transactions when data changes. These workflows enable users to continuously join new data into large existing data sets without reprocessing all data. Apache Fluo is built on Apache Accumulo.

Take the Fluo tour if you are interested in learning more. Feel free to contact us if you have questions.

Major Features

Reduced Latency

When combining new data with existing data, Fluo offers reduced latency when compared to batch processing frameworks (e.g Spark, MapReduce).


Incremental updates are implemented using transactions which allow thousands of updates to happen concurrently without corrupting data.

Core API

The core Fluo API supports simple, cross-node transactional updates using get/set methods.

Avoid Reprocessing Data

Combine new data with existing data without having to reprocess the entire dataset.

General Purpose

Fluo applications consist of a series of observers that execute user code when observed data is updated.

Recipes API

The Fluo Recipes API builds on the core API to offer complex transactional updates.