Apache Fluo - Large-scale Incremental Processing

Apache Fluo™ is a distributed processing system that lets users make incremental updates to large data sets

Overview

With Apache Fluo, users can set up workflows that execute cross node transactions when data changes. These workflows enable users to continuously join new data into large existing data sets without reprocessing all data. Apache Fluo is built on Apache Accumulo.

Take the Fluo tour if you are interested in learning more. Feel free to contact us if you have questions.

Latest News

Apache Fluo 2.0.0 Apr 2023

How Fluo Leveraged Scan Executors Sep 2019

Apache Fluo YARN 1.0.0 Mar 2018

Apache Fluo Recipes 1.2.0 Mar 2018

Apache Fluo 1.2.0 Feb 2018

View all posts in the news archive

Major Features

Reduced Latency

When combining new data with existing data, Fluo offers reduced latency when compared to batch processing frameworks (e.g Spark, MapReduce).

Reliable

Incremental updates are implemented using transactions which allow thousands of updates to happen concurrently without corrupting data.

Core API

The core Fluo API supports simple, cross-node transactional updates using get/set methods.

Avoid Reprocessing Data

Combine new data with existing data without having to reprocess the entire dataset.

General Purpose

Fluo applications consist of a series of observers that execute user code when observed data is updated.

Recipes API

The Fluo Recipes API builds on the core API to offer complex transactional updates.