Apache Fluo is a distributed processing system that lets users make incremental updates to large data sets. With Apache Fluo, users can set up workflows that execute cross node transactions when data changes. These workflows enable users to continuously join new data into large existing data sets without reprocessing all data. Apache Fluo is built on [Apache Accumulo].

Below are resources for this release:

Apache Fluo follows semver for its API . The API consists of everything under the org.apache.fluo.api package. Code outside of this package can change at any time. If your project is using Fluo code that falls outside of the API, then consider initiating a discussion about adding it to the API.

Notable changes

The major changes in 2.0.0 are highlighted here, for the complete list of changes, see the 2.0.0 Milestone on Github.

  • Many performance and bug fixes were made.
  • Fluo was updated to work with Accumulo 2.1.0, Java 11, and Hadoop 3
  • Added scan authorizations to snapshots 1120
  • Added asynchronous get methods to snapshots for reading data. 969
  • Added asynchronous submit method to LoaderExecutor 1100
  • Added option to Fluo’s scan command to show notifications. 1026
  • Fluo now sets Accumulo scan hints. This can be used to optimize Fluo server side scan execution in Accumulo. 1072
  • Summaries of Fluo’s metadata are generated in Accumulo. This could be used to select files for compaction in Accumulo. 1071