Fluo is transaction layer that enables incremental processing on big data.
Fluo is an implementation of Percolator built on Accumulo that runs in YARN. It is not recommended for production use yet.
Getting Started
There are several ways to run Fluo (listed in order of increasing difficulty):
- quickstart - Starts a MiniFluo instance that is configured to run a word count application
- MiniFluo - Sets up a minimal Fluo instance that writes its data to single directory
- fluo-dev - Command-line tool for running Fluo and its dependencies on a single machine
- Zetten - Command-line tool that launches an AWS cluster and deploys Fluo and its dependencies to it
- Production - Sets up Fluo on a cluster where Accumulo, Hadoop & Zookeeper are running
Except for quickstart, all above will set up a Fluo application that will be idle unless you create client & observer code for your application. You can either create your own application or configure your Fluo application to run an example below:
- phrasecount - Computes phrase counts for unique documents
- fluo-stress - Computes the number of unique integers by building bitwise trie
Implementation
- Architecture - Overview of Fluo’s architecture
- Contributing - Documentation for developers who want to contribute to Fluo
- Metrics - Fluo metrics are visible via JMX by default but can be configured to send to Graphite or Ganglia