Author : Keith Turner
30 Sep 2019
Accumulo 2.0 introduced Scan Executors giving control over processing of scans in Accumulo tablet servers. Fluo has a good use case for scan executors: notification scans. Fluo workers continually scan for notifications to find transactions to execute. All workers continually scanning for notifications puts load on Accumulo tablet servers which could negatively impact transactions. Scan executors provides a way to limit this load.
Fluo utilizes this feature by setting scan hints for notification scans
scan_type=fluo-ntfy. These hints are passed to Accumulo tablet
servers and are ignored by default. For these scan types, Accumulo could be
configured to either send them to a special thread pool and/or prioritize them
differently within a thread pool. Below is an example of Accumulo shell
commands that set up a special executor for notification scans.
config -s tserver.scan.executors.fnotify.threads=1 config -t fluo_table -s table.scan.dispatcher=org.apache.accumulo.core.spi.scan.SimpleScanDispatcher config -t fluo_table -s table.scan.dispatcher.opts.executor.fluo-ntfy=fnotify
The system setting
tserver.scan.executors.fnotify.threads=1 creates a single-threaded
scan executor in each tablet server named
fnotify. The two per-table
settings configure a scan dispatcher (the
SimpleScanDispatcher is built into
Accumulo) on the table
fluo_table. The scan dispatcher is configured such that when
a scan hint of
scan_type=fluo-ntfy is seen, it runs on the executor
All other scans will run on the default executor. This has the effect running
all notification scans on a single dedicated thread in each tablet server.
The above setting were tested in a scenario where 20 Fluo worker were run
against a single tablet server with 20 tablets. The Fluo stress test was run
with a low ingest rate, resulting in continual notification scanning by the 20
workers. While the test was running,
top were used to inspect the
tablet server. This inspection revealed that notification scans were all
running in a single thread which was using 100% of a single core. This left all
of the other cores free to process transactions. Further testing to see how
this impacts throughput is needed. Observing the worker debug logs, all of them
seemed to complete notification scans, quickly finding new work.
Fluo took a descriptive approach to using scan hints, where it described to Accumulo the type of scan it was running. However, Fluo does not care what, if anything, Accumulo does with that information. This allows administrators to configure Accumulo in many different ways to handle notification scans, without any changes to Fluo.
For my first pass at using scan executors, I tried a prescriptive approach. I attempted to use scan hints to explicitly name an executor for notification scans. I realized this would require Fluo configuration to provide the name of the scan executor. Forcing a user to specify Accumulo and Fluo configuration was very cumbersome so I abandoned the prescriptive approach. The descriptive approach I settled on in its place is less cumbersome (it only requires Accumulo config) and more flexible (it supports executors and/or prioritization instead of only executors).
At the time of this writing, no released version of Fluo supports Accumulo 2.0. Once Fluo 1.3.0 is released with Accumulo 2.0, Hadoop 3.0, and Java 11 support, it will include support for scan executors.
View all posts in the news archive