Apache NiFi is a powerful data transformation or data movement tool that helps to move or transform data with interactive flows that can be controlled from GUI. It can be used in most of the workloads that involve data-streaming, data-ingestion, and data-transformation.
Here below I write a few best practices that could help in designing efficient flow patterns in Apache NiFi.
Trigger the flow Using ListenHTTP
Often times, most NiFi processors would need an incoming flowfile to trigger an event. Consider a scenario where you want to pick up a file using the FetchSFTP processor or do a complex data format transformation using the ConvertRecord processor you would need incoming flow files to trigger the event. While some other scenarios have flowfile content that is streamed from other activities. In the first scenario, you can use the ListenHTTP processor and make a REST call to ListenHTTP processor with flow file name to trigger the flow initiation when certain events outside the scope of NiFi flow happen.
Increasing Concurrent tasks of a processor for parallelism
There will be times when there will be an increase in the number of incoming flow files and the processor may be bottlenecked hence resulting in an increase in latency and reduced throughput of the processor. The flow files are queued in the relationship. You can fine-tune the number of concurrent tasks running per processor in this case.
Distribute Load Processor
You can also use the DistributeLoad processor to distribute tasks across multiple processors uniformly as in the order of arrival flow files at the base processor. You could choose between round-robin, next-available and load-distribution-services.