Performance & Debugging tool

Performance & Debugging tool#

Grain offers two configurable modes that can be set to gain deeper insights into pipeline execution and identify potential issues.

Open in Colab

# @test {"output": "ignore"}
!pip install grain

Visualization mode#

To get an overview of your dataset pipeline structure and clear understanding of how the data flows, enable visualization mode. This will log a visual representation of your pipeline, allowing you to easily identify different transformation stages and their relationships. To enable visualization mode, set the flag --grain_py_dataset_visualization_output_dir="" or call grain.config.update("py_dataset_visualization_output_dir", "")

# @test {"output": "ignore"}
import grain.python as grain

grain.config.update("py_dataset_visualization_output_dir", "")
ds = (
    grain.MapDataset.range(20)
    .seed(seed=42)
    .shuffle()
    .batch(batch_size=2)
    .map(lambda x: x)
    .to_iter_dataset()
)
it = iter(ds)

# Visualization graph is constructed once the dataset produces the first element
for _ in range(10):
  next(it)
Grain Dataset graph:

RangeMapDataset(start=0, stop=20, step=1)
  ││
  ││  
  ││
  ╲╱
"<class 'int'>[]"

  ││
  ││  WithOptionsMapDataset
  ││
  ╲╱
"<class 'int'>[]"

  ││
  ││  ShuffleMapDataset
  ││
  ╲╱
"<class 'int'>[]"

  ││
  ││  BatchMapDataset(batch_size=2, drop_remainder=False)
  ││
  ╲╱
'int64[2]'

  ││
  ││  MapMapDataset(transform=<lambda> @ <ipython-input-1-930f8fd1bf7d>:9)
  ││
  ╲╱
'int64[2]'

  ││
  ││  PrefetchDatasetIterator(read_options=ReadOptions(num_threads=16, prefetch_buffer_size=500), allow_nones=False)
  ││
  ╲╱
'int64[2]'

Debug mode#

To troubleshoot performance issues in your dataset pipeline, enable debug mode. This will log a real-time execution summary of the pipeline at one-minute intervals. This execution summary provides a detailed information on each transformation stage such as processing time, number of elements processed and other details that helps in identifying the slower stages in the pipeline. To enable debug mode, set the flag --grain_py_debug_mode=true or call grain.config.update("py_debug_mode",True)

import time


# Define a dummy slow preprocessing function
def _dummy_slow_fn(x):
  time.sleep(10)
  return x
# @test {"output": "ignore"}
import time

grain.config.update("py_debug_mode", True)

ds = (
    grain.MapDataset.range(20)
    .seed(seed=42)
    .shuffle()
    .batch(batch_size=2)
    .map(_dummy_slow_fn)
    .to_iter_dataset()
    .map(_dummy_slow_fn)
)
it = iter(ds)

for _ in range(10):
  next(it)
Grain Dataset Execution Summary:

NOTE: Before analyzing the `MapDataset` nodes, ensure that the `total_processing_time` of the `PrefetchDatasetIterator` node indicates it is a bottleneck. The `MapDataset` nodes are executed in multiple threads and thus, should not be compared to the `total_processing_time` of `DatasetIterator` nodes.

|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| id | name                           | inputs | percent wait time | total processing time | min processing time | max processing time | avg processing time | num produced elements |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 6  | RangeMapDataset(start=0, stop= | []     | 0.00%             | 86.92us               | 1.00us              | 53.91us             | 4.35us              | 20                    |
|    | 20, step=1)                    |        |                   |                       |                     |                     |                     |                       |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 5  | WithOptionsMapDataset          | [6]    | 0.00%             | N/A                   | N/A                 | N/A                 | N/A                 | N/A                   |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 4  | ShuffleMapDataset              | [5]    | 0.00%             | 15.95ms               | 42.40us             | 2.28ms              | 797.35us            | 20                    |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 3  | BatchMapDataset(batch_size=2,  | [4]    | 0.00%             | 803.14us              | 47.04us             | 290.24us            | 80.31us             | 10                    |
|    | drop_remainder=False)          |        |                   |                       |                     |                     |                     |                       |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 2  | MapMapDataset(transform=_dummy | [3]    | 16.68%            | 100.08s               | 10.00s              | 10.01s              | 10.01s              | 10                    |
|    | _slow_fn @ <ipython-input-2-23 |        |                   |                       |                     |                     |                     |                       |
|    | 02a47a813f>:4)                 |        |                   |                       |                     |                     |                     |                       |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 1  | PrefetchDatasetIterator(read_o | [2]    | N/A               | 10.02s                | 12.40us             | 10.02s              | 1.67s               | 6                     |
|    | ptions=ReadOptions(num_threads |        |                   |                       |                     |                     |                     |                       |
|    | =16, prefetch_buffer_size=500) |        |                   |                       |                     |                     |                     |                       |
|    | , allow_nones=False)           |        |                   |                       |                     |                     |                     |                       |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 0  | MapDatasetIterator(transform=_ | [1]    | 83.32%            | 50.05s                | 10.01s              | 10.01s              | 10.01s              | 5                     |
|    | dummy_slow_fn @ <ipython-input |        |                   |                       |                     |                     |                     |                       |
|    | -2-2302a47a813f>:4)            |        |                   |                       |                     |                     |                     |                       |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|

In the above execution summary, 86% of the time is spent in the MapDatasetIterator node and is the slowest stage of the pipeline.

Note that although from the total_processing_time, it might appear that MapMapDataset(id:2) is the slowest stage, nodes from the id 2 to 6 are executed in multiple threads and hence, the total_processing_time of these nodes should be compared to the total_processing_time of iterator nodes(id:0)