Frugal Sketching

Usage

  • Step: Advances the simulation by sampling one or more random values from the stream. All sampled values are inserted into each Frugal sketch.
  • Step Size: The number of values that will be sampled from the stream and inserted into each sketch as the simulation advances. This is unrelated to the 'step' internal to the Frugal algorithms. Step size may never be less than one.
  • Auto Scale: Automatically adjusts the step size based on the current length of the stream. This provides a detailed view at the beginning of the stream and smooths the visualization as the number of items in the stream increases.
  • Play: Starts the simulation. Random values will be sampled from the stream and inserted into each Frugal sketch with a delay between each step. Clicking the button again pauses the progress of the simulation.
  • Delay: The number of miliseconds to wait between generating new stream values when Play is pressed. This can be reduced to make the simulation play faster or increased to make it play slower.
  • Reset: Reset the state of the stream and all of the frugal sketches. The algorithm being displayed, the distribution generating stream elements, and distribution parameters are not reset.
  • Initial Guess: Sets the starting value for all Frugal sketches. When the reset button is pressed, all Frugal sketches will be initialized to this value.
  • Frugal Algorithm: Chooses the Frugal algorithm to display. When the 2U algorithm is selected, multiple choices of step function are available. All Frugal sketches are fed the same stream of data.
  • Distribution Selection: Selects the distribution currently being sampled from, and changes the parameters of the currently selected distribution. Distributions and choices of parameters may be changed as the length of the stream increases. Changing the active distribution while the simulation runs is allowed and encouraged.

Simulation Elements

  • Stream Value: The last randomly generated value that was inserted into the Frugal sketches. When step size is greater than 1, this is the most recent value added to the stream in the latest batch of stream elements.
  • Actual Value: The actual quantile of the entire stream.
  • Frugal Estimate: The Frugal quantile estimate of the stream.
  • % Error: The difference between the estimated and actual quantile over the actual quantile. The sparkline displays the change in the error over time. Sparklines are color-coded by quantile, and match the plot of the estimated values below
  • Estimated Values: The value of the Frugal sketches for each quantile as the length of the stream increases. The color of each line corresponds to the color of the sparklines in the error table.
  • Histogram: The distribution of the values in the stream. The vertical axis is shared with the line plot so that the current position of the timeseries is reflected on the histogram. As elements are sampled from the distribution selected they directly update this plot as well as updating the Frugal sketches.

Footnotes

  • Refer to the corresponding blog post for more information about Frugal Sketching.
  • Frugal sketches are incredibly good at estimating the value of the currently active distribution. However, they're memoryless. Select a few different distributions as the stream grows and watch the estimators adjust accordingly.
  • While watching the Frugal estimators quickly approach the quantiles of a new distribution, take note of how fast or slow the actual quantiles are changing. The actual quantiles are based on the entire stream, and may not appear to be affected by a new sampling distribution immediately (or at all).

Have fun and happy sketching!

Ben Linsay