Hot Design Public Beta is now available - Join the beta ➜

Handling Millions of Data Points in Cross-Platform .NET Applications

October 8, 2024

Uno Platform Team

Ever wondered how to visualize millions of datapoints without your app breaking a sweat?

In this tutorial, we’ll tackle a common challenge faced by many developers: how to effectively visualize massive datasets across different platforms without sacrificing performance. We’ll explore a solution that allows you to render millions of data points smoothly, whether you’re working on real-time analytics dashboards, scientific data processing tools, or any application that demands high-performance data visualization.

Throughout this guide,you’ll learn techniques for optimizing data rendering, managing memory efficiently, and creating responsive UIs even when dealing with large datasets.

By the time we’re done, you’ll have an app capable of:

Visualizing millions of data points
Connecting to an SQLite database for persistent storage
Offering one-click chart clearing and to switch between various ScottPlot chart types

Before Getting Started

First things first, let’s make sure you’re set up with Uno Platform. If you haven’t already, head over to our Get Started guide to get your environment ready.

New to ScottPlot? No worries! Scott’s got you covered with an Uno Platform Quickstart that’ll get you up to speed in no time.

Project Structure

For this tutorial, we’ll be working with the following packages and versions:

				
					    <PackageVersion Include="ScottPlot.WinUI" Version="5.0.39" />
    <PackageVersion Include="MessagePack" Version="2.5.172" />
    <PackageVersion Include="sqlite-net-pcl" Version="1.9.172" />
    <PackageVersion Include="Uno.SQLitePCLRaw.provider.wasm" Version="3.2.0" />
    <PackageVersion Include="SQLitePCLRaw.bundle_green" Version="2.1.10" />

Our project consists of several key files:

MainPage.xaml: Defines the UI
MainPage.xaml.cs: Contains the main logic for the application
DataService.cs: Handles database operations
PlotSettings.cs: Defines the structure for storing plot settings
Series.cs: Defines the data model

Step 1: User Interface (MainPage.xaml)

Let’s start by visualizing our UI.

Visual Structure

Title: Displays the purpose of the chart
Data Visualization: uses ScottPlot:WinUIPlot control
User Interaction Buttons:
Horizontal StackPanel containing three buttons:
- Add random data
- Clear the plot
- Change chart type
Information Display: TextBlock that shows current
- Data Points
- Chart Type

				
					┌─────────────────────────────────┐
│             Title               │
├─────────────────────────────────┤
│                                 │
│                                 │
│        Data Visualization       │
│     (ScottPlot:WinUIPlot)       │
│                                 │
│                                 │
├─────────────────────────────────┤
│ [Add Data] [Clear] [Change Type]│
├─────────────────────────────────┤
│      Information Display        │
└─────────────────────────────────┘

				
					<Page 
     ...
      xmlns:um="using:Uno.Material"
      xmlns:ScottPlot="using:ScottPlot.WinUI"
      Background="{ThemeResource BackgroundBrush}">

  <Grid>
    <Grid.RowDefinitions>
      <!-- Title -->
      <RowDefinition Height="Auto" />
      <!-- Plot -->
      <RowDefinition Height="*" />
      <!-- Buttons -->
      <RowDefinition Height="Auto" />
      <!-- Status Text -->
      <RowDefinition Height="Auto" />
    </Grid.RowDefinitions>

    <!-- Title -->
    <TextBlock Grid.Row="0"
               Text="Plotting 5 Million Points"
               FontSize="24"
               FontWeight="Bold"
               HorizontalAlignment="Center"
               Margin="0,10" />

    <!-- Plot -->
    <ScottPlot:WinUIPlot Grid.Row="1"
                         x:Name="WinUIPlot1" />

    <!-- Buttons -->
    <StackPanel Grid.Row="2"
                Orientation="Horizontal"
                HorizontalAlignment="Center"
                Margin="0,10"
                Spacing="5">

      <Button x:Name="AddRandomDataButton"
              Content="Add Random Data"
              Click="AddRandomDataButton_Click" />

      <Button x:Name="ClearPlotButton"
              Content="Clear Plot"
              Click="ClearPlotButton_Click" />

      <Button x:Name="ChangeChartTypeButton"
              Content="Change Chart Type"
              Click="ChangeChartTypeButton_Click" />
    </StackPanel>

    <!-- Status Text -->
    <TextBlock Grid.Row="3"
               x:Name="StatusTextBlock"
               HorizontalAlignment="Center"
               Margin="0,0,0,10" />
  </Grid>

</Page>

Step 2: Implementing the Main Logic

Let’s start by looking at the namespaces and class variables used in our main file:

				
					using Microsoft.UI.Dispatching;
using Windows.Storage;
using SQLite;
using ScottPlot;

namespace ScottPlotDataPersistedSample
{
    public sealed partial class MainPage : Page
    {
        // Data-related fields
        private DataService _dataService;
        private readonly Random _random = new();
        private readonly List<Series> _cachedSeries = new();
        private readonly object _seriesLock = new();

        // Plot-related fields
        private string currentPlotType = "SignalPlot";
        private readonly string[] _plotTypes = { "SignalPlot", "SignalConst", "Heatmap", "ScatterDownsample" };

        // UI-related fields
        private readonly DispatcherQueue _dispatcherQueue;

        // State-tracking fields
        private bool _isInitialized = false;
        private const int _batchSize = 5;
        private int _currentBatchIndex = 0;
        private int _currentChartTypeIndex = 0;

        // ... rest of the class
    }
}

This section sets up the foundation of our application:
- We import necessary namespaces for UI dispatching, storage access, SQLite operations, and ScottPlot functionality.
- The class variables are organized into categories: data-related, plot-related, UI-related, and state-tracking.
- _dataService will handle our database operations.
- _cachedSeries stores our data in memory, protected by _seriesLock for thread safety.
- _plotTypes defines the available chart types, with currentPlotType tracking the current selection.
- _dispatcherQueue ensures UI updates happen on the correct thread.
- State-tracking variables help manage initialization, batch loading, and chart type cycling.

2.1 Initialization and Asynchronous Programming

				
					public sealed partial class MainPage : Page
{
    public MainPage()
    {
        this.InitializeComponent();
        this.Loaded += MainPage_Loaded;
        _dispatcherQueue = DispatcherQueue.GetForCurrentThread();
    }

    private async void MainPage_Loaded(object sender, RoutedEventArgs e)
    {
        await UpdateUIStatusAsync("Loading initial data, please wait...");
        try
        {
            await InitializeDatabaseAsync();
            currentPlotType = _dataService.GetLastUsedPlotType() ?? currentPlotType;
            await LoadNextDataBatchAsync();
            _isInitialized = true;
            await InitializePlotAsync();
            UpdateStatusText(_cachedSeries.Sum(s => s.DataPoints.Length));
        }
        catch (Exception ex)
        {
            await UpdateUIStatusAsync($"Initialization failed: {ex.Message}");
        }
    }

    private async Task InitializeDatabaseAsync()
    {
        _dataService = new DataService();
        await _dataService.InitializeAsync();
    }
}

This section demonstrates the initialization process of our application:

The constructor sets up the UI components and attaches the MainPage_Loaded event handler.
MainPage_Loaded is marked as async, allowing us to use await for asynchronous operations. This ensures our UI remains responsive during potentially time-consuming initialization tasks.
We start by updating the UI to inform the user that data is loading.
The initialization process is wrapped in a try-catch block to handle any exceptions that might occur during startup.
We initialize the database, load the last used plot type (if any), load the initial batch of data, and set up the plot.
If any step fails, we catch the exception and display an error message to the user.

This approach allows for a smooth startup process, even when dealing with large datasets or slow storage systems. By using asynchronous methods, we ensure the UI thread isn’t blocked, maintaining a responsive application throughout the initialization phase.

2.2 Lazy Loading and Memory Management

				
					private async Task LoadNextDataBatchAsync()
{
    List<Series> nextBatch;
    do
    {
        nextBatch = await _dataService.GetSeriesBatchAsync(_currentBatchIndex, _batchSize);
        lock (_seriesLock)
        {
            _cachedSeries.AddRange(nextBatch);
        }
        _currentBatchIndex += nextBatch.Count;
        await UpdateUIStatusAsync(nextBatch.Count > 0
            ? $"Loaded {_cachedSeries.Count} series so far..."
            : $"All series loaded. Total: {_cachedSeries.Count} series.");
    } while (nextBatch.Count == _batchSize);
}

This method implements lazy loading, a crucial technique for managing large datasets:

We load data in small batches (defined by _batchSize) rather than all at once. This approach helps manage memory usage, especially important for platforms with limited resources like mobile devices or web browsers.
The do-while loop continues fetching batches until we receive fewer items than the batch size, indicating we’ve reached the end of the data.
We use a lock (_seriesLock) when adding new data to _cachedSeries to ensure thread safety. This is important as data loading happens asynchronously and could potentially conflict with other operations accessing the cached data.
After each batch is loaded, we update the UI to show progress. This keeps the user informed and provides a sense of responsiveness, even when loading large amounts of data.

By implementing lazy loading, we can handle datasets that are too large to fit into memory all at once, making our application more scalable and efficient in its resource usage.

2.3 ScottPlot Integration and Multiple Chart Types

				
					private async Task PlotDataAsync()
{
    await Task.Run(() =>
    {
        lock (_seriesLock)
        {
            var localSeriesList = _cachedSeries.ToList();
            _dispatcherQueue.TryEnqueue(() =>
            {
                WinUIPlot1.Plot.Clear();
                var palette = new ScottPlot.Palettes.Category10();

                foreach (var series in localSeriesList)
                {
                    switch (currentPlotType)
                    {
                        case "SignalPlot":
                            var signalPlot = WinUIPlot1.Plot.Add.Signal(series.DataPoints);
                            signalPlot.Color = palette.GetColor(localSeriesList.IndexOf(series));
                            break;
                        case "SignalConst":
                            var signalConstPlot = WinUIPlot1.Plot.Add.SignalConst(series.DataPoints);
                            signalConstPlot.LineWidth = 2;
                            signalConstPlot.Color = palette.GetColor(localSeriesList.IndexOf(series));
                            break;
                        case "Heatmap":
                            if (localSeriesList.Count > 0)
                            {
                                double[,] heatmapData = GenerateHeatmapData(series.DataPoints);
                                WinUIPlot1.Plot.Add.Heatmap(heatmapData);
                            }
                            break;
                        case "ScatterDownsample":
                            var xs = Enumerable.Range(0, series.DataPoints.Length).Select(x => (double)x).ToArray();
                            var scatterPlot = WinUIPlot1.Plot.Add.Scatter(xs, series.DataPoints);
                            scatterPlot.Color = palette.GetColor(localSeriesList.IndexOf(series));
                            break;
                    }
                }
                WinUIPlot1.Plot.Axes.AutoScale();
                WinUIPlot1.Refresh();
            });
        }
    });
}

This method is the core of our data visualization process:

We use Task.Run to perform the plotting operation on a background thread, preventing UI freezes during complex calculations.
The _seriesLock ensures thread-safe access to our data.
We create a local copy of the series list to avoid holding the lock for too long.
The actual plotting is done within a _dispatcherQueue.TryEnqueue call, ensuring all UI updates happen on the main thread.
We support multiple chart types (SignalPlot, SignalConst, Heatmap, ScatterDownsample), each optimized for different data visualization needs:
- SignalPlot and SignalConst are efficient for large, evenly-spaced datasets.
- Heatmap is used for 2D grid data visualization.
- ScatterDownsample is suitable for non-uniform data, using downsampling to maintain performance with large datasets.
We use a color palette to distinguish between different series in the plot.
After adding all series, we auto-scale the axes and refresh the plot to ensure all data is visible.

This flexible approach allows us to switch between different visualization types easily, catering to various data analysis needs while maintaining performance.

2.4 User Interactions and Memory Management

				
					private async void AddRandomDataButton_Click(object sender, RoutedEventArgs e)
{
    if (!_isInitialized)
    {
        await UpdateUIStatusAsync("Initialization in progress. Please wait.");
        return;
    }

    var newSeries = GenerateRandomWalk(100000, _cachedSeries.Sum(s => s.DataPoints.Length));
    lock (_seriesLock)
    {
        _cachedSeries.Add(newSeries);
    }

    await _dataService.AddSeriesBatchAsync(new List<Series> { newSeries });
    await PlotDataAsync();
    UpdateStatusText(_cachedSeries.Sum(s => s.DataPoints.Length));
    GC.Collect();
}

This method handles the “Add Random Data” button click:

We first check if the application is fully initialized to prevent premature data addition.
We generate a new random walk series with 100,000 points, starting from the current total number of points.
The new series is added to our cached data in a thread-safe manner using a lock.
We then persist this new data to the database asynchronously.
After adding the data, we update the plot and the status text.
Finally, we call GC.Collect() to prompt garbage collection. This is particularly important when dealing with large datasets, as it helps manage memory usage after significant data operations.

While explicit garbage collection should be used judiciously, in this case, it helps ensure our application doesn’t consume excessive memory, especially on resource-constrained devices.

Step 3: Data Persistence (DataService.cs)

				
					public class DataService
{
    private SQLiteConnection _db;
    private static StorageFolder _localFolder = Windows.Storage.ApplicationData.Current.LocalFolder;

    public async Task InitializeAsync()
    {
        StorageFolder folder = await _localFolder.CreateFolderAsync("ScottPlotDatabase", CreationCollisionOption.OpenIfExists);
        string dbPath = Path.Combine(folder.Path, "seriesData.db");
        _db = new SQLiteConnection(dbPath);
        _db.CreateTable<Series>();
        _db.CreateTable<PlotSettings>();
    }

    public async Task AddSeriesBatchAsync(List<Series> seriesList)
    {
        await Task.Run(() =>
        {
            _db.RunInTransaction(() =>
            {
                foreach (var series in seriesList)
                {
                    _db.Insert(series);
                }
            });
        });
    }

    public async Task<List<Series>> GetSeriesBatchAsync(int startIndex, int batchSize)
    {
        return await Task.Run(() =>
        {
            return _db.Table<Series>()
                .Skip(startIndex)
                .Take(batchSize)
                .ToList();
        });
    }

    // Other methods omitted for brevity
}

The DataService class manages our SQLite database operations:

InitializeAsync sets up the SQLite database, creating necessary tables if they don’t exist.
AddSeriesBatchAsync efficiently inserts multiple series into the database using a transaction, which improves performance for batch inserts.
GetSeriesBatchAsync retrieves a batch of series from the database, supporting our lazy loading approach.

By using SQLite, we achieve efficient local storage of large datasets. The use of async methods ensures database operations don’t block the UI thread, maintaining application responsiveness even during intensive I/O operations.

Step 4: Settings Persistence (PlotSettings.cs)

Create a new file DataService.cs and add this code:

				
					public class PlotSettings
{
    [PrimaryKey, AutoIncrement]
    public int Id { get; set; }
    public string? LastUsedPlotType { get; set; }
}

This simple class defines our plot settings structure:

The [PrimaryKey, AutoIncrement] attribute on Id tells SQLite to use this as a unique, auto-incrementing primary key.
LastUsedPlotType stores the user’s last selected plot type, allowing us to restore their preference between sessions.

By persisting these settings, we enhance the user experience, making the application feel more personalized and remembering user choices across different uses of the application.

Step 5: Data Model (Series.cs)

Create a new file DataService.cs and add this code:

				
					using SQLite;
using MessagePack;

namespace ScottPlotDataPersistedSample;

[MessagePackObject]
public class Series
{
    [PrimaryKey, AutoIncrement]
    public int Id { get; set; }

    public string DataPointsSerialized { get; set; }

    public double Origin { get; set; }

    [Ignore]
    [IgnoreMember]
    public double[] DataPoints
    {
        get => MessagePackSerializer.Deserialize<double[]>(Convert.FromBase64String(DataPointsSerialized));
        set => DataPointsSerialized = Convert.ToBase64String(MessagePackSerializer.Serialize(value));
    }
}

The Series class is a crucial component of our application, serving as the data model for our time series data. Let’s break down its key features:

MessagePack Serialization: The class is decorated with the [MessagePackObject] attribute, indicating that we’re using MessagePack for efficient serialization. MessagePack is a binary serialization format that’s faster and more compact than JSON, which is particularly beneficial when dealing with large datasets.
SQLite Integration: The Id property is marked with [PrimaryKey, AutoIncrement] attributes, allowing SQLite to manage it as a unique identifier for each series.
Data Storage: Instead of storing the raw array of data points, we store a serialized string representation (DataPointsSerialized). This approach allows for more efficient storage and retrieval, especially for large datasets.
Origin: The Origin property stores the starting point of the series, which can be useful for certain types of data analysis or visualization.
DataPoints Property: This property is the interface for accessing the actual data points. It’s marked with [Ignore] for SQLite (so it’s not stored directly in the database) and [IgnoreMember] for MessagePack (so it’s not included in the serialization).
- The getter deserializes the stored string back into a double[] array.
- The setter serializes the double[] array into a Base64 string for storage.

This design offers several advantages:

Efficient Storage: By serializing the data points, we can store large arrays more efficiently in the database.
Lazy Deserialization: The actual deserialization only happens when the DataPoints property is accessed, saving memory and processing time if the raw data isn’t needed.
Cross-Platform Compatibility: This approach ensures our data can be easily stored, retrieved, and transmitted across different platforms and devices.

By using MessagePack and this serialization strategy, we’ve optimized our data model for both performance and storage efficiency, which is crucial when working with large datasets in a cross-platform environment.

Time to Test It Out!

We’ve covered a lot of ground, so let’s take a moment to recap. We’ve built a data visualization app that handles large datasets across multiple platforms—here are the key takeaways:

Improved application responsiveness using asynchronous programming, ensuring the UI remains interactive even during intensive operations.
Used lazy loading to manage large datasets efficiently, allowing our application to handle substantial amounts of data without overwhelming system resources.
Integrated SQLite for efficient, cross-platform data persistence, providing a reliable storage solution regardless of the device.
Leveraged ScottPlot for versatile data visualization, incorporating support for multiple chart types to adapt to various data presentation needs.
Applied memory management techniques, including strategic use of garbage collection, to optimize performance, particularly for resource-constrained environments.
Ensured thread safety when dealing with shared resources in an asynchronous context, preventing data conflicts during multitasking operations.
Added user preference persistence, enhancing the overall user experience by remembering their choices across sessions.

With these techniques, you’ll be able to build powerful, cross-platform apps that not only handle large datasets with ease but also deliver smooth performance and rich data visualization, all while keeping your users’ experience front and center.