Building Smarter .NET Applications with ONNX ML Models

One of the most recent and exciting areas in mobile application development is the integration of machine learning models to add intelligent capabilities. With the ability to process large amounts of data, identify patterns, and make predictions, ML models can enhance the functionality and usability of mobile applications in various ways, such as image recognition, natural language processing, and personalized recommendations. Furthermore, with frameworks such as ONNX, it has become easier to integrate machine learning models into mobile .NET applications, making it a viable option for developers who want to create smart and personalized user experiences.

This 4 part series will outline how to consume your pre-trained models (Trained using TensorFlow & PyTorch) in a .NET, Uno Platform built application. Each article will explore one of the following machine-learning tasks: 

This series’ primary objective is to illustrate, in steps, how to consume an ML Model in an Uno Platform Application and not to teach you how to train and eliminate the common issues encountered while training and improving ML models, such as over-generalization. 

Objectives of this Article

The first article of this series aims to show readers how to convert pre-trained models to ONNX format, which will subsequently be consumed in an Uno Platform application. The article and accompanying code base will explain how to train, evaluate, test, and convert your models to the ONNX format. Itemized below are the models we will go over: 

  1. MNIST CLASSIFIER: Trained with Tensor flow 

  2. BERT QnA: Pre-Trained with PyTorch 

By the end of this article, you should have a basic understanding of how to train, test, evaluate and convert your ML models to the ONNX format in preparation for consumption in the next set of articles in this series. 

What is ONNX?

ONNX (Open Neural Network Exchange) is an open standard for encapsulating and exchanging deep learning models. It provides a unified format for storing and sharing these said models amongst different ML frameworks, including but not limited to PyTorchTensorFlow, Caffe2 and SciKit. With ONNX, developers can use a host of ML frameworks to train and deploy their models. ONNX also allows developers to optimize and accelerate their models using hardware accelerators e. g. GPUs. ONNX is supported by several major technology companies and has an ever-growing community of users and contributors. 



MNIST (Modified National Institute of Standards and Technology) is a dataset of handwritten digits’ images widely used in machine learning and computer vision research. It consists of a training set of 60,000 images and a test set of 10,000 images, each of which is a 28×28 pixel grayscale image of a single handwritten digit (0-9). The MNIST dataset is considered a benchmark in machine learning and has been widely used to evaluate and compare the performance of different machine learning algorithms. The dataset is widely available and can be downloaded for free from several online sources. Yann LeCun, Corinna Cortes & Christopher J. C. Burges of the Computer Science Department at the Courant Institute of Mathematical sciences at New York University developed the dataset. It was first published in 1998.

For this article’s source code, I used python programming language and utilized Google’s CoLab to run and execute my code in the ipynb (interactive python notebooks). Links to all resources can be found at the end of this article.


Install and Import all prerequisite packages as shown below:

Once this was done, I proceeded to unpack the dataset and split them into Training images, Training labels, Test images and Test labels. Then, selecting the training and testing images, I flattened them by performing a reshape operation on the images. 


I proceeded to define a simple sequential model as shown below: 

This model accepts an Input of dimension (784,) where: 282 = 784 signals the height and width of the images. The output is of dimension (1,10) indicating 10 possible predictions 
from 0-9. 


I commenced the training of the model by executing this snippet of code below:

This took a little time, enough for me to grab a drink and some cashew nuts to snack on 🙃. 

While training your model, you can create checkpoints. The concept of checkpoints in ML Training is similar to saved checkpoints in your favourite video game; they serve as a tool with which ML Engineers & Developers can save their model undergoing training at a specific point in time to be resumed later or evaluated against other checkpoints. 


Once training was completed, I evaluated my newly trained model using the test images as input and the test labels as expected output to compare with my model’s predicted output. 

This took a little time, enough for me to grab a drink and some cashew nuts to snack on 🙃. 

While training your model, you can create checkpoints. The concept of checkpoints in ML Training is similar to saved checkpoints in your favourite video game; they serve as a tool with which ML Engineers & Developers can save their model undergoing training at a specific point in time to be resumed later or evaluated against other checkpoints. 


Post evaluation, I proceeded to save my model in a TensorFlow format using this code snippet below:

This saves the model in a TensorFlow format which will be converted to the ONNX format using the code snippet below: 


Now that I have my ONNX model, I can proceed to evaluating and checking if the converted model works. I achieved this by installing and importing the “tf2onnx” and “onnxruntime” packages first and executing the code snippets below to load and print out a summary of the ONNX models parameters.

You can see the summary is similar to the one printed out in the original model. I then went on to evaluate my ONNX model using the same test images as input and test labels as expected output. The code snippet below shows the actual code executed and its output which shows it’s accuracy similar to the original saved model. 

I proceeded to test my ONNX model with my own handwritten image by using a collection of python packages to help load, pre-process input and post process output (This will be explained in the subsequent articles from a C# Perspective). 



BERT (Bidirectional Encoder Representations from Transformers) is a deep learning model developed by Google that has been trained on a large dataset of text and can be used for a variety of natural language processing tasks. It is designed to understand the context and relationships between words in a sentence, rather than just processing the individual words in isolation. This allows BERT to perform well on tasks such as language translation, language modelling, and text classification. BERT has achieved state-of-the-art performance on a number of benchmarks and is widely used in natural language processing research and in industry applications. It is implemented in Python and is available as an open-source library.  

BERT is designed to pre-train deep bidirectional representations from unlabelled text by jointly conditioning on both left and right context in all layers. This allows it to capture the context of a word based on the words that come before and after it, rather than just the words that come before, as is the case with many traditional language models. BERT is trained on a large dataset of unlabelled text and is then fine-tuned on a specific task, such as sentiment analysis or question answering.


Install, import packages, retrieve pretrained BERT QnA model. I achieved this using this code snippets. 

I use a pretrained model here and the retrieval of the pretrained model could take a while as it is quite a large download. Once that was sorted, I generated dummy inputs a shown below in the code snippet: 

NB: The generated .onnx model, around 1.5GB, will need to be added to the sample Uno Platform codebase as an embedded resource for the BERT NLP Sample to work in the application. The 4th article in this series will walk you through how to add the .onnx file as an embedded resource. 


Since I am using a pre-trained model here, I went straight into defining pre-processing, post-processing and conversion helper functions, amongst others, that will help me transform my input to an appropriate format for my model, transform my models output to a human-readable format convert the model to an ONNX format, and execute an evaluation of an input on both the PyTorch model and the ONNX model.


I finally proceeded to test both the PyTorch model and ONNX model on the same pre-defined input, as shown in the code snippets below:

Next Steps

With this article, I set the stage by showing you how to build a fresh model from scratch, convert it to the ONNX Format, take a pre-trained model, and convert it to an ONNX model. In the following article, I will review the fundamentals of ONNX runtime from a C# Mobile developers’ point of view. Specifically, I will illustrate how to load a model, pre-process the input to be fed to the model and post-process the output in a human-readable format.

Resources Used





Google CoLab 

Welcome to Collaboratory – Collaboratory ( 


Python Interactive Notebooks (.ipynb) 

The Jupyter Notebook — IPython 



Welcome to 








ONNX Runtime 

ONNX Runtime | Home 

Resources Generated





Source Code 



ONNX Models 


To upgrade to the latest release of Uno Platform, please update your packages to 4.6 via your Visual Studio NuGet package manager! If you are new to Uno Platform, following our official getting started guide is the best way to get started. (5 min to complete)

Authored by Paula Aliu


Share this post:

Uno Platform 5.2 LIVE Webinar – Today at 3 PM EST – Watch