We’ve all heard the dire warnings about artificial intelligence deployments and their often dismal ROI. 2023 is the year companies look to overcome lackluster results in AI and demonstrate the value of their investments. Edge deployment is one critical piece of this mission, but many companies need help to create a deployment-ready workflow. At NVIDIA’s latest GTC conference, Brandon Johnson, Senior Solutions Architect, Edge AI (NVIDIA); and Yuval Zukerman, Director, Technical Alliances (Domino Data Lab), demonstrated one such workflow.
Why edge computing is such a significant component of these workflows
Edge computing is critical for deploying artificial intelligence (AI) deep learning models in the cloud because it helps to address some of the challenges associated with running these models in a centralized cloud environment. It addresses:
- The latency caused by massive data volumes required to power AI models: Edge computing processes data locally on the device or in a nearby server rather than sending it to the cloud. This can improve the speed and responsiveness of the application and help reduce network congestion.
- Bandwidth hogging and waste: By the same token, edge computing also reduces strain on bandwidth, helping increase the efficiency of AI tools and apps.
- Security concerns specific to AI: Because deep learning models are data-hungry, much of this data is bound to be sensitive. Edge computing reduces exposure by bypassing central processing.
- Reliability of various AI models and tools: That same data hunger also requires a constant flow of input. Edge computing can help improve reliability by processing data locally on the device or in a nearby server—even if the cloud connection is lost—helping ensure that the application continues even in the event of network disruptions or other issues.
Why do companies struggle to build a successful edge to cloud workflow?
There are many reasons companies struggle to build an edge-to-cloud workflow for their AI deployments. Some of these obstacles include:
- A lack of expertise: Technology expertise is still a highly competitive field, and not all companies can afford to invest in new talent—or even attract it. Many are relying on reskilling and upskilling to fill the gap, but this takes time.
- Existing complexity and integration issues: Companies may struggle to understand how to connect edge devices to the cloud, how to manage and analyze data at scale, and how to build and deploy AI models effectively, especially if legacy systems are still an active part of the ecosystem.
- Concerns about cost: In a catch-22, AI can be both cost-saving and cost-creating, depending on the situation. Companies may struggle to justify the new cost involved with building new stacks and investing in upgraded hardware/software, particularly if they are unsure of the long-term benefits of the technology.
- Security concerns: Maintaining required security protocols across a cloud ecosystem can be challenging, leading to alert fatigue and accidental loopholes with updates. Companies may struggle to ensure that data is protected at the edge and in the cloud.
Building a successful edge-to-cloud workflow for AI requires careful planning, investment, and expertise. Companies that can overcome the challenges associated with this technology can unlock significant benefits, including improved efficiency, better insights, and greater agility in responding to changing market conditions.
How MLOps platform Domino approaches this challenge
Zukerman uses the fictional company GlobalCo Chemicals to demonstrate how optimized tools can help companies overcome these barriers in Domino Data Lab’s workshop, “Deploy a Deep Learning Model from the Cloud to the Edge.” In this example, the company is investing in robotics to help ease risks associated with working on the factory floor.
GlobalCo has decided to purchase robots, allowing workers to perform tasks with a combination of voice commands and manual intervention. The models require training for the voice commands. To make this seamless, speech recognition needs to be ultra-fast—performing in milliseconds—and robots must deploy in a way that overcomes traditional inconsistencies associated with WiFi networks.
The fictional data science team on this project will guide the robotics voice commands. Engineers will use MATLAB to develop models and leverage the Domino platform for collaboration. In this scenario, the chosen model is a network pretrained for audio.
The development workflow for this project follows the MLOps model lifecycle.
- Data exploration: The first step in developing an AI model is exploring the data. In this case, the team uses MATLAB to explore and reshape data and convert audio files into MEL spectrograms. This step involves understanding the characteristics of the data, identifying any patterns or trends, and cleaning and preparing the data for use in the model.
- Model development and training: Once the data has been explored and prepared, the team can begin developing and training the model. In this case, the team uses a pre-trained Yamnet neural network trained on audio signals. The team uses GPU infrastructure to train the model and adjusts the structure of network layers to improve model performance. MATLAB automates pre-processing steps, training the network, and assessing the model’s effectiveness.
- Model API and container packaging: Once the model has been trained and assessed, the team can package it as a model API and container. To do this, MATLAB’s Compiler SDK generates a Python package wrapping the model, and the package is published as a Domino model API. The team creates a Python driver function to act as the Model API entry point and saves the file to the Domino file system. The model API can be accessed and used by other applications, making integrating the model into different workflows easier.
- Container registry: The container running the model is published to NVIDIA’s Fleet Command container registry. This step involves adding the container to the registry, which acts as a central repository for all containers used in the project. Using a container registry, the team can easily manage and deploy containers and ensure that all containers are up-to-date and consistent across the project.
- Model deployment to the edge: The final step in deploying an AI model is to deploy it to the edge locations connected to the platform using Fleet Command’s container registry. The team selects the location and the model from the drop-down list and deploys it. Fleet Command takes care of the rest, ensuring that the model is deployed to the correct location and can be used by other applications. This step is critical for ensuring that the model is available where needed and can be used to improve business processes and decision-making.
AI at the edge is not only possible but mission-critical
The fictional team leveraged a unified platform to collaborate across tools and operationalize AI. They were able to abstract away the complexity that often stands in the way of companies building a cloud-to-edge workflow and were able to deploy AI more easily to edge locations, taking full advantage of the value AI can offer.
See the presentation and watch as they demonstrate the workflow with NVIDIA’s on-demand workshops.
Elizabeth Wallace is a Nashville-based freelance writer with a soft spot for data science and AI and a background in linguistics. She spent 13 years teaching language in higher ed and now helps startups and other organizations explain – clearly – what it is they do.