How to Install & Run SAP-RPT-1-OSS Locally?

by Ayush Kumar | November 12, 2025

Ready to build cheaper?

Custom CPU plans from as little as $0.012/hour.

sap-rpt-1-oss is SAP’s table-native, semantics-aware in-context learner for classification and regression. It embeds column names and cell values (no manual preprocessing), handles missing data, and scales quality with context size and bagging. For peak accuracy, it prefers big VRAM; for speed or smaller GPUs, just shrink the context and bagging.

Description

Implementation of the deep learning model with the inference pipeline described in the paper “ConTextTab: A Semantics-Aware Tabular In-Context Learner”.

Column Name	Data Type / Embedding	Example Value	Notes
Acquisition date	Date embedding	16/08/2024	Encoded using date embeddings (dark purple); contributes to temporal context understanding.
Price ($)	Numerical embedding	1792.00	Encoded using numerical embeddings (blue); numeric scale handled automatically.
Description	Text embedding	Laptop	Encoded using text embeddings (light lavender); adds semantic meaning to categorical text.
Received	Boolean / Text embedding	TRUE	Treated as short text or categorical; encoded with text embeddings.

GPU Configuration Table

Tier / Use case	Max context size	Bagging	Precision	Min VRAM (approx.)	Suggested NVIDIA GPUs	Notes / Tips
Best quality (large tables, benchmarks)	8192	8	FP16/BF16	≥80 GB	H100 80G, A100 80G	Official “best” setting; highest accuracy/robustness on wide or long tables.
High quality (most projects, wide tables)	4096	4	FP16/BF16	~48 GB	RTX A6000 48G, L40S 48G	Strong results with ~40–60% less memory than 8k/8.
Balanced (prod inference on single GPU)	4096	2	FP16/BF16	~32–40 GB	A6000 48G (headroom), A5000 24G*	Good balance of speed/quality. If OOM on 24G, drop bagging to 1.
Fast single-GPU (cost-efficient)	2048	1–2	FP16/BF16	~24 GB	L4 24G, RTX 4090 24G, A10 24G	Recommended starting point for 24G cards. Increase to bagging=2 if VRAM allows.
Lightweight trials / small tables	1024–1536	1	FP16/BF16	12–16 GB	RTX 3060 12G, T4 16G	For quick demos. If unstable, reduce batch size or rows per call.

Resources

Link: https://huggingface.co/SAP/sap-rpt-1-oss

Step-by-Step Process to Install & Run SAP-RPT-1-OSS Locally

For the purpose of this tutorial, we will use a GPU-powered Virtual Machine offered by NodeShift; however, you can replicate the same steps with any other cloud provider of your choice. NodeShift provides the most affordable Virtual Machines at a scale that meets GDPR, SOC2, and ISO27001 requirements.

Step 1: Sign Up and Set Up a NodeShift Cloud Account

Visit the NodeShift Platform and create an account. Once you’ve signed up, log into your account.

Follow the account setup process and provide the necessary details and information.

Step 2: Create a GPU Node (Virtual Machine)

GPU Nodes are NodeShift’s GPU Virtual Machines, on-demand resources equipped with diverse GPUs ranging from H200s to A100s. These GPU-powered VMs provide enhanced environmental control, allowing configuration adjustments for GPUs, CPUs, RAM, and Storage based on specific requirements.

Navigate to the menu on the left side. Select the GPU Nodes option, create a GPU Node in the Dashboard, click the Create GPU Node button, and create your first Virtual Machine deploy

Step 3: Select a Model, Region, and Storage

In the “GPU Nodes” tab, select a GPU Model and Storage according to your needs and the geographical region where you want to launch your model.

We will use 1 x H100 SXM GPU for this tutorial to achieve the fastest performance. However, you can choose a more affordable GPU with less VRAM if that better suits your requirements.

Step 4: Select Authentication Method

There are two authentication methods available: Password and SSH Key. SSH keys are a more secure option. To create them, please refer to our official documentation.

Step 5: Choose an Image

In our previous blogs, we used pre-built images from the Templates tab when creating a Virtual Machine. However, for running SAP-RPT-1-OSS, we need a more customized environment with full CUDA development capabilities. That’s why, in this case, we switched to the Custom Image tab and selected a specific Docker image that meets all runtime and compatibility requirements.

We chose the following image:

nvidia/cuda:12.1.1-devel-ubuntu22.04

This image is essential because it includes:

Full CUDA toolkit (including nvcc)
Proper support for building and running GPU-based models like SAP-RPT-1-OSS.
Compatibility with CUDA 12.1.1 required by certain model operations

Launch Mode

We selected:

Interactive shell server

This gives us SSH access and full control over terminal operations — perfect for installing dependencies, running benchmarks, and launching models like SAP-RPT-1-OSS.

Docker Repository Authentication

We left all fields empty here.

Since the Docker image is publicly available on Docker Hub, no login credentials are required.

Identification

Template Name:

nvidia/cuda:12.1.1-devel-ubuntu22.04

CUDA and cuDNN images from gitlab.com/nvidia/cuda. Devel version contains full cuda toolkit with nvcc.

This setup ensures that the SAP-RPT-1-OSS runs in a GPU-enabled environment with proper CUDA access and high compute performance.

After choosing the image, click the ‘Create’ button, and your Virtual Machine will be deployed.

Step 6: Virtual Machine Successfully Deployed

You will get visual confirmation that your node is up and running.

Step 7: Connect to GPUs using SSH

NodeShift GPUs can be connected to and controlled through a terminal using the SSH key provided during GPU creation.

Once your GPU Node deployment is successfully created and has reached the ‘RUNNING’ status, you can navigate to the page of your GPU Deployment Instance. Then, click the ‘Connect’ button in the top right corner.

Now open your terminal and paste the proxy SSH IP or direct SSH IP.

Next, If you want to check the GPU details, run the command below:

nvidia-smi

Step 8: Install Python 3.11 and Pip (VM already has Python 3.10; We Update It)

Run the following commands to check the available Python version.

If you check the version of the python, system has Python 3.10.12 available by default. To install a higher version of Python, you’ll need to use the deadsnakes PPA.

Run the following commands to add the deadsnakes PPA:

apt update && apt install -y software-properties-common curl ca-certificates
add-apt-repository -y ppa:deadsnakes/ppa
apt update

Now, run the following commands to install Python 3.11, Pip and Wheel:

apt install -y python3.11 python3.11-venv python3.11-dev
python3.11 -m ensurepip --upgrade
python3.11 -m pip install --upgrade pip setuptools wheel
python3.11 --version
python3.11 -m pip --version

Step 10: Create and Activate a Virtual Environment

Set up a clean Python 3.11 environment for running the SAP-RPT-1-OSS model.

mkdir -p ~/sap-rpt && cd ~/sap-rpt
python3.11 -m venv .venv && source .venv/bin/activate
pip install --upgrade pip setuptools wheel

Step 11: Authenticate with Hugging Face to Enable Model Checkpoint Downloads

The SAP-RPT-1-OSS model pulls its weights directly from Hugging Face. You must log in once on your VM to authorize downloads.

pip install huggingface_hub
python -c "from huggingface_hub import login; login()"

Step 12: Clone and Install the SAP-RPT-1-OSS Repository

Now that your Hugging Face login is set up, clone the official SAP sample repository and install its dependencies locally.

git clone https://github.com/SAP-samples/sap-rpt-1-oss.git
cd sap-rpt-1-oss
pip install -r requirements.txt
pip install -e .

Step 13: Connect to Your GPU VM with a Code Editor

Before you start running model script with the SAP-RPT-1-OSS model, it’s a good idea to connect your GPU virtual machine (VM) to a code editor of your choice. This makes writing, editing, and running code much easier.

You can use popular editors like VS Code, Cursor, or any other IDE that supports SSH remote connections.
In this example, we’re using cursor code editor.
Once connected, you’ll be able to browse files, edit scripts, and run commands directly on your remote server, just like working locally.

Why do this?
Connecting your VM to a code editor gives you a powerful, streamlined workflow for Python development, allowing you to easily manage your code, install dependencies, and experiment with large models.

Step 14: Create the Script

Create a file (ex: #app.py) and add the following code:

from sklearn.datasets import load_breast_cancer
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from sap_rpt_oss import SAP_RPT_OSS_Classifier

X, y = load_breast_cancer(return_X_y=True)
Xtr, Xte, ytr, yte = train_test_split(X, y, test_size=0.5, random_state=42)

clf = SAP_RPT_OSS_Classifier(max_context_size=4096, bagging=2)  # adjust to your VRAM
clf.fit(Xtr, ytr)
pred = clf.predict(Xte)
print("Accuracy:", accuracy_score(yte, pred))

What This Script Does

Loads a sample dataset (breast_cancer) from scikit-learn for binary classification.
Splits the data into training and testing sets (50% each).
Initializes the SAP_RPT_OSS_Classifier, setting max_context_size=4096 and bagging=2 to control model memory and performance.
Trains the model on the training data (fit()), using the SAP tabular in-context learner to understand column and value semantics.
Generates predictions on the test data and prints the overall accuracy score — verifying that the model runs successfully on your GPU VM.

Step 15: Run the Script

python3 app.py

Conclusion

With these simple steps, you’ve successfully installed and run SAP-RPT-1-OSS — SAP’s advanced table-native, semantics-aware model designed for tabular classification and regression. Running it on a NodeShift GPU Virtual Machine ensures optimal performance, scalability, and compliance with enterprise-grade standards like GDPR, SOC2, and ISO27001.

By combining the semantic understanding of real-world data with the efficiency of in-context learning, SAP-RPT-1-OSS sets a new benchmark for tabular AI. Whether you’re experimenting with research datasets or building production-grade analytics pipelines, this setup gives you a powerful, GPU-ready foundation to explore the future of tabular intelligence.

Relevant blog posts

November 11, 2025

How to Cut Your AI Costs in Half with TOON – The Smarter, Token-Optimized Alternative to JSON

Every token you send to an AI model costs money, and when your application scales, those costs can balloon fast. That’s where Token-Oriented Object Notation (TOON) steps in, offering a revolutionary way to save on API expenses without sacrificing data clarity or model accuracy. Designed as a compact, human-readable, and LLM-optimized alternative to JSON, TOON drastically reduces token usage by 30–60% across large structured datasets. It blends the simplicity of CSV, the readability of YAML, and the precision of JSON, creating a format that’s tailor-made for AI inputs. With features like tabular arrays, indentation-based hierarchy, and optional key folding, TOON helps models parse and reason about structured data more efficiently, all while maintaining perfect fidelity to your original dataset. The result? You send less data, get faster responses, and cut your AI inference costs dramatically, all by changing how you represent your data.

November 11, 2025

How to Install & Run Omnilingual ASR Locally?

Omnilingual ASR is Meta’s groundbreaking open-source speech recognition system built to support over 1,600 languages, including hundreds never before covered by any ASR model. It’s designed for inclusivity — allowing new languages to be added with just a few paired examples — and combines scalable zero-shot learning with flexible model architectures (Wav2Vec2, CTC, and LLM-based). The flagship OmniASR_LLM_7B model achieves state-of-the-art transcription accuracy, with character error rates (CER) below 10% for nearly 80% of supported languages, making it the most globally comprehensive ASR ever released. Each model is fully compatible with PyTorch, Fairseq2, and Hugging Face datasets, making it easy for developers and researchers to build multilingual transcription systems at scale.

November 10, 2025

How to Install and Run Kimi K2 Thinking GGUF Locally

Kimi K2 Thinking is one of the most advanced open-source reasoning models available today, combining a Mixture-of-Experts (MoE) architecture with a massive 1 trillion total parameters, yet it efficiently activates only 32 billion parameters per token, delivering extraordinary intelligence without overwhelming hardware demands. What makes K2 Thinking particularly astonishing is its agentic capability: before it answer your questions, it plans, reasons step-by-step, and invokes tools autonomously to solve multi-step problems, write code, perform research, analyse data, and execute workflows that may span 200–300 sequential steps without losing coherence. This makes it especially powerful for developers building autonomous agents, researchers performing long-horizon reasoning, and anyone who needs more than just a chat-style response. Additionally, thanks to native INT4 quantization and Quantization-Aware Training, K2 Thinking offers lossless reasoning at significantly reduced GPU memory and latency, enabling smooth inference even on local hardware. And if you want to run it locally without needing a huge hardware, the Kimi-K2-GGUF quantized builds by Unsloth bring the storage requirements down from 1.09TB to ~230GB, an 80% size reduction, making it realistically deployable on consumer-grade setups.

See all posts

Ready to build
with us?

The ideal way for organizations young and old to ease their way into the distributed and affordable cloud at their own pace.

Stay Tuned!

Stay up to date with the latest updates, news, and hotfixes for our product.

NodeShift creates a vital link between developers and affordable cloud.

Switch theme

English (EN)
Arabic (AR)
Chinese (ZH-CN)
German (DE)
Korean (KO)
Russian (RU)
French (FR)
Spanish (ES)
Portuguese (PT)
Japanese (JA)

JavaScript is disabled in your browser. For a better experience, please enable JavaScript.Learn how to enable JavaScript.