Filter AI Models

Use the filters below to narrow down the AI models based on your criteria. Combine multiple filters to get precise information tailored to your needs!

  • Model: Filter by model to view data tied to specific models like CNNs, DNNs, Transformers, etc.
  • Phase: Filter by phase to view data on models during inference or training stages.
  • Layers: Focus your study on specific layers within the models.
  • Operation: Filter based on operations performed by the models.
  • Sparsity: Select models based on their sparsity levels.
  • Data Type: Choose models using specific data types in inference or training.
  • Memory Footprint: Filter models by their memory usage to match your hardware capabilities.
  • Production Hardware Platforms: Find models compatible with specific hardware platforms.

AI Models Data Visualization

Explore our interactive charts that showcase the latest data on memory footprint, number of papers per year, data types used in inference and training, and hardware platforms distribution. Our charts are updated regularly to reflect the most current trends and findings in AI research.

Training - Hardware Platform and Memory Footprint

Chart showing memory footprint during training across different hardware platforms.

Inference - Hardware Platform and Memory Footprint

Chart showing memory footprint during inference across different hardware platforms.

Number of Papers in the Database Per Year

Chart displaying the number of papers added to the database each year.

Data Types Used in Inference and Training

Chart illustrating data types used in inference and training phases.

Comprehensive AI Models Table

Interact with our detailed table containing extensive information on AI models, including references to research papers verified by our team.

Model Phase Layers Operations Sparsity Data Type Memory Footprint (GB) Production Hardware Platforms Features Industry Usage
Convolutional Neural Network (CNN) Inference Convolution,
Neural Network (Activation functions, Weights and bias operations),
Pooling
ReLU, Sigmoid or Tanh
Vector-matrix multiply, Vector-vector add
Dense vector, Dense matrix
Dense vector, Dense matrix
1-bit, 2-bit, 8-bit, FP16, FP32, INT8 CPU, GPU: -1 to 1 GB
CPU, GPU: 1 to 10 GB
CPU, GPU Real-time requirements,
High memory BW,
Compute intensive
Apple: Face ID, Apple Neural Engine;
Google: Object detection;
Instagram: Automatically recognize and tag images;
Amazon: Summarizing customer feedback, forecasting
Training Backpropagation,
Batch normalization,
Dropout,
Forward pass
Partial derivatives over a fector, vector-vector add, vector-vector multiply, random binary vector generation, all inference operations Dense vector
Dense Vector, Dense Vector
BF16, FP16, FP32 GPU: -1 to 1 GB
GPU: 1 to 10 GB
GPU: 10 to 100 GB
GPU Memory footprint,
Memory bandwidth,
Scalability
Apple: Face ID, Apple Neural Engine;
Google: Object detection;
Instagram: Automatically recognize and tag images;
Amazon: Summarizing customer feedback, forecasting
Deep Learning Recommendation Systems (DLRS) Inference Embedding (Embedding lookup, Lookup aggregation),
Neural Network (Activation functions, Weights and bias operations)
Embedding lookup: Load from memory
Lookup aggregation: Vector-vector add or Vector-vector concat
ReLU
Matrix-matrix multiply, Vector-vector add
Embedding lookup: N/A
Lookup aggregation: Dense vectors
Dense matrix
Dense matrix, Dense matrix; Dense vector, Dense vector
INT4, INT8, INT32, BF16, FP16, FP32 MTIA, GPU, CPU: 10 to 100 GB
MTIA, GPU, CPU: 100 to 1000 GB
MTIA, GPU: 1000 to 10000 GB
MTIA, GPU: 10000+ GB
CPU, GPU, MTIA Real-time requirements,
Neural network compute intensive,
Embedding light compute and high memory capacity and bandwidth needs
Amazon, Google, eBay, Alibaba: Advertisements and product recommendation;
Facebook, Instagram: Social networking service;
YouTube, Spotify, Fox, Pinterest: Media/music/image recommendations;
LinkedIn: News feed
Training Backpropagation,
Forward pass
Partial derivatives over a vector, vector-vector add, all inference operations Dense vector, dense vector INT8, INT32, BF16, FP16, FP32 GPU, MTIA: 10 to 100 GB
GPU, MTIA: 100 to 1000 GB
GPU, MTIA: 1000 to 10000 GB
GPU, MTIA: 10000+ GB
GPU, MTIA Neural network compute intensive,
Embedding light compute and high memory capacity and bandwidth needs
Amazon, Google, eBay, Alibaba: Advertisements and product recommendation;
Facebook, Instagram: Social networking service;
YouTube, Spotify, Fox, Pinterest: Media/music/image recommendations;
LinkedIn: News feed
Deep Neural Network (DNN) Inference Neural Network (Activation functions, Weights and bias operations) ReLU, Sigmoid or Tanh
Vector-matrix multiply, Vector-vector add
N/A
Dense vector, Dense matrix; Dense vector, Dense vector
1-bit, 2-bit, 8-bit, INT8, FP16, FP32 CPU, GPU: -1 to 1 GB
CPU, GPU: 1 to 10 GB
CPU, GPU Real-time requirements,
High memory BW,
Compute intensive
Apple: Face ID, Apple Neural Engine;
Google: Object detection, voice recognition;
Amazon: Summarizing customer feedback, voice assistants
Training Backpropagation,
Batch normalization,
Dropout,
Forward pass
Partial derivatives over a vector, vector-vector add, vector-vector multiply, random binary vector generation, all inference operations Dense vector
dense vector, dense vector
BF16, FP16, FP32 GPU: -1 to 1 GB
GPU: 1 to 10 GB
GPU Memory footprint,
Memory bandwidth,
Scalability
Apple: Face ID, Apple Neural Engine;
Google: Object detection, voice recognition;
Amazon: Summarizing customer feedback, voice assistants
Graph Neural Network (GNN) Inference Aggregation (Multiple and aggregated vector through an operator),
Combination (Activation functions, Weights and bias operations)
Aggregation: Matrix-matrix multiply
ReLU, Sigmoid or Tanh
Matrix-matrix multiply, Vector-vector add
Aggregation: Sparse matrix, dense matrix
N/A
Dense matrix, Dense matrix; Dense vector, Dense vector
INT32, FP32 GPU, CPU: 100 to 1000 GB GPU, CPU Compute intensity,
High memory BW,
Real-time response
Pinterest, Amazon: Recommender systems;
Social network analysis;
Molecular structure prediction;
Twitter: Temporal Graph Networks (TGNs);
Facebook: Social network graphs
Training Backpropagation,
Dropout,
Forward pass
Partial derivatives over a vector, vector-vector add, random vinary vector generation, vector-vector multiply, all inference operations Dense vector
Dense vector, Dense vector
FP32 GPU, CPU: 100 to 1000 GB GPU, CPU Compute intensity,
High memory BW,
Load-imbalance
Pinterest, Amazon: Recommender systems;
Social network analysis;
Molecular structure prediction;
Twitter: Temporal Graph Networks (TGNs);
Facebook: Social network graphs
Recurrent Neural Network (RNN) Inference Embedding,
Recurrent layers (Activation functions, Weights and bias operations)
ReLU, Sigmoid, or Tanh
Vector-matrix multiply, Vector-vector add
Dense vector
Dense vector, Dense matrix; Dense vector, Dense vector
FP32, INT8, FP16, BF16 GPU, TPU: -1 to 1 GB GPU, TPU Real-time requirements,
Sequential execution dependencies
Google: NLP, Translation, Speech recognition, Autonomous driving;
Baidu: Speech recognition;
Amazon: Speech recognition
Training Backpropagation through time,
Forward pass
Partial derivatives over a vector, vector-vector add, all inference operations Dense vector
Dense vector, Dense vector
FP32, FP16 GPU, TPU: -1 to 1 GB GPU, TPU Can exploit model and data parallelism Google: NLP, Translation, Speech recognition, Autonomous driving;
Baidu: Speech recognition;
Amazon: Speech recognition
Transformers Inference Concatenation,
Embedding,
Multi-head attention (Linear, Probability-value multiply, Query-key multiply, Softmax),
Neural Network (Activation functions, Linear),
Normalization,
Positional encoding
Linear: Matrix-matrix add, Matrix-matrix multiply
Probability-value multiply: Matrix-matrix multiply
Query-key multiply: Matrix-matrix multiply
Softmax: Scalar-vector power, Vector-scalar divide, Vector to scalar reduction
ReLU, SwiGLU, GeLU, GeGLU
Linear: Dense matrix, Dense matrix
Probability-value multiply: Dense matrix, Dense matrix; Dense matrix, Sparse matrix
Query-key multiply: Dense matrix, Dense matrix; Dense matrix, Sparse matrix
Softmax: Dense vector
FP32, FP16, BF16 TPU, GPU: -1 to 1 GB
TPU, GPU: 1 to 10 GB
TPU, GPU: 10 to 100 GB
TPU, GPU: 100 to 1000 GB
TPU, GPU: 1000 to 10000 GB
TPU, GPU Real-time requirements,
High memory BW requirements,
Good scalability
Google: NLP, Video processing, Image analysis, Protein structure prediction;
OpenAI: NLP, Image generation, Image analysis;
Facebook: NLP;
Microsoft: NLP, Image generation;
NVIDIA: NLP
Training Backpropagation,
Dropout,
Forward pass
Partial derivatives over a vector, vector-vector add, random binary vector generation, vector-vector multiply, all inference operations Dense vector, Dense vector
Dense vector
BF16, FP16, FP32 TPU, GPU: -1 to 1 GB
TPU, GPU: 1 to 10 GB
TPU, GPU: 10 to 100 GB
TPU, GPU: 100 to 1000 GB
TPU, GPU: 10000+ GB
GPU, TPU High memory BW requirements,
Good scalability
Google: NLP, Video processing, Image analysis, Protein structure prediction;
OpenAI: NLP, Image generation, Image analysis;
Facebook: NLP;
Microsoft: NLP, Image generation;
NVIDIA: NLP

All data is backed by research papers verified by our team. You can view references for each cell, export them, or view all references together.

Continuously Updated Database

Our AI models database grows weekly as we add more references through our thorough verification process. Stay up-to-date with the latest AI research and models.

Learn more about our data collection and verification process on our About Page.