Filter AI Models
Use the filters below to narrow down the AI models based on your criteria. Combine multiple filters to get precise information tailored to your needs!
- Model: Filter by model to view data tied to specific models like CNNs, DNNs, Transformers, etc.
- Phase: Filter by phase to view data on models during inference or training stages.
- Layers: Focus your study on specific layers within the models.
- Operation: Filter based on operations performed by the models.
- Sparsity: Select models based on their sparsity levels.
- Data Type: Choose models using specific data types in inference or training.
- Memory Footprint: Filter models by their memory usage to match your hardware capabilities.
- Production Hardware Platforms: Find models compatible with specific hardware platforms.
AI Models Data Visualization
Explore our interactive charts that showcase the latest data on memory footprint, number of papers per year, data types used in inference and training, and hardware platforms distribution. Our charts are updated regularly to reflect the most current trends and findings in AI research.
Training - Hardware Platform and Memory Footprint
Inference - Hardware Platform and Memory Footprint
Number of Papers in the Database Per Year
Data Types Used in Inference and Training
Comprehensive AI Models Table
Interact with our detailed table containing extensive information on AI models, including references to research papers verified by our team.
Model | Phase | Layers | Operations | Sparsity | Data Type | Memory Footprint (GB) | Production Hardware Platforms | Features | Industry Usage |
---|---|---|---|---|---|---|---|---|---|
Convolutional Neural Network (CNN) | Inference |
Convolution, Neural Network (Activation functions, Weights and bias operations), Pooling |
ReLU, Sigmoid or Tanh Vector-matrix multiply, Vector-vector add |
Dense vector, Dense matrix Dense vector, Dense matrix |
1-bit, 2-bit, 8-bit, FP16, FP32, INT8 |
CPU, GPU: -1 to 1 GB CPU, GPU: 1 to 10 GB |
CPU, GPU |
Real-time requirements, High memory BW, Compute intensive |
Apple: Face ID, Apple Neural Engine; Google: Object detection; Instagram: Automatically recognize and tag images; Amazon: Summarizing customer feedback, forecasting |
Training |
Backpropagation, Batch normalization, Dropout, Forward pass |
Partial derivatives over a fector, vector-vector add, vector-vector multiply, random binary vector generation, all inference operations | Dense vector Dense Vector, Dense Vector |
BF16, FP16, FP32 |
GPU: -1 to 1 GB GPU: 1 to 10 GB GPU: 10 to 100 GB |
GPU |
Memory footprint, Memory bandwidth, Scalability |
Apple: Face ID, Apple Neural Engine; Google: Object detection; Instagram: Automatically recognize and tag images; Amazon: Summarizing customer feedback, forecasting |
|
Deep Learning Recommendation Systems (DLRS) | Inference |
Embedding (Embedding lookup, Lookup aggregation), Neural Network (Activation functions, Weights and bias operations) |
Embedding lookup: Load from memory Lookup aggregation: Vector-vector add or Vector-vector concat ReLU Matrix-matrix multiply, Vector-vector add |
Embedding lookup: N/A Lookup aggregation: Dense vectors Dense matrix Dense matrix, Dense matrix; Dense vector, Dense vector |
INT4, INT8, INT32, BF16, FP16, FP32 |
MTIA, GPU, CPU: 10 to 100 GB MTIA, GPU, CPU: 100 to 1000 GB MTIA, GPU: 1000 to 10000 GB MTIA, GPU: 10000+ GB |
CPU, GPU, MTIA |
Real-time requirements, Neural network compute intensive, Embedding light compute and high memory capacity and bandwidth needs |
Amazon, Google, eBay, Alibaba: Advertisements and product recommendation; Facebook, Instagram: Social networking service; YouTube, Spotify, Fox, Pinterest: Media/music/image recommendations; LinkedIn: News feed |
Training |
Backpropagation, Forward pass |
Partial derivatives over a vector, vector-vector add, all inference operations | Dense vector, dense vector | INT8, INT32, BF16, FP16, FP32 |
GPU, MTIA: 10 to 100 GB GPU, MTIA: 100 to 1000 GB GPU, MTIA: 1000 to 10000 GB GPU, MTIA: 10000+ GB |
GPU, MTIA |
Neural network compute intensive, Embedding light compute and high memory capacity and bandwidth needs |
Amazon, Google, eBay, Alibaba: Advertisements and product recommendation; Facebook, Instagram: Social networking service; YouTube, Spotify, Fox, Pinterest: Media/music/image recommendations; LinkedIn: News feed |
|
Deep Neural Network (DNN) | Inference | Neural Network (Activation functions, Weights and bias operations) |
ReLU, Sigmoid or Tanh Vector-matrix multiply, Vector-vector add |
N/A Dense vector, Dense matrix; Dense vector, Dense vector |
1-bit, 2-bit, 8-bit, INT8, FP16, FP32 |
CPU, GPU: -1 to 1 GB CPU, GPU: 1 to 10 GB |
CPU, GPU |
Real-time requirements, High memory BW, Compute intensive |
Apple: Face ID, Apple Neural Engine; Google: Object detection, voice recognition; Amazon: Summarizing customer feedback, voice assistants |
Training |
Backpropagation, Batch normalization, Dropout, Forward pass |
Partial derivatives over a vector, vector-vector add, vector-vector multiply, random binary vector generation, all inference operations | Dense vector dense vector, dense vector |
BF16, FP16, FP32 |
GPU: -1 to 1 GB GPU: 1 to 10 GB |
GPU |
Memory footprint, Memory bandwidth, Scalability |
Apple: Face ID, Apple Neural Engine; Google: Object detection, voice recognition; Amazon: Summarizing customer feedback, voice assistants |
|
Graph Neural Network (GNN) | Inference |
Aggregation (Multiple and aggregated vector through an operator), Combination (Activation functions, Weights and bias operations) |
Aggregation: Matrix-matrix multiply ReLU, Sigmoid or Tanh Matrix-matrix multiply, Vector-vector add |
Aggregation: Sparse matrix, dense matrix N/A Dense matrix, Dense matrix; Dense vector, Dense vector |
INT32, FP32 | GPU, CPU: 100 to 1000 GB | GPU, CPU |
Compute intensity, High memory BW, Real-time response |
Pinterest, Amazon: Recommender systems; Social network analysis; Molecular structure prediction; Twitter: Temporal Graph Networks (TGNs); Facebook: Social network graphs |
Training |
Backpropagation, Dropout, Forward pass |
Partial derivatives over a vector, vector-vector add, random vinary vector generation, vector-vector multiply, all inference operations | Dense vector Dense vector, Dense vector |
FP32 | GPU, CPU: 100 to 1000 GB | GPU, CPU |
Compute intensity, High memory BW, Load-imbalance |
Pinterest, Amazon: Recommender systems; Social network analysis; Molecular structure prediction; Twitter: Temporal Graph Networks (TGNs); Facebook: Social network graphs |
|
Recurrent Neural Network (RNN) | Inference |
Embedding, Recurrent layers (Activation functions, Weights and bias operations) |
ReLU, Sigmoid, or Tanh Vector-matrix multiply, Vector-vector add |
Dense vector Dense vector, Dense matrix; Dense vector, Dense vector |
FP32, INT8, FP16, BF16 | GPU, TPU: -1 to 1 GB | GPU, TPU |
Real-time requirements, Sequential execution dependencies |
Google: NLP, Translation, Speech recognition, Autonomous driving; Baidu: Speech recognition; Amazon: Speech recognition |
Training |
Backpropagation through time, Forward pass |
Partial derivatives over a vector, vector-vector add, all inference operations | Dense vector Dense vector, Dense vector |
FP32, FP16 | GPU, TPU: -1 to 1 GB | GPU, TPU | Can exploit model and data parallelism |
Google: NLP, Translation, Speech recognition, Autonomous driving; Baidu: Speech recognition; Amazon: Speech recognition |
|
Transformers | Inference |
Concatenation, Embedding, Multi-head attention (Linear, Probability-value multiply, Query-key multiply, Softmax), Neural Network (Activation functions, Linear), Normalization, Positional encoding |
Linear: Matrix-matrix add, Matrix-matrix multiply Probability-value multiply: Matrix-matrix multiply Query-key multiply: Matrix-matrix multiply Softmax: Scalar-vector power, Vector-scalar divide, Vector to scalar reduction ReLU, SwiGLU, GeLU, GeGLU |
Linear: Dense matrix, Dense matrix Probability-value multiply: Dense matrix, Dense matrix; Dense matrix, Sparse matrix Query-key multiply: Dense matrix, Dense matrix; Dense matrix, Sparse matrix Softmax: Dense vector |
FP32, FP16, BF16 |
TPU, GPU: -1 to 1 GB TPU, GPU: 1 to 10 GB TPU, GPU: 10 to 100 GB TPU, GPU: 100 to 1000 GB TPU, GPU: 1000 to 10000 GB |
TPU, GPU |
Real-time requirements, High memory BW requirements, Good scalability |
Google: NLP, Video processing, Image analysis, Protein structure prediction; OpenAI: NLP, Image generation, Image analysis; Facebook: NLP; Microsoft: NLP, Image generation; NVIDIA: NLP |
Training |
Backpropagation, Dropout, Forward pass |
Partial derivatives over a vector, vector-vector add, random binary vector generation, vector-vector multiply, all inference operations | Dense vector, Dense vector Dense vector |
BF16, FP16, FP32 |
TPU, GPU: -1 to 1 GB TPU, GPU: 1 to 10 GB TPU, GPU: 10 to 100 GB TPU, GPU: 100 to 1000 GB TPU, GPU: 10000+ GB |
GPU, TPU |
High memory BW requirements, Good scalability |
Google: NLP, Video processing, Image analysis, Protein structure prediction; OpenAI: NLP, Image generation, Image analysis; Facebook: NLP; Microsoft: NLP, Image generation; NVIDIA: NLP |
All data is backed by research papers verified by our team. You can view references for each cell, export them, or view all references together.
Continuously Updated Database
Our AI models database grows weekly as we add more references through our thorough verification process. Stay up-to-date with the latest AI research and models.
Learn more about our data collection and verification process on our About Page.