Basis of the Incubation Program
Where GenAI Needs Computing Power:
Model Pre-training:
Creating pre-trained AI models capable of generating content requires a lot of computing power. This process involves using a large amount of data to train a model to learn and understand content generation patterns. Large models like the GPT series require hundreds of high-end GPUs and several weeks or months. Pre-trained models may not directly apply in specific domains as they often need more understanding of particular domain data.
Content Service:
After completing the model training, generating content with the model requires computing power, usually less than during the training phase. At this stage, the model uses the knowledge it learned during training to create new content, and might include text generation, image creation, audio synthesis, etc.
Reinforcement Training & Fine-tuning:
Further training adjustments are needed based on pre-trained large models using specific domain datasets for better requirements in particular tasks and domains. Reinforcement training and fine-tuning don't require as much computing power as the pre-training phase, usually only 4 or 8 cards, and the memory requirements of each card vary from 10G to 40G, depending on the algorithm requirements. By fine-tuning large models, you can leverage pre-trained models' knowledge and feature representation capabilities to achieve better performance and generalization capabilities in specific tasks.
Real-time Generation:
For applications that need to generate content in real-time, such as chatbots, games, or real-time translation, the computing power requirements might be higher because the system must generate high-quality content quickly.
Evaluation and Verification:
During the content generation process, we usually need algorithms to evaluate and verify the quality and accuracy of the generated content, which might involve using additional models and algorithms, which also consume computing power.
Personalization & Customization:
In some cases, AI-generated content must be personalized or customized according to user preferences or needs and might involve using additional algorithms to adapt or modify the generated content, further increasing the computing power requirements.)
P1X and AINUR believe the current mainstream computing power platforms use NV GPU-based computing power servers. NV gears its optimization direction for AI computing power towards expensive solutions that large platforms, such as A100/H100, use. Its optimizations for Tensor Core, GPU card memory size, memory bandwidth, acceleration units, GPU connections, etc., have a massive advantage over consumer-level 30/40 GPUs. However, for the value of large models to be in the hands of the people or communities (DAOs) and to align with the people rather than the power center, AINUR, and P1X hope to support innovation in the AI+Web3 field by providing cheaper and more accessible AI computing power.
Taking the consumer-grade NV RTX3090 as an example, its single-precision floating-point computing power is only half that of the A100. Considering NV's actual product segmentation tendency, the performance might be even lower, but it can still provide usable computing power for AI innovation. P1X and AINUR hope to provide realistic support for AI innovation using consumer-grade GPUs.
We will tap into and reorganize the potential of the 3090GPU, and now offer two services for AI innovators, and plan to follow George Holtz's philosophy with the innovators to build a powerful, cost-effective, and sustainable AI computing power infrastructure under the Tiny Grad framework.
Last updated