1. Case Summary
Takming University of Science and Technology (TMUST) is located in the Neihu District of Taipei City, which has a close relationship with the Neihu Software Science Park. It has set out to form the "AI Industry Intelligent Application Alliance", which is committed to providing a practical field for the industry to learn artificial intelligence technology. Small and medium-sized enterprises that want to import AI can learn AI technologies in TMUST, hoping to greatly reduce the threshold for enterprises to import AI.
- Artificial intelligence Teaching Scenario
- Nvidia Tesla V100、GPU、Kubernetes、Docker、Gemini AI Console
2. Pain Points and Challenges
- 8 GPU cards need to be able to support 50 students to use at the same time, and they need to be independent of each other when using resources
- A friendly interface is required to quickly enter the AI development environment, so as not to waste time learning system operations
- Need to be able to add resources flexibly in order to integrate hardware resources in the university in the future
3. Architecture Design Features
- Provide GPU Partitioning technology, which allow single GPU card be allocated to multiple containers and users at the same time, and also keep resource independence.
- Provide an effective three-tier, multi-project management mechanism.
- Provide AI frameworks such as TensorFlow, keras, digits, PyTorch, and have the ability to expand other AI frameworks
- Built-in AI Console Web Portal, students can independently activate AI development environment on the platform and develop directly through Jupyter Web IDE
- Built-in Multi-Cloud integration architecture. If university needs to integrate other resources in the future, which can be used to integrate resources
- Students can quickly open the AI computing environment and IDE editor through the web operation interface, saving a lot of time learning instructions and Docker operation
- When multiple people use computing resources, the resources remain independent. Which is different from the previous solution where one container affects other containers on the same GPU
- Even if there is only a single GPU server, there are still 8 GPUs to manage, so the administrator can still easily allocate resources through this system
- The Computer Center of TMUST has successfully built an AI computing sharing platform. In the future, it will combine with the "AI Industry Intelligent Application Alliance" to provide a perfect field for AI learning and introduction in the academic industry