Browser-based AI for Biology: Fine-tuned ProtBERT for Subcellular Localization Prediction
How This Works:
Uses ProtBERT from Rostlab/HuggingFace, a protein language model fine-tuned to predict which part of a cell a protein resides in (10 classes: nucleus, cytoplasm, mitochondrion, etc.)
Converted from PyTorch to ONNX format using torch.onnx.export so that the model can run efficiently in web browsers via Transformers.js
Applied INT8 quantization using ONNXRuntime to reduce model size from 1.6GB to 404MB (75% reduction), enabling faster download and inference while maintaining comparable accuracy
Testing confirms the model's limited accuracy is not caused by quantization - the original unquantized model shows similar performance, indicating inherent model limitations
Purpose: This demonstrates running biology foundation models entirely in the browser, making AI deployment easy with no backend server costs. However, model size must be carefully controlled for good user experience.