I trained the generative models all from scratch. Pretrained models are not that helpful when it’s important to accurately capture very domain specific features.
One of the classifiers I tried was based on zoobot with a custom head. Assuming the publications around zoobot are truthful, it was trained exclusively on similar data from a multitude of different sky surveys.
That data is also publicly available (of course), so a model could be trained on it. I’d love to say I’d doubt Google/YouTube would ever do that, but at this point nothing would surprise me.