takethepoints
Helluva Engineer
- Messages
- 6,095
You've put your fingers on why I won't be the person to do this. I don't know enough about machine learning to pull it off. My bad: this is where a lot of research is going, mainly because it's relatively more easy to find and analyze massive amounts of text then to try to get the stats agencies to collect more data. I'm sticking to these relatively more reliable sources in my own research; recent work shows that perhaps as much as 40% of internet traffic to sites involving "hot" issues is generated by bots. But none of that solves this problem. Perhaps there's a grad student in sports management somewhere …This is an interesting problem. There's substantial published work that is probably quite similar in politics on questions like approval or likeability or in marketing on brand perception. These tend to use word vector based machine learning models. In many ways the blueprint is there to take up, and tuning the model for 'energy' would be a worthwhile exercise if you want to get into machine learning on language datasets. Of course, you'd need a data source, like Lexis Nexis, and some machine learning compute resources. The first is more expensive. Would be fun to hack around on.
I think the larger model after making a hype metric would be problematic though. I suspect trying to control for resources along with a pretty limited data set would be tough.