Suggest the tags based on the content that was there in the question posted on Stackoverflow.
1] We are modeling with less data points (0.5M data points) and more weight is given to the title.
2] We are limiting our tags to 500 only.
3] Due to the above steps we are reducing the time to train the model.
4] If we want to train the whole data we need high computational resource.
5] With 500 tags we are covering 90.956 % of questions.
6] When we apply OneVsRest Logistic regression on BOW we get macro F1 score as 0.3338.
7] Tfidf performs well than BOW on this dataset.