ASReview Model Selection Guide
ASReviewMachine LearningSystematic Reviews
Earlier this year, I wrote a guest blog for the ASReview website on how to choose models within ASReview.
Since model selection is one of the most common questions new users have, I wanted to make sure it’s referenced here as well.
Highlights from the blog
- ASReview as a Swiss Army knife — it supports multiple feature extractors and classifiers, giving researchers freedom to tailor the pipeline.
- Feature extractors:
- TF-IDF → lightweight, fast, and interpretable, but ignores context.
- Doc2Vec → learns semantics from scratch, more context-aware, but computationally heavy.
- SBERT → transformer-based, multilingual, powerful on semantics, but memory-intensive.
- Classifiers:
- Naive Bayes → very fast, pairs well with TF-IDF.
- Random Forest → robust ensemble, balances accuracy and stability.
- SVM → effective in high dimensions, but slower on large datasets.
- Logistic Regression → solid statistical baseline, efficient.
- Neural Networks → powerful, but require more data and compute.
- Processing times vary: TF-IDF embeddings take seconds, Doc2Vec minutes, SBERT hours. Classifiers usually run in under 10 seconds per cycle.
- Best model? → There isn’t one. The “best” choice depends on your dataset and research question. ASReview offers verified combinations (like TF-IDF + Naive Bayes, or SBERT + XGBoost) as starting points.
👉 You can read the full post on the ASReview Blog.