Xubin Wang's Homepage

About me

Welcome to my page! My research interests include advanced machine learning, AI techniques, data analysis, and dynamic optimization. I develop novel approaches in machine learning, focusing on foundation models and their scientific applications. My expertise encompasses data mining for high-dimensional data, feature selection, and knowledge extraction from large datasets, along with optimizing AI models for better efficiency and performance. I have published in top conferences and journals, such as ICML, TKDE, and CSUR, and my work has been featured by MIT Technology Review. Additionally, my knowledge base construction and management system has garnered more than 200 stars on GitHub, reflecting its value and utility within the community. By leveraging interdisciplinary knowledge in AI, data science, and algorithm design, I aim to achieve breakthroughs in machine intelligence that benefit various scientific and practical domains. You can reach me at email: wangxubin [at] kindlab [dot] site. Any information or suggestions would be greatly appreciated.

Research Topics

Large Language Models and Their Applications

We tackle the challenges associated with text classification tasks, particularly in few-shot prompting scenarios, by introducing the Reinforced Diverse Example Selector RDES. RDES employs a reinforcement learning framework, specifically Q-learning, to optimize the selection of diverse reference examples, ensuring a balanced representation of data that enhances classification accuracy. Additionally, we explore the integration of Chain-of-Thought reasoning into the selection process, which further boosts the model's predictive performance. In parallel, we present the RAG-QA-Generator, an automated tool designed for the construction and management of knowledge bases within Retrieval-Augmented Generation (RAG) systems. This tool processes document data and utilizes large language models to generate high-quality question-answer pairs, facilitating the automated development of RAG system knowledge bases. Together, these contributions highlight the potential of advanced methodologies in addressing the complexities of text classification and knowledge management.

Evolutionary Machine Learning and Its Applications

Our research at the intersection of Evolutionary Machine Learning and its applications has focused on addressing prominent challenges in complex domains like feature selection, biomarker identification and cancer classification. We have proposed novel algorithms and frameworks such as [SaWDE, KBS'22] for large-scale feature selection using a self-adaptive differential evolution approach, [MEL, TKDE'24] for multi-task evolutionary learning through information sharing, [FWPSO, BIBM'22] for efficient biomarker gene identification from microarray data via feature weighting particle swarm optimization, [EODE, TCBB'24] for ensemble-based improved cancer screening through optimized feature selection, modeling, and classification, and [HSNOE, ESWA'24] which leverages a hybrid sampling technique and ant colony-based feature selection within an ensemble to enhance identification of hidden responders in imbalanced biological data. Extensive experimentation demonstrates the superior and robust performance of our proposed approaches, validating their ability to provide effective solutions for challenging machine learning problems across domains while overcoming issues like local optima, dimensionality, and generalization across datasets.

Selected Publications

Full list of publications can be found at Google Scholar, ResearchGate, DBLP or PUBLICATIONS.

[1] Wang, X., Wu, J., Yuan, Y., Li, M., Cai, D., & Jia, W. (2025). Demonstration Selection for In-Context Learning via Reinforcement Learning. In Forty-Second International Conference on Machine Learning (ICML). PMLR. (CCF A) [PDF] [Poster] [Slides] [Blog] [News1,News2]
[2] Wang, X., Tang, Z., Guo, J., Meng, T., Wang, C., Wang, T., & Jia, W. (2025). Empowering Edge Intelligence: A Comprehensive Survey on On-Device AI Models. ACM Computing Surveys, 57(9), 1-39. (SCI I) [PDF] [Audio] [Blog] [News1, News2]
[3] Wang, X., Shangguan, H., Huang, F., Wu, S., & Jia, W. (2024). MEL: Efficient Multi-Task Evolutionary Learning for High-Dimensional Feature Selection. IEEE Transactions on Knowledge and Data Engineering, 36(08), 4020-4033. (CCF A) [PDF] [Supplementary] [Code] [Blog]
[4] Wang, X., Wang, Y., Ma, Z., Wong, K. C., & Li, X. (2024). Exhaustive Exploitation of Nature-inspired Computation for Cancer Screening in an Ensemble Manner. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 21(5), 1366-1379. (CCF B) [PDF] [Code] [Blog]
[5] Wang, X., Wang, Y., Ma, Z., Wong, K. C., & Li, X. (2024). Evolving Pathway Activation from Cancer Gene Expression Data using Nature-inspired Ensemble Optimization. Expert Systems with Applications, 248, 123469. (SCI I) [PDF] [Code]
[6] Wang, X., Wang, Y., Wong, K. C., & Li, X. (2022). A self-adaptive weighted differential evolution approach for large-scale feature selection. Knowledge-Based Systems, 235, 107633. (SCI I) [PDF] [Code] [Poster]
[7] Wang, X., & Jia, W. (2022). A Feature Weighting Particle Swarm Optimization Method to Identify Biomarker Genes. In 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (pp. 830-834). IEEE. (CCF B) [PDF] [Code] [Video]

Xubin Wang's Site

About me

Research Topics

Large Language Models and Their Applications

Evolutionary Machine Learning and Its Applications

Selected Publications

Full list of publications can be found at Google Scholar, ResearchGate, DBLP or PUBLICATIONS.

Research Interests

Advanced Machine Learning and AI Techniques

Data Analysis and Knowledge Discovery

Optimization and Model Efficiency

Collaborator

Contact

wangxubin [at] kindlab [dot] site