Welcome to my personal website, which serves as a platform for self-presentation and communication with others. Here, you can learn about my research progress in Intelligent Computing and other areas of interest. If you have any suggestions, please feel free to let me know.
About Me Research WorksWelcome to my page! My research interests include advanced machine learning, AI techniques, data analysis, and dynamic optimization. I develop novel approaches in machine learning, focusing on foundation models and their scientific applications. My expertise encompasses data mining for high-dimensional data, feature selection, and knowledge extraction from large datasets, along with optimizing AI models for better efficiency and performance. I have published in top conferences and journals, such as ICML, TKDE, and CSUR, and my work has been featured by MIT Technology Review. Additionally, my knowledge base construction and management system has garnered more than 200 stars on GitHub, reflecting its value and utility within the community. By leveraging interdisciplinary knowledge in AI, data science, and algorithm design, I aim to achieve breakthroughs in machine intelligence that benefit various scientific and practical domains. You can reach me at email: wangxubin [at] kindlab [dot] site. Any information or suggestions would be greatly appreciated.
We tackle the challenges associated with text classification tasks, particularly in few-shot prompting scenarios, by introducing the Reinforced Diverse Example Selector RDES. RDES employs a reinforcement learning framework, specifically Q-learning, to optimize the selection of diverse reference examples, ensuring a balanced representation of data that enhances classification accuracy. Additionally, we explore the integration of Chain-of-Thought reasoning into the selection process, which further boosts the model's predictive performance. In parallel, we present the RAG-QA-Generator, an automated tool designed for the construction and management of knowledge bases within Retrieval-Augmented Generation (RAG) systems. This tool processes document data and utilizes large language models to generate high-quality question-answer pairs, facilitating the automated development of RAG system knowledge bases. Together, these contributions highlight the potential of advanced methodologies in addressing the complexities of text classification and knowledge management.
Our research at the intersection of Evolutionary Machine Learning and its applications has focused on addressing prominent challenges in complex domains like feature selection, biomarker identification and cancer classification. We have proposed novel algorithms and frameworks such as [SaWDE, KBS'22] for large-scale feature selection using a self-adaptive differential evolution approach, [MEL, TKDE'24] for multi-task evolutionary learning through information sharing, [FWPSO, BIBM'22] for efficient biomarker gene identification from microarray data via feature weighting particle swarm optimization, [EODE, TCBB'24] for ensemble-based improved cancer screening through optimized feature selection, modeling, and classification, and [HSNOE, ESWA'24] which leverages a hybrid sampling technique and ant colony-based feature selection within an ensemble to enhance identification of hidden responders in imbalanced biological data. Extensive experimentation demonstrates the superior and robust performance of our proposed approaches, validating their ability to provide effective solutions for challenging machine learning problems across domains while overcoming issues like local optima, dimensionality, and generalization across datasets.