Welcome to my personal website, which serves as a platform for self-presentation and communication with others. Here, you can learn about my research progress in Intelligent Computing and other areas of interest. If you have any suggestions, please feel free to let me know.
I work on foundation models, agent systems, and data-centric optimization, with a practical goal: convert model capability into reliable product value under real-world constraints (quality, latency, memory, privacy, cost). My work studies how better selection mechanisms across data, prompts, and deployment paths can improve both model reasoning and system efficiency.
Technically, I combine reinforcement learning, evolutionary optimization, and robust representation learning to build an end-to-end stack from data curation to LLM/agent deployment. This includes: (1) demonstration selection for in-context learning (RDES, ICML 2025), (2) lightweight meta-selection for scalable prompting (Meta-Sel, 2026), (3) robust data-centric pipelines for imbalanced and noisy structured data (MEL/EODE/HSNOE/GMR), and (4) system-level optimization for on-device and edge intelligence (On-Device AI, CSUR 2025; Cognitive Edge Computing, 2026).
For frontier foundation-model and agent roles, I position this foundation into four capability directions: data engine quality (selection/filtering for reliable training and evaluation sets), post-training and reasoning efficiency (prompt/demo optimization), agent reliability (tool-use orchestration and controllable autonomy), and evaluation and deployment (joint measurement of quality, efficiency, and safety trade-offs). Publications appear in ICML, TKDE, TCBB, CSUR, KBS and related venues.
Contact: xubin.wang [at] kindlab.site · 欢迎合作 / Collaboration welcome.
This framework organizes all studies (2022-2026) into one systematic line: build resource-constrained intelligence through selection. The trajectory evolves from data/feature optimization to LLM and agent intelligence, and translates into role-relevant capabilities: data engine quality, reasoning efficiency, agent orchestration, and reliable evaluation/deployment.
FWPSO and SaWDE establish search-space reduction, supporting efficient data filtering and curation.
MEL/EODE/HSNOE improve transferability and robustness, enabling higher-quality training/eval data pipelines.
RDES and Meta-Sel target low-cost, high-quality reasoning through better demonstration selection.
On-device and Cognitive Edge studies connect model quality, system constraints, and reliable deployment.
Selection, denoising, and imbalance-aware data processing for robust training and evaluation sets.
Prompt/demo optimization to improve reasoning quality under strict latency and token budgets.
From single-model inference to controllable multi-agent workflows for practical task completion.
Joint optimization of quality, latency, memory, privacy, and operational safety in real-world systems.
The research program is organized into two core directions with clear translational value for frontier model and agent product work: Large Language Models, Agents and Their Applications and Data-Centric AI and Its Applications.
Core Focus: Build efficient and deployable intelligence for LLM-based reasoning and autonomous
agent collaboration under real-world constraints.
Role-Relevant Translation: inference-time optimization, post-training style selection,
controllable tool-use, and robust agent execution.
Representative Works:
[RDES, ICML 2025]
[Cognitive Edge Computing, 2026]
[On-Device AI, CSUR 2025]
[Meta-Sel, arXiv 2026]
Application Value: intelligent text annotation, resource-aware in-context reasoning, and
edge-side assistants/agents with better cost-latency-privacy trade-offs.
Core Focus: Improve learning reliability by selecting and refining informative features and
samples from high-dimensional, noisy, and imbalanced data.
Role-Relevant Translation: data quality engineering, benchmark robustness, and stable
generalization for model development and evaluation.
Representative Works:
[SaWDE, KBS 2022]
[FWPSO, BIBM 2022]
[MEL, TKDE 2024]
[EODE, TCBB 2024]
[HSNOE, ESWA 2024]
[GMR, arXiv 2026]
Application Value: robust biomedical data processing, biomarker/pathway discovery, and stable
decision support for imbalanced structured learning tasks.