Sheng-Chieh (Jack) Lin

Logo

Ph.D. Student, University of Waterloo

About Me

Hi, I’m a Ph.D. student supervised by Jimmy Lin in the David R. Cheriton School of Computer Science at the University of Waterloo starting from 2020. My research interests are dense retrieval for text search, including how to integrate lexical and semantic features into dense representations, its application to broader scenario, such as conversationa, multilingual and multimodal search.

Previously, I was a senior engineer at TSMC and working on improving CMOS image sensors through data analysis and knowledge of semiconductor. This experience shapes my research philosophy: human prior knowledge is the key to telling the story behind data and solving problems.

Publications

FLAME: Factuality-Aware Alignment for Large Language Models.
Sheng-Chieh Lin*, Luyu Gao, Barlas Oguz, Wenhan Xiong, Jimmy Lin, Wen-tau Yih, Xilun Chen*.
NeurIPS (just accepted), Dec 2024. [arxiv]

Unifying Multimodal Retrieval via Document Screenshot Embedding.
Xueguang Ma, Sheng-Chieh Lin, Minghan Li, Wenhu Chen, Jimmy Lin.
EMNLP (just accepted), Nov 2024. [arxiv]

mAggretriever: A Simple yet Effective Approach to Zero-Shot Multilingual Dense Retrieval.
Sheng-Chieh Lin, Amin Ahmad, and Jimmy Lin.
EMNLP, Nov 2023. [code]

How to Train Your DRAGON: Diverse Augmentation Towards Generalizable Dense Retrieval.
Sheng-Chieh Lin*, Akari Asai, Minghan Li, Barlas Oguz, Jimmy Lin, Yashar Mehdad, Wen-tau Yih, and Xilun Chen*.
EMNLP Findings, Nov 2023. [code][arxiv]

One Blade for One Purpose: Advancing Math Information Retrieval using Hybrid Search.
Wei Zhong, Sheng-Chieh Lin, Jheng-Hong Yang, and Jimmy Lin.
SIGIR, Jul 2023.

SLIM: Sparsified Late Interaction for Multi-Vector Retrieval with Inverted Indexes.
Minghan Li, Sheng-Chieh Lin, Xueguang Ma, and Jimmy Lin.
SIGIR, Jul 2023.

Improving Conversational Passage Re-ranking with View Ensemble.
Jia-Huei Ju, Sheng-Chieh Lin, Ming-Feng Tsai, Chuan-Ju Wang.
SIGIR, Jul 2023. [arxiv]

CITADEL: Conditional Token Interaction via Dynamic Lexical Routing for Efficient and Effective Multi-Vector Retrieval.
Minghan Li, Sheng-Chieh Lin, Barlas Oguz, Asish Ghoshal, Jimmy Lin, Yashar Mehdad, Wen-tau Yih, Xilun Chen.
ACL, Jul 2023. [code][arxiv]

Aggretriever: A Simple Approach to Aggregate Textual Representation for Robust Dense Passage Retrieval.
Sheng-Chieh Lin, Minghan Li, Jimmy Lin.
Transactions of the Association for Computational Linguistics, May 2023. [code][arxiv]

A Dense Representation Framework for Lexical and Semantic Matching.
Sheng-Chieh Lin, Jimmy Lin.
ACM Transactions on Information Systems, Apr 2023. [code][arxiv]

Contextualized Query Embeddings for Conversational Search.
Sheng-Chieh Lin, Jheng-Hong Yang, Jimmy Lin.
EMNLP, Nov 2021. [code][Pyserini][arxiv]

In-Batch Negatives for Knowledge Distillation with Tightly-Coupled Teachers for Dense Retrieval.
Sheng-Chieh Lin*, Jheng-Hong Yang*, Jimmy Lin.
ACL workshop on Representation Learning for NLP (RepL4NLP), Aug 2021. [code][Pyserini][arxiv]

Multi-Stage Conversational Passage Retrieval: An Approach to Fusing Term Importance Estimation and Neural Query Rewriting.
Sheng-Chieh Lin*, Jheng-Hong Yang*, Rodrigo Nogueira, Ming-Feng Tsai, Chuan-Ju Wang, and Jimmy Lin.
ACM Transactions on Information Systems, Aug 2021. [code][arxiv]

Efficiently Teaching an Effective Dense Retriever with Balanced Topic Aware Sampling.
Sebastian Hofstätter, Sheng-Chieh Lin, Jheng-Hong Yang, Jimmy Lin, and Allan Hanbury.
SIGIR, July 2021. [code]

Chatty Goose: A Python Framework for Conversational Search.
Edwin Zhang, Sheng-Chieh Lin, Jheng-Hong Yang, Ronak Pradeep, Rodrigo Nogueira, and Jimmy Lin.
SIGIR (Demonstrations), July 2021. [code]

Pyserini: A Python Toolkit for Reproducible Information Retrieval Research with Sparse and Dense Representations.
Jimmy Lin, Xueguang Ma, Sheng-Chieh Lin, Jheng-Hong Yang, Ronak Pradeep, and Rodrigo Nogueira.
SIGIR (Resource), July 2021. [code]

Designing Templates for Eliciting Commonsense Knowledge from Pretrained Sequence-to-Sequence Models.
Jheng-Hong Yang*, Sheng-Chieh Lin*, Rodrigo Nogueira, Ming-Feng Tsai, Chuan-Ju Wang, and Jimmy Lin.
COLING, December 2020. [arxiv]

Personalized TV Recommendation: Fusing User Behavior and Preferences.
Sheng-Chieh Lin*, Ting-Wei Lin*, Jing-Kai Lou, Ming-Feng Tsai, Chuan-Ju Wang.
arXiv:2104.08707, August 2020.

Negative-Aware Collaborative Filtering.
Sheng-Chieh Lin, Yu-Neng Chuang, Sheng-Fang Yang, Ming-Feng Tsai and Chuan-Ju Wang.
RecSys (Late-Breaking Results), September 2019.