- Preprints
- FLM-101B: An Open LLM and How to Train It with $100K Budget
- Xiang Li, Yiqun Yao, Xin Jiang, Xuezhi Fang, Xuying Meng, Siqi Fan, Peng Han, Jing Li, Li Du, Bowen Qin, Zheng Zhang, Aixin Sun and Yequan Wang
-
PDF  
Code  
Abstract  
BibTex
Large language models (LLMs) have achieved remarkable success in NLP and multimodal tasks. Despite these successes, their development faces two main challenges: (i) high computational cost; and (ii) difficulty in conducting fair and objective evaluations. LLMs are prohibitively expensive, making it feasible for only a few major players to undertake their training, thereby constraining both research and application opportunities. This underscores the importance of cost-effective LLM training. In this paper, we utilize a growth strategy to significantly reduce LLM training cost. We demonstrate that an LLM with 101B parameters and 0.31TB tokens can be trained on a $100K budget. We also adopt a systematic evaluation paradigm for the IQ evaluation of LLMs, in complement to existing evaluations that focus more on knowledge-oriented abilities. We introduce our benchmark including evaluations on important aspects of intelligence including symbolic mapping, itrule understanding, pattern mining, and anti-interference. Such evaluations minimize the potential impact of memorization. Experimental results show that our model FLM-101B, trained with a budget of $100K, achieves comparable performance to powerful and well-known models, eg GPT-3 and GLM-130B, especially in the IQ benchmark evaluations with contexts unseen in training data. The checkpoint of FLM-101B will be open-sourced at https://huggingface.co/CofeAI/FLM-101B.
@article{DBLP:journals/corr/abs-2309-03852,
author = {Xiang Li and
Yiqun Yao and
Xin Jiang and
Xuezhi Fang and
Xuying Meng and
Siqi Fan and
Peng Han and
Jing Li and
Li Du and
Bowen Qin and
Zheng Zhang and
Aixin Sun and
Yequan Wang},
title = {{FLM-101B:} An Open {LLM} and How to Train It with {\textdollar}100K
Budget},
journal = {CoRR},
volume = {abs/2309.03852},
year = {2023},
url = {https://doi.org/10.48550/arXiv.2309.03852},
doi = {10.48550/arXiv.2309.03852},
eprinttype = {arXiv},
}
Function-to-Style Guidance of LLMs for Code Translation
Longhui Zhang, Bin Wang, Jiahao Wang, Xiaofeng Zhao, Min Zhang, Hao Yang, Meishan Zhang, Yu Li, Jing Li , Jun Yu, Min Zhang
ICML-25- The Forty-Second International Conference on Machine Learning, 2025.
PDF  
Code  
Abstract  
BibTex
Large language models (LLMs) have made significant strides in code translation tasks. However, ensuring both the correctness and readability of translated code remains a challenge, limiting their effective adoption in real-world software development. In this work, we propose F2STrans, a function-to-style guiding paradigm designed to progressively improve the performance of LLMs in code translation. Our approach comprises two key stages: (1) Functional learning, which optimizes translation correctness using high-quality source-target code pairs mined from online programming platforms, and (2) Style learning, which improves translation readability by incorporating both positive and negative style examples. Additionally, we introduce a novel code translation benchmark that includes up-to-date source code, extensive test cases, and manually annotated ground-truth translations, enabling comprehensive functional and stylistic evaluations. Experiments on both our new benchmark and existing datasets demonstrate that our approach significantly improves code translation performance. Notably, our approach enables Qwen-1.5B to outperform prompt-enhanced Qwen-32B and GPT-4 on average across 20 diverse code translation scenarios.
Few-Shot Learner Generalizes Across AI-Generated Image Detection
Shiyu Wu, Jing Liu, Jing Li, Yequan Wang
ICML-25- The Forty-Second International Conference on Machine Learning, 2025.
PDF  
Code  
Abstract  
BibTex
Current fake image detectors trained on large synthetic image datasets perform satisfactorily on limited studied generative models. However, these detectors suffer a notable performance decline over unseen models. Besides, collecting adequate training data from online generative models is often expensive or infeasible. To overcome these issues, we propose Few-Shot Detector (FSD), a novel AI-generated image detector which learns a specialized metric space to effectively distinguish unseen fake images by utilizing very few samples. Experiments show that FSD achieves state-of-the-art performance by
average ACC on GenImage dataset. More importantly, our method is better capable of capturing the intra-category common features in unseen images without further training.
Knowledge Editing with Dynamic Knowledge Graphs for Multi-hop Question Answering
Yifan Lu, Yigeng Zhou, Jing Li , Yequan Wang, Xuebo Liu, Daojing He, Fangming Liu, Min Zhang
AAAI-25- The Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025.
PDF  
Code  
Abstract  
BibTex
Multi-hop question answering (MHQA) poses a significant challenge for large language models (LLMs) due the extensive knowledge demands involved. Knowledge editing, which aims to precisely modify the LLMs to incorporate specific knowledge without negatively impacting other unrelated knowledge, offers a potential solution for addressing MHQA challenges with LLMs. However, current solutions struggle to effectively resolve issues of knowledge conflicts. Most parameter-preserving editing methods are hindered by inaccurate retrieval and overlook secondary editing issues, which can introduce noise into the reasoning process of LLMs. In this paper, we introduce KEDKG, a novel knowledge editing method that leverages a dynamic knowledge graph for MHQA, designed to ensure the reliability of answers. KEDKG involves two primary steps: dynamic knowledge graph construction and knowledge graph augmented generation. Initially, KEDKG autonomously constructs a dynamic knowledge graph to store revised information while resolving potential knowledge conflicts. Subsequently, it employs a fine-grained retrieval strategy coupled with an entity and relation detector to enhance the accuracy of graph retrieval for LLM generation. Experimental results on benchmarks show that KEDKG surpasses previous state-of-the-art models, delivering more accurate and reliable answers in environments with dynamic information.
@inproceedings{kedkg_aaai_25,
title={Knowledge Editing with Dynamic Knowledge Graphs for Multi-hop Question Answering},
author = {Yifan Lu and Yigeng Zhou and Jing Li and Yequan Wang and Xuebo Liu and Daojing He and Fangming Liu andMin Zhang},
booktitle = {The Thirty-Ninth AAAI Conference on Artificial Intelligence (AAAI)},
year={2025}
}
Impromptu Cybercrime Euphemism Detection
Xiang Li, Yucheng Zhou, Laiping Zhao, Jing Li , Fangming Liu
COLING-25- The 31st International Conference on Computational Linguistics, 2025.
PDF  
Abstract  
BibTex
Detecting euphemisms is essential for content security on various social media platforms, but existing methods designed for detecting euphemisms are ineffective in impromptu euphemisms. In this work, we make a first attempt to an exploration of impromptu euphemism detection and introduce the Impromptu Cybercrime Euphemisms Detection (ICED) dataset. Moreover, we propose a detection framework tailored to this problem, which employs context augmentation modeling and multi-round iterative training. Our detection framework mainly consists of a coarse-grained and a fine-grained classification model. The coarse-grained classification model removes most of the harmless content in the corpus to be detected. The fine-grained model, impromptu euphemisms detector, integrates context augmentation and multi-round iterations training to better predicts the actual meaning of a masked token. In addition, we leverage ChatGPT to evaluate the mode's capability. Experimental results demonstrate that our approach achieves a remarkable 76-fold improvement compared to the previous state-of-the-art euphemism detector.
@inproceedings{xianglicoling25,
title={Impromptu Cybercrime Euphemism Detection},
author = {Xiang Li and Yucheng Zhou and Laiping Zhao and Jing Li and Fangming Liu},
booktitle = {The 31st International Conference on Computational Linguistics (COLING)},
year={2025}
}
SMSMO: Learning to generate multimodal summary for scientific papers
Xinyi Zhong, Zusheng Tan, Shen Gao, Jing Li, Jiaxing Shen, Jingyu Ji, Jeff Tang, Billy Chiu
ELSEVIER KBS-25- Knowledge-Based Systems, Volume 310, 2025
PDF  
Abstract  
BibTex
Nowadays, publishers like Elsevier increasingly use graphical abstracts (i.e., a pictorial paper summary) along with textual abstracts to facilitate scientific paper readings. In such a case, automatically identifying a representative image and generating a suitable textual summary for individual papers can help editors and readers save time, facilitating them in reading and understanding papers. To tackle the case, we introduce the dataset for Scientific Multimodal Summarization with Multimodal Output (SMSMO). Unlike other multimodal tasks which performed on generic, medium-size contents (e.g., news), SMSMO needs to tackle longer multimodal contents in papers, with finer-grained multimodality interactions and semantic alignments between images and text. For this, we propose a cross-modality, multi-task learning summarizer (CMT-Sum). It captures the intra- and inter-modality interactions between images and text through a cross-fusion module; and models the finer-grained image–text semantic alignment by jointly generating the text summary, selecting the key image and matching the text and image. Extensive experiments conducted on two newly introduced datasets on the SMSMO task showcase our model’s effectiveness.
@article{zhong2024smsmo,
title={SMSMO: Learning to generate multimodal summary for scientific papers},
author={Zhong, Xinyi and Tan, Zusheng and Gao, Shen and Li, Jing and Shen, Jiaxing and Ji, Jingyu and Tang, Jeff and Chiu, Billy},
journal={Knowledge-Based Systems},
pages={112908},
year={2024},
publisher={Elsevier}
}
Parameter Competition Balancing for Model Merging
Guodong Du, Junlin Lee, Jing Li , Runhua Jiang, Yifei Guo, Shuyang Yu, Hanting Liu, Sim Kuan Goh, Ho-Kin Tang, Daojing He, Min Zhang
NeurIPS-24- The Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024.
PDF  
Code  
Abstract  
BibTex
While fine-tuning pretrained models has become common practice, these models often underperform outside their specific domains. Recently developed model merging techniques enable the direct integration of multiple models, each fine-tuned for distinct tasks, into a single model. This strategy promotes multitasking capabilities without requiring retraining on the original datasets. However, existing methods fall short in addressing potential conflicts and complex correlations between tasks, especially in parameter-level adjustments, posing a challenge in effectively balancing parameter competition across various tasks. This paper introduces an innovative technique named PCB-Merging (Parameter Competition Balancing), a lightweight and training-free technique that adjusts the coefficients of each parameter for effective model merging. PCB-Merging employs intra-balancing to gauge parameter significance within individual tasks and inter-balancing to assess parameter similarities across different tasks. Parameters with low importance scores are dropped, and the remaining ones are rescaled to form the final merged model. We assessed our approach in diverse merging scenarios, including cross-task, cross-domain, and cross-training configurations, as well as out-of-domain generalization. The experimental results reveal that our approach achieves substantial performance enhancements across multiple modalities, domains, model sizes, number of tasks, fine-tuning forms, and large language models, outperforming existing model merging methods.
@inproceedings{guodong24neurips,
title={Parameter Competition Balancing for Model Merging},
author = {Guodong Du and
Junlin Lee and Jing Li and Runhua Jiang and Yifei Guo and Shuyang Yu and Hanting Liu and Sim Kuan Goh and Ho-Kin Tang and Daojing He and Min Zhang},
booktitle = {The Thirty-eighth Annual Conference on Neural Information Processing Systems (NeurIPS)},
year={2024}
}
Multimodal Reasoning with Multimodal Knowledge Graph
Junlin Lee, Yequan Wang, Jing Li and Min Zhang
ACL-24- The 62nd Annual Meeting of the Association for Computational Linguistics, 2024.
PDF  
Abstract  
BibTex
Multimodal reasoning with large language models (LLMs) often suffers from hallucinations and the presence of deficient or outdated knowledge within LLMs. Some approaches have sought to mitigate these issues by employing textual knowledge graphs, but their singular modality of knowledge limits comprehensive cross-modal understanding. In this paper, we propose the Multimodal Reasoning with Multimodal Knowledge Graph (MR-MKG) method, which leverages multimodal knowledge graphs (MMKGs) to learn rich and semantic knowledge across modalities, significantly enhancing the multimodal reasoning capabilities of LLMs. In particular, a relation graph attention network is utilized for encoding MMKGs and a cross-modal alignment module is designed for optimizing image-text alignment. A MMKG-grounded dataset is constructed to equip LLMs with initial expertise in multimodal reasoning through pretraining. Remarkably, MR-MKG achieves superior performance while training on only a small fraction of parameters, approximately 2.25% of the LLM's parameter size. Experimental results on multimodal question answering and multimodal analogy reasoning tasks demonstrate that our MR-MKG method outperforms previous state-of-the-art models.
@inproceedings{junlin24acl,
title={Multimodal Reasoning with Multimodal Knowledge Graph},
author = {Junlin Lee and
Yequan Wang and
Jing Li and
Min Zhang},
booktitle = {The 62nd Annual Meeting of the Association for Computational Linguistics (ACL)},
year={2024}
}
Knowledge Fusion By Evolving Weights of Language Models
Guodong Du, Jing Li , Hanting Liu, Runhua Jiang, Shuyang Yu, Yifei Guo, Sim Kuan Goh and Ho-Kin Tang
ACL-24- Findings of the 62nd Annual Meeting of the Association for Computational Linguistics, 2024.
PDF  
Code  
Abstract  
BibTex
Fine-tuning pre-trained language models, particularly large language models, demands extensive computing resources and can result in varying performance outcomes across different domains and datasets. This paper examines the approach of integrating multiple models from diverse training scenarios into a unified model. This unified model excels across various data domains and exhibits the ability to generalize well on out-of-domain data. We propose a knowledge fusion method named Evolver, inspired by evolutionary algorithms, which does not need further training or additional training data. Specifically, our method involves aggregating the weights of different language models into a population and subsequently generating offspring models through mutation and crossover operations. These offspring models are then evaluated against their parents, allowing for the preservation of those models that show enhanced performance on development datasets. Importantly, our model evolving strategy can be seamlessly integrated with existing model merging frameworks, offering a versatile tool for model enhancement. Experimental results on mainstream language models (i.e., encoder-only, decoder-only, encoder-decoder) reveal that Evolver outperforms previous state-of-the-art models by large margins. The code is publicly available at https://github.com/duguodong7/model-evolution.
@inproceedings{guodong24acl,
title={Knowledge Fusion By Evolving Weights of Language Models},
author = {Guodong DU and
Jing Li and
Hanting Liu and
Runhua Jiang and
Shuyang Yu and
Yifei Guo and
Sim Kuan Goh and
Ho-Kin Tang},
booktitle = {Findings of The 62nd Annual Meeting of the Association for Computational Linguistics (ACL)},
year={2024}
}
Masked Structural Growth for 2x Faster Language Model Pre-training
Yiqun Yao, Zheng Zhang, Jing Li and Yequan Wang
ICLR-24- The Twelfth International Conference on Learning Representations, 2024.
PDF  
Code  
Abstract  
BibTex
Acceleration of large language model pre-training is a critical issue in present NLP research. In this paper, we focus on speeding up pre-training by progressively growing from a small Transformer structure to a large one. There are two main research problems related to progressive growth: growth schedule and growth operator. For growth schedule, existing work has explored multi-stage expansion of depth and feedforward layers. However, the impact of each dimension on the schedule's efficiency is still an open question. For growth operator, existing work relies on the initialization of new weights to inherit knowledge, and achieve only non-strict function preservation, limiting further optimization of training dynamics. To address these issues, we propose Masked Structural Growth (MSG), including growth schedules involving all possible dimensions and strictly function-preserving growth operators that is independent of the initialization of new weights. Experiments show that MSG is significantly faster than related work: we achieve a speed-up of 80% for Bert-base and 120% for Bert-large pre-training. Moreover, MSG is able to improve fine-tuning performances at the same time.
@inproceedings{yao24iclr,
title={Masked Structural Growth for 2x Faster Language Model Pre-training},
author = {Yiqun Yao and
Zheng Zhang and
Jing Li and
Yequan Wang},
booktitle = {The Twelfth International Conference on Learning Representations (ICLR)},
year={2024}
}
Few-Shot Relation Extraction With Dual Graph Neural Network Interaction
Jing Li, Shanshan Feng and Billy Chiu
IEEE TNNLS-24- IEEE Transactions on Neural Networks and Learning Systems, 35(10): 14396-14408, 2024.
PDF  
Abstract  
BibTex
Recent advances in relation extraction with deep neural architectures have achieved excellent performance. However, current models still suffer from two main drawbacks: 1) they require enormous volumes of training data to avoid model overfitting and 2) there is a sharp decrease in performance when the data distribution during training and testing shift from one domain to the other. It is thus vital to reduce the data requirement in training and explicitly model the distribution difference when transferring knowledge from one domain to another. In this work, we concentrate on few-shot relation extraction under domain adaptation settings. Specifically, we propose, a novel graph neural network (GNN) based approach for few-shot relation extraction. leverages an edge-labeling dual graph (i.e. an instance graph and a distribution graph) to explicitly model the intraclass similarity and interclass dissimilarity in each individual graph, as well as the instance-level and distribution-level relations across graphs. A dual graph interaction mechanism is proposed to adequately fuse the information between the two graphs in a cyclic flow manner. We extensively evaluate on FewRel1.0 and FewRel2.0 benchmarks under four few-shot configurations. The experimental results demonstrate that can match or outperform previously published approaches. We also perform experiments to further investigate the parameter settings and architectural choices, and we offer a qualitative analysis.
@article{jing24dualgraph,
title={Few-Shot Relation Extraction With Dual Graph Neural Network Interaction},
author={Li, Jing and Feng, Shanshan and Chiu, Billy},
journal={IEEE Transactions on Neural Networks and Learning Systems (TNNLS)},
volume = {35},
number = {10},
pages = {14396--14408},
year = {2024},
publisher={IEEE}
}
Chain of Thought with Explicit Evidence Reasoning for Few-shot Relation Extraction
Xilai Ma, Jing Li and Min Zhang
EMNLP-23- Findings of The 2023 Conference on Empirical Methods in Natural Language Processing, 2023.
PDF  
Abstract  
BibTex
Few-shot relation extraction involves identifying the type of relationship between two specific entities within a text, using a limited number of annotated samples. A variety of solutions to this problem have emerged by applying meta-learning and neural graph techniques which typically necessitate a training process for adaptation. Recently, the strategy of in-context learning has been demonstrating notable results without the need of training. Few studies have already utilized in-context learning for zero-shot information extraction. Unfortunately, the evidence for inference is either not considered or implicitly modeled during the construction of chain-of-thought prompts. In this paper, we propose a novel approach for few-shot relation extraction using large language models, named CoT-ER, chain-of-thought with explicit evidence reasoning. In particular, CoT-ER first induces large language models to generate evidences using task-specific and concept-level knowledge. Then these evidences are explicitly incorporated into chain-of-thought prompting for relation extraction. Experimental results demonstrate that our CoT-ER approach (with 0% training data) achieves competitive performance compared to the fully-supervised (with 100% training data) state-of-the-art approach on the FewRel1.0 and FewRel2.0 datasets.
@inproceedings{DBLP:conf/emnlp/MaLZ23a,
author = {Xilai Ma and
Jing Li and
Min Zhang},
title = {Chain of Thought with Explicit Evidence Reasoning for Few-shot Relation
Extraction},
booktitle = {Findings of the Association for Computational Linguistics (EMNLP),
pages = {2334--2352},
year = {2023},
url = {https://aclanthology.org/2023.findings-emnlp.153},
}
Rethinking Document-Level Relation Extraction: A Reality Check
Jing Li, Yequan Wang, Shuai Zhang and Min Zhang
ACL-23- Findings of The 61st Annual Meeting of the Association for Computational Linguistics, 2023.
PDF  
Abstract  
BibTex
Recently, numerous efforts have continued to push up performance boundaries of document-level relation extraction (DocRE) and have claimed significant progress in DocRE. In this paper, we do not aim at proposing a novel model for DocRE. Instead, we take a closer look at the field to see if these performance gains are actually true. By taking a comprehensive literature review and a thorough examination of popular DocRE datasets, we find that these performance gains are achieved upon a strong or even untenable assumption in common: all named entities are perfectly localized, normalized, and typed in advance. Next, we construct four types of entity mention attacks to examine the robustness of typical DocRE models by behavioral probing. We also have a close check on model usability in a more realistic setting. Our findings reveal that most of current DocRE models are vulnerable to entity mention attacks and difficult to be deployed in real-world end-user NLP applications. Our study calls more attentions for future research to stop simplifying problem setups, and to model DocRE in the wild rather than in an unrealistic Utopian world.
@inproceedings{li2023rethinking,
title={Rethinking Document-Level Relation Extraction: A Reality Check},
author={Li, Jing and Wang, Yequan and Zhang, Shuai and Zhang, Min},
pages= {5715--5730},
booktitle = {Findings of The 61st Annual Meeting of the Association for Computational Linguistics (ACL)},
year={2023}
}
Few-Shot Named Entity Recognition via Meta-Learning (Extended Abstract)
Jing Li, Billy Chiu, Shanshan Feng and Hao Wang
ICDE-23- The 39th IEEE International Conference on Data Engineering, 2023.
PDF  
Abstract  
BibTex
toupdate
toupdate
A Survey on Deep Learning for Named Entity Recognition (Extended Abstract)
Jing Li, Aixin Sun, Jianglei Han and Chenliang Li
ICDE-23- The 39th IEEE International Conference on Data Engineering, 2023.
PDF  
Abstract  
BibTex
toupdate
toupdate
GRLSTM: Trajectory Similarity Computation with Graph-based Residual LSTM
Silin Zhou, Jing Li, Hao Wang, Shuo Shang, Peng Han
AAAI-23- The Thirty-Seventh AAAI Conference on Artificial Intelligence.
PDF  
Abstract  
BibTex
The computation of trajectory similarity is a crucial task in many spatial data analysis applications. However, existing methods have been designed primarily for trajectories in Euclidean space, which overlooks the fact that real-world trajectories are often generated on road networks. This paper addresses this gap by proposing a novel framework, called GRLSTM (Graph-based Residual LSTM). To jointly capture the properties of trajectories and road networks, the proposed framework incorporates knowledge graph embedding (KGE), graph neural network (GNN), and the residual network into the multi-layer LSTM (Residual-LSTM). Specifically, the framework constructs a point knowledge graph to study the multi-relation of points, as points may belong to both the trajectory and the road network. KGE is introduced to learn point embeddings and relation embeddings to build the point fusion graph, while GNN is used to capture the topology structure information of the point fusion graph. Finally, Residual-LSTM is used to learn the trajectory embeddings.To further enhance the accuracy and robustness of the final trajectory embeddings, we introduce two new neighbor-based point loss functions, namely, graph-based point loss function and trajectory-based point loss function. The GRLSTM is evaluated using two real-world trajectory datasets, and the experimental results demonstrate that GRLSTM outperforms all the state-of-the-art methods significantly.
@inproceedings{DBLP:conf/aaai/Zhou0WS023,
author = {Silin Zhou and
Jing Li and
Hao Wang and
Shuo Shang and
Peng Han},
editor = {Brian Williams and
Yiling Chen and
Jennifer Neville},
title = {{GRLSTM:} Trajectory Similarity Computation with Graph-Based Residual
{LSTM}},
booktitle = {Thirty-Seventh {AAAI} Conference on Artificial Intelligence, {AAAI}
2023, Thirty-Fifth Conference on Innovative Applications of Artificial
Intelligence, {IAAI} 2023, Thirteenth Symposium on Educational Advances
in Artificial Intelligence},
pages = {4972--4980},
year = {2023},
}
Sequence Labeling with Meta-Learning
Jing Li, Peng Han, Xiangnan Ren, Jilin Hu, Lisi Chen and Shuo Shang
IEEE TKDE-23- IEEE Transactions on Knowledge and Data Engineering, 35(3): 3072-3086, 2023.
PDF  
Abstract  
BibTex
Recent neural architectures in sequence labeling have yielded state-of-the-art performance on single domain data such as newswires. However, they still suffer from (i) requiring massive amounts of training data to avoid overfitting; (ii) huge performance degradation when there is a domain shift in the data distribution between training and testing. In this paper, we investigate the problem of domain adaptation for sequence labeling under homogeneous and heterogeneous settings. We propose MetaSeq, a novel meta-learning approach for domain adaptation in sequence labeling. Specifically, MetaSeq incorporates meta-learning and adversarial training strategies to encourage robust, general and transferable representations for sequence labeling. The key advantage of MetaSeq is that it is capable of adapting to new unseen domains with a small amount of annotated data from those domains. We extensively evaluate MetaSeq on named entity recognition, part-of-speech tagging and slot filling tasks under homogeneous and heterogeneous settings. The experimental results show that MetaSeq achieves state-of-the-art performance against eight baselines. Impressively, MetaSeq surpasses the in-domain performance using only 16.17% and 7% of target domain data on average for homogeneous settings, and 34.76%, 24%, 22.5% of target domain data on average for heterogeneous settings.
@article{jing23seq,
author = {Jing Li and Peng Han and Xiangnan Ren and Jilin Hu and Lisi Chen and Shuo Shang},
title = {Sequence Labeling with Meta-Learning},
journal = {IEEE Transactions on Knowledge and Data Engineering (TKDE)},
volume = {35},
number = {3},
pages = {3072--3086},
year = {2023},
url = {https://doi.org/10.1109/TKDE.2021.3118469},
doi = {10.1109/TKDE.2021.3118469},
}
A Dual-Channel Framework for Sarcasm Recognition by Detecting Sentiment Conflict
Yiyi Liu, Yequan Wang, Aixin Sun, Xuying Meng, Jing Li, Jiafeng Guo
NAACL-22- Findings of 2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics.
PDF  
Abstract  
BibTex
Sarcasm employs ambivalence, where one says something positive but actually means negative, and vice versa. The essence of sarcasm, which is also a sufficient and necessary condition, is the conflict between literal and implied sentiments expressed in one sentence. However, it is difficult to recognize such sentiment conflict because the sentiments are mixed or even implicit. As a result, the
recognition of sophisticated and obscure sentiment brings in a great challenge to sarcasm detection. In this paper, we propose a DualChannel Framework by modeling both literal and implied sentiments separately. Based on this dual-channel framework, we design the Dual-Channel Network (DC-Net) to recognize sentiment conflict. Experiments on political debates (i.e., IAC-V1 and IAC-V2) and Twitter datasets show that our proposed DC-Net achieves state-of-the-art performance on sarcasm recognition. Our code is released to support research https://github.com/yiyi-ict/dual-channel-for-sarcasm.
@inproceedings{DBLP:conf/naacl/LiuWSMLG22,
author = {Yiyi Liu and
Yequan Wang and
Aixin Sun and
Xuying Meng and
Jing Li and
Jiafeng Guo},
editor = {Marine Carpuat and
Marie{-}Catherine de Marneffe and
Iv{\'{a}}n Vladimir Meza Ru{\'{\i}}z},
title = {A Dual-Channel Framework for Sarcasm Recognition by Detecting Sentiment
Conflict},
booktitle = {Findings of the Association for Computational Linguistics: {NAACL}
2022, Seattle, WA, United States, July 10-15, 2022},
pages = {1670--1680},
publisher = {Association for Computational Linguistics},
year = {2022},
url = {https://doi.org/10.18653/v1/2022.findings-naacl.126},
doi = {10.18653/v1/2022.findings-naacl.126},
timestamp = {Tue, 31 Jan 2023 17:06:57 +0100},
biburl = {https://dblp.org/rec/conf/naacl/LiuWSMLG22.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
Interactive Information Extraction by Semantic Information Graph
Siqi Fan, Yequan Wang, Jing Li, Zheng Zhang, Shuo Shang, Peng Han
IJCAI-ECAI-22- The 31st International Joint Conference on Artificial Intelligence and the 25th European Conference on Artificial Intelligence, 2022. Acceptance rate: 15%.
PDF  
Abstract  
BibTex
Information extraction (IE) mainly focuses on three highly correlated subtasks, i.e., entity extraction, relation extraction and event extraction. Recently, there are studies using Abstract Meaning Representation (AMR) to utilize the intrinsic correlations among these three subtasks. AMR based models are capable of building the relationship of arguments. However, they are hard to deal with relations. In addition, the noises of AMR (i.e., tags unrelated to IE tasks, nodes with unconcerned conception, and edge types with complicated hierarchical structures) disturb the decoding processing of IE. As a result, the decoding processing limited by the AMR cannot be worked effectively. To overcome the shortages, we propose an Interactive Information Extraction (InterIE) model based on a novel Semantic Information Graph (SIG). SIG can guide our InterIE model to tackle the three subtasks jointly. Furthermore, the well-designed SIG without noise is capable of enriching entity and event trigger representation, and capturing the edge connection between the information types. Experimental results show that our InterIE achieves state-of-the-art performance on all IE subtasks on the benchmark dataset (i.e., ACE05-E+ and ACE05-E). More importantly, the proposed model is not sensitive to the decoding order, which goes beyond the limitations of AMR based methods.
@inproceedings{DBLP:conf/ijcai/FanWLZSH22,
author = {Siqi Fan and
Yequan Wang and
Jing Li and
Zheng Zhang and
Shuo Shang and
Peng Han},
editor = {Luc De Raedt},
title = {Interactive Information Extraction by Semantic Information Graph},
booktitle = {Proceedings of the Thirty-First International Joint Conference on
Artificial Intelligence, {IJCAI} 2022, Vienna, Austria, 23-29 July
2022},
pages = {4100--4106},
publisher = {ijcai.org},
year = {2022},
url = {https://doi.org/10.24963/ijcai.2022/569},
doi = {10.24963/ijcai.2022/569},
timestamp = {Wed, 27 Jul 2022 16:43:00 +0200},
biburl = {https://dblp.org/rec/conf/ijcai/FanWLZSH22.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
FOGS: First-Order Gradient Supervision with Learning-based Graph for Traffic Flow Forecasting
Xuan Rao, Hao Wang, Shuo Shang, Liang Zhang, Jing Li, Peng Han
IJCAI-ECAI-22- The 31st International Joint Conference on Artificial Intelligence and the 25th European Conference on Artificial Intelligence, 2022. Acceptance rate: 15%.
PDF  
Abstract  
BibTex
Traffic flow forecasting plays a vital role in the transportation domain. Existing studies usually manually construct correlation graphs and design sophisticated models for learning spatial and temporal features to predict future traffic states. However, manually constructed correlation graphs cannot accurately extract the complex patterns hidden in the traffic data. In addition, it is challenging for the prediction model to fit traffic data due to its irregularly-shaped distribution. To solve the above-mentioned problems, in this paper, we propose a novel learning-based method to learn a spatial-temporal correlation graph, which could make good use of the traffic flow data. Moreover, we propose First-Order Gradient Supervision (FOGS), a novel method for traffic flow forecasting. FOGS utilizes first-order gradients, rather than specific flows, to train prediction model, which effectively avoids the problem of fitting irregularly-shaped distributions. Comprehensive numerical evaluations on four real-world datasets reveal that the proposed methods achieve state-of-the-art performance and significantly outperform the benchmarks.
@inproceedings{DBLP:conf/ijcai/RaoWZLS022,
author = {Xuan Rao and
Hao Wang and
Liang Zhang and
Jing Li and
Shuo Shang and
Peng Han},
editor = {Luc De Raedt},
title = {{FOGS:} First-Order Gradient Supervision with Learning-based Graph
for Traffic Flow Forecasting},
booktitle = {Proceedings of the Thirty-First International Joint Conference on
Artificial Intelligence, {IJCAI} 2022, Vienna, Austria, 23-29 July
2022},
pages = {3926--3932},
publisher = {ijcai.org},
year = {2022},
url = {https://doi.org/10.24963/ijcai.2022/545},
doi = {10.24963/ijcai.2022/545},
timestamp = {Sun, 02 Oct 2022 16:08:04 +0200},
biburl = {https://dblp.org/rec/conf/ijcai/RaoWZLS022.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
A Survey on Deep Learning for Named Entity Recognition
Jing Li, Aixin Sun, Jianglei Han and Chenliang Li
IEEE TKDE-22- IEEE Transactions on Knowledge and Data Engineering, 34(1): 50-70, 2022.
PDF  
Abstract  
BibTex
Named entity recognition (NER) is the task to identify text spans that mention named entities, and to classify them into predefined categories such as person, location, organization etc. NER serves as the basis for a variety of natural language applications such as question answering, text summarization, and machine translation. Although early NER systems are successful in producing decent recognition accuracy, they often require much human effort in carefully designing rules or features. In recent years, deep learning, empowered by continuous real-valued vector representations and semantic composition through nonlinear processing, has been employed in NER systems, yielding state-of-the-art performance. In this paper, we provide a comprehensive review on existing deep learning techniques for NER. We first introduce NER resources, including tagged NER corpora and off-the-shelf NER tools. Then, we systematically categorize existing works based on a taxonomy along three axes: distributed representations for input, context encoder, and tag decoder. Next, we survey the most representative methods for recent applied techniques of deep learning in new NER problem settings and applications. Finally, we present readers with the challenges faced by NER systems and outline future directions in this area.
@article{jing22nersurvey,
author = {Jing Li and Aixin Sun and Jianglei Han and Chenliang Li},
title = {A Survey on Deep Learning for Named Entity Recognition},
journal = {IEEE Transactions on Knowledge and Data Engineering (TKDE)},
volume = {34},
number = {1},
pages = {50--70},
year = {2022},
url = {https://doi.org/10.1109/TKDE.2020.2981314},
doi = {10.1109/TKDE.2020.2981314},
}
Neural Text Segmentation and Its Application to Sentiment Analysis
Jing Li, Billy Chiu, Shuo Shang and Ling Shao
IEEE TKDE-22- IEEE Transactions on Knowledge and Data Engineering, 34(2): 828-842, 2022.
PDF  
Abstract  
BibTex  
Demo
Text segmentation is a fundamental task in natural language processing. Depending on the levels of granularity, the task can be defined as segmenting a document into topical segments, or segmenting a sentence into elementary discourse units (EDUs). Traditional solutions to the two tasks heavily rely on carefully designed features. The recently proposed neural models do not need manual feature engineering, but they either suffer from sparse boundary tags or cannot efficiently handle the issue of variable size output vocabulary. In light of such limitations, we propose a generic end-to-end segmentation model, namely SEGBOT, which first uses a bidirectional recurrent neural network to encode an input text sequence. SEGBOT then uses another recurrent neural networks, together with a pointer network, to select text boundaries in the input sequence. In this way, SEGBOT does not require any hand-crafted features. More importantly, SEGBOT inherently handles the issue of variable size output vocabulary and the issue of sparse boundary tags. In our experiments, SEGBOT outperforms state-of-the-art models on two tasks: document-level topic segmentation and sentence-level EDU segmentation. As a downstream application, we further propose a hierarchical attention model for sentence-level sentiment analysis based on the outcomes of SEGBOT. The hierarchical model can make full use of both word-level and EDU-level information simultaneously for sentence-level sentiment analysis. In particular, it can effectively exploit EDU-level information, such as the inner properties of EDUs, which cannot be fully encoded in word-level features. Experimental results show that our hierarchical model achieves new state-of-the-art results on the Movie Review and Stanford Sentiment Treebank benchmarks.
@article{li22segsenti,
author = {Jing Li and Billy Chiu and Shuo Shang and Ling Shao},
title = {Neural Text Segmentation and Its Application to Sentiment Analysis},
journal = {IEEE Transactions on Knowledge and Data Engineering (TKDE)},
volume = {34},
number = {2},
pages = {828--842},
year = {2022},
url = {https://doi.org/10.1109/TKDE.2020.2983360},
doi = {10.1109/TKDE.2020.2983360},
}
Few-Shot Named Entity Recognition via Meta-Learning
Jing Li, Billy Chiu, Shanshan Feng and Hao Wang
IEEE TKDE-22- IEEE Transactions on Knowledge and Data Engineering, 34(9): 4245-4256, 2022.
PDF  
Abstract  
BibTex
Few-shot learning under the N-way K-shot setting (i.e., K annotated samples for each of N classes) has been widely studied in relation extraction (e.g., FewRel) and image classification (e.g., Mini-ImageNet). Named entity recognition (NER) is typically framed as a sequence labeling problem where the entity classes are inherently entangled together because the entity number and classes in a sentence are not known in advance, leaving the N-way K-shot NER problem so far unexplored. In this paper, we first formally define a more suitable N-way K-shot setting for NER. Then we propose FewNER, a novel meta-learning approach for few-shot NER. FewNER separates the entire network into a task-independent part and a task-specific part. During training in FewNER, the task-independent part is meta-learned across multiple tasks and a task-specific part is learned for each single task in a low-dimensional space. At test time, FewNER keeps the task-independent part fixed and adapts to a new task via gradient descent by updating only the task-specific part, resulting in it being less prone to overfitting and more computationally efficient. The results demonstrate that FewNER achieves state-of-the-art performance against nine baseline methods by significant margins on three adaptation experiments.
@article{li20fewshot,
author = {Jing Li and Billy Chiu and Shanshan Feng and Hao Wang},
title = {Few-Shot Named Entity Recognition via Meta-Learning},
journal = {IEEE Transactions on Knowledge and Data Engineering (TKDE)},
volume = {34},
number = {9},
pages = {4245--4256},
year = {2022},
url = {https://doi.org/10.1109/TKDE.2020.3038670},
doi = {10.1109/TKDE.2020.3038670},
}
Neural Named Entity Boundary Detection
Jing Li, Aixin Sun and Yukun Ma
IEEE TKDE-21- IEEE Transactions on Knowledge and Data Engineering, 33(4): 1790-1795, 2021.
PDF  
Abstract  
BibTex  
Demo
In this paper, we focus on named entity boundary detection , which is to detect the start and end boundaries of an entity mention in text, without predicting its type. The detected entities are input to entity linking or fine-grained typing systems for semantic enrichment. We propose BdryBot , a recurrent neural network encoder-decoder framework with a pointer network to detect entity boundaries from a given sentence. The encoder considers both character-level representations and word-level embeddings to represent the input words. In this way, BdryBot does not require any hand-crafted features. Because of the pointer network, BdryBot overcomes the problem of variable size output vocabulary and the issue of sparse boundary tags. We conduct two sets of experiments, in-domain detection and cross-domain detection, on six datasets. Our results show that BdryBot achieves state-of-the-art performance against five baselines. In addition, our proposed approach can be further enhanced when incorporating contextualized language embeddings into token representations.
@article{li21bdrybot,
author = {Jing Li and Aixin Sun and Yukun Ma},
title = {Neural Named Entity Boundary Detection},
journal = {IEEE Transactions on Knowledge and Data Engineering (TKDE)},
volume = {33},
number = {4},
pages = {1790--1795},
year = {2021},
url = {https://doi.org/10.1109/TKDE.2020.2981329},
doi = {10.1109/TKDE.2020.2981329},
}
Domain Generalization for Named Entity Boundary Detection via Meta-Learning
Jing Li, Shuo Shang and Lisi Chen
IEEE TNNLS-21- IEEE Transactions on Neural Networks and Learning Systems, 32(9): 3819-3830, 2021.
PDF  
Abstract  
BibTex
Named entity recognition (NER) aims to recognize mentions of rigid designators from text belonging to predefined semantic types, such as person, location, and organization. In this article, we focus on a fundamental subtask of NER, named entity boundary detection, which aims at detecting the start and end boundaries of an entity mention in the text, without predicting its semantic type. The entity boundary detection is essentially a sequence labeling problem. Existing sequence labeling methods either suffer from sparse boundary tags (i.e., entities are rare and nonentities are common) or they cannot well handle the issue of variable size output vocabulary (i.e., need to retrain models with respect to different vocabularies). To address these two issues, we propose a novel entity boundary labeling model that leverages pointer networks to effectively infer boundaries depending on the input sequence. On the other hand, training models on source domains that generalize to new target domains at the test time are a challenging problem because of the performance degradation. To alleviate this issue, we propose METABDRY, a novel domain generalization approach for entity boundary detection without requiring any access to target domain information. Especially, adversarial learning is adopted to encourage domain-invariant representations. Meanwhile, metalearning is used to explicitly simulate a domain shift during training so that metaknowledge from multiple resource domains can be effectively aggregated. As such, METABDRY explicitly optimizes the capability of ``learning to generalize,'' resulting in a more general and robust model to reduce the domain discrepancy. We first conduct experiments to demonstrate the effectiveness of our novel boundary labeling model. We then extensively evaluate METABDRY on eight data sets under domain generalization settings. The experimental results show that METABDRY achieves state-of-the-art results against the recent seven baselines.
@article{li21domaingen,
author = {Jing Li and Shuo Shang and Lisi Chen},
title = {Domain Generalization for Named Entity Boundary Detection via Metalearning},
journal = {IEEE Transactions on Neural Networks and Learning Systems (TNNLS)},
volume = {32},
number = {9},
pages = {3819--3830},
year = {2021},
url = {https://doi.org/10.1109/TNNLS.2020.3015912},
doi = {10.1109/TNNLS.2020.3015912},
}
Leveraging Official Content and Social Context to Recommend Software Documentation
Jing Li, Zhenchang Xing and Muhammad Ashad Kabir
IEEE TSC-21- IEEE Transactions on Services Computing, 14(2), 472-486, 2021.
PDF  
Abstract  
BibTex
For an unfamiliar Application Programming Interface (API), software developers often access the official documentation to learn its usage, and post questions related to this API on social question and answering (Q&A) sites to seek solutions. The official software documentation often captures the information about functionality and parameters, but lacks detailed descriptions in different usage scenarios. On the contrary, the discussions about APIs on social Q&A sites provide enriching usages. Moreover, existing code search engines and information retrieval systems cannot effectively return relevant software documentation when the issued query does not contain code snippets or API-like terms. In this paper, we present CnCxL2R , a software documentation recommendation strategy incorporating the content of official documentation and the social context on Q&A into a learning-to-rank schema. In the proposed strategy, the content, local context and global context of documentation are considered to select candidate documents. Then four types of features are extracted to learn a ranking model. We conduct a large-scale automatic evaluation on Java documentation recommendation. The results show that CnCxL2R achieves state-of-the-art performance over the eight baseline models. We also compare the CnCxL2R with Google search. The results show that CnCxL2R can recommend more relevant software documentation, and can effectively capture the semantic between the high-level intent in developers’ queries and the low-level implementation in software documentation.
@article{TSCLiXK21,
author = {Jing Li and
Zhenchang Xing and
Muhammad Ashad Kabir},
title = {Leveraging Official Content and Social Context to Recommend Software
Documentation},
journal = {{IEEE} Trans. Serv. Comput.},
volume = {14},
number = {2},
pages = {472--486},
year = {2021},
url = {https://doi.org/10.1109/TSC.2018.2812729},
doi = {10.1109/TSC.2018.2812729},
}
HME: A Hyperbolic Metric Embedding Approach for Next-POI Recommendation
Shanshan Feng, Lucas Vinh Tran, Gao Cong, Lisi Chen, Jing Li and Fan Li
SIGIR-20- The 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2020. Acceptance rate: 147/555 (26%).
PDF  
Abstract  
BibTex
With the increasing popularity of location-aware social media services, next-Point-of-Interest (POI) recommendation has gained significant research interest. The key challenge of next-POI recommendation is to precisely learn users' sequential movements from sparse check-in data. To this end, various embedding methods have been proposed to learn the representations of check-in data in the Euclidean space. However, their ability to learn complex patterns, especially hierarchical structures, is limited by the dimensionality of the Euclidean space. To this end, we propose a new research direction that aims to learn the representations of check-in activities in a hyperbolic space, which yields two advantages. First, it can effectively capture the underlying hierarchical structures, which are implied by the power-law distributions of user movements. Second, it provides high representative strength and enables the check-in data to be effectively represented in a low-dimensional space. Specifically, to solve the next-POI recommendation task, we propose a novel hyperbolic metric embedding (HME) model, which projects the check-in data into a hyperbolic space. The HME jointly captures sequential transition, user preference, category and region information in a unified approach by learning embeddings in a shared hyperbolic space. To the best of our knowledge, this is the first study to explore a non-Euclidean embedding model for next-POI recommendation. We conduct extensive experiments on three check-in datasets to demonstrate the superiority of our hyperbolic embedding approach over the state-of-the-art next-POI recommendation algorithms. Moreover, we conduct experiments on another four online transaction datasets for next-item recommendation to further demonstrate the generality of our proposed model.
@inproceedings{DBLP:conf/sigir/FengTCCLL20,
author = {Shanshan Feng and
Lucas Vinh Tran and
Gao Cong and
Lisi Chen and
Jing Li and
Fan Li},
title = {{HME:} {A} Hyperbolic Metric Embedding Approach for Next-POI Recommendation},
booktitle = {Proceedings of the 43rd International {ACM} {SIGIR} conference on
research and development in Information Retrieval (SIGIR)},
pages = {1429--1438},
publisher = {{ACM}},
year = {2020},
url = {https://doi.org/10.1145/3397271.3401049},
doi = {10.1145/3397271.3401049},
}
Contextualized Point-of-Interest Recommendation
Peng Han, Zhongxiao Li, Yong Liu, Peilin Zhao, Jing Li, Hao Wang and Shuo Shang
IJCAI-PRICAI-20- The 29th International Joint Conference on Artificial Intelligence and the 17th Pacific Rim International Conference on Artificial Intelligence, 2020.
Acceptance rate: 592/4717 (12.6%).
PDF  
Abstract  
BibTex
Point-of-interest (POI) recommendation has become an increasingly important sub-field of recommendation system research. Previous methods employ various assumptions to exploit the contextual information for improving the recommendation accuracy. The common property among them is that similar users are more likely to visit similar POIs and similar POIs would like to be visited by the same user. However, none of existing methods utilize similarity explicitly to make recommendations. In this paper, we propose a new framework for POI recommendation, which explicitly utilizes similarity with contextual information. Specifically, we categorize the context information into two groups, i.e., global and local context, and develop different regularization terms to incorporate them for recommendation. A graph Laplacian regularization term is utilized to exploit the global context information. Moreover, we cluster users into different groups, and let the objective function constrain the users in the same group to have similar predicted POI ratings. An alternating optimization method is developed to optimize our model and get the final rating matrix. The results in our experiments show that our algorithm outperforms all the state-of-the-art methods.
@inproceedings{DBLP:conf/ijcai/HanLLZLWS20,
author = {Peng Han and
Zhongxiao Li and
Yong Liu and
Peilin Zhao and
Jing Li and
Hao Wang and
Shuo Shang},
title = {Contextualized Point-of-Interest Recommendation},
booktitle = {Proceedings of the Twenty-Ninth International Joint Conference on
Artificial Intelligence (IJCAI)},
pages = {2484--2490},
year = {2020},
url = {https://doi.org/10.24963/ijcai.2020/344},
doi = {10.24963/ijcai.2020/344},
}
MetaNER: Named Entity Recognition with Meta-Learning
Jing Li, Shuo Shang and Ling Shao
WWW-20- The Web Conference, 2020. Acceptance rate: 217/1129 (19.2%).
PDF  
Abstract  
BibTex
Recent neural architectures in named entity recognition (NER) have yielded state-of-the-art performance on single domain data such as newswires. However, they still suffer from (i) requiring massive amounts of training data to avoid overfitting; (ii) huge performance degradation when there is a domain shift in the data distribution between training and testing. In this paper, we investigate the problem of domain adaptation for NER under homogeneous and heterogeneous settings. We propose MetaNER, a novel meta-learning approach for domain adaptation in NER. Specifically, MetaNER incorporates meta-learning and adversarial training strategies to encourage robust, general and transferable representations for sequence labeling. The key advantage of MetaNER is that it is capable of adapting to new unseen domains with a small amount of annotated data from those domains. We extensively evaluate MetaNER on multiple datasets under homogeneous and heterogeneous settings. The experimental results show that MetaNER achieves state-of-the-art performance against eight baselines. Impressively, MetaNER surpasses the in-domain performance using only 16.17% and 34.76% of target domain data on average for homogeneous and heterogeneous settings, respectively.
Pay Your Trip for Traffic Congestion: Dynamic Pricing in Traffic-Aware Road Networks
Lisi Chen, Shuo Shang, Bin Yao and Jing Li
AAAI-20- The Thirty-Fourth AAAI Conference on Artificial Intelligence. Acceptance rate: 1591/7737 (20.6%).
PDF  
Abstract  
BibTex
Pricing is essential in optimizing transportation resource allocation. Congestion pricing is widely used to reduce urban traffic congestion. We propose and investigate a novel Dynamic Pricing Strategy (DPS) to price travelers' trips in intelligent transportation platforms (e.g., DiDi, Lyft, Uber). The trips are charged according to their “congestion contributions” to global urban traffic systems. The dynamic pricing strategy retrieves a matching between n travelers' trips and the potential travel routes (each trip has k potential routes) to minimize the global traffic congestion. We believe that DPS holds the potential to benefit society and the environment, such as reducing traffic congestion and enabling smarter and greener transportation. The DPS problem is challenging due to its high computation complexity (there exist kn matching possibilities). We develop an efficient and effective approximate matching algorithm based on local search, as well as pruning techniques to further enhance the matching efficiency. The accuracy and efficiency of the dynamic pricing strategy are verified by extensive experiments on real datasets.
@inproceedings{DBLP:conf/aaai/ChenSYL20,
author = {Lisi Chen and
Shuo Shang and
Bin Yao and
Jing Li},
title = {Pay Your Trip for Traffic Congestion: Dynamic Pricing in Traffic-Aware
Road Networks},
booktitle = {The Thirty-Fourth {AAAI} Conference on Artificial Intelligence (AAAI)},
pages = {582--589},
year = {2020},
url = {https://aaai.org/ojs/index.php/AAAI/article/view/5397},
}
Adversarial Transfer for Named Entity Boundary Detection with Pointer Networks
Jing Li, Deheng Ye and Shuo Shang
IJCAI-19- The 28th International Joint Conference on Artificial Intelligence, Pages 5053-5069, 2019. Acceptance rate: 850/4752 (17.9%).
PDF  
Abstract  
BibTex
In this paper, we focus on named entity boundary detection, which aims to detect the start and end boundaries of an entity mention in text, without predicting its type. A more accurate and robust detection approach is desired to alleviate error propagation in downstream applications, such as entity linking and fine-grained typing systems. Here, we first develop a novel entity boundary labeling approach with pointer networks, where the output dictionary size depends on the input, which is variable. Furthermore, we propose AT-Bdry, which incorporates adversarial transfer learning into an end-to-end sequence labeling model to encourage domain-invariant representations. More importantly, AT-Bdry can reduce domain difference in data distributions between the source and target domains, via an unsupervised transfer learning approach (i.e., no annotated target-domain data is necessary). We conduct Formal Text to Formal Text, Formal Text to Informal Text and ablation evaluations on five benchmark datasets. Experimental results show that AT-Bdry achieves state-of-the-art transferring performance against recent baselines.
@inproceedings{li19advt,
author = {Jing Li and Deheng Ye andd Shuo Shang},
title = {Adversarial Transfer for Named Entity Boundary Detection with Pointer Networks},
booktitle = {Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI)},
pages = {5053--5059},
year = {2019},
url = {https://doi.org/10.24963/ijcai.2019/702},
}
Neural Discourse Segmentation
Jing Li
IJCAI-19- The 28th International Joint Conference on Artificial Intelligence, Pages 6539-6541, 2019. (Demo)
PDF  
Abstract  
BibTex
Identifying discourse structures and coherence relations in a piece of text is a fundamental task in natural language processing. The first step of this process is segmenting sentences into clause-like units called elementary discourse units (EDUs). Traditional solutions to discourse segmentation heavily rely on carefully designed features. In this demonstration, we present SEGBOT, a system to split a given piece of text into sequence of EDUs by using an end-to-end neural segmentation model. Our model does not require hand-crafted features or external knowledge except word embeddings, yet it outperforms state-of-the-art solutions to discourse segmentation.
@inproceedings{li19segdemo,
author = {Jing Li},
title = {Neural Discourse Segmentation},
booktitle = {Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI)},
pages = {6539--6541},
year = {2019},
url = {https://doi.org/10.24963/ijcai.2019/949},
}
LinkLive: Discovering web learning resources for developers from Q&A discussions
Jing Li, Zhenchang Xing and Aixin Sun
WWWJ-19- World Wide Web. 22(4), Pages 1699-1725, Springer, 2019.
PDF  
Abstract  
BibTex
Software developers need access to correlated information (e.g., API documentation, Wikipedia pages, Stack Overflow questions and answers) which are often dispersed among different Web resources. This paper is concerned with the situation where a developer is visiting a Web page, but at the same time is willing to explore correlated Web resources to extend his/her knowledge or to satisfy his/her curiosity. Specifically, we present an item-based collaborative filtering technique, named LinkLive, for automatically recommending a list of correlated Web resources for a particular Web page. The recommendation is done by exploiting hyperlink associations from the crowdsourced knowledge on Stack Overflow. We motivate our research using an exploratory study of hyperlink dissemination patterns on Stack Overflow. We then present our LinkLive technique that uses multiple features, including hyperlink co-occurrences in Q&A discussions, locations (e.g., question, answer, or comment) in which hyperlinks are referenced, and votes for posts/comments in which hyperlinks are referenced. Experiments using 7 years of Stack Overflow data show that, our technique recommends correlated Web resources with promising accuracy in an open setting. A user study of 6 participants suggests that practitioners find the recommended Web resources useful for Web discovery.
@article{LiXS19,
author = {Jing Li and Zhenchang Xing and Aixin Sun},
title = {LinkLive: discovering Web learning resources for developers from Q{\&}A discussions},
journal = {World Wide Web},
volume = {22},
number = {4},
pages = {1699--1725},
year = {2019},
url = {https://doi.org/10.1007/s11280-018-0621-y},
doi = {10.1007/s11280-018-0621-y},
}
DLocRL: A Deep Learning Pipeline for Fine-Grained Location Recognition and Linking in Tweets
Canwen Xu, Jing Li, Xiangyang Luo, Jiaxin Pei, Chenliang Li, Donghong Ji
WWW-19- The Web Conference, Pages 3391-3397, ACM, 2019. (Short)
PDF  
Abstract  
BibTex
In recent years, with the prevalence of social media and smart devices, people causally reveal their locations such as shops, hotels, and restaurants in their tweets. Recognizing and linking such fine-grained location mentions to well-defined location profiles are beneficial for retrieval and recommendation systems. In this paper, we propose DLocRL, a new deep learning pipeline for fine-grained location recognition and linking in tweets, and verify its effectiveness on a real-world Twitter dataset.
@inproceedings{DBLP:conf/www/XuLLPLJ19,
author = {Canwen Xu and
Jing Li and
Xiangyang Luo and
Jiaxin Pei and
Chenliang Li and
Donghong Ji},
title = {DLocRL: {A} Deep Learning Pipeline for Fine-Grained Location Recognition
and Linking in Tweets},
booktitle = {The World Wide Web Conference (WWW)},
pages = {3391--3397},
year = {2019},
url = {https://doi.org/10.1145/3308558.3313491},
doi = {10.1145/3308558.3313491},
}
Spatial Keyword Search: A Survey
Lisi Chen, Shuo Shang, Chengcheng Yang and Jing Li
GeoInformatica-19- GeoInformatica. Springer, July 2019.
PDF  
Abstract  
BibTex
Spatial keyword search has been playing an indispensable role in personalized route recommendation and geo-textual information retrieval. In this light, we conduct a survey on existing studies of spatial keyword search. We categorize existing works of spatial keyword search based on the types of their input data, output results, and methodologies. For each category, we summarize their common features in terms of input data, output result, indexing scheme, and search algorithms. In addition, we provide detailed description regarding each study of spatial keyword search. This survey summarizes the findings of existing spatial keyword search studies, thus uncovering new insights that may guide software engineers as well as further research.
@article{DBLP:journals/geoinformatica/ChenSYL20,
author = {Lisi Chen and
Shuo Shang and
Chengcheng Yang and
Jing Li},
title = {Spatial keyword search: a survey},
journal = {GeoInformatica},
volume = {24},
number = {1},
pages = {85--106},
year = {2020},
url = {https://doi.org/10.1007/s10707-019-00373-y},
doi = {10.1007/s10707-019-00373-y},
Subtopic-Driven Multi-Document Summarization
Xin Zheng, Aixin Sun, Jing Li and Karthik Muthuswamy
EMNLP-IJCNLP-19- 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Pages 3144-3153, 2019. Acceptance rate: 684/2877 (23.8%).
PDF  
Abstract  
BibTex
In multi-document summarization, a set of documents to be summarized is assumed to be on the same topic, known as the underlying topic in this paper. That is, the underlying topic can be collectively represented by all the documents in the set. Meanwhile, different documents may cover various different subtopics and the same subtopic can be across several documents. Inspired by topic model, the underlying topic of a document set can also be viewed as a collection of different subtopics of different importance. In this paper, we propose a summarization model called STDS. The model generates the underlying topic representation from both document view and subtopic view in parallel. The learning objective is to minimize the distance between the representations learned from the two views. The contextual information is encoded through a hierarchical RNN architecture. Sentence salience is estimated in a hierarchical way with subtopic salience and relative sentence salience, by considering the contextual information. Top ranked sentences are then extracted as a summary. Note that the notion of subtopic enables us to bring in additional information (e.g. comments to news articles) that is helpful for document summarization. Experimental results show that the proposed solution outperforms state-of-the-art methods on benchmark datasets.
@inproceedings{DBLP:conf/emnlp/ZhengSLM19,
author = {Xin Zheng and
Aixin Sun and
Jing Li and
Karthik Muthuswamy},
title = {Subtopic-driven Multi-Document Summarization},
booktitle = {Proceedings of the 2019 Conference on Empirical Methods in Natural
Language Processing and the 9th International Joint Conference on
Natural Language Processing (EMNLP-IJCNLP)},
pages = {3151--3160},
publisher = {Association for Computational Linguistics},
year = {2019},
url = {https://doi.org/10.18653/v1/D19-1311},
doi = {10.18653/v1/D19-1311},
}
To Do or Not To Do: Distill Crowdsourced Negative Caveats to Augment API Documentation
Jing Li, Aixin Sun and Zhenchang Xing
JASIST-18- Journal of the Association for Information Science and Technology. Volume 69, Issue 12, Pages 1460-1475, Wiley, 2018.
PDF  
Abstract  
BibTex
Negative caveats of application programming interfaces (APIs) are about “how not to use an API,” which are often absent from the official API documentation. When these caveats are overlooked, programming errors may emerge from misusing APIs, leading to heavy discussions on Q&A websites like Stack Overflow. If the overlooked caveats could be mined from these discussions, they would be beneficial for programmers to avoid misuse of APIs. However, it is challenging because the discussions are informal, redundant, and diverse. For this, for example, we propose Disca, a novel approach for automatically Distilling desirable API negative caveats from unstructured Q&A discussions. Through sentence selection and prominent term clustering, Disca ensures that distilled caveats are context‐independent, prominent, semantically diverse, and nonredundant. Quantitative evaluation in our experiments shows that the proposed Disca significantly outperforms four text‐summarization techniques. We also show that the distilled API negative caveats could greatly augment API documentation through qualitative analysis.
@article{LiSX18,
author = {Jing Li and Aixin Sun and Zhenchang Xing},
title = {To Do or Not To Do: Distill crowdsourced negative caveats to augment api documentation},
journal = {J. Assoc. Inf. Sci. Technol.},
volume = {69},
number = {12},
pages = {1460--1475},
year = {2018},
url = {https://doi.org/10.1002/asi.24067},
doi = {10.1002/asi.24067},
}
SegBot: A Generic Neural Text Segmentation Model with Pointer Network
Jing Li, Aixin Sun and Shafiq Joty
IJCAI-18-The 27th International Joint Conference on Artificial Intelligence and the 23rd European Conference on Artificial Intelligence. Pages 4166-4172, 2018. Acceptance rate: 710/3470 (20.5%).
PDF  
Abstract  
BibTex  
Demo
Text segmentation is a fundamental task in natural language processing that comes in two levels of granularity: (i) segmenting a document into a sequence of topical segments (topic segmentation), and (ii) segmenting a sentence into a sequence of elementary discourse units (EDU segmentation). Traditional solutions to the two tasks heavily rely on carefully designed features. The recently proposed neural models do not need manual feature engineering, but they either suffer from sparse boundary tags or they cannot well handle the issue of variable size output vocabulary. We propose a generic end-to-end segmentation model called SegBot. SegBot uses a bidirectional recurrent neural network to encode input text sequence. The model then uses another recurrent neural network together with a pointer network to select text boundaries in the input sequence. In this way, SegBot does not require hand-crafted features. More importantly, our model inherently handles the issue of variable size output vocabulary and the issue of sparse boundary tags. In our experiments, SegBot outperforms state-of-the-art models on both topic and EDU segmentation tasks.
@inproceedings{LiSJ18segbot,
author = {Jing Li and Aixin Sun and Shafiq R. Joty},
title = {SegBot: {A} Generic Neural Text Segmentation Model with Pointer Network},
booktitle = {Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI)},
pages = {4166--4172},
year = {2018},
url = {https://doi.org/10.24963/ijcai.2018/579},
doi = {10.24963/ijcai.2018/579},
}
API Caveat Explorer: Surfacing Nagative Usages from Practice
Jing Li, Aixin Sun, Zhenchang Xing and Lei Han
SIGIR-18-The 41st International ACM SIGIR Conference on Research and Development in Information Retrieval, Pages 1293-1296. ACM, 2018. (Demo)
PDF  
Abstract  
BibTex  
Demo
Application programming interface (API) documentation well describes an API and how to use it. However, official documentation does not describe "how not to use it" or the different kinds of errors when an API is used wrongly. Programming caveats are negative usages of an API. When these caveats are overlooked, errors may emerge, leading to heavy discussions on Q&A websites like Stack Overflow. In this demonstration, we present API Caveat Explorer, a search system to explore API caveats that are mined from large-scale unstructured discussions on Stack Overflow. API Caveat Explorer takes API-oriented queries such as "HashMap" and retrieves API caveats by text summarization techniques. API caveats are represented by sentences, which are context-independent, prominent, semantically diverse and non-redundant. The system provides a web-based interface that allows users to interactively explore the full picture of all discovered caveats of an API, and the details of each. The potential users of API Caveat Explorer are programmers and educators for learning and teaching APIs.
@inproceedings{LiSXH18,
author = {Jing Li and
Aixin Sun and
Zhenchang Xing and
Lei Han},
title = {{API} Caveat Explorer - Surfacing Negative Usages from Practice: An
API-oriented Interactive Exploratory Search System for Programmers},
booktitle = {The 41st International {ACM} {SIGIR} Conference on Research {\&}
Development in Information Retrieval},
pages = {1293--1296},
year = {2018},
url = {https://doi.org/10.1145/3209978.3210170},
doi = {10.1145/3209978.3210170},
}
Learning to Answer Programming Questions with Software Documentation through Social Context Embedding
Jing Li, Aixin Sun and Zhenchang Xing
INS-18- Information Sciences. Volumes 448–449, Pages 36-52, June 2018, Elsevier.
PDF  
Abstract  
BibTex
Official software documentation provides a comprehensive overview of software usages, but not on specific programming tasks or use cases. Often there is a mismatch between the documentation and a question on a specific programming task because of different wordings. We observe from Stack Overflow that the best answers to programmers’ questions often contain links to formal documentation. In this paper, we propose a novel deep-learning-to-answer framework, named QDLinker, for answering programming questions with software documentation. QDLinker learns from the large volume of discussions in community-based question answering site to bridge the semantic gap between programmers’ questions and software documentation. Specifically, QDLinker learns question-documentation semantic representation from these question answering discussions with a four-layer neural network, and incorporates semantic and content features into a learning-to-rank schema. Our approach does not require manual feature engineering or external resources to infer the degree of relevance between a question and documentation. Through extensive experiments, results show that QDLinker effectively answers programming questions with direct links to software documentation. QDLinker significantly outperforms the baselines based on traditional retrieval models and Web search services dedicated for software documentation retrieval. The user study shows that QDLinker effectively bridges the semantic gap between the intent of a programming question and the content of software documentation.
@article{L2ALiSX18,
author = {Jing Li and
Aixin Sun and
Zhenchang Xing},
title = {Learning to answer programming questions with software documentation
through social context embedding},
journal = {Information Sciences},
volume = {448-449},
pages = {36--52},
year = {2018},
url = {https://doi.org/10.1016/j.ins.2018.03.014},
doi = {10.1016/j.ins.2018.03.014},
}
HDSKG: Harvesting Domain Specific Knowledge Graph from Content of Webpages
Xuejiao Zhao, Zhenchang Xing, Muhammad Ashad Kabir, Naoya Sawada, Jing Li and Shangwei Lin
SANER-17-The 24th IEEE International Conference on Software Analysis, Evolution, and Reengineering. Acceptance rate: 34/140 (24.3%).
PDF  
Abstract  
BibTex
Knowledge graph is useful for many different domains like search result ranking, recommendation, exploratory search, etc. It integrates structural information of concepts across multiple information sources, and links these concepts together. The extraction of domain specific relation triples (subject, verb phrase, object) is one of the important techniques for domain specific knowledge graph construction. In this research, an automatic method named HDSKG is proposed to discover domain specific concepts and their relation triples from the content of webpages. We incorporate the dependency parser with rule-based method to chunk the relations triple candidates, then we extract advanced features of these candidate relation triples to estimate the domain relevance by a machine learning algorithm. For the evaluation of our method, we apply HDSKG to Stack Overflow (a Q&A website about computer programming). As a result, we construct a knowledge graph of software engineering domain with 35279 relation triples, 44800 concepts, and 9660 unique verb phrases. The experimental results show that both the precision and recall of HDSKG (0.78 and 0.7 respectively) is much higher than the openIE (0.11 and 0.6 respectively). The performance is particularly efficient in the case of complex sentences. Further more, with the self-training technique we used in the classifier, HDSKG can be applied to other domain easily with less training data.
@inproceedings{DBLP:conf/wcre/ZhaoXKSLL17,
author = {Xuejiao Zhao and
Zhenchang Xing and
Muhammad Ashad Kabir and
Naoya Sawada and
Jing Li and
Shang{-}Wei Lin},
editor = {Martin Pinzger and
Gabriele Bavota and
Andrian Marcus},
title = {{HDSKG:} Harvesting domain specific knowledge graph from content of
webpages},
booktitle = {{IEEE} 24th International Conference on Software Analysis, Evolution
and Reengineering (SANER)},
pages = {56--67},
publisher = {{IEEE} Computer Society},
year = {2017},
url = {https://doi.org/10.1109/SANER.2017.7884609},
doi = {10.1109/SANER.2017.7884609},
}
From Discussion to Wisdom: Web Resource Recommendation for Hyperlinks in Stack Overflow
Jing Li, Zhenchang Xing, Deheng Ye and Xuejiao Zhao
SAC-16-The 31st ACM Symposium on Applied Computing,2016. Acceptance rate: 252/1047 (24.07%).
PDF  
Abstract  
BibTex
Application programming interface (API) documentation well describes an API and how to use it. However, official documentation does not describe "how not to use it" or the different kinds of errors when an API is used wrongly. Programming caveats are negative usages of an API. When these caveats are overlooked, errors may emerge, leading to heavy discussions on Q&A websites like Stack Overflow. In this demonstration, we present API Caveat Explorer, a search system to explore API caveats that are mined from large-scale unstructured discussions on Stack Overflow. API Caveat Explorer takes API-oriented queries such as "HashMap" and retrieves API caveats by text summarization techniques. API caveats are represented by sentences, which are context-independent, prominent, semantically diverse and non-redundant. The system provides a web-based interface that allows users to interactively explore the full picture of all discovered caveats of an API, and the details of each. The potential users of API Caveat Explorer are programmers and educators for learning and teaching APIs.
@inproceedings{SACLiXYZ16,
author = {Jing Li and
Zhenchang Xing and
Deheng Ye and
Xuejiao Zhao},
editor = {Sascha Ossowski},
title = {From discussion to wisdom: web resource recommendation for hyperlinks
in stack overflow},
booktitle = {Proceedings of the 31st Annual {ACM} Symposium on Applied Computing (SAC)},
pages = {1127--1133},
year = {2016},
url = {https://doi.org/10.1145/2851613.2851815},
doi = {10.1145/2851613.2851815},
}
BPMiner: Mining Developers' Behavior Patterns from Screen-Captured Task Videos
Jing Li, Lingfeng Bao, Zhenchang Xing, Xinyu Wang and Bo Zhou
SAC-16-The 31st ACM Symposium on Applied Computing, 2016. Acceptance rate: 252/1047 (24.07%).
PDF  
Abstract  
BibTex
Many user studies of software development use screen-capture software to record developers' behavior for post-mortem analysis. However, extracting behavioral patterns from screencaptured videos requires manual transcription and coding of videos, which is often tedious and error-prone. Automatically extracting Human-Computer Interaction (HCI) data from screen-captured videos and systematically analyzing behavioral data will help researchers analyze developers' behavior in software development more effectively and efficiently. In this paper, we present BPMiner, a novel behavior analysis approach to mine developers' behavior patterns from screencaptured videos using computer vision techniques and exploratory sequential pattern analysis. We have implemented a proof-of-concept prototype of BPMiner, and applied the BPMiner prototype to study the developers' online search behavior during software development. Our study suggests that the BPMiner approach can open up new ways to study developers' behavior in software development.
@inproceedings{SACLiBXWZ16,
author = {Jing Li and
Lingfeng Bao and
Zhenchang Xing and
Xinyu Wang and
Bo Zhou},
editor = {Sascha Ossowski},
title = {BPMiner: mining developers' behavior patterns from screen-captured
task videos},
booktitle = {Proceedings of the 31st Annual {ACM} Symposium on Applied Computing (SAC)},
pages = {1371--1377},
year = {2016},
url = {https://doi.org/10.1145/2851613.2851771},
doi = {10.1145/2851613.2851771},
}
Software-specific Part-of-speech Tagging: An Experimental Study on Stack Overflow
Deheng Ye, Zhenchang Xing, Jing Li and Nachiket Kapre
SAC-16-The 31st ACM Symposium on Applied Computing, 2016. Acceptance rate: 252/1047 (24.07%).
PDF  
Abstract  
BibTex
Part-of-speech (POS) tagging performance degrades on out-of-domain data due to the lack of domain knowledge. Software engineering knowledge, embodied in textual documentations, bug reports and online forum discussions, is expressed in natural language, but is full of domain terms, software entities and software-specific informal languages. Such software texts call for software-specific POS tagging. In the software engineering community, there have been several attempts leveraging POS tagging technique to help solve software engineering tasks. However, little work is done for POS tagging on software natural language texts. In this paper, we build a software-specific POS tagger, called S-POS, for processing the textual discussions on Stack Overflow. We target at Stack Overflow because it has become an important developer-generated knowledge repository for software engineering. We define a POS tagset that is suitable for describing software engineering knowledge, select corpus, develop a custom tokenizer, annotate data, design features for supervised model training, and demonstrate that the tagging accuracy of S-POS outperforms that of the Stanford POS Tagger when tagging software texts. Our work presents a feasible roadmap to build software-specific POS tagger for the socio-professional contents on Stack Overflow, and reveals challenges and opportunities for advanced software-specific information extraction.
@inproceedings{DBLP:conf/sac/YeXLK16,
author = {Deheng Ye and
Zhenchang Xing and
Jing Li and
Nachiket Kapre},
editor = {Sascha Ossowski},
title = {Software-specific part-of-speech tagging: an experimental study on
stack overflow},
booktitle = {Proceedings of the 31st Annual {ACM} Symposium on Applied Computing (SAC)},
pages = {1378--1385},
publisher = {{ACM}},
year = {2016},
url = {https://doi.org/10.1145/2851613.2851772},
doi = {10.1145/2851613.2851772},
}
Extracting and Analyzing Time-Series HCI Data from Screen-Captured Task Videos
Lingfeng Bao, Jing Li, Zhenchang Xing, Xinyu Wang, Xin xia and Bo Zhou
EMSE-16- Empirical Software Engineering, Springer, Pages 1-41, 2016.
PDF  
Abstract  
BibTex
Recent years have witnessed the increasing emphasis on human aspects in software engineering research and practices. Our survey of existing studies on human aspects in software engineering shows that screen-captured videos have been widely used to record developers’ behavior and study software engineering practices. The screen-captured videos provide direct information about which software tools the developers interact with and which content they access or generate during the task. Such Human-Computer Interaction (HCI) data can help researchers and practitioners understand and improve software engineering practices from human perspective. However, extracting time-series HCI data from screen-captured task videos requires manual transcribing and coding of videos, which is tedious and error-prone. In this paper we report a formative study to understand the challenges in manually transcribing screen-captured videos into time-series HCI data. We then present a computer-vision based video scraping technique to automatically extract time-series HCI data from screen-captured videos. We also present a case study of our scvRipper tool that implements the video scraping technique using 29-hours of task videos of 20 developers in two development tasks. The case study not only evaluates the runtime performance and robustness of the tool, but also performs a detailed quantitative analysis of the tool’s ability to extract time-series HCI data from screen-captured task videos. We also study the developer’s micro-level behavior patterns in software development from the quantitative analysis.
@article{DBLP:journals/ese/BaoLXWXZ17,
author = {Lingfeng Bao and
Jing Li and
Zhenchang Xing and
Xinyu Wang and
Xin Xia and
Bo Zhou},
title = {Extracting and analyzing time-series {HCI} data from screen-captured
task videos},
journal = {Empir. Softw. Eng.},
volume = {22},
number = {1},
pages = {134--174},
year = {2017},
url = {https://doi.org/10.1007/s10664-015-9417-1},
doi = {10.1007/s10664-015-9417-1},
}
Learning to Extract API Mentions from Informal Natural Language Discussions
Deheng Ye, Zhenchang Xing, Chee Yong Foo, Jing Li, and Nachiket Kapre
ICSME-16-The 32nd International Conference on Software Maintenance and Evolution. Acceptance rate: 37/125 (29%).
PDF  
Abstract  
BibTex
When discussing programming issues on social platforms (e.g, Stack Overflow, Twitter), developers often mention APIs in natural language texts. Extracting API mentions in natural language texts is a prerequisite for effective indexing and searching for API-related information in software engineering social content. However, the informal nature of social discussions creates two fundamental challenges for API extraction: common-word polysemy and sentence-format variations. Common-word polysemy refers to the ambiguity between the API sense of a common word and the normal sense of the word (e.g., append, apply and merge). Sentence-format variations refer to the lack of consistent sentence writing format for inferring API mentions. Existing API extraction techniques fall short to address these two challenges, because they assume distinct API naming conventions (e.g., camel case, underscore) or structured sentence format (e.g., code-like phrase, API annotation, or full API name). In this paper, we propose a semi-supervised machine-learning approach that exploits name synonyms and rich semantic context of API mentions to extract API mentions in informal social text. The key innovation of our approach is to exploit two complementary unsupervised language models learned from the abundant unlabeled text to model sentence-format variations and to train a robust model with a small set of labeled data and an iterative self-training process. The evaluation of 1,205 API mentions of the three libraries (Pandas, Numpy, and Matplotlib) in Stack Overflow texts shows that our approach significantly outperforms existing API extraction techniques based on language-convention and sentence-format heuristics and our earlier machine-learning based method for named-entity recognition.
@inproceedings{DBLP:conf/icsm/YeXFLK16,
author = {Deheng Ye and
Zhenchang Xing and
Chee Yong Foo and
Jing Li and
Nachiket Kapre},
title = {Learning to Extract {API} Mentions from Informal Natural Language
Discussions},
booktitle = {2016 {IEEE} International Conference on Software Maintenance and Evolution (ICSME)},
pages = {389--399},
year = {2016},
url = {https://doi.org/10.1109/ICSME.2016.11},
doi = {10.1109/ICSME.2016.11},
}
Software-specific Named Entity Recognition in Software Engineering Social Content
Deheng Ye, Zhenchang Xing, Chee Yong Foo, Zi Qun Ang, Jing Li and Nachiket Kapre
SANER-16-The 23rd IEEE International Conference on Software Analysis, Evolution, and Reengineering. Acceptance rate: 52/140 (37%).
PDF  
Abstract  
BibTex
Software engineering social content, such as Q&A discussions on Stack Overflow, has become a wealth of information on software engineering. This textual content is centered around software-specific entities, and their usage patterns, issues-solutions, and alternatives. However, existing approaches to analyzing software engineering texts treat software-specific entities in the same way as other content, and thus cannot support the recent advance of entity-centric applications, such as direct answers and knowledge graph. The first step towards enabling these entity-centric applications for software engineering is to recognize and classify software-specific entities, which is referred to as Named Entity Recognition (NER) in the literature. Existing NER methods are designed for recognizing person, location and organization in formal and social texts, which are not applicable to NER in software engineering. Existing information extraction methods for software engineering are limited to API identification and linking of a particular programming language. In this paper, we formulate the research problem of NER in software engineering. We identify the challenges in designing a software-specific NER system and propose a machine learning based approach applied on software engineering social content. Our NER system, called S-NER, is general for software engineering in that it can recognize a broad category of software entities for a wide range of popular programming languages, platform, and library. We conduct systematic experiments to evaluate our machine learning based S-NER against a well-designed, and to study the effectiveness of widely-adopted NER techniques and features in the face of the unique characteristics of software engineering social content.
@inproceedings{DBLP:conf/wcre/YeXFALK16,
author = {Deheng Ye and
Zhenchang Xing and
Chee Yong Foo and
Zi Qun Ang and
Jing Li and
Nachiket Kapre},
title = {Software-Specific Named Entity Recognition in Software Engineering
Social Content},
booktitle = {{IEEE} 23rd International Conference on Software Analysis, Evolution,
and Reengineering (SANER)},
pages = {90--101},
publisher = {{IEEE} Computer Society},
year = {2016},
url = {https://doi.org/10.1109/SANER.2016.10},
doi = {10.1109/SANER.2016.10},
}
scvRipper: Video Scraping Tool for Modeling Developers' Behavior Using Interaction Data
Lingfeng Bao, Jing Li, Zhenchang Xing, Xinyu Wang and Bo Zhou
ICSE-15-The 37th International Conference on Software Engineering Tool Demonstrations, Vol.2, Pages 673-676, 2015.
PDF  
Abstract  
Demo  
BibTex
Screen-capture tool can record a user's interaction with software and application content as a stream of screenshots which is usually stored in certain video format. Researchers have used screen-captured videos to study the programming activities that the developers carry out. In these studies, screen-captured videos had to be manually transcribed to extract software usage and application content data for the study purpose. This paper presents a computer-vision based video scraping tool (called scvRipper) that can automatically transcribe a screen-captured video into time-series interaction data according to the analyst's need. This tool can address the increasing need for automatic behavioral data collection methods in the studies of human aspects of software engineering.
@inproceedings{DBLP:conf/icse/BaoLXWZ15,
author = {Lingfeng Bao and
Jing Li and
Zhenchang Xing and
Xinyu Wang and
Bo Zhou},
title = {scvRipper: Video Scraping Tool for Modeling Developers' Behavior Using
Interaction Data},
booktitle = {37th {IEEE/ACM} International Conference on Software Engineering,
{ICSE} 2015, Florence, Italy, May 16-24, 2015, Volume 2},
pages = {673--676},
publisher = {{IEEE} Computer Society},
year = {2015},
url = {https://doi.org/10.1109/ICSE.2015.220},
doi = {10.1109/ICSE.2015.220},
}
Reverse Engineering Time-Series Interaction Data from Screen-Captured Videos
Lingfeng Bao, Jing Li, Zhenchang Xing, Xinyu Wang and Bo Zhou
SANER-15-The 22nd IEEE International Conference on Software Analysis, Evolution, and Reengineering, Pages 399-408, 2015. Acceptance rate: 46/144 (32%).
PDF  
Abstract  
BibTex
In recent years the amount of research on human aspects of software engineering has increased. Many studies use screen-capture software (e.g., Snagit) to record developers' behavior as they work on software development tasks. The recorded task videos capture direct information about which activities the developers carry out with which content and in which applications during the task. Such behavioral data can help researchers and practitioners understand and improve software engineering practices from human perspective. However, extracting time-series interaction data (software usage and application content) from screen-captured videos requires manual transcribing and coding of videos, which is tedious and error-prone. In this paper we present a computer-vision based video scraping technique to automatically reverse-engineer time-series interaction data from screen-captured videos. We report the usefulness, effectiveness and runtime performance of our video scraping technique using a case study of the 29 hours task videos of 20 developers in the two development tasks.
@inproceedings{DBLP:conf/wcre/BaoLXWZ15,
author = {Lingfeng Bao and
Jing Li and
Zhenchang Xing and
Xinyu Wang and
Bo Zhou},
editor = {Yann{-}Ga{\"{e}}l Gu{\'{e}}h{\'{e}}neuc and
Bram Adams and
Alexander Serebrenik},
title = {Reverse engineering time-series interaction data from screen-captured
videos},
booktitle = {22nd {IEEE} International Conference on Software Analysis, Evolution,
and Reengineering (SANER)},
pages = {399--408},
year = {2015},
url = {https://doi.org/10.1109/SANER.2015.7081850},
doi = {10.1109/SANER.2015.7081850},
}