This event has ended. View the official site or create your own event → Check it out
This event has ended. Create your own
ACM/IEEE Joint Conference on Digital Libraries 2017
University of Toronto
JCDL 2017 | #JCDL@2017
View analytic
Wednesday, June 21 • 11:00 - 12:30
Paper Session 04: Citation Analysis

Sign up or log in to save this to your schedule and see who's attending!

Saeed-Ul Hassan, Anam Akram and Peter Haddawy. Identifying Important Citations using Contextual Information from Full Text (Full)

*VB Best Paper Award Nominee

In this paper we address the problem of classifying cited work into important and non-important to the developments presented in a research publication. This task is vital for the algorithmic techniques that detect and follow emerging research topics and to qualitatively measure the impact of publications in increasingly growing scholarly big data. We consider cited work as important to a publication if that work is used or extended in some way. If a reference is cited as background work or for the purpose of comparing results, the cited work is considered to be non-important. By employing five classification techniques (Support Vector Machine, Naïve Bayes, Decision Tree, K-Nearest Neighbors and Random Forest) on an annotated dataset of 465 citations, we explore the effectiveness of eight previously published features and six novel features (including context based, cue words based and textual based). Within this set, our new features are among the best performing. Using the Random Forest classifier we achieve an overall classification accuracy of 0.91 AUC.

Luca Weihs and Oren Etzioni. Learning to Predict Citation-Based Impact Measures (Full)
Citations implicitly encode a community's judgment of a paper's importance and thus provide a unique signal by which to sLucaWeihstudy scientific impact. Efforts in understanding and refining this signal are reflected in the probabilistic modeling of citation networks and the proliferation of citation-based impact measures such as Hirsch's h-index. While these efforts focus on understanding the past and present, they leave open the question of whether scientific impact can be predicted into the future. Recent work addressing this deficiency has employed linear and simple probabilistic models; we show that these results can be handily outperformed by leveraging non-linear techniques. In particular, we find that these AI methods can predict measures of scientific impact for papers and authors, namely citation rates and h-indices, with surprising accuracy, even 10 years into the future. Moreover, we demonstrate how existing probabilistic models for paper citations can be extended to better incorporate refined prior knowledge. Of course, predictions of ``scientific impact" should be approached with healthy skepticism, but our results improve upon prior efforts and form a baseline against which future progress can be easily judged.

Mayank Singh, Ajay Jaiswal, Priya Shree, Arindam Pal, Animesh Mukherjee and Pawan Goyal. Understanding the Impact of Early Citers on Long-Term Scientific Impact (Full)
Our current knowledge of scholarly plagiarism is largely based on the similarity between full text research articles. In this paper, we propose an innovative and novel conceptualization of scholarly plagiarism in the form of reuse of explicit citation sentences in scientific research articles. Note that while full-text plagiarism is an indicator of a gross-level behavior, copying of citation sentences is a more nuanced micro-scale phenomenon observed even for well-known researchers. The current work poses several interesting questions and attempts to answer them by empirically investigating a large bibliographic text dataset from computer science containing millions of lines of citation sentences. In particular, we report evidences of massive copying behavior. We also present several striking real examples throughout the paper to showcase widespread adoption of this undesirable practice. In contrast to the popular perception, we find that copying tendency increases as an author matures. The copying behavior is reported to exist in all fields of computer science; however, the theoretical fields indicate more copying than the applied fields.

avatar for Saeed-Ul Hassan

Saeed-Ul Hassan

Assistant Professor, Information Technology University

Mayank Singh

Indian Institute of Technology Kharagpur

Luca Weihs

University of Washington

Wednesday June 21, 2017 11:00 - 12:30
Innis Town Hall 2 Sussex Ave, Toronto, ON M5S 1J5

Attendees (17)