source code Latest Research Papers

<span lang="EN-US">This research focuses on the k-center problem and its applications. Different methods for solving this problem are analyzed. The implementations of an exact algorithm and of an approximate algorithm are presented. The source code and the computation complexity of these algorithms are presented and analyzed. The multitasking mode of the operating system is taken into account considering the execution time of the algorithms. The results show that the approximate algorithm finds solutions that are not worse than two times optimal. In some case these solutions are very close to the optimal solutions, but this is true only for graphs with a smaller number of nodes. As the number of nodes in the graph increases (respectively the number of edges increases), the approximate solutions deviate from the optimal ones, but remain acceptable. These results give reason to conclude that for graphs with a small number of nodes the approximate algorithm finds comparable solutions with those founds by the exact algorithm.</span>

Download Full-text

Documentation Matters: Human-Centered AI System to Assist Data Science Code Documentation in Computational Notebooks

ACM Transactions on Computer-Human Interaction ◽

10.1145/3489465 ◽

2022 ◽

Vol 29 (2) ◽

pp. 1-33

Author(s):

April Yi Wang ◽

Dakuo Wang ◽

Jaimie Drozdal ◽

Michael Muller ◽

Soya Park ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Data Science ◽

Source Code ◽

Generation System ◽

Document Code ◽

Human Data ◽

Within Subjects ◽

The Creation ◽

Api Documentation

Computational notebooks allow data scientists to express their ideas through a combination of code and documentation. However, data scientists often pay attention only to the code, and neglect creating or updating their documentation during quick iterations. Inspired by human documentation practices learned from 80 highly-voted Kaggle notebooks, we design and implement Themisto, an automated documentation generation system to explore how human-centered AI systems can support human data scientists in the machine learning code documentation scenario. Themisto facilitates the creation of documentation via three approaches: a deep-learning-based approach to generate documentation for source code, a query-based approach to retrieve online API documentation for source code, and a user prompt approach to nudge users to write documentation. We evaluated Themisto in a within-subjects experiment with 24 data science practitioners, and found that automated documentation generation techniques reduced the time for writing documentation, reminded participants to document code they would have ignored, and improved participants’ satisfaction with their computational notebook.

Download Full-text

Why Do Developers Reject Refactorings in Open-Source Projects?

ACM Transactions on Software Engineering and Methodology ◽

10.1145/3487062 ◽

2022 ◽

Vol 31 (2) ◽

pp. 1-23

Author(s):

Jevgenija Pantiuchina ◽

Bin Lin ◽

Fiorella Zampetti ◽

Massimiliano Di Penta ◽

Michele Lanza ◽

...

Keyword(s):

Open Source ◽

Software Quality ◽

Good Practice ◽

Source Code ◽

Code Review ◽

Code Quality ◽

Shed Light

Refactoring operations are behavior-preserving changes aimed at improving source code quality. While refactoring is largely considered a good practice, refactoring proposals in pull requests are often rejected after the code review. Understanding the reasons behind the rejection of refactoring contributions can shed light on how such contributions can be improved, essentially benefiting software quality. This article reports a study in which we manually coded rejection reasons inferred from 330 refactoring-related pull requests from 207 open-source Java projects. We surveyed 267 developers to assess their perceived prevalence of these identified rejection reasons, further complementing the reasons. Our study resulted in a comprehensive taxonomy consisting of 26 refactoring-related rejection reasons and 21 process-related rejection reasons. The taxonomy, accompanied with representative examples and highlighted implications, provides developers with valuable insights on how to ponder and polish their refactoring contributions, and indicates a number of directions researchers can pursue toward better refactoring recommenders.

Download Full-text

What You See is What it Means! Semantic Representation Learning of Code based on Visualization and Transfer Learning

ACM Transactions on Software Engineering and Methodology ◽

10.1145/3485135 ◽

2022 ◽

Vol 31 (2) ◽

pp. 1-34

Author(s):

Patrick Keller ◽

Abdoul Kader Kaboré ◽

Laura Plein ◽

Jacques Klein ◽

Yves Le Traon ◽

...

Keyword(s):

Transfer Learning ◽

Language Processing ◽

State Of The Art ◽

Semantic Representation ◽

Source Code ◽

Visual Representations ◽

Representation Learning ◽

Classification Problem ◽

Semantic Code ◽

Code Clone

Recent successes in training word embeddings for Natural Language Processing ( NLP ) tasks have encouraged a wave of research on representation learning for source code, which builds on similar NLP methods. The overall objective is then to produce code embeddings that capture the maximum of program semantics. State-of-the-art approaches invariably rely on a syntactic representation (i.e., raw lexical tokens, abstract syntax trees, or intermediate representation tokens) to generate embeddings, which are criticized in the literature as non-robust or non-generalizable. In this work, we investigate a novel embedding approach based on the intuition that source code has visual patterns of semantics. We further use these patterns to address the outstanding challenge of identifying semantic code clones. We propose the WySiWiM ( ‘ ‘What You See Is What It Means ” ) approach where visual representations of source code are fed into powerful pre-trained image classification neural networks from the field of computer vision to benefit from the practical advantages of transfer learning. We evaluate the proposed embedding approach on the task of vulnerable code prediction in source code and on two variations of the task of semantic code clone identification: code clone detection (a binary classification problem), and code classification (a multi-classification problem). We show with experiments on the BigCloneBench (Java), Open Judge (C) that although simple, our WySiWiM approach performs as effectively as state-of-the-art approaches such as ASTNN or TBCNN. We also showed with data from NVD and SARD that WySiWiM representation can be used to learn a vulnerable code detector with reasonable performance (accuracy ∼90%). We further explore the influence of different steps in our approach, such as the choice of visual representations or the classification algorithm, to eventually discuss the promises and limitations of this research direction.

Download Full-text

On the Reproducibility and Replicability of Deep Learning in Software Engineering

ACM Transactions on Software Engineering and Methodology ◽

10.1145/3477535 ◽

2022 ◽

Vol 31 (1) ◽

pp. 1-46

Author(s):

Chao Liu ◽

Cuiyun Gao ◽

Xin Xia ◽

David Lo ◽

John Grundy ◽

...

Keyword(s):

Deep Learning ◽

Software Engineering ◽

Source Code ◽

Experimental Results ◽

Supervised Machine Learning ◽

Optimization Process ◽

Experimental Result ◽

Experimental Setup ◽

High Quality ◽

Two Factors

Context: Deep learning (DL) techniques have gained significant popularity among software engineering (SE) researchers in recent years. This is because they can often solve many SE challenges without enormous manual feature engineering effort and complex domain knowledge. Objective: Although many DL studies have reported substantial advantages over other state-of-the-art models on effectiveness, they often ignore two factors: (1) reproducibility —whether the reported experimental results can be obtained by other researchers using authors’ artifacts (i.e., source code and datasets) with the same experimental setup; and (2) replicability —whether the reported experimental result can be obtained by other researchers using their re-implemented artifacts with a different experimental setup. We observed that DL studies commonly overlook these two factors and declare them as minor threats or leave them for future work. This is mainly due to high model complexity with many manually set parameters and the time-consuming optimization process, unlike classical supervised machine learning (ML) methods (e.g., random forest). This study aims to investigate the urgency and importance of reproducibility and replicability for DL studies on SE tasks. Method: In this study, we conducted a literature review on 147 DL studies recently published in 20 SE venues and 20 AI (Artificial Intelligence) venues to investigate these issues. We also re-ran four representative DL models in SE to investigate important factors that may strongly affect the reproducibility and replicability of a study. Results: Our statistics show the urgency of investigating these two factors in SE, where only 10.2% of the studies investigate any research question to show that their models can address at least one issue of replicability and/or reproducibility. More than 62.6% of the studies do not even share high-quality source code or complete data to support the reproducibility of their complex models. Meanwhile, our experimental results show the importance of reproducibility and replicability, where the reported performance of a DL model could not be reproduced for an unstable optimization process. Replicability could be substantially compromised if the model training is not convergent, or if performance is sensitive to the size of vocabulary and testing data. Conclusion: It is urgent for the SE community to provide a long-lasting link to a high-quality reproduction package, enhance DL-based solution stability and convergence, and avoid performance sensitivity on different sampled data.

Download Full-text

A Survey of Flaky Tests

ACM Transactions on Software Engineering and Methodology ◽

10.1145/3476105 ◽

2022 ◽

Vol 31 (1) ◽

pp. 1-74

Author(s):

Owain Parry ◽

Gregory M. Kapfhammer ◽

Michael Hilton ◽

Phil McMinn

Keyword(s):

Software Testing ◽

Source Code ◽

Daily Basis ◽

Research Area ◽

The Body ◽

Clear Indication ◽

Software Developers ◽

Software Bugs ◽

Detection Strategies ◽

Test Suites

Tests that fail inconsistently, without changes to the code under test, are described as flaky . Flaky tests do not give a clear indication of the presence of software bugs and thus limit the reliability of the test suites that contain them. A recent survey of software developers found that 59% claimed to deal with flaky tests on a monthly, weekly, or daily basis. As well as being detrimental to developers, flaky tests have also been shown to limit the applicability of useful techniques in software testing research. In general, one can think of flaky tests as being a threat to the validity of any methodology that assumes the outcome of a test only depends on the source code it covers. In this article, we systematically survey the body of literature relevant to flaky test research, amounting to 76 papers. We split our analysis into four parts: addressing the causes of flaky tests, their costs and consequences, detection strategies, and approaches for their mitigation and repair. Our findings and their implications have consequences for how the software-testing community deals with test flakiness, pertinent to practitioners and of interest to those wanting to familiarize themselves with the research area.

Download Full-text

DDA-SKF: Predicting Drug–Disease Associations Using Similarity Kernel Fusion

Frontiers in Pharmacology ◽

10.3389/fphar.2021.784171 ◽

2022 ◽

Vol 12 ◽

Author(s):

Chu-Qiao Gao ◽

Yuan-Ke Zhou ◽

Xiao-Hong Xin ◽

Hui Min ◽

Pu-Feng Du

Keyword(s):

Computational Model ◽

State Of The Art ◽

Drug Repositioning ◽

Source Code ◽

Orphan Drugs ◽

Kernel Fusion ◽

Disease Associations ◽

Laplacian Regularized Least Squares ◽

Novel Drug ◽

Similarity Information

Drug repositioning provides a promising and efficient strategy to discover potential associations between drugs and diseases. Many systematic computational drug-repositioning methods have been introduced, which are based on various similarities of drugs and diseases. In this work, we proposed a new computational model, DDA-SKF (drug–disease associations prediction using similarity kernels fusion), which can predict novel drug indications by utilizing similarity kernel fusion (SKF) and Laplacian regularized least squares (LapRLS) algorithms. DDA-SKF integrated multiple similarities of drugs and diseases. The prediction performances of DDA-SKF are better, or at least comparable, to all state-of-the-art methods. The DDA-SKF can work without sufficient similarity information between drug indications. This allows us to predict new purpose for orphan drugs. The source code and benchmarking datasets are deposited in a GitHub repository (https://github.com/GCQ2119216031/DDA-SKF).

Download Full-text

Implicit implementation of the nonlocal operator method: an open source code

Engineering With Computers ◽

10.1007/s00366-021-01537-x ◽

2022 ◽

Author(s):

Yongzheng Zhang ◽

Huilong Ren

Keyword(s):

Open Source ◽

Variational Principles ◽

Source Code ◽

Gradient Elasticity ◽

Operator Method ◽

Nonlinear Problems ◽

Energy Functional ◽

Benchmark Problems ◽

Nonlocal Operator ◽

Open Source Code

AbstractIn this paper, we present an open-source code for the first-order and higher-order nonlocal operator method (NOM) including a detailed description of the implementation. The NOM is based on so-called support, dual-support, nonlocal operators, and an operate energy functional ensuring stability. The nonlocal operator is a generalization of the conventional differential operators. Combined with the method of weighed residuals and variational principles, NOM establishes the residual and tangent stiffness matrix of operate energy functional through some simple matrix without the need of shape functions as in other classical computational methods such as FEM. NOM only requires the definition of the energy drastically simplifying its implementation. The implementation in this paper is focused on linear elastic solids for sake of conciseness through the NOM can handle more complex nonlinear problems. The NOM can be very flexible and efficient to solve partial differential equations (PDEs), it’s also quite easy for readers to use the NOM and extend it to solve other complicated physical phenomena described by one or a set of PDEs. Finally, we present some classical benchmark problems including the classical cantilever beam and plate-with-a-hole problem, and we also make an extension of this method to solve complicated problems including phase-field fracture modeling and gradient elasticity material.

Download Full-text

Sensitive visualization of SARS-CoV-2 RNA with CoronaFISH

Life Science Alliance ◽

10.26508/lsa.202101124 ◽

2022 ◽

Vol 5 (4) ◽

pp. e202101124

Author(s):

Elena Rensen ◽

Stefano Pietropaoli ◽

Florian Mueller ◽

Christian Weber ◽

Sylvie Souquere ◽

...

Keyword(s):

Human Tissue ◽

Rna Virus ◽

Source Code ◽

Initial Application ◽

Green Monkey ◽

Positive Sense ◽

Therapeutic Molecules ◽

Infected Cells ◽

Detailed Protocol ◽

Transcription And Replication

The current COVID-19 pandemic is caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The positive-sense single-stranded RNA virus contains a single linear RNA segment that serves as a template for transcription and replication, leading to the synthesis of positive and negative-stranded viral RNA (vRNA) in infected cells. Tools to visualize vRNA directly in infected cells are critical to analyze the viral replication cycle, screen for therapeutic molecules, or study infections in human tissue. Here, we report the design, validation, and initial application of FISH probes to visualize positive or negative RNA of SARS-CoV-2 (CoronaFISH). We demonstrate sensitive visualization of vRNA in African green monkey and several human cell lines, in patient samples and human tissue. We further demonstrate the adaptation of CoronaFISH probes to electron microscopy. We provide all required oligonucleotide sequences, source code to design the probes, and a detailed protocol. We hope that CoronaFISH will complement existing techniques for research on SARS-CoV-2 biology and COVID-19 pathophysiology, drug screening, and diagnostics.

Download Full-text

Evolution Process and Supply Chain Adaptation of Smart Contracts in Blockchain

Journal of Mathematics ◽

10.1155/2022/2839566 ◽

2022 ◽

Vol 2022 ◽

pp. 1-13

Author(s):

Yue Wu ◽

Junxiang Li ◽

Jiru Zhou ◽

Shichang Luo ◽

Liwei Song

Keyword(s):

Supply Chain ◽

Business Process ◽

Source Code ◽

Evolution Process ◽

Future Research ◽

Chain Model ◽

Smart Contracts ◽

Application Field ◽

Smart Contract ◽

Block Chain

Because of its unique decentralization, encryption, reliability, and tamper-proof, the block chain system makes smart contracts break through the shackles of the lack of trusted environment, and its application field keeps expanding. We read the source code and official documents of Bitcoin, Ethereum, and Hyperledger to explore the operation principle and implementation mode of smart contract. By analyzing the evolution process of smart contracts in blockchain and the sequence of its function expansion, according to the multirole business process of supply chain, we design a semipublic smart contract chain model based on Ethereum and Hyperledger in order to provide useful inspiration and help for the future research of smart contracts in blockchain applied in supply chain.

Download Full-text

source code
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

An analysis between exact and approximate algorithms for the k-center problem in graphs

Documentation Matters: Human-Centered AI System to Assist Data Science Code Documentation in Computational Notebooks

Why Do Developers Reject Refactorings in Open-Source Projects?

What You See is What it Means! Semantic Representation Learning of Code based on Visualization and Transfer Learning

On the Reproducibility and Replicability of Deep Learning in Software Engineering

A Survey of Flaky Tests

DDA-SKF: Predicting Drug–Disease Associations Using Similarity Kernel Fusion

Implicit implementation of the nonlocal operator method: an open source code

Sensitive visualization of SARS-CoV-2 RNA with CoronaFISH

Evolution Process and Supply Chain Adaptation of Smart Contracts in Blockchain

Export Citation Format

source codeRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

An analysis between exact and approximate algorithms for the k-center problem in graphs

Documentation Matters: Human-Centered AI System to Assist Data Science Code Documentation in Computational Notebooks

Why Do Developers Reject Refactorings in Open-Source Projects?

What You See is What it Means! Semantic Representation Learning of Code based on Visualization and Transfer Learning

On the Reproducibility and Replicability of Deep Learning in Software Engineering

A Survey of Flaky Tests

DDA-SKF: Predicting Drug–Disease Associations Using Similarity Kernel Fusion

Implicit implementation of the nonlocal operator method: an open source code

Sensitive visualization of SARS-CoV-2 RNA with CoronaFISH

Evolution Process and Supply Chain Adaptation of Smart Contracts in Blockchain

source code
Recently Published Documents