nearest neighbours
Recently Published Documents


TOTAL DOCUMENTS

681
(FIVE YEARS 207)

H-INDEX

39
(FIVE YEARS 5)

2022 ◽  
Vol 2161 (1) ◽  
pp. 012017
Author(s):  
Krishnaraj Chadaga ◽  
Srikanth Prabhu ◽  
K Vivekananda Bhat ◽  
Shashikiran Umakanth ◽  
Niranjana Sampathila

Abstract Severe Acute Respiratory Syndrome Coronavirus 2(SARS-CoV-2), colloquially known as Coronavirus surfaced in late 2019 and is an extremely dangerous disease. RT-PCR (Reverse transcription Polymerase Chain Reaction) tests are extensively used in COVID-19 diagnosis. However, they are prone to a lot of false negatives and erroneous results. Hence, alternate methods are being researched and discovered for the detection of this infectious disease. We diagnose and forecast COVID-19 with the help of routine blood tests and Artificial Intelligence in this paper. The COVID-19 patient dataset was obtained from Israelita Albert Einstein Hospital, Brazil. Logistic regression, random forest, k nearest neighbours and Xgboost were the classifiers used for prediction. Since the dataset was extremely unbalanced, a technique called SMOTE was used to perform oversampling. Random forest obtained optimal results with an accuracy of 92%. The most important parameters according to the study were leukocytes, eosinophils, platelets and monocytes. This preliminary COVID-19 detection can be utilised in conjunction with RT-PCR testing to improve sensitivity, as well as in further pandemic outbreaks.


E-methodology ◽  
2021 ◽  
Vol 7 (7) ◽  
pp. 71-84
Author(s):  
ANDRZEJ BUDA ◽  
KATARZYNA KUŹMICZ

Aim: In our research, we examine universal properties of the global network whose structure represents a real-world network that might be later extended to social media, commodity market or countries under the infl uence of diseases like Covid-19 or ASF.Methods: We propose quasi-epidemiological agent-based model of virus spread on a network. Firstly, we consider countries represented by subnetworks that have a scale-free structure achieved by the preferential attachment construction with a node hierarchy and binary edges. The global network of countries is a complete, directed, weighted network of thesesubnetworks connected by their capitals and divided into cultural and geographical proximity. Viruses with a defi ned strength or aggressiveness occur independently at one of the nodes of a selected subnetwork and correspond to a piece of products or messages or diseases.Results and conclusion: We analyse dynamics set by varying parameter values and observe a variety of phenomena including local and global pandemics and the existence of an epidemic threshold in the subnetworks. These phenomena have been also shown fromindividual users points of view because the node removal from the network might have impact on its nearest neighbours differently. The selective participation in global network is proposed here to avoid side effects when the global network has been fully connected and no longer divided into clusters.


Sensors ◽  
2021 ◽  
Vol 22 (1) ◽  
pp. 139
Author(s):  
Yu Miao ◽  
Alan Hunter ◽  
Ioannis Georgilas

OctoMap is an efficient probabilistic mapping framework to build occupancy maps from point clouds, representing 3D environments with cubic nodes in the octree. However, the map update policy in OctoMap has limitations. All the nodes containing points will be assigned with the same probability regardless of the points being noise, and the probability of one such node can only be increased with a single measurement. In addition, potentially occupied nodes with points inside but traversed by rays cast from the sensor to endpoints will be marked as free. To overcome these limitations in OctoMap, the current work presents a mapping method using the context of neighbouring points to update nodes containing points, with occupancy information of a point represented by the average distance from a point to its k-Nearest Neighbours. A relationship between the distance and the change in probability is defined with the Cumulative Density Function of average distances, potentially decreasing the probability of a node despite points being present inside. Experiments are conducted on 20 data sets to compare the proposed method with OctoMap. Results show that our method can achieve up to 10% improvement over the optimal performance of OctoMap.


Author(s):  
Syed Saad Amer ◽  
Gurleen Wander ◽  
Manmeet Singh ◽  
Rami Bahsoon ◽  
Nicholas R. Jennings ◽  
...  

Heart disease kills more people around the world than any other disease, and it is one of the leading causes of death in the UK, triggering up to 74,000 deaths per year. An essential part in the prevention of deaths by heart disease and thus heart disease itself is the analysis of biomedical markers to determine the risk of a person developing heart disease. Lots of research has been conducted to assess the accuracy of detecting heart disease by analyzing biomedical markers. However, no previous study has attempted to identify the biomedical markers which are most important in this identification. To solve this problem, we proposed a machine learning-based intelligent heart disease prediction system called BioLearner for the determination of vital biomedical markers. This study aims to improve upon the accuracy of predicting heart disease and identify the most essential biological markers. This is done with the intention of composing a set of markers that impacts the development of heart disease the most. Multiple factors determine whether or not a person develops heart disease. These factors are thought to include Age, history of chest pain (of different types), fasting blood sugar of different types, heart rate, smoking, and other essential factors. The dataset is analyzed, and the different aspects are compared. Various machine learning models such as [Formula: see text] Nearest Neighbours, Neural Networks, Support Vector Machine (SVM) are trained and used to determine the accuracy of our prediction for future heart disease development. BioLearner is able to predict the risk of heart disease with an accuracy of 95%, much higher than the baseline methods.


F1000Research ◽  
2021 ◽  
Vol 10 ◽  
pp. 1274
Author(s):  
Nurulhuda Mustafa ◽  
Lew Sook Ling ◽  
Siti Fatimah Abdul Razak

Background: Customer churn is a term that refers to the rate at which customers leave the business. Churn could be due to various factors, including switching to a competitor, cancelling their subscription because of poor customer service, or discontinuing all contact with a brand due to insufficient touchpoints. Long-term relationships with customers are more effective than trying to attract new customers. A rise of 5% in customer satisfaction is followed by a 95% increase in sales. By analysing past behaviour, companies can anticipate future revenue. This article will look at which variables in the Net Promoter Score (NPS) dataset influence customer churn in Malaysia's telecommunications industry.  The aim of This study was to identify the factors behind customer churn and propose a churn prediction framework currently lacking in the telecommunications industry.   Methods: This study applied data mining techniques to the NPS dataset from a Malaysian telecommunications company in September 2019 and September 2020, analysing 7776 records with 30 fields to determine which variables were significant for the churn prediction model. We developed a propensity for customer churn using the Logistic Regression, Linear Discriminant Analysis, K-Nearest Neighbours Classifier, Classification and Regression Trees (CART), Gaussian Naïve Bayes, and Support Vector Machine using 33 variables.   Results: Customer churn is elevated for customers with a low NPS. However, an immediate helpdesk can act as a neutral party to ensure that the customer needs are met and to determine an employee's ability to obtain customer satisfaction.   Conclusions: It can be concluded that CART has the most accurate churn prediction (98%). However, the research is prohibited from accessing personal customer information under Malaysia's data protection policy. Results are expected for other businesses to measure potential customer churn using NPS scores to gather customer feedback.


Algorithms ◽  
2021 ◽  
Vol 14 (12) ◽  
pp. 356
Author(s):  
Szabolcs Szekér ◽  
Ágnes Vathy-Fogarassy

An essential criterion for the proper implementation of case-control studies is selecting appropriate case and control groups. In this article, a new simulated annealing-based control group selection method is proposed, which solves the problem of selecting individuals in the control group as a distance optimization task. The proposed algorithm pairs the individuals in the n-dimensional feature space by minimizing the weighted distances between them. The weights of the dimensions are based on the odds ratios calculated from the logistic regression model fitted on the variables describing the probability of membership of the treated group. For finding the optimal pairing of the individuals, simulated annealing is utilized. The effectiveness of the newly proposed Weighted Nearest Neighbours Control Group Selection with Simulated Annealing (WNNSA) algorithm is presented by two Monte Carlo studies. Results show that the WNNSA method can outperform the widely applied greedy propensity score matching method in feature spaces where only a few covariates characterize individuals and the covariates can only take a few values.


Sensors ◽  
2021 ◽  
Vol 21 (23) ◽  
pp. 8091
Author(s):  
Khadijat A. Olorunlambe ◽  
Zhe Hua ◽  
Duncan E. T. Shepherd ◽  
Karl D. Dearn

Acoustic emission (AE) testing detects the onset and progression of mechanical flaws. AE as a diagnostic tool is gaining traction for providing a tribological assessment of human joints and orthopaedic implants. There is potential for using AE as a tool for diagnosing joint pathologies such as osteoarthritis and implant failure, but the signal analysis must differentiate between wear mechanisms—a challenging problem! In this study, we use supervised learning to classify AE signals from adhesive and abrasive wear under controlled joint conditions. Uncorrelated AE features were derived using principal component analysis and classified using three methods, logistic regression, k-nearest neighbours (KNN), and back propagation (BP) neural network. The BP network performed best, with a classification accuracy of 98%, representing an exciting development for the clustering and supervised classification of AE signals as a bio-tribological diagnostic tool.


2021 ◽  
Vol 13 (2) ◽  
pp. 1199-1208
Author(s):  
N. Ajaypradeep ◽  
Dr.R. Sasikala

Autism is a developmental disorder which affects cognition, social and behavioural functionalities of a person. When a person is affected by autism spectrum disorder, he/she will exhibit peculiar behaviours and those symptoms initiate from that patient’s childhood. Early diagnosis of autism is an important and challenging task. Behavioural analysis a well known therapeutic practice can be adopted for earlier diagnosis of autism. Machine learning is a computational methodology, which can be applied to a wide range of applications in-order to obtain efficient outputs. At present machine learning is especially applied in medical applications such as disease prediction. In our study we evaluated various machine learning algorithms [(Naive bayes (NB), Support Vector Machines (SVM) and k-Nearest Neighbours (KNN)] with “k-fold” based cross validation for 3 datasets retrieved from the UCI repository. Additionally we validated the effective accuracy of the estimated results using a clustered cross validation strategy. The process of employing the clustered cross validation scrutinises the parameters which contributes more importance in the dataset. The strategy induces hyper parameter tuning which yields trusted results as it involves double validation. On application of the clustered cross validation for a SVM based model, we obtained an accuracy of 99.6% accuracy for autism child dataset.


2021 ◽  
Vol 2145 (1) ◽  
pp. 012065
Author(s):  
K Ketthong ◽  
S Pulpirom ◽  
L Rianthakool ◽  
K Prasanai ◽  
C Na Takuathung ◽  
...  

Abstract We simulate the wave propagation through various mediums using a graph-theoretical path-finding algorithm. The mediums are discretized to the square lattices, where each node is connected up to its 4th nearest neighbours. The edge connecting any 2 nodes is weighted by the time of flight of the wave between the nodes, which is calculated from the Euclidean distance between the nodes divided by the average velocity at the positions of those nodes. According to Fermat’s principle of least time, wave propagation between 2 nodes will follow the path with minimal weight. We thus use the path-finding algorithm to find such a path. We apply our method to simulate wave propagation from a point source through a homogeneous medium. By defining a wavefront as a contour of nodes with the same time of flight, we obtain a spherical wave as expected. We next investigate the wave propagation through a boundary of 2 mediums with different wave velocities. The result shows wave refraction that exactly follows Snell’s law. Finally, we apply the algorithm to determine the velocity model in a wood sample, where the wave velocity is determined by the angle between the propagation direction and the radial direction from its pith. By comparing the time of flight from our simulation with the measurements, the parameters in the velocity model can be obtained. The advantage of our method is its simplicity and straightforwardness. In all the above simulations, the same simple path-finding code is used, regardless of the complexity of the wave velocity model of the mediums. We expect that our method can be useful in practice when an investigation of wave propagation in a complex medium is needed.


Information ◽  
2021 ◽  
Vol 12 (12) ◽  
pp. 491
Author(s):  
Erjon Skenderi ◽  
Jukka Huhtamäki ◽  
Kostas Stefanidis

In this paper, we consider the task of assigning relevant labels to studies in the social science domain. Manual labelling is an expensive process and prone to human error. Various multi-label text classification machine learning approaches have been proposed to resolve this problem. We introduce a dataset obtained from the Finnish Social Science Archive and comprised of 2968 research studies’ metadata. The metadata of each study includes attributes, such as the “abstract” and the “set of labels”. We used the Bag of Words (BoW), TF-IDF term weighting and pretrained word embeddings obtained from FastText and BERT models to generate the text representations for each study’s abstract field. Our selection of multi-label classification methods includes a Naive approach, Multi-label k Nearest Neighbours (ML-kNN), Multi-Label Random Forest (ML-RF), X-BERT and Parabel. The methods were combined with the text representation techniques and their performance was evaluated on our dataset. We measured the classification accuracy of the combinations using Precision, Recall and F1 metrics. In addition, we used the Normalized Discounted Cumulative Gain to measure the label ranking performance of the selected methods combined with the text representation techniques. The results showed that the ML-RF model achieved a higher classification accuracy with the TF-IDF features and, based on the ranking score, the Parabel model outperformed the other methods.


Sign in / Sign up

Export Citation Format

Share Document