scholarly journals Stacking-Based Ensemble Learning of Self-Media Data for Marketing Intention Detection

2019 ◽  
Vol 11 (7) ◽  
pp. 155 ◽  
Author(s):  
Yufeng Wang ◽  
Shuangrong Liu ◽  
Songqian Li ◽  
Jidong Duan ◽  
Zhihao Hou ◽  
...  

Social network services for self-media, such as Weibo, Blog, and WeChat Public, constitute a powerful medium that allows users to publish posts every day. Due to insufficient information transparency, malicious marketing of the Internet from self-media posts imposes potential harm on society. Therefore, it is necessary to identify news with marketing intentions for life. We follow the idea of text classification to identify marketing intentions. Although there are some current methods to address intention detection, the challenge is how the feature extraction of text reflects semantic information and how to improve the time complexity and space complexity of the recognition model. To this end, this paper proposes a machine learning method to identify marketing intentions from large-scale We-Media data. First, the proposed Latent Semantic Analysis (LSI)-Word2vec model can reflect the semantic features. Second, the decision tree model is simplified by decision tree pruning to save computing resources and reduce the time complexity. Finally, this paper examines the effects of classifier associations and uses the optimal configuration to help people efficiently identify marketing intention. Finally, the detailed experimental evaluation on several metrics shows that our approaches are effective and efficient. The F1 value can be increased by about 5%, and the running time is increased by 20%, which prove that the newly-proposed method can effectively improve the accuracy of marketing news recognition.

Author(s):  
Ajay Kumar Gupta

This chapter presents an overview of spam email as a serious problem in our internet world and creates a spam filter that reduces the previous weaknesses and provides better identification accuracy with less complexity. Since J48 decision tree is a widely used classification technique due to its simple structure, higher classification accuracy, and lower time complexity, it is used as a spam mail classifier here. Now, with lower complexity, it becomes difficult to get higher accuracy in the case of large number of records. In order to overcome this problem, particle swarm optimization is used here to optimize the spam base dataset, thus optimizing the decision tree model as well as reducing the time complexity. Once the records have been standardized, the decision tree is again used to check the accuracy of the classification. The chapter presents a study on various spam-related issues, various filters used, related work, and potential spam-filtering scope.


Author(s):  
Yunwei Zhao ◽  
Can Wang ◽  
Chi-Hung Chi ◽  
Kwok-Yan Lam ◽  
Sen Wang

The availability of massive social media data has enabled the prediction of people’s future behavioral trends at an unprecedented large scale. Information cascades study on Twitter has been an integral part of behavior analysis. A number of methods based on the transactional features (such as keyword frequency) and the semantic features (such as sentiment) have been proposed to predict the future cascading trends. However, an in-depth understanding of the pros and cons of semantic and transactional models is lacking. This paper conducts a comparative study of both approaches in predicting information diffusion with three mechanisms: retweet cascade, url cascade, and hashtag cascade. Experiments on Twitter data show that the semantic model outperforms the transactional model, if the exterior pattern is less directly observable (i.e. hashtag cascade). When it becomes more directly observable (i.e. retweet and url cascades), the semantic method yet delivers approximate accuracy (i.e. url cascade) or even worse accuracy (i.e. retweet cascade). Further, we demonstrate that the transactional and semantic models are not independent, and the performance gets greatly enhanced when combining both.


Author(s):  
Avijit Kumar Chaudhuri ◽  
Deepankar Sinha ◽  
Dilip K. Banerjee ◽  
Anirban Das

Algorithms ◽  
2021 ◽  
Vol 14 (6) ◽  
pp. 176
Author(s):  
Wei Zhu ◽  
Xiaoyang Zeng

Applications have different preferences for caches, sometimes even within the different running phases. Caches with fixed parameters may compromise the performance of a system. To solve this problem, we propose a real-time adaptive reconfigurable cache based on the decision tree algorithm, which can optimize the average memory access time of cache without modifying the cache coherent protocol. By monitoring the application running state, the cache associativity is periodically tuned to the optimal cache associativity, which is determined by the decision tree model. This paper implements the proposed decision tree-based adaptive reconfigurable cache in the GEM5 simulator and designs the key modules using Verilog HDL. The simulation results show that the proposed decision tree-based adaptive reconfigurable cache reduces the average memory access time compared with other adaptive algorithms.


Diagnostics ◽  
2021 ◽  
Vol 11 (6) ◽  
pp. 1094
Author(s):  
Michael Wong ◽  
Nikolaos Thanatsis ◽  
Federica Nardelli ◽  
Tejal Amin ◽  
Davor Jurkovic

Background and aims: Postmenopausal endometrial polyps are commonly managed by surgical resection; however, expectant management may be considered for some women due to the presence of medical co-morbidities, failed hysteroscopies or patient’s preference. This study aimed to identify patient characteristics and ultrasound morphological features of polyps that could aid in the prediction of underlying pre-malignancy or malignancy in postmenopausal polyps. Methods: Women with consecutive postmenopausal polyps diagnosed on ultrasound and removed surgically were recruited between October 2015 to October 2018 prospectively. Polyps were defined on ultrasound as focal lesions with a regular outline, surrounded by normal endometrium. On Doppler examination, there was either a single feeder vessel or no detectable vascularity. Polyps were classified histologically as benign (including hyperplasia without atypia), pre-malignant (atypical hyperplasia), or malignant. A Chi-squared automatic interaction detection (CHAID) decision tree analysis was performed with a range of demographic, clinical, and ultrasound variables as independent, and the presence of pre-malignancy or malignancy in polyps as dependent variables. A 10-fold cross-validation method was used to estimate the model’s misclassification risk. Results: There were 240 women included, 181 of whom presented with postmenopausal bleeding. Their median age was 60 (range of 45–94); 18/240 (7.5%) women were diagnosed with pre-malignant or malignant polyps. In our decision tree model, the polyp mean diameter (≤13 mm or >13 mm) on ultrasound was the most important predictor of pre-malignancy or malignancy. If the tree was allowed to grow, the patient’s body mass index (BMI) and cystic/solid appearance of the polyp classified women further into low-risk (≤5%), intermediate-risk (>5%–≤20%), or high-risk (>20%) groups. Conclusions: Our decision tree model may serve as a guide to counsel women on the benefits and risks of surgery for postmenopausal endometrial polyps. It may also assist clinicians in prioritizing women for surgery according to their risk of malignancy.


IEEE Access ◽  
2019 ◽  
Vol 7 ◽  
pp. 114851-114861 ◽  
Author(s):  
Zhiguang Zhou ◽  
Xinlong Zhang ◽  
Xiaoyun Zhou ◽  
Yuhua Liu

2017 ◽  
Vol 2017 ◽  
pp. 1-6 ◽  
Author(s):  
Zhong Xin ◽  
Lin Hua ◽  
Xu-Hong Wang ◽  
Dong Zhao ◽  
Cai-Guo Yu ◽  
...  

We reanalyzed previous data to develop a more simplified decision tree model as a screening tool for unrecognized diabetes, using basic information in Beijing community health records. Then, the model was validated in another rural town. Only three non-laboratory-based risk factors (age, BMI, and presence of hypertension) with fewer branches were used in the new model. The sensitivity, specificity, positive predictive value, negative predictive value, and area under the curve (AUC) for detecting diabetes were calculated. The AUC values in internal and external validation groups were 0.708 and 0.629, respectively. Subjects with high risk of diabetes had significantly higher HOMA-IR, but no significant difference in HOMA-B was observed. This simple tool will help general practitioners and residents assess the risk of diabetes quickly and easily. This study also validates the strong associations of insulin resistance and early stage of diabetes, suggesting that more attention should be paid to the current model in rural Chinese adult populations.


Sign in / Sign up

Export Citation Format

Share Document