online queries
Recently Published Documents


TOTAL DOCUMENTS

22
(FIVE YEARS 10)

H-INDEX

3
(FIVE YEARS 1)

2021 ◽  
Author(s):  
Kuvat Momynaliev ◽  
Dimash Khoroshun ◽  
Vasiliy Akimkin

Monitoring online queries can provide early and accurate information about the spread of COVID-19 in the population and about the effectiveness of COVID-19 epidemic control measures. The purpose of the study. Assessment of significance of online queries regarding smell impairment to evaluate the epidemiological status and effectiveness of COVID-19 epidemic control measures. Materials and methods. Weekly online queries from Yandex Russian users regarding smell impairment were analysed in regions and large cities of Russia from 16/3/2020 to 21/2/2021. A total of 81 regions of Russia and several large cities, such as Moscow, St. Petersburg, and Nizhny Novgorod, were included in the study. Results. A strong positive direct correlation (r>0.7) was found between the number of smell-related queries in Yandex new cases of COVID-19 in 59 out of 85 Russian regions and large cities (70%). During the "first" peak of COVID-19 incidence in Russia (April-May 2020), the increase in the number of smell-related queries outpaced the increase in the number of new cases by 1-2 weeks in 23 out of 59 regions of Russia. During the "second" peak of COVID-19 incidence in Russia (October-December 2020), the increase in the number of smell-related queries outpaced the increase in the number of new cases by 1-2 weeks in 36 regions of Russia, including Moscow. We also estimated the increase in the query/new case ratio during the "second" peak of incidence for 45 regions. It was found that the query/new case ratio increased by more than 100% in 24 regions. The regions where the increase in queries was more than 160% compared to new infection cases during the "second" peak of incidence demonstrated significantly higher search activity related to levofloxacin than the regions where the increase in queries was lower than 160% compared to the increase in new infection cases. Conclusion. The sudden interest in smell impairment and growing frequency of online queries among the population can be used as an indicator of the spread of coronavirus infection among the population as well as for evaluation of the effectiveness of COVID-19 epidemic control measures. Keywords: COVID-19, SARS-CoV-2 coronavirus, Yandex.Wordstat, correlation, query, sense of smell.


2021 ◽  
Vol 4 ◽  
Author(s):  
Janan Arslan ◽  
Kurt K. Benke

The COVID-19 pandemic produced a very sudden and serious impact on public health around the world, greatly adding to the burden of overloaded professionals and national medical systems. Recent medical research has demonstrated the value of using online systems to predict emerging spatial distributions of transmittable diseases. Concerned internet users often resort to online sources in an effort to explain their medical symptoms. This raises the prospect that incidence of COVID-19 may be tracked online by search queries and social media posts analyzed by advanced methods in data science, such as Artificial Intelligence. Online queries can provide early warning of an impending epidemic, which is valuable information needed to support planning timely interventions. Identification of the location of clusters geographically helps to support containment measures by providing information for decision-making and modeling.


2020 ◽  
Vol 14 (3) ◽  
pp. 351-363
Author(s):  
Yue Wang ◽  
Ruiqi Xu ◽  
Zonghao Feng ◽  
Yulin Che ◽  
Lei Chen ◽  
...  

Measuring similarities among different nodes is important in graph analysis. SimRank is one of the most popular similarity measures. Given a graph G ( V , E ) and a source node u , a single-source Sim-Rank query returns the similarities between u and each node v ∈ V. This type of query is often used in link prediction, personalized recommendation and spam detection. While dealing with a large graph is beyond the ability of a single machine due to its limited memory and computational power, it is necessary to process single-source SimRank queries in a distributed environment, where the graph is partitioned and distributed across multiple machines. However, most current solutions are based on shared-memory model, where the whole graph is loaded into a shared memory and all processors can access the graph randomly. It is difficult to deploy such algorithms on shared-nothing model. In this paper, we present DISK, a distributed framework for processing single-source SimRank queries. DISK follows the linearized formulation of SimRank, and consists of offline and online phases. In the offline phase, a tree-based method is used to estimate the diagonal correction matrix of SimRank accurately, and in the online phase, single-source similarities are computed iteratively. Under this framework, we propose different optimization techniques to boost the indexing and queries. DISK guarantees both accuracy and parallel scalability, which distinguishes itself from existing solutions. Its accuracy, efficiency, parallel scalability and scalability are also verified by extensive experimental studies. The experiments show that DISK scales up to graphs of billions of nodes and edges, and answers online queries within seconds, while ensuring the accuracy bounds.


Algorithms ◽  
2020 ◽  
Vol 13 (11) ◽  
pp. 276
Author(s):  
Paniz Abedin ◽  
Arnab Ganguly ◽  
Solon P. Pissis ◽  
Sharma V. Thankachan

Let T[1,n] be a string of length n and T[i,j] be the substring of T starting at position i and ending at position j. A substring T[i,j] of T is a repeat if it occurs more than once in T; otherwise, it is a unique substring of T. Repeats and unique substrings are of great interest in computational biology and information retrieval. Given string T as input, the Shortest Unique Substring problem is to find a shortest substring of T that does not occur elsewhere in T. In this paper, we introduce the range variant of this problem, which we call the Range Shortest Unique Substring problem. The task is to construct a data structure over T answering the following type of online queries efficiently. Given a range [α,β], return a shortest substring T[i,j] of T with exactly one occurrence in [α,β]. We present an O(nlogn)-word data structure with O(logwn) query time, where w=Ω(logn) is the word size. Our construction is based on a non-trivial reduction allowing for us to apply a recently introduced optimal geometric data structure [Chan et al., ICALP 2018]. Additionally, we present an O(n)-word data structure with O(nlogϵn) query time, where ϵ>0 is an arbitrarily small constant. The latter data structure relies heavily on another geometric data structure [Nekrich and Navarro, SWAT 2012].


10.2196/19788 ◽  
2020 ◽  
Vol 22 (9) ◽  
pp. e19788
Author(s):  
Atina Husnayain ◽  
Eunha Shim ◽  
Anis Fuad ◽  
Emily Chia-Yu Su

Background South Korea is among the best-performing countries in tackling the coronavirus pandemic by using mass drive-through testing, face mask use, and extensive social distancing. However, understanding the patterns of risk perception could also facilitate effective risk communication to minimize the impacts of disease spread during this crisis. Objective We attempt to explore patterns of community health risk perceptions of COVID-19 in South Korea using internet search data. Methods Google Trends (GT) and NAVER relative search volumes (RSVs) data were collected using COVID-19–related terms in the Korean language and were retrieved according to time, gender, age groups, types of device, and location. Online queries were compared to the number of daily new COVID-19 cases and tests reported in the Kaggle open-access data set for the time period of December 5, 2019, to May 31, 2020. Time-lag correlations calculated by Spearman rank correlation coefficients were employed to assess whether correlations between new COVID-19 cases and internet searches were affected by time. We also constructed a prediction model of new COVID-19 cases using the number of COVID-19 cases, tests, and GT and NAVER RSVs in lag periods (of 1-3 days). Single and multiple regressions were employed using backward elimination and a variance inflation factor of <5. Results The numbers of COVID-19–related queries in South Korea increased during local events including local transmission, approval of coronavirus test kits, implementation of coronavirus drive-through tests, a face mask shortage, and a widespread campaign for social distancing as well as during international events such as the announcement of a Public Health Emergency of International Concern by the World Health Organization. Online queries were also stronger in women (r=0.763-0.823; P<.001) and age groups ≤29 years (r=0.726-0.821; P<.001), 30-44 years (r=0.701-0.826; P<.001), and ≥50 years (r=0.706-0.725; P<.001). In terms of spatial distribution, internet search data were higher in affected areas. Moreover, greater correlations were found in mobile searches (r=0.704-0.804; P<.001) compared to those of desktop searches (r=0.705-0.717; P<.001), indicating changing behaviors in searching for online health information during the outbreak. These varied internet searches related to COVID-19 represented community health risk perceptions. In addition, as a country with a high number of coronavirus tests, results showed that adults perceived coronavirus test–related information as being more important than disease-related knowledge. Meanwhile, younger, and older age groups had different perceptions. Moreover, NAVER RSVs can potentially be used for health risk perception assessments and disease predictions. Adding COVID-19–related searches provided by NAVER could increase the performance of the model compared to that of the COVID-19 case–based model and potentially be used to predict epidemic curves. Conclusions The use of both GT and NAVER RSVs to explore patterns of community health risk perceptions could be beneficial for targeting risk communication from several perspectives, including time, population characteristics, and location.


2020 ◽  
Vol 54 (5) ◽  
pp. 1509-1524 ◽  
Author(s):  
Albert Cohen ◽  
Wolfgang Dahmen ◽  
Ronald DeVore ◽  
James Nichols

Reduced bases have been introduced for the approximation of parametrized PDEs in applications where many online queries are required. Their numerical efficiency for such problems has been theoretically confirmed in Binev et al. (SIAM J. Math. Anal. 43 (2011) 1457–1472) and DeVore et al. (Constructive Approximation 37 (2013) 455–466), where it is shown that the reduced basis space Vn of dimension n, constructed by a certain greedy strategy, has approximation error similar to that of the optimal space associated to the Kolmogorov n-width of the solution manifold. The greedy construction of the reduced basis space is performed in an offline stage which requires at each step a maximization of the current error over the parameter space. For the purpose of numerical computation, this maximization is performed over a finite training set obtained through a discretization of the parameter domain. To guarantee a final approximation error ε for the space generated by the greedy algorithm requires in principle that the snapshots associated to this training set constitute an approximation net for the solution manifold with accuracy of order ε. Hence, the size of the training set is the ε covering number for M and this covering number typically behaves like exp(Cε−1/s) for some C > 0 when the solution manifold has n-width decay O(n−s). Thus, the shear size of the training set prohibits implementation of the algorithm when ε is small. The main result of this paper shows that, if one is willing to accept results which hold with high probability, rather than with certainty, then for a large class of relevant problems one may replace the fine discretization by a random training set of size polynomial in ε−1. Our proof of this fact is established by using inverse inequalities for polynomials in high dimensions.


2020 ◽  
Author(s):  
Atina Husnayain ◽  
Eunha Shim ◽  
Anis Fuad ◽  
Emily Chia-Yu Su

BACKGROUND South Korea is among the best-performing countries in tackling the coronavirus pandemic by using mass drive-through testing, face mask use, and extensive social distancing. However, understanding the patterns of risk perception could also facilitate effective risk communication to minimize the impacts of disease spread during this crisis. OBJECTIVE We attempt to explore patterns of community health risk perceptions of COVID-19 in South Korea using internet search data. METHODS Google Trends (GT) and NAVER relative search volumes (RSVs) data were collected using COVID-19–related terms in the Korean language and were retrieved according to time, gender, age groups, types of device, and location. Online queries were compared to the number of daily new COVID-19 cases and tests reported in the Kaggle open-access data set for the time period of December 5, 2019, to May 31, 2020. Time-lag correlations calculated by Spearman rank correlation coefficients were employed to assess whether correlations between new COVID-19 cases and internet searches were affected by time. We also constructed a prediction model of new COVID-19 cases using the number of COVID-19 cases, tests, and GT and NAVER RSVs in lag periods (of 1-3 days). Single and multiple regressions were employed using backward elimination and a variance inflation factor of &lt;5. RESULTS The numbers of COVID-19–related queries in South Korea increased during local events including local transmission, approval of coronavirus test kits, implementation of coronavirus drive-through tests, a face mask shortage, and a widespread campaign for social distancing as well as during international events such as the announcement of a Public Health Emergency of International Concern by the World Health Organization. Online queries were also stronger in women (<i>r</i>=0.763-0.823; <i>P</i>&lt;.001) and age groups ≤29 years (<i>r</i>=0.726-0.821; <i>P</i>&lt;.001), 30-44 years (<i>r</i>=0.701-0.826; <i>P</i>&lt;.001), and ≥50 years (<i>r</i>=0.706-0.725; <i>P</i>&lt;.001). In terms of spatial distribution, internet search data were higher in affected areas. Moreover, greater correlations were found in mobile searches (<i>r</i>=0.704-0.804; <i>P</i>&lt;.001) compared to those of desktop searches (<i>r</i>=0.705-0.717; <i>P</i>&lt;.001), indicating changing behaviors in searching for online health information during the outbreak. These varied internet searches related to COVID-19 represented community health risk perceptions. In addition, as a country with a high number of coronavirus tests, results showed that adults perceived coronavirus test–related information as being more important than disease-related knowledge. Meanwhile, younger, and older age groups had different perceptions. Moreover, NAVER RSVs can potentially be used for health risk perception assessments and disease predictions. Adding COVID-19–related searches provided by NAVER could increase the performance of the model compared to that of the COVID-19 case–based model and potentially be used to predict epidemic curves. CONCLUSIONS The use of both GT and NAVER RSVs to explore patterns of community health risk perceptions could be beneficial for targeting risk communication from several perspectives, including time, population characteristics, and location.


Author(s):  
Mustafa Khairallah

In this paper, we study a group of AEAD schemes that use rekeying as a technique to increase efficiency by reducing the state size of the algorithm. We provide a unified model to study the behavior of the keys used in these schemes, called Rekey-and-Chain (RaC). This model helps understand the design of several AEAD schemes. We show generic attacks on these schemes based on the existence of certain types of weak keys. We also show that the borderline between multi-key and single-key analyses of these schemes is not solid and the analysis can be performed independent of the master key, leading sometimes to practical attacks in the multi-key setting. More importantly, the multi-key analysis can be applied in the single key setting, since each message is encrypted with a different key. Consequently, we show gaps in the security analysis of COMET and mixFeed in the single key setting, which led the designers to provide overly optimistic security claims. In the case of COMET, full key recovery can be performed with 264 online queries and 264 offline queries in the single-key setting, or 246 online queries per user and 264 offline queries in the multi-key setting with ∼ 0.5 million users. In the case of mixFeed, we enhance the forgery adversarial advantage in the single-key setting with a factor of 267 compared to what the designers claim. More importantly, our result is just a lower bound of this advantage, since we show that the gap in the analysis of mixFeed depends on properties of the AES Key Schedule that are not well understood and require more cryptanalytic efforts to find a more tight advantage. After reporting these findings, the designers updated their security analyses and accommodated the proposed attacks.


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Mark Bun ◽  
Thomas Steinke ◽  
Jonathan Ullman

We consider the problem of answering queries about a sensitive dataset subject to differential privacy. The queries may be chosen adversarially from a larger set $Q$ of allowable queries in one of three ways, which we list in order from easiest to hardest to answer:Offline: The queries are chosen all at once and the differentially private mechanism answers the queries in a single batch. Online: The queries are chosen all at once, but the mechanism only receives the queries in a streaming fashion and must answer each query before seeing the next query. Adaptive: The queries are chosen one at a time and the mechanism must answer each query before the next query is chosen. In particular, each query may depend on the answers given to previous queries.Many differentially private mechanisms are just as efficient in the adaptive model as they are in the offline model. Meanwhile, most lower bounds for differential privacy hold in the offline setting. This suggests that the three models may be equivalent. We prove that these models are all, in fact, distinct. Specifically, we show that there is a family of statistical queries such that exponentially more queries from this family can be answered in the offline model than in the online model. We also exhibit a family of search queries such that exponentially more queries from this family can be answered in the online model than in the adaptive model. We also investigate whether such separations might hold for simple queries like threshold queries over the real line.


2019 ◽  
Author(s):  
Amaryllis Mavragani ◽  
Gabriela Ochoa

UNSTRUCTURED Internet data are being increasingly integrated into health informatics research and are becoming a useful tool for exploring human behavior. The most popular tool for examining online behavior is Google Trends, an open tool that provides information on trends and the variations of online interest in selected keywords and topics over time. Online search traffic data from Google have been shown to be useful in analyzing human behavior toward health topics and in predicting disease occurrence and outbreaks. Despite the large number of Google Trends studies during the last decade, the literature on the subject lacks a specific methodology framework. This article aims at providing an overview of the tool and data and at presenting the first methodology framework in using Google Trends in infodemiology and infoveillance, including the main factors that need to be taken into account for a strong methodology base. We provide a step-by-step guide for the methodology that needs to be followed when using Google Trends and the essential aspects required for valid results in this line of research. At first, an overview of the tool and the data are presented, followed by an analysis of the key methodological points for ensuring the validity of the results, which include selecting the appropriate keyword(s), region(s), period, and category. Overall, this article presents and analyzes the key points that need to be considered to achieve a strong methodological basis for using Google Trends data, which is crucial for ensuring the value and validity of the results, as the analysis of online queries is extensively integrated in health research in the big data era.


Sign in / Sign up

Export Citation Format

Share Document